Printing of unsigned numbers

That sounds like a perfect use case for FixedPointNumbers. Whether the C lib uses unsigned integers doesn’t matter if the backing type is the same.

It matters because each hexadecimal digit corresponds to exactly 4 bits which is very useful if you’re manipulating bits and important numbers like typemax(T) have obvious representations.

3 Likes

Digits? But are you using printed text? As I said above, sorry but I’ve no idea on what you guys are talking about.

Say you wanted to extract the 4 most significant bits of a UInt8. In hexadecimal you would write this as x & 0xf0. In decimal that would be x & 240. Can you really tell with the latter what’s going on on the bit level? Printing this number as 0xf0, makes the bit-pattern much more obvious, since if you want to know the value of the nth-bit, you always know you have to look at the (n÷4)th hex digit.

12 Likes

As @simeonschaub says, that’s not a reason to avoid the FixedPointNumber types, because their bit-level representation is Unsigned and that’s all the C code sees. You get two other advantages:

  • conversion between floating-point and integer types “just works”: normally you’re supposed to convert a UInt8 image to floating point by dividing by 255, and a UInt16 image by dividing by 65535. In contrast, if you use FixedPointNumber types, it’s always just convert(T, x) regardless of T and x.
  • does anyone in your field ever use 10- or 12-bit cameras? If so, how do you detect saturated pixels? Normally these are represented as the low bits of a UInt16, and so the heuristic x == 65535 doesn’t work for a 10- or 12-bit camera. You can do it if you pass camera metadata down to every function that might use this, but this is insanely awkward. In contrast, FixedPointNumbers contains the N6f10 and N4f12 types, and the check for saturation is always just x == oneunit(x).

This is some random text to circumvent discourse’s blocking me from posting it (I first posted in the other conversation, then a refresh showed this topic had been split out, so I deleted the post without realizing discourse had also transferred my reply. Once deleted, discourse didn’t like me attempting to re-post the same content.)

8 Likes

OK, I removed tat pirate show because I don’t want to push it to other people but will have to find a replacement because I simply cannot understand why Unsigneds are printed in a illegible way (c’mon, who can read hexadecimal).

Regarding the invalidations, will try to dig but low expectation. Anyway, thanks for pointing them out.

1 Like

Sorry, I meant to move this post to the top of this thread since it came first chronologically, and was in response to:

I kinda take offense on that. I had to print float numbers in hexadecimal my thesis to guarantee 100% reproducibility because I was having chaos effect problems. It is not that much of a problem.

Presumably you didn’t do your thesis in Julia?
Julia uses Ryu for printing floating point.

Ryu generates the shortest decimal representation of a floating point number that maintains round-trip safety.

1 Like

No, I will not, both because I am using Julia but also because I am not working in the same subject anymore, XD

So, according to your example/experience, all floating points should be printed in hexadecial by default :slightly_smiling_face:

Off course there several reasons to print in hexa, but what I don’t understand is why are we all obliged to see unsigned numbers being printed in hexadecimal by default

I think the disconnect here is that for many of us “unsigned” does not mean “mainly used for counters” or “mainly used for intensities”, where “positive base-10 number” is what you frequently want. As Tim mentioned, if you just need a counter, use a signed integer; if you need an intensity, use something with more convenient convert properties like FixedPointNumbers.

Rather, for all the people that are chiming in here, it is much more common to approach “unsigned” as the type to use for bitmasks. It is my belief that your use case for “unsigned” is the minority, and my use case is in the majority, hence it makes a lot of sense to me that the default printing style is hexadecimal.

On the other hand, python’s ctypes, numpy, and tensorflow agree with your preferred style (they print unsigned numbers in base 10). Python’s bytes and bytearray are on the fence, either using hex or ascii, but not base-10. I have not checked python’s array and buffer interfaces for their conventions, nor pytorch and jax.

9 Likes

No, that was not my point. I just wanted to say that hexadecimal is not that hard to understand. Maybe I should have only quoted “(c’mon, who can read hexadecimal)”. In C++, I maybe would prefer that behavior, to guarantee that serialization destined to reuse would end up lossless without any extra work/attention, but in Julia Ryu takes care of that.

However, I prefer the current behavior of printing unsigned as hex (it just do not have anything with my previous comment). Printing in hex makes clear the number is unsigned, and when you are using unsigned most of the time you want to check them against some mask which is easier this way. Also, if you print a sequence of UInt8 values, you can just concatenate them that you have the larger number they would represent in UInt{16,32,64} what I also find practical.

Almost all of my use for <:Unsigned benefits from their hexadecimal display. To see decimal digits, I use

dec(x::Unsigned) = Int128(x)
dec(x::UInt128) = BigInt(x)
1 Like

Not so easy when is VSCode that controls the display. The wrapped types use unsigned as counters because that’s what they are in C side and converting them to FixedPoint is not solution here (they are used in loops).

Just one personal preference here, but I disagree with avoiding unsigned for counters, and I found the way Julia prints unsigned jarring at first. If you know your variable can’t be less than 0, using an unsigned communicates that intent.

I say found in the past tense, because I’ve got used to it, and I understand that bitmasks are a common usage of unsigned, which are easier to understand in hex. When printing, there’s usually an option to pick the base. Might be nice to be able to set a global for non performance critical printing, like show perhaps? I wouldn’t be surprised if someone has already written a package to do just that :laughing:

1 Like

Seems this was already discussed here

1 Like

That seems a VSCode problem: it does not give access to customize this printing. Did your type piracy fixed this problem? If yes, then you could have it only in your development code, so as to make your debugging in your specific workflow easier, but not include it in the production code.

1 Like

I’ve added the pirate line to my startup.jl file and now they all look more familiar to my eyes :slightly_smiling_face:

But I still find strange things around the unsigneds. Why print prints scalars in decimal and arrays in hexa?

julia> a = rand(UInt8, 2,2)
2×2 Matrix{UInt8}:
 0xf7  0xeb
 0x02  0x8c

julia> print(a[1])
247
julia> print(a)
UInt8[0xf7 0xeb; 0x02 0x8c]
2 Likes

This is a fun one, reminds me of the differences and inconsistencies between str and repr in python:

julia> a = rand(UInt8, 2,2)
2×2 Matrix{UInt8}:
 0x41  0x91
 0x63  0x71

julia> a[1]
0x41

julia> print(a[1])
65
julia> repr(a[1])
"0x41"

Back in 2015, it was decided to make print use decimal for unsigned values: unsigned integers interpolate in hex format · Issue #3450 · JuliaLang/julia · GitHub

However, print of an array calls the fallback method of print that calls show. Probably the reasoning is similar to this discussion of tuple printing (which also uses show), where @jeff.bezanson wrote:

The reason we do this is that numbers have canonical printed representations, but tuples do not. As soon as we’re printing ( ,) , we’re using julia syntax and might as well stick with it. If we printed (17185,) , that would be a representation of a julia object, but of a different julia object, which seems like a strange thing to do.

1 Like