Showing a BitVector in NamedTuple

I find a bit of inconsistency, probably about showing a NamedTuple

julia> v = [1, 0, 1.];

julia> nt = (bitvec = Bool.(v),) # the following output is not accurate
(bitvec = Bool[1, 0, 1],)

julia> nt.bitvec
3-element BitVector:
 1
 0
 1

julia> nt.bitvec::Vector{Bool}
ERROR: TypeError: in typeassert, expected Vector{Bool}, got a value of type BitVector
Stacktrace:
 [1] top-level scope
   @ REPL[4]:1

julia> bitvec = Bool[1, 0, 1]
3-element Vector{Bool}:
 1
 0
 1

julia> bitvec::Vector{Bool}
3-element Vector{Bool}:
 1
 0
 1

could this be somehow improved?

The issue is not necessary related to NamedTuples. It’s simply that the short form representation of a BitVector is Bool[...]:

julia> println(nt.bitvec)  # show(stdout, nt.bitvec)
Bool[1, 0, 1]

julia> display(nt.bitvec)  # show(stdout, MIME("text/plain"), nt.bitvec)
3-element BitVector:
 1
 0
 1

I would assume it’s possible to change this to e.g. BitVector[1, 0, 1] without too much trouble.

See also Show container type by default when showing AbstractArrays that are not Arrays by LilithHafner · Pull Request #48829 · JuliaLang/julia · GitHub.

2 Likes

So long as this is not executable, why not Bit[1, 0, 1].
In this way we know it is something other than Bool[1, 0, 1].

1 Like

There isn’t a requirement for print-ed expressions to evaluate to an equal object or at all, even if it would be preferable. For AbstractArrays, they seem to prioritize the element type and values over the exact concrete type.

julia> Number[1,2.2] |> println
Number[1, 2.2]

julia> Number[1,2.2] |> display
2-element Vector{Number}:
 1
 2.2

julia> SVector{2}(Number[1, 2.2]) |> println
Number[1, 2.2]

julia> SVector{2}(Number[1, 2.2]) |> display
2-element SVector{2, Number} with indices SOneTo(2):
 1
 2.2

I disagree with these because the appearance of array literals imply an element type of BitVector or the nonexistent Bit, and the expressions throw errors if evaluated. Bool.([1, 0, 1]) or BitVector([1, 0, 1]) (as suggested in the linked issue) both do that job now, but I’m concerned that’s unreliable or misleading somehow.

4 Likes

Also, there is

julia> v = BitVector([1, 0, 1]); show(v)
Bool[1, 0, 1]

The docstring for show (with one argument) says:

The representation used by show […] should be parseable Julia code when possible.

I find the possibility to generate parseable output very useful – more precisely, parseable output that reproduces the argument to show. I think show(v) should give BitVector([1, 0, 1]) in this case.

4 Likes

If we’re talking more about reflection than printing, it’s worth pointing out that dump recursively digs into structs to a maxdepth (default 8) until printing a few types of values, possibly in arrays.

julia> struct Blah
         b::BitVector
         c::Complex{Int}
       end

julia> Blah([1,0], 1im) |> println # default to the Bool[] issue
Blah(Bool[1, 0], 0 + 1im)

julia> Blah([1,0], 1im) |> display
Blah(Bool[1, 0], 0 + 1im)

julia> Blah([1,0], 1im) |> dump
Blah
  b: BitVector
    chunks: Array{UInt64}((1,)) UInt64[0x0000000000000001]
    len: Int64 2
    dims: Tuple{Int64}
      1: Int64 6
  c: Complex{Int64}
    re: Int64 0
    im: Int64 1

Note that dump still runs into print for some structs like Complex in arrays, which is not the default behavior.

julia> Complex{Int}.(1:2, 1:2) |> dump
Array{Complex{Int64}}((2,)) Complex{Int64}[1 + 1im, 2 + 2im]

julia> struct MyComplex{T}
         re::T
         im::T
       end

julia> MyComplex{Int}.(1:2, 1:2) |> dump
Array{MyComplex{Int64}}((2,))
  1: MyComplex{Int64}
    re: Int64 1
    im: Int64 1
  2: MyComplex{Int64}
    re: Int64 2
    im: Int64 2

Why though? I don’t think getting the type to match is worth the extra method and line noise.

I think that for testing and debugging it’s helpful if one can show any variable in a form that one can feed back to Julia via copy-paste. This function may or may not be named show. Unlike for the other variants of show, the goal would not primarily be readability by users.

For some things, this may be much longer than you might expect. For example, non-default NaN bitpatterns would need some form of reinterpret from an unsigned integer added to its printing in order to get the same instance of e.g. Float64 back.

2 Likes

What is NaN bit? How does it come into being?

That’s not possible or useful in many cases.

  1. Evaluating expressions to instantiate a mutable type semantically makes a different object from the original input, and sometimes an equal object is not enough. This applies to this example too, we don’t always want a copy of the array.
  2. Some types deliberately omit constructors, so the remaining expressions that do instantiate them do a very poor job of showing the values.
  3. Some values aren’t intended to be instantiated normally. These may print as explanatory expressions e.g. enum display, qualified global variables e.g. enum print, unqualified global variables e.g. NaNs, or some combination e.g. function display. In many cases, the global variables wouldn’t be imported into the module where the print call occurs.
  4. Values with high precision are truncated for reasonable printouts (fractional sums of powers of 2 have a trailing 5 in decimal, which we don’t often see). This isn’t a problem for Base floating point types because enough digits are printed to be unambiguously closest to the original value, but that’s not necessarily the case for other types. We could just as easily decide that a struct containing several floating point values can’t afford to print as many digits.

Printing then copy-pasting is just a generally insufficient way to access or copy values, that’s what returns and assignments in the actual program are for. The main purpose is to give us useful information within a typical screen, which is why there are debates on how much to show e.g. whether component arrays should show their concrete type.

The IEEE 754 standard for floating point types reserves some values for things that aren’t real numbers. NaN is “not a number” at all, which can be used for many purposes like missing values or invalid operations, though languages that adhere to IEEE 754 values may choose to handle those differently. Julia does still make 0/0 produce a NaN value, though it’s NOT the same NaN value assigned to Base.NaN for convenience.

But NaN will not occur in a BitVector?

julia> a = Bool.([0, 1, 0]);

julia> a[1] = a[1] / a[3]
ERROR: InexactError: Bool(NaN)

julia> a[1] = a[2] / a[3]
ERROR: InexactError: Bool(Inf)

julia> a
3-element BitVector:
 0
 1
 0

No, Bool is not a floating point type and doesn’t have a separate NaN specification either. A BitVector stores each element with a single bit; there’s only room for the values 0 and 1, nothing left for a NaN.

That true, but I think that shouldn’t stop us from having such an output where it can be done easily. The more often it works, the more helpful it will be.

The main purpose is to give us useful information within a typical screen

I see this as the purpose of show(io, MIME"text/plain"(), x) as opposed to show(x) or show(io, x). Already at present, show(x) prints out all elements of a vector or dictionary x, no matter how large it is.

1 Like

That’s somewhat suggested in the 2-argument show’s docstring: “should be parseable Julia code when possible”, though it doesn’t say the code should evaluate to an equal value e.g. [0/0] != [NaN] or the same type e.g. the original example.

My aplogies, my comment was a bit offtopic :slight_smile: This is not related to BitVector, but only to the representation of various instances of floating point numbers and how they are printed. Specifically, how any floating point object that isnan is printed.

Under the hood, a Float64 follows IEEE-745, which means that it consists of 64 bits (representable by a UInt64), consisting of the signbit, an exponent and a mantissa (or significand, which is without an implicit leading one). Together they make up the 1+11+52=64 bits of a Float64. Now, if you take a Float64 and want to know it’s binary representation, you can do reinterpret(UInt64, f) and you’ll get a UInt64 with the exact bits making up that exact Float64. This is also known as a “bitpattern”, because those 64 bits uniquely identify that specific Float64.

How does this relate to isnan? Well, there is more than one bitpattern for which isnan returns true. In IEEE-745, this is the case whenever the 11 bits belonging to the exponent are all 1, and the significand is nonzero (if it is, you have an infinity). This means that there are 2^53-1 distinct bitpatterns to encode NaN, all of which currently are printed as NaN. For most usecases, this is perfectly fine, but in the reply above the printed representation of NaN doesn’t map back 1:1 to the bitpattern anymore, and hence my comment that the printed representation would need to include reinterpret in order to establish this. At the moment, using the constant NaN in your code will always lead to the same bitpattern. As far as I know, there is no other “blessed” way (apart from Core.bitcast) to create these non-standard NaN value, so you mostly don’t have to worry about them.

Related thread:

Counterexample:

julia> NaN === -NaN
false

julia> bitstring(NaN)
"0111111111111000000000000000000000000000000000000000000000000000"

julia> bitstring(-NaN)
"1111111111111000000000000000000000000000000000000000000000000000"

Right, but those still have the same significand :slight_smile:

1 Like