Undef consistency: x == undef vs isassigned(x)

r_v · August 22, 2018, 5:26pm

julia> a = resize!([], 1)
1-element Array{Any,1}:
 #undef

julia> b = [undef]
1-element Array{UndefInitializer,1}:
 array initializer with undefined values

So, I would assume that if I can

julia> b[1] == undef
true

That I should be able to do the same on a. However, that is not the case.

julia> a[1] == undef
ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] getindex(::Array{Any,1}, ::Int64) at ./array.jl:731
 [2] top-level scope at none:0

I am told that in this case, I must use isassigned instead of a typical == test for undef.

julia> isassigned(a, 1)
false

julia> isassigned(b, 1)
true

Why is there this apparent inconsistency in usage when an array is created with an undef vs when one is extended?

Thank you
rv

mbauman · August 22, 2018, 5:28pm

undef is not itself an undefined value, but rather a special array initializer that flags it can initialize with undefined values.

(I edited your post to improve the formatting — use triple backticks ``` for code blocks)

rdeits · August 22, 2018, 5:47pm

Or, another way of putting this: Julia (unlike, say, Javascript) does not have a defined undefined value that you can compare undefined things against. Which seems reasonable to me, because having a defined undefined seems…weird?

r_v · August 22, 2018, 5:59pm

Thanks for reformatting. Is there a way I can preview my posts for cleanliness before posting (like in GitHub) so that you don’t have to cleanup behind me ?

I understand the definition of undef - but I am asking a different question… For instance, let’s say,

julia> c = vcat(a, b...)
2-element Array{Any,1}:
 #undef                                    
    array initializer with undefined values

From a user’s perspective, there are two undefined entries… one is #undef, and the other is UndefInitializer . But, from that perspective, the elements are treated differently.

julia> c[1]
ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] getindex(::Array{Any,1}, ::Int64) at ./array.jl:731
 [2] top-level scope at none:0

julia> c[2]
array initializer with undefined values


julia> typeof(c[1])
ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] getindex(::Array{Any,1}, ::Int64) at ./array.jl:731
 [2] top-level scope at none:0

julia> typeof(c[2])
UndefInitializer

To defensively program, testing with isassigned for every element seems more expensive than testing for undef (it might not really be… I’m just trying to understand here)

julia> @code_llvm c[2] == undef

; Function ==
; Location: operators.jl:83
define i8 @"julia_==_35430"() {
top:
  ret i8 1
}

julia> @code_llvm isassigned(c, 1)

; Function isassigned
; Location: array.jl:204
define i8 @julia_isassigned_35332(%jl_value_t addrspace(10)* nonnull align 16 dereferenceable(40), i64) {
top:
; Location: array.jl:205
; Function -; {
; Location: int.jl:803
; Function -; {
; Location: int.jl:52
  %2 = add i64 %1, -1
;}}
; Location: array.jl:206
; Function length; {
; Location: array.jl:199
  %3 = addrspacecast %jl_value_t addrspace(10)* %0 to %jl_value_t addrspace(11)*
  %4 = bitcast %jl_value_t addrspace(11)* %3 to %jl_array_t addrspace(11)*
  %5 = getelementptr inbounds %jl_array_t, %jl_array_t addrspace(11)* %4, i64 0, i32 1
  %6 = load i64, i64 addrspace(11)* %5, align 8
;}
; Function <; {
; Location: int.jl:427
  %7 = icmp ult i64 %2, %6
;}
  br i1 %7, label %L17, label %L16

L16:                                              ; preds = %top
  ret i8 0

L17:                                              ; preds = %top
; Location: array.jl:207
  %8 = bitcast %jl_value_t addrspace(11)* %3 to %jl_value_t* addrspace(13)* addrspace(11)*
  %9 = load %jl_value_t* addrspace(13)*, %jl_value_t* addrspace(13)* addrspace(11)* %8, align 8
  %10 = getelementptr %jl_value_t*, %jl_value_t* addrspace(13)* %9, i64 %2
  %11 = load %jl_value_t*, %jl_value_t* addrspace(13)* %10, align 8
  %12 = icmp ne %jl_value_t* %11, null
; Function ==; {
; Location: promotion.jl:350
; Function ==; {
; Location: promotion.jl:425
  %13 = zext i1 %12 to i8
;}}
  ret i8 %13
}

Keno · August 22, 2018, 6:02pm

This is equivalent to

@code_llvm undef == undef

and probably not what you wanted to look at.

r_v · August 22, 2018, 6:03pm

Probably not! What is the correct way to get the @code_llvm for that case?

Keno · August 22, 2018, 6:05pm

Depends on exactly what you want, but

f(c) = c[2] == undef
@code_llvm f(c)

is what I imagine you were looking for, though to have it be comparable, to your other one, you probably want to include the indexing calculation:

f(c, i) = c[i] == undef
@code_llvm f(c, 2)

yuyichao · August 22, 2018, 6:08pm

No. There’s one undefined entry. undef is not undefined, period. Having it in an array is nothing different from any other types. It’s not undefined in the same sense you wouldn’t say a nothing entry is undefined.

It has been a long time since this come up but no, don’t use code_llvm or code_native unless you understand exactly what each instruction is about. Counting the number of lines, even when you use them correctly, does not tell you how fast things are.

yuyichao · August 22, 2018, 6:24pm

isassigned is basically never slower than indexing.

Depending on what you mean by “defensive program” (or what you are defending against) you obviously need to do different test, though basically none of them have anything specific to undef. Things that you may want to defend against include,

Undefined entries. This is exactly what isassigned is testing. In a valid program, if isassigned(a, i) is true, a a[i] immediate following it should not raise an exception. (and before anyone mentions it, data race is undefined behavior).
Entries of wrong type. Unless there’s a good reason, you usually should just put constraint on the eltype of the array. If your array type is Vector{Int}, no entry will have a type other than Int and nothing will be of value undef.

If you need to accept an array with an abstract eltype, you usually either don’t care about the type, or you want to filter out some specific types and you should do just that, still with nothing specific to undef.

mbauman · August 22, 2018, 7:48pm

Instead of looking so deep into the internals with @code_llvm, you might be more interested in peeling back a few less layers and look at Meta.@lower.

The syntax c[2] == undef is just a function call with two arguments. You can equivalently write it as ==(c[2], undef). The syntax c[2] is also just a function call: it’s equivalent to getindex(c, 2). All arguments are evaluated before getting passed to a function. So c[2] == undef is really doing something like x = getindex(c, 2); ==(x, undef).

This is precisely what Meta.@lower shows, using a special intermediate representation where %X is a variable name:

julia> Meta.@lower c[2] == undef
:($(Expr(:thunk, CodeInfo(
 1 ─ %1 = (Base.getindex)(c, 2)
 │   %2 = %1 == undef
 └──      return %2
))))

Finally, those special lines that print as #undef entries aren’t values at all in Julia. They’re undefined, so you can never assign them to anything or even work with them at all. They’ll just error as soon as you try to access them. Which, of course, Julia needs to do before it’s able to use the == function.

The undef constant, on the other hand, is a value. In fact, it’s just a convenient short-hand name for UndefInitializer()… and that’s just defined with an empty struct X end definition. That’s it. We just give it a meaning by adding an Array method that dispatches on its type to return uninitialized memory. We don’t prevent you from passing this thing around, so you can store it in arrays or dictionaries or anywhere else, just like any other thing in Julia.

Instead of defensively putting isassigned around all your indexing calls, you can instead just never ask for undefined memory in the first place. Instead of Array, you can use fill or comprehensions. And instead of using resize! to grow vectors, use append! or push! as appropriate. You can also check out using the special missing singleton as a first-class way to work with missing values.

r_v · August 22, 2018, 8:15pm

Thank you for your detailed explanation.

Just fyi - I don’t ask these questions because I do them. I ask these questions because they are, in my reading of the documentation, unanswered.

I generally go with the presumption that the goal of the forum is to help people understand the design or usage choices (which you precisely did your response) as opposed to making them feel stupid for asking questions that others here might instinctively know the answer to. The former attracts more people to contribute while latter leads to people just avoiding such forums and going about their way contributing to other things where they are made to feel welcome.

My sum total experience with julia is about 4 days, give or take a couple… and I found that while I was trying to work on #28415 that doing a unique(...) on a resized Array{Any,1} was barfing and that led to my set of experiments to understand the nature of undef - and posed the question having gone through the documentation and not finding adequate clarity in it.

This was really specifically an issue that I encountered only with Array{Any,1} being resized. I could not cause #undef to occur in other cases of a specific type.

Anyway, thank you for the response.

Keno · August 22, 2018, 9:19pm

FWIW, I do feel that the current way undef works is a bit of a wart on the language (issue at More explicit representation of undef-ability of fields · Issue #25519 · JuliaLang/julia · GitHub), so perhaps we’ll be able to do something about it in the 2.0 timeframe.

r_v · August 22, 2018, 9:27pm

That would be a welcome usability change.

If nothing else, a design change that could be considered is - an expansion caused by a resize!(Array{Any,1}, N) could cause undefs to be added, instead of #undef. That would be intuitive, imo, since an expansion caused new undefined data to be added, instead of new #undef entries that one cannot really work with.

Keno · August 22, 2018, 9:30pm

No, that would be completely wrong. undef is just a magic token to pass to the array constructor to have it produce #undef. It has no other semantics. In particular, you can have an #undef entry for any Array{Foo} if Foo is a mutable struct, but you cannot have an undef entry in such an array.

r_v · August 22, 2018, 9:30pm

Good point.

r_v · August 22, 2018, 9:31pm

So what do you suggest ?

Keno · August 22, 2018, 9:32pm

What I suggested in the issue above. Have an uninhabited Undef type. And avoid #undef as much as possible for semantics (use Union{T, Nothing} instead, since that’s fast now).

Topic		Replies	Views
Has `undef` lost its undefedness? General Usage	4	1906	July 12, 2018
Undefs.jl: Convenience and Experiment Package Announcements package , array , undef , undefinitializer	10	628	January 25, 2023
Is accessing an `undef` array undefined behavior? General Usage	47	1258	July 23, 2023
Subsetting uninitialized arrays Internals & Design question	7	704	June 21, 2019
What does #undef mean? General Usage question , dictionary	17	2061	June 6, 2022

Undef consistency: x == undef vs isassigned(x)

Related topics