Inconsistent results from using #undef

vktr = Vector{T}(undef,4)

if T is Float64 or Int64 this fills in a number instead of undef
if T is Complex, Symbol, String, etc, this makes a Vector of undef

julia> versioninfo()
Julia Version 1.12.0
Commit b907bd0600f (2025-10-07 15:42 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: macOS (x86_64-apple-darwin24.0.0)
  CPU: 8 × Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
  WORD_SIZE: 64
  LLVM: libLLVM-18.1.7 (ORCJIT, skylake)
  GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 8 virtual cores)

julia> flt = Vector{Float64}(undef,4)
4-element Vector{Float64}:
 2.130317216e-314
 2.1303172166e-314
 2.470579428e-314
 2.248986567e-314

julia> ine = Vector{Int64}(undef,4)
4-element Vector{Int64}:
 5057780960
 4552074256
 4552110096
 4552078288

julia> cpx = Vector{Complex}(undef,4)
4-element Vector{Complex}:
 #undef
 #undef
 #undef
 #undef

julia> uci = Vector{Union{Complex,Int64}}(undef,4)
4-element Vector{Union{Int64, Complex}}:
 #undef
 #undef
 #undef
 #undef

this is similar to other posts on similar subjects but I think this crystalizes the issue better.

This is already answered in What does #undef mean? - #2 by gdalle

Initializing with undef means you cannot say anything about the values. There is nothing “inconsistent”

3 Likes

The whole point with undef is that for simple types (bits types, like Int, ComplexF64, NTuple{3,UInt}), the array isn’t filled in. It’s just allocated, and not initialized with any values. This saves time. For pointer types, null pointers are filled in (they are printed as #undef, and raises an error if used).

julia> a = ["a","b"]
2-element Vector{String}:
 "a"
 "b"

julia> unsafe_store!(Ptr{UInt}(pointer(a)), UInt(0), 2)
Ptr{UInt64}(0x00007704ceec4478)

julia> a
2-element Vector{String}:
    "a"
 #undef

julia> a[2]
ERROR: UndefRefError: access to undefined reference

1 Like

I think it helps to clarify that “undefined” has several meanings in programming in general, and this one means undefined behavior: something you’re not supposed to do but it’s simpler for the language specification to avoid describing what happens.

It is cheap to calculate and allocate a chunk of memory for values of isbits types or an array of such, prior to initializing the values. Accessing the uninitialized data is obviously not supposed to happen, but if the language mandated a program error, the memory would need to make more space for flags indicating initialization AND add overhead to value access. Those overheads would be present even if you did everything properly. So, it really is simpler to leave that behavior undefined, and the Constructor page of the Manual even explicitly acknowledges it. The usual approach is to do basically nothing about it; even when you’re printing the vector, it just displays the garbage data along with the other initialized values.

For non-isbits types however, there are underlying pointers, and the fields or elements that directly use them are called “references” to the associated values. It would be easy to not define access here either, but the risk of easily reading or writing data to an unpredictable memory location was unacceptable. So, the page does mandate an access error, forcing implementations to add those overheads. Printing the vector does check the pointer flag each element, and #undef is printed when access isn’t safe.

2 Likes

The way this is implemented is quite interesting. The undef is just an instance of the singleton type UndefInitializer. I.e. somewhere in the Base module there’s this:

struct UndefIntializer end
const undef = UndefInitializer()

Then there are array constructors defined for this type, equivalent to:

Vector{Float64}(::UndefInitializer, size) = <allocate memory>
Vector{String}(::UndefInitializer, size) = <allocate and zero-fill memory>

It’s not possible, I think, to set an entry to undef:

julia> v = ["a", "b"]
2-element Vector{String}:
 "a"
 "b"

julia> v[2] = undef
ERROR: MethodError: Cannot `convert` an object of type UndefInitializer to an object of type String

It could have been made possible, e.g. with this stuff (this is type piracy, and shouldn’t be done outside of Base):

julia> Base.setindex!(v::Vector{String}, ::UndefInitializer, i::Int) = unsafe_store!(Ptr{UInt}(pointer(v)), UInt(0), i)

julia> v[2] = undef;

julia> isassigned(v, 2)
false
1 Like

It’s not possible, I think, to set an entry to undef:

The canonical way is

julia> d=[""]; Base._unsetindex!(d, 1); d
1-element Vector{String}:
 #undef

I’m not sure whether this is officially exported yet or still counts as “internal API”, but it should be pretty safe. cf Make `Base._unsetindex!` (or a function like it) Public · Issue #58943 · JuliaLang/julia · GitHub

If you are a stickler for the rules, you can use ccall(:jl_arrayunset, Cvoid, (Any, Csize_t), d, 0) – that is officially deprecated, but it used to be official API, so it will be supported forever.

It is sometimes important to do this to allow an object to be garbage collected if you don’t need it any longer.

1 Like

Minor correction: As of Julia 1.11, accessing undefined struct fields, or undefined slots in an Array or Memory is not undefined behaviour. Its behaviour is:

  • If the element is not a bitstype, throw a UndefRefError
  • Else, load an arbitrary value. Future loads of the same index must return the same value.

Although the behaviour is not well documented.

2 Likes

It may be safe for array elements, definitely unsafe for struct fields (although I haven’t heard of even an internal function that does so).
The @code_warntype output shows that getfield on mutable structs acknowledges if there are constructors with partial initialization. If all of the inner constructors have full initialization, getfiled assumes that field access is always safe and omits null checks. If there is a constructor with partial initialization, field access goes through a check, so accessing an undef at least not causes a segfault.

1 Like

The latter rather means “a value represented by an arbitrary bit pattern”. It may be not a proper value in the sense that it may violate any constraint imposed by constructor. Accessing such values is still UB.

No, that’s not correct. Creating values that violates the inner constructor is not UB*.
For example, suppose I make a type like this

struct LessThanThree
    x::UInt8 # invariant: Only 0x00, 0x01, or 0x02 is allowed!

    function LessThanThree(x::UInt8)
        x > 0x02 && throw(ArgumentError("Must be less than three!"))
        new(x)
    end
end

function get_name(x::LessThanThree)
    return @inbounds ("one", "two", "three")[x.x + 0x01]
end

Here, the function get_name is incorrect, because Julia does NOT say it’s UB to circumvent the inner constructor of LessThanThree, and construct a value with the bitpattern 0xff (or any other bitpattern).
Of course, the function get_name itself can invoke UB precisely because it dangerously assumes the bitpatterns of LessThanThree.

Now of course, in any user code base, any author is free to state: “My code assumes you don’t circumvent the constructor of my types and you are going to bump into UB if you do”. But that UB is not caused by circumventing the constructor - the blame is squarely at the feet of the author.
This is no different from saying: “In my codebase, I assume that integer overflow never happens and if it does, my code will invoke UB”. That doesn’t actually make integer overflow UB itself.

* the exception is Bool, where constructing Bools using reinterpret with any other bitpattern than 0x00 and 0x01 is UB.

1 Like

Only if it’s an version of Complex, without a bitstype type parameter. So, ie., Complex{Float64} will not lead to #undef.

I could take some time later to look, but do you have a link on hand to this 1.11 decision?

I think it technically doesn’t count as undefined behavior because there is a significant restriction on what can happen eg accessing uninitialized isbits data can’t be implemented to throw an error. I think an arbitrary value is unspecified behavior but I’m not sure.

Here: julia/doc/src/devdocs/ub.md at c2237a5fef0cb0d0bc3f89c3e9a491e9c899ced0 · JuliaLang/julia · GitHub

I can’t find any source for it being well-defined for pointerful types, but the manual seems to suggest it’s an error, and I remember Stefan Karpinski writing it was well defined.

… It could really use some documentation

1 Like

Well, I should’ve said “using such values would be UB” in the broad sense that any documented behavior for that type may be wrong for such values unless explicitly stated otherwise.
Also, I am starting to appreciate the formulation for the isbits types. In effect, if it’ll be at some point decided to initialize undefined entries with something specific, the rule still holds.
Also, it is unclear if there is a rule for unions of isbits types. I guess, it’s the same as for isbits, i.e. entries hold some arbitrary value. It seems that the current implementation sets the type tag to some default value and the value part retains garbage.

Note, no inconsistency [with what you should be using]:

Vector{ComplexF64}(undef,4)
4-element Vector{ComplexF64}:
 2.5e-323 + 3.0e-323im
      0.0 + 0.0im
      0.0 + 0.0im
      0.0 + 0.0im

help?> Complex
search: Complex complex ComplexF64 ComplexF32 ComplexF16
..
  ComplexF16, ComplexF32 and ComplexF64 are aliases for Complex{Float16}, Complex{Float32} and Complex{Float64} respectively.

Complex is an abstract type (or sort of) only to be used where such is helpful, i.e. not to be used for real structs or Vector/Arrays/Matrixes of, since that would be slow (still is possible…). Nor should be used in a Union, but also neither should a Union of even ComplexF64 even be used there since you can put integers or reals “into” the complex types…

1 Like