Inconsistent results from using #undef

vktr = Vector{T}(undef,4)

if T is Float64 or Int64 this fills in a number instead of undef
if T is Complex, Symbol, String, etc, this makes a Vector of undef

julia> versioninfo()
Julia Version 1.12.0
Commit b907bd0600f (2025-10-07 15:42 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: macOS (x86_64-apple-darwin24.0.0)
  CPU: 8 × Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
  WORD_SIZE: 64
  LLVM: libLLVM-18.1.7 (ORCJIT, skylake)
  GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 8 virtual cores)

julia> flt = Vector{Float64}(undef,4)
4-element Vector{Float64}:
 2.130317216e-314
 2.1303172166e-314
 2.470579428e-314
 2.248986567e-314

julia> ine = Vector{Int64}(undef,4)
4-element Vector{Int64}:
 5057780960
 4552074256
 4552110096
 4552078288

julia> cpx = Vector{Complex}(undef,4)
4-element Vector{Complex}:
 #undef
 #undef
 #undef
 #undef

julia> uci = Vector{Union{Complex,Int64}}(undef,4)
4-element Vector{Union{Int64, Complex}}:
 #undef
 #undef
 #undef
 #undef

this is similar to other posts on similar subjects but I think this crystalizes the issue better.

This is already answered in What does #undef mean? - #2 by gdalle

Initializing with undef means you cannot say anything about the values. There is nothing “inconsistent”

2 Likes

The whole point with undef is that for simple types (bits types, like Int, ComplexF64, NTuple{3,UInt}), the array isn’t filled in. It’s just allocated, and not initialized with any values. This saves time. For pointer types, null pointers are filled in (they are printed as #undef, and raises an error if used).

julia> a = ["a","b"]
2-element Vector{String}:
 "a"
 "b"

julia> unsafe_store!(Ptr{UInt}(pointer(a)), UInt(0), 2)
Ptr{UInt64}(0x00007704ceec4478)

julia> a
2-element Vector{String}:
    "a"
 #undef

julia> a[2]
ERROR: UndefRefError: access to undefined reference

1 Like

I think it helps to clarify that “undefined” has several meanings in programming in general, and this one means undefined behavior: something you’re not supposed to do but it’s simpler for the language specification to avoid describing what happens.

It is cheap to calculate and allocate a chunk of memory for values of isbits types or an array of such, prior to initializing the values. Accessing the uninitialized data is obviously not supposed to happen, but if the language mandated a program error, the memory would need to make more space for flags indicating initialization AND add overhead to value access. Those overheads would be present even if you did everything properly. So, it really is simpler to leave that behavior undefined, and the Constructor page of the Manual even explicitly acknowledges it. The usual approach is to do basically nothing about it; even when you’re printing the vector, it just displays the garbage data along with the other initialized values.

For non-isbits types however, there are underlying pointers, and the fields or elements that directly use them are called “references” to the associated values. It would be easy to not define access here either, but the risk of easily reading or writing data to an unpredictable memory location was unacceptable. So, the page does mandate an access error, forcing implementations to add those overheads. Printing the vector does check the pointer flag each element, and #undef is printed when access isn’t safe.

1 Like

The way this is implemented is quite interesting. The undef is just an instance of the singleton type UndefInitializer. I.e. somewhere in the Base module there’s this:

struct UndefIntializer end
const undef = UndefInitializer()

Then there are array constructors defined for this type, equivalent to:

Vector{Float64}(::UndefInitializer, size) = <allocate memory>
Vector{String}(::UndefInitializer, size) = <allocate and zero-fill memory>

It’s not possible, I think, to set an entry to undef:

julia> v = ["a", "b"]
2-element Vector{String}:
 "a"
 "b"

julia> v[2] = undef
ERROR: MethodError: Cannot `convert` an object of type UndefInitializer to an object of type String

It could have been made possible, e.g. with this stuff (this is type piracy, and shouldn’t be done outside of Base):

julia> Base.setindex!(v::Vector{String}, ::UndefInitializer, i::Int) = unsafe_store!(Ptr{UInt}(pointer(v)), UInt(0), i)

julia> v[2] = undef;

julia> isassigned(v, 2)
false
1 Like

It’s not possible, I think, to set an entry to undef:

The canonical way is

julia> d=[""]; Base._unsetindex!(d, 1); d
1-element Vector{String}:
 #undef

I’m not sure whether this is officially exported yet or still counts as “internal API”, but it should be pretty safe. cf Make `Base._unsetindex!` (or a function like it) Public · Issue #58943 · JuliaLang/julia · GitHub

If you are a stickler for the rules, you can use ccall(:jl_arrayunset, Cvoid, (Any, Csize_t), d, 0) – that is officially deprecated, but it used to be official API, so it will be supported forever.

It is sometimes important to do this to allow an object to be garbage collected if you don’t need it any longer.

1 Like

Minor correction: As of Julia 1.11, accessing undefined struct fields, or undefined slots in an Array or Memory is not undefined behaviour. Its behaviour is:

  • If the element is not a bitstype, throw a UndefRefError
  • Else, load an arbitrary value. Future loads of the same index must return the same value.

Although the behaviour is not well documented.

It may be safe for array elements, definitely unsafe for struct fields (although I haven’t heard of even an internal function that does so).
The @code_warntype output shows that getfield on mutable structs acknowledges if there are constructors with partial initialization. If all of the inner constructors have full initialization, getfiled assumes that field access is always safe and omits null checks. If there is a constructor with partial initialization, field access goes through a check, so accessing an undef at least not causes a segfault.

1 Like

The latter rather means “a value represented by an arbitrary bit pattern”. It may be not a proper value in the sense that it may violate any constraint imposed by constructor. Accessing such values is still UB.

No, that’s not correct. Creating values that violates the inner constructor is not UB*.
For example, suppose I make a type like this

struct LessThanThree
    x::UInt8 # invariant: Only 0x00, 0x01, or 0x02 is allowed!

    function LessThanThree(x::UInt8)
        x > 0x02 && throw(ArgumentError("Must be less than three!"))
        new(x)
    end
end

function get_name(x::LessThanThree)
    return @inbounds ("one", "two", "three")[x.x + 0x01]
end

Here, the function get_name is incorrect, because Julia does NOT say it’s UB to circumvent the inner constructor of LessThanThree, and construct a value with the bitpattern 0xff (or any other bitpattern).
Of course, the function get_name itself can invoke UB precisely because it dangerously assumes the bitpatterns of LessThanThree.

Now of course, in any user code base, any author is free to state: “My code assumes you don’t circumvent the constructor of my types and you are going to bump into UB if you do”. But that UB is not caused by circumventing the constructor - the blame is squarely at the feet of the author.
This is no different from saying: “In my codebase, I assume that integer overflow never happens and if it does, my code will invoke UB”. That doesn’t actually make integer overflow UB itself.

* the exception is Bool, where constructing Bools using reinterpret with any other bitpattern than 0x00 and 0x01 is UB.