[ANN] ArrayAllocators.jl: Integrating calloc and aligned memory into Array construction

ArrayAllocators.jl

I am happy to announce ArrayAllocators.jl, a registered package that provides new mechanisms of array allocation. ArrayAllocators.jl provides new values that can take the place of undef when constructing arrays via the Array constructor: Array{T}(allocator, n, m, ...).

Quick Start Example

For example, you can now do the following.

using ArrayAllocators
malloced_array = Array{Int}(malloc, 16, 32, 8)
calloced_array = Array{Int}(calloc, 1024)
aligned_array = Array{Int}(MemAlign(2^16), 1024, 2048)

Faster Zeros with Calloc

A few months ago I came across a few circumstances where the implementation of NumPy’s zeros seemed faster than Julia’s zeros in several microbenchmarks. Investigating this unveiled that NumPy uses the C standard function calloc which allocates memory and guarantees that it will be initialized to 0. calloc as exposed to Julia via Libc.calloc allows for fast array allocation and lazy initialization. However, this may result in slower performance when the data is eventually accessed.

ArrayAllocators.jl integrates calloc into the Array constructor as follows:

julia> using ArrayAllocators

julia> @time A = Array{UInt8}(undef, 1024^3);
  0.001379 seconds (2 allocations: 1.000 GiB, 98.36% gc time)

julia> @time Z = zeros(UInt8, 1024^3);
  0.463365 seconds (2 allocations: 1.000 GiB, 1.25% gc time)

julia> @time C = Array{UInt8}(calloc, 1024^3);
  0.000026 seconds (5 allocations: 1.000 GiB)

julia> @time sum(Z)
  0.226251 seconds
0x0000000000000000

julia> @time sum(C)
  0.312937 seconds
0x0000000000000000

julia> @time sum(C)
  0.171955 seconds
0x0000000000000000

julia> isequal(Z, C)
true

For a detailed discussion, see the earlier thread.

Aligned Memory

Aligning memory can allow certain vectorized operaitons to be accelerated. Julia typically allocates memory on 16-byte or 64-byte boundaries depending on the size of the array.

julia> A = Array{UInt8}(undef, 1024^2);

julia> reinterpret(UInt, pointer(A)) % 64
0x0000000000000000

@stevegj has earlier provided a mechanism to use posix_memalign to allocate aligned memory. posix_memalign allows alignment along 16-byte boundaries or any larger power of 2. Thanks to @carstenbauer for bringing this my attention.

On Windows, I have implemented aligned memory using VirtualAlloc2, but this requires alignment on 64 kilobyte boundaries or greater. I am considering adding a version based on _aligned_malloc which would provide more granularity, but use of the C-runtime on Windows can get complicated.

With ArrayAllocators.jl, you can create aligned memory and explicitly specify the allocation via the following mechanism:

julia> using ArrayAllocators

julia> alignment = 2^16
65536

julia> memalign = MemAlign(alignment)
ArrayAllocators.POSIX.PosixMemAlign{ArrayAllocators.ByteCalculators.CheckedMulByteCalculator}(65536)

julia> aligned_array = Array{UInt8}(memalign, 1024^3);

julia> pointer(aligned_array)
Ptr{UInt8} @0x00007fbadd9c0000

julia> reinterpret(UInt, pointer(aligned_array)) % alignment
0x0000000000000000

The underlying platform specific versions of MemAlign can also be accessed.

julia> using ArrayAllocators.POSIX

julia> posix_memalign = PosixMemAlign(32)
PosixMemAlign{ArrayAllocators.ByteCalculators.CheckedMulByteCalculator}(32)

julia> posix_aligned = Array{Int}(posix_memalign, 1024, 1024);

julia> pointer(posix_aligned)
Ptr{Int64} @0x000000000383f780

julia> reinterpret(UInt, pointer(posix_aligned)) % 32
0x0000000000000000

Overflow Detection

Integer overflow can occur when calculating the number of bytes that are needed to allocate for an array leading to erroneous results.

https://wiki.sei.cmu.edu/confluence/display/c/MEM07-C.+Ensure+that+the+arguments+to+calloc()%2C+when+multiplied%2C+do+not+wrap

julia> D = typemax(Int)
9223372036854775807

julia> D * (D-2) * 300
900

900

ArrayAllocators.jl defaults to using Base.checked_mul to check for integer overflow via ArrayAllocators.ByteCalculators.CheckedMulByteCalculator aliased as ArrayAllocators.DefaultByteCalculator.

julia> using ArrayAllocators

julia> Array{Int16}(calloc, D÷2, 4)
ERROR: OverflowError: The product of the dimensions results in integer overflow.
Stacktrace:
...

julia> Array{UInt8}(calloc, D, D-2, 300)
ERROR: OverflowError: The product of the dimensions results in integer overflow.
Stacktrace:
...

julia> Array{Int}(calloc, D÷2)
ERROR: OverflowError: The product of array length and element size will cause an overflow.
...

The AbstractByteCalculator used is a parameter of AbstractAllocator. Alternative ways of calculating the number of bytes can be used or the overflow detection can be unsafely disabled:

julia> using ArrayAllocators, ArrayAllocators.ByteCalculators

julia> unsafe_calloc = CallocAllocator{UnsafeByteCalculator}()
CallocAllocator{UnsafeByteCalculator}()

julia> bad_array = Array{UInt8}(unsafe_calloc, D, D-2, 300);

julia> size(bad_array)
(9223372036854775807, 9223372036854775805, 300)

julia> length(bad_array)
900

julia> bad_array
9223372036854775807×9223372036854775805×300 Array{UInt8, 3}:
[:, :, 1] =

signal (11): Segmentation fault
...

Allocating non-bitstypes

All allocators can allocate bitstypes. Non-bitstypes can only be allocated by allocators that initialize their arrays to 0 such as calloc.

julia> using ArrayAllocators

julia> mutable struct NotaBitstype
           somefield
       end

julia> isbitstype(NotaBitstype)
false

julia> ArrayAllocators.iszeroinit(typeof(malloc))
false

julia> Array{NotaBitstype}(malloc, 16);
ERROR: ArgumentError: NotaBitstype is not a bitstype
Stacktrace:
...

julia> ArrayAllocators.iszeroinit(typeof(calloc))
true

julia> Array{NotaBitstype}(calloc, 16)
16-element Vector{NotaBitstype}:
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef

Other Allocators

ArrayAllocators.jl also implements the malloc singleton of MallocAllocator and UndefAllocator, which wraps around undef.

malloc may allow the use of alternative memory allocators as discussed below.

Generally, allocators that do not require additional dependencies can be added to ArrayAllocators.jl. This includes other mecanisms in libc or native to specific operating systems. Feel free to open an issue or pull request for your favorite allocator.

Extensions

Allocators that do require additional dependencies to be cross platform should go into dependent packages. For example, NumaAllocators.jl is a package that implements allocators for Non-Uniform Memory Access that is currently being registered. I will post a separate package announcement for this when it is registered.

Extensions can subtype AbstractArrayAllocator which provides abstract functions to make array construction easier. The interface currently consists of allocate, iszeroinit, and extending Base.unsafe_wrap.

Summary

ArrayAllocators.jl overloads the the Array constructor to provide additional allocation options. This allows for fine grained control of memory allocation for arrays. This allows easy access to specialized memory allocation procedures that can impact performance.

33 Likes

Here is a cross reference to the Twitter thread which has some discussion of the package:

3 Likes

I apologize beforehand if my question is very naive. I’m one of those people who use Julia because they really don’t want to worry about internals.

I was told in another thread that Julia arrays are not guaranteed to be contiguous in memory. In particular - as a consequence - if I create a Julia array e.g. Vector{UInt8}(undef, 8*N) and then extract the pointer, then create a new Vector{Float64} that points to the same memory, then this is unsafe. Something like this:

_C = zeros(UInt8, 100 * sizeof(Float64))
ptr = Base.unsafe_convert(Ptr{Float64}, _C)
C = Base.unsafe_wrap(Array, ptr, 100)

(note that I’m proposing to remember _C so that the GC doesn’t release the memory.)

Now, if I were to use your Vector{UInt8}(calloc, 8*N) instead, i.e.,

_C = Vector{UInt8}(calloc, 100 * sizeof(Float64))

is it guaranteed that the block of memory to which this vector points is contiguous? I.e. the procedure proposed above would be safe?

The docs for DenseArray states that

The elements of a dense array are stored contiguously in memory.

so I’m surprised to learn that an Array may not be contiguous in memory.

I read the other thread through a few times, and the only person who said anything about an array being contiguous or not is you. The other concept that was mentioned was alignment, which is a distinct property from an array being contiguous. I think there may have been a misunderstanding.

Contiguous

Generally an array instance, A, of type Array{T, N} is contiguous in Julia if

isbitstype(eltype(A)) == isbitstype(T) == true

Bitstypes generally are primitives and non-mutable concrete structs.

For emphasis, I specifically mean Core.Array here. You can prove it for yourself with the following:

julia> A = Array{Float64,1}(undef, 100);

julia> for i in eachindex(A)
           println(pointer(A, i))
       end
Ptr{Float64} @0x00007f72ede3a6c0
Ptr{Float64} @0x00007f72ede3a6c8
Ptr{Float64} @0x00007f72ede3a6d0
Ptr{Float64} @0x00007f72ede3a6d8
Ptr{Float64} @0x00007f72ede3a6e0
Ptr{Float64} @0x00007f72ede3a6e8
...

julia> all(diff(reinterpret(UInt, pointer.(Ref(A), 1:100))) .== sizeof(eltype(A)))
true

Alignment

The second concept that arose was alignment. When we allocate an array, Julia will generally align the array on a 16-byte or 64-byte boundary based on the size of the array. That is we make sure the memory starts at a certain multiple of a number of bytes from the origin of the memory heap, which itself is aligned. We may want to do this to align memory to the processor’s cache line. You can verify this by using mod on the pointer address:

julia> A = Array{Float64,1}(undef, 100);

julia> pointer(A)
Ptr{Float64} @0x00007f72eefe5fc0

julia> reinterpret(UInt, pointer(A))
0x00007f72eefe4740

julia> reinterpret(UInt, pointer(A)) % 16
0x0000000000000000

julia> reinterpret(UInt, pointer(A)) % 64
0x0000000000000000

julia> import Base.%

julia> x::Ptr % y::Int = reinterpret(UInt, x) % y # define % for Ptr
rem (generic function with 146 methods)

julia> pointer(A) % 64
0x0000000000000000

ArrayAllocators.jl and Alignment

ArrayAllocators.jl does not make any general guarantees about alignment by itself. It is essentially using unsafe_wrap on pointers obtained from various low-level memory allocation routines. unsafe_wrap enforces alignment according to the type. Memory allocated using calloc is not guaranteed to be aligned to a certain byte boundary. It depends completely on the Libc.calloc, and it is thus may be operating system specific.

ArrayAllocators.jl does allow you allocate memory with a specific alignment as long it is above a minimum value and a power of 2.

julia> using ArrayAllocators

julia> ArrayAllocators.min_alignment(MemAlign)
8

julia> A = Array{Float64}(MemAlign(256), 100);

julia> pointer(A)
Ptr{Float64} @0x0000000005fc0800

julia> pointer(A) % 256
0x0000000000000000

Note that on Windows the minimum alignment is quite large since I’m using VirtualAlloc2 to implement memory alignment.

ReinterpretArray

Base.ReinterpretArray allows you to wrap an existing array at any offset. There are no aligned pointer guarantees and thus can make fewer assumptions overall about its associated memory. This is why it can be slower in many areas such as bounds checking. Observe that all of the following are valid.

julia> A = Array{UInt8}(undef, 100);

julia> reinterpret(Int64, @view(A[1:end-4]));

julia> reinterpret(Int64, @view(A[2:end-3]));

julia> reinterpret(Int64, @view(A[3:end-2]));

julia> reinterpret(Int64, @view(A[4:end-1]));

julia> reinterpret(Int64, @view(A[5:end]));

Contrast that with unsafe_wrap below.

unsafe_wrap

Base.unsafe_wrap is used by ArrayAllocators.jl and does require pointers to be aligned. You will get an error otherwise. Contrast this with reinterpret above.

julia> unsafe_wrap(Array, Ptr{Int64}(pointer(A,1)), 12);

julia> unsafe_wrap(Array, Ptr{Int64}(pointer(A,2)), 12);
ERROR: ArgumentError: unsafe_wrap: pointer 0x7f731b1d4359 is not properly aligned to 8 bytes
Stacktrace:
 [1] #unsafe_wrap#89
   @ ./pointer.jl:89 [inlined]
 [2] unsafe_wrap(::Type{Array}, p::Ptr{Int64}, d::Int64)
   @ Base ./pointer.jl:89
 [3] top-level scope
   @ REPL[678]:1

julia> unsafe_wrap(Array, Ptr{Int64}(pointer(A,3)), 12);
ERROR: ArgumentError: unsafe_wrap: pointer 0x7f731b1d435a is not properly aligned to 8 bytes
...

julia> unsafe_wrap(Array, Ptr{Int64}(pointer(A,4)), 12);
ERROR: ArgumentError: unsafe_wrap: pointer 0x7f731b1d435b is not properly aligned to 8 bytes
...

julia> unsafe_wrap(Array, Ptr{Int64}(pointer(A,5)), 12);
ERROR: ArgumentError: unsafe_wrap: pointer 0x7f731b1d435c is not properly aligned to 8 bytes
...

julia> unsafe_wrap(Array, Ptr{Int64}(pointer(A,6)), 12);
ERROR: ArgumentError: unsafe_wrap: pointer 0x7f731b1d435d is not properly aligned to 8 bytes
...

julia> unsafe_wrap(Array, Ptr{Int64}(pointer(A,7)), 12);
ERROR: ArgumentError: unsafe_wrap: pointer 0x7f731b1d435e is not properly aligned to 8 bytes
...

julia> unsafe_wrap(Array, Ptr{Int64}(pointer(A,8)), 12);
ERROR: ArgumentError: unsafe_wrap: pointer 0x7f731b1d435f is not properly aligned to 8 bytes
...

julia> unsafe_wrap(Array, Ptr{Int64}(pointer(A,9)), 12);

Suggestion: Use Libc.malloc and unsafe_wrap.

My suggestion for you regarding what you were trying to do in the previous thread is consider using Libc.malloc or Libc.calloc directly to allocate memory directly and then use unsafe_wrap to wrap an Array around it. You can then manage the memory directly. You’ll have to manually use Libc.free when you are done with it, and do not have to worry about carrying around the reference. This memory is managed outside of the garbage collector.

julia> p = Libc.malloc(800)
Ptr{Nothing} @0x000000000532c200

julia> A = unsafe_wrap(Array{Float64}, Ptr{Float64}(p), 100);

julia> A = nothing

julia> Libc.free(p)

ArrayAllocators.jl is basically just doing the above with the own = true keyword to unsafe_wrap or adding a custom finalizer when we need another method to free memory other than Libc.free. It seems to me that you want to manage your own memory like you would do in C. Thus, you should management your own memory like in C via malloc, calloc, and free.

Summary

Julia’s Array type is contiguous for bitstypes. In the previous thread, you may have confused issues regarding contiguous memory with aligned memory. reinterpret does not enforce pointer alignment. unsafe_wrap does enforce pointer alignment.

ArrayAllocators.jl uses unsafe_wrap. It either uses the own = true keyword to unsafe_wrap or implements its own finalizer. The memory allocation is done by lower level routines from either Libc or the Windows operating system kernel. These routines provide contiguous memory blocks. ArrayAllocators.jl also offers options to allocate aligned memory.

9 Likes

Note that

julia> isabstracttype(DenseArray)
true

julia> Array <: DenseArray
true

Now let’s discuss how the above is a lie when the element type is not a bitstype.

julia> mutable struct Foo
           x::Int
           y::Int
           z::Int
       end

julia> isbitstype(Foo)
false

julia> sizeof(Foo)
24

julia> A = Array{Foo}(undef, 100)
100-element Vector{Foo}:
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
   ⋮
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef

julia> pointer(A, 1)
Ptr{Foo} @0x00007f73a6060038

julia> pointer(A, 2)
Ptr{Foo} @0x00007f73a6060040

julia> pointer(A, 2) - pointer(A,1)
0x0000000000000008

julia> sizeof(A)
800

Above we see that Foo is not a bitstype because it is a mutable struct. It has three Int64 fields, 8 bytes each, so its size is 24 bytes. When we allocate an Array{Foo}, we see that the elements are all “undefined” and that the pointer references are only 8 bytes apart rather than the 24 bytes needed to pack Foos into the memory. That’s strange. Let’s play with this more.

julia> A[1] = Foo(1,2,3)
Foo(1, 2, 3)

julia> unsafe_load(pointer(A, 1)) # We were expecting Foo(1,2,3)
Foo(140134682854992, 0, 0)

julia> unsafe_load(unsafe_load(Ptr{Ptr{Foo}}(pointer(A)))) # There it is!
Foo(1, 2, 3)

In this case A is really an array of pointers to Foo. What really should have resulted from pointer(A) is a Ptr{Ptr{Foo}}, not Ptr{Foo}. Fixing that allows us retrieve the correct Foo(1, 2, 3) that we put there.

What are all these #undef entries? They are NULL pointers.

julia> A[2]
ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] getindex(A::Vector{Foo}, i1::Int64)
   @ Base ./array.jl:861
 [2] top-level scope
   @ REPL[31]:1

julia> unsafe_load(Ptr{Ptr{Foo}}(pointer(A, 2)))
Ptr{Foo} @0x0000000000000000

julia> A[2] = Foo(4, 5, 6)
Foo(4, 5, 6)

julia> unsafe_load(Ptr{Ptr{Foo}}(pointer(A, 2)))
Ptr{Foo} @0x00007f73a67783b0

julia> unsafe_load(unsafe_load(Ptr{Ptr{Foo}}(pointer(A, 2))))
Foo(4, 5, 6)

Thus in the case of a non-bitstype, the memory for Foo is not continuous.

What does ArrayAllocators.jl do when you try to allocate a non-bitstype?

In some circumstances, ArrayAllocators.jl will fail:

julia> Array{Foo}(malloc, 100)
ERROR: ArgumentError: Foo is not a bitstype

If the allocator guarantees zero initialization, this will succeed.

julia> Array{Foo}(calloc, 100)
100-element Vector{Foo}:
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
   ⋮
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
2 Likes

You’re confusing memory continuity of the array with inlineability of the mutable struct. Every element of the array, which is an array of pointers due to the mutability of the elements, lies next to its neighbors in memory - in contrast to e.g. a linked list or even tree structure of pointers. Not storing inline is a necessary consequence of mutability, as the following has to work:

julia> a = Foo(1,2,3)
Foo(1, 2, 3)

julia> arr = [a]
1-element Vector{Foo}:
 Foo(1, 2, 3)

julia> a.x = 5
5

julia> arr
1-element Vector{Foo}:
 Foo(5, 2, 3)

If mutable structs were stored inline, that wouldn’t work anymore because you’d have to somehow go back to all previously existing references to a and change them to point into your newly created array. Things only get more complicated when, after arr, you define another array b_arr = [a] - now to which array should the original a point?

The (as far as I can tell) only time when you can store mutable structs inline would be when its guaranteed that no elements of the array can possibly survive longer than the array itself, which is a semantic guarantee julia doesn’t make (use Rust if you absolutely need that). That’s why these sorts of allocation workarounds require runtime support to work properly with the rest of the language and that’s also why isbitstype requires immutability.


For almost all cases where people reach for pointer, they should reach for Ref instead as it handles all that stuff for you - whether something is a pointer or not under the hood is an implementation detail (and I think allowed to change as long as the change is not visible, e.g. when the array elements don’t escape the lifetime of the array). Don’t reach for pointer and you don’t have to reach into unsafe_* territory to accomplish anything:

julia> mutable struct Foo
                  x::Int
                  y::Int
                  z::Int
              end

julia> A = Array{Foo}(undef, 10)
10-element Vector{Foo}:
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef

julia> a = Foo(1,2,3)
Foo(1, 2, 3)

julia> a_ref = Ref(A, 1)
Base.RefArray{Foo, Vector{Foo}, Nothing}(Foo[#undef, #undef, #undef, #undef, #undef, #undef, #undef, #undef, #undef, #undef], 1, nothing)

julia> a_ref[]
ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] getindex
   @ ./essentials.jl:13 [inlined]
 [2] getindex(b::Base.RefArray{Foo, Vector{Foo}, Nothing})
   @ Base ./refpointer.jl:181
 [3] top-level scope
   @ REPL[5]:1

julia> A[1] = a
Foo(1, 2, 3)

julia> a_ref[] === a
true

Would it be worthwhile to add this detail to the docstring of DenseArray? The current docstring

DenseArray{T, N} <: AbstractArray{T,N}

`N`-dimensional dense array with elements of type `T`. The elements of a dense array are stored contiguously in memory.

seems to indicate that the elements of type Foo are stored contiguously in memory, whereas, IIUC, it’s really the pointers to Foo that are contiguous.

No, I don’t think so. You’re not interacting with the pointers directly in regular use, modulo the unsafe_* APIs - they’re an implementation detail. Also, for all intents and purposes, the pointer is the object itself. You can’t separate the object from the pointer that’s pointing to it.

Thank you for the extremely thorough explanations

I’m not sure if that would be the best place to document the memory layout. DenseArray is frankly an obscure implementation detail. A broader topic on array data layouts is needed.

Arrays of structs is a challenging topic because a number of topics have been confused as Sukera pointed out. I am writing from the perspective of the user where the underlying use of pointers is far from transparent. In fact, the return type of pointer when used with non-bitstypes like mutable structs is misleading.

Compare this to the use of an “immutable” struct. Notice how by the end of the example, we actually have modified the value of the fields.

julia> struct Foo
           x::Int
           y::Int
           z::Int
       end

julia> A = Array{Foo}(undef, 10)
10-element Vector{Foo}:
 Foo(0, 0, 0)
 Foo(0, 0, 0)
 Foo(0, 0, 0)
 Foo(0, 0, 0)
 Foo(0, 0, 0)
 Foo(0, 0, 0)
 Foo(0, 0, 0)
 Foo(0, 0, 0)
 Foo(0, 0, 0)
 Foo(0, 0, 0)

julia> B = reinterpret(Int, A);

julia> B .= 1:30
30-element reinterpret(Int64, ::Vector{Foo}):
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
  ⋮
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30

julia> A
10-element Vector{Foo}:
 Foo(1, 2, 3)
 Foo(4, 5, 6)
 Foo(7, 8, 9)
 Foo(10, 11, 12)
 Foo(13, 14, 15)
 Foo(16, 17, 18)
 Foo(19, 20, 21)
 Foo(22, 23, 24)
 Foo(25, 26, 27)
 Foo(28, 29, 30)

It’s not misleading, it’s an extremely unfortunate bug. Sadly non-trivial and also breaking to fix, which is just one more reason to not use pointer, ever, and use Ref instead because that can at least properly track whether it’s coming from a single-element-store or references an element in an array (RefValue from Ref(x) vs. RefArray from Ref(arr, idx)).

You’re not changing the value of any fields. You’re removing the type restriction by saying “those are all Ints now” and then saving new Ints in their place. The immutability of the type wasn’t broken since you didn’t go through the type in the first place. This has the effect of replacing the fields as well, but you will run into problems as soon as you have padding:

julia> struct Foo
               x::Int8
               y::Int16
               z::Int8
          end

julia> A = Array{Foo}(undef, 10);

julia> B = reinterpret(Int, A);
ERROR: ArgumentError: cannot reinterpret an `Foo` array to `Int64` whose first dimension has size `10`.
The resulting array would have non-integral first dimension.
[..]

julia> B = reinterpret(UInt8, A);

julia> B
60-element reinterpret(UInt8, ::Vector{Foo}):
Error showing value of type Base.ReinterpretArray{UInt8, 1, Foo, Vector{Foo}, false}:
ERROR: Padding of type UInt8 is not compatible with type Foo.
[...]

It only allows you to do that if the padding agrees and because those immutables are not saved by reference to heap memory in the first place. Accessing any index of the array via e.g. A[1] copies the immutable value out of the array:

julia> a_foo = A[1]
Foo(0, 0, 0)

julia> B = reinterpret(Int, A);

julia> B .= 1:30;

julia> a_foo
Foo(0, 0, 0)

julia> A[1]
Foo(1, 2, 3)

By definition, a copy of an immutable value cannot be distinguished from the original value.

That is an interesting example. We can still manipulate the underlying memory of the array though.

julia> struct Foo
           x::Int8
           y::Int16
           z::Int8
       end

julia> A = Array{Foo}(undef, 10)
10-element Vector{Foo}:
 Foo(-48, 3161, 0)
 Foo(0, -30640, 89)
 Foo(0, 0, 8)
 Foo(12, 0, 0)
 Foo(80, 4347, 0)
 Foo(0, 8, 12)
 Foo(0, 0, 8)
 Foo(12, 0, 0)
 Foo(1, 0, 0)
 Foo(0, -1, -1)

julia> for i=0x01:0x3c
           unsafe_store!(Ptr{UInt8}(pointer(A)), i, i)
       end

julia> A
10-element Vector{Foo}:
 Foo(1, 1027, 5)
 Foo(7, 2569, 11)
 Foo(13, 4111, 17)
 Foo(19, 5653, 23)
 Foo(25, 7195, 29)
 Foo(31, 8737, 35)
 Foo(37, 10279, 41)
 Foo(43, 11821, 47)
 Foo(49, 13363, 53)
 Foo(55, 14905, 59)

It may be more apparent what happened here if we switch to UInt8 and UInt16 as the fields.

julia> struct Bar
           x::UInt8
           y::UInt16
           z::UInt8
       end

julia> reinterpret(Bar, A)
10-element reinterpret(Bar, ::Vector{Foo}):
 Bar(0x01, 0x0403, 0x05)
 Bar(0x07, 0x0a09, 0x0b)
 Bar(0x0d, 0x100f, 0x11)
 Bar(0x13, 0x1615, 0x17)
 Bar(0x19, 0x1c1b, 0x1d)
 Bar(0x1f, 0x2221, 0x23)
 Bar(0x25, 0x2827, 0x29)
 Bar(0x2b, 0x2e2d, 0x2f)
 Bar(0x31, 0x3433, 0x35)
 Bar(0x37, 0x3a39, 0x3b)

julia> sizeof(Foo)
6

Foo is padded to six bytes. The second and sixth byte were added as padding.

@jw3126 's Accessors.jl may be of interest here.

julia> using Accessors

julia> @set A[1].x = 90
10-element Vector{Foo}:
 Foo(90, 1027, 5)
 Foo(7, 2569, 11)
 Foo(13, 4111, 17)
 Foo(19, 5653, 23)
 Foo(25, 7195, 29)
 Foo(31, 8737, 35)
 Foo(37, 10279, 41)
 Foo(43, 11821, 47)
 Foo(49, 13363, 53)
 Foo(55, 14905, 59)

julia> @set A[1].y = 100
10-element Vector{Foo}:
 Foo(1, 100, 5)
 Foo(7, 2569, 11)
 Foo(13, 4111, 17)
 Foo(19, 5653, 23)
 Foo(25, 7195, 29)
 Foo(31, 8737, 35)
 Foo(37, 10279, 41)
 Foo(43, 11821, 47)
 Foo(49, 13363, 53)
 Foo(55, 14905, 59)

This is a JuliaCon presentation of Accessors.jl predecessor SetField.jl.

Using unsafe_* obviously does not fall under any type restrictions. It’s unsafe_ after all, you can do anything. reinterpret just gives a somewhat more safe interface around that, while at least still respecting padding (which shouldn’t ever be read/written - depending on how it’s implemented internally, you can trigger UB from LLVM).

These kinds of packages don’t do that though - they return a new object and require special handling for custom immutable structs. E.g. from the docs of Accessors.jl:

julia> using Accessors

julia> @set x.greeting = "Hi"
(greeting = "Hi", name = "World")

julia> x # still the same. Accessors did not overwrite x, it just created an updated copy
(greeting = "Hello", name = "World")

You can find similar explanations for SetField.jl. In the case of accessing an Array, the lenses (as they’re called/implemented) mutate the array by assigning a new instance, which is fair game.