Array of Structs: getindex, slicing, and broadcasting getfield

I have a struct like

mutable struct State
    ID::Int64
    kind::Symbol
    val::Float64
end

that contains info about one object, but what I’m interested in is States, a collection of multiple State objects. I am combining each separate State into an array and defining States = Array{State}(undef, 5), for example. But now it’s difficult to broadcast States.ID. I was thinking of modifying get indices along the lines of something like

Base.getindex(states::Array{State,1}, n, ::Type{Val{:ID}}) = states[n].ID

but this doesn’t work.

What’s standard / best practice, and how can I ensure that updates to particular fields are reflected in the full object?

1 Like

One solution is using https://github.com/piever/StructArrays.jl.

4 Likes

I would suggest defining an accessor function and then broadcasting that function:

id(s::State) = s.ID

id.(states)

Does that work for your use case?

3 Likes

Thanks @rdeits and @mohamed82008! I’ll try out the StructArrays package first since from a quick @btime it seems that broadcasting will allocate each call to id.(states) whereas it does it once for StructArrays

StructArrays will allocate all your ids contiguously, so if you’re accessing a lot of them at a time, it will probably be faster because it can be vectorized / take advantage of SIMD, while the array of structs cannot.
Same goes for the vals in your State.

1 Like

As a follow up, can I use StructArrays in a nested way, for example:

mutable struct State
    ID::Int64
    kind::Symbol
    val::Float64
end
mutable struct Object
    st::State
    other::Int64
end
O1 = Object(State(1, :s1, 1.0), 10)
O2 = Object(State(2, :s2, 3.0), 20)
O3 = Object(State(3, :s2, 3.0), 30)
Os = StructArray([O1; O2; O3])

so that I can access Os.st.val?

I can’t assign StructArray(Os.st) to Os.st, but I can access as StructArray(Os.st).val. However, this feels awkward.

EDIT: it looks like it’s possible to define Cell from here https://github.com/piever/StructArrays.jl/issues/11

IIUC, you want to access the fields in a view. Use MappedArrays:

using MappedArrays
Os_st = mappedarray((x) -> x.st, Os)
1 Like

Alternatively, Os = StructArray(objects; unwrap = t -> t <: State) works for objects = [O1; O2; O3]!

1 Like

Following up on usage of StructArrays, what is the best practice for updating values? For example in Os, I want to update a single element,

Os.st[1].ID = 100

but this isn’t reflected in Os after the assignment.

Yes, I think what you want here is StructArrays with the unwrap option. Performance-wise StructArrays works best if your struct is immutable. The trick is that you can change your “columns” even though the struct is immutable as follows:

julia> using StructArrays

julia> struct State
           ID::Int64
           kind::Symbol
           val::Float64
       end

julia> s = StructArray(State(i, :test, 1.2) for i in 1:3)
3-element StructArray{State,1,NamedTuple{(:ID, :kind, :val),Tuple{Array{Int64,1},Array{Symbol,1},Array{Float64,1}}}}:
 State(1, :test, 1.2)
 State(2, :test, 1.2)
 State(3, :test, 1.2)

julia> s.val[2] = 0;

julia> s
3-element StructArray{State,1,NamedTuple{(:ID, :kind, :val),Tuple{Array{Int64,1},Array{Symbol,1},Array{Float64,1}}}}:
 State(1, :test, 1.2)
 State(2, :test, 0.0)
 State(3, :test, 1.2)

That is to say doing Os.st.ID[1] = 100 rather than Os.st[1].ID = 100. This way you are changing the underlying array and calling getindex there will return the correct thing.

1 Like

Thanks @piever! Is there a “best” way to do Os.st.ID[1] = 100 so that Os changes, or if I’m doing this, is it best then to just broadcast functions like id.() per @rdeits above? That is, when I look at Os[1], I’d like its State ID to be 100 now, too.

os.st.ID[1] = 100 changes the underlying array, so:

julia> using StructArrays

julia> struct State
           ID::Int64
           kind::Symbol
           val::Float64
       end

julia> struct Object
           st::State
           other::Int64
       end

julia> o1 = Object(State(1, :s1, 1.0), 10);

julia> o2 = Object(State(2, :s2, 3.0), 20);

julia> o3 = Object(State(3, :s2, 3.0), 30);

julia> os = StructArrays.StructArray([o1, o2, o3]; unwrap = t -> t <: State);

julia> os.st.ID[1]
1

julia> os.st.ID[1] = 100
100

julia> os.st.ID[1]
100

julia> os[1]
Object(State(100, :s1, 1.0), 10)

The idea is that os.st.ID is the vector where all the IDs are stored (the structs themselves are not stored anyway but generated on the fly when you do say os[3]), so if you do os.st.ID[3] = 10 you are changing the underlying vector and the next time the struct is generated, it’ll reflect the change.

1 Like

Ah, I think I missed this. Thank you!!

I can highly recommend StructArrays (kudos to @piever).