Mutating fields in StructArrays

lmiq · April 27, 2021, 12:59pm

I am starting to use StructArrays in a package, and the benefits clearly outweigh the costs. However, there is one thing that I have, for the moment, to let go, which is the mutation of fields using the most natural syntax.

I mean, for example, with an array of mutable structs, I can do:

julia> mutable struct Atom
         index
         name
       end

julia> v1 = [ Atom(1,"C"), Atom(2,"N") ]
2-element Vector{Atom}:
 Atom(1, "C")
 Atom(2, "N")

julia> v1[1].name = "N"  # <--- mutate field name of first element
"N"

julia> v1
2-element Vector{Atom}:
 Atom(1, "N")
 Atom(2, "N")

With a StructArray, however, that does not work anymore:

julia> v2 = StructArray(v1)
2-element StructArray(::Vector{Any}, ::Vector{Any}) with eltype Atom:
 Atom(1, "N")
 Atom(2, "N")

julia> v2[1].name = "C"   # <-- same syntax, no error
"C"

julia> v2
2-element StructArray(::Vector{Any}, ::Vector{Any}) with eltype Atom:
 Atom(1, "N")  # but no mutation :-(
 Atom(2, "N")

I know I can do

julia> v2.name[1] = "C"
"C"

julia> v2
2-element StructArray(::Vector{Any}, ::Vector{Any}) with eltype Atom:
 Atom(1, "C")
 Atom(2, "N")

But that would be a breaking change. Is it possible, easy, and reasonable to recover the original mutation syntax while still using StructArrays? This was the only thing that broke from what I have been doing with these arrays of structs, everything else worked perfectly, and better.

Skoffer · April 27, 2021, 1:11pm

Weell… If you allow yourself to use immutable structures…

using StructArrays
using Setfield

struct Atom
  index
  name
end

v1 = [ Atom(1,"C"), Atom(2,"N") ]
v2 = StructArray(v1)

julia> @set! v2[1].name = "N"
2-element Vector{Atom}:
 Atom(1, "N")
 Atom(2, "N")

julia> v2 # to confirm that it really has changed
2-element Vector{Atom}:
 Atom(1, "N")
 Atom(2, "N")

Ok, in reality, it works with mutable structures too. But immutables are so much better.

What is happening in your example, is that v2[1] creates new object, something like Atom(v2[1].index, v2[1].name). When you mutate it, nothing happens, because it’s new. When using @set! it basically go through the following

# v2[1]
atom = Atom(v2[1].index, v2[1].name)

# @set atom.name = "N"
atom = Atom(atom.index, "N")

# @set!
v2[1] = atom

You can use Setfield for the last operation, or you can write your own macro which does the same. Or you can write macro, which swaps operations, i.e. it should do something like

@swapset! v2[1].name = "N"

# it should translate to
v2.name[1] = "N"

Last one is better, I suppose, because it mutates vector, which is more efficient than materialize structure and dematerialze it again (even if compiler is smart enough and can skip some operations).

Henrique_Becker · April 27, 2021, 1:41pm

The short answer would be no. StaticArrays works when you simply access or replace an element because it can redefines getindex and setindex! for the StructArray object. However, if you try to change a single field of an element, what you have is getindex (that will create a new object on the fly) and then a setproperty! in this newly created object that is not stored in the StructArray because the StructArray does not store structs, it stores arrays of fields.

Theoretically this could be made work, but then one of the main features of StructArray would need to be broken, i.e., that it returns an unwrapped struct of the object you want to store. If the returned struct could be wrapped, then it could store a reference to the array, and update the value there.

~~Note that you can do the following ridiculous pattern, that works with both the StaticArray and without it, but may affect the performance.~~

sa[i] = sa[i].field = 5

The code above does not work, in fact, unfortunately seems that is not trivial to capture the object generated on-the-fly during an assignment.

Or define a generated function/macro that checks the type and takes the right action.

lmiq · April 27, 2021, 2:30pm

Thank you both.

My problem here is mostly what to expose to the user. Originally I wanted to allow he/she to mutate the objects using the most common syntax

atom[1].name = "C"

but apparently I will have to give up on that if I want to use StructArrays. If I decide to give up on that, then it is probably a good idea in my case to set everything to immutable and expose @set! or some custom function that just calls @set! to the user.

lmiq · April 27, 2021, 2:37pm

Now that I think: does it make any difference to use mutable or immutable structs there? At the end, after conversion to a StructArray I will get an immuable struct anyway. Keeping the original struct as mutable is just a mater of convenience for building up the data.

Henrique_Becker · April 27, 2021, 3:07pm

Well, if the struct is mutable then what StructArray returns is a mutable object. If you always deal with such objects in the context of the StructArray then I believe using mutable structs brings little to the table, but if you need to pass these objects to other function that may need to mutate them, well, then they need to be mutable, then after you can assign them back to the StructArray (but this seems like a source of subtle bugs, because updating the objects will not update them in the array, so you always need to remember to assign them back).

lmiq · April 27, 2021, 3:14pm

Apparently StructArrays will always return an immutable object, but since it wraps the fields of the structs in the form of arrays, those arrays are always mutable anyway:

julia> mutable struct A
         x::Int
       end

julia> using StructArrays

julia> v = [A(1), A(2)]
2-element Vector{A}:
 A(1)
 A(2)

julia> vsa = StructArray(v)
2-element StructArray(::Vector{Int64}) with eltype A:
 A(1)
 A(2)

julia> ismutable(vsa)
false

julia> vsa.x[1] = 3
3

julia> vsa
2-element StructArray(::Vector{Int64}) with eltype A:
 A(3)
 A(2)

What remains from the original struct is only the types and field names. So in any case one must take care on how we deal with that data in other functions.

Skoffer · April 27, 2021, 3:32pm

Not exactly. vsa[1] in your example build object of the type A. And as such it will be either mutable or immutable, depending on the A. Of course if you never use vsa[1] then it doesn’t matter.

Henrique_Becker · April 27, 2021, 3:32pm

I think you misunderstood what I said, I meant that:

julia> typeof(vsa[1])
A

julia> isimmutable(vsa[1])
false

And if A is a mutable struct then the object returned by vsa[1] is mutable:

julia> first_obj = vsa[1]
A(1)

julia> first_obj.x = 10
10

julia> first_obj
A(10)

There is difference between the returned element object being mutable and the fact that changing it does not change the StructArray object.

lmiq · April 27, 2021, 4:18pm

Yes, of course, thanks.

That actually makes it even more important to try to keep them immutable if possible, otherwise one will be copying stuff all the time to and from the heap. Although in minimal examples it seems that the compiler is smart enough to not allocate a new instance of the struct, even if it is mutable:

julia> using StructArrays

julia> mutable struct A
         x::Int
       end

julia> vA = StructArray([A(1), A(2)])
2-element StructArray(::Vector{Int64}) with eltype A:
 A(1)
 A(2)

julia> f(v) = sum(v[i].x for i in eachindex(v))
f (generic function with 1 method)

julia> @btime f($vA)
  3.137 ns (0 allocations: 0 bytes)
3

Probably it is better to check if the specific example one is dealing with that actually matters or not.

lmiq · April 27, 2021, 5:17pm

Actually this does not work. Note that it converted the StructArray into a regular array:

julia> using StructArrays

julia> using Setfield

julia> struct Atom
         index
         name
       end

julia> v1 = [ Atom(1,"C"), Atom(2,"N") ]
2-element Vector{Atom}:
 Atom(1, "C")
 Atom(2, "N")

julia> v2 = StructArray(v1)
2-element StructArray(::Vector{Any}, ::Vector{Any}) with eltype Atom:
 Atom(1, "C")
 Atom(2, "N")

julia> @set! v2[1].name = "N"
2-element Vector{Atom}:
 Atom(1, "N")
 Atom(2, "N")

piever · April 27, 2021, 8:20pm

This issue seems to be popping up a lot recently and there is a PR: Expand documentation, add discussion on counterintuitive behavior by jlchan · Pull Request #188 · JuliaArrays/StructArrays.jl (github.com) to clarify the docs.

@Henrique_Becker’s explanation is 100% correct: the structures are not stored anywhere but generated on the fly. Once you generate it, it can be mutated but that won’t affect the underlying array.

You could try to use LazyRow(s) to generate custom row objects with an special setproperty! method that updates the array, see the docs at JuliaArrays/StructArrays.jl: Efficient implementation of struct arrays in Julia (github.com)

LazyRow(s) work even if the underlying struct is immutable, because they act directly on the component array, so it is strongly recommended to use immutable structs (generating them on the fly should be much more efficient).

Topic		Replies	Views
Manipulating fields of a StructArray does not work General Usage data_structures	1	506	November 29, 2018
Mutating a StructArray without allocation General Usage question , package , allocations , structarraysjl	5	174	October 16, 2024
Trying to reference a struct field into an array New to Julia arrays , struct	7	1279	August 26, 2022
Are immutable struct really immutable? General Usage	12	5321	July 10, 2019
Mutable scalar in immutable object: the best alternative? New to Julia	11	2092	May 29, 2024

Mutating fields in StructArrays

Related topics