You could also define

(s::Symbol)(data) = getproperty(data, s)

which would let you say


would you mind applying to the above example with X T and T1 ?


julia> :T1.(:T.([X...]))  .+ 0
Thanks :slight_smile:

This is type-piracy though, and might have unexpected consequences

Is there a way to improve the definition?

I don’t know of an easy and equally concise way. One way to avoid the piracy is to wrap the Symbol in your own type, eg.

julia> struct MySym
       s :: Symbol

julia> (s::MySym)(data) = getproperty(data, s.s)

julia> MySym(:T1).(MySym(:T).([X...]))  .+ 0
One severe pitfall that hasn’t been mentioned is that .. has a pretty low operator precedence, equal to :. So unfortunately we’re stuck with

julia> :( A..x .^ 2 ) == :( A..(x .^ 2) )

Which is probably a good thing, because this confirms that .. was intended by the language designers to be used as a :-like range operator, so it’s unlikely that different packages will use it with wildly different semantics…

I’m working a lot with structures of the form


and often want to get an array of x. I can do


X ⋄ :a ⋄ :x

which is ok. but something like X..a..x would be much better.


julia> X=[

julia> using Accessors

julia> (@optic _.a.x).(X)
That’s really cool. Thank you.

would it be difficult to extend it to something closer to X.a.x say:

@optic X _.a.x
using Accessors
julia> macro mapoptic(e, v)
           :(map((@optic $e), $v))
@mapoptic (macro with 1 method)

julia> @mapoptic _.a.x X
tho if the interface you really want is just X.a.x there should be a way to do that using StructArrays or something.

Indeed StructArrays.jl works well here:

julia> using StructArrays

julia> X = [
2-element Vector{@NamedTuple{a::@NamedTuple{x::Int64, y::Int64}, b::Tuple{}}}:
 (a = (x = 1, y = 1), b = ())
 (a = (x = 2, y = 3), b = ())

julia> Y = StructArray(X, unwrap=t -> t <: NamedTuple)
2-element StructArray(StructArray(::Vector{Int64}, ::Vector{Int64}), ::Vector{Tuple{}}) with eltype @NamedTuple{a::@NamedTuple{x::Int64, y::Int64}, b::Tuple{}}:
 (a = (x = 1, y = 1), b = ())
 (a = (x = 2, y = 3), b = ())

julia> Y[1]
(a = (x = 1, y = 1), b = ())

julia> Y.a.x
StructArrays are definitely the efficient and convenient data structure for this kind of columnar manipulations! You can store the dataset as a StructArray from the beginning and don’t do any conversions.

Btw, Accessors now export @o as an alias for @optic, to further encourage using this macro (:
map((@o _.a.x), X) works for all arrays, and for StructArrays it can be made as efficient as X.a.x. This optimization is more experimental, and for now is only in AccessorsExtra.jl, not in Accessors.jl proper.


This is brilliant.
Thank you

Is it possible to get StructArray type syntax but maintain coupling with the underlying vector of Structs ?

Of course, it’s possible – thanks to Julia composability (:
One approach is to make separate views of each component of your original array, and put them into a StructArray:

julia> using StructArrays, FlexiMaps, Accessors

julia> X = [(a=1, b=2), (a=3, b=4)]
2-element Vector{@NamedTuple{a::Int64, b::Int64}}:
 (a = 1, b = 2)
 (a = 3, b = 4)

julia> Y = StructArray(
           a=mapview((@o _.a), X),
           b=mapview((@o _.b), X),
           # repeat for all components you need
2-element StructArray(::FlexiMaps.MappedArray{Int64, 1, PropertyLens{:a}, Vector{@NamedTuple{a::Int64, b::Int64}}}, ::FlexiMaps.MappedArray{Int64, 1, PropertyLens{:b}, Vector{@NamedTuple{a::Int64, b::Int64}}}) with eltype @NamedTuple{a::Int64, b::Int64}:
 (a = 1, b = 2)
 (a = 3, b = 4)

julia> Y[2]
(a = 3, b = 4)

# the new Y array actually refers to X values, and updates correspondingly
julia> Y.a .= [5, 6]
2-element FlexiMaps.MappedArray{Int64, 1, PropertyLens{:a}, Vector{@NamedTuple{a::Int64, b::Int64}}}:

julia> X
2-element Vector{@NamedTuple{a::Int64, b::Int64}}:
 (a = 5, b = 2)
 (a = 6, b = 4)

julia> Y.b[2] = 10

julia> X
2-element Vector{@NamedTuple{a::Int64, b::Int64}}:
 (a = 5, b = 2)
 (a = 6, b = 10)

But I personally think it’s better to just store your data in a StructArray in the first place. Are there any specific reasons you prefer a basic Vector here?

I use functions to operate on the underlying structures.
The basic vector is coupled to the underling structs. The StructArray isn’t.

@kwdef mutable struct X

SA = StructArray( [X(1,1), X(2,2)] )

function f!( x::X )
    x.a += 10    

SA.a[1] #1

v  = collect(SA)
v[1].a  #11

Unless you define it as you’ve done above.
My underlying structs might have a dozen fields.
Is there any shorthand to do the below for all fields?

Y = StructArray(
           a=mapview((@o _.a), X),
           b=mapview((@o _.b), X),
           # repeat for all components you need
This is a common issue with StructArrays, see this page from the documentation: Some counterintuitive behaviors · StructArrays

When you do SA[1] it creates an X struct on the fly. The f!.(SA) call operates on these on-the-fly structs so it has no effect: the on-the-fly structs are discarded at the end of the f! call.

To work around this, you can work on “lazy rows” rather than temporary structs:

using StructArrays

@kwdef mutable struct X

SA = StructArray( [X(1,1), X(2,2)] )

function f!( x )
    x.a += 10    

SA.a[1] #11