Hi Julians,
I’ve found myself wanting convenient ways to manipulate arrays of struct-like objects, such as:
julia> df = [(x = 1,), (x = 2,), (x = 3,), (x = 4,), (x = 5,)];
julia> “df.y = df.x .^ 2” # pseudocode
5-element Vector{NamedTuple{(:x, :y), Tuple{Int64, Int64}}}:
(x = 1, y = 1)
(x = 2, y = 4)
(x = 3, y = 9)
(x = 4, y = 16)
(x = 5, y = 25)
Think of it like broadcasting the getproperty
call df.x
to produce getproperty.(df, :x)
, and doing something similar for the assignment.
The style is similar to using DataFramesMeta.jl
,
julia> @transform(DataFrame(df), :y = :x.^2)
5×2 DataFrame
Row │ x y
│ Int64 Int64
─────┼──────────────
1 │ 1 1
2 │ 2 4
3 │ 3 9
4 │ 4 16
5 │ 5 25
except that DataFrame
s are not quite as flexible because their columns are constrained to one dimension. On the other hand, Julia’s native broadcasting allows you to extend dimensions easily, and isn’t as fussy about preserving lengths.
This train of thought led to the following few lines of Julia defining a “dot broadcasting” macro @..
Macro definition
using MacroTools: prewalk, @capture
# allow setting fields of immutable named tuples
function setfield(nt::NamedTuple, value, field)
names = Tuple(keys(nt) ∪ (field,))
NamedTuple{names}(k == field ? value : nt[k] for k ∈ names)
end
broadcast_dot_operator(expr) = prewalk(expr) do node
if @capture(node, x_.k_ = y_)
:( $x = $setfield.($x, $y, $(Meta.quot(k))) )
elseif @capture(node, x_.k_)
:( getindex.($x, $(Meta.quot(k))) )
else
node
end
end
macro var".."(expr)
broadcast_dot_operator(esc(expr))
end
which allows you to do such things as
julia> df = [(x = 0,)]; # start with single ‘data point’
julia> @.. df.x = 1:3 # easily extend dimensions
3-element Vector{NamedTuple{(:x,), Tuple{Int64}}}:
(x = 1,)
(x = 2,)
(x = 3,)
julia> @.. begin # easily add dimensions
df.y = (1:2)'
df.z = df.x .* df.y
end
3×2 Matrix{NamedTuple{(:x, :y, :z), Tuple{Int64, Int64, Int64}}}:
(x = 1, y = 1, z = 1) (x = 1, y = 2, z = 2)
(x = 2, y = 1, z = 2) (x = 2, y = 2, z = 4)
(x = 3, y = 1, z = 3) (x = 3, y = 2, z = 6)
julia> @.. df.z = [df;;; df].z .* [1;;; 100]
3×2×2 Array{NamedTuple{(:x, :y, :z), Tuple{Int64, Int64, Int64}}, 3}:
[:, :, 1] =
(x = 1, y = 1, z = 1) (x = 1, y = 2, z = 2)
(x = 2, y = 1, z = 2) (x = 2, y = 2, z = 4)
(x = 3, y = 1, z = 3) (x = 3, y = 2, z = 6)
[:, :, 2] =
(x = 1, y = 1, z = 100) (x = 1, y = 2, z = 200)
(x = 2, y = 1, z = 200) (x = 2, y = 2, z = 400)
(x = 3, y = 1, z = 300) (x = 3, y = 2, z = 600)
I’m wondering if a comparable kind of broadcasting for getproperty
and setproperty
is already defined in some package. Is this kind of notation already in use? If not, should it be a package?