Recursive Accessor descend condition

Is it possible to write this with Accessors.Recursive?

using Accessors

modifyrec(f, x) =
    if hasproperty(x, :bs)
        (@modify(x.bs |> Elements()) do b
            modifyrec(f, b)
        end) |> f
    else
        f(x)
    end

let data = (;a=1, bs=[(;a=2, bs=[(;a=3, bs=[(;a=4)])])])
    modifyrec(data) do x
        @set x.a = 100 * x.a
    end
end
# (a = 100, bs = [(a = 200, bs = [(a = 300, bs = [(a = 400,)])])])

Imo Accessors.Recursive isn’t the most generic or straightforward to use…

I’d suggest this:

julia> using AccessorsExtra

julia> data = (;a=1, bs=[(;a=2, bs=[(;a=3, bs=[(;a=4)])])])

julia> @modify(a -> 100*a, data |> RecursiveOfType(NamedTuple, order=:pre) |> _.a)
(a = 100, bs = [(a = 200, bs = [(a = 300, bs = [(a = 400,)])])])

I would still like to specify that recursion happens on _.bs |> Elements() if present. Is that possible? Ie, the “descend condition” should be hasproperty(_, :bs) and the way to descend is examining the elements of _.bs.

Use either RecursiveOfType(NamedTuple) |> If(x -> hasproperty(x, :bs)) or RecursivePred(x -> hasproperty(x, :bs)).

Use RecursivePred(x -> hasproperty(x, :bs), Elements() ∘ (@maybe _.bs), order=:pre) or analogously with RecursiveOfType.
Btw, I never – not even once – needed to explicitly specify the way to descend in practice. Maybe it’s not necessary in your case as well?


Note: inference may start suffering with complex queries at some point…

How should a case like this be treated?

data = (;a=1, bs=[(;a=2, bs=[(;a=3, bs=[(;a=4)])])],cs=(;a=5,bs=(;a=6)))

Right, good question.

modifyrec(data) do x
  @set x.a = 100 * x.a
end
(
  a = 100, 
  bs = [(a = 200, bs = [(a = 300, bs = [(a = 400,)])])], 
  cs = (a = 5, bs = (a = 6,)),
)
modify(f, data, RecursivePred(Returns(true), Elements()∘(@maybe _.bs); order=:post)) # good
#  (a = 100, bs = [(a = 200, bs = [(a = 300, bs = [(a = 400,)])])], cs = (a = 5, bs = (a = 6,)))

I think in my cases the key or propertyname is how I know the semantics of the value inside. The typeof a value does not capture much about its interpretation; it’s the interface and more importantly its relation to its context – namely its access path optic _.bs – that tells me what the data means and hence what modifying it would mean.

For example if it’s some big json I can follow a path of .children[*].children[*].children[*]... but I wouldn’t want to apply a transformation all over a whole tree *[*].*[*].* lest it accidentally grab some other field .children[100].stepchildren[*].children[] either in the present document or added in a later edition of the data.

By the way what’s happening with the order=nothing case here? I was guessing it would default to one or the other but it seems to do something else.

julia> let data = (;a=1, bs=[(;a=2, bs=[(;a=3, bs=[(;a=4)])])],cs=(;a=5,bs=(;a=6)))
           f(x) = @set x.a = 100 * x.a
           modifyrec(f, data)
           [
               modify(f, data, RecursivePred(x -> x.a ≤ 3, Elements()∘(@maybe _.bs); order=:pre)) ,
               modify(f, data, RecursivePred(x -> x.a ≤ 3, Elements()∘(@maybe _.bs); order=:post)) ,
               modify(f, data, RecursivePred(x -> x.a ≤ 3, Elements()∘(@maybe _.bs); order=nothing)) 
           ]
       
       end
3-element Vector{@NamedTuple{a::Int64, bs::Vector{@NamedTuple{a::Int64, bs::Vector{@NamedTuple{a::Int64, bs::Vector{@NamedTuple{a::Int64}}}}}}, cs::@NamedTuple{a::Int64, bs::@NamedTuple{a::Int64}}}}:
 (a = 100, bs = [(a = 200, bs = [(a = 300, bs = [(a = 4,)])])], cs = (a = 5, bs = (a = 6,)))
 (a = 100, bs = [(a = 200, bs = [(a = 300, bs = [(a = 4,)])])], cs = (a = 5, bs = (a = 6,)))
 (a = 100, bs = [(a = 2, bs = [(a = 3, bs = [(a = 4,)])])], cs = (a = 5, bs = (a = 6,)))

is it possible to apply (in a “simply” way with the Accessor syntax) the function f() to every key :a that has :bs as a parent?

order=nothing doesn’t descend into matching values at all. This is arguably the most natural and intuitive default: you generally want modify(x -> 2*x, data, RecursiveOfType(Number)) to multiply numbers by two only once, not several times for stuff like complex(10u"m").

I haven’t documented this flag originally because wasn’t sure that order=:pre/:post/nothing is the right/useful set of choices. Over time it seems quite natural, probably it should finally be documented… And ideally, when the semantics of recursive optics is fleshed out and clear, they can be upstreamed to Accessors proper.

I like the idea of specifying traversal order and the predicate specification, as you’re showing there.

That said I was worried about .stepchildren and @rocco_sprmnt21’s .cs above and I think complex(10m) is another example. Early stopping with order=nothing is one way to solve it but it still makes me nervous to specify “what you’ll find there” versus “how to get there” because there might be false positives.

My instinct might be something like

RecursivePre(p, optic) = ...
RecursivePost(p, optic) = ...
RecursivePre(optic) = RecursivePre(Returns(true), optic)
RecursivePost(optic) = RecursivePost(Returns(true), optic)

because

  • I can specify how to descend
  • don’t need to specify a stopping condition if I don’t need one
  • it avoids flag kwargs
  • nothing isn’t really an order

Sure, probably best to call it :nodescent or something along those lines :slight_smile:

Idk, why having three types that dispatch to the same code is better than having a kwarg with three possible values?

Could you please clarify and motivate the differences from the current state?
I catch two differences:

  • kwargs with three possible values → three types
  • 1-arg constructor fixes the first argument (predicate) instead of the second (optic).

Anything I missed?

I’m not against changes here, but would like to understand both the actual suggestions and the underlying reasoning.

Yeah I think that’s all I’m saying there, though the “yolo”-style recursion I was knocking above makes me think we must be coming from totally different planets, and I should hold my tongue till I understand your position more. I’m curious what kind of data and applications justify that approach.


If I understand, the current RecursivePred is more or less

children(x::NamedTuple) = collect(x)
children(x::Vector) = copy(x)
children(x::Int) = Int[]

Accessors.set(x::NamedTuple, ::typeof(children), val::Array) = NamedTuple{propertynames(x)}(Tuple(val))
Accessors.set(x::Array, ::typeof(children), val::Array) = copy(val)
Accessors.set(x::Int, ::typeof(children), val::Array) = x


rec_nodescend(f, p, x) = let
    if p(x)
        f(x)
    else
        @set children(x) = rec_nodescend.(f, p, children(x))
    end
end

rec_pre(f, p, x) = let
    x = @set children(x) = rec_pre.(f, p, children(x))
    p(x) ? f(x) : x
end

let
    data = (;a=1, bs=[(;a=2, bs=[(;a=3, bs=[(;a=4)])])],cs=(;a=5,bs=(;a=6)))
    [
        rec_nodescend((x->100x), (x -> x isa Number), data),
        rec_pre((x->100x), (x -> x isa Number), data)
    ]
end
2-element Vector{@NamedTuple{a::Int64, bs::Vector{@NamedTuple{a::Int64, bs::Vector{@NamedTuple{a::Int64, bs::Vector{@NamedTuple{a::Int64}}}}}}, cs::@NamedTuple{a::Int64, bs::@NamedTuple{a::Int64}}}}:
 (a = 100, bs = [(a = 200, bs = [(a = 300, bs = [(a = 400,)])])], cs = (a = 500, bs = (a = 600,)))
 (a = 100, bs = [(a = 200, bs = [(a = 300, bs = [(a = 400,)])])], cs = (a = 500, bs = (a = 600,)))

My general tendency is to avoid flag kwargs for somewhat “taste”-type reasons. Part of the motivation for this practice is to let the type checker help, and because I think “stringly”-typed stuff is just a little ugly.

But also for “inversion of control” reasons—do we have a proof that there are only two (or three) “orders” for these operations such that users won’t want to supply their own?:

“Order” is currently being used for both (1) deciding whether f applies to x or y where y = @set children(x) = f.(children(x)), and (2) whether to apply f to the children of x, in the case of :nodescend. For example does p need to be evaluated before f is applied to x's children or could we decide whether to apply f to the current value based on the result of apply f to x’s children?: p(y) ? f(y) : y. Do we also want :noascend?

If this isn’t a closed enum of recursion strategies, extensible external control might be warranted, versus internal branching on a symbol?

I’m more or less just going from my gut here, so please take all these comments with a large grain of salt.

Correct, for the modify(f, x, RecursivePred) part. As an optic, it also supports getall() and setall(), and composes with others.

Indeed, I think I remember your opinion that it’s “illegal” to access arbitrary properties in Julia :slight_smile: This question may be debatable in principle, but here I look from a more pragmatic PoV.

Just searched for RecursiveOfType occurrences in my code, and all the constructor calls are single-argument – never (literally not even once!) I had to specify the second parameter, customizing the descent rule. Feels like a good argument for this being the default :slight_smile:

Some examples include:

# conversion:
modify(Float32, x, RecursiveOfType(Float64))
modify(String, x, RecursiveOfType(AbstractString))

# replace distributions/uncertainties with some representative value
modify(mean, x, RecursiveOfType(Distribution))
modify(value, x, RecursiveOfType(Uncertain.Value))

# in https://github.com/baggepinnen/MonteCarloMeasurements.jl/pull/173/files
@getall x |> RecursiveOfType(Particles) |> length(_.particles)
modify(x, RecursiveOfType(Particles)) do p
    p.particles[i]
end

# for quick ad-hoc optimization:
rawvals = getall(x, RecursiveOfType(AbstractFloat))
... rawvals is a vector/tuple of floats ...
setall(x, RecursiveOfType(AbstractFloat), rawvals)

# AST manipulation
# get all function calls:
@getall x |> RecursivePred(e -> Base.isexpr(e, :call); order=:pre) |> _.args[1]
# wrap all function calls as myfunc(f, args...):
modify(x, RecursivePred(e -> Base.isexpr(e, :call); order=:pre)) do call
    :(myfunc($(call.args...)))
end

# JSON3 to Dictionary conversion
modify(Dictionary, JSON3.read(...), RecursiveOfType(JSON3.Object; order=:pre))

It’s just so convenient to tell what kind of object you want to extract/modify, no matter how exactly it can be reached within the structure!

I wonder what your potential usecases are, and why you think automatic descent won’t work there.

1 Like

I guess order isn’t really the best name here (suggestions welcome!).
It defines one thing: what to do when at some step the “current” object x matches the predicate. Options are (for modify):

  1. replace x with f(x)
  2. replace x with f(x); then descend into f(x) and replace it with the result
  3. descend into x and replace it with the result x'; then replace it with f(x')

I think I only used #1 (nothing) and #3 (:pre).
These three seem to be reasonably generic and tightly related, so they are set with a single argument now.

Thanks for digging these up. I think these snippets show the kind of values sought, rather than the context they’re in, and it’s the context that I’m mostly worried about. I’m inferring that in most of those cases, you’re confident its safe to match on types because

  • (1) you control the data values and (2) what types they have, and (3) you don’t expect the data types to gain or lose propertynames in the future, or
  • in the json case, you don’t control the data, but (5) the types are restricted and (6) you’re not doing semantically significant changes (just converting from Object to Dictionary)

Is that basically your correctness argument, or is there something else?

I don’t want to bore you to death by rehashing my concerns, but since you ask…:slight_smile: :

False positive matches

How can I be confident that in all the data that this code will run on, there won’t be any numbers other than the ones I want?

  • False positive data. Say I get a large JSON response from some web API. I’m only interested in one descent pathway .children and I don’t want to read the docs for all the other potentially many other fields and subfields that optionally appear in this object. A hidden embedded .stepchildren field spoils my .children genealogy:
data = (;age=1, children=[(;age=2, children=[(;age=3, children=[(;age=4, children=[(;children=[(;age=5, children=[], stepchildren=[(;age=100, children=[])])])])])])])

mean(getall(data, RecursiveOfType(Int))) # secret stepchild
  • False positive metadata. Dictionary has plenty of internal Numbers like .indices.hashes etc and I wouldn’t want to accidentally getall those. This returns all sorts of internal numbers:
let d = dictionary([:a => dictionary([:x => [100]]), :b => dictionary([:x => [101]])])
    getall(d, RecursiveOfType(Int))
end

False negative matches

  • Data-value types: If I’m querying a JSON for RecursiveOfType(Integer) and then JSON3.read decides to parse json "3.0" as Float64 instead of Int64, I’ll miss it and get the wrong value.

  • Library changes: How can I be confident that the values I want will always have type JSON3.Object rather than having been replaced by a JSON3.NewObject when JSON3.jl introduced a new type, and then they’ll be missed and I’ll get the wrong result?

Thanks for listening. :slight_smile:

Isn’t #3 a post-order traversal, since it applies f on x' instead of on x?

Would BreadthFirst() be another acceptable order?

Indeed, that was my point :slight_smile: Looking forward to a real/realistic examples when this automatic descent doesn’t work!
Preferably, concrete usecases one may encounter in practice.

Not really… The main reason to use recursive descent here is because the type and exact structure of the data may change, and I don’t want to manually write and update all relevant paths.
Types (2) and propertynames (3) may easily change: from the most basic case of namedtuples with arbitrary properties, to more involved changes to structs in question.

I guess this was the case in my JSON example indeed…
Also remember that it’s not just modify: one could write a reasonable query like

@getall JSON3.read(...) |> RecursiveOfType(JSON3.Object) |> If(x -> x[:kind] == :mykind) |> _[:mykey]
1 Like

Well, if something unexpected can change in your data or library, you need to check and potentially update your code. Doesn’t seem specific to optics or recursive descent at all!

Then don’t use catch-all recursive descent on these types :slight_smile: Remember that using Children() to descend is just the default, one is free to provide any optic there.

For containers specifically, it makes sense to define children as elements instead of properties. This is easy to add and already done for arrays (AccessorsExtra.jl/src/recursive.jl at master · JuliaAPlavin/AccessorsExtra.jl · GitHub) as they don’t have properties at all.

Sure, you are right, thanks for catching!

Idk, what would be the difference in the result? And do you have some potential applications in mind?