Lazy and eager API design

This is a somewhat general API interface design question and I am still wondering about the idiomatic “Julian” way of formulating it, so I thought I would ask here.

Suppose I have a vocabulary of verbs f, g, h, which operate on some objects. These operations can either be lazy or eager (think eg Base.filter vs Iterators.filter). Both the lazy and the eager versions can be advantageous, and the user should feel free to decide when to actually instantiate intermediate objects (eg after benchmarking/profiling, suppose there is no good general answer because it depends on dimensions etc).

I can imagine various approaches:

  1. using a wrapper type for the lazy version, use the capitalized type name for that, and the small letter version for the eager one. Eg F vs f. This can be neat, but exposes the implementation (perhaps the user should not care about the wrapper type).

  2. put the lazy and eager versions in submodules, eg ThatModule.f and ThatModule.Lazy.f. This can be cumbersome and confusing, and precludes having both in the same namespace.

  3. govern lazyness with a keyword, eg f(x; lazy = true). This is not type stable (unless we rely on constant propagation).

What would you recommend?

1 Like

how about making f(x) lazy by default and add a run(f(x)) for the eager one? (eagerness is explicit) It is a bit similar to the mutating functions approach that calls the mutating one on a internally copy of the input. I consider lazy the more ‘generic’ case as it has state.

I like 3. as well; for the lazy/eager dispatch you could create methods with different number of arguments or dispatch on some singleton…

Lazy-as-default seems sane to me – if the user wants the eager computation then they can wrap the lazy function call in Base.collect or YourModule.materialize or something similar.

1 Like

This is probably somewhere that could use some coordination across packages. In IndexedTables all operations are eager by default but lazy version are often also possible, just not exposed to the user and I’m curious what’d be a nice API. Lazy by default here would be very breaking unfortunately, even though it feels like the cleanest design (with some collect or copy to materialize it).

1 Like

Many interfaces in Julia (incl Base, the standard libraries, and various packages) offer both lazy and eager versions (in some cases a package complementing an existing API).

I agree that coordinating some general syntax for this would be nice, so the solution would ideally be one which is free from type piracy (so keywords are out).

I am somewhat reluctant to take a general stand on whether being eager or lazy should be the default, as I think it depends on the problem. An ideal API would just have both.

1 Like

LazyArrays.jl now has Applied much like Broadcasted to represent generic call trees. There is a PR (or rather my suggestion) for creating non-materialized Broadcasted and Applied object with a macro @~. You can get a lazy object from, e.g., @~ f(g(h.(x) .+ y), z).

I think it would be nice to have a common syntax like @~ macro and common representation like Applied for lazy API. You can then specialize materialize for Applied{<:ApplyStyle, typeof(YOUR_FUNCTION)} to evaluate the call tree. It may be nice to also have something equivalent to Broadcast.instantiate to turn Applied{<:ApplyStyle, typeof(YOUR_FUNCTION)} to the public lazy API of your package.

2 Likes

Are there relevant iterators that cannot be lazy by design?

Either way, why didn’t we go for a lazy by default pattern to get people used to collecting/materializing? That way all functions that iterators in some kind, would only materialize where expected.
Maybe even make iterators that can be lazy - if such exists - hiding the full result. And specializing collect for them to instantly return the full result.