FLoops.jl provides a macro
@floop to provide alternative “backend” of the
for loop syntax based on the mechanism provided by Transducers.jl. It can be used to generate a fast generic iteration over complex collections.
I think the iteration mechanism (
foldl) of Transducers.jl has many advantages over currently how
for loop is implemented (
iterate). However, I’ve realized that functional aspect of Transducers.jl can be a cognitive overhead and impedes its adoption. By lowering the familiar
for syntax to
foldl, I’m hoping that
foldl become much more accessible to many Julia users.
This package is not registered yet. I thought to post this here first to measure the interest. I used raw
foldl too much and I’m still not sure if I personally really need this. However, I think it’d be great if FLoops.jl can provide some incentive for data collection authors to define
] add https://github.com/tkf/FLoops.jl.git in the REPL to install the package.
Quoting Usage section of the README:
Simply wrap a
forloop and its initialization part by
julia> using FLoops # exports @floop macro julia> @floop begin s = 0 for x in 1:3 s += x end end s 6
When accumulating into pre-defined variables, simply list them between
@floopalso works with multiple accumulators.
julia> @floop begin s p = 1 for x in 4:5 s += x p *= x end end s 15 julia> p 20
begin ... endblock can be omitted if the
forloop does not require local variables to carry the state:
julia> @floop for x in 1:3 @show x end x = 1 x = 2 x = 3
@floop is better because
foldl is better than
iterate. Here is some demonstration. It’s a “recap” if you already have heard about Transducers.jl.
@floop is fast for complex collections
This is the ratio (
baseline / target) of the time takes to run
@floop begin acc = 0.0 for x in xs acc += x end end
with and without
@floop (so larger value means
@floop is better).
The input collections are generated by
floats = randn(1000) dataset = [ "Vector" => floats, "filter" => Iterators.filter(!ismissing, ifelse.(floats .> 2, missing, floats)), "flatten" => Iterators.flatten((floats, (1, 2), 3:4, 5:0.2:6, Zeros(1000))), "BlockVector" => mortar([floats, floats]), ]
As you can see,
@floop is beneficial for collections with more complex structure. In particular,
@floop is much faster for chunked/blocked collections like
Deterministic setup and teardown
foldl is also useful for building robust and correct API. For example,
eachline(::AbstractString) is not safe to use with
break and also not exception-safe; the file object is not closed deterministically. Note that this is not because the implementation of
eachline is not careful enough. This is simply a limitation of
Defining a safer version of
eachline(::AbstractString) is much simpler with
using Transducers using Transducers: @next, complete function safe_eachline(filename::AbstractString; keep=false) return AdHocFoldable(filename) do rf, acc, filename open(filename) do io while !eof(io) acc = @next(rf, acc, readline(io; keep=keep)) end return complete(rf, acc) end end end @floop for ln in safe_eachline(".gitignore") @show ln end
This mechanism is useful for any kind of container that needs some resources during the loop (e.g.,
How it works
@floop works by converting the native Julia
for loop syntax to
foldl defined by Transducers.jl. Unlike
foldl defined in
foldl defined by Transducers.jl is powerful enough to cover the
for loop semantics and more.
It may be nice to extend
@floop to parallel loops. However, this is where
(map)reduce-like approach is more appropriate and I cannot come up with the syntax to naturally express
(map)reduce (the parallel version of