Julia has a cumsum
function that operators on arrays, but not on lists. This seems to work:
cumsum2(itr) = foldl((x,y)->(x...,x[end]+y), itr)
Is this the best way to accomplish the task?
Julia has a cumsum
function that operators on arrays, but not on lists. This seems to work:
cumsum2(itr) = foldl((x,y)->(x...,x[end]+y), itr)
Is this the best way to accomplish the task?
You can collect
the iterator into an array: cumsum(collect(itr))
.
I’m not sure why there is no cumsum(itr)
definition.
Thanks, @yurvish. Yes, that works, too. I’m hoping for a solution that returns a list, mainly because I’m trying to better understand how to use functions like foldl
. But to be honest, I’m still unsure about when I should be using lists, and when I should be using arrays.
What do you mean by lists? That’s not a standard piece of terminology in the Julia world.
Is tuple
the right terminology?
Tuple support for cumsum was just merged:
Probably that’s what you wanted to use. Julia creates a tuple with (x...,)
:
julia> x = [1, 2, 3]
3-element Array{Int64,1}:
1
2
3
julia> y = (x...,)
(1, 2, 3)
julia> typeof(y)
Tuple{Int64,Int64,Int64}
If you are unsure if you need tuples or arrays, I think using arrays (vectors) is OK for most cases. You’d need tuples when you want to improve your program (e.g., performance).
BTW, you were very close to defining cumsum. One way to finish what you wrote is to use init=(0,)
and then remove it after foldl
is done:
julia> cumsum2(itr) = foldl((x,y)->(x...,x[end]+y), itr; init=(0,))[2:end]
cumsum2 (generic function with 1 method)
julia> cumsum2((1, 2, 3))
(1, 3, 6)
Seems like the init=(0,)
option isn’t needed:
cumsum2(itr) = foldl((x,y)->(x...,last(x)+y), itr)
julia> cumsum2((1,2,3))
(1, 3, 6)
@tkf, you mentioned the possibility of improved performance using Tuples. Where might I read more about this?
Note that your implementation works only when the elements of itr
are numbers (although it’s also the case with my init=(0,)
version). What I wanted to illustrate was that you should use init
most of the time when using foldl
with custom op
, and specifying it could be tricky sometimes. If you want to make it work with something other than numbers, e.g., cumsum2(([1], [2], [3]))
, I think you’d need to do something like what I did in the PR.
https://docs.julialang.org/en/latest/manual/performance-tips/ is a great resource for performance tips in general. But it looks like it doesn’t have much for tuples. The parts mentioning type-stability is somewhat relevant. For example, Julia can infer the type of last((1, "a"))
but not last([1, "a"])
. Another reason why tuple could be useful for performance is that it can be stack-allocated. You can probably find something relevant if you dig into StaticArray
-vs-Array
or struct
-vs-mutable struct
.
If you come from the lazy functional programming perspective, the usual tricks of defining a function on lists recursively work quite well with small tuples in julia. You can use first
and Base.tail
to split a tuple into the first element and the remaining ones. Using multiple dispatch, you can have separate procedures for a general tuple (t::Tuple
) and for the empty tuple (t::Tuple{}
).
For example here you could do:
# we cheat by adding an init argument, as it makes the recursion easier
function cumsum2(init, t::Tuple)
f, ls = first(t), Base.tail(t)
return (init, cumsum2(init + f, ls)...)
end
cumsum2(init, ::Tuple{}) = (init,)
cumsum2(t::Tuple) = cumsum2(first(t), Base.tail(t))
cumsum2(::Tuple{}) = () # we need this as first(t) would error on the empty tuple
This will be extremely performant on small tuples (for constant inputs, the computation even happens at compilation time), but you may need to be careful that compilation times become problematic for large tuples.
Thanks for the helpful responses, @tkf and @piever!
This will be extremely performant on small tuples (for constant inputs, the computation even happens at compilation time), but you may need to be careful that compilation times become problematic for large tuples.
Roughly, what qualifies as a “small” tuple?
I think in julia Base generally they use this trick only for tuples up to 16 elements, so I guess that’s a reasonable cut off. If it’s much longer than that, you are probably better off using arrays.