Let’s say we have a simple 2D matrix
a = [0 1 2 3 4;
5 6 7 8 9;
10 11 12 13 14;
15 16 17 18 19]
It looks like you can apply a function to rows and columns like this
julia> map(sum, eachrow(a))
and you get the expected results.
You can also use a list comprehension like
julia> [sum(x) for x in eachrow(a)]
but if you try to achieve that using pipes then that doesn’t work and seemingly gives you the result as if you’d asked for sums of columns
a |> eachrow |> sum
and instead using
eachcol gives you the row sums
julia> a |> eachcol |> sum
What’s all that about?
As a relative beginner, these results are quite confusing. This doesn’t seem like it makes sense.
If you broadcast the piped sum over each row it seems to work alright:
a |> eachrow |> x -> sum.(x)
# equal to: sum(a,dims=2)
Thanks @rafael.guerra Im my mental model, the sum is receiving a rows, one at a time, so not sure why you would need to use broadcasting in that case.
Can you offer any insights into why the naive pipe approach did not work, or why it seemed to be operating on columns and not rows?
eachrow(a) is an iterator of vectors, and
+ is defined for vectors. So
sum is applying
+ to an accumulator.
julia> [1, 2] + [3, 4]
julia> sum([[1, 2], [3, 4]]) # vector of vectors kind of like eachrow
collect(eachrow(a)) to see the vector of vectors that is piped into sum.
My best take on this would be to have a look what the iterator created by “eachrow” does yields:
julia> b = collect.(eachrow(a))
[0, 1, 2, 3, 4]
[5, 6, 7, 8, 9]
[10, 11, 12, 13, 14]
[15, 16, 17, 18, 19]
If you collect the result you see that you get a vector of vectors. Now you can check what happens if you apply sum to such an object:
And you can check which function is called for it:
julia> @which sum(b)
sum(a::AbstractArray; dims, kw...) in Base at reducedim.jl:873
Thanks @DorianT. So in short
eachrow is not generating vectors as such, but a generator which pumps out vectors?
But it’s not immediately clear why what
sum is being provided with is different when doing
a |> eachcol |> sum and
[sum(x) for x in eachrow(a)]? I think that’s what I’m finding particularly confusing.
As in, why the need for
collect when using pipes, but not when using a list comprehension?
Sum gets the whole iterator at once and sums the elements in the pipe version. In your list comprehension you apply one sum to each element of the iterator. You can use
.|> sum though
Let’s say the rows of
r3. Then the following have the same meaning:
A |> eachrow |> sum
sum([r1, r2, r3]) # Sum of a list of 3 elements (each element being a vector)
r1 + r2 + r3
Your list comprehension does something else… These are equivalent:
[sum(r) for r in eachrow(A)] # Calculate sum(r) for each row r
[sum(r1), sum(r2), sum(r2)]
Here instead of calculating the sum of three arrays we calculate three sums!