Let’s say we have a simple 2D matrix
a = [0 1 2 3 4;
5 6 7 8 9;
10 11 12 13 14;
15 16 17 18 19]
map
It looks like you can apply a function to rows and columns like this
julia> map(sum, eachrow(a))
4-element Vector{Int64}:
10
35
60
85
and you get the expected results.
list comprehension
You can also use a list comprehension like
julia> [sum(x) for x in eachrow(a)]
4-element Vector{Int64}:
10
35
60
85
pipe
but if you try to achieve that using pipes then that doesn’t work and seemingly gives you the result as if you’d asked for sums of columns
a |> eachrow |> sum
5-element Vector{Int64}:
30
34
38
42
46
and instead using eachcol
gives you the row sums
julia> a |> eachcol |> sum
4-element Vector{Int64}:
10
35
60
85
What’s all that about?
As a relative beginner, these results are quite confusing. This doesn’t seem like it makes sense.
1 Like
If you broadcast the piped sum over each row it seems to work alright:
a |> eachrow |> x -> sum.(x)
# equal to: sum(a,dims=2)
1 Like
Thanks @rafael.guerra Im my mental model, the sum is receiving a rows, one at a time, so not sure why you would need to use broadcasting in that case.
Can you offer any insights into why the naive pipe approach did not work, or why it seemed to be operating on columns and not rows?
eachrow(a)
is an iterator of vectors, and +
is defined for vectors. So sum
is applying +
to an accumulator.
julia> [1, 2] + [3, 4]
2-element Vector{Int64}:
4
6
julia> sum([[1, 2], [3, 4]]) # vector of vectors kind of like eachrow
2-element Vector{Int64}:
4
6
Should do collect(eachrow(a))
to see the vector of vectors that is piped into sum.
1 Like
My best take on this would be to have a look what the iterator created by “eachrow” does yields:
julia> b = collect.(eachrow(a))
4-element Vector{Vector{Int64}}:
[0, 1, 2, 3, 4]
[5, 6, 7, 8, 9]
[10, 11, 12, 13, 14]
[15, 16, 17, 18, 19]
If you collect the result you see that you get a vector of vectors. Now you can check what happens if you apply sum to such an object:
julia> sum(b)
5-element Vector{Int64}:
30
34
38
42
46
And you can check which function is called for it:
julia> @which sum(b)
sum(a::AbstractArray; dims, kw...) in Base at reducedim.jl:873
1 Like
Thanks @DorianT . So in short eachrow
is not generating vectors as such, but a generator which pumps out vectors?
But it’s not immediately clear why what sum
is being provided with is different when doing a |> eachcol |> sum
and [sum(x) for x in eachrow(a)]
? I think that’s what I’m finding particularly confusing.
As in, why the need for collect
when using pipes, but not when using a list comprehension?
jules
July 24, 2021, 5:08pm
9
Sum gets the whole iterator at once and sums the elements in the pipe version. In your list comprehension you apply one sum to each element of the iterator. You can use .|> sum
though
3 Likes
sijo
July 24, 2021, 5:17pm
10
Let’s say the rows of A
are r1
, r2
, r3
. Then the following have the same meaning:
A |> eachrow |> sum
sum(eachrow(A))
sum([r1, r2, r3]) # Sum of a list of 3 elements (each element being a vector)
r1 + r2 + r3
Your list comprehension does something else… These are equivalent:
[sum(r) for r in eachrow(A)] # Calculate sum(r) for each row r
[sum(r1), sum(r2), sum(r2)]
Here instead of calculating the sum of three arrays we calculate three sums!
4 Likes