Map vs list comprehension

mzaffalon · December 13, 2016, 12:50pm

What is the difference between map(v -> f(v), lst) and [f(v) for v in lst]?
I have seen this discussion but the focus was on speed and memory allocation.

dpsanders · December 13, 2016, 2:11pm

Conceptually, there is no difference. Note that you can simply write

map(f, lst)

if f is a previously-defined function.

mzaffalon · December 13, 2016, 2:31pm

Why are there two versions?

dfdx · December 13, 2016, 2:39pm

They are not quite the same. For example, with list comprehension you can additionally filter elements:

[x for x in 1:10 if x % 2 == 0]

But map may be shorter and more convenient, e.g.:

[f(input) for input in inputs]
map(f, inputs)

In addition, map (mostly) preserves type of a collection, while list comprehencion doesn’t. Compare:

map(x -> x + 1, Set([1, 2, 3]))
[x + 1 for x in Set([1, 2, 3])]

mzaffalon · December 23, 2016, 6:42am

I still having a hard time with all this (Julia v0.5).

Comprehension on an Array{Tuple} is fine

julia> a=[(1,2),(3,4)];
julia> [x for (x,y) in a]
2-element Array{Int64,1}:

but not map

julia> map((x,y)->x, a)
ERROR: MethodError: no method matching (::##3#4)(::Tuple{Int64,Int64})

even though

(x,y) = a[1]

is fine.

Also filter raises an error on

julia> filter((x,y)->x==3, a)
ERROR: MethodError: no method matching (::##5#6)(::Tuple{Int64,Int64})

because a is not an associative collection (two arguments are passed to the function in this case: this is specified in the manual). Indeed filter works here

julia> filter((x,y)->x==3, Dict(a))
Dict{Int64,Int64} with 1 entry:

Then again neither one is valid:

julia> foreach((x,y)->println(x), a)
ERROR: MethodError: no method matching (::##9#10)(::Pair{Int64,Int64})
julia> foreach((x,y)->println(x), Dict(a))
ERROR: MethodError: no method matching (::##11#12)(::Pair{Int64,Int64})

Instead

julia> for (x,y) in a
       println(x)
       end

is OK.

Then I come across this:

julia> filter((x,y)->begin println(typeof(x)); x[1]==3; end, Dict(a))
Int64
Int64
Dict{Int64,Int64} with 1 entry:

but it should be an error because Int64 has no getindex.

I am very confused, but there must have been a good reason to have it this way and I cannot see it. How can I picture all this in a more systematic way?

(And all this because of this post.)

mzaffalon · December 23, 2016, 9:29am

julia> a = Int64(6); a[1]
6

Rather unexpected…

Evizero · December 23, 2016, 12:23pm

I suspect (but don’t actually know) that one reason for Int64 to provide a getindex implementation is to make broadcast work in a general way.

pfitzseb · December 23, 2016, 12:32pm

The key difference between loops/comprehensions and the anonymous functions used in filter, map etc. seems to be the implicit tuple destructuring that only happens in the former case (which makes sense for dispatch).

Imho the outlier here is filter for associative iterables, also see
https://github.com/JuliaLang/julia/issues/17886

So apart from that, the system appears to be consistent: In loops, comprehensions and assignments you get automatic tuple destructuring if you want it, and otherwise you don’t.

stevengj · December 23, 2016, 3:04pm

No, broadcast doesn’t need this (in 0.6, it works on arbitrary “scalar” types that don’t have getindex).

I think that largely this is the Matlab legacy; in Matlab, numbers are “really” 1x1 matrices internally, and it is quite common to write functions that are supposed to work on either scalars or arrays of numbers in order to vectorize. To simplify the process of writing such generic scalar/vector code, you can access numbers as if they were 0-dimensional arrays in Julia.

I think that a lot of the need for this should be gone now with 0.5’s dot-call syntax: in the cases where you would previously have written a generic vector/scalar function, you should now just write the scalar function f(x), and then apply it to arrays A with f.(A). This is not only easier, it is also faster because it can fuse with other elementwise operations and the result can be assigned in-place with .=.

See also: make numbers non-iterable? · Issue #7903 · JuliaLang/julia · GitHub

mzaffalon · December 24, 2016, 7:07am

Oh, nice explanation, thank you. The implicit tuple destructuring is the bit I was missing.

Nosferican · June 18, 2017, 7:30pm

Should be:

map((elem) -> elem[1], object)

since the elements of the object are tuples and you want to select the first element of the tuple.

dpsanders · June 18, 2017, 7:48pm

Note that there is a function first:

julia 0.6> first((1,2))
1

So you can just write

julia 0.6> a = [(1, 2), (3, 4)]
2-element Array{Tuple{Int64,Int64},1}:
 (1, 2)
 (3, 4)

julia 0.6> first.(a)
2-element Array{Int64,1}:
 1
 3

e3c6 · June 30, 2017, 9:01am

Is there a reason for this behavior?

dfdx · June 30, 2017, 9:24am

By definition, list comprehension builds a list. You could possibly have a kind of “collection comprehension” that tries to preserve collection type. But I can see a little value for it and a number of hard design choices, e.g. what syntax this feature should have, how to do type dispatching (which is a solved issue for map in Julia), how to handle filtering in general collections (i.e. [x for x in xs if condition(x)] for lists), etc.

Evizero · June 30, 2017, 9:29am

might make more sense to call it array comprehension then, since it understands shape.

julia> A = rand(2,3)
2×3 Array{Float64,2}:
 0.05249   0.251237  0.911031
 0.461673  0.73201   0.854654

julia> [a^2 for a in A]
2×3 Array{Float64,2}:
 0.0027552  0.0631202  0.829977
 0.213142   0.535838   0.730434

StefanKarpinski · July 1, 2017, 2:08am

We don’t call them list comprehensions nor do we call the data structure lists – that’s Python terminology. Julia’s random access n-dimensional data type is an array and the comprehensions that construct them are array comprehensions.

jlapeyre · July 1, 2017, 3:28pm

map does not preserve the type of an Array in v0.6.

Julia v0.6

julia> typeof(map(identity,Any[1,2,3]))
Array{Int64,1}

Julia v0.5

julia> typeof(map(identity,Any[1,2,3]))
Array{Any,1}

Operating on an Array, map in v0.6 appears to return an array of the least common (non-proper) supertype of the elements.

This is one of the thousand cuts Symata.jl has suffered under v0.6. (Not that I’m complaining, I knew the API was in flux.)

ohsonice · July 2, 2017, 10:03pm

Would pre-allocation solve that problem? In that case, it is explicit that you are persevering type:

 x = Any[1,2,3]
 y = similar(x)
 map!(identity,y,x)

jlapeyre · July 2, 2017, 11:21pm

Yes, preallocation solves the problem, or my problem, at any rate. This was relatively easy to fix once I discovered the origin of the bad behavior.

pint · July 3, 2017, 7:19am

how is that a bad behavior though? you want map to treat identity as special case?

Topic		Replies	Views
Different return types in map vs comprehension for empty vectors General Usage type-stability	3	219	March 1, 2023
Map vs Loops & Array Comprehensions in Julia 1.0 Performance	10	5812	March 3, 2021
What is the difference between map(f, v) and [f(x) for x in v]? General Usage	2	346	July 21, 2022
Type instability in list comprehensions General Usage repl , type-stability , comprehension	46	778	August 9, 2024
When to use broadcasting with . vs map General Usage broadcast	23	5261	October 4, 2022

Map vs list comprehension

Related topics