Documentation about using anonymous and named functions with pmap

question

#1

I noticed that anonymous functions behave differently from regular functions w.r.t. serialization and code availability on other processes. e.g., when julia is run with multiple processes,

f(x) = x^2
pmap(f, 1:10)

throws an error, but

f = x->x^2
pmap(f, 1:10)

works fine. Is there documentation about the specific differences between anonymous and normal named functions that allow them to be used in this way?

(I sort of already understand that it is simpler to serialize anonymous functions because they only have 1 method, but I would like to understand the design decision so that I can make my code future-proof)


#2

The first version should be

@everywhere f(x) = x^2
pmap(f, 1:10)

The @everywhere ensures the function can be used by every process. You can find the documentation for this here.

Not a parallel expert, but I assume the anonymous function just gets passed to every process by pmap which is why it works without the @everywhere.


#3

This is not true in general, you can add methods to anonymous functions:

julia> f = (x) -> x^2
#16 (generic function with 1 method)

julia> (::typeof(f))(x::Int) = 10x

julia> f
#16 (generic function with 2 methods) # Note that f now has 2 methods

julia> f(2.)
4.0

julia> f(2)
20

#4

Thanks, @Karajan! Yes, I understand how to get it working (though sometimes @everywhere won’t suffice: https://github.com/timholy/ProgressMeter.jl/issues/125); I’m trying to understand why the decision to have anonymous functions behave differently was made, whether this is desired behavior that is going to carry on into the future, and what differences between anonymous and regular functions make this different behavior reasonable. (also, is there a way to tell if a function is anonymous? the output in the REPL for a function always just says “generic function”)


#5

Ah, good point… one less hypothesis for why the behavior is different. Maybe it has to do with anonymous functions being nameless? There is no risk that an anonymous function will clash with a name already defined on the other process?


#6

I actually can’t reproduce the error on 1.0.3. What error are you getting?


#7

Were you perhaps executing the code with just one process?

julia> using Distributed; addprocs(2)
2-element Array{Int64,1}:
 2
 3

julia> f(x) = x^2
f (generic function with 1 method)

julia> pmap(f, 1:10)
ERROR: On worker 2:
UndefVarError: #f not defined

(this error is, of course, expected; the question is “why is the behavior of anonymous functions different, and will this always be the case?”)


#8

Ah yes, that wasn’t very smart of me.