I really like the current trend in Base of functions that take a function as their first argument, and have been adopting it for my own code. My question is: If I define the no-op function
f(x) = x
can I happily pass this into other functions of the form:
function f1(f::Function, x1, x2, ...)
... #some stuff
z1 = f(x)
... #some other stuff
end
and get zero efficiency loss versus the case where I don’t include a function as an argument?
My understanding is that in the above pseudo-code, the z1 = f(x) is compiled down to z1 = x, and so this is indeed the case, but I wanted to double-check before adopting this style wholesale. (yes I did try looking at code_lowered type output, but managed to quickly confuse myself, and also wasn’t certain if what I was seeing would hold up in more complex cases).
Bonus question: does julia have a standard no-op function for use in cases like this, eg say if I wanted to define the shortcut method:
If you have no missing data or don’t care about speed in the next couple of weeks until this gets fixed (not by me, I don’t dare touch codegen), then you have no problem; but this issue makes some benchmarks misleading in the meantime.
Functions all have their own types. Just like how Julia specializes on 1 + 1 and 1.0 + 1.0 to call integer and floating point addition, respectively, when you call f1(identity, x, y, z), it’ll specialize on the types of those arguments and do all sorts of optimizations since it knows what all the types are. You can even dispatch on specific function types:
julia> f(::typeof(identity)) = 1
f(::typeof(sin)) = 2
f (generic function with 2 methods)
julia> f(identity)
1
julia> f(sin)
2
julia> f(cos)
ERROR: MethodError: no method matching f(::typeof(cos))
Closest candidates are:
f(::typeof(sin)) at REPL[1]:2
f(::typeof(identity)) at REPL[1]:1
That’s a neat trick! I had no idea you could do that. I’m guessing this is a fairly new feature? I seem to remember a year or two ago that typeof(sum) would evaluate to Function, so that typeof(sum) == typeof(identity) would evaluate to true. (I just verified for myself then that it now evaluates to false)
So every defined function can now implicitly be thought of as its own type? Or perhaps a better analogy would be its own parametric type… something like Function{T} where {T<:Union{sum, identity, ...}}, so that we can still write things like f1(f::Function, x) and have it work for any function f?
I just played around a bit and realised the same holds for anonymous functions too, and you can dispatch on them, with the caveat that f1 = (x -> identity(x)) and f2 = (x -> identity(x)), are different function types. Does this mean that anonymous functions are all created and stored in global scope, such that you couldn’t write a loop that creates anonymous functions indefinitely, since they would never be garbage collected?
Sorry, I just realised that is a lot of questions. I should probably re-read the manual at some point. I think the last version I read was v0.3…
The manual is definitely your friend here, but this has changed a lot in the last couple of years
Before Julia v0.5, Function was the concrete type of every function (as you remember correctly), and anonymous functions were inherently slower than ordinary functions. As of Julia v0.5, Function is now an abstract type, and each named and anonymous function is a separate concrete type which is <: Function. You can certainly still write f1(f::Function, x) and any named or anonymous function will work for f, but now you can also dispatch on the type of a particular function (if you want). This change is also what made anonymous functions just as fast as regular functions, which in turn enabled fast broadcast fusion and lots of other fun features. Here’s the most relevant PR: WIP: redesign closures, then generic functions by JeffBezanson · Pull Request #13412 · JuliaLang/julia · GitHub
As of v0.5 and above, an anonymous function creates a new, callable type. Closures are just callable types with fields containing their closed-over values. You can actually see this:
# We have to put this in a function in order to create a real
# closure instead of just a function that references some global
# variable named `i`
julia> function closure_demo()
i = 1
f = x -> x + i
end
closure_demo (generic function with 1 method)
julia> f = closure_demo()
(::#11) (generic function with 1 method)
julia> f.i
1
julia> f(2)
3
As far as I know, the compiled code from a function is indeed never garbage collected. However, that code is only generated once per anonymous function definition. So, for example, we can do:
julia> fs = [x -> x + i for i in 1:10]
f10-element Array{##18#20{Int64},1}:
#18
#18
#18
#18
#18
#18
#18
#18
#18
#18
julia> fs[1].i
1
julia> fs[2].i
2
Each element of fs is just a lightweight instance of the same type (#18) with a different captured value of i. They all share the same compiled code.
On the other hand, if you were to create lots of new functions in a loop (you’d have to do something like call eval() in your loop to do this), then you would indeed run into trouble because the compiled code for those functions would not be garbage collected.
I knew that anonymous functions had been made fast, but I had no understanding of how they did it. That is a fantastic write-up of it, thank you very much. If you’re on StackOverflow, I’d be happy to post this as a question, and you can cut-and-paste your answer from here, since I think it is a great resource.