Nested functions pros and cons

I’ve learned that nested functions are useful in Matlab to avoid passing tons of arguments to external functions when optimizing/fitting some data.

Is this still true in Julia? Since lambda functions are very cheap, is there any point in nesting a function? What if I’m nesting 2 or 3 layers deep? Then repeatedly calling a top nested function will re-“build” all the deeper nested functions many times over…

3 Likes

This is not how it works. The function is “pulled out” and defined when the code is getting lowered.

Some code examples of what you mean would help clarify your point. To me, it just seems like you are talking about a closure which indeed is a way to avoid passing arguments to functions getting passed into other routines, like optimization algorithms.

4 Likes

I think you got me right. I was about to create a contrived but clear example, but then I figured I might as well just put the actual code here, and test it!

Code

Nested:

function nested(retina, distance, aperture, morphz)
    layers, rc, ous, nodalz, focallengths = createobj(morphz)
    r = Ray()
    function θ2signal(θ)
        l = Light(distance, aperture, rc, θ, nodalz)
        function getsignal(b)
            ous[retina].medium.signal.photoreceptor = 0.0
            try
                l(r, b[1], b[2])
                raytrace!(r, ous)
            catch ex
                ex isa RayTraceEllipsoids.DeadRay || throw(ex)
            end
            ous[retina].medium.signal.photoreceptor
        end
        s, _ = hcubature(getsignal, [0, 0], [1, 0.5], initdiv=10, maxevals=10^4)
        s
    end
    s0 = θ2signal(0)
    fun(θ) = (θ2signal(θ) - s0/2)^2
    res = optimize(fun, 1e-4, 0.45)
    Optim.minimizer(res)
end

Not Nested:

function getsignal(b, ous, retina, l, r)
    ous[retina].medium.signal.photoreceptor = 0.0
    try
        l(r, b[1], b[2])
        raytrace!(r, ous)
    catch ex
        ex isa RayTraceEllipsoids.DeadRay || throw(ex)
    end
    ous[retina].medium.signal.photoreceptor
end
function θ2signal(θ, distance, aperture, rc, nodalz, ous, retina, r)
    l = Light(distance, aperture, rc, θ, nodalz)
    s, _ = hcubature(b -> getsignal(b, ous, retina, l, r), [0, 0], [1, 0.5], initdiv=10, maxevals=10^4)
    s
end
function notnested(retina, distance, aperture, morphz)
    layers, rc, ous, nodalz, focallengths = createobj(morphz)
    r = Ray()
    s0 = θ2signal(0, distance, aperture, rc, nodalz, ous, retina, r)
    res = optimize(θ -> θ2signal(θ, distance, aperture, rc, nodalz, ous, retina, r), 1e-4, 0.45)
    Optim.minimizer(res)
end

Benchmarks

Nested:

BenchmarkTools.Trial: 
  memory estimate:  105.94 MiB
  allocs estimate:  4444955
  --------------
  minimum time:     2.583 s (0.59% GC)
  median time:      2.584 s (0.54% GC)
  mean time:        2.584 s (0.54% GC)
  maximum time:     2.584 s (0.48% GC)
  --------------
  samples:          2
  evals/sample:     1

Not nested:

BenchmarkTools.Trial: 
  memory estimate:  16.72 MiB
  allocs estimate:  752272
  --------------
  minimum time:     2.836 s (0.10% GC)
  median time:      2.837 s (0.05% GC)
  mean time:        2.837 s (0.05% GC)
  maximum time:     2.839 s (0.00% GC)
  --------------
  samples:          2
  evals/sample:     1

Conclusions

So, not nested is ~10% slower but takes ~5 times less memory.

Note that you’re only actually running two samples in your benchmark, so a 10% difference in timing may not be all that statistically significant.

In general there should be no performance penalty for nested functions (which behave exactly like any other anonymous function), but you do run the risk of running into performance of captured variables in closures · Issue #15276 · JuliaLang/julia · GitHub The way to check for this is to run @code_warntype. Have you tried that, and does the result look OK?

Also, I’m pretty sure try...catch blocks are pretty slow in Julia, particularly if the catch path is taken. Are you sure that’s not slowing down your code?

5 Likes

I have not. I’ll look into that.

Waaaat…? Didn’t know that. I’ve been using it as a means to terminate a iteration that is fruitless. Since the mechanics of such an event is nested well down the code I can’t elicit a break or some such. But I can most certainly propagate some failure in another way. Hmm!!!

Using try catch for control flow is not great. It has performance implications and makes the code pretty confusing to read. Using a return value from raytrace! might work just as well here.

7 Likes

Wow, I did run into that. Thanks for the flag. OK, not nested it is.

Previously, I used a nested function definition also, but later I realized that I can get much better performance if I define the other function outside the main function, and use parametric typed to pass the extra information instead, so that lowered code is simplified for pre-compilation. This would probably be faster.

Don’t take this to mean “closures are slow” or that you shouldn’t use them if they suit your particular problem. It’s nearly always possible (if somewhat annoying) to avoid issue 15276 using a let block, as long as your closure (i.e. your nested function) doesn’t need to affect any bindings in the parent scope. But if you are running into that issue and you don’t need a closure, then yes, defining your methods externally is another easy way to fix it.

7 Likes

If you need variables from the parent scope, the nested function is relatively short, and specific to where it’s used (can’t be reused elsewhere), I think nesting is a good idea.

Disadvantages of nesting: If the nested function grows, readability of the parent function can be compromised (compare your notnested with nested above). Also, it’s harder to unit test a nested function in isolation.

1 Like

@goto is your friend. Just write @label error_foo at the position where you want to break to, and then write error_condition && @goto error_foo. In that case you need to also remember scoping: If you need some loop-local variable in the error handler, then don’t make it loop-local (initialize it to some dummy value of the correct type outside of the loop; if you need to transfer a for-loop counter, then write it to something that is visible at the @label before triggering the @goto).

This is typically significantly more readable than “iterated break” where each loop checks errors of the inner loop and possibly breaks again.

In julia, @goto is not a necessarily a code smell, but instead is more fundamental than while and for (as evidenced by @code_lowered that rewrites control flow in terms of @goto).

This is not more or less true in any other languages that I’m aware of. Loops are also lowered into conditional and unconditional branches in C compilers which isn’t any different from what you get in julia. They just may not have a easy way to inspect it or in the case of LLVM (and probably most other optimizing compilers) use a completely language. FWIW, the lowered AST isn’t strictly the same language as the surface julia syntax and/so while and for in julia are not implemented as @goto.

In any case, the usual advice against goto has nothing to do with what they do (unconditional branch, which is always going to be needed in these languages) but the way they are/can be used. It’s just too general and flexible that can be used to confuse both the compiler and the reader. So the advice is always to use them in a clear and predictable way/pattern.

The use of @goto to break out deeply nested loop is certainly fine and it’s one of the few patterns that I know that are widely accepted. But this should definitely not be generalized to using @goto being more encourage in julia than anywhere else.

5 Likes

This is generally true in other languages as well**. Pretty much all optimization specifically around exceptions are based on the assumption that they don’t happen often so they can be slow. (Of course on top of that our try-catch are even more expensive for C interop that is only needed in very few cases…)

** At least in compilied/optimized languages. Probably doesn’t make much difference with an interpreter and I’ve certainly seen cases where throwing an exception as control flow is faster than a normal branch in python…

4 Likes

Cool, I tried it now, but it seems that it won’t work across functions, right? I can’t @goto from one function to a @label in another function, right? If correct then this won’t work for me – breaking events occur inside functions other than the functions I want to exit from. Kind of like this:

function fun(x) 
    x < 2 && @goto kaka
    x = 1
end
function fun2(x)
    g = fun(x)
    g+1
end
function fun3(x)
    return fun2(x)
    @label kaka
    false
end

Even if you could (you cannot) – you would not want to do that. That is how “spaghetti code” is cooked.

3 Likes

I don’t have anything against gotos per se, but I’ve met a lot of programmers, even experienced ones, who have a very strong aversion to them, and will consider any usage of goto a sign of poor quality code and a lazy/inexperienced developer. Even if you don’t agree with that, if your code is being judged (e.g. in a pull request or job interview), it’s something worth paying attention to.

In your example, goto doesn’t sound like the right approach. Neither does an exception. Hard to tell with such an artificial example, but if the breaking event is a dead ray, perhaps you should have a special return value indicating that, which you can test/propagate up the call hierarchy. You could either simply use a constant value for that (e.g. if you’re using floats, you could use NaN), or more elegantly you could have a small RayStatus struct and a method like dead(rs::RayStatus) which indicates status.

There will be a few extra lines propagating the status, but on the other hand, since the DeadRay exception is just caught and ignored in your example above, it seems like that whole try/catch block can go away with this approach.

4 Likes

Indeed that’s how I solved it. Some of the relevant functions now return failure, result instead of just result. I then propagate the failure accordingly. It works very nicely now.

The point of try/catch is that they are capable of unwinding a call stack that is not known at compile time. This is expensive (especially with respect to what the compiler can optimize).

If your special condition is triggered and handled in the same stack frame, then you should not pay this price, and @goto is strictly preferable to try/catch. A well-placed @goto is really no different than continue or break, and is imo the right way of “break twice”.

There is one theoretical performance advantage of exceptions: If the special condition applies rarely and can be raised by a CPU trap, then you don’t need to compile the branch, i.e. checking for the special condition is almost free (it forbids some optimizations but incurs not a single instruction).

Afaik julia currently doesn’t really exploit that hardware mechanism in user code.

Very cool. Yea, in my specific case it happens about 5% of the “time”. I think. So it’s not ultra rare.

The alternative way of handling this is to use ‘Missing’, ‘Union’, ‘nothing’, etc…

(maybe admins should split this into a new topic - “try catch versus return break” or some such)

Hmm, I tried adding @simd in some loops and realized I can’t have break or continue in such loops. So one solution is not to break the loop but write versions of the functions in the loop that short-circuit on specific values (values that indicate a failure).

So for instance, if something deeply nested returned a failure then it should return something like a Val(true) (true for failure), and I can add a method for one of the top functions that does nothing when the failure argument is true:

fun(arguments..., ::Val{false}) = <calculations...>
fun(arguments..., ::Val{true}) = nothing

What do you think about that?