Why does Julia have multiple ways to define functions?

hatmatrix · August 2, 2023, 11:31pm

The docs present at least three ways: the standard form, “compact” form, anonymous form, etc. Then there are the do blocks…

In that you can assign the anonymous function to a symbol, the compact form seems redundant? The anonymous form also doesn’t permit multiple expressions (a perennial gripe with Python) unless you use a compound expression or couple it with a begin...end block with local keyword?

Coming from R, I wonder why it was not made more simple. For instance, in R, function(x) x^2 + 2x - 1 can be an anonymous function if used as an argument to a function; the canonical way to define a function is to assign this to a variable: foo <- function(x) x^2 + 2x - 1. When the function includes multiple statements/expressions, you use curly braces and this can remain anonymous or assigned. Julia eschews braces but does permit the end keyword in the same line as function - should we think of the compact form just syntactic sugar on the standard form (using function) and we can just use the function syntax everywhere?

In the process of coding, you might start with a single line function then later need to change it to include multiple lines and having to change the syntax seems a high cost penalty. Is there a rationale for having these different options?

Elrod · August 2, 2023, 11:37pm

Note that you’d need

const foo = x -> x^2 + 2x - 1

to be mostly equivalent to

foo(x) = x^2 + 2x - 1

if you’re going to ever be referring to foo as a global variable.
This is for performance reasons. Non-const globals are bad.

Note that const globals cannot really be redefined, especially not if it changes the type.
A new anonymous function will have a new type.

However, Julia has special support and handling so that redefining regular functions (defined with the short or long form) work as expected.

This isn’t answering your question, I’m just trying to make it clear that you shouldn’t use R’s style here.

You could always just stick to the short form.

foo(x) = begin
   tmp = x^2 - 1
   return tmp + 2x
end

You don’t really need to switch if it gets too long.

Then you could use JuliaFormatter.jl to fix your formatting problems.

stevengj · August 2, 2023, 11:54pm

Yes.

hatmatrix · August 3, 2023, 12:02am

Thanks for the clarification on the use of the anonymous function. Yet you show me another way to define a function!

I thought begin...end did not create a new scope, but I see that here one is created by the function. The form you show seems to be more “universal” - unfortunately it appears to make the built-in function syntax redundant.

danielwe · August 3, 2023, 12:20am

One limitation of anonymous functions, even when assigned to a variable/constant, is that you can’t ~~easily~~ obviously add methods to them:

f(x::Int) = x^2
f(x::Float64) = x^3
# vs
const f = x::Int -> x^2
# now what?

Benny · August 3, 2023, 12:23am

It doesn’t, it’s the function that creates a local scope.

Yes you can

julia> const g = x::Int -> x^2
#6 (generic function with 1 method)

julia> g(x, y) = x^2+y^2
#6 (generic function with 2 methods)

If the variable is not constant, you use the type:

julia> h = g
#6 (generic function with 2 methods)

julia> (::typeof(h))(x, y, z) = x^2+y^2+z^2

julia> h
#6 (generic function with 3 methods)

danielwe · August 3, 2023, 12:26am

Fair, edited to replace easily with obviously (at least it was never obvious to me)

algunion · August 3, 2023, 12:28am

My advice is to progressively use the syntax that feels convenient/safe as you progress and become more confident. Julia has a friendly surface and simple enough syntax that can get you a long way.

However, the manual/documentation is not opinionated (as you would have the legacy-R vs Wickham’s sugars). We can agree that presenting all the facets and possibilities might look too much for an introductory reading.

Now, to address some of your worries.

Here is why this is not redundant: multiple dispatch.

Bear with me a little.

If you plan to stick with one definition, then the following two are not going to be different (let’s say, for calling purposes):

const f = x -> x + 1
ff(x) = x + 1

However, what if you want to define a new method? For ff you would just go and specialize per your needs:

ff(x::MyCustomType) = ... something

How about the anonymous function? There we have a global constant that is set in stone (any other option would compromise the performance). Now we cannot assign to f in a weird additive manner, but I can do this (which is actually the short-form definition):

f(x::MyCustomType) = ... something

So, now we are forced to mix them anyway - so why use the anonymous and do the extra keystrokes to have a const in the first place?

The point is there is no good reason to define anonymous functions and assign them to global constants in the first place: that is not their purpose.

The redundancy would hold if they are actually doing the same thing - and they are not: as you can see, if you give up the short form definition, the global constant naming thing would ruin the definition of multiple methods for your function - so now you’ll need to rely on the standard definition for your additional methods even if they are just “one expression” long (and it is obvious that the short form requires naming - that implies you cannot just pass an anonymous ad-hoc defined function value to another function - as in: map(x -> x + 1, 1:10) scenario).

Conclusion: no redundancy in essence (although we can admit the existence of a corner-case scenario where they seem interchangeable).

So, I hope that it is pretty clear at this point that standard form (with its single-(compound)-expression syntactic sugar) is essentially different from anonymous functions, and there is no redundancy.

There is a cost not worth paying only if you are frequently doing the multiple lines switch - and if you are doing that frequently, then it is clear that you are going to be more aware of your code design decisions (and use the standard form to start with). On the other hand, if you only occasionally refactor a function to include multiple lines, then it is a small penalty of switching to compound expression or standard form (but you already saved a lot of keystrokes by defining a myriad of no-need-to-change functions using the short-form). I know that Julia is putting lots of responsibilities on the developer’s shoulders - in a way, it is too powerful (and dangerous sometimes). Imagine that it is powerful enough to allow the encoding of R-like syntax (at least the Hadley Wickham flavor).

But what about begin ... end and let ... end forms?

begin and let blocks are not specifically related to functions. There is convenience syntax in Julia that allows the developer to achieve all kinds of stuff. You can also use begin ... end and let ... end to define variables.

If someone doesn’t have anything in principle against begin and let blocks, then… why should there be an issue that the language allows to glue things together in a syntactically valid fashion?

Yes, we can agree that Julia is waaaay more complex than R - and although there is idiomatic Julia code, there are still a lot of gray areas where kind of anything goes (usually the idiomatic Julia is to prevent certain pitfalls, especially performance-related).

Benny · August 3, 2023, 1:12am

Me neither. The (::typeof(h)) thing is actually very logical if you look up the couple sections on functors and realize that a function is just an instance of a singleton type, that’s not the part that I complain about.

I also prefer if there was only 1 way to do things, but if you try to stick to 1 form instead of the typical usages, you hit parser issues. Let’s take the 4 forms: named function, anonymous function, named =, and anonymous -> (the latter two take single expressions, not single lines):

return type issue for anonymous `function` and `->` forms

julia> function f(x::Int)::Int 0 end
f (generic function with 1 method)

julia> function (x::Int)::Int 0 end # comma or more arguments does not help
ERROR: syntax: ambiguous signature in function definition. Try adding a comma if this is a 1-argument anonymous function...

julia> f(x::Int)::Int = 0
f (generic function with 1 method)

julia> ((x::Int)::Int) -> 0 # even parentheses doesn't help
ERROR: syntax: "x::Int" is not a valid function argument name around...

return type and `where` clause issue for anonymous `function`, `=`, and `->` forms

julia> function f(x::T)::Int where T 0 end
f (generic function with 1 method)

julia> function (x::T)::Int where T 0 end # comma or more arguments does not help
ERROR: syntax: ambiguous signature in function definition. Try adding a comma if this is a 1-argument anonymous function...

julia> f(x::T)::Int where T = 0
ERROR: UndefVarError: T not defined

julia> (f(x::T)::Int) where T = 0 # need parentheses to work
f (generic function with 1 method)

julia> ((x::T)::Int) where T -> 0 # parentheses doesn't help
ERROR: syntax: invalid variable expression in "where" around...

single-line parentheses, brackets, or braces expression issue for named and anonymous `function` forms

julia> function f(x) (x,) end
ERROR: syntax: space before "(" not allowed in "f(x) (" at...

julia> function f(x)(x,) end # ^it thought (x,) was the arguments expression
ERROR: syntax: invalid function name "f(x)" around

julia> function f(x); (x,) end # need new line or ; to work
f (generic function with 1 method)

julia> function (x); (x,) end # need new line or ; to work
#1 (generic function with 1 method)

julia> f(x) = (x,)
f (generic function with 1 method)

julia> (x) -> (x,)
#3 (generic function with 1 method)

I think it should be possible to patch these issues, since we can identify the proper rules: {optional name or (var::type) parentheses}{no space here!!}{arguments parentheses}::{optional return type} where {optional arguments’ type parameters brace}{function body} {end}. But patching the parser is probably really hard. I think if somebody (me) would like to use 1 form for everything, it’d be a patched function block.

Nathan_Boyer · August 3, 2023, 2:25pm

Yes, you only need to use the other forms if they are more convenient.

Anonymous functions are convenient when you want to construct the function at the same time you pass it to another function:

julia> v = 1:3
1:3

julia> v² = map(x->x^2, v)
3-element Vector{Int64}:
 1
 4
 9

Compact form is convenient when the function is short and you want it to look like math:

julia> f(x) = (x + 3) / 2
f (generic function with 1 method)

julia> @show f(7);
f(7) = 5.0

Compact form is also convenient if you want to add convenience methods to a longer function:

julia> function my_func(a, b, c)
           # does
           # fancy
           # stuff
       end
       my_func(nt::NamedTuple) = my_func(nt.a, nt.b, nt.c)
my_func (generic function with 2 methods)

The do function syntax I have only really used with opening files, since it has the benefit of closing the file if there is an error during execution. See docs here for an example.

If you don’t think you need any of the above convenience, then it is perfectly fine to always use the full function form.

hatmatrix · August 3, 2023, 9:16pm

Indeed this is probably the crux of it. In these cases it is maybe useful to stick to the standard form, though the begin...end form is a way to extend the compact syntax and maybe I wish it had been the standard way to do it. (Though the other form is more similar to MATLAB and Fortran.)

sadish-d · August 4, 2023, 4:10am

This might be controversial, but as someone new to Julia myself, a lot of Julia’s syntactic sugar feels like false promises to me. I’ve got into trouble with mixing anonymous functions with |> and with ternary operators condition ? this : that. I tend to be cautious and wrap thing in parentheses. Like (anonymous function) |> something (this is actually a suggestion in the official documentation) or (condition) ? (this) : (that). Being able to write thing things in one line like function f(x) x end instead of blocks also feels like a false promise sometimes.

For functions, I default to using the function ... end format and always specifying return. So your example function f(x) (x,) end would work if you did function f(x) return (x,) end.

I also don’t typically specify return type because it essentially just calls the convert() function. It can hide errors that you might want exposed. Take, f(c)::Integer = c for example. f("x") will give you an error but f('x') will not.

eahenle · August 4, 2023, 4:34am

I have also run into the problem of the ternary and pipe operators not working as I expected; it was the result of “over-sugaring” an expression. In that kind of situation, you either have to use less sugar, or cut it with some salt, i.e. use parentheses.

Why wouldn’t you just define f(c::Number)::Int = c?

algunion · August 4, 2023, 5:10am

Short disclaimer: My comments below are not the same as saying that Julia is flawless in general. So please read my comments as a focused reply to some of your affirmations, not as an exaggerated apologetics of the language.

My reply is below:

To some extent is understandable to get into trouble when starting Julia and even feel that some things do not work as advertised. Also, if Julia is not the very first language you are picking up, I suspect you are also bringing some background bias/expectation.

So I would be very careful when saying that something is a false promise. A genuine false promise would be something along the lines of the syntactic sugar not working as described/advertised by Julia documentation.

Getting into problems caused by improper usage of Julia’s syntax is not the same as being the victim of some false promise - even more so if that improper usage is somehow enforced by past non-Julia experiences.

Regarding the following:

… it is pretty difficult to understand why this might be a false promise since you also mention that the documentation correctly captured the corner case and indicated the proper usage.

Regarding the return type usage, the documentation specifies that:

Return type declarations are rarely used in Julia: in general, you should instead write “type-stable” functions in which Julia’s compiler can automatically infer the return type.

I am pretty sure the syntax sugar that works well in most of the cases should not be labeled as a false promise only because some corner cases can confuse the Julia parser (and are also documented accordingly).

I also understand that some syntax might not be intuitive, and things might not always work as expected. The good news is that you should not be ashamed of abusing the crutches you find along the way (documentation + community) and make steps towards becoming independent.

When you start using the language, you might be slow and less productive anyway - so the syntax sugar will not do much for you at that stage (in the same way, a faster/better car is not making much of a difference for the ones learning to drive).

sadish-d · August 4, 2023, 5:13am

I am drawing a distinction between what :: does in 1.0::Integer and what it does in f(c)::Integer. In the former is an assertion, but the latter is a call to convert().

If I wanted to convert the returned value, I would write f(c::Number)= convert(Int64, c).

algunion · August 4, 2023, 5:15am

So why bother to annotate your function f(x)::Integer` in the first place?

From documentation:

A return type can be specified in the function declaration using the :: operator. This converts the return value to the specified type.

Return type declarations are rarely used in Julia: in general, you should instead write “type-stable” functions in which Julia’s compiler can automatically infer the return type.

sadish-d · August 4, 2023, 5:32am

I don’t typically annotate my function return values. Benny’s examples have function which annotate the return type. I’m explaining why there’s reason to be cautious with that practice. And this is not my original thinking. I learned from others in the community (don’t have a citation).

I did not say that Julia makes false promises. I said it “feels” like it. As someone coming from other languages, I can relate to some of the frustrations that OP feels (and might feel in the future) when using some of Julia’s syntax. I am sure a lot of thought has gone into how the language is designed. And if I don’t know the sequence in which Julia evaluates operators (eg. -> vs |>) then that’s not a fault in the language. But I do try to caution people about too much sugar. Another example: implicit multiplication: a=[1,2]; a'a gives you 5. But [1,2]'[1,2] gives 2. As a new user, I found that confusing.

algunion · August 4, 2023, 5:44am

Understood - no point in debating feelings, then.

Related to return types, there can be cases where it actually makes sense (which is not the same as recommending using it all over the place without good reason):

floatproducer(x) = x + 1.0
int8producer(x) = convert(Int8, x)

function manyexitpoints(x)::Integer
    x > 10 && return floatproducer(x)
    x < 10 && return int8producer(x)
    # dozens of additional return points 
    # all compatible with convert(Int, n)
    x
end

Imagine a function where you have multiple return locations, and you also need (for any reason) to return a particular type from your function (yes, it is obvious that you could just convert the result - but let’s enforce the constraint). Now, it would not be very practical to call convert for each return location - so using a return type will do the job.

By the way - I never needed anything close to the above example

Barget · August 4, 2023, 10:24am

There’s a notion behind it that’s interesting, namely, that whether or not the syntax should be congruent with its usage.

Two (dumb) examples : goto’s and includes. Its has became a common knowledge not to use gotos everywhere, because spaghetti code bad. Same for nested includes that should be avoided or kept to a minimum.

On the other hand, if-else statements (despite being compiled as “gotos” at assembly level) provides much less flexibilities that gotos but are much easier to reason about.

My point being that some syntaxes impose a certain usage, which you don’t have to learn separately, other syntaxes provides so much freedom in use that you also need to learn the use cases.

This has been addressed in some messages above (e.g. use anonymous funcs when you pass it as an argument, use single-line style when its convenient, etc.), but I wanted to highlight this notion of syntax vs use cases (some former can impose a certain usage, others don’t … ), so that it can be applied to other aspects of the language (e.g. when to do explicit typing - some languages impose it - , how to structure a project with includes, using array of structs vs struct of array etc.), acknowledging the fact that there’s many solution for the same problem.

To conclude, I think this is one hard aspect of Julia, that is, you have some much freedom that you need to build a personal intuition of “when to use what” … which takes time. (At least, that was the case for me.)

hatmatrix · August 4, 2023, 11:30pm

Very interesting points and examples.

I suppose if-else statements can be written with gotos and so the former form is in principle redundant, but we would hardly think of if-else as syntactic sugar and never think to forgo this control structure.

As was demonstrated in this thread, it’s possible to bind anonymous functions and extend it for multiple dispatch but most would agree that this is not a good move. Apart from user responsibility and personal intuition, there is also some community effort to establish consensus and define what is idiomatic in any language. Apart from Numpy/Pandas where there are gajillion ways to do the same thing, Pythonistas have been fairly good at broadcasting what the “Pythonic” way of doing certain things were (was it in the documentation?), but in other communities it’s not clear how one goes about learning these language idioms systematically.