Hi there I’m still new to Julia. Trying to figure out why some behavior is not as I would it expect to be.
In the dataframes example for the split-apply-combine strategy here https://dataframes.juliadata.org/stable/man/split_apply_combine/
there is an example:
combine(iris_gdf, :PetalLength => mean => :myMean)
which works fine and as intended, but when I use an anonymous function like this
combine(iris_gdf, :PetalLength => x → mean(x) => :myMean)
I get a data frame like this:
Which seems like a bug.
Question is, how do I get the new column for a custom function with a proper name and proper content without renaming the column in an extra line.
My “bad” solution is like this:
rename(combine(iris_gdf, :PetalLength => x → mean(x)), “PetalLength_function” => “myMean”)
(mean) is just the same as mean, while x -> mean => :x is not the same as (x -> mean(x)) => :x in the same way that (1) is the same as 1 but 1 + 2 \times 3 is not the same as (1 + 2) \times 3
The Pair(s) are resolved before being passed to the combine function. Let’s look below at what the combine function is receiving as its second argument.
Your original version is a single Pair from the column name to a function output. (The “var” thing is denoting an anonymous function.) The output of the anonymous function is a Pair, which you can see in your provided PetalLength_function column screenshot.
Julia reads the above the same way as if you would have put the parentheses around the final Pair. Written this way, you have only provided combine with a source column and an operation function.
There is no -> character in that expression. It is an issue of parsing order between the => and -> symbols.
As nilshg said, think of it like 2 * 3 + 4 * 5 vs. 2 * 3 * 5. Then with parenthesis, the first one changes meaning 2 * (3 + 4) * 5, but the second one doesn’t 2 * (3) * 5.