One thing I would also like in my dream chain function is a way to alter a global state without breaking the chain
@chain df begin
fun1(1)
@eval_in_local_scope begin
x = 5
end
fun2(x)
end
Iām sure this makes some anti-chain purists have their head explode, but itās something Iāve always wanted in R.
That already works the whole expression is just one big let block
Can you give an example? is it just that expressions of the form Expr(:call, ....)
are replaced and other ones are not?
Check the Readme, it lists the different ways expressions are replaced. You can also try it with macroexpand. Every expression just gets prepended with a newvar =
, thatās why error highlighting works etc. So you can do anything you can do in normal code
Ah okay, this is what people are discussing above about @!
and @aside
. ftw. I like @aside
.
I love it. In my work the data transform stages often stack pretty deep, so the pipe symbol at the end (though cool generally) become burdensome.
- Agree with eliminate
_
, good idea to make default with explicit_
allowed first. - Error handling ā this is huge improvement
-
@!
for debugging flag: not a good choice, semantically overlaps with the mutating function syntax (and means kinda the opposite).- other options (really anything but
!
): -
@tee
,@x
(for exclude),@bypass
,@()
,@0
,@<
- other options (really anything but
- The
begin
seems superflous, except that anend
is needed, is it sensible to swallow that as well?
+1 for @bypass
I like that too.
The begin
-end
block is needed so that multiple statements are parsed as a single expression. If you have something like this
@chain df
transform
end
it wonāt parse correctly. The parser will assume that the macro call ends after df
, due to the subsequent newline character.
I agree in renaming @!, it is not easy to read, and it could confuse. For me @aside and @bypass are both nice and they are not confusing at all.
+1 for @bypass
Iām suggesting the @chain macro absorb the ābeginā keyword, the parsed result would still contain it.
There are plusses and minuses to this, basically users have to treat the @chain
macro as the beginning of a block.
Parsing happens before macro expansion, so thatās not actually possible. In other words, you canāt use arbitrary syntax in a macro. You can only use syntax that the Julia parser knows how to parse.
Iāve decided to use @aside
, I think it both describes its purpose best and is easiest to understand without additional knowledge
As I like both Chain.jl and Underscores.jl, I did a POC to combine the two.
Borrow functions (and the name of @_
) from Underscores.jl and we can define anonymous functions in the pipe block as expressions of _
or _1,_2,...
(or _ā,_ā,...
).
Examples
The macro @_
for POC is based on @chian
and uses __
instead of _
as the placeholder.
@_ [1:5, 4:10] begin
map(_[end]^2, __)
filter(isodd, __)
end
using DataFrames
df = DataFrame(x = [1, 3, 2, 1], y = 1:4)
@_ df begin
filter(_.x > 1 && isodd(_.y) , __)
transform([:x, :y] => ByRow(_1 *100 + _2) => :z)
end
The original @chain
would look like:
@chain [1:5, 4:10] begin
map(x -> x[end]^2, _)
filter(isodd, _)
end
@chain df begin
filter(row -> row.x > 1 && isodd(row.y) , _)
transform([:x, :y] => ByRow((a, b) -> a *100 + b) => :z)
end
I like the documentation example:
@chain df begin
dropmissing
filter(:id => >(6), _)
groupby(:group)
combine(:age => sum)
end
The only remaining underscore is in filter
This could be removed as well using
@where( :id .> 6 )
Unfortunately you then have to vectorise the condition.
Is there a way to avoid this and just have where( :id > 6 ) ?
is there a Chain equivalent of vectorised piping eg:
`@pipe [1 2 3] .|>
log .|>
_^2
No, thatās a trade off that comes from rewriting to temporary variables and keeping the expressions on each line otherwise intact. Also because thereās no symbol between lines that could signal this. That means broadcasting fusion does not happen across lines, but itās not so common in the DataFrames scenario.
You can prefix function symbols like @. log
though and of course use normal broadcasting like _ .^ 2
, just remember thereās no fusion across lines.
In the below, the insertcols! function gives an error if the first argument _ is removed.
@chain DataFrame(A=["a/b","c/d","x/y"]) begin
insertcols!( _, ([:C1,:C2] .=> split.( _.A, '/' ) |> invert )...)
end
does the first argument removal feature not work for ! functions ?
Many thanks for your answer by the way.
Iām asking all these dump questions because I love the package.
This breaks because whenever you have a _
in the expression, the āfirst argumentā rule ceases to apply. You always need the _
in the correct places whenever you have any _
.