I tried to understand how this script here works and I tried to reproduce it in other ways:
this with insertcols! it seems to work
df=DataFrame(name=["James Smith 39","Michel Smith 41","Maria Garcia 19"],
freq=[12345,23431,11322])
insertcols!( df,([:first,:last,:age] .=>collect.(zip(split.(df.name)...)))...)
instead trying to do something similar with transform! it does not work
transform!(df, :name => (x-> collect.(zip(split.(x)...))).=>[:first,:last,:age])
could someone help me to find where the error is in the latter version?
You have an extra broadcasting at the end. You want
julia> transform!(df, :name => (x-> collect.(zip(split.(x)...)))=>[:first,:last,:age])
1 Like
Tanks!
I had been led by the analogy with the insertcols function syntax to use the form with braodcasting.
Iβm trying to understand what happens in each step.
For example I have proved that the collect is not necessary.
transform!(df, :name => (x-> (zip(split.(x)...)))=>[:first,:last,:age])
Iβm also still trying to understand the various possibilities of the transform function.
In one of the points in the manual it is explained that you can also use vectors of pairs (symbol, function).
For this I tried the following form:
transform!(df, [:name => (x->cs)=>cd for (cs,cd) in zip(split.(df.name),[:first,:last,:age])])
Iβm continuing to study the possibilities of the transform function !.
this is very close to the one I was looking for in order to manage the splits of a column in a general way.
transform!(df, :name => (x-> split.(x))=>AsTable)
transform!(df, :name => (x-> split.(x))=>[string("col",i) for i in 1:length(split(df.name[1]))])
The most succint is probably
julia> transform(df, :name => ByRow(split) => AsTable)
3Γ5 DataFrame
Row β name freq x1 x2 x3
β String Int64 SubStrinβ¦ SubStrinβ¦ SubStrinβ¦
ββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1 β James Smith 39 12345 James Smith 39
2 β Michel Smith 41 23431 Michel Smith 41
3 β Maria Garcia 19 11322 Maria Garcia 19