I’m sorry that my questions sound a bit confusing. I’ve been trying to figure out how to work with dataframes using a pipe of some kind, could be @linq like below, or could be one of the other ones.
using DataFrames
using DataFramesMeta
using CSV
using Statistics
data = "PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked\n87,0,3,\"Ford, Mr. William Neal\",male,16.0,1,3,W./C. 6608,34.375,,S\n89,1,1,\"Fortune, Miss. Mabel Helen\",female,23.0,3,2,19950,263.0,C23 C25 C27,S\n371,1,1,\"Harder, Mr. George Achilles\",male,25.0,1,0,11765,55.4417,E50,C\n421,0,3,\"Gheorgheff, Mr. Stanio\",male,,0,0,349254,7.8958,,C\n498,0,3,\"Shellard, Mr. Frederick William\",male,,0,0,C.A. 6212,15.1,,S\n511,1,3,\"Daly, Mr. Eugene Patrick\",male,29.0,0,0,382651,7.75,,Q\n538,1,1,\"LeRoy, Miss. Bertha\",female,30.0,0,0,PC 17761,106.425,,C\n627,0,2,\"Kirkland, Rev. Charles Leonard\",male,57.0,0,0,219533,12.35,,Q\n781,1,3,\"Ayoub, Miss. Banoura\",female,13.0,0,0,2687,7.2292,,C\n855,0,2,\"Carter, Mrs. Ernest Courtenay (Lilian Hughes)\",female,44.0,1,0,244252,26.0,,S\n"
df = CSV.read(IOBuffer(data))
df = @linq df |>
deletecols([:Name, :SibSp, :Parch, :Cabin, :Embarked, :Ticket, :Fare]) |>
rename(:PassengerId => :id, :Survived => :survived, :Pclass => :class, :Sex => :sex, :Age => :age)
df.age = replace(df.age, missing => median(skipmissing(df[:age])))
df = @transform(df, survived = convert.(Bool, :survived))
df.sex = categorical(df.sex)
df.class = categorical(df.class)
df
The code above works, but ideally those transformations would be in one pipe, and I couldn’t make it work. There is no particular need for that, except that it would look nice. Well, even with that it would not still be the best possible.