Iβm quite new to DataFrames and tried to translate some of my R code to Julia.
Here is my R code:
df |>
left_join(df2, by="customer") |>
left_join(df3, by="order") |>
left_join(df4, by="item") |>
group_by(order) |>
mutate(total_income = sum(wholesale_income),
profit = total - total_income,
profit_count = sum(profit > 0)) |>
arrange(profit_count) |>
first()
This code merges some data frames, then creates some new variables and outputs the first row.
This is what I tried in Julia:
@chain df begin
leftjoin(df2, on=:customer, matchmissing=:equal)
leftjoin(df3, on=:order, matchmissing=:equal)
leftjoin(df4, on=:item, matchmissing=:equal)
groupby(:order)
@transform(:total_income = sum(skipmissing(:wholesale_income)))
@transform(:profit = :total .- :total_income)
@transform(:profit_count = sum(skipmissing(:profit) .> 0))
@orderby(:profit_count)
first
end
It looks pretty similar but the results are very different, especially the profit counts.
This may be because of the missing values, it is still a mystery to me how these are treated in Julia, in R this seems to happen automatically.