I understand what you say but I mean to put some of those things on an specific thread organized as a collection of exercises, problems, contests, summaries…
And I’m not speaking about natural language processing or astrophysics problems, just simple problems, though some of first could be used as an example.
For example once I needed an efficient ways to reshape huge data from long to wide format using R and data.table:
And I got this answer from Uwe, the usual methods consumes too much memory.
# add unique row number to join on later
# (leave `ID` col as placeholder for all other id.vars)
mydata[, rn := seq_len(.N)]
# define columns to be reshaped
measure_cols <- stringr::str_subset(names(mydata), "_\\d$")
# melt with only one id.vars column
molten <- melt(mydata, id.vars = "rn", measure.vars = measure_cols)
# split column names of measure.vars
# Note that "variable" is reused to save memory
molten[, c("variable", "measure") := tstrsplit(variable, "_")]
# coerce names to factors in the same order as the columns appeared in mydata
molten[, variable := forcats::fct_inorder(variable)]
# remove columns no longer needed in mydata _before_ joining to save memory
mydata[, (measure_cols) := NULL]
# final dcast and right join
result <- mydata[dcast(molten, ... ~ variable), on = "rn"]
It would be great to have a collection of similar things for Julia.