When I run the code below, the memory usage blows up:
function func(df::DataFrame) X = df[:time] indices = findall(X .> 0) end # read in R data rds = "blablab.rds" objs = load(rds); params = collect(0.5:0.005:0.7); for i in 1:length(objs) cols = [string(name) for name in names(objs.data[i]) if occursin("blabla",string(name))] hypers = [(a,b) for a in cols, b in params] results = [DataFrame() for _ in 1:length(hypers)] # HERE IS WHERE THE MEMORY BLOWS UP Threads.@threads for hi in 1:length(hypers) name, val = hypers[hi] results[hi] = func(objs.data[i]) end end
df is 0.7GB. When I run this piece of code my memory usage goes up to ~30GB!!! It seems like just accessing a column of
func() is copying the whole thing?