Hi how’s it going, I have a function that takes in a dataframe, groups by a column, and takes a random sample of that column. I want to set some condition that says “If the size of this subdataframe is less than 20, don’t return anything, otherwise, return the random sample”
heres the code:
new_df = by(df,:group) do sdf
sample_df = sdf[StatsBase.sample(axes(sdf, 1), min(size(sdf,1),20); replace = false, ordered = true), :]
#here is where I need the condition that says "if you've got less than 20 rows, I don't need you in my returned dataframe"
return sample_df
end
return new_df
Thank you for your help as always!