using Dataframes & Rollingfunctions, so when the dfData is about 2000 rows, no issues, code executes and does what is expected, but when I add all of the IDs, dfData grows to about 100k rows, and the above code throws out an error:
nested task error:
Bad window span (12) for length 10.
which seems to originate in splitapplycombine.jl part of Dataframes package, but I cant quite figure out why - is it some soft of memory limit I am hitting with how dataframes can be used in Julia?
Thanks!
edit: issue seems to happen only when df gets over 8000 rows or so, and then the error starts showing up…
Could you post some runnable code so people can reproduce this on their own machines? Code to make some fake data in dfData and to include all the modules you’re using would be ideal.
struggling to reproduce the error with random data df…
using DataFrames, Query, and RollingFunctions packages
Query for the linq function there to read the df, DataFrames to actually store a query as a df, and RollingFunctions package for the “running” method.
the below works just fine unfortunately, so starting to think this isnt a pure size issue…
the 12 in there, there are ID categories that only have up to 10 rows, so it gets broken the moment it tries to do 12 passes where only 10 rows exist for the groupby category.
Hmmm… is there a simple way to tell it to ignore those? alternatively I guess i can filter out original df to ensure those dont occur.