I am running experiments that involve with Monte Carlo sampling on clusters, and I am collecting the mean and variance using
It takes a long time and I get time out errors on our clusters, so I want to save data and restart in another job.
Since the number of sampling is huge, I want to store only the mean, variance and the sample size, (not the whole data) to restart.
I am aware of the algorithm Online estimation of variance with limited memory - Cross Validated (but if I were to be willing to implement this myself, I would not be using OnlineStats)
If I do
mystat = Series(Mean(),Variance())
├─ Mean: n=10 | value=0.542621
└─ Variance: n=10 | value=0.077484
If I can save
mystat and load
mystat as a “julia variable” like matlab then that’s fine, but it seems to be tricky: What is the preferred way to save variables? - #17 by FHell
value(mystat) gives the mean and variance, and
nobs(mystat)gives the sample size, which I can save to a
.txt file and I can read it in another run.
But given the mean, variance, and the sample size, I don’t know how to create
Series “mystat” with the same information, so that I can
merge! in another run of my experiment.