Julia equivalent of boost accumulator

Is there a Julia package similar to boost accumulator of C++? The boost accumulator can calculate the statistical properties (average, standard error, etc) of a data stream on the fly without storing all datas.

Check out OnlineStats.jl or RollingFunctions.jl.

2 Likes

By the way, I’ve found the JuliaHub AskAI GPT based ChatBot to be very useful for these questions. Feeding it your “prompt” yields

Yes, there is a Julia package called OnlineStats that provides functionality similar to the boost accumulator in C++. OnlineStats allows you to perform statistical calculations on streaming or large datasets using online algorithms. These algorithms process data one observation at a time and use O(1) memory, which means they can calculate statistical properties on the fly without storing all the data.

OnlineStats provides a wide range of statistical functions, including mean, variance, standard deviation, quantiles, histograms, and more. It also supports various types of data, such as numerical, categorical, and time series data.

To use OnlineStats, you can install it by running ] add OnlineStats in the Julia package manager. Once installed, you can import the package and create an accumulator object to perform the desired calculations. Here’s a simple example that calculates the mean and standard deviation of a data stream:

using OnlineStats

# Create an accumulator for mean and standard deviation
acc = Mean() + Variance()

# Process data stream
for observation in data_stream
    fit!(acc, observation)
end

# Get the calculated mean and standard deviation
mean_val = mean(acc)
std_dev = std(acc)

In this example, Mean() and Variance() are accumulator objects provided by OnlineStats. The fit! function is used to update the accumulator with each observation from the data stream. Finally, you can retrieve the calculated mean and standard deviation using the mean and std functions, respectively.

OnlineStats is a powerful package that allows you to perform various statistical calculations on streaming or large datasets efficiently. It provides similar functionality to the boost accumulator in C++ and is a great choice for real-time or memory-efficient statistical analysis in Julia.

Use this answer at your own discretion.

1 Like

OnlineStats.jl is exactly what I need. Thanks! (and I feel stupid for not asking GPT first lol)