Step Detection

I would like to detect steps in a signal.
I wonder, if a Julia package exists that can do this job. I found the Python package ruptures that does the job, but it has two shortcomings:

  • It is slow
  • It can not handle larger vectors

It took a while before I understood why I face all the time crahses, the reason was simply that my data
vector is to long (in my case it contains more then 3e5 elements).
Is there another package that can do the same also on longer vectors?

Here is a code snippet for those how are interested to experiment with this package:

using PyCall, NumPyArrays
rpt = PyCall.pyimport("ruptures")
# --- signal:
n_ = 1000
ones_ = ones(n_,1)
zeros_ = zeros(n_,1)
x_ = vcat(ones_, zeros_, ones_, zeros_, ones_, zeros_, ones_, zeros_, ones_, zeros_, ones_, zeros_, 
    ones_, zeros_, ones_, zeros_, ones_, zeros_, ones_, zeros_, ones_, zeros_, ones_, zeros_, 
    )
signal_ = NumPyArray(x_)

# --- detection
algo = rpt.Pelt(model="rbf").fit(signal_)
result = algo.predict(pen=2)

I’m not an expert in these things, just some search results which might help:

Yes, I saw this package, but it seems to be not any longer maintained, maybe I will give it a try anyhow.
Currently, I code my own function, the logic is the same, you compare two neighboring moving windows / cohorts, and when both differ significantly you have probably detected a step.

1 Like

For detecting the peaks, you could use Peaks.jl

There are a few utilities for change-point detection using matrix profiles implemented in GitHub - baggepinnen/MatrixProfile.jl: Time-series analysis using the Matrix profile in Julia

The matrix profile is quite useful for a lot of different time series tasks, but can be expensive to compute for very long vectors.

Linking related thread.

Thanks! - I saw also your package, but I am struggling to instrument it for my use case:
a set-point changes from time to time up-and-down and other variables follow in time,
some with a time shift, others without, what would be the practical approach to do separate cohorts with similar value, cutting out the transition periods with your package?

Do you have a good dynamical model of how the system behaves? If so, something like a Kalman filter is likely a better approach for multivariate time series. By monitoring the prediction error of the Kalman filter, you detect a step when the prediction error becomes large in terms of the posterior covariance matrix. A quick google search returned tons of results for

kalman filter for change point detection

this review article covers this use case

Kalman filters, and other state estimators, are implemented in several julia packages.

2 Likes

Thanks, for the link! :slight_smile:
Yesterday, I also stumbled a few times over the key word calman filter.
Today I need something quick and dirty … I have an idea about the length of the transition periods and I have an idea about the min duration of steady periods.
step_change

I hope I find the time to study also the calman filter approach later on.

This one is really slow!!! :frowning:

Here my quick and dirty approach, with the tuning parameter indx_offset I can move the step indicator in the right direction, if needed:

# x_ and y_ are given, x_ might be the time vector.

function step_changes(_signal::AbstractVector; _cohort_length::Int=100, Δthrhld::Number=0.0125, _step_gap::Int=5, _Δstep::Int=100, indx_offset::Int=10)
    _steps = []; wait_conter = 0
    for i_step = 1:length(_signal) - (2 * _cohort_length + _step_gap)
        wait_conter = max(0, wait_conter - 1)
        cohort_end = i_step + _cohort_length - 1
        cohort_a = _signal[i_step : cohort_end]
        cohort_start = cohort_end + _step_gap
        cohort_b = _signal[cohort_start : cohort_start + _cohort_length - 1]
        Δmean = abs(mean(cohort_a) - mean(cohort_b))
        if Δmean > Δthrhld && wait_conter == 0
            push!(_steps, i_step + _cohort_length - indx_offset)
            wait_conter = _Δstep
        end
    end
    return _steps
end

step_changes_starts = step_changes(y_; _cohort_length = 100, Δthrhld = 0.01, _Δstep = 10000, indx_offset=50)
time_stamps_power_up = x_[step_changes_starts]

step_changes_stops = step_changes(reverse(y_); _cohort_length = 100, Δthrhld = 0.01, _Δstep = 10000, indx_offset=60)
time_stamps_power_down = reverse(x_)[step_changes_stops]

By calling the function twice I can distinguish between the start and the stop index of one stair tread.
For my data it works perfect with the correct parameters:

1 Like