Julia implementation of seqperiod (from MATLAB)

Hi! I’ve been playing with Julia for a while. Suppose you are given a 2000-by10000 Matrix of Float64 type and were told to find the periodicity of each of the 10k columns and store it in some variable. Note that, very close values are assumed to be identical. For that, you are provided with a tolerance value. Here periodicity means the minimum length of cycling…

Ex: 1 2 3 1 2 3 1 2 will give periodicity 3
1.000001 1.999999 1 2 0.999999 will give periodicity 2 for tolerance tol = 0.001.

I have written one code:

function seqper(x; tol =0.001)
    ind1=findall(x-> x<tol,[abs.(x.-x[1])...]); 
    period=length(x)
    for i=2:length(ind1)
        if (maximum(abs.(x[ind1[i]:end].-x[1:end-ind1[i]+1]))<=tol)
        period=ind1[i]-1
        break
        end
    end 
    return period
end

Can anyone help me with an performant implementation?

Why I asked: I am finding reasons for migrating from MATLAB to Julia i.e. translating all my MATLAB codes to Julia and check if it gives me more speed. Recently, I am working on a discrete dynamical system (DDS). I need to solve the system for a range of parameter values (say 100) and (trajectory in DynamicalSystems) and store the last few iterations (say 2000) of one state variable in an Array. This gives me a 2000-by-100 array. I need to find the periodicity of these 100 columns.

vec(abs.(x .- x[1])) is much better, you don’t want to splat huge amount of elements

@views maximum(abs, x[ind1[i]:end] .- x[1:end-ind1[i]+1]) should reduce some allocation for you. You can also turn this into a loop to eliminate allocation.

also I don’t quite understand your algorithm, I think it may be not well defined what you want to find:

julia> seqper([1,2,1,2,3, 1,2,1,2,3,4])
ind1 = [1, 3, 6, 8]
11

julia> seqper([1,2,1,2, 1,2,1,2,3])
ind1 = [1, 3, 5, 7]
9

I’d expect periodicity to be 5 in first case and 2 or 4 in second

A quick fix for some allocations:

function seqper(x; tol =0.001)
    m = period = length(x)
    ind1 = findall(abs(x[i]-x[1]) < tol for i = 1:m)
    for i = 2:length(ind1)
        period = ind1[i] - 1
        any(abs(x[j]-x[j-ind1[i]+1]) <= tol for j = ind1[i]:m) && return period
    end
    return period
end
1 Like

In my use-case, I have consider a time-series data and see whether it repeats periodically or not. The last element ‘3’ in your first example breaks that repeating sequence.

why 12123 as a whole can’t be a pattern? does your data guaranteed not to “cross” itself in the middle of a pattern?

As far as I know, yes. I have never seen any irregularity and the numerical solutions are quite smooth.

Jupyter notebook: https://github.com/genkuroki/public/blob/main/0017/seqper.ipynb

Generate test data X and Y:

using BenchmarkTools

function testvec(n, per; tol=1e-3)
    X = [float(mod1(k, per)) for k in 1:n]
    @. X + tol * (rand() - 0.5)
end

X = testvec(10^4, 999);
a = zeros(length(X));
a[end] = 2e-3
Y = X + a;

Original version:

function seqper(x; tol=0.001)
    ind1 = findall(≤(tol), [abs.(x .- x[1])...])
    period = length(x)
    for i = 2:length(ind1)
        if maximum(abs.(x[ind1[i]:end] .- x[1:end-ind1[i]+1])) ≤ tol
            period = ind1[i] - 1
            break
        end
    end 
    return period
end

@show seqper(X)
@btime seqper($X)
@show seqper(Y)
@btime seqper($Y);

Results:

seqper(X) = 999
  228.700 μs (10016 allocations: 602.69 KiB)
seqper(Y) = 10000
  314.900 μs (10061 allocations: 1.42 MiB)

My revised simple zero-allocated version:

function seqper1(x; tol=1e-3)
    @inbounds for k in 2:length(x)
        if abs(x[k] - x[1]) ≤ tol
            all(j -> abs(x[j] - x[j-k+1]) ≤ tol, k:lastindex(x)) && return k - 1
        end
    end 
    return length(x)
end

@show seqper1(X)
@btime seqper1($X)
@show seqper1(Y)
@btime seqper1($Y);

Results:

seqper1(X) = 999
  6.640 μs (0 allocations: 0 bytes)
seqper1(Y) = 10000
  36.400 μs (0 allocations: 0 bytes)

Case X: 230 μs → 6.64 μs
Case Y: 315 μs → 36.4 μs

In optimization, aiming for zero-allocation is very important in many cases (Measure performance with @time and pay attention to memory allocation).

4 Likes

Thank you @Seif_Shebl . With some minor changes to your code, I got 5x speed.

function seqper(x; tol =0.001)
    m = period =length(x)
    ind1 = findall([abs(x[i]-x[1]) <= tol for i = 1:m]); 
    for i=2:length(ind1)
        period=ind1[i]-1
        all([abs(x[j]-x[j-ind1[i]+1])<=tol for j = ind1[i]:m]) && return period
    end 
    return period
end

Replaced ‘any’ with ‘all’ and added a couple of square brackets inside ‘findall’.

Thanks a lot, @genkuroki . By far this one is the fastest. I will check the links you provided. It’s a huge relief. Here is a part of my code :

         NTr=10000;
         NIter = 2000;
         xpts = ytps = 101
         par1range=range(1, 9, length = xpts)
        par2range = range(0, 1, length = ytps)
        sol_last =zeros(Float64, NIter, xpts*ypts);
        @sync Threads.@threads for i = 1:xpts 
            # println("i = $(i)")
            for  j = 1:ypts
                tr = trajectory(vigi_dis(init; b=par1range[i], v=par2range[j]), NIter; Ttr = NTr);
                sol_last[:,(i-1)*xpts+j] = tr[2:end,1];
            end
        end
       
        @time periods = seqper1.(eachcol(sol_last),tol=0.001)  

        # ss = mxarray(sol_last) # @time ss = seqper.(eachcol(sol_last),tol=0.001) 
        # xdata, ydata = mxcall(:meshgrid, 2, par1range, par2range)
        # mxcall(:seqperiod, 1, ss)

    

It’s working faster than ever. Can it improved more?

Edit: Sorry, TAB key published my unfinished writing.

you don’t need this.

use @view tr[2:end,1].

depending how heavy this computation is, you can also multi-thread this

@jling , Using @view gives error:

                sol_last[:,(i-1)*xpts+j] = @view tr[2:end,1];
ERROR: TaskFailedException

    nested task error: ArgumentError: Dataset views only accept indices on one dimension
    Stacktrace:
     [1] view(::Dataset{2, Float64}, ::UnitRange{Int64}, ::Int64)
       @ DelayEmbeddings C:\Users\MHBook\.julia\packages\DelayEmbeddings\DdOZb\src\subdataset.jl:40
     [2] macro expansion
       @ c:\Users\MHBook\Desktop\ljul\isoperiodic2.jl:49 [inlined]
     [3] (::var"#445#threadsfor_fun#138"{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}, Matrix{Float64}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}, UnitRange{Int64}})(onethread::Bool)
       @ Main .\threadingconstructs.jl:81
     [4] (::var"#445#threadsfor_fun#138"{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}, Matrix{Float64}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}, UnitRange{Int64}})()
       @ Main .\threadingconstructs.jl:48
Stacktrace:

wait at .\task.jl

threading_run(func::Function) at .\threadingconstructs.jl

macro expansion at .\threadingconstructs.jl

top-level scope at c:\Users\MHBook\Desktop\ljul\isoperiodic2.jl

Threads.@threads [schedule] for ... end
A macro to parallelize a for loop to run with multiple threads. Splits the iteration space among multiple tasks and runs those tasks on threads according to a scheduling policy. A barrier is placed at the end of the loop which waits for all tasks to finish execution.

The schedule argument can be used to request a particular scheduling policy. The only currently supported value is :static, which creates one task per thread and divides the iterations equally among them. Specifying :static is an error if used from inside another @threads loop or from a thread other than 1.

The default schedule (used when no schedule argument is present) is subject to change.

!!! compat "Julia 1.5" The schedule argument is available as of Julia 1.5.

Base.Threads.@threads is a Function. 1 method for function Base.Threads.@threads

@threads(__source__::Core.LineNumberNode, __module__::Core.Module, args::Core.Vararg where N) in Threads at C:\Users\MHBook\AppData\Local\Programs\Julia-1.6.1\share\julia\base\threadingconstructs.jl:118

Any documentation on how and when to use it properly?

oh well I didn’t know that’s not a Julia matrix…forget about it then