Julia implementation of seqperiod (from MATLAB)

MathGuy · August 21, 2021, 8:56pm

Hi! I’ve been playing with Julia for a while. Suppose you are given a 2000-by10000 Matrix of Float64 type and were told to find the periodicity of each of the 10k columns and store it in some variable. Note that, very close values are assumed to be identical. For that, you are provided with a tolerance value. Here periodicity means the minimum length of cycling…

Ex: 1 2 3 1 2 3 1 2 will give periodicity 3
1.000001 1.999999 1 2 0.999999 will give periodicity 2 for tolerance tol = 0.001.

I have written one code:

function seqper(x; tol =0.001)
    ind1=findall(x-> x<tol,[abs.(x.-x[1])...]); 
    period=length(x)
    for i=2:length(ind1)
        if (maximum(abs.(x[ind1[i]:end].-x[1:end-ind1[i]+1]))<=tol)
        period=ind1[i]-1
        break
        end
    end 
    return period
end

Can anyone help me with an performant implementation?

Why I asked: I am finding reasons for migrating from MATLAB to Julia i.e. translating all my MATLAB codes to Julia and check if it gives me more speed. Recently, I am working on a discrete dynamical system (DDS). I need to solve the system for a range of parameter values (say 100) and (trajectory in DynamicalSystems) and store the last few iterations (say 2000) of one state variable in an Array. This gives me a 2000-by-100 array. I need to find the periodicity of these 100 columns.

jling · August 21, 2021, 10:02pm

vec(abs.(x .- x[1])) is much better, you don’t want to splat huge amount of elements

@views maximum(abs, x[ind1[i]:end] .- x[1:end-ind1[i]+1]) should reduce some allocation for you. You can also turn this into a loop to eliminate allocation.

jling · August 21, 2021, 10:11pm

also I don’t quite understand your algorithm, I think it may be not well defined what you want to find:

julia> seqper([1,2,1,2,3, 1,2,1,2,3,4])
ind1 = [1, 3, 6, 8]
11

julia> seqper([1,2,1,2, 1,2,1,2,3])
ind1 = [1, 3, 5, 7]
9

I’d expect periodicity to be 5 in first case and 2 or 4 in second

Seif_Shebl · August 21, 2021, 10:11pm

A quick fix for some allocations:

function seqper(x; tol =0.001)
    m = period = length(x)
    ind1 = findall(abs(x[i]-x[1]) < tol for i = 1:m)
    for i = 2:length(ind1)
        period = ind1[i] - 1
        any(abs(x[j]-x[j-ind1[i]+1]) <= tol for j = ind1[i]:m) && return period
    end
    return period
end

MathGuy · August 21, 2021, 10:26pm

In my use-case, I have consider a time-series data and see whether it repeats periodically or not. The last element ‘3’ in your first example breaks that repeating sequence.

jling · August 21, 2021, 10:32pm

why 12123 as a whole can’t be a pattern? does your data guaranteed not to “cross” itself in the middle of a pattern?

MathGuy · August 21, 2021, 10:41pm

As far as I know, yes. I have never seen any irregularity and the numerical solutions are quite smooth.

genkuroki · August 21, 2021, 11:03pm

Jupyter notebook: https://github.com/genkuroki/public/blob/main/0017/seqper.ipynb

Generate test data X and Y:

using BenchmarkTools

function testvec(n, per; tol=1e-3)
    X = [float(mod1(k, per)) for k in 1:n]
    @. X + tol * (rand() - 0.5)
end

X = testvec(10^4, 999);
a = zeros(length(X));
a[end] = 2e-3
Y = X + a;

Original version:

function seqper(x; tol=0.001)
    ind1 = findall(≤(tol), [abs.(x .- x[1])...])
    period = length(x)
    for i = 2:length(ind1)
        if maximum(abs.(x[ind1[i]:end] .- x[1:end-ind1[i]+1])) ≤ tol
            period = ind1[i] - 1
            break
        end
    end 
    return period
end

@show seqper(X)
@btime seqper($X)
@show seqper(Y)
@btime seqper($Y);

Results:

seqper(X) = 999
  228.700 μs (10016 allocations: 602.69 KiB)
seqper(Y) = 10000
  314.900 μs (10061 allocations: 1.42 MiB)

My revised simple zero-allocated version:

function seqper1(x; tol=1e-3)
    @inbounds for k in 2:length(x)
        if abs(x[k] - x[1]) ≤ tol
            all(j -> abs(x[j] - x[j-k+1]) ≤ tol, k:lastindex(x)) && return k - 1
        end
    end 
    return length(x)
end

@show seqper1(X)
@btime seqper1($X)
@show seqper1(Y)
@btime seqper1($Y);

Results:

seqper1(X) = 999
  6.640 μs (0 allocations: 0 bytes)
seqper1(Y) = 10000
  36.400 μs (0 allocations: 0 bytes)

Case X: 230 μs → 6.64 μs
Case Y: 315 μs → 36.4 μs

In optimization, aiming for zero-allocation is very important in many cases (Measure performance with @time and pay attention to memory allocation).

MathGuy · August 21, 2021, 11:10pm

Thank you @Seif_Shebl . With some minor changes to your code, I got 5x speed.

function seqper(x; tol =0.001)
    m = period =length(x)
    ind1 = findall([abs(x[i]-x[1]) <= tol for i = 1:m]); 
    for i=2:length(ind1)
        period=ind1[i]-1
        all([abs(x[j]-x[j-ind1[i]+1])<=tol for j = ind1[i]:m]) && return period
    end 
    return period
end

Replaced ‘any’ with ‘all’ and added a couple of square brackets inside ‘findall’.

MathGuy · August 21, 2021, 11:18pm

Thanks a lot, @genkuroki . By far this one is the fastest. I will check the links you provided. It’s a huge relief. Here is a part of my code :

         NTr=10000;
         NIter = 2000;
         xpts = ytps = 101
         par1range=range(1, 9, length = xpts)
        par2range = range(0, 1, length = ytps)
        sol_last =zeros(Float64, NIter, xpts*ypts);
        @sync Threads.@threads for i = 1:xpts 
            # println("i = $(i)")
            for  j = 1:ypts
                tr = trajectory(vigi_dis(init; b=par1range[i], v=par2range[j]), NIter; Ttr = NTr);
                sol_last[:,(i-1)*xpts+j] = tr[2:end,1];
            end
        end
       
        @time periods = seqper1.(eachcol(sol_last),tol=0.001)  

        # ss = mxarray(sol_last) # @time ss = seqper.(eachcol(sol_last),tol=0.001) 
        # xdata, ydata = mxcall(:meshgrid, 2, par1range, par2range)
        # mxcall(:seqperiod, 1, ss)

It’s working faster than ever. Can it improved more?

Edit: Sorry, TAB key published my unfinished writing.

jling · August 21, 2021, 11:32pm

you don’t need this.

use @view tr[2:end,1].

depending how heavy this computation is, you can also multi-thread this

MathGuy · August 21, 2021, 11:45pm

@jling , Using @view gives error:

                sol_last[:,(i-1)*xpts+j] = @view tr[2:end,1];

ERROR: TaskFailedException

    nested task error: ArgumentError: Dataset views only accept indices on one dimension
    Stacktrace:
     [1] view(::Dataset{2, Float64}, ::UnitRange{Int64}, ::Int64)
       @ DelayEmbeddings C:\Users\MHBook\.julia\packages\DelayEmbeddings\DdOZb\src\subdataset.jl:40
     [2] macro expansion
       @ c:\Users\MHBook\Desktop\ljul\isoperiodic2.jl:49 [inlined]
     [3] (::var"#445#threadsfor_fun#138"{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}, Matrix{Float64}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}, UnitRange{Int64}})(onethread::Bool)
       @ Main .\threadingconstructs.jl:81
     [4] (::var"#445#threadsfor_fun#138"{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}, Matrix{Float64}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}, UnitRange{Int64}})()
       @ Main .\threadingconstructs.jl:48
Stacktrace:

wait at .\task.jl

threading_run(func::Function) at .\threadingconstructs.jl

macro expansion at .\threadingconstructs.jl

top-level scope at c:\Users\MHBook\Desktop\ljul\isoperiodic2.jl

Threads.@threads [schedule] for ... end
A macro to parallelize a for loop to run with multiple threads. Splits the iteration space among multiple tasks and runs those tasks on threads according to a scheduling policy. A barrier is placed at the end of the loop which waits for all tasks to finish execution.

The schedule argument can be used to request a particular scheduling policy. The only currently supported value is :static, which creates one task per thread and divides the iterations equally among them. Specifying :static is an error if used from inside another @threads loop or from a thread other than 1.

The default schedule (used when no schedule argument is present) is subject to change.

!!! compat "Julia 1.5" The schedule argument is available as of Julia 1.5.

Base.Threads.@threads is a Function. 1 method for function Base.Threads.@threads

@threads(__source__::Core.LineNumberNode, __module__::Core.Module, args::Core.Vararg where N) in Threads at C:\Users\MHBook\AppData\Local\Programs\Julia-1.6.1\share\julia\base\threadingconstructs.jl:118

Any documentation on how and when to use it properly?

jling · August 21, 2021, 11:50pm

github.com

JuliaDynamics/DelayEmbeddings.jl/blob/458166d507f27552979e26dcc08072db053ffb0c/src/subdataset.jl#L37


      
          
          Base.parent(sd::SubDataset) = sd.parent
          Base.parentindices(sd::SubDataset) = parentindices(sd.data)
          
          """
              view(d::Dataset, indices)
          
          Return a view into the parent dataset `d`, as a [`SubDataset`](@ref)
          that contains the datapoints of `d` referred to by `indices`.
          """
          Base.view(d::AbstractDataset, i) = SubDataset(d, view(d.data, i))
          
          function Base.view(::AbstractDataset, ::Any, ::Any, ::Vararg)
              throw(ArgumentError("Dataset views only accept indices on one dimension"))
          end

oh well I didn’t know that’s not a Julia matrix…forget about it then

Topic		Replies	Views
Solving difference equation: Part 2 Performance difference-equations	4	931	August 29, 2021
Simple speed comparison with Matlab New to Julia question	33	1262	May 6, 2025
Julia is significantly slower (~18 x) than Matlab in vector and matrix algebra New to Julia	32	1924	June 25, 2023
Pairwise computation slower than Python (Cython) code (BallTree very slow!) Performance question , tree	27	2064	June 18, 2021
Julia vs Matlab - building a Jacobian matrix New to Julia	30	4226	March 28, 2019

Julia implementation of seqperiod (from MATLAB)

Related topics