[ANN] UcrDtw.jl: a "fast" implementation of Dynamic Time Warping

I don’t think this is true. The registry contains a git URL, but I don’t think it cares where that URL is hosted. Sean won’t be able to use Attobot to do releases (which is pretty convenient for maintainers) but that shouldn’t affect user’s ability to install the package as normal.

9 Likes

This is correct, and things will eventually change so that Gitlab and other git platforms are supported for automatic merging into General.

2 Likes

8 posts were split to a new topic: Licensing of UcrDtw.jl (split from its announcement)

I think everyone would agree that ultimately we are going to need a way to register and update packages that are not on github, regardless.

6 Likes

I’m super glad to see more DTW support. DTW is really useful in a lot of audio processing contexts, and the other implementations don’t seem like they’re maintained, so this is a welcome development. Thanks for putting the work in!

3 Likes

Very interesting!

I have several questions:

  1. How do you handle noisy edges, when S is a squeezed version of Q?

  2. According to the project description,

Return the maximum distance encountered during the search. This allows for mapping the distance values onto a similarity range [0, 1].

What other normalization methods do you use to threshold the resulted similarity? How about transforming DTW distance to the correlation coefficient? I know that Euclidean distance can be easily transformed to it, given running means, std and window length.

  1. What’s the speed difference comparing to the original C implementation?

  2. What about smoothing path projections, like shapeDTW, in particular DTW of subsequences / motifs?

1 Like

That isn’t covered yet, but I did find a promising paper for this use-case, which I document in an issue.

The goal of the similarity was to give a more human-readable number. Do people generally intuitively understand the correlation coefficient?

Working on that. First, I want to finish the multi-variate code, then I’ll start the comparison.

I’ve never heard of shapeDTW! It seems quite promising. I’ll curious how it compares to Matrix Profile, since they both seem concerned with sub-sequences. However, it does seem easier to implement than Matrix Profile…

2 Likes

As far as I know, the main idea of this is that you can transform your series into some feature series/sequence and then optimize distance in feature space, just like multi-variate DTW.

One simple feature series is to pick not just one i-th sample, but some short motif of neighbouring samples (a region [i-d : i+d]). More than that, this region can be further decimated just to 3-dot motif: x[i-d], x[i], x[i+d] (if it is smooth enough). See my implementation of this, maybe it will help optimizing it:

Source code
@inline eucdist(x, y) = abs2(x-y) #(x-y)*(x-y)
@inline absdist(x, y) = abs(x-y)

# multi-variate version
function _dtw!(D::Matrix{T}, x1::AbstractArray{T}, x2::AbstractArray{T}, w::Int, dist = eucdist) where T<:Real
    len1, len2 = size(x1, 1), size(x2, 1)
    d1 = ndims(x1) == 1 ? 1 : size(x1, 2)
    d  = ndims(x2) == 1 ? 1 : size(x2, 2)
    d == d1 || error("DTW: Dimensions mismatch")
    size(D,1) >= len1 && size(D,2) >= len2 || error("DTW: Maximum length exceeded") # проверка превышения длины
    # dist = eucdist

    # Initialize first column and first row
    D[1,1] = dist(x1[1,1], x2[1,1])
    @inbounds for k=2:d
        D[1,1] += dist(x1[1,k], x2[1,k])
    end
    prevD = D[1,1]
    @inbounds for r=2:len1
        c = max(r-w, 1); # c = 1
        D[r,c] = prevD
        for k=1:d
            D[r,c] += dist(x1[r,k], x2[c,k]) #(x1[r] - x2[1])*(x1[r] - x2[1])
        end
        prevD = D[r,c]
    end
    prevD = D[1,1]
    @inbounds for c=2:len2
        r = max(c-w, 1); # c = 1
        D[1,c] = prevD
        for k=1:d
            D[r,c] += dist(x1[r,k], x2[c,k]) #(x1[1] - x2[c])*(x1[1] - x2[c])
        end
        prevD = D[r,c]
    end
    # Complete the cost matrix
    @inbounds for c=2:len2, r = max(c-w+1, 2) : min(c+w-1, len1) # r=2:len1
        D[r,c] = min(D[r-1,c], D[r-1,c-1], D[r,c-1])
        for k=1:d
            D[r,c] += dist(x1[r,k], x2[c,k]) # (x1[r] - x2[c])*(x1[r] - x2[c])
        end
    end
    D[len1, len2]
end

# 3-dot version
function _dtw!(D::Matrix{T}, x1::AbstractArray{T}, x2::AbstractArray{T}, w::Int, dt::Int, dist = eucdist) where T<:Real
    len1, len2 = size(x1, 1), size(x2, 1)
    d1 = ndims(x1) == 1 ? 1 : size(x1, 2)
    d  = ndims(x2) == 1 ? 1 : size(x2, 2)
    d == d1 || error("DTW: Dimensions mismatch")
    size(D,1) >= len1 && size(D,2) >= len2 || error("DTW: Maximum length exceeded") # проверка превышения длины

    dt2 = 2*dt
    len1 -= dt2
    len2 -= dt2

    # Initialize first column and first row
    D[1,1] = dist(x1[1,1], x2[1,1]) + dist(x1[1+dt,1], x2[1+dt,1]) + dist(x1[1+dt2,1], x2[1+dt2,1])
    @inbounds for k=2:d
        D[1,1] += dist(x1[1,k], x2[1,k]) + dist(x1[1+dt,k], x2[1+dt,k]) + dist(x1[1+dt2,k], x2[1+dt2,k])
    end
    prevD = D[1,1]
    @inbounds for r=2:len1
        c = max(r-w, 1); # c = 1
        D[r,c] = prevD
        for k=1:d
            D[r,c] += dist(x1[r,k], x2[c,k]) + dist(x1[r+dt,k], x2[c+dt,k]) + dist(x1[r+dt2,k], x2[c+dt2,k])
        end
        prevD = D[r,c]
    end
    prevD = D[1,1]
    @inbounds for c=2:len2
        r = max(c-w, 1); # r = 1
        D[r,c] = prevD
        for k=1:d
            D[r,c] += dist(x1[r,k], x2[c,k]) + dist(x1[r+dt,k], x2[c+dt,k]) + dist(x1[r+dt2,k], x2[c+dt2,k])
        end
        prevD = D[r,c]
    end
    # Complete the cost matrix
    @inbounds for c=2:len2
        for r = max(c-w+1, 2) : min(c+w-1, len1) # r=2:len1
            D[r,c] = min(D[r-1,c], D[r-1,c-1], D[r,c-1])
            for k=1:d
                D[r,c] += dist(x1[r,k], x2[c,k]) + dist(x1[r+dt,k], x2[c+dt,k]) + dist(x1[r+dt2,k], x2[c+dt2,k])
            end
        end
    end
    D[len1, len2]
end

What about Matrix Profile, there are many domain-specific performance gains not covered in it (as far as I know):

  • online clustering optimizations,
  • template shape restrictions;
  • template-specific “cheap” rules for prior region exclusion (or interest region selection).

Such optimizations can dramatically improve performance for many concrete applications - in a sense that your number of actual DTW checks tend to num_of_templates x num_of_matches, or even less…

1 Like

So currently multivariate data isn’t supported?

This is the approach I’m familiar from using DTW on audio. Generally you’re not matching up individual samples but some feature vector (usually some kind of spectral features).

Yeah, I haven’t finished writing that for-loop yet…

This method seems really interesting! Can it deal with multivariate timeseries?

Not yet, but I’m working on it. Gotta resolve a small bug first though.

3 Likes

sorry for not reading the entire thread, seems really exciting!

Just merged a PR which now allows for searching a multi-variate time-series!

10 Likes

It seems this package is no longer at the URL in the original post. Is it still available?

Unfortunately, the company I work for wanted to delay open-sourcing the code, so I had to make it private again. :frowning_face:

1 Like

Is it possible to host a copy of the UcrDtw.jl on GitHub? Would be more convenient.

So… @Seanny123 did you retrack the package? The link that the package was originally hosted does not work anymore.

Does anyone have a copy somewhere that I can find? If it was released as open source then we should be able to have a copy on GitHub, license-wise.

2 Likes

Any news on the release of this package? I would be keen on trying it :slight_smile:

2 Likes

I brushed up the abandoned TimeWarp.jl and made some changes to the interface as well as some performance improvements in DynamicAxisWarping.jl. The repo does currently not contain any of the known optimizations for searching for a short pattern in a long time series, but I might add some if I find that I need them it contains most of them by now.

My plans are to experiment with combinations of optimal transport and dynamic time warping, and stuff like soft-DTW are probably going to end up higher on the TODO-list.

Here is an example (notebook) that measures distance between spectrograms by combining dynamic time warping with a transport-based cost along the frequency axis using SpectralDistances.jl.

8 Likes