[ANN] UcrDtw.jl: a "fast" implementation of Dynamic Time Warping

package
announcement

#1

Dynamic Time Warping (DTW) is a way to search a time-series for a given pattern. Basically, instead of Euclidian Distance (ED) for comparing the pattern to the time-series, DTW can be used and allow for more useful matches.

Typically, DTW is significantly slower than ED, if implemented naively. Luckily, Eammon Keogh has spent a chunk of his career optimizing the heck out of it. This package, UcrDtw.jl, is an re-implementation/translation of the UCR Suite of optimizations, previously published as C code. It fixes some bugs and adds some features.

The term “fast” is in quotation marks, because I haven’t done that much profiling, but I have implemented all the optimizations. Help in this area would be appreciated!


Dynamic Time Warping in Julia
#2

I don’t understand the choice of using GitLab. This means that you won’t be able to register it in the general registry and we will never be able to do ]add UcrDtw. What is the rationale for this?


#3

My company is using GitLab, so I have to use it.


#4

I don’t think this is true. The registry contains a git URL, but I don’t think it cares where that URL is hosted. Sean won’t be able to use Attobot to do releases (which is pretty convenient for maintainers) but that shouldn’t affect user’s ability to install the package as normal.


#5

This is correct, and things will eventually change so that Gitlab and other git platforms are supported for automatic merging into General.


#6

One thing I’m wondering is how this package’s license (GNU LGPL v3) will work with other packages which desire to depend on it. I’m not going to bash your choice of license; I’m a fan of the GPL/LGPL/AGPL license family in general. Is the license that you’ve chosen intended to signify intent for how this package should be used?

P.S. This package looks awesome, and regardless of the license or registry situation, I look forward to giving it a spin! :smile:


#7

I think everyone would agree that ultimately we are going to need a way to register and update packages that are not on github, regardless.


#8

I’m super glad to see more DTW support. DTW is really useful in a lot of audio processing contexts, and the other implementations don’t seem like they’re maintained, so this is a welcome development. Thanks for putting the work in!


#9

I’m not very good at this whole licensing thing. I chose LGPL because it would let people include the code in a project without restriction (as opposed to GPL), but have to include any improvements they made. What usage intent have I signified?


#10

After reading into the LGPL a bit more, I think my original assumptions about how it works are incorrect, so nevermind! :stuck_out_tongue:

For reference: I had somehow assumed that an LGPL-licensed library is only allowed to be used with other non-*GPL code when that code is dynamically linked to the LGPL library (which Julia does not do with package code; it’s more accurate to consider JIT’ing to be static linking). So I was worried that the only other packages that could be used with your package was other *GPL packages. But I don’t think that’s how the LGPL works from my more recent readings.

EDIT: My confusion was that I conflated the LGPL with the “linking exception” used by some GPL software.


#11

Please consider a MIT license (or the equivalent ISC) like the rest of the ecosystem! That saves every Julian from thinking twice before depending on your nice work!


#12

Definitely will do that for my personal work! This work was done via my employer, Emagin, so this was the most liberal license I could get.


#13

I don’t think that LGPL is meaningfully different from GPL in Julia since we don’t dynamically link anything (although it may allow Julia to dynamically link GPL-incompatible dynamic libraries, e.g. MKL). So basically the effect of this is that any Julia program that uses this package will have to be effectively GPLv3 in the end but may depend on GPL-incompatible dynamic libraries. You may also want to consider the Apache 2.0 license.


#14

Very interesting!

I have several questions:

  1. How do you handle noisy edges, when S is a squeezed version of Q?

  2. According to the project description,

Return the maximum distance encountered during the search. This allows for mapping the distance values onto a similarity range [0, 1].

What other normalization methods do you use to threshold the resulted similarity? How about transforming DTW distance to the correlation coefficient? I know that Euclidean distance can be easily transformed to it, given running means, std and window length.

  1. What’s the speed difference comparing to the original C implementation?

  2. What about smoothing path projections, like shapeDTW, in particular DTW of subsequences / motifs?


#15

Of course I’m not a lawyer, but may understanding of the LGPL is that it doesn’t require that a derived work must be released under the same terms as a whole. The requirements should be to specify the license of the LGPL library and provide any modification to its source code.


#16

That isn’t covered yet, but I did find a promising paper for this use-case, which I document in an issue.

The goal of the similarity was to give a more human-readable number. Do people generally intuitively understand the correlation coefficient?

Working on that. First, I want to finish the multi-variate code, then I’ll start the comparison.

I’ve never heard of shapeDTW! It seems quite promising. I’ll curious how it compares to Matrix Profile, since they both seem concerned with sub-sequences. However, it does seem easier to implement than Matrix Profile…


#17

As far as I know, the main idea of this is that you can transform your series into some feature series/sequence and then optimize distance in feature space, just like multi-variate DTW.

One simple feature series is to pick not just one i-th sample, but some short motif of neighbouring samples (a region [i-d : i+d]). More than that, this region can be further decimated just to 3-dot motif: x[i-d], x[i], x[i+d] (if it is smooth enough). See my implementation of this, maybe it will help optimizing it:

Source code
@inline eucdist(x, y) = abs2(x-y) #(x-y)*(x-y)
@inline absdist(x, y) = abs(x-y)

# multi-variate version
function _dtw!(D::Matrix{T}, x1::AbstractArray{T}, x2::AbstractArray{T}, w::Int, dist = eucdist) where T<:Real
    len1, len2 = size(x1, 1), size(x2, 1)
    d1 = ndims(x1) == 1 ? 1 : size(x1, 2)
    d  = ndims(x2) == 1 ? 1 : size(x2, 2)
    d == d1 || error("DTW: Dimensions mismatch")
    size(D,1) >= len1 && size(D,2) >= len2 || error("DTW: Maximum length exceeded") # проверка превышения длины
    # dist = eucdist

    # Initialize first column and first row
    D[1,1] = dist(x1[1,1], x2[1,1])
    @inbounds for k=2:d
        D[1,1] += dist(x1[1,k], x2[1,k])
    end
    prevD = D[1,1]
    @inbounds for r=2:len1
        c = max(r-w, 1); # c = 1
        D[r,c] = prevD
        for k=1:d
            D[r,c] += dist(x1[r,k], x2[c,k]) #(x1[r] - x2[1])*(x1[r] - x2[1])
        end
        prevD = D[r,c]
    end
    prevD = D[1,1]
    @inbounds for c=2:len2
        r = max(c-w, 1); # c = 1
        D[1,c] = prevD
        for k=1:d
            D[r,c] += dist(x1[r,k], x2[c,k]) #(x1[1] - x2[c])*(x1[1] - x2[c])
        end
        prevD = D[r,c]
    end
    # Complete the cost matrix
    @inbounds for c=2:len2, r = max(c-w+1, 2) : min(c+w-1, len1) # r=2:len1
        D[r,c] = min(D[r-1,c], D[r-1,c-1], D[r,c-1])
        for k=1:d
            D[r,c] += dist(x1[r,k], x2[c,k]) # (x1[r] - x2[c])*(x1[r] - x2[c])
        end
    end
    D[len1, len2]
end

# 3-dot version
function _dtw!(D::Matrix{T}, x1::AbstractArray{T}, x2::AbstractArray{T}, w::Int, dt::Int, dist = eucdist) where T<:Real
    len1, len2 = size(x1, 1), size(x2, 1)
    d1 = ndims(x1) == 1 ? 1 : size(x1, 2)
    d  = ndims(x2) == 1 ? 1 : size(x2, 2)
    d == d1 || error("DTW: Dimensions mismatch")
    size(D,1) >= len1 && size(D,2) >= len2 || error("DTW: Maximum length exceeded") # проверка превышения длины

    dt2 = 2*dt
    len1 -= dt2
    len2 -= dt2

    # Initialize first column and first row
    D[1,1] = dist(x1[1,1], x2[1,1]) + dist(x1[1+dt,1], x2[1+dt,1]) + dist(x1[1+dt2,1], x2[1+dt2,1])
    @inbounds for k=2:d
        D[1,1] += dist(x1[1,k], x2[1,k]) + dist(x1[1+dt,k], x2[1+dt,k]) + dist(x1[1+dt2,k], x2[1+dt2,k])
    end
    prevD = D[1,1]
    @inbounds for r=2:len1
        c = max(r-w, 1); # c = 1
        D[r,c] = prevD
        for k=1:d
            D[r,c] += dist(x1[r,k], x2[c,k]) + dist(x1[r+dt,k], x2[c+dt,k]) + dist(x1[r+dt2,k], x2[c+dt2,k])
        end
        prevD = D[r,c]
    end
    prevD = D[1,1]
    @inbounds for c=2:len2
        r = max(c-w, 1); # r = 1
        D[r,c] = prevD
        for k=1:d
            D[r,c] += dist(x1[r,k], x2[c,k]) + dist(x1[r+dt,k], x2[c+dt,k]) + dist(x1[r+dt2,k], x2[c+dt2,k])
        end
        prevD = D[r,c]
    end
    # Complete the cost matrix
    @inbounds for c=2:len2
        for r = max(c-w+1, 2) : min(c+w-1, len1) # r=2:len1
            D[r,c] = min(D[r-1,c], D[r-1,c-1], D[r,c-1])
            for k=1:d
                D[r,c] += dist(x1[r,k], x2[c,k]) + dist(x1[r+dt,k], x2[c+dt,k]) + dist(x1[r+dt2,k], x2[c+dt2,k])
            end
        end
    end
    D[len1, len2]
end

What about Matrix Profile, there are many domain-specific performance gains not covered in it (as far as I know):

  • online clustering optimizations,
  • template shape restrictions;
  • template-specific “cheap” rules for prior region exclusion (or interest region selection).

Such optimizations can dramatically improve performance for many concrete applications - in a sense that your number of actual DTW checks tend to num_of_templates x num_of_matches, or even less…


#18

So currently multivariate data isn’t supported?

This is the approach I’m familiar from using DTW on audio. Generally you’re not matching up individual samples but some feature vector (usually some kind of spectral features).


#19

Yeah, I haven’t finished writing that for-loop yet…