I don’t understand the choice of using GitLab. This means that you won’t be able to register it in the general registry and we will never be able to do ]add UcrDtw
. What is the rationale for this?
[ANN] UcrDtw.jl: a "fast" implementation of Dynamic Time Warping
My company is using GitLab, so I have to use it.
I don’t think this is true. The registry contains a git URL, but I don’t think it cares where that URL is hosted. Sean won’t be able to use Attobot to do releases (which is pretty convenient for maintainers) but that shouldn’t affect user’s ability to install the package as normal.
This is correct, and things will eventually change so that Gitlab and other git platforms are supported for automatic merging into General.
I think everyone would agree that ultimately we are going to need a way to register and update packages that are not on github, regardless.
I’m super glad to see more DTW support. DTW is really useful in a lot of audio processing contexts, and the other implementations don’t seem like they’re maintained, so this is a welcome development. Thanks for putting the work in!
Very interesting!
I have several questions:

How do you handle noisy edges, when S is a squeezed version of Q?

According to the project description,
Return the maximum distance encountered during the search. This allows for mapping the distance values onto a similarity range [0, 1].
What other normalization methods do you use to threshold the resulted similarity? How about transforming DTW distance to the correlation coefficient? I know that Euclidean distance can be easily transformed to it, given running means, std and window length.

What’s the speed difference comparing to the original C implementation?

What about smoothing path projections, like shapeDTW, in particular DTW of subsequences / motifs?
That isn’t covered yet, but I did find a promising paper for this usecase, which I document in an issue.
The goal of the similarity was to give a more humanreadable number. Do people generally intuitively understand the correlation coefficient?
Working on that. First, I want to finish the multivariate code, then I’ll start the comparison.
I’ve never heard of shapeDTW! It seems quite promising. I’ll curious how it compares to Matrix Profile, since they both seem concerned with subsequences. However, it does seem easier to implement than Matrix Profile…
As far as I know, the main idea of this is that you can transform your series into some feature series/sequence and then optimize distance in feature space, just like multivariate DTW.
One simple feature series is to pick not just one ith sample, but some short motif of neighbouring samples (a region [id : i+d]). More than that, this region can be further decimated just to 3dot motif: x[id], x[i], x[i+d] (if it is smooth enough). See my implementation of this, maybe it will help optimizing it:
Source code
@inline eucdist(x, y) = abs2(xy) #(xy)*(xy)
@inline absdist(x, y) = abs(xy)
# multivariate version
function _dtw!(D::Matrix{T}, x1::AbstractArray{T}, x2::AbstractArray{T}, w::Int, dist = eucdist) where T<:Real
len1, len2 = size(x1, 1), size(x2, 1)
d1 = ndims(x1) == 1 ? 1 : size(x1, 2)
d = ndims(x2) == 1 ? 1 : size(x2, 2)
d == d1  error("DTW: Dimensions mismatch")
size(D,1) >= len1 && size(D,2) >= len2  error("DTW: Maximum length exceeded") # проверка превышения длины
# dist = eucdist
# Initialize first column and first row
D[1,1] = dist(x1[1,1], x2[1,1])
@inbounds for k=2:d
D[1,1] += dist(x1[1,k], x2[1,k])
end
prevD = D[1,1]
@inbounds for r=2:len1
c = max(rw, 1); # c = 1
D[r,c] = prevD
for k=1:d
D[r,c] += dist(x1[r,k], x2[c,k]) #(x1[r]  x2[1])*(x1[r]  x2[1])
end
prevD = D[r,c]
end
prevD = D[1,1]
@inbounds for c=2:len2
r = max(cw, 1); # c = 1
D[1,c] = prevD
for k=1:d
D[r,c] += dist(x1[r,k], x2[c,k]) #(x1[1]  x2[c])*(x1[1]  x2[c])
end
prevD = D[r,c]
end
# Complete the cost matrix
@inbounds for c=2:len2, r = max(cw+1, 2) : min(c+w1, len1) # r=2:len1
D[r,c] = min(D[r1,c], D[r1,c1], D[r,c1])
for k=1:d
D[r,c] += dist(x1[r,k], x2[c,k]) # (x1[r]  x2[c])*(x1[r]  x2[c])
end
end
D[len1, len2]
end
# 3dot version
function _dtw!(D::Matrix{T}, x1::AbstractArray{T}, x2::AbstractArray{T}, w::Int, dt::Int, dist = eucdist) where T<:Real
len1, len2 = size(x1, 1), size(x2, 1)
d1 = ndims(x1) == 1 ? 1 : size(x1, 2)
d = ndims(x2) == 1 ? 1 : size(x2, 2)
d == d1  error("DTW: Dimensions mismatch")
size(D,1) >= len1 && size(D,2) >= len2  error("DTW: Maximum length exceeded") # проверка превышения длины
dt2 = 2*dt
len1 = dt2
len2 = dt2
# Initialize first column and first row
D[1,1] = dist(x1[1,1], x2[1,1]) + dist(x1[1+dt,1], x2[1+dt,1]) + dist(x1[1+dt2,1], x2[1+dt2,1])
@inbounds for k=2:d
D[1,1] += dist(x1[1,k], x2[1,k]) + dist(x1[1+dt,k], x2[1+dt,k]) + dist(x1[1+dt2,k], x2[1+dt2,k])
end
prevD = D[1,1]
@inbounds for r=2:len1
c = max(rw, 1); # c = 1
D[r,c] = prevD
for k=1:d
D[r,c] += dist(x1[r,k], x2[c,k]) + dist(x1[r+dt,k], x2[c+dt,k]) + dist(x1[r+dt2,k], x2[c+dt2,k])
end
prevD = D[r,c]
end
prevD = D[1,1]
@inbounds for c=2:len2
r = max(cw, 1); # r = 1
D[r,c] = prevD
for k=1:d
D[r,c] += dist(x1[r,k], x2[c,k]) + dist(x1[r+dt,k], x2[c+dt,k]) + dist(x1[r+dt2,k], x2[c+dt2,k])
end
prevD = D[r,c]
end
# Complete the cost matrix
@inbounds for c=2:len2
for r = max(cw+1, 2) : min(c+w1, len1) # r=2:len1
D[r,c] = min(D[r1,c], D[r1,c1], D[r,c1])
for k=1:d
D[r,c] += dist(x1[r,k], x2[c,k]) + dist(x1[r+dt,k], x2[c+dt,k]) + dist(x1[r+dt2,k], x2[c+dt2,k])
end
end
end
D[len1, len2]
end
What about Matrix Profile, there are many domainspecific performance gains not covered in it (as far as I know):
 online clustering optimizations,
 template shape restrictions;
 templatespecific “cheap” rules for prior region exclusion (or interest region selection).
Such optimizations can dramatically improve performance for many concrete applications  in a sense that your number of actual DTW checks tend to num_of_templates
x num_of_matches
, or even less…
So currently multivariate data isn’t supported?
This is the approach I’m familiar from using DTW on audio. Generally you’re not matching up individual samples but some feature vector (usually some kind of spectral features).
Yeah, I haven’t finished writing that forloop yet…
This method seems really interesting! Can it deal with multivariate timeseries?
Not yet, but I’m working on it. Gotta resolve a small bug first though.
sorry for not reading the entire thread, seems really exciting!
Just merged a PR which now allows for searching a multivariate timeseries!
It seems this package is no longer at the URL in the original post. Is it still available?
Unfortunately, the company I work for wanted to delay opensourcing the code, so I had to make it private again.
Is it possible to host a copy of the UcrDtw.jl on GitHub? Would be more convenient.
So… @Seanny123 did you retrack the package? The link that the package was originally hosted does not work anymore.
Does anyone have a copy somewhere that I can find? If it was released as open source then we should be able to have a copy on GitHub, licensewise.