DamerauLevenshtein() vs Levenshtein() why the same distance?

I expected a shorter distance for Damerau Levenshtein because there is an exchange of two letters. Why are there the same results?

julia> compare("martha", "martht", DamerauLevenshtein())
0.8333333333333334

julia> compare("martha", "martht", Levenshtein())
0.8333333333333334

Paul

This does not really look like a Julia question.
I guess you should at least specify what package are you calling these functions from and why you would expect a different result.

I suppose you refer to StringDistances.jl(but I might be wrong).
By the way, I believe the result is correct: the strings “martha” and “martht” are only separated by a transformation of ‘a’ into ‘t’, hence you have 1-1/6=0.8[3] in both cases.
Instead, where you see the difference between the two metrics is by comparing e.g. “martha” and “martah”: with the Damerau-Levenshtein metric you can swap adjacent letters and hence the two strings are separated by a single edit; with the Levenshtein metric you have to apply 2 separate edits ‘a’->‘h’ and ‘h’->‘a’ and hence you get a bigger distance.

julia> compare("martha", "martah", DamerauLevenshtein())
 0.8333333333333334

julia> compare("martha", "martah", Levenshtein())
 0.6666666666666667

Now Levenshtein returns a smaller value because, as stated in the docs, The function compare is defined as 1 minus the normalized distance between two strings, hence it returns “closeness” between the two strings rather than distance.

7 Likes