Based on past experience, I have a feeling that I’ve horrendously over-engineered a solution to what seems like a simple problem. I have a vector of codes (numbers represented as strings) that I need to ‘collapse’ down. For example, given the following vector:
codes = ["11", "13", "112", "11213", "113", "213", "21345", "21376", "7567", "75679"]
The function should produce this result:
["11", "13", "213", "7567"]
Basically, it should find the shortest codes in the vector and then see if there are any other codes that start with the same and eliminate them. So “112”, “11213”, and “113” all fall under the “11” category and, as such, should be removed from the vector. “21345” and “21376” fall under the “213” category so they should be removed, and so on.
The Frankenstein recursive function that I’ve written to do this looks like this:
function collapsecodes(v::Vector{String}, minlength::Int = 0)
m = minlength == 0 ? minimum(length.(v)) : minlength
s = filter(x -> length(x) == m, v)
diff = setdiff(v, s)
c = vcat([filter(x -> length(x) >= m && x[1:m] == s[i], diff) for i in 1:length(s)]...)
a = setdiff(v, c)
return length(setdiff(v, a)) == 0 ? a : collapsecodes(a, m+1)
end
# REPL
julia> @time collapsecodes(codes)
0.085379 seconds (167.69 k allocations: 7.989 MiB)
4-element Array{String,1}:
"11"
"13"
"213"
"7567"
I have a feeling one of you amazing Julians can put together a solution that is 10 times more elegant and 10 times more performant