Advice on over- vs. proper use of type parameters

I am writing an immutable type in Julia and have realized while working on it that I have an opportunity to go a little crazy with the type parameters. Here’s a contrived example that deals with the same problem. I’m curious about advice for when this kind of use of parametric types is worth it and when it might be a bad idea. My own intuitions on the problem, for what they are worth, are below.

This question is a good resource but only a partial answer.

Problem. I want to write an efficient immutable type StringTrie <: AbstractSet{String} that implements a set of strings as a trie. A simple implementation might be like this:

struct TrieNode
    endsword::Bool
    children::Tuple
end
function TrieNode(s::String) 
    if length(s) == 0
        return TrieNode(true, ())
    else 
        return TrieNode(false, (s[1] => TrieNode(s[2:end]),))
    end
end
function _find(n::TrieNode, c::Char)
    for (ii,(k,v)) in enumerate(n.children)
        k == c && return ii
    end
    return 0
end
Base.in(s::String, n::TrieNode) = begin
    length(s) == 0 && return n.endsword
    (c,s) = (s[1], s[2:end])
    ii = _find(n, c)
    return ii == 0 ? false : in(s, n.children[ii][2])
end        
push(n::TrieNode, s::String) = begin
    if length(s) == 0
        return n.endsword ? n : TrieNode(true, n.children)
    else
        (c,s) = (s[1], s[2:end])
        ii = _find(n, c)
        if ii == 0
            return TrieNode(n.endsword, (n.children..., c => TrieNode(s)))
        else
            (k,v) = n.children[ii]
            new_v = push(v, s)
            new_v === v && return n
            ch = n.children
            new_ch = (ch[1:ii-1]..., k=>new_v, ch[ii+1:end]...)
            return TrieNode(n.endsword, new_ch)
        end
    end
end

struct StringTrie <: AbstractSet{String}
    n::Int
    root::Union{TrieNode,Nothing}
end
StringTrie() = StringTrie(0, nothing)
StringTrie(s::Vararg{String,N}) where {N} = reduce(push, s, init=StringTrie())
Base.length(t::StringTrie) = t.n
Base.in(s::String, t::StringTrie) = in(s, t.root)
push(t::StringTrie, s::String) = begin
    t.root === nothing && return StringTrie(1, TrieNode(s))
    new_root = push(t.root, s)
    return new_root === t.root ? t : StringTrie(t.n+1, new_root)
end

A simple test confirms this mostly works:

strings = ["foo", "bar", "baz", "qux", "qu", "ba", "foobar", ""]
st = StringTrie(strings...)
([s in st for s in strings], [reverse(s) in st for s in strings])
#[Out] := (Bool[1, 1, 1, 1, 1, 1, 1, 1], Bool[0, 0, 0, 0, 0, 0, 0, 1])

An alternative to this method might be to lean really further into the type system in order to give the compiler lots of information. For this, one could go with the following code.

struct TrieNode{T <: Tuple}
    endsword::Bool
    children::T
end
function TrieNode(endsword::Bool, children::T) where {T<:Tuple}
    return TrieNode{T}(endsword, children)
end

The rest of this implementation is actually almost identical to the original, but it gives the compiler all the type information it could possibly want. As a consequence, however, the typeof(trie.root) for even a small trie becomes something awful:

strings = ["foo", "bar", "baz", "qux", "qu", "ba", "foobar", ""]
st = StringTrie(strings...)
typeof(st.root)
# [Out] := TrieNode{Tuple{Pair{Char,TrieNode{Tuple{Pair{Char,TrieNode{Tuple{Pair{Char,TrieNode{Tuple{Pair{Char,TrieNode{Tuple{Pair{Char,TrieNode{Tuple{Pair{Char,TrieNode{Tuple{}}}}}}}}}}}}}}}}}},Pair{Char,TrieNode{Tuple{Pair{Char,TrieNode{Tuple{Pair{Char,TrieNode{Tuple{}}},Pair{Char,TrieNode{Tuple{}}}}}}}}},Pair{Char,TrieNode{Tuple{Pair{Char,TrieNode{Tuple{Pair{Char,TrieNode{Tuple{}}}}}}}}}}}

Note, however, that the API of StringTrie remains unharmed by this, so the problem is only potentially a performance one.

My understanding is that this particular implementation of TrieNode should have a completely flat profile of Chars and Bools in memory with none of the data hierarchy explicitly needed, which is a big plus. However, it’s unclear to me how much compile-time I’m sacrificing by going this route. A trie like this with thousands of entries seems untenable, instinctively.

I also suspect that with some clever technomancy one could come up with similar examples where the extent to which the code leans on the compiler is less than in the last case but more than in the first two. What’s the ideal to aim for here? Also, what is the best way to benchmark? (Does @benchmark include/ignore compile time? Can they be benchmarked separately?). If I have code that generates types everywhere, will the llvm eventually garbage collect them or will they build up indefinitely?

Thanks in advance!

1 Like

Very interesting question! I have a partial answer only, but maybe helpful:

My experience with chained types shows that you can get “interactively acceptable” compile times for up to a few hundred total type parameters. E.g. This test of Plugins.jl creates a chained list of 100 types, where each has 4 parameters (one of them links to the next item), and using that it merges 100 small functions into a single one. All of this is done in ~0.67 secs on my machine. For length 1000 it needs ~18 secs.

If the compilation runs only once, as usual, then @benchmark will not be useful. But it seems possible to force compilation using Val types in every round, and get out usable data. For a rough decision about usability, @time is good enough in my opinion.

As I remember, there is no garbage collection of compiled code and types, but recently I tried to search for this info and found nothing.

1 Like

I think, you can avoid excessively long type signatures if you use linked lists or BSTs to store children.

Should be something like

struct List{T}
    data::T
    next::Union{List{T}, Nothing}
end

struct TrieNode
    endsword::Bool
    children::Union{List{Pair{Char, TrieNode}}, Nothing}
end

The performance will not be as good as with tuples, I think, but should compile faster.

1 Like