So I need a data structure implementing a tiny associative array from integers to another type T (which type is not completely clear, but assume lists of integers, implemented as Vector{Int}, for now. Or it could be strings.) The number of (key, value) pairs will be very small (usually 2 and almost never more than 10), so using complex types with hashing (Dict) or trees (SortedDict) is likely not too efficient (I actually use Dict right now and a large part of my time is spent hashing keys).
So stdlib contains SparseArrays, which implements a SparseVector type using a list of keys and a list of values: this is space- and time-efficient and would be perfect for my usage. However, getindex(::SparseVector, non_existent_key) returns zero(T), which in my case is an error.
Of course, I could wrap Vector{Int} inside a struct ZeroableVector <: AbstractVector such that zero(ZeroableVector) returns [], but this seems a bit too ad-hoc; more precisely, since setindex!(::SparseVector, ...) calls iszero, I would also need to add a method to that function, which would conflict with the legitimate method inherited from AbstractVector.
So:
is there a relatively standard structure implementing a tiny associative array? (e.g. a spiritual successor to GitHub - andyferris/AssociativeArray.jl ?) something like LISP’s alists?
otherwise: did somebody already tweak SparseVector for something resembling this? (e.g. by overwriting get, adding a haskey method, etc.)?
I’m not aware of an existing implementation, but I think you could implement your own in a pretty efficient way. If you’re willing to pay the O(N) lookup cost, it could be as simple as:
struct SmallMap{K, V} <: AbstractDict{K, V}
keys::Vector{K}
vals::Vector{V}
SmallMap{K, V}() where {K, V} = new{K, V}(K[], V[])
end
function Base.getindex(m::SmallMap, key)
for (i, k) in enumerate(m.keys)
if k == key
return m.vals[i]
end
end
throw(KeyError(key))
end
function Base.setindex!(m::SmallMap, value, key)
for (i, k) in enumerate(m.keys)
if k == key
m.vals[i] = value
return m
end
end
push!(m.keys, key)
push!(m.vals, value)
return m
end
function Base.iterate(m::SmallMap, i=1)
if i > length(m.keys)
return nothing
else
return (m.keys[i] => m.vals[i], i + 1)
end
end
Base.length(m::SmallMap) = length(m.keys)
Usage:
julia> m = SmallMap{String, Int}()
SmallMap{String, Int64}()
julia> m["hello"] = 1
1
julia> m["world"] = 2
2
julia> m
SmallMap{String, Int64} with 2 entries:
"hello" => 1
"world" => 2
(O(N) is not much of a problem given my values of N indeed…)
Yes, this is more-or-less what I did (also I realized that writing the type was not much longer than my post here; I wrote almost the same as what you did, plus keys(), values() , haskey, a generic constructor, and that’s it.) Thanks for the answer!
You could also sort key-value pairs, and use searchsortedfirst etc for a lookup, which is O(log(N)), plus, of course, the one-time cost of sorting — whether it is worth it depends on your application.