I’d like to announce the StaticDictTrees.jl package. It provides a high-performance dictionary with Tuples as keys, and with the added functionality of providing views into any data subset identified by an incomplete key.
If you’ve ever used a Dict{Tuple, T} and wished you could instantly isolate all entries sharing a specific prefix without an expensive O(N) filter, this package is for you.
An example is worth a million words:
using StaticDictTrees
# Create a static dict tree
metrics = SDTree((:prod, :web, :cpu) => 42.5,
(:prod, :web, :ram) => 78.1,
(:prod, :db, :cpu) => 12.0,
(:stage, :web, :cpu) => 8.4)
# Standard full-key lookup
metrics[:prod, :web, :cpu] # 42.5
# Take a zero-allocation, type-stable view of a sub-branch
prod_web = view(metrics, (:prod, :web))
# The view behaves exactly like a relative AbstractDict ...
prod_web[:ram] # 78.1
# ... and any modification in a view is reflected in the original tree
prod_web[:ram] = 99.
metrics[:prod, :web, :ram] # 99.
Deletion is feasible, it works, its availability do not harm the performance of other operations (lookup, insert and updates), and most importantly it is clearly mentioned that it is by far the slowest operation.
In short: it can be useful to delete just one entry rather than recreating a huge structure from scratch.
So why refusing?
I was not aware of Base._unsetindex!, thanks for te suggestion!
But I want to use the Array exposed interface, rather than relying on internal machinery.
This would be in contrast with the need to always have all the values stored in a contiguous vector.
Well, there possibly is a misunderstanding here: this is not a 5-minute chat with Gemini being copy/pasted and published, it’s something it took me a lot of time to conceive and implement despite the help from AI, and I share it in the hope it is useful to someone else.
Also, I don’t think the code is sloppy at all (but I am surely biased…).
But I understand your concerns, and I am curious to know whether there is a clear guideline on registering / advertising packages whose code has been developed with AI help (besides pointing it out).
But I want to use the Array exposed interface, rather than relying on internal machinery.
You shouldn’t be afraid of using the internal machinery here. Everybody uses something like Base.unset_index! for this job. Without that, it would not be possible to write things like Dict. If you must avoid unexported APIs (e.g. due to policy), then you can ccall – that was exported in 1.0 and therefore will remain valid.
I apologize, I misunderstood the nature of your package. In that case, keep on sharing and having fun! And sorry for me being a spoilsport
(…but I think the sharing should be for educational purposes, not for “use this as dependency”)
It’s not the code, it’s the underlying algorithm / datastructure that looks sloppy.
You can also do deletion by moving: If the key/value at index idx needs to go, then do
oldkey = sdtree.keys[idx]
oldlen = length(sdtree.keys)
movedkey = sdtree.keys[oldlen]
sdtree.keys[idx] = movedkey
sdtree.values[idx] = sdtree.values[oldlen]
pop!(sdtree.keys)
pop!(sdtree.values)
sdtree.lookup[movedkey] = idx
delete!(sdtree.lookup, oldkey)
#do the same for branch_lookup
I modified the delete and prune algorithms to follow @foobar_lv2 suggestion (“deletion by moving”). This destroys the order on the internal values vector but the perfomance improvements are huge!;
The SDTree and SDBranch inherit from AbstractDict hence it is fine for values() to return an iterator;
I also implemented the values_view() method to return a view on the internal vector with correct order (namely the order elements were inserted in the tree).