Get element of a SortedDict by index instead of getting by key

FemtoTrader · December 15, 2016, 4:32pm

Hello,

I have a SortedDict like

julia> using DataStructures: SortedDict
julia> v = SortedDict(Dict{Float64,Float64}(), Base.Forward)
julia> v[1.32]=10
julia> v[1.33]=20
julia> v
DataStructures.SortedDict{Float64,Float64,Base.Order.ForwardOrdering} with 2 entries:
  1.32 => 10.0
  1.33 => 20.0

I’d like to get a value using its index (its position) instead of its key.

How can I do this using Julia ?

Kind regards

rdeits · December 15, 2016, 4:41pm

You can always collect() the keys of the dict to get a vector into which you can index:

keys_as_vector = collect(keys(v))
keys_as_vector[2]

But this allocates a new array to hold the keys, so it may not be the most efficient approach possible.

However, if you’re always accessing the elements by index instead of by key, then perhaps a SortedDict isn’t really the right data structure for your case? Would a vector of Pairs or Tuples work instead?

FemtoTrader · December 15, 2016, 4:46pm

I need to access values both by index and by key so a vector of Pairs or Tuples can’t work.

I’d prefer to avoid to allocates a new array as it can be inefficient.

ChrisRackauckas · December 15, 2016, 5:26pm

I know you can do this with an OrderedDict. I don’t think it’s possible without creating a temporary for the indexing because the internal indexing of a SortedDict is much more difficult. This is because OrderedDict stores everything in arrays while SortedDict stores everything in a BalancedTree. So you have to really step through the tree to find out what the next value is, which is what collecting the iterator does.

Maybe it could be better for you to sort it yourself and use an OrderedDict for this reason? Or maybe find a way to save what the indexes to the BalancedTree mean, and mutate this when a value is added?

kevin.squire · December 16, 2016, 3:53am

Your best bet would probably be to use an OrderedDict, and resort it whenever you need to access an element by position.

If you’re resorting frequently, you should use TimSort from the SortingAlgorithms package, which is faster than other sorts when the data is mostly sorted.

A SortedDict is implemented with a tree, and it might be possible to use a different tree structure which makes indexing by position efficient, by maintaining counts of nodes on each branch, but I’m not specifically familiar with such a tree.

Stephen_Vavasis · December 16, 2016, 4:53am

If your purpose is to be able to quickly answer a query such as: what is the 145th largest item in the SortedDict, then you are out of luck-- the data structure does not currently provide an efficient algorithm for this type of query. (An inefficient algorithm would be: step through items 1 to 144 by advancing a token.) With a substantial amount of code development, it would be possible to support a query like this in O(log n) time. The code development would involving adding a field to all internal tree nodes that keeps track of the number of descendants of the node.

If your purpose is, given a key-value pair already found, store something like an integer so that you can get direct (i.e., O(1)) access to the (k,v) pair again later on, then you should read up on tokens and semitokens in the documentation.

FemtoTrader · December 17, 2016, 8:34am

My goal is to simulate an orderbook (with price as key and volume as value).
So I need to be able to update this orderbook by level ie (for example) add or substract a given volume for a given price.
I also need to calculate price for a given volume so I need to iterate increasing (for asks prices) prices and so volumes up to requested volume.

Stephen_Vavasis · December 18, 2016, 1:26am

If I understand correctly, you have two main operations: given a new (p,v), where p=price and v=volume, you want to insert this into the sorted dict. If there is already (p,v’) in the sorted dict (same p), then instead you wish to update (p,v’) to (p,v+v’)…

The other operation is, given a volume V, you want to find and delete pairs (p,v) starting from lowest to highest p, until the sequence of v’s removed adds up to V.

Both of these are possible with SortedDict. For the first operation, you find p; if found under semitoken s, then you update sd[s], and if not found, the you push! or insert! the new item. For the second operation, you use a while-loop with the startof function and delete! function.

Your username ‘femtotrader’ suggests that you are interested in high-performance software for use in high-speed financial trades. In this case, you should know that SortedDict in Julia is not as fast as map in C++, primarily because the latter has no safeguards or error-checks, at least when used in the default way.

Topic		Replies	Views
Indexing `OrderedDict`s by index instead of key Internals & Design indexing , dictionary	3	2211	March 14, 2022
Values(dict) returns key-value pairs after sorting New to Julia dictionary	5	2494	August 11, 2019
Mutable sorted collections General Usage sort , collection	7	1918	July 20, 2021
How to get two dictionaries with the same ordering of the keys New to Julia question	5	284	June 19, 2023
Sorting nested dictionary General Usage question , sort , dictionary	11	1701	May 15, 2019

Get element of a SortedDict by index instead of getting by key

Related topics