[ANN] DictionaryIndexing.jl

kool7d · April 14, 2021, 10:58am

Dictionary indexing can be useful if, for instance, you don’t remember the key for something but you know what place it is in. Warning this package commits type piracy since it adds functionality to existing types (AbstractDicts). So only pirates should use this.

It imports only 1 package (OrderedCollections) and has 1 function that enables indexing dictionaries using function syntax.

Example:

using DictionaryIndexing

d = OrderedDict(:Apl => "apple",
	 	:Brc => "birch",
	 	:Cnd => "candle",
	 	:Drn => "dragon",
	 	:Exp => "expensive",
		:Frg => "forage",
		:Gra => "grain",
		:Hlt => "health")

dxs = d(2, 4:5, [7,8])

OrderedDict{Symbol,String} with 5 entries:
  :Brc => "birch"
  :Drn => "dragon"
  :Exp => "expensive"
  :Grn => "grain"
  :Hlt => "health"

kool7d · April 16, 2021, 3:18am

Example:

using DictionaryIndexing

dd = OrderedDict(	:Apl => "apple",
			 	 	:Brc => "birch",
			 	 	:Cnd => "candle",
			 	 	:Drn => "dragon",
			 	 	:Exp => "expensive",
				 	:Frg => "forage",
				 	:Gra => "grain",
				 	:Hlt => "health",
				 	:Irn => "irony",
				 	:Jak => "jackal" )     # length is 10

dxs = dd(2, 4:5, [7,8], 5:length(dd), [8,5])    # instead of `end` we get the last index with `length`

OrderedDict{Symbol, String} with 8 entries:
  :Brc => "birch"            #  2
  :Drn => "dragon"			 #  4
  :Exp => "expensive"		 #  5
  :Gra => "grain"  			 #  7
  :Hlt => "health"			 #  8
  :Frg => "forage"			 #  6
  :Irn => "irony"			 #  9
  :Jak => "jackal"			 #  10

Overlaps are handled such that same key => value pairs are not added again. This behavior is inherent to Dicts. If you want to change this to keep the last occurrence, use the keyword argument keep = :last (or keep = "last").

dxs = dd(2, 4:5, [7,8], 5:length(dd), [8,5]; keep = :last)

OrderedDict{Symbol, String} with 8 entries:
  :Brc => "birch"		     #  2
  :Drn => "dragon"			 #  4
  :Frg => "forage"			 #  6
  :Gra => "grain"			 #  7
  :Irn => "irony"			 #  9
  :Jak => "jackal"			 #  10
  :Hlt => "health"			 #  8
  :Exp => "expensive"		 #  5

If you want to do more complicated things like filtering the collected indices you can use the filter keyword with any filtering function.

dd(2, 4:5, [7,8], 5:length(dd), [8,5]; filter = x->in(x,5:6))

OrderedDict{Symbol, String} with 2 entries:
  :Exp => "expensive"		 #  5
  :Frg => "forage"			 #  6

dd(2, 4:5, [7,8], 5:length(dd), [8,5]; keep = :last, filter = x->in(x,5:6))

OrderedDict{Symbol, String} with 2 entries:
  :Frg => "forage"			 #  6
  :Exp => "expensive"		 #  5

Ordinarily keep = :last occurs after filtering, but if for some reason you want it to happen before the filter use keep = :lastbefore. Likewise, if you want it to keep the first instance before filtering, do keep = :firstbefore. This is possibly desirable if your filter function involves the number of occurrences.

gustaphe · April 16, 2021, 5:52am

What’s the piracy, and why?

kool7d · April 16, 2021, 6:00am

It just means that existing types are being used. So instead of making new types to use, or new objects, I am “pirating” the AbstractDict supertype (usually OrderedDict) by making it work in an another way, which for big Types could clash with some existing system if done haphazardly. Which is why it you have to summon it with using DictionaryIndexing

gustaphe · April 16, 2021, 6:30am

I know what piracy is, I wondered what the piracy was. But okay, you’re defining (d::AbstractDict)(indices).

This would probably be better as getindex(::OrderedDict, ::Integer). It makes no sense for AbstractDicts in general, because they have no order.

And if you want to get items by index, why not use a Vect?

kool7d · April 16, 2021, 6:48am

Then I would have to convert my Dict to a Vec.
I usually use vectors but have more recently been exploring Dictionaries, and it seemed handy.
Also, it doesn’t necessarily matter if the Dict is ordered. It can be used to easily construct arbitrary dictionaries from other dictionaries. I am testing it for plot recipes. It is a little shorter and cleaner to write

dd(2:length(dd)) than 
getindex(dd, 2:length(dd)) or 
dd(2:2:18, 3:3:27; keep=:last, filter=x->!iseven(div(x,9))) compared to (?)

And if your keys happen to be a little annoying like “4N60l1ll” or contain some symbol character that you can’t figure out how to make or are too lazy to find it, here ya go. It is at least simple. Convenience is the real point of it.

If there are bugs or reasons to change it to just OrderedDicts, it can be changed then.

SteffenPL · April 16, 2021, 7:42am

But if the keys are of type Integer, wouldn’t d[1] be ambiguous?

gustaphe · April 16, 2021, 10:11am

And if you have a callable CallableDict <: AbstractDict then d(1) will be ambiguous.

I don’t know if getindex(::Dict{Int, Any}, ::Int) has priority over (is considered more specific than) getindex(::Dict{Any, Any}, ::Int). But I think that implementing this indexing operation as a direct function call is a syntactical lie. There’s a reason Julia chose to have x[3] != x(3).

IterTools.jl has nth related to this:

Return the nth element of xs. This is mostly useful for non-indexable collections.

gustaphe · April 16, 2021, 10:26am

Experimenting a bit, it would appear that defining getindex(::OrderedDict{<:Any, <:Any}, ::Int) would still allow you to index into integer-keyed dictionaries as before, meaning that you could only not linear-index into an integer-keyed dictionary:

OrderedDict(:apple=>1,:banana=>2)[2] == 2
OrderedDict(2=>1, 1=>2)[2] == 1

which I think would be the most natural solution of this ambiguity anyway.

OrderedDict(2=>1, 1=>2)(2) == 2 seems a lot more confusing than helpful.

kool7d · April 16, 2021, 10:42am

It doesn’t do this (no wonder it seems confusing), it returns another OrderedDict. You are slicing. (so actually DictionarySlicing might be a more accurate name )

OrderedDict(2=>1, 1=>2)(2) -> OrderedDict(1=>2)

I think you are missing the point of the package TBH.
Also, what would be the proper way of doing this:

dd(2:2:18, 3:3:27; keep=:last, filter=x->!iseven(div(x,9)))

kool7d · April 16, 2021, 10:46am

nth(OrderedDict(:apple => 1, :banana => 2), 2)   ->   :banana => 2
which is a Pair{Symbol, Int64} not a Dict

gustaphe · April 16, 2021, 11:18am

Well, yeah.

Whether you are slicing or indexing, the direct function call is the wrong syntax. Two non-pirating options:

Define slice(::OrderedDict, args...)
Define struct LinearIndexableDict <: AbstractDict

I’m not sure I agree that’s an operation that needs to be simple. You’ve objectively chosen the wrong data type. But something along the lines of

k = collect(keys(dd))[vcat(2:2:18, 3:3:27)]
filter(x -> first(x) in k && !iseven(div(last(x),9)), dd)

(for the subtleties of order, the Iterators.filter function can be used on Generators, so

k = unique(collect(keys(dd))[vcat(2:2:18, 3:3:27)])
v = Iterators.filter(x -> !iseven(div(x,9)),  dd[kk] for kk in k)
OrderedDict(k .=> v)

)

kool7d · April 16, 2021, 5:56pm

The simplicity of it for the user is the point. It is easier. It is shorter, and harder to screw up. It was also easy to make. One import and one function.
You are getting me thinking though. I changed it to work only with OrderedDicts since I could just convert any other dict to an ordered one for it. Defining slice(::OrderedDict, args...) looks good. That might be what I am looking for and is almost as short. If there is an existing package that the slice function would belong in, LMK. I’m not married to DictionaryIndexing.

dd(2:2:18, 3:3:27; keep=:last, filter=x->!iseven(div(x,9)))

gustaphe:

I’m not sure I agree that’s an operation that needs to be simple. You’ve objectively chosen the wrong data type. But something along the lines of
k = collect(keys(dd))[vcat(2:2:18, 3:3:27)]
filter(x -> first(x) in k && !iseven(div(last(x),9)), dd)

You don’t agree it should be simple? Now to me, that doesn’t make sense. If something is simpler it is better unless it creates problems. Objectively chosen the wrong data type? I am already using the dictionaries. You seem to subjectively dislike working with dictionaries.
The second part there didn’t work when I just tested it. I get a methoderror.

3 lines and 3 new variables instead of 1? This is objectively much more code.

gustaphe · April 16, 2021, 6:57pm

The verbosity is a feature. If it was easy to do, people would actually do it.

kool7d · April 16, 2021, 9:53pm

Now you’re getting it.

what is a syntactical lie?

It is making new syntax, so it is correct by definition.

Topic		Replies	Views
[ANN] Dictionaries.jl - Improved productivity and performance of dictionaries in Julia Package Announcements dictionary , dictionaries	22	6199	December 15, 2019
Dictionaries.jl token API General Usage	5	1009	December 31, 2022
[ANN] Dictionaries.jl 0.3.0 - now using ordered collections by default Package Announcements dictionary	5	1374	June 13, 2021
Indexing `OrderedDict`s by index instead of key Internals & Design indexing , dictionary	3	2211	March 14, 2022
Ordinal Indexing as a Language Feature Internals & Design	26	3653	January 9, 2023

[ANN] DictionaryIndexing.jl

Related topics