[ANN] DictionaryIndexing.jl

Dictionary indexing can be useful if, for instance, you don’t remember the key for something but you know what place it is in. Warning this package commits type piracy since it adds functionality to existing types (AbstractDicts). So only pirates should use this.

It imports only 1 package (OrderedCollections) and has 1 function that enables indexing dictionaries using function syntax.

Example:

using DictionaryIndexing

d = OrderedDict(:Apl => "apple",
	 	:Brc => "birch",
	 	:Cnd => "candle",
	 	:Drn => "dragon",
	 	:Exp => "expensive",
		:Frg => "forage",
		:Gra => "grain",
		:Hlt => "health")

dxs = d(2, 4:5, [7,8])

OrderedDict{Symbol,String} with 5 entries:
  :Brc => "birch"
  :Drn => "dragon"
  :Exp => "expensive"
  :Grn => "grain"
  :Hlt => "health"
4 Likes

Example:

using DictionaryIndexing

dd = OrderedDict(	:Apl => "apple",
			 	 	:Brc => "birch",
			 	 	:Cnd => "candle",
			 	 	:Drn => "dragon",
			 	 	:Exp => "expensive",
				 	:Frg => "forage",
				 	:Gra => "grain",
				 	:Hlt => "health",
				 	:Irn => "irony",
				 	:Jak => "jackal" )     # length is 10

dxs = dd(2, 4:5, [7,8], 5:length(dd), [8,5])    # instead of `end` we get the last index with `length`

OrderedDict{Symbol, String} with 8 entries:
  :Brc => "birch"            #  2
  :Drn => "dragon"			 #  4
  :Exp => "expensive"		 #  5
  :Gra => "grain"  			 #  7
  :Hlt => "health"			 #  8
  :Frg => "forage"			 #  6
  :Irn => "irony"			 #  9
  :Jak => "jackal"			 #  10

Overlaps are handled such that same key => value pairs are not added again. This behavior is inherent to Dicts. If you want to change this to keep the last occurrence, use the keyword argument keep = :last (or keep = "last").

dxs = dd(2, 4:5, [7,8], 5:length(dd), [8,5]; keep = :last)

OrderedDict{Symbol, String} with 8 entries:
  :Brc => "birch"		     #  2
  :Drn => "dragon"			 #  4
  :Frg => "forage"			 #  6
  :Gra => "grain"			 #  7
  :Irn => "irony"			 #  9
  :Jak => "jackal"			 #  10
  :Hlt => "health"			 #  8
  :Exp => "expensive"		 #  5

If you want to do more complicated things like filtering the collected indices you can use the filter keyword with any filtering function.

dd(2, 4:5, [7,8], 5:length(dd), [8,5]; filter = x->in(x,5:6))

OrderedDict{Symbol, String} with 2 entries:
  :Exp => "expensive"		 #  5
  :Frg => "forage"			 #  6

dd(2, 4:5, [7,8], 5:length(dd), [8,5]; keep = :last, filter = x->in(x,5:6))

OrderedDict{Symbol, String} with 2 entries:
  :Frg => "forage"			 #  6
  :Exp => "expensive"		 #  5

Ordinarily keep = :last occurs after filtering, but if for some reason you want it to happen before the filter use keep = :lastbefore. Likewise, if you want it to keep the first instance before filtering, do keep = :firstbefore. This is possibly desirable if your filter function involves the number of occurrences.

1 Like

What’s the piracy, and why?

It just means that existing types are being used. So instead of making new types to use, or new objects, I am “pirating” the AbstractDict supertype (usually OrderedDict) by making it work in an another way, which for big Types could clash with some existing system if done haphazardly. Which is why it you have to summon it with using DictionaryIndexing

I know what piracy is, I wondered what the piracy was. But okay, you’re defining (d::AbstractDict)(indices).

This would probably be better as getindex(::OrderedDict, ::Integer). It makes no sense for AbstractDicts in general, because they have no order.

And if you want to get items by index, why not use a Vect?

Then I would have to convert my Dict to a Vec.
I usually use vectors but have more recently been exploring Dictionaries, and it seemed handy.
Also, it doesn’t necessarily matter if the Dict is ordered. It can be used to easily construct arbitrary dictionaries from other dictionaries. I am testing it for plot recipes. It is a little shorter and cleaner to write

dd(2:length(dd)) than 
getindex(dd, 2:length(dd)) or 
dd(2:2:18, 3:3:27; keep=:last, filter=x->!iseven(div(x,9))) compared to (?)

And if your keys happen to be a little annoying like “4N60l1ll” or contain some symbol character that you can’t figure out how to make or are too lazy to find it, here ya go. It is at least simple. Convenience is the real point of it.

If there are bugs or reasons to change it to just OrderedDicts, it can be changed then.

But if the keys are of type Integer, wouldn’t d[1] be ambiguous?

And if you have a callable CallableDict <: AbstractDict then d(1) will be ambiguous.

I don’t know if getindex(::Dict{Int, Any}, ::Int) has priority over (is considered more specific than) getindex(::Dict{Any, Any}, ::Int). But I think that implementing this indexing operation as a direct function call is a syntactical lie. There’s a reason Julia chose to have x[3] != x(3).

IterTools.jl has nth related to this:

Return the nth element of xs. This is mostly useful for non-indexable collections.

Experimenting a bit, it would appear that defining getindex(::OrderedDict{<:Any, <:Any}, ::Int) would still allow you to index into integer-keyed dictionaries as before, meaning that you could only not linear-index into an integer-keyed dictionary:

OrderedDict(:apple=>1,:banana=>2)[2] == 2
OrderedDict(2=>1, 1=>2)[2] == 1

which I think would be the most natural solution of this ambiguity anyway.

OrderedDict(2=>1, 1=>2)(2) == 2 seems a lot more confusing than helpful.

It doesn’t do this (no wonder it seems confusing), it returns another OrderedDict. You are slicing. (so actually DictionarySlicing might be a more accurate name :thinking:)

OrderedDict(2=>1, 1=>2)(2) -> OrderedDict(1=>2)

I think you are missing the point of the package TBH.
Also, what would be the proper way of doing this:

dd(2:2:18, 3:3:27; keep=:last, filter=x->!iseven(div(x,9)))

nth(OrderedDict(:apple => 1, :banana => 2), 2)   ->   :banana => 2
which is a Pair{Symbol, Int64} not a Dict

Well, yeah.

Whether you are slicing or indexing, the direct function call is the wrong syntax. Two non-pirating options:

  • Define slice(::OrderedDict, args...)
  • Define struct LinearIndexableDict <: AbstractDict

I’m not sure I agree that’s an operation that needs to be simple. You’ve objectively chosen the wrong data type. But something along the lines of

k = collect(keys(dd))[vcat(2:2:18, 3:3:27)]
filter(x -> first(x) in k && !iseven(div(last(x),9)), dd)

(for the subtleties of order, the Iterators.filter function can be used on Generators, so

k = unique(collect(keys(dd))[vcat(2:2:18, 3:3:27)])
v = Iterators.filter(x -> !iseven(div(x,9)),  dd[kk] for kk in k)
OrderedDict(k .=> v)

)

The simplicity of it for the user is the point. It is easier. It is shorter, and harder to screw up. It was also easy to make. One import and one function.
You are getting me thinking though. I changed it to work only with OrderedDicts since I could just convert any other dict to an ordered one for it. Defining slice(::OrderedDict, args...) looks good. That might be what I am looking for and is almost as short. If there is an existing package that the slice function would belong in, LMK. I’m not married to DictionaryIndexing.

dd(2:2:18, 3:3:27; keep=:last, filter=x->!iseven(div(x,9)))

You don’t agree it should be simple? Now to me, that doesn’t make sense. If something is simpler it is better unless it creates problems. Objectively chosen the wrong data type? I am already using the dictionaries. You seem to subjectively dislike working with dictionaries.
The second part there didn’t work when I just tested it. I get a methoderror.

3 lines and 3 new variables instead of 1? This is objectively much more code.

The verbosity is a feature. If it was easy to do, people would actually do it.

Now you’re getting it.

what is a syntactical lie?

It is making new syntax, so it is correct by definition.