Indexing AxisArrays by integer values (which don't start at 1...)

I use Axis Arrays for my modelling, but I’ve stumbled onto an annoying problem:

using AxisArrays

Y = [2019,2020] # years
x = [1, 2]

A = AxisArray(x, Y)

A[2019]
# ERROR: BoundsError: attempt to access 2-element Array{Int64,1} at index [2019]

A[atvalue(2019)]
# 1

I don’t particularly want to rewrite a bunch of my code to put atvalue everywhere, so I was wondering if there’s a cheeky workaround, e.g. make Y an Array of type something similar to Int but not Int, so that I can still write stuff like Y[2] - Y[1] == 1.

Suggestions?

I found a (partial) solution:

using AxisArrays
using Dates
Y = Year.([2019,2020]) # years

This sort of gives me the behaviour I want, though I eventually need the difference between years to calculate a discount rate, which wouldn’t work without some slight modification:

(Y[2] - Y[1])^0.95
# MethodError: no method matching ^(::Year, ::Float64)
(Y[2].value - Y[1].value)^0.95
# 1.00

Just introducing a wrapper type to deal with the AxisArray API is imho not very pretty.

It is not clear to me why you need the AxisArray type.
Have you considered something like OrderedDict( Y.=>x) from the DataStructures package, or even OffsetArrays if you’re just dealing with increment-by-one years?

1 Like

Those are good suggestions.

One way for the package’s API to avoiding such ambiguities would be to always use A(2019) for lookup and A[1] for indexing, instead of guessing based on the type as AxisArrays does. The downside is that while A[1] = v goes to one setindex! call, A(2019) = v tries to define a function, so won’t work. Out of curiosity, do you need to write into the array like this in what you are doing, like A[Year(2019)] = v?

2 Likes

Have you considered something like OrderedDict( Y.=>x) from the DataStructures package, or even OffsetArrays if you’re just dealing with increment-by-one years?

I had actually started out with OrderedDicts, but I was starting to get annoyed with having to write comprehensions whenever I wanted to sum over a particular axis. I’m not particularly keen on going back to them at this point.

One way for the package’s API to avoiding such ambiguities would be to always use A(2019) for lookup and A[1] for indexing

That would be confusing though, no? Since () is really for function calls, so you would start to wonder whether I had a function called A. I do generally just fetch indices of the AxisArrays though, so something along those lines would be useful.

Short of any other suggestions, I might just define some super simple types to index my AxisArrays. Though I have a feeling that might come back to bite me in the arse at some point…

Take a look at DimensionalData.jl. It’s perhaps closer to what you’re looking for?

Well, isn’t the root confusion here that A[y] can mean either indexing (y ∈ eachindex(A)) or atvalue lookup? You know which one you want, but the computer guesses wrong.

OffsetArrays.jl doesn’t have this problem, there is only indexing. It wouldn’t be hard to make a similar array-like type which allows non-contiguous indices, but IIRC lots of things assume that axes of AbstractArrays are UnitRanges, i.e. consecutive integers. Dictionaries don’t have this problem as they don’t try to also be AbstractArrays. Perhaps Dictionaries.jl is what you want?

d1 = Dict([2018, 2020] .=> [1, 2π])
sum(d1) # error

using Dictionaries
d2 = Dictionary([2018, 2020], [1, 2π])
sum(d2)
d2 ./ 10

As far as I know that package doesn’t guess, and requires you to mark lookup by At():

using DimensionalData
da = DimensionalArray([1, 2π], (Dim{:year}([2018, 2020]),))
da[At(2018)] == da[1]

ah indeed. Thanks!

Offset arrays isn’t what I want, since I would also like to be able have indices which are strings. As I said before, I moved away from OrderedDicts and Dicts in general simply because I wanted to be able to write stuff like sum(A[:,"PV",1]) instead of sum(A[y,"PV",1] for y=Y). This is of course a minor issue.

What I really want is for AxisArrays to always do A[atvalue.(y)]. So I could just redefine getindex(AxisArray) to always do this within my module?