Lift and wrap array with custom indexes

question

#1

I have a function coded for “normal” 1:n axes that

  1. takes a matrix,
  2. does some manipulation, including slices, LU, etc,
  3. gets a vector.

I want to generalize this to work with arrays of custom indexes (eg OffsetArray), with the following in mind: it should only work for matrices which have the same axes along both dimensions, and then use this to “wrap” the resulting vector. I just don’t know how to generalize this — the trick that I am missing is that in general I don’t know the constructor for the general array type.

Perhaps I should use a modifying version of op with similar? But then I would have to calculate result types.

M(W)E:

using LinearAlgebra, OffsetArrays

op_internal(A::AbstractMatrix) = normalize(vcat(A[end, 1:(end-1)], one(eltype(A))), 1)

function op(A)
    if Base.has_offset_axes(A)
        axes(A, 1) == axes(A, 2) || error("matrix not square")
        ## FIXME --- how to extract A as a matrix for op_internal, then wrap the result?
    else
        op_internal(A)
    end
end

O = ones(3, 3) .* 2
op(O)                        # works

# below: missing code, want OffsetArray([0.4, 0.4, 0.2], 2:4)
op(OffsetArray(O, 2:4, 2:4))

#2

The “hard” point about this example is the concatenation: we don’t really have a good API for concatenating arrays with offset axes. What should the output axes be? The most useful (but undocumented) approach I know of is in CatIndices, which uses a wrapper called PinIndices to say “the axes of this object should ‘win’, and everything else stretches out to either side.”

This implementation is close to what you want:

using LinearAlgebra, OffsetArrays, CatIndices, EndpointRanges

function op(A::AbstractMatrix)
    axes(A, 1) == axes(A, 2) || error("matrix not square")
    normalize(vcat(PinIndices(A[iend, ibegin:iend-1]), [one(eltype(A))]), 1)
end

O = ones(3, 3) .* 2
display(op(O))

OO = OffsetArray(O, 2:4, 2:4)
display(op(OO))

But there’s a couple of missing elements to make it perfect:

  • the vcat should be generalized (here you’d like to to take numbers as well as AbstractVectors and PinIndices)
  • the axes of the output are not what you were hoping. That’s because in A[iend, ibegin:iend-1], the range object ibegin:iend-1 implicitly has axes that range from 1:2, and the axes of the result are the axes of the indices. (As they should be!) You’d kind of like to do something like A[iend, Base.IdentityUnitRange(ibegin:iend-1)] but currently that doesn’t work. Or perhaps better, one could define iibegin and iiend in EnpointRanges to mean identity-index and then use iibegin:iiend-1 as an identity-range.

#3

Thank you for your help. I may have been unclear in wording the question: I want/need to code op_internal to work for arrays with regular indexes (from 1), mostly because the library functions it uses don’t handle generalized indexing (the above code is just an MWE, it is more complex). I hope that more and more functions will work with generalized indexes, but in the meantime, I am looking for a way to drop back into regular indexing, and then just put indices on the end result.

I ended up reworking the code to use a pre-allocated buffer, and using reshape for a “view” with non-custom indexes, but I am not sure this is the right way. MWE modified:

using LinearAlgebra, OffsetArrays

function op_blackbox!(v::AbstractVector, A::AbstractMatrix)
    v[1:(end-1)] = A[end, 1:(end-1)]
    v[end] = one(eltype(v))
    normalize!(v, 1)
    v
end

calc_type(::Type{T}) where T = typeof(√one(T))

function op(A)
    S = calc_type(eltype(A))
    ax1 = axes(A, 1)
    ax1 == axes(A, 2) || error("matrix not square")
    if Base.has_offset_axes(A)
        n = length(ax1)             # for reshape
        B = reshape(A, n, n)        # a "view" with non-offset indexes
        w = similar(A, S, ax1)      # return array
        v = reshape(w, n)           # a "view" buffer with non-offset indexes
    else
        B = A                   # work in the same array
        w = v = similar(A, S, ax1) # likewise
    end
    op_blackbox!(v, B)
    w
end

O = ones(3, 3) .* 2
op(O)
op(OffsetArray(O, 2:4, 2:4))

#4

If one wanted to “do this right” then let me then add a third deficiency to my list above:

  • in vcat, if the first entry is a PinIndices with Base.OneTo axes, then use Base.OneTo axes for the output.

The point here is that ideally, you shouldn’t need to special-case on Base.has_offset_axes: it’s much better to have generic code that can deal with general axes.

I guess this is a case where you should ask yourself, “how many functions like this am I going to write?” If it’s just one or a few, the approach you’ve taken is certainly the most time-efficient. In contrast, if you’re going to be doing a lot of this, I’d urge you to submit a couple of PRs to CatIndices so that all your later development is easier (contrast 3 LOC vs 21).


#5

As I said (but perhaps underemphasized — that’s the problem with MWEs) in the original post, the problem that motivated this uses LinearAlgebra, especially qr and lu, and I would have to update these too. I will consider PRs, but my understanding is that in the end it all boils down to an unwrap/wrap pattern. Also, I would really need to think hard about pivoting and generalized indexes.


#6

Understood. But it’s better not to think of it as a wrap/unwrap problem: if you look at code that handles indexing, in general it’s a bit rare to have explicit unwrapping. The much better way to handle this is through generic interfaces (e.g., getindex, similar, and so-on).

cat operations are a tricky case because the fundamental operation is basically conditioned on an “arrays-are-just-lists” viewpoint rather than “arrays are Dicts with keys arranged on a spatial grid” viewpoint.


#7

Specifically, how would you handle the following for generalized indexes?

using LinearAlgebra

function stationary_distribution(matrix::AbstractArray)
    LU = lu(matrix, Val(false); check = false) # not pivoted, singular
    L = LU.L
    x = vcat(L[1:(end-1), 1:(end-1)]' \ -L[end, 1:(end-1)], 1)
    normalize!(x, 1)
    x
end

#8

In an ideal world,

  • generalize LU decomposition so it handles arbitrary square axes (there’s nothing about the underlying algorithm that requires 1-based indexing)
  • wrap the first argument to vcat with PinIndices

Your code would thus be almost identical to what it is now.


#9

I am very enthusiastic about generalized indexing, and I am grateful for all the work that you and others have invested in it.

However, the ideal state of affairs you describe will require much more work. I hope to contribute to this, but I think in the meantime an escape mechanism like above would be helpful, even to the extent of describing it in the manual (which also recognizes that not all code is ready, and mentions how to check for it).

In particular, I am curious if reshape is the best way to get a standard-indexed “view” into an array with custom indexing. That would solve 90% of my problems (the rest being type calculation).


#10

It’s not as general as you might hope, because reshape can end up checking has_offset_axes:

julia> using CatIndices

julia> v = BidirectionalVector(1:3)
BidirectionalVector{Int64} with indices CatIndices.URange(1,3):
 1
 2
 3

julia> pushfirst!(v, 0)
BidirectionalVector{Int64} with indices CatIndices.URange(0,3):
 0
 1
 2
 3

julia> reshape(v, 4)
ERROR: AssertionError: !(has_offset_axes(v))
Stacktrace:
 [1] _reshape(::BidirectionalVector{Int64}, ::Tuple{Int64}) at ./reshapedarray.jl:167
 [2] reshape at ./reshapedarray.jl:112 [inlined]
 [3] reshape(::BidirectionalVector{Int64}, ::Int64) at ./reshapedarray.jl:115
 [4] top-level scope at none:0

#11

So there is no general solution? I was thinking of something like this (MWE without frills and optimization, setindex! not defined, etc):

struct PlainView{T, N, P <: AbstractArray{T,N}, I} <: AbstractArray{T,N}
    parent::P
    indices::I
end

Base.size(P::PlainView) = length.(axes(P.parent))

Base.getindex(P::PlainView, I::Vararg{Int,N}) where N =
    P.parent[Base.reindex(P, P.indices, I)...]

function plain_view(A::AbstractArray{T,N}) where {T,N}
    if Base.has_offset_axes(A)
        PlainView(A, ntuple(i -> firstindex(A, i):lastindex(A, i), N))
    else
        A
    end
end

#12

You can always wrap with another layer, e.g., OffsetArray. (Your PlainView basically is OffsetArray.)


#13

I made a PR to that effect:

Also, thinking about pivoting indices (for LU & friends): I came to the conclusion that for generalized indexing, it should just remain the same, with the understanding that pivot indices should be interpreted within the relevant axes(A, j). What do you think?


#14

Thanks for the PR, I’ll check it in a bit. Re pivoting, that sounds pretty reasonable.