Supporting offset arrays without explicit dependency

In my package (NCDatasets) I am calling a C function which expect an array as input.
I use two functions so that ranges, bit arrays, transposed arrays,… get converted
to plain arrays. However this does not work for offset arrays:

using OffsetArrays

function C_function_expecting_array(A::Array)
    println("Yes, got an array. Thank you!")
end


function C_function_expecting_array(A::AbstractArray)
    println("hold on, converting to array")
    C_function_expecting_array(Array(A))
end

Example session:

julia> C_function_expecting_array([1,2,3])
Yes, got an array. Thank you!

julia> C_function_expecting_array(1:3)
hold on, converting to array
Yes, got an array. Thank you!

julia> C_function_expecting_array(BitArray([true,false]))
hold on, converting to array
Yes, got an array. Thank you!

julia> C_function_expecting_array(CUDA.cu([1,2,3])) # also works with CUDA arrays
hold on, converting to array
Yes, got an array. Thank you!

julia> C_function_expecting_array(OffsetArray([1,2,3],-1))
hold on, converting to array
ERROR: DimensionMismatch("axes must agree, got (Base.OneTo(3),) and (OffsetArrays.IdOffsetRange(values=0:2, indices=0:2),)")
Stacktrace:
 [1] (::Base.var"#checkaxs#125")(axd::Tuple{Base.OneTo{Int64}}, axs::Tuple{OffsetArrays.IdOffsetRange{Int64, Base.OneTo{Int64}}})
   @ Base ./abstractarray.jl:1101
 [2] copyto_axcheck!(dest::Vector{Int64}, src::OffsetVector{Int64, Vector{Int64}})
   @ Base ./abstractarray.jl:1103
 [3] Array
   @ ./array.jl:563 [inlined]
 [4] Array
   @ ./boot.jl:481 [inlined]
 [5] C_function_expecting_array(A::OffsetVector{Int64, Vector{Int64}})
   @ Main ./REPL[71]:3
 [6] top-level scope
   @ REPL[76]:1
 [7] top-level scope
   @ ~/.julia/packages/CUDA/fAEDi/src/initialization.jl:52

I believe I could call the first function directly with the offset array as we have:

a = [1,2,3]
pointer(a) == pointer(OffsetArray(a,-1))
# true

But I do not what to change the function signature to C_function_expecting_array(A::Union{Array,OffsetArray}) as it would introduce an explicit dependency to offset arrays.

1 Like

All you need to do is just never for for i in 1:n and instead just do things like eachindex or use start and ending indices for loops, or just use broadcast. If you do that, then it doesn’t matter what the indexing of the array is, it will be supported. And if you stick to broadcast, then you won’t even need to have indexing defined and you can support it (i.e. CuArrays).

This is mentioned in the SciML Style Guide with some examples of how to do this correctly and notes about things like similar.

2 Likes

It is a deep dive into the julia source code.


First I found (in boot.jl)

Array(A::AbstractArray{T,N})    where {T,N}   = Array{T,N}(A)

Then in array.jl

Array{T,N}(x::AbstractArray{S,N})         where {T,N,S} = copyto_axcheck!(Array{T,N}(undef, size(x)), x)

Then in abstractarray.jl

function copyto_axcheck!(dest, src)
    @noinline checkaxs(axd, axs) = axd == axs || throw(DimensionMismatch("axes must agree, got $axd and $axs"))

    checkaxs(axes(dest), axes(src))
    copyto!(dest, src)
end

It seems that the problem happens in function axes.
In abstractarray.jl you can find it

function axes(A)
    @_inline_meta
    map(oneto, size(A))
end

From the error message we know that

DimensionMismatch("axes must agree, got (Base.OneTo(3),) and (OffsetArrays.IdOffsetRange(values=0:2, indices=0:2),)")

It seems that the function axes from OffsetsArray has a higher priority.


Now we can do some really bad things (definitely not recommanded)

julia> function axes(A)
           map(Base.OneTo, size(A))
       end
axes (generic function with 1 method)

julia> function copyto_axcheck!(dest, src)
           @noinline checkaxs(axd, axs) = axd == axs || throw(DimensionMismatch("axes must agree, got $axd and $axs"))

           checkaxs(axes(dest), axes(src))
           copyto!(dest, src)
       end
copyto_axcheck! (generic function with 1 method)

julia> Array{T,N}(x::AbstractArray{S,N})         where {T,N,S} = copyto_axcheck!(Array{T,N}(undef, size(x)), x)

julia> Array(A::AbstractArray{T,N})    where {T,N}   = Array{T,N}(A)
Array

julia> import OffsetArrays

julia> Array(OffsetArrays.OffsetArray([1,2,3],-1))
3-element Vector{Int64}:
 1
 2
 3

Just define all these functions by yourself to get a higher priority when they are working.

I don’t know too much about multiple dispatch.

1 Like

One way to strip the offsets from an OffsetArray is by creating a view:

julia> A = OffsetArray([1,2,3], 1)
3-element OffsetArray(::Vector{Int64}, 2:4) with eltype Int64 with indices 2:4:
 1
 2
 3

julia> axes(A)
(OffsetArrays.IdOffsetRange(values=2:4, indices=2:4),)

julia> B = view(A, UnitRange.(axes(A))...)
3-element view(OffsetArray(::Vector{Int64}, 2:4), 2:4) with eltype Int64:
 1
 2
 3

julia> axes(B)
(Base.OneTo(3),)

This doesn’t require a hard dependency of OffsetArrays, and the view shares memory with the original array. You may create a copy, of course, using the Array constructor:

julia> Array(B)
3-element Vector{Int64}:
 1
 2
 3

So your function may look like this:

julia> function C_function_expecting_array(A::AbstractArray)
           println("hold on, converting to array")
           C_function_expecting_array(Array(view(A, UnitRange.(axes(A))...)))
       end
C_function_expecting_array (generic function with 2 methods)

julia> C_function_expecting_array(OffsetArray([1,2,3],-1))
hold on, converting to array
Yes, got an array. Thank you!
1 Like

Do you know why some constructors “work” (i.e. are happy to return something with different axes) and others don’t?

julia> o5 = OffsetArray(rand(5),-3);

julia> UnitRange(axes(o5,1))
-2:2

julia> Array(axes(o5,1))
5-element Vector{Int64}:
 -2
 -1
  0
  1
  2

julia> Array(o5)
ERROR: DimensionMismatch: axes must agree, got (Base.OneTo(5),) and (OffsetArrays.IdOffsetRange(values=-2:2, indices=-2:2),)

That said, if your C function wants a pointer, it’s a bit of a shame to copy the whole thing just to get this. pointer and strides will work on OffsetArrays (but not on ranges, etc.)

I think the fact that Array(::AbstractRange) works for offset ranges is a bit of an accident, and it’s because the indices of ranges aren’t handled consistently (ie. ranges are often treated as iterators). As a consequence, we have behavior like:

julia> r = axes(o5,1)
OffsetArrays.IdOffsetRange(values=-2:2, indices=-2:2)

julia> r == UnitRange(r)
true

julia> OffsetArray(r) == OffsetArray(UnitRange(r))
false

In Array(::AbstractRange), this is the method that gets called:

Array{T,1}(r::AbstractRange{T}) where {T} = vcat(r)

This is technically incorrect behavior I guess, and Tim Holy had tried to fix some of it in Check that axes start with 1 for AbstractRange operations by timholy · Pull Request #30950 · JuliaLang/julia · GitHub I think.

A more “correct” version of UnitRange.(axes(A)) is UnitRange.(first.(axes(A)), last.(axes(A))), which avoids the indices altogether

1 Like

folks…
https://juliaarrays.github.io/OffsetArrays.jl/stable/reference/#OffsetArrays.no_offset_view

particularly, simply

parent(A)

is enough most of the time and it’s from Base julia

2 Likes

In my case, I actually need to interface with a C function which expects a pointer.
Unfortunately, the C function is unaware of eachindex. Here is a more complete example:

Thank you all for your good suggestion.
@jling, Are you proposing something like this?

if A isa OffsetArray
  A = parent(A)
end

Then, I guess, I would need to have an explicit dependence to OffsetArrays which I would like to avoid. I cannot call parent unconditionally, since e.g. for transposed arrays parent returns the original source array (and similar for views).

parent(A) is not guaranteed to return a 1-indexed array, eg. for nested OffsetArrays.
OffsetArrays.no_offset_view will work, but it’ll require you to explicitly depend on OffsetArrays, which OP seems to want to avoid.

honestly, don’t support OffsetArrays. calling C is low-level enough that you should handle it downstream, in a different package

Thanks a lot, this is a quite clever way! I am still wondering if the OffsetArrays package should not be more cooperative here and provide a conversion function here like Array(o::OffsetArrays) (even if it means a new allocation).

I think collect(::OffsetArray) should give you an Array

If your main objective is to obtain a pointer, why not just see if the array supports being converted to a pointer?

julia> using OffsetArrays

julia> OA = OffsetArray(collect(5:10), -1)
6-element OffsetArray(::Vector{Int64}, 0:5) with eltype Int64 with indices 0:5:
  5
  6
  7
  8
  9
 10

julia> applicable(pointer, OA)
true

julia> pointer(OA)
Ptr{Int64} @0x000000007497d770

julia> unsafe_load(pointer(OA))
5

julia> applicable(Base.cconvert, Ptr, OA)
true

In fact, it is better to just let @ccall handle this for you. Otherwise, we would need to use GC.@preserve to make sure the object is not garbage collected.

julia> cfunc_ptr = @cfunction(unsafe_load, Int, (Ptr{Int},))
Ptr{Nothing} @0x000000005f5eedd0

julia> @ccall $cfunc_ptr(OA::Ptr{Int})::Int
5

This also supports SubArrays.

julia> sOA = @view(OA[1:3])
3-element view(OffsetArray(::Vector{Int64}, 0:5), 1:3) with eltype Int64:
 6
 7
 8

julia> @ccall $cfunc_ptr(sOA::Ptr{Int})::Int
6

Overall, I think you are trying to do too much within your package. If you really need an Array, then just implement the method for Array. Let the users of your package figure out how to obtain an Array.

What you really wanted here is to call a C function. In this case, I suggest letting ccall throw an error and perhaps trying to catch it to give a better description or recover.