Multidimensional arrays with 0-based indices

question

#1

Hi,

I’m trying to create an array view (for data from C++) that uses 0-based indices. I got most of it working, except for the use of end as an index into a multidimensional error. The following code mimics what I want to do in a standalone example, the final test fails:

using Compat
using Base.Test

using CustomUnitRanges: filename_for_zerorange
include(filename_for_zerorange)

immutable PtrView{ScalarT,N,LayoutT} <: AbstractArray{ScalarT,N}
  ptr::Ptr{ScalarT}
  size::NTuple{N,Int}
end

immutable LayoutLeft
end

PtrView{ScalarT,N,LayoutT}(A::Array{ScalarT,N},::Type{LayoutT}) = PtrView{ScalarT,N,LayoutT}(pointer(A), size(A))

@compat Base.IndexStyle{ScalarT,N,LayoutT}(::Type{PtrView{ScalarT,N,LayoutT}}) = IndexLinear()
Base.indices{ScalarT,N}(v::PtrView{ScalarT,N,LayoutLeft}) = map(ZeroRange, v.size)
Base.endof{ScalarT,N}(v::PtrView{ScalarT,N,LayoutLeft}) = prod(v.size)-1
Base.getindex{ScalarT,N}(v::PtrView{ScalarT,N,LayoutLeft}, i::Integer) = unsafe_load(v.ptr,i+1)
Base.setindex!{ScalarT,N}(v::PtrView{ScalarT,N,LayoutLeft}, value, i::Integer) = unsafe_store!(v.ptr,value,i+1)

v_arr = [1,2,3]
A_arr = [1 2 3;4 5 6]

v = PtrView(v_arr, LayoutLeft)
A = PtrView(A_arr, LayoutLeft)

@show @test v[0] == 1
@show @test v[end] == 3
@show @test A[0] == 1
@show @test A[end] == 6
@show @test A[0,end] == 3

The error:

Error During Test
  Test threw an exception of type MethodError
  Expression: A[0,end] == 3
  MethodError: no method matching size(::PtrView{Int64,2,LayoutLeft})
  Closest candidates are:
    size{T,N}(::AbstractArray{T,N}, !Matched::Any) at abstractarray.jl:47
    size{N}(::Any, !Matched::Integer, !Matched::Integer, !Matched::Integer...) at abstractarray.jl:48
    size(!Matched::BitArray{1}) at bitarray.jl:39
    ...
   in size at ./abstractarray.jl:47 [inlined]
   in trailingsize(::PtrView{Int64,2,LayoutLeft}, ::Int64) at ./abstractarray.jl:201
ERROR: LoadError: There was an error during testing
 in record(::Base.Test.FallbackTestSet, ::Base.Test.Error) at ./test.jl:397
 in do_test(::Base.Test.Threw, ::Expr) at ./test.jl:281

Do I get around this by overriding Base.trailingsize? This seems related but unconcluded:


New blog post on arrays
#2

With unconventional indices, you’re much better off using ibegin and iend, see https://github.com/JuliaArrays/EndpointRanges.jl.

Presumably you’re aware that this is a “dangerous” view in that you need to be sure that ptr doesn’t go away. If it’s owned by C code and won’t be GCed, you should be safe, but some care is required.


#3

Thanks, EndpointRanges looks very useful. I looked a bit at how end is implemented, and at a first glance adding a function like this:

Base.endof(A::AbstractArray, d) = last(indices(A,d))

and then making end call that instead of size or trailingsize would fix this. Probably it’s not that simple?

Also, I noticed linearindices is defined as:

linearindices(A)                 = OneTo(_length(A))

This should also be overridden for multidimensional arrays with unconventional indices, right?

The memory is indeed managed by C++ in the real use case.


#4

I’d forgotten we had the d version of endof, so that seems good. OTOH, any code that does A[..., 1:end, ...] is probably not what you want (it leaves out the first element), so there’s a sense in which not defining that endof method might be good defensive practice, at least in the early days.


#5

I don’t think here is a d version of endof, unless I missed it. I was contemplating a PR to add it, but I guess ranges complicate things. Is it worth attempting this anyway, or is something like EndpointRanges.jl going to replace this functionality in the “end”? Also, is there a test suite that checks all the range corner cases even with custom indices?

Regarding linearindices: should it be added to the docs that this needs to be overridden for multidimensional arrays using custom indices?


#6

Sorry, I wasn’t reading carefully enough. I guess the reason I forgot about the d version is we don’t have it (yet) :slight_smile:.

Is it worth attempting this anyway, or is something like EndpointRanges.jl going to replace this functionality in the “end”?

Perhaps the best approach would be to put it in the parser. That’s the main place where end is implemented now. We’d want to include begin too. My Scheme is very weak so I haven’t tried to tackle this myself yet.

Also, is there a test suite that checks all the range corner cases even with custom indices?

Perhaps the best would be to copy the OffsetArrays tests?

Regarding linearindices: should it be added to the docs that this needs to be overridden for multidimensional arrays using custom indices?

I don’t think it needs specialzation. The reason is that there are two methods:

linearindices(A)                 = (@_inline_meta; OneTo(_length(A)))
linearindices(A::AbstractVector) = (@_inline_meta; indices1(A))

The second one “does the right thing” for vectors. (indices1 is a short-circuiting equivalent of indices(A, 1).) For multidimensional arrays, linear indexing always starts with 1 no matter what indices the array has, so that first definition seems to cover the bases.

That said, if you discover reason to need to specialize it, I’d be interested in hearing why.


#7

OK, I didn’t realise linear indices were always 1-based for ND-arrays. What is actually needed is to define get/setindex differently for the 1D and ND case, then:

Base.getindex{ScalarT}(v::PtrView{ScalarT,1,LayoutLeft}, i::Integer) = unsafe_load(v.ptr,i+1)
Base.setindex!{ScalarT}(v::PtrView{ScalarT,1,LayoutLeft}, value, i::Integer) = unsafe_store!(v.ptr,value,i+1)
Base.getindex{ScalarT,N}(v::PtrView{ScalarT,N,LayoutLeft}, i::Integer) = unsafe_load(v.ptr,i)
Base.setindex!{ScalarT,N}(v::PtrView{ScalarT,N,LayoutLeft}, value, i::Integer) = unsafe_store!(v.ptr,value,i)

This is necessary since e.g. A[0,0] is converted to A[1]. I wanted to specialize linearindices because I thought A[0] refers to the first matrix element, so indeed there is no need to specialize it.


#8

I had some more fun with this, new code for a complete example with benchmarks and omitting the pointers for clarity:

using Compat
using Base.Test

using CustomUnitRanges: filename_for_zerorange
include(filename_for_zerorange)

immutable ZeroView{ScalarT,N} <: AbstractArray{ScalarT,N}
  a::Array{ScalarT,N}
end

@compat Base.IndexStyle{ScalarT,N}(::Type{ZeroView{ScalarT,N}}) = IndexLinear()
Base.indices{ScalarT,N}(v::ZeroView{ScalarT,N}) = map(ZeroRange, size(v.a))
# needed to solve type instability:
Base.indices{ScalarT,N}(v::ZeroView{ScalarT,N}, d) = ZeroRange(size(v.a,d))
Base.getindex{ScalarT}(v::ZeroView{ScalarT,1}, i::Integer) = v.a[i+1]
Base.setindex!{ScalarT}(v::ZeroView{ScalarT,1}, value, i::Integer) = (v.a[i+1] = value)
Base.getindex{ScalarT,N}(v::ZeroView{ScalarT,N}, i::Integer) = v.a[i] # linearindices are always 1-based for ND arrays
Base.setindex!{ScalarT,N}(v::ZeroView{ScalarT,N}, value, i::Integer) = (v.a[i] = value)

v = ZeroView([1,2,3])
A = ZeroView([1 2 3;4 5 6])

println("v:\n", v)
println("A:\n", A)

@test v[0] == 1
@test v[2] == 3
@test A[1] == 1
@test A[6] == 6
@test A[0,0] == 1
@test A[0,2] == 3
@test A[1,0] == 4
@test A[1,2] == 6

function benchmark_fill_vec(A)
  for i in linearindices(A)
    A[i] = i
  end
end

function benchmark_fill_mat(A)
  for i in indices(A,1)
    for j in indices(A,2)
      A[i,j] = i*(j+1)
    end
  end
end

const bench_size = 1000000
const ncols = 3
v_bench = ZeroView(zeros(bench_size))
A_bench = ZeroView(zeros(bench_size,ncols))

println("vector timings:")
@time benchmark_fill_vec(v_bench)
@time benchmark_fill_vec(v_bench)
@time benchmark_fill_vec(v_bench)

@test v_bench[0] == 0
@test v_bench[bench_size-1] == bench_size-1

println("matrix timings, linear indexing:")
@time benchmark_fill_vec(A_bench)
@time benchmark_fill_vec(A_bench)
@time benchmark_fill_vec(A_bench)

@test A_bench[0,0] == 1
@test A_bench[bench_size-1,2] == bench_size*ncols

println("matrix timings, [i,j] indexing:")
@time benchmark_fill_mat(A_bench)
@time benchmark_fill_mat(A_bench)
@time benchmark_fill_mat(A_bench)

@test A_bench[0,0] == 0
@test A_bench[bench_size-1,2] == (bench_size-1)*ncols

There is a problem with the indices(A,d) variant, since this introduces a type instability. Adding the specialization:

Base.indices{ScalarT,N}(v::ZeroView{ScalarT,N}, d) = ZeroRange(size(v.a,d))

fixes that, but may break other functionality such as ranges, since the definition in Base is:

function indices{T,N}(A::AbstractArray{T,N}, d)
    @_inline_meta
    d <= N ? indices(A)[d] : OneTo(1)
end

I believe the branch on N is the cause of the type instability.

Running with bounds checking off and the specialized indices d version I get these timings (on 0.6):

v:
[1,2,3]
A:
[1 2 3; 4 5 6]
vector timings:
  0.008465 seconds (6.13 k allocations: 293.714 KB)
  0.000626 seconds (4 allocations: 160 bytes)
  0.000668 seconds (4 allocations: 160 bytes)
matrix timings, linear indexing:
  0.010662 seconds (4.06 k allocations: 192.623 KB)
  0.002969 seconds (4 allocations: 160 bytes)
  0.003170 seconds (4 allocations: 160 bytes)
matrix timings, [i,j] indexing:
  0.024965 seconds (10.71 k allocations: 424.688 KB)
  0.012448 seconds (4 allocations: 160 bytes)
  0.014230 seconds (4 allocations: 160 bytes)

The difference in timing between the [i,j] and linear indexing surprises me.

BTW, would it be useful to add this example to the docs (once everything is correct, of course)?