Offset array package with offsets in the type?

I’m a bit disappointed to realize that in OffsetArrays the offsets are a struct field instead of a type parameter. Do any alternative packages exist?

The offsets are a field instead of in the type to reduce the number of methods that need to be compiled. While I’m not aware of other offset array packages, is there anything specific that you’re trying to achieve by having the offsets in the type? In that case, it might not be too difficult to define a wrapper, and forward certain methods to the parent OffsetArray.

4 Likes

I just think it would be nice to be able to offset arrays without worrying about the array member access run time cost. But I’m aware that there’s a tradeoff between compilation time and run time.

In my program, there would be just a single offset array type, so I don’t have to worry about method explosion.

just define your own array with a constant offset then…

Maybe I’ll do that, but it doesn’t seem straightforward, I’ll have to think about various interfaces that have to be respected.

array is not tuple, the offset calculation time is really minimal, do you have benchmark showing it being a significant overhead compared to if you encode the offset in types?

1 Like

I don’t have benchmarks (I don’t exactly have anything to compare with yet), but it seems like reading the offset values for each access would cause extra data cache misses, while encoding the offsets as part of the type seems like it could enable nice optimizations. I’m just speculating, of course.

I don’t think so, whatever you do, in the end you will call

getindex(XX, M) -> getindex(OO, N)

XX is your wrapper type and user did XX[M], and OO is the final inner Julia built-in array.

The only question is how to get N. Regardless of where the offset is stored, you still need to calculate N = M + offset, I really don’t think it will be significant

The offset addition operation could be encoded in the executable code (not data) or even optimized away, if the offset is a compile time known constant.
I think the main improvement is eliminating the extra memory access.

constant folding happens even when it’s not encoded as type:

julia> struct A
           a::Int
       end

julia> function f(x)
           A1 = A(1)
           return A1.a + x
       end
f (generic function with 1 method)

julia> @code_llvm f(3)
;  @ REPL[2]:1 within `f`
define i64 @julia_f_808(i64 signext %0) #0 {
top:
;  @ REPL[2]:3 within `f`
; ┌ @ int.jl:87 within `+`
   %1 = add i64 %0, 1
; └
  ret i64 %1
}

For a single getindex/setindex! const offset might help.
But for most cases, we access the elements many times within a function.
LLVM might not know the offset value, but it knows it won’t change.
So the overhead could be ignored for most usecase.

4 Likes

A fun fact that I don’t know why: OffsetArray of Array is sometimes faster than normal Array :man_shrugging:

The issue below checks a simple sum loop over the array

function arr_sum(X)
    val = zero(eltype(X))
    R = CartesianIndices(X)
    for i in R
        @inbounds val += X[i]
    end
    val
end

and you can observe that OffsetArray is basically as fast as Array.

https://github.com/JuliaLang/julia/issues/38073

1 Like

Perhaps this has something to do with various alignments within the compiled code? Some of that would vary between Julia restarts, so it might make sense to repeat benchmarks like these, restarting Julia in between.
There are many other possible benchmarking noise sources, of course, and trying to account for as many as possible can be fun.

1 Like