OffsetArrays, inbounds, and confusion

oh, so
A[index]
is “guaranteed” to exist?
is there a fancy way to do it just with a function enumerate() or similar (i.e. to get the counter, index, and value of A in one function?)

for (counter, index, a) in Xenumerate(A)

and then if that was the preferred idiom for people to use, offset arrays would “just work” more often then now?

thanks for clarifying

pairs(collection)

Return an iterator over key => value pairs for any collection that maps a set of keys to a set of values. This includes arrays, where the keys are the array indices.

Also see this old answer of mine.

To my knowledge there is no function that combines pairs and enumerate, but you can just pass the return of pairs to enumerate.

julia> using OffsetArrays
[ Info: Precompiling OffsetArrays [6fe1bfb0-de20-5000-8ca7-80f57d26f881]

julia> oa = OffsetArray(["a", "b", "c", "d", "e"], -2:2)
5-element OffsetArray(::Array{String,1}, -2:2) with eltype String with indices -2:2:
 "a"
 "b"
 "c"
 "d"
 "e"

julia> for (count, (index, value)) in enumerate(pairs(oa))
           println("$count $index $value")
       end
1 -2 a
2 -1 b
3 0 c
4 1 d
5 2 e
2 Likes

thanks. continuing my edification, how come this works to replace traditional for i=1:length(A) loop
for a in A
but not
for (i,a) in A
or even
for (count,i,a) in A

It seams to me that Julia should have enough info to make this work?

(please be nice, I come from a numerical/Matlab/Fortran background, so all these post 1970s data structures are still slightly opaque).

1 Like

You are correct it could act the way you propose, it is a conscious design decision to not go that way. for a in A will always iterates just the values of A. You could design/change the language to do the equivalent to for (i, a) in pairs(A) when you write for (i, a) in A, and do the equivalent of for (count, (i, a)) when you write for (count, i, a) in enumerate(pairs(A)). However, this would disallow you to de-structure tuples in the general case, because the syntax would be ambiguous. For example, the following use case would be disallowed (or have its meaning changed):

julia> A = [('a', 10), ('b', 20), ('c', 30)]
3-element Array{Tuple{Char,Int64},1}:
 ('a', 10)
 ('b', 20)
 ('c', 30)

julia> for (a, b) in A; println("$a $b"); end
a 10
b 20
c 30

I do agree with the language design decision here, as I find it to be the more elegant one. Now, any of these iterators that you may want (including the count, i, a) can be implemented by enumerators instead.

julia> enumpairs(xs) = zip(1:length(xs), eachindex(xs), values(xs))
enumpairs (generic function with 1 method)

julia> using OffsetArrays

julia> oa = OffsetArray(["a", "b", "c", "d", "e"], -2:2)
5-element OffsetArray(::Array{String,1}, -2:2) with eltype String with indices -2:2:
 "a"
 "b"
 "c"
 "d"
 "e"

julia> for (count, i, a) in enumpairs(oa)
           println("$count $i $a")
       end
1 -2 a
2 -1 b
3 0 c
4 1 d
5 2 e
6 Likes

UPDATED. ORIGINAL SOLUTION WAS WRONG (won’t delete it for context in the discussion that follows), but I updated this post so that the correct answer is here also (solution proposed by @mbauman).

This was the first thing I thought about when reading this thread. I think I found a decent solution:

for i in 5:length(a)-4
  #...
end

becomes (wrong answer)

for i in eachindex(a)[5:end-4]
    # ....
end

Correct solution

for i in eachindex(a)[begin+4:end-4]
    # ....
end

Paulo

That doesn’t work because the indices of an offset array are themselves offset. So indexing eachindex(A)[5:end-4] doesn’t necessarily start from 1 like you’d hope. Instead, you could use eachindex(A)[begin+4:end-4], which has the additional virtue of being symmetrical.

3 Likes

For generic code I’d refrain from using [begin+4:end-4],
as it would fail on “strided” arrays (e.g. arrays stored in compact form, but with only even indices).

Edit: leaving the part below, which shows how indexing on eachindex surprised me.

I read Paulo’s post as “iterate from the 5th element to the 4th before last”,
which seems a valid use case.

To do that, one can’t index on the eachindex(oa) iterator itself

oa = OffsetArray(["a", "b", "c", "d", "e"], -2:2)
eachindex(oa)[3:5]
BoundsError: attempt to access 5-element OffsetArrays.IdOffsetRange{Int64, Base.OneTo{Int64}} with indices -2:2 at index [3:5]

I currently do not understand what it is trying to achieve.

But on the collected indexes, it works nicely:

oa = OffsetArray(["a", "b", "c", "d", "e"], -2:2)

# loop on the 3rd to 5th element
for idx in collect(eachindex(oa))[3:5]
	println("$idx: $(oa[idx])")
end

0: c
1: d
2: e

I’d expect Base.iterators.take(iterator, 3:5) to return a nice iterator for this purpose,
but it’s not the case yet (as of julia-1.6.5).

Julia is already awesome, and progressing quickly in the right direction. Keep on the good work !

This is false. All AbstractArray subtypes must have AbstractUnitRange axes — even those with nontraditional indexing. Interfaces · The Julia Language

Your memory can be strided, but the indices in Julia must be contiguous.

8 Likes

All AbstractArray subtypes must have AbstractUnitRange axes

Thanks. I see now that my hypothetic “strided” array could not be an AbstractArray.
This simplifies things.

1 Like

Other than OffsetArrays, there are many containers that support only a part of array interface, like Queue, Stacks and so on.

So, If I have a function that only pushes an element to its argument, I just wat to be sure to dispatch any argument type that is push-able. But there is no such trait-based dispatch.

On the other side, looks like AbstractArray is a poor abstraction in terms of its “compactness” / “surface area” (brings too many assumptions and methods)? In other words, a single abstract type bounds too many traits together.

So, maybe an ideal solution (for Julia 2.0, who knows?) would be to have any abstract type declared as a set of small traits? Then, type dispatch over types can be expanded to dispatch over custom sets of traits, so something like this would be a valid code for any subtype, that supports all declared traits:

# trait set for A is a subset of AbstractArray, but not OffsetArray:
function my_sum(A::{Indexable, IndexFromOne, HasLength, FiniteLength, Eltype{T}, etc.}) where T
    r = zero(eltype(A))
    for i in 1:length(A)
        @inbounds r +=A[i]
    end
    return r
end
1 Like

Thanks! I didn’t know about zip().

Putting my numerics hat on and to clarify… If i was to use enumpairs() in “matlab” style Julia (i.e vectors and matrix of Float64’), Julia is smart enough to make this as fast as if I wrote it in the “Fortran” style

count  = 1
for i=1:N
  a = A[i]
  ...
  count = count + 1
end

i.e. there is no numerical performance hit by using “fancy” functions like zip() and enumerate() rather than “raw” for loops.

If the version with enumerate/pairs are slower, it’s a bug.

3 Likes

I tried it out quickly and it appeared to work… Must have typed something wrong. Sorry

Whether a function is exported or not does not by itself indicate whether it is public API or not. The main indicator if it is public API is whether it had a docstring and is included in the manual. If it is documented and the documentation does not say otherwise, it is public API.

There have been some efforts to explicitly declare components to be public API. One example is this issue:

And this package:
https://docs.juliahub.com/PublicAPI/Jfs06/0.1.0/

1 Like