AbstractVector vs AbstractRange: length vs size -- which is fundamental?

It appears that julia master has changed a behavior from previous versions, and now size for an AbstractRange is defined as (length(r),) instead of the previous behavior where length was defined as prod(size(r)). This makes length a more fundamental function than size, which, fair enough, seems to be the right thing for 1D arrays. However this puts AbstractRanges at odds with AbstractVectors, since AbstractVectors rely on the previous behavior. As an exampe:

julia> struct CustomArray{T,N,A<:AbstractArray{T,N}} <: AbstractArray{T,N}
       parent :: A
       end

julia> Base.size(a::CustomArray) = size(a.parent)

julia> Base.axes(a::CustomArray) = axes(a.parent)

julia> Base.getindex(a::CustomArray{<:Any,N}, i::Vararg{Int,N}) where {N} = getindex(a.parent, i...)

julia> CustomArray(3:4)
2-element CustomArray{Int64, 1, UnitRange{Int64}}:
 3
 4

julia> CustomArray(3:4) |> length
2

julia> CustomArray(3:4) isa AbstractVector
true

this works correctly on nightly. Now

julia> struct CustomRange{T,A<:AbstractRange{T}} <: AbstractRange{T}
       parent :: A
       end

julia> Base.size(a::CustomRange) = size(a.parent)

julia> Base.axes(a::CustomRange) = axes(a.parent)

julia> Base.getindex(a::CustomRange, i::Int) where {N} = getindex(a.parent, i)

julia> Base.step(a::CustomRange) = step(a.parent)

julia> CustomRange(3:4)
3:1:4

julia> CustomRange(3:4) |> length
ERROR: length implementation missing
[...]

julia> CustomRange(3:4) isa AbstractVector
true

It’s a bit nonintuitive that for AbstractRanges one needs to define length whereas for AbstractVectors one needs to define size. Although breakages caused by this change may be explained as “missing methods”, perhaps this behavior should be documented better if this is the intent? Should one always define both methods anyway, even though this might be redundant?

3 Likes

This came up for a suggested topic and went unanswered.

According to the Base interfaces documentation, for iterators length is more fundamental for general iterators since your options are:

  1. Do not implement either length or size if Base.IteratorSize(IterType) in (Base.IsInfinite(), Base.SizeUnknown()) .
  2. Implement just length if Base.IteratorSize(IterType) == Base.HasLength()
  3. Implement length and size if Base.IteratorSize(IterType) isa Base.HasShape

https://docs.julialang.org/en/v1/manual/interfaces/

1 Like

AbstractArrays such as ranges definitely fall in type-3, for which presumably we need to define both length and size. The only issue that remains is that the docstring for length(A::AbstractArray) states that it defaults to prod(size(A)), which seems to indicate that length comes for free with size. Maybe the docstring doesn’t need to specify this, especially if the expectation is that length needs to be implemented by the developer.

2 Likes

The fallback for length(::AbstractArray) is useful. What’s problematic here is that there’s a specific method for AbstractRanges which throws an error, which is inconsistent with other AbstractArrays.

Maybe this should be discussed on the PR which introduced this:
https://github.com/JuliaLang/julia/pull/40382
CC: @jameson

1 Like