Should view(array, 1:length(array)) return a SubArray?

Hi All,

julia> arr = zeros(Int, 3)
3-element Array{Int64,1}:
 0
 0
 0

julia> b = view(arr, 1:length(arr))
3-element SubArray{Int64,1,Array{Int64,1},Tuple{UnitRange{Int64}},true}:
 0
 0
 0

Now that b will be treated like a SubArray() there may be additional overheads in computations. Are SubArray methods in general identify these cases internally and provide the same level of performance?

regards,

Sambit

It is important for performance that a function has return types only dependent on the input types and not values (search for “type stable”, also look at @code_warntype). Thus this is correct.

1 Like

I’d guess the answer to the performance question should be at least interesting. Notice the last type parameter is the value true.

Without checking this myself I am guessing that this information is used somewhere to provide a speed bonus

Just did a benchmark will share the result. I see a 20% performance drop.

julia> @benchmark f_hsB_SubArray_view()
BenchmarkTools.Trial: 
  memory estimate:  1.06 KiB
  allocs estimate:  36
  --------------
  minimum time:     58.720 ms (0.00% GC)
  median time:      59.373 ms (0.00% GC)
  mean time:        59.477 ms (0.00% GC)
  maximum time:     63.780 ms (0.00% GC)
  --------------
  samples:          85
  evals/sample:     1

julia> @benchmark f_hsB_SubArray_noview()
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     43.614 ms (0.00% GC)
  median time:      44.429 ms (0.00% GC)
  mean time:        44.783 ms (0.00% GC)
  maximum time:     54.849 ms (0.00% GC)
  --------------
  samples:          112
  evals/sample:     1

Data input:

arr=UInt8[rand(['0','1','2','3','4','5','6','7','8','9','0','a','b','c','d','e','f'])
 for x = 1:mb_10]
arr1=zeros(UInt8, (10<<19))

f_hsB_SubArray_view() = hex2bytes!(view(arr1, 1:length(arr1)),view(arr, 1:length(arr)))
f_hsB_SubArray_noview() = hex2bytes!(arr1,arr)

The hex2bytes! is a new method we are introducing as part of PR:

https://github.com/JuliaLang/julia/pull/23267

Essentially, the same code executed with view and without view has different performance outcomes. It may be desired but good to know or document.

Aren’t you using global variables here? That might negatively affect the benchmark results.

@traktofon the comparison is relative here so ideally the effect should be equally seen in both cases. In any case, here is the result with local variables:

julia> f_hsB_SubArray_view()
BenchmarkTools.Trial: 
  memory estimate:  1.06 KiB
  allocs estimate:  36
  --------------
  minimum time:     58.579 ms (0.00% GC)
  median time:      59.293 ms (0.00% GC)
  mean time:        59.487 ms (0.00% GC)
  maximum time:     65.059 ms (0.00% GC)
  --------------
  samples:          85
  evals/sample:     1

julia> f_hsB_SubArray_noview()
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     43.517 ms (0.00% GC)
  median time:      44.126 ms (0.00% GC)
  mean time:        44.310 ms (0.00% GC)
  maximum time:     49.490 ms (0.00% GC)
  --------------
  samples:          113
  evals/sample:     1

The concern I had with local variables was optimization may optimize away computations and replace with the final result as it does not take any inputs or process it.

for example:

int factorial_5() may be short circuited to a fixed value of 120 after first JIT compilation.

Since, the arrays I use are random number sequences it’ll be dependent on the system state. Also note the @benchmark is applied on the computation only and not allocations.

I understand the concern raised on datatype instability of global variables. Just theoretically that to my understanding should be concern for assignment of previously declared or unassigned variables. Not the case here where the initialization of the variable is happening with pre-specified concrete datatype. But I am sure there may be practical constraints where such rules may not hold good.

Code snippet used.

f_hsB_SubArray_view() = begin
    const mb_10 = (10 << 20)

    arr=UInt8[rand(['0','1','2','3','4','5','6','7','8','9','0','a','b','c','d','e','f'])
                for x = 1:mb_10]
    str = String(arr)
    b_str = from_hexstringS(str)
    arr1=zeros(UInt8, (10<<19))

    @benchmark hex2bytes!(view(arr1, 1:length(arr1)),view(arr, 1:length(arr)))
end

f_hsB_SubArray_noview() = begin
    const mb_10 = (10 << 20)

    arr=UInt8[rand(['0','1','2','3','4','5','6','7','8','9','0','a','b','c','d','e','f'])
                for x = 1:mb_10]
    str = String(arr)
    b_str = from_hexstringS(str)
    arr1=zeros(UInt8, (10<<19))

    @benchmark hex2bytes!(arr1,arr)
end

The methods that are being reviewed.

@inline number_from_hex(c::UInt) = begin
    DIGIT_ZERO     = UInt('0')
    DIGIT_NINE     = UInt('9')
    LATIN_UPPER_A  = UInt('A')
    LATIN_UPPER_F  = UInt('F')
    LATIN_A        = UInt('a')
    LATIN_F        = UInt('f')

    return (DIGIT_ZERO <= c <= DIGIT_NINE) ? c - DIGIT_ZERO :
        (LATIN_UPPER_A <= c <= LATIN_UPPER_F) ? c - LATIN_UPPER_A + 10 :
        (LATIN_A <= c <= LATIN_F) ? c - LATIN_A + 10 :
        throw(ArgumentError("Not a hexadecimal number"))
end

function hex2bytes!(d::AbstractVector{UInt8}, s::AbstractVector{UInt8})
    i, j = start(s), 0
    # This line is important as this ensures computation happens in word boundary and not
    # byte boundary. Boundary computation can be almost 10 times slower
    n::UInt = 0
    c1::UInt = 0
    c2::UInt = 0
    while !done(s, i)
        n = 0
        c1, i = next(s, i)
        done(s, i) && throw(ArgumentError("source vector length must be even"))
        c2, i = next(s, i)
        n = number_from_hex(c1)
        n <<= 4
        n += number_from_hex(c2)
        d[j+=1] = (n & 0xFF)
    end
    return d
end