Speed of writing vs. reading elements in an array

I’m writing a performance critical module. While using @inbounds, I can read elements (getindex) about 50x faster than I can write elements (setindex!) in two arrays of identical type and size. Both arrays are accessed at the same level of irregularity.

Is there anything I can do to speed up writing to an array?

Store is usually slower than load though usually not this much slower. It’s almost impossible to give advice without more information about the code.

Here is a reduced version of my code:

function f!{T<:Real}(A::MyImmutableType, x::Vector{T}, y::Vector{T}, z::Vector{T}, nchunks::Integer)
    for chunk = 1:nchunks
        @inbounds for i = 1:128
            row_ind, col_ind = ind2sub(A.dims, A.ints[chunk][i])
            zval = z[A.ints[chunk][i]]
            xval = x[col_ind]
            yval = y[row_ind]
            y[row_ind] = yval + xval*zval
        end
    end
end

function f{T<:Real}(A::MyImmutableType, x::Vector{T}, y::Vector{T}, z::Vector{T}, nchunks::Integer)
    y_scalar = 0.0
    for chunk = 1:nchunks
        @inbounds for i = 1:128
            row_ind, col_ind = ind2sub(A.dims, A.ints[chunk][i])
            zval = z[A.ints[chunk][i]]
            xval = x[col_ind]
            yval = y[row_ind]
            y_scalar += yval + xv*zval
        end
    end
    return y_scalar
end

The first function takes ~1.5 seconds to run and the second ~0.03 seconds. Any thoughts?

You’ll see some performance regressions because of aliasing issues, but it shouldn’t be this much (I’d expect maybe 4x).

Also you’ll probably get a better response if you post a complete runnable example (including the definition of MyImmutableType and the @time macros).

1 Like