Skipping over loop index

Consider the following code, where I want to take a difference across columns of a matrix.

    for j=1:(J-1)
        Z[:,j] .= X[:,j].-X[:,J]
    end

However, I want to allow the last term to be a column potentially different from J. So instead I write

    base = 3
    jdx  = setdiff(1:J,base)
    for j in jdx
        k = findall(jdx.==j)[1]
        Z[:,k] .= X[:,j].-X[:,base]
    end

My sense is that this is a stupid thing to do, but I am not sure how else to “re-index” the columns of Z to range from 1:(J-1), rather than skipping the base value.

Not surprisingly, the version of the code with findall() performs much worse, especially as J gets larger. See full code below.

Does anyone know of a smarter way to “re-index” in the way I’ve written? Thank you.

using Random
Random.seed!(1234)

function Zdiff(N::Int64=50,J::Int64=10,base::Int64=J)
    base = 3
    jdx  = setdiff(1:J,base)

    X = rand(N,J)
    Z = Array{Float64,2}(undef, N, J-1)

    for j in jdx
        k = findall(jdx.==j)[1]
        Z[:,k] .= X[:,j].-X[:,base]
    end
    return nothing
end

function Zdiff1(N::Int64=50,J::Int64=10)
    jdx  = 1:(J-1)

    X = rand(N,J)
    Z = Array{Float64,2}(undef, N, J-1)

    for j in jdx
        Z[:,j] .= X[:,j].-X[:,J]
    end
    return nothing
end

# don't time first call
Zdiff(50,10,3)
Zdiff1()

println("J = 10")
@time Zdiff(50,10,3)
@time Zdiff1()

println("J = 20")
@time Zdiff(50,20,3)
@time Zdiff1(50,20)

#J = 10
#  0.000143 seconds (76 allocations: 57.281 KiB)
#  0.000025 seconds (24 allocations: 16.625 KiB)
#J = 20
#  0.015504 seconds (139 allocations: 119.195 KiB)
#  0.000069 seconds (45 allocations: 34.141 KiB)

Yeah, using findall definitely seems unnecessary. Perhaps I’m missing something, but why not just increment the column of Z by one each for each index of j which is not equal to base. For example:

julia> function Zdiff2(N = 50, J = 10, base = J)
         X = randn(N, J)
         Z = Array{Float64, 2}(undef, N, J - 1)
         z_col = 1
         for j in 1:J
           if j != base
             @views Z[:, z_col] .= X[:, j] .- X[:, base]
             z_col += 1
           end
         end
         Z
       end
julia> using BenchmarkTools

julia> @btime Zdiff2($50, $20);
  7.227 μs (59 allocations: 18.17 KiB)

Note also that I’m using @views in my code, since otherwise X[:, j] creates a copy of that slice of the matrix (slicing is always a copy for Julia Arrays). With @views, it instead creates a view into the same matrix, avoiding a bunch of unnecessary copying.

2 Likes

That did it! Thank you.

I’m surprised that it still takes more resources to skip the column index. On my machine I got

  7.182 μs (40 allocations: 33.91 KiB)
  13.679 μs (59 allocations: 18.17 KiB)

for @btime respectively using my Zdiff1($50, $20) and your Zdiff2($50, $20).