Consider the following code, where I want to take a difference across columns of a matrix.
for j=1:(J-1)
Z[:,j] .= X[:,j].-X[:,J]
end
However, I want to allow the last term to be a column potentially different from J
. So instead I write
base = 3
jdx = setdiff(1:J,base)
for j in jdx
k = findall(jdx.==j)[1]
Z[:,k] .= X[:,j].-X[:,base]
end
My sense is that this is a stupid thing to do, but I am not sure how else to “re-index” the columns of Z
to range from 1:(J-1)
, rather than skipping the base
value.
Not surprisingly, the version of the code with findall()
performs much worse, especially as J
gets larger. See full code below.
Does anyone know of a smarter way to “re-index” in the way I’ve written? Thank you.
using Random
Random.seed!(1234)
function Zdiff(N::Int64=50,J::Int64=10,base::Int64=J)
base = 3
jdx = setdiff(1:J,base)
X = rand(N,J)
Z = Array{Float64,2}(undef, N, J-1)
for j in jdx
k = findall(jdx.==j)[1]
Z[:,k] .= X[:,j].-X[:,base]
end
return nothing
end
function Zdiff1(N::Int64=50,J::Int64=10)
jdx = 1:(J-1)
X = rand(N,J)
Z = Array{Float64,2}(undef, N, J-1)
for j in jdx
Z[:,j] .= X[:,j].-X[:,J]
end
return nothing
end
# don't time first call
Zdiff(50,10,3)
Zdiff1()
println("J = 10")
@time Zdiff(50,10,3)
@time Zdiff1()
println("J = 20")
@time Zdiff(50,20,3)
@time Zdiff1(50,20)
#J = 10
# 0.000143 seconds (76 allocations: 57.281 KiB)
# 0.000025 seconds (24 allocations: 16.625 KiB)
#J = 20
# 0.015504 seconds (139 allocations: 119.195 KiB)
# 0.000069 seconds (45 allocations: 34.141 KiB)