I have a matrix (cost_mat[~600,~600]) with only 1 & 0 as its elements. You can think that each element of the matrix is individually put as 1 (with probability p) or 0. I need to find out (cost_mat^N) where N(~600) is a large number. I don’t need the actual value of the elements in (cost_mat^N). I just need to know whether they are positive or not.(1 if +ve else 0,as all elements are anyway positive here so need not consider any negative values).
I didn’t use the Integer matrix as BLAS doesn’t use multicores then. Besides, the values tend to get quite high. So,I had to do the matrix power operation in step (with overflow check in each step).
Here is a sample of my code,
size=553
cost_mat=zeros(Float64, size,size);
for j in 1:1:size
for i in 1:1:size
cost_mat[i,j] = rand()< 0.7 ? 1 : 0;
cost_mat[j,i]=cost_mat[i,j];
end
end
cost_mat = cost_mat+LinearAlgebra.I;
println("cost matrix calculated")
c = copy(cost_mat);
N=48
step=24
for i in 1:1:N
println(i,"/",N," multiplication done ")
#After doing each power operation,I convert the values to 1 or 0
c = !=(0).(c*(cost_mat^(step)));
end
This operation takes around 10 seconds on a 32 core machine(BLAS seems to use only 20 cores tough 30 cores allotted). I have to repeat this calculation ~1000 times. So, this seems a quite difficult bottleneck.
Besides to assign 1(if element != 0) or 0 (if element is 0), I have checked equality. I think checking (>0.5 or <0.5) instead of equality would make more sense as we have floats here. (Anyway, the elements should have an integer value, so checking greater or less than 1 should also work),
Surprisingly this operation is significantly faster in Numpy. (though it uses all 32 cores)
Taking the power (cost_mat)^(N*step) is faster (is it because of the avoidance of fore loop & offloaded to BLAS?) but I need to take care of the overflow. So, doing it in steps seem safer to me.
I can’t think of any way to speed up this operation. I don’t have much experience with Julia. Any suggestion would help me a lot.
Thank you in advance.Have a nice day.