I have created a function that computes the eigenvalues of large sparse matrices. I have implemented it both on CPU and GPU. I used CUDA.jl for portions of code that were computationally expensive. Now, I want to compare the two implementations but I am not sure about the handling of the GPU version.
I have created the following function in order to benchmark my operation:
function bench()
file = matopen("path_to_file")
Problem = read(file,"Problem");
A::SparseMatrixCSC{FLOAT} = Problem["A"];
@timeit to "RBL_gpu" CUDA.@time d,_ = RBL_gpu(A,25,10);
I call it with that way:
# d,_ = RBL_gpu(sprandn(FLOAT,50,50,0.5),1,10);
to = TimerOutput();
I have noticed that if I call my function first with a small matrix, just to warm up things, the execution time is improved. For example,
with the warm up: 12 seconds
without the warm up: 25 seconds
So, my question is which of the two is the most accurate? Am I cheating with the “warm up” or it is a good practice?
I am thinking that a user would need to call the function only once. Maybe I could create an interface and place the warm-up call inside it.
Sorry for the long text and thank you in advance.