Why sum of the measured times of each function is not equal to the total run time?

Here is the code. I’m just providing the code to executed and observe the issue that I am talking about.

A= sparse([0 1 0 1;1 0 1 0;0 1 0 0;1 0 0 0])

sm_vec = rand(4, 5)

k = 3

function Filter(sm_vec, k, A)
    @time begin
    N = size(sm_vec, 2);

    sz = size(A, 1)

    V = zeros(size(A, 1), N);

    val = sum(A, dims = 1) .^ (-.5);

    I2 = 1:sz

    D = sparse(I2, I2, sparsevec(val))

    A = D * A * D

    on = ones(Int, size(A, 1))



    for iter in 1:N

        sm = view(sm_vec, :, iter)

        for loop in 1:k

            sm = A * sm

        end

        sm_ot = sm - ((dot(sm, on) / dot(on, on)) * on)

        sm_norm = sm_ot ./ norm(sm_ot);

        V[: , iter] = sm_norm;
    end
end#time
    return V

end #end of function

@time result = Filter(sm_vec, k, A)

0.000012 seconds (37 allocations: 3.531 KiB)
0.000824 seconds (149 allocations: 8.547 KiB)

Why the measured times are not the same? The purpose of this question is that I am trying to speed up the code but I need to figure out which part takes the most time. I put @time begin…end in several places throughout the code but surprisingly, I noticed that the sum of those measured times inside the function is not equal to the the measure time of calling the function.

Are you measuring compilation time?

2 Likes

Can you show how you are measuring the timings? Can you include the benchmarking code?

2 Likes

I just edited the post to show how I measure the time. I know that tic-toc is not the best way but I’m just trying to find out which part of my code is slow so I can work on the speed up.

I use tic-toc to measure time. I guess it includes the compilation time as well.

tick and tock are not part of base Julia. Where are you getting them from?

In any case, if they work anything like tic and toc in Matlab, then tock gives the time since the last tick call. When your final call to tock happens outside of the loop, when was the last tick call? Hint: It’s not the tick before the loop.

Don’t use tick/tock to measure timing. It’s not a good idea in Matlab because of exactly this problem, and there are better tools for this in Julia like @elapsed.

2 Likes
function f1()
    sleep(1)
    return 1
end

function f2()
    sleep(2)
    return 2
end

function f3()
    sleep(3)
    return 3
end

function test(N)
    local f1Total = 0
    local f2Total = 0
    local f3Total = 0

    local started = time_ns()
    for loop = 1:N
        local s = time_ns()
        a = f1()
        local t = time_ns() - s
        f1Total += t
        # println("F1: $t ns")

        local s = time_ns()
        b = f2()
        local t = time_ns() - s
        f2Total += t
        # println("F2: $t ns")

        local s = time_ns()
        c = f3()
        local t = time_ns() - s
        f3Total += t
        # println("F3: $t ns")
    end
    local total = time_ns() - started
    local overhead = total - f1Total - f2Total - f3Total

    println("F1 Total: ($N): $f1Total ns")
    println("F2 Total: ($N): $f2Total ns")
    println("F3 Total: ($N): $f3Total ns")
    println("Total:    ($N): $total ns")
    println("Overhead: ($N): $overhead ns")
end

On my machine I’m seeing around 3000ns to 4000ns of overhead which would be the calls to time_ns() along with the math (addition/subtraction) to track the times. BTW println if fairly “expensive” so if you are printing the time inside the loop that will add more overhead to the loop. On my machine uncommenting the println(s) inside the loop boost the overhead to a bit over 2 million nano seconds.

Here are my results without println in the loops:

F1 Total: (10): 10021025897 ns
F2 Total: (10): 20017349021 ns
F3 Total: (10): 30031619654 ns
Total:    (10): 60069998184 ns
Overhead: (10): 3612 ns

To the others on this thread there appears to be a package named:
https://github.com/schmrlng/CPUTime.jl
That has the tic()/tock() functions…

Just to check, you get those values in the second run, right?

I think that the problem is not using tick() and tock() since I get the same results using other methods like @time. I still see a big difference in sum of the functions (or chunk of code) compared to the total run time of function. Basically, when I measure the elapsed time of code inside the function, it is a lot less (>2x) than the total run time.

Even if I don’t add anything extra, I still get higher run time compared to just measuring each chunk of code.

Yes.

It seems that you still didn’t answer anything about the overhead. What’s your N.

N is small (e.g 5). In each iteration I multiply a sparse matrix into a vector three times (updating the vector each time). How this could be related to my question though? What do you exactly mean by overhead?

It is already described in detail above.

There’s nothing magical there. The only things that @time does is to get the current time. You can just print out the 32 time points and see where’s the missing time. If you want others to help you figure out the exact cause, you would have to give much more details to your code so that others can actually run it.

1 Like

Maybe because the functions tick() and tock() themselves also last some time.

I just provided the code which can be executed to see the problem.

Errrr, you have a whole loop after the inner time measurement. Why do you expect that to cost no time?

Top post was edited so this doesn’t make sense anymore. The original code I was replying to was.


function Filter(sm_vec, k, A)
    @time begin
    N = size(sm_vec, 2);

    sz = size(A, 1)

    V = zeros(size(A, 1), N);

    val = sum(A, dims = 1) .^ (-.5);

    I2 = 1:sz

    D = sparse(I2, I2, sparsevec(val))

    A = D * A * D

    on = ones(Int, size(A, 1))

end#time

    for iter in 1:N

        sm = view(sm_vec, :, iter)

        for loop in 1:k

            sm = A * sm

        end

        sm_ot = sm - ((dot(sm, on) / dot(on, on)) * on)

        sm_norm = sm_ot ./ norm(sm_ot);

        V[: , iter] = sm_norm;
    end

    return V

end #end of function

@time result = Filter(sm_vec, k, A)

Which have the inner time measurement not enclosing the whole function.

2 Likes

Also, the time here is no where close to 1-2 seconds in about 5 loops so a lot of the overhead as explained above are going to be important. If the 1-2 second time measurement does not correspond to the 5 loops, please provide more detail about the relevant information for the same setup rather than answering different questions based on different experiments without explicitly mentioning the difference. There’s no way for others to figure out…

If you are not just complaining about 0.000824 being very different from 0.000012, please explain in full detail what you are actually talking about instead.

Also, as an aside, it’ll be better if you post update in comment or at least keep the old version instead of just editing the top post like this… now it’s impossible for people reading this thread to make sense of a lot of the previous posts from other people without looking at the edit history of yours…

I edited the post. The results are for the current version.

Sure, I will keep updating in future instead of editing the post. This was actually my first post. My exact question is why there is difference between 0.000824 and 0.000012?