How much is it normal that @time differs in time?


#1

I have a question regarding @time in julia, I hope this might make me understand @time better.

How much is it usual that @time differs when running the code over and over? I thought it should be around 0.1-0.2 sec difference in maximum, but when I run:

@time nbinclude("mynotebook.ipynb") ``` multiple times I get everything from:
```0.010195 seconds (27.85 k allocations: 1.591 MB)```
to:
```0.969294 seconds (27.84 k allocations: 1.590 MB)```

I want to know if this is normal behaviour before I put a lot of time trying to "fix" my code. I just thought it seems very strange that it can differ this much. How much is it normal for the time to differ when you're using @time?

#2

Is the 0.969 seconds the first time you ran the code?
Code has to compile the first time you run it.

Have you tried:

using BenchmarkTools
@benchmark nbinclude("mynotebook.ipynb")

That will run your code some large number of times and give you the minimum, median, mean, and max.
Should give you an idea of what typical runs look like.
Normally the minimum, median, and mean are all really close together. If they are, that means it is almost always running at close to the minimum speed (the mean is sensitive to outliers).
For example,

julia> @benchmark randn(50)
BenchmarkTools.Trial: 
  memory estimate:  544 bytes
  allocs estimate:  1
  --------------
  minimum time:     722.586 ns (0.00% GC)
  median time:      747.432 ns (0.00% GC)
  mean time:        792.537 ns (1.95% GC)
  maximum time:     6.679 μs (76.11% GC)
  --------------
  samples:          10000
  evals/sample:     140

The maximum time is over 8 times longer than the mean, yet still nothing to worry about. The average time is barely longer than the median, so the distribution isn’t very skewed, and both central measures are close to the minimum.

Finally: if you really like @time, BenchmarkTools offers @btime.


#3

Unfortunally no, I run the code many times and it differed everytime (between those times), it went up and down randomly. Thank you, I’ll try @benchmark and see if it helps!


#4

I’ll step away and let others who know a lot more than me answer. Without more info, I have almost no idea, and the best I can offer is a couple stabs in the dark:

  1. Do you have a lot of iterative algorithms? Like numerical minimizations from random inits?
  2. A lot of generated functions? The docs warn that they can sometimes recompile.

#5

This can have so many reasons: improper benchmarking, a property of your algorithm, or a bug in a library that you are using, … and countless other reasons. Impossible to say without a MWE. If you expect help, please invest effort in making one.


#6

I will, I just thought I should ask first if it was something strange before investing more in the problem. But it seems like it shouldn’t differ that much then?


#7

If you want an answer to that question, you will have to post a minimal working example. For all we know, your code might contain something like sleep(rand()).


#8

Honestly, how do you expect us to answer this? We have no idea what’s in mynotebook.ipynb.


#9

I have another question kind of on the same topic (with code). I have the code:

center=zeros(9)
@time @inbounds for m=1:800
    r1=11
    r2=12
    r3=13
    r12=3
    r23=4
    r13=5
    C1=r1+(1/3)*r12
    C2=r1+(2/3)*r12
    C3=r2+(1/3)*r23
    C4=r2+(2/3)*r23
    C5=r1+(1/3)*r13
    C6=r1+(2/3)*r13
    a1=1/3*(C1+C5+r1)
    a2=1/3*(C1+C2)
    a3=1/3*(C2+C3)
    a4=1/3*(C2+C3)
    a5=1/3*(C3+C4)
    a6=1/3*(C1+C5)
    a7=1/3*(C5+C6)
    a8=1/3*(C4+C6)
    a9=1/3*(C4+C6+r3)
    center2=[a1; a2; a3; a4; a5; a6; a7; a8; a9] 
end
end```

And when I run: ```@time testTime()``` I get the following times:
``` 0.009107 seconds (8.00 k allocations: 237.500 KB)
  0.046527 seconds (18.48 k allocations: 723.088 KB)```

I have also tried to do the same in Matlab and then the times are exactly the same. Have I misunderstood how to time a for-loop using @time or is it something else that makes the time for the entire function longer than for the loop (the function consists of the for-loop so I thought it would be pretty much the same)? Does it take that long to create a function or why isn't the times around the same?

#10

Did you run it twice? The first time it compiles the function.


#11

Yes, I’ve tried running it multiple times and get the same behaviour. But now it seems I get even bigger difference:
` 0.000135 seconds (8.00 k allocations: 237.500 KB)
0.020507 seconds (18.48 k allocations: 723.041 KB)
``
So, the loop really doesn’t seem to take that much time so I don’t understand what takes time?


#12

What happens if you put the loop in a separate function?


#13

I’m not quite sure what you mean since the entire function is basically just the loop, I thought I had kind of done that? But I tried to extract the loop and put it in a separate function and called that function inside testTime() function and the result was the same. But maybe that wasn’t what you meant?


#14

Yet another possibility is that you’re operating on uninitialized data. Subnormal numbers can randomly occur in the calculations in this case, and operations on those are much slower. But yes, post you code and we can help you better.


#15

It is vastly easier to use BenchmarkTools for your use-cases.
Consider the following:

julia> using BenchmarkTools

julia> function testTime()
    center=zeros(9)
    @inbounds for m=1:800
        r1=11
        r2=12
        r3=13
        r12=3
        r23=4
        r13=5
        C1=r1+(1/3)*r12
        C2=r1+(2/3)*r12
        C3=r2+(1/3)*r23
        C4=r2+(2/3)*r23
        C5=r1+(1/3)*r13
        C6=r1+(2/3)*r13
        a1=1/3*(C1+C5+r1)
        a2=1/3*(C1+C2)
        a3=1/3*(C2+C3)
        a4=1/3*(C2+C3)
        a5=1/3*(C3+C4)
        a6=1/3*(C1+C5)
        a7=1/3*(C5+C6)
        a8=1/3*(C4+C6)
        a9=1/3*(C4+C6+r3)
        center2=[a1; a2; a3; a4; a5; a6; a7; a8; a9]
    end
end

julia> @benchmark testTime()
BenchmarkTools.Trial:
  memory estimate:  237.66 KiB
  allocs estimate:  8001
  --------------
  minimum time:     112.423 μs (0.00% GC)
  median time:      193.016 μs (0.00% GC)
  mean time:        208.401 μs (7.94% GC)
  maximum time:     5.176 ms (81.85% GC)
  --------------
  samples:          10000
  evals/sample:     1

As you can see, this code does not vary much, each time it is executed (compare median and mean).

Your code would be much easier to read if you indented it properly.
Also, please put your whole code in triple backticks, not only single lines. This renders the code as a block which scrolls if its too long.


#16

For comparison, here’s what happens when I try the above on my system. The testTime() function was copy-pasted from into the REPL just before running the below tests. (Julia 0.6.0)

julia> @time testTime()
  0.008214 seconds (11.44 k allocations: 421.239 KiB)
  0.214811 seconds (230.86 k allocations: 12.083 MiB)

julia> @time testTime()
  0.000160 seconds (8.00 k allocations: 237.500 KiB)
  0.000277 seconds (8.06 k allocations: 239.125 KiB)

The first run includes compilation so it’s the second run that’s of interest. There seems to be about 0.0001 seconds of overhead for the outer call to `@time’. This difference stays constant if the number of iterations in the inner loop is changed.

However, if I run @time testTime a large number of times (about 20 or so), then at one point both the inner and outer timings are much slower. Maybe the JIT-compiler decides it’s worth the effort to try some optimizations at this point (?) because afterwards the timings are faster. (Increasing the number of iterations in the loop makes this happen sooner.)

julia> @time testTime()
  0.000198 seconds (8.00 k allocations: 237.500 KiB)
  0.000319 seconds (8.06 k allocations: 239.094 KiB)

julia> @time testTime()
  0.055488 seconds (8.00 k allocations: 237.500 KiB, 99.58% gc time)
  0.055641 seconds (8.06 k allocations: 239.125 KiB, 99.30% gc time)

julia> @time testTime()
  0.000088 seconds (8.00 k allocations: 237.500 KiB)
  0.000214 seconds (8.06 k allocations: 239.125 KiB)

EDIT: Reading the printout that I posted myself, I see that it’s the garbage collector that slowed down the 20:th or so run. Still interesting that the runtime got faster after that.


#17

I’ve looked into the same code as @shorty66 posted, but profiled it without GC:

gc_enable(false)
times = [@elapsed(testTime() for _ in 1:100000])
using PyPlot
plt[:hist](log.(times), 20)

(x-axis is log time)

That’s a more significant spread than I expected. Perhaps attributable to CPU-cache access-time varation? Other takeaways:

  • [a1; a2; a3; ...] vs. [a1, a2, a3, ...] gives the same result, but has no impact on running time (good)
  • All numbers in this example are known and computable at compile time, but Julia doesn’t identify this. Replacing the loop’s content with just center2=[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0] cuts the running time in half. Maybe this should be an issue.
  • Taking out the center and center2, Julia/LLVM correctly identifies that nothing happens in this function, and just turns it into return nothing

EDIT: had mistaken gc(false) -> gc_enable(false). Updated figure.


#18

Another figure plotting the running times; sorry if this is getting a bit OT:

Since these are all without GC, I’m assuming that this shows each level of the memory hierarchy getting filled up. I wish someone more knowledgeable than me could explain the horizontal streaks!


#19

I’m sorry but with this code I didn’t wonder why it differed from one running time to another but why the two times you got from one run differed. I included @time in the for-loop to compare what happened with the time for only the for-loop and for the entire code, and since the function only consists of the for-loop I thought these should be the same but strangely they differed.


#20

When I run it the first time I get:

  1.289053 seconds (553.28 k allocations: 21.203 MB, 0.66% gc time)```
Which is very much different in time, but still maybe not that interesting since it's compiling.

But every run I get something like what I wrote before, now I got this for example:
  ```0.000173 seconds (8.00 k allocations: 237.500 KB)
  0.008580 seconds (8.31 k allocations: 251.469 KB)```

I've run it very many times but I don't seem to reach your result. It still seems you get more similar times for the function and the loop, I use Julia 0.5, could that be the reason?