A problem about performance

jling · October 3, 2022, 7:51pm

what, when did I say that? stop making strawman argument

we know non-const global causes problem
some weird stuff going on in OP related to non-const global
solution: don’t use non-const global because it’s a bad idea in all cases anyway

dlakelan · October 3, 2022, 7:54pm

Just pretend the first post says

“I wrote fun2 which is a good function, then I rewrite fun1 which passes the argument in a non const global variable… I thought it should be much slower but in fact it’s 3x faster. Why?”

StefanKarpinski · October 3, 2022, 7:57pm

Please tone it down here. No need to yell (type in all caps) or get short with anyone. Let’s focus on @dlakelan’s rephrased question here.

XuJingye2022 · October 3, 2022, 8:06pm

Love and peace

XuJingye2022 · October 3, 2022, 8:07pm

It’s not “wrong”, just not the best.

dlakelan · October 3, 2022, 8:07pm

Thanks @StefanKarpinski and @jling I felt like simply this is a miscommunication. Hopefully the rephrased question helps.

Can anyone else confirm the results with the same Julia version as OP? AND OP what kind of computer are you running on? processor and OS etc?

Are you still able to consistently reproduce these timings?

GunnarFarneback · October 3, 2022, 8:19pm

Easy heuristic: If a benchmark shows an elementary operation being way faster than 1 ns, you are probably not measuring anything meaningful.

Thought experiment: If your CPU can make 10000 additions in 1.8 ns with serial code, what kind of frequency does it need to run at? (Assume one operation per clock cycle unless you happen to know a better approximation.)

XuJingye2022 · October 3, 2022, 8:22pm

Because after I did what they said, fun2($v) or const v = rand(10000), there was no difference between them, just like @lmiq 's. So it shouldn’t be a problem about version or platform.

I guess, fun2 will do something extra compared with fun1 when passing nonconstant v to them. I don’t know that, could someone explain that?

dlakelan · October 3, 2022, 8:28pm

If i make both functions return s at the end, then I get the same timing

@btime f1()
@btime f2($v)

Both take about 12us on my laptop.

So, I think the answer is probably that because the original functions return nothing the loop is optimized away, and the difference you saw was the difference in time it takes to call a function with a given single argument vs to call a zero argument function. That’s my guess.

GunnarFarneback · October 3, 2022, 8:39pm

I think the difference is more a question of how effectively the two functions happen to be optimized down to doing nothing. You can see from the LLVM code in the original post that there are still remnants of doing something, which apparently takes a small amount of time, but that it clearly involves neither looping, nor summing anything.

dlakelan · October 3, 2022, 8:40pm

Yes, that seems to be it. As soon as you make them do useful work, by returning s at the end, they both do real stuff and take about the same time.

Topic		Replies	Views
Comparing performance of 2 simple averaging functions - why is one faster? Performance	5	500	August 31, 2020
Benchmarking questions General Usage benchmark	3	327	September 18, 2023
Performance differences between `Array` and `Tuple` when combined with `zip` of index range in `for` loop New to Julia question	2	353	February 7, 2022
Float64 comparison operator performance Performance	8	1061	September 26, 2019
Same performances between type-stable and type-unstable code Performance code_warntype , type-stability	13	268	October 3, 2024

A problem about performance

Related topics