julia> f() = addone(3)
f (generic function with 1 method)
julia> @code_llvm f()
; Function f
; Location: REPL[21]:1
define i64 @julia_f_241591136() {
top:
ret i64 4
}
What you did was equivalent of writing
code_llvm(addone, Tuple{Int})
which doesn’t give the compiler any hint that the about the value of the argument.
it calls rem, but if I wrap it with some main function, it becomes constant again:
function main()
x = 3
addone(x)
end
Does this affect @btime addone($x) and @time addone(x) macro?
So, should I put any code inside some main() or f() functions with no input arguments and do NOT benchmark any functions with global input arguments?
The Tuple{..., ...} is a list of the argument types to the functions.
If it is not const the type is not known because you can change it at any point (e.g. by doing x = 3.0). If you declare it const then it works fine:
julia> const x = 3
3
julia> f() = addone(x)
f (generic function with 1 method)
julia> @code_llvm f()
; Function f
; Location: REPL[3]:1
define i64 @julia_f_312611515() {
top:
ret i64 4
}
Which version are you using? We are not very good at recognizing and optimizing out pure functions in general but this case shouldn’t have that problem and on 1.0 no rem is called.
What the compiler can’t do though, is to remove the error check. y % x throws an error when x is 0 and that cannot be optimized out.
That’s unrelated.
That’s just how code_native the function should be used.
No, and none of your example, above or below, uses global variable anyway…
Sure, I know you are talking about benchmark but none of the code above uses global variables… They are the difference between compilers doing constant propagation or not by having the variable as a function input rather than locally known value…
Thare are really multiple problems you are seeing and all what I’m compilaing about @kristoffer.carlsson is that you bring up many issues that unnecessarily complicated the discussion…
For a start, code_native (either the function or the macro) does not show you the code that corresponds to the expression you give it, it shows the function you’ll call with it. So having the input to the macro as a constant or not will not change the result.
Secondly, because of this, in no part of any of the code above does global variable play any role. You aren’t seeing any issue caused by type instability so bringing it up is just complicating things… It is a thing to be careful about when doing benchmarks but @code_native isn’t a benchmark and you aren’t running the code.
Should go without saying that putting in a function (it already is) does not change the result.
And now the real question. This shouldn’t be the case. Are you just refering to the rem in the debug info? or are you actually seeing a call to the rem function. If it’s the former, it’s just the error issue I said above and if it’s the latter, what’s your version and what’s the code you see?
And finally, something unrelated to the main issue but worth bringing up: none of the local and the ::Int or ::Int64 on the local variables you have are doing anything and you should get rid of them. There are cases where using them is good but in this case it just make the code harder to read (If you are from C and like explicit variable declaration that’s fine but it won’t help the compiler).
And as for why code_native for addone and f() = addone(3) are different in wrt the rem but not in C. I believe % does not have side effect in C (as in integer divide by 0 is UB in C). However, this is not the case in julia, (integer divide by 0 always throws an error). This means that in C % where the result is not used is a no-op and can be removed but in julia that is impossible unless you can proof that there is no error (edit: the error can’t be removed but the actual divide will be). This is why when you check for that explicitly, there will be no code left for the rem even if the input is not known.
julia> function addone(x::Int)
local y::Int = 10
local z::Int = 3
if x != 0
y = y % x
end
x += 1
x
end
addone (generic function with 1 method)
julia> @code_llvm addone(3)
; Function addone
; Location: REPL[1]:2
define i64 @julia_addone_36031(i64) {
top:
; Location: REPL[1]:7
; Function +; {
; Location: int.jl:53
%1 = add i64 %0, 1
;}
; Location: REPL[1]:8
ret i64 %1
}
I also don’t think you are actually seeing a rem call in any case since
Unless you disabled inlining, rem the julia function should be inlined.
There isn’t a C function rem that you’ll call for this function. On all supported platforms it should be implemented as (inlined) assembly in the llvm backend and you shouldn’t see any call.
This also suggests that since you are probably not super familiar with the assembly code, you should probably use code_llvm for this. (This is a case where inference, i.e. code_warntype, won’t give you enough info but code_llvm is more than enough.)
Seems like I took the wrong operations to test on local variables…
By the way, I run Julia 0.7, so it’s not about the version - just the rem exception handling.
And here, there’s no rem function call (read, no divide calculated) in your code.
The Function rem here is just the debug info telling you that code is inlined from the rem function. So depending on what you mean by rem is still there,
There’s no rem call.
There’s no division
Part of the rem function that has sideeffect that doesn’t exist in C is still there.
So the compiler is equally good at removing no-ops but the def of no-op is different and the safer semantics prevented the removal of a branch here (which should be pretty cheap…)
Yes, pretty much… And you should be able to see the difference by just calling addone(0)…
rand(Int) cannot be optimized out, it change the state of the global RNG. There isn’t anything defined about this and won’t be. Things that calls external functions will have a hard time to be optimized out
No. And again, the fact that you are not seeing a function call to that means that it’s not that complex.
And no, just don’t do that…
As I said, it seems that you don’t actually read assembly all that well, so use code_llvm please.
It is never ever a good idea to just count how many lines there are in, well, basically any code to see what optimization is done. You failed in this once and now again…
You are just seeing the bounds check, which for one reason or another isn’t being optimized out since llvm’s range analysis isn’t super happy about the mix of signed and unsigned comparison…
And again, the “longer” code you see (or at least I see from your code) is just debug info. When you have two setindex!, the first one throws the bounds error instead of the second one so you end up with the debug info of two functions instead of one. Again, just stop counting lines and actually read the code. The debug info shows perfectly where you can find the bounds check and the store.