I was just comparing the performance of a handcoded lifted version of say addition for Nullables with the version that is in base now using the dot notation. My benchmark code looks like this:
a = Nullable(2)
b = Nullable(4)
@benchmark $a .+ $b
function my_addition{T<:Number}(a::Nullable{T}, b::Nullable{T})
if isnull(a) || isnull(b)
return Nullable{T}()
else
return Nullable(get(a)+get(b))
end
end
@benchmark foo($a, $b)
Using the dot notation leads to performance that is about an order of magnitude worse than the hand coded version. Here are my benchmark results, first using the dot notation:
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 65.146 ns (0.00% GC)
median time: 70.344 ns (0.00% GC)
mean time: 75.946 ns (0.00% GC)
maximum time: 311.660 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 987
time tolerance: 5.00%
memory tolerance: 1.00%
And for my hand coded version:
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 7.546 ns (0.00% GC)
median time: 7.849 ns (0.00% GC)
mean time: 8.439 ns (0.00% GC)
maximum time: 34.414 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
time tolerance: 5.00%
memory tolerance: 1.00%
Is there any hope that the dot version might become as fast as a hand coded version? Or is this a known limitation that can’t really be solved?
I was hoping that I could use the dot notation in Query.jl to deal with Nullable
s, but that won’t really work with this kind of performance characteristic…
PS: And yes, I’m aware of https://github.com/JuliaLang/Juleps/pull/21, but I couldn’t really get an answer to my (narrow) question here about the performance of the dot operation on Nullable
s.
PPS: CC @nalimilan, who probably just knows the answer
When https://github.com/JuliaLang/julia/pull/16961 was merged, the dot syntax was as performant as the hand-written version (actually moreso, as it avoids a branch). Something must have happened in the meantime. Can you file an issue?
I looked into it and I believe it is a codegen regression/change that has caused this problem, but I can’t identify which one. I have a quick patch that will fix this regression by changing the Nullable broadcast code.
After the patch I get
julia> @benchmark Nullable(1) .+ Nullable(2)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 1.338 ns (0.00% GC)
median time: 1.357 ns (0.00% GC)
mean time: 1.452 ns (0.00% GC)
maximum time: 14.030 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
time tolerance: 5.00%
memory tolerance: 1.00%
which I hope you’ll find more pleasant .
The PR is taking slightly longer to finish than I’d like due to some stray segmentation faults and other concerns. I should have it up for review by the end of the day.
2 Likes
That looks fantastic!
Are there any caveats about using this for lifting in something like Query that I should know about? To me this looks as if it more or less solves the whole lifting debate we had, but I’m a bit worried that I’m overlooking something (because why did we even have https://github.com/JuliaLang/Juleps/pull/21 otherwise?).
In terms of this particular syntax and performance, I’d say there aren’t really any caveats. The unfortunate thing is that we don’t have a nice way to broadcast dot calls, i.e. it’s not easy to do x ..+ y
if we have Array{<:Nullable}
.
The Julep proposes a nicer and more flexible lower-level representation of Nullable
than the existing one and the currently recommended option is not anticipated to really change the existing broadcast
methods. (Union{Some{T}, Null}
)
The other options that would change the semantics of dot-calls would be using Union{T, Null}
and Union{T, Null{T}}
to represent Nullable{T}
, which are currently being actively discussed. However both these solutions are significantly more difficult changes than Union{Some{T}, Null}
, so I believe that if the faster union types get into 1.0, we will start with that, and the broadcast
methods will remain unchanged.
Thanks, that clarifies things a lot!
I should update that thanks to Jeff’s PR
https://github.com/JuliaLang/julia/pull/21310
this performance regression should be gone now. Thanks again for the report!