Could be, especially if you average weighted by runtime
Flexible than any other for object-oriented programming…
mutable struct MyStruct
id::Int64
name::String
end
function display(this::MyStruct)
println("=> id:", this.id, " name:", this.name)
end
function merge(this::MyStruct, other::MyStruct)
this.id += other.id
this.name *= other.name
return this
end
function Base.getproperty(this::MyStruct, name::Symbol)
try
f = eval(name)
return (args...) -> f(this, args...)
catch UndefVarError
return Base.getfield(this, name)
end
end
m1 = MyStruct(123, "hello_")
m2 = MyStruct(456, "world")
m1.merge(m2)
m1.display()
julia> @btime sum(1:2:10)
0.027 ns (0 allocations: 0 bytes)
25
julia> @btime sum(1:2:1000000000)
0.028 ns (0 allocations: 0 bytes)
250000000000000000
I was really impressed when I first saw this. Even Haskell is not doing this O(1) out of the box.
That’s not O(1), that’s O(0). The entire computation was hoisted out of the benchmark loop and evaluated at compile time, unless you travelled to the future and got a THz processor.
Here’s an actual measurement of the runtime speed (which is still very impressive)
julia> @btime sum($(Ref(1:2:10))[])
7.379 ns (0 allocations: 0 bytes)
25
julia> @btime sum($(Ref(1:2:1000000000))[])
8.061 ns (0 allocations: 0 bytes)
250000000000000000
Interpolating the Ref(1:2:10)
makes it so that julia’s compiler won’t infer the contents as a constant (it could be changed by another thread for instance) so it waits until runtime to actually do the calculation.
Good to know, thanks. My point is that it won’t loop but rather n*(n+1)/2 formula at the first run. I was not aware that it is optimized even more in a static context. Awesome.
Lots can be statically optimized away. My understanding is that there are basically the three things that will stop the compiler from hoisting a calculation:
-
Mutable, heap allocated containers such as Arrays, Refs or mutable structs.
-
Code that is allowed to throw an error depending on a runtime value. Julia is getting smarter about this one though
-
The compiler giving up because a block of code is too big.
It’s not that surprising: there is a specific method for sum(::UnitRange)
that uses the formula for the sum of an arithmetic progression, see julia - What is the difference between `UnitRange` and `Array`? - Stack Overflow
It’s not necessarily relying on the presence of a sum
function call. See behavior below first with integer argument and then with float.
function f(x)
s=zero(x)
for i ∈ 1:x
s+=i
end
s
end
julia> @time f(1_000_000)
0.007364 seconds (16.19 k allocations: 880.975 KiB)
500000500000
julia> @time f(1_000_000)
0.000001 seconds (5 allocations: 176 bytes)
500000500000
julia> @time f(1_000_000.0)
0.096444 seconds (304.15 k allocations: 15.480 MiB)
5.000005e11
julia> @time f(1_000_000.0)
0.001901 seconds (5 allocations: 176 bytes)
5.000005e11
julia> @time f(1_000_000.0)
0.001942 seconds (5 allocations: 176 bytes)
5.000005e11
julia> @time f(1_000.0)
0.000004 seconds (5 allocations: 176 bytes)
500500.0
For someone who coded Java or C# it is. There is no way to express things like that there without writing tons of if(typeof(x) == UnitRange)
or using so called “design patterns” that make me sick (f.e. Visitor for multiple dispatch).
Great answer, I cannot upvote due to my low SO rank.
Have you read the answer I posted on StackOverflow?
I mean, it’s not surprising after you check with @less
o related macros what’s happening under the hood
I did just now - but I might be missing something. Did you explain how my function f
above gets the O(0) treatment (only for integers) even though it doesn’t call sum
? It was my understanding from when someone showed this previously that LLVM does the magic in this case.
Looks like it’s true, @code_native shows no loops. This magic happens for ints and is not related to Julia’s sum
magic.
@btime f($100000000.0)
320.451 ms (0 allocations: 0 bytes)
5.00000005e15
@btime f($100000000)
1.848 ns (0 allocations: 0 bytes)
5000000050000000
Lack of $ caused that it was computed compile time, which time is not included in @btime result - another kind of magic.
My mind is blown.
Simple. I’ve always had a development time vs execution time trade-off to fight. Julia shifts that curve, dramatically…
@tro3 yes! I am so much more efficient and my code actually stands the wear of battle unless I do stupid things.
Inspired by Julia’s documentation on interfaces (I had to play with the code a little to get the optimizations to actually trigger):
julia> using BenchmarkTools
julia> struct Squares
count::Int
end
julia> Base.eltype(::Type{Squares}) = Int
julia> Base.length(S::Squares) = S.count
julia> @inline function Base.iterate(S::Squares, state::Int=0)
state == S.count ? nothing : ((state+1)*(state+1), state+1)
end
julia> [s for s in Squares(5)]'
1×5 LinearAlgebra.Adjoint{Int64,Array{Int64,1}}:
1 4 9 16 25
julia> function mymean(x)
s = zero(eltype(x))
@fastmath for a ∈ x
s += a
end
s / length(x)
end
mymean (generic function with 1 method)
julia> @btime mymean(Squares($(Ref(10))[]))
7.698 ns (0 allocations: 0 bytes)
38.5
julia> @btime mymean(Squares($(Ref(10^2))[]))
7.698 ns (0 allocations: 0 bytes)
3383.5
julia> @btime mymean(Squares($(Ref(10^3))[]))
7.962 ns (0 allocations: 0 bytes)
333833.5
julia> @btime mymean(Squares($(Ref(10^4))[]))
7.961 ns (0 allocations: 0 bytes)
3.33383335e7
julia> @btime mymean(Squares($(Ref(10^5))[]))
7.961 ns (0 allocations: 0 bytes)
3.3333833335e9
julia> @btime mymean(Squares($(Ref(10^6))[]))
7.961 ns (0 allocations: 0 bytes)
3.3333383333350006e11
(The @fastmath
is there so that the code is fast for Float64
as well; it isn’t needed for this example. @simd
also makes floats fast, but prevents this optimization.)
In my post I had used @time, not @btime, and you could see the effects of compilation in the first run for ints and for floats and then not in the second ones.
beautiful Rainer (and hello!)
Not sure if it is the fewest possible lines, but coming fresh out of the oven. I needed a function that returns all neighbours in a arbitrary dimensional matrix around a given index.
When I started with this, I only needed a 2D matrices, so I wrote the first function based on a suggestion for python, which was accepted as most efficient on SO.
But now I need it for higher dimensions, so I started over with the second function. Seems more verbose, but arguably easier to understand, works for any number of dimensions, and is actually slightly faster.
function getneighbours(M, cartIdx, radius = 1)
return [M[i,j] for i in cartIdx[1]-radius:cartIdx[1]+radius, j in cartIdx[2]-radius:cartIdx[2]+radius
if !((i == cartIdx[1]) & (j == cartIdx[2]))&& (1<= i <= size(M, 1)) && (1<= j <= size(M, 2))]
end
function getnbs(M, startIdx::CartesianIndex; radius = 1)
idx = CartesianIndices(ntuple(i->(-radius:radius), ndims(M))) .+ startIdx
idx = idx[:]
deleteat!(idx, ceil(Int, length(idx)/2))
filter!(id -> checkbounds(Bool, M, id) , idx)
return M[idx]
end
M = LinearIndices((300,300))
M = Int.(t)
testIdx = CartesianIndex((30,30))
display(@benchmark getneighbours(M, testIdx))
display(@benchmark getnbs(M, testIdx))
BenchmarkTools.Trial:
memory estimate: 416 bytes
allocs estimate: 8
--------------
minimum time: 206.596 ns (0.00% GC)
median time: 245.063 ns (0.00% GC)
mean time: 356.022 ns (23.41% GC)
maximum time: 43.481 μs (98.77% GC)
--------------
samples: 10000
evals/sample: 446
BenchmarkTools.Trial:
memory estimate: 400 bytes
allocs estimate: 4
--------------
minimum time: 176.339 ns (0.00% GC)
median time: 215.448 ns (0.00% GC)
mean time: 319.323 ns (16.98% GC)
maximum time: 25.761 μs (97.86% GC)
--------------
samples: 10000
evals/sample: 575