Make EWMA as fast as pandas

That’s not a type conversion, rather it’s a type assertion. So it’s just there for safety.

1 Like

What do you mean? Are you referring to fma? It doesn’t just make the implementation faster, it also makes it more accurate, so I don’t understand your viewpoint.

1 Like

fma (muladd) and Base.OneTo, basically

The use or not of fma is subject of long discussions here, and for sensible reasons it is not the default, I don’t want to revive this discussion here.

But my point of view is that these are strategies that fall outside the scope of essential Julia syntax, so I’m curious on how much they really matter for performance, because they may cause the wrong impression that they are needed for writing performant code.

People here in the forum has expressed a view that writing performance oriented Julia code is like learning a new language, and I disagree, from my experience, with that.

2 Likes

in this case when I tried it it was a factor 2 speedup, so yes definitely really matters

3 Likes

You misunderstand the situation. The discussions are about whether the compiler should be free to compile expressions like a*b + c into fma(a, b, c). So you’re complaining about the wrong thing.

Yeah, Base.OneTo is probably not necessary. My stylistic choice is to use it, though.

4 Likes

I’m not complaining about anything. :smile:

What I think is important is to keep the perspective that, here for example, there seems to be a 500x speedup by improving the algorithm vs. a 2x speedup by using some lower level tricks.

Just for the records, this version, with @fastmath performs similarly to the fastest one for me:

julia> function ewma_4_sym2(x::AbstractArray{T}, c::T) where {T}
         res = zeros(T, size(x))
         num = zero(T)
         den = zero(T)
         @fastmath for i ∈ eachindex(x)
           j = i - 1
           num = num * c + x[begin+j]
           den = den * c + 1
           res[begin + j] = num/den
         end
         return res
       end
ewma_4_sym2 (generic function with 1 method)

julia> @btime ewma_4_sym2($x,$c);
  120.550 μs (2 allocations: 781.30 KiB)

julia> @btime ewma_4_sym($x,$c);
  119.972 μs (2 allocations: 781.30 KiB)

julia> ewma_4_sym2(x,c) ≈ ewma_4_sym(x,c)
true

@fastmath allows for the the compiler to use fma, afaik.

11 Likes

FYI, Base.OneTo is part of Julia’s public API. There’s noting wrong with using it, assuming appropriate.

Previously I would be skeptical of everything were you need to qualify with with Base. but I checked and at least by now OneTo is marked public with the new public keyword, so it’s not internal non-API.

Yes, it’s not the default for a* b + c, as in recent C/C++ compilers (clang). That doesn’t mean don’t use, it means, it’s always (a bit) faster, and when you’ve done the analysis that it’s safe, you can (and should?!) use it.

I didn’t do the analysis for fma here for this code, and note fma is usually more accurate, what people worry about are the rare exceptions, nor did I really look into

help?> Base.OneTo
2 Likes

I was not saying anything there was wrong. It just make the code less readable for the regular user. Also the optimizations that can be done by hand with fma can be handled by the @fastmath macro here, which IMO also improves readability.

2 Likes

This is backwards, IMO, as @fastmath is unsafe. In this case it has no ill effect, but newbies shouldn’t be encouraged to use the unsafe features of the language.

2 Likes

I agree. My advice for new users is to not use these macros for performance at all. In fact, even my packages I end up not using anything of these, as the benefits are very minor relative to what can be improved algorithmically.

What I think is that the codes resulting from benchmark competitions scare new users, because the problems are usually very simple and, thus, highly influenced by these micro-optimizations, which end up giving the impression that writing performant code is always this divergent from the simpler and straightforward syntax.

6 Likes