Prompted by this Dot function thread, in which I learned about SugarBLAS and InplaceOps and Yeppp… Often it seems that the first implementation of a function is very simple, and then the fast version is quite a bit more complicated.
Does anyone else try to keep both of them around, so that when you’ve forgotten what you were trying to do, you can look at the easy one first? And if you do, are there good ways to automate this, to make sure that the simple one is still giving the same result as the fast one?
I’m wondering if there ought to be some kind of @slowversion marker, which would cause tests to be run over both versions. (Ideally, all 2^n ways…) Perhaps this already exists?
If you are thinking about more far future then not only @slowerversion could be faster, but packages used to improve performance could be also abandoned. (And possibility to return to @reference_implementation could be useful)
Reference implementation could be helpful for documentation/learning too.
DiffEq has two implementations of a lot of algorithms. The “slow versions” are out-of-place and actually faster for things like static arrays, which means that they actually have a purpose on smaller problems. Then the mutating versions are careful about memory access and the GPU kernels they build and so they are faster on mutable arrays and for parallelism. Dispatch keeps it all organized.
But I think you’ll see that with some practice it’s fairly obvious how to go from one to the other. It’s pretty clear how to find out what needs to be cached, what operations need to mutate, and what can be re-used after some practice. It’s really not a big deal. I keep the two since they are optimized in different domains, not for display purposes.