The way you’re describing it is unfortunately not possible to get efficiency. At least, it would be a large mathematical achievement to make it possible. Let me explain a little bit.
You’re talking about building a Math Library. Currently the most advanced math library in Julia is @musm’s SLEEF.jl
Another project which is doing this is Libm.jl, which is @pkofod’s GSoC project:
Having spent a good amount of timing following and helping benchmark the previous Libm.jl turned Amal.jl turned Sleef.jl, I can say that there are a lot of details to get things right. Essentially, for every function you are looking to create an approximation which gets to the right amount of accuracy to cover the range of values you want to calculate. So if you’re working with
Float64, you want to come up with discretizations that, for any
Float64 input, will have accuracy to
Float64. This limits both the range of values you can expect, and the accuracy you need. To get this fast, you can usually cover large ranges of values using semi-generic methods (a Taylor expansion with an appropriate degree), but then you always have to work out the details. You need to specialize on the values close to
k*2pi to get good accuracy near zeros of
cos. You need to specialize on the subnormal numbers, for example:
You need to do a lot of by-hand tuning if you want this to be fast and give the accuracy the user would expect. This is the complete opposite of generic programming. But it makes sense: generic algorithms are written using multiple dispatch, and they are performant because the functions at the very bottom were tuned by hand. However, these are the functions at the bottom which need to be tuned by hand.
Since these functions are all very small and when tuned say by using a table lookup (a precomputed table of values), you’re in a domain where you have to avoid any extra overhead. So while “using optimization to find the right amount of Taylor coefficients” or “optimize to find a good polynomial approximation” sound like great ideas, you have to notice that you cannot do this “on the fly” because that optimization problem would be the vast majority of the calculation time. So you cannot hope to make arbitrary precision “fast” here: it’s just the nature of the problem.
However, Julia is well-suited for tuning via adding dispatches. You can write a generic implementation on
AbstractFloat, and then use your tools to find the right Taylor coefficients / polynomial approximation for
Float128, … and have each number type have a specialization that is fast for it, each hand-tuned and tested. Julia is perfectly suited to creating such a library, but of course, that is a lot of work.