Why do bigfloats allocate that much ? Could a native rewrite solve the problem?

Hi.

Everything is in the title: I dont think I understood why BigFloats allocate that much, and why nothing could be done about it. Is there some source, article or doc that explains this ?

As @JeffreySarnoff showed in this post and the following of the discussion, it is possible to design non-allocative multiplications, additions, fma, … for BigFloats, directly calling into the corresponding C functions. Why are’nt those defaults / used more ?

Furthermore, I was wondering if this is inherent to the representation itself or not: would a proper rewrite of the same representation in native Julia have the same problems ? At runtime, on dispatch time, the size of Bigfloats would be known (something like BigFloat{prec} where prec<:Integer).

Posisiblities:

  • Port gmp/mpfr directly : might be very hard as there is many code
  • Port Arb : maybe less code but needs an equivalent of FLINT.
  • Enhance MultiFloats.jl to match BigFloats requirements.
  • Others ?

Thanks for any link and or discussion on the matter :wink:

I believe https://github.com/Nemocas/Nemo.jl wraps Arb

I would always prefer a port to a wrap. Wrapping stuffs yields allocations problems, vetorisations problems, etc… same thing we already have with BigFloats.

Since Julia will recompile code for each value of prec, this is not a good idea unless perhaps you restrict prec to a sparse sequence, say 53, 106, 212, … or 64, 128, 256, …

Some alternatives:

  • Have Julia recycle BigFloat objects without having to clear/initialize them.
  • Change the BigFloat struct to inline a few limbs and allocate heap memory only when the precision is too large. This adds a few cycles of overhead to call MPFR functions (you need to mock up mpfr_t objects on the stack), but it will be faster if you write custom code for the low-precision arithmetic. This is precisely what Arb does, though only up to two limbs (128 bits); I guess you’d want to do this with four limbs to optimize for the current BigFloat default prec of 256 bits.
  • Use metaprogramming to rewrite code using BigFloat to do in-place operations with fewer temporary allocations. This is potentially interesting even if you’ve already solved the allocation problem, because large time savings are possible if you start to fuse composite operations (e.g. dot products - [1901.04289] Faster arbitrary-precision dot product and matrix multiplication).
2 Likes

I don’t think there is an existing solution that meshes well with Julia and could just be adapted in a simple way. Generally Julia handles numbers as constants, not containers, so the mutating approach you link will not work in generic code that assumes numbers.

Maybe the exiting machinery (like broadcast) could be hijacked for expressions like

z .= z * a + b # z::BigFloat

but I am not aware of an implementation.

That said, if you are merely looking for more precision, try something like

It will be orders of magnitudes faster than BigFloats. I try to use the latter only when I need very high precision and don’t care about the speed (eg testing).

3 Likes