Dot-macro, in-place ops for immutables

foobar_lv2 · April 10, 2018, 10:23pm

I just got a speed-up of factor ~17 on some BigInt code by using inplace operations, see https://discourse.julialang.org/t/a-plea-for-int-overflow-checking-as-the-default/3338/75?u=foobar_lv2.

Consider the line n=3*n+1, appearing in an inner loop. If n::BigInt, then this line is terrible, and one must ccall into libgmp in order to do this operation in-place and avoid the allocation.

The best way of writing this would be @. n=3*n+1 and have the macro figure out what to do. Then, one would also want to update the macro/lowering to also work for plain integers.

Is this doable? That is, (1) Make @. a NOP for immutables, so that people can write generic code that avoids allocations both for StaticArray and for Array, and (2) Teach @. to work on BigInt

Especially (1) might need breaking changes, if we also want X .= Y; to be equivalent to X = convert(typeof(X),Y); for immutable X (or alternatively be equivalent to X=Y).

mbauman · April 10, 2018, 10:27pm

See https://github.com/JuliaLang/julia/issues/19992 for some prior discussions here.

foobar_lv2 · April 10, 2018, 10:44pm

Hah, I missed that discussion, thanks for linking. For what it’s worth, I really like https://github.com/JuliaLang/julia/issues/26612.

stevengj · April 10, 2018, 11:50pm

Even if .= were changed to assign in-place like this, it wouldn’t be sufficient to eliminate bigint temporaries from n .= 3*n+1.

There has been a fair amount of discussion of how to get faster bignum support by eliding allocations.
See e.g. https://github.com/JuliaLang/julia/issues/4176 and WIP: Pooling BigInts and BigFloats by skariel · Pull Request #10084 · JuliaLang/julia · GitHub and [WIP] implement BigInt (and BigFloat?) with native Array by rfourquet · Pull Request #17015 · JuliaLang/julia · GitHub on in-place BigInt and BigFloat operations.

foobar_lv2 · April 11, 2018, 12:28am

I know that n .= 3*n+1 will never work; it doesn’t even work for arrays today. However, @. n = 3*n +1 has access to the entire expression. If it had access to the types as well, possibly by placing a @generated somewhere (how?), I think something could be done using the libgmp API: walk the expression tree, figure out whether we can go without temporaries; if we do need temporaries, can we hook into the inference/optimization (before llvm touches the code)? Then we could allocate the slot for the temporary outside the loop.

But more realistically, we would expect the user to write @. n= n*3; @. n += 1;, that is, write the computation in a way that avoids as many temporaries as possible, and possibly hoist necessary temps out of the loop by hand. With the added bonus that the code will now fail for plain Int, for no good reason, forcing people to either duplicate code (bad) or having all their code spit out by macros that generate versions for plain and big ints (worse).

Having to remember the GMP API is just cruel (and even worse: you need to read the source of the MPZ julia-wrapper and the libgmp manual side-by-side).

Of course compiler optimizations that eventually make the dots unnecessary are awesome. In the meantime, an explicit way that is less pain-inducing would be pretty cool.

stevengj · April 11, 2018, 3:40pm

It doesn’t. Macros never have access to types.

In fact, the whole point of the dot syntax in Julia is that it is purely syntactic sugar that doesn’t need to have access to the types. This is what allows it to be generic to arbitrary functions and array-like containers.

Eliminating temporaries in bignum operations requires something that happens much later in compilation than a macro, and probably requires compiler support or something like Cassette.jl.

foobar_lv2 · April 11, 2018, 7:23pm

If m .= n.*3 could lower to m=broadcast!(*,m,n,3 ) then I could implement broadcast!(::typeof(*), dest::Integer, a::Integer, b::Integer)=a*b and broadcast!(::typeof(*), dest::BigInt, a::BigInt, b::Int64)=Base.GMP.MPZ.mul_si!(dest, a,b) for significantly nicer generic code that uses @. m= n*3 (obviously one would need a couple of others as well).

That’s still not the whole way, but better than calling MPZ by hand. In order to eliminate temporary from @. n=3*n+1 without requiring the user to split by hand into @. n=3*n; @. n = n+1;, as I said, I am not sure. If @. could somehow force a call into a @generated version of broadcast! then we could maybe go the rest of the way.

edit: Maybe @. could do this already, I’d have to look at what it actually does and whether a modification of the macro would be enough to support a more convenient non-allocating MPZ calls.

stevengj · April 11, 2018, 10:06pm

For this to work you’d need a way to disable broadcast fusion for certain types, which may happen but doesn’t exist yet (better internal interface for extending `broadcast` · Issue #22060 · JuliaLang/julia · GitHub)

Topic		Replies	Views
Idea for setting both mutables and immutables without allocation Internals & Design proposal	23	3910	April 15, 2017
Is there an in-place alternative to +=, *=, etc.? Performance question	1	486	November 5, 2021
Initializing an array with literals without allocating a temporary General Usage	18	530	August 9, 2020
[ANN] MPFR_wrap.jl Community package , announcement	11	527	June 14, 2020
Julia in-place modification and memory usages New to Julia question	21	4032	June 8, 2020

Dot-macro, in-place ops for immutables

Related topics