Built in exponentiation for `Any` type

A really cool thing about the definition of ^ is that for any custom type MyType<:Number, for x::MyType and p::Integer > 0, x^p will work as long as multiplication Base.:*(a::MyType, b::MyType) is defined.

Why is this functionality not extended to types outside Number?

In LinearAlgebra, there is

(^)(A::AbstractMatrix, p::Integer) = p < 0 ? power_by_squaring(inv(A), -p) : power_by_squaring(A, p)

which is (as far as I could see) essentially identical to the definition for Numbers. Why not just have

(^)(x::Any, p::Integer) = p < 0 ? power_by_squaring(inv(x), -p) : power_by_squaring(x, p)

?
Then when you try to calculate an integer exponent of an instance of some arbitrary new type, it throws an error and tells you to implement *…or one or inv, depending on the exponent value.

Is it just not that useful in practice, or is there some deeper reason for avoiding this?

(Related to a recent thread in which I posted here.)

1 Like

I am not sure what you mean here, since you then proceed to link an example where it is defined for AbstractMatrix. Base also defines ^ for strings, various packages implement methods for their own types.

Defining it for all types would be problematic: that x may not even be a type for which * is meaningful. Generally, it best to organize methods into an interface (explicitly or implicitly specified), and only define them for types which are supposed to implement it.

1 Like

I think Julia is a bit undecided between formal type hierarchies (Int <: Integer <: Real <: Number etc) and less formal but more flexible traits (e.g. Base.IteratorSize). It’s usually a matter of taste and/or archeology which approach is taken in which domain.

When I started a library with a Polynomial type, I also needed to choose whether to make it <: Number or not. I decided against it because it gave many method ambiguities and also it doesn’t feel right overall, but it also meant I needed to re-implement many methods including :^ that I would otherwise have got for free.

It would be nice if the whole Number could be deconstructed in traits like HasAddition, HasMultiplication, DivisionType, HasZero, IsScalar (for broadcasting and getindex(..., 1, 1, 1)) and whatever else we can think of. But that does give a lot of additional complexity. Moreover, it will never be abstract enough to satisfy everyone: see this post that explains why the author never get anything done in Haskell.

10 Likes

Here is a generic exp implementation I wrote: ExponentialUtilities.jl/exp.jl at master · SciML/ExponentialUtilities.jl · GitHub which should work for any x for which the functions LinearAlgebra.opnorm, +, *, ^, and / (including addition with
UniformScaling objects) are defined.

It takes advantage of the fact that exp(x) == exp(x/2)^2 to make the norm of quantity to be exponentiated small, and then uses a Pade series unrolled at compile time to evaluate a polynomial expansion for exp. If you’re using a number type, I would just define LinearAlgebra.opnorm(x::MyNumberType) = abs(x).

If you’re interested in seeing something like this in Base julia, perhaps this implementation could be the basis of a PR, but it would need some work first to meet the Julia repo’s stringent standards.

7 Likes

How is the speed of this compared to the builtin float methods? What about BigFloat?

julia> using ExponentialUtilities

julia> x = Ref(rand()); xb = Ref(rand(BigFloat));

julia> @btime exp($x[])
  6.667 ns (0 allocations: 0 bytes)
1.660201975759545

julia> @btime exp_generic($x[])
  22.466 ns (0 allocations: 0 bytes)
1.6602019757595448

julia> @btime exp($xb[])
  2.122 μs (5 allocations: 200 bytes)
2.052547247206527370497127499783700253674772663390460826453252047220169692131062

julia> @btime exp_generic($xb[])
  7.711 μs (94 allocations: 5.14 KiB)
2.052547247206527370497127499568724448582867134486854464533969415581103539818611

julia> X = randn(100, 100);

julia> @btime exp($X);
  982.430 μs (43 allocations: 1.53 MiB)

julia> @btime exp_generic($X);
  2.768 ms (108 allocations: 3.97 MiB)

It’s definitely slower. There’s probably things one can do to speed this up though.

We actually used to have a ^(::Any, ::Integer) method defined in 0.6.

https://github.com/JuliaLang/julia/pull/23332

1 Like

Having (^)(x, p::Number) wouldn’t cause the same issues (at least as often) though, right?

But that’s even harder to define in the abstract. It’s one thing to define the simple repeated multiplication:

function ^(x, p::Integer)
    y = x
    for i in 2:p
        y *= x
    end
    return y
end

But even that has problems; what does it return in the case of p == 0? I guess we need some knowledge of a zero(x). What about p==1 — there it ends up returning x itself, which may not be appropriate for mutable objects. To do anything more complicated requires knowing more about x. And that’s just for integer powers. Good luck trying to define sensible general behaviors for fractional and complex exponentiation!

It just seems easier to allow folks to manually opt-in to ^ than it is to define a whole required interface for its usage.

5 Likes

Are these really problems, though?

(I assume you mean one(x), since normally x^0 == one(x).)
The function can use one(x) in its definition, and if one is not defined for some type, then it will simply throw an error that says one is not defined, so then you can either define a one, or if the type there is no sensible one, you shouldn’t be calling x^0 anyway.

The same can be said for p < 0 and inv(x).

for p==1, we can just return copy(x), no? This is what is in the current definition of power_by_squaring anyway.


I think this is the real issue. Maybe for ^ this modular approach of

  1. Defining only base methods like +,-,*,inv,zero,one etc. for types
  2. General advanced functions like exp, sin, ^, abs etc. use combinations of base and advanced functions.
  3. Which advanced functions can be used with which arguments depend on which base methods were defined.

works fine, but for types where the internal structure completely changes how one accesses data inside x for multiple ‘advanced’ functions, this approach doesn’t really make sense.

I just learnt about this from the posts by @Tamas_Papp and @tkluck, but apparently the way to deal with this is either through strict type hierarchies or interfaces, though I have trouble coming up with examples myself…maybe something to do with an Iterator that IsInfinite()?

Interfaces aside, I am still convinced that having the current ^(x::Number, p::Integer) be for x::All instead would make sense, but it was decided by @iamed2 (and everyone who let the PR through) that regardless of the questionable generality, there is an issue with method ambiguity. I will accede that the usability gained by removing the method ambiguity is better than the generality lost.


:diamond_shape_with_a_dot_inside: Thanks to everyone who contributed to the discussion so far :diamond_shape_with_a_dot_inside:
I now have a better framework for thinking and learning about these issues—and a better idea of how Julia developers think about them. Hopefully others also found some enlightenment here.

They’re not really problems, no. But we tend to err on the side of conservativism when it comes to defining what something means — especially if it’s going to apply to literally Anything.

The key point is that it’s easier to define ^ yourself than it is to document and understand everything that you need to implement in order to use the general ^ method. The ambiguities, in my view, were a symptom of this problem and not wholly unrelated.

Part of the beauty of Julia is that things don’t need to be built in for you to write them how you want (and get the performance you want).

2 Likes

If anyone is interested, I have a generic implementation in AbstractTensors.jl and Grassmann.jl

With exp function defined in Grassmann.jl/composite.jl at master · chakravala/Grassmann.jl · GitHub

Intended for ^(::Number,::TensorAlgebra) and ^(::TensorAlgebra,::Integer) inputs.

2 Likes