Why doesn't `^` promote its arguments?

julia> @which x^y
# https://github.com/JuliaLang/julia/tree/5e7b6dd090791ca545897d9625a82cae34c116b1/base/intfuncs.jl#L299
^(x::Number, p::Integer) in Base at intfuncs.jl:299
1 Like

I believe promotion is explicitly defined here?

Ahh, I see. Indeed, if I extend Base.^ by saying ^(x::Number, p::Integer) = ^(promote(x,p)...), then ^(Int8(127), Int64(2)) yields 16129 of type Int64. But why is ^ different from other operators in this respect? I’ve read some comments in the code about letting the compiler optimise the operation according to the type of the exponent… Can this be the reason? Still, it’s a little puzzling to me that ^ should be so special.

Mhh, thanks! but why is ^ treated differently from the other operators in this snippet?

that’s for literal? from an implementation perspective, power by squaring simply does this:

while p>1
  x *= x
  p -= 1
end

which is why there’s no promotion because x only multiplies with its own type.

We can change this behavior if we promote it before the loop. Idk what’s the ramification of that or if that’s even math/IEEE sound strategy

2 Likes

I don’t know, but I could imagine the reason being that the adequate target type of the promotion is dependent of the (value of the) exponent.

I would imagine it would be unacceptable from a performance point of view if x^2 is calculated as pow(x, 2.0) instead of as x*x. Promoting the exponent to a float would be pretty bad.

5 Likes

That’s exactly what happens, though: if we specify either the base or the exponent as a float, promotion occurs. E.g. 3^2.0 == 9.0 or 3.0^2 == 9.0. This is also what I would expect after reading the documentation. But the problem is just for integers of different sizes (no need to involve floats or other data types). Int8(3)^Int64(2) returns 9 of type Int8, so no promotion occurs… whereas for all other operations, and for exponentiations between integers and floats, promotion occurs. E.g. Int8(3) + Int64(2) returns 5 of type Int64. I agree with @jling that the proximal reason is that exponentiation between integers is implemented as x * x repeated p times, so the type of the exponent is not considered. Isn’t this behaviour inconsistent? It could be easily fixed by adjusting the power_by_squaring function. And since we are limited to exponentiations between integers (albeit of different sizes) I don’t think performance plays an important role here.

No, promotion does not happen for 3.0^2, it just turns into 3.0*3.0, as you actually note later in your post.

What sort of changes and behaviour would you like to see?

1 Like

Pardon my poor choice of words, what I mean is the following.

  • In the case of 3.0^2, the result is of type Float, the ‘common type’ between Float and Int.
  • Similarly, in the case of Int8(3) + Int64(2), the result is of type Int64, the ‘common type’ between Int8 and Int64.

From my point of view, the expected behaviour of Int8(3)^Int64(2) should be to return 9 of type Int64, betcause Int64 is the common type between the two arguments. The actual behaviour, however, is that this operation returns a number of type Int8, because of how exponentiation is implemented in power_by_squaring.

What I would do is, at the beginning of the power_by_squaring(x_, p::Integer) function, convert x_ (the base) to the common type between the base and the exponent, and leave the rest of the algorithm unchanged. It would still result in exponentiation between integers, but the behaviour would be consistent with the other operations.

Now

julia> Int8(16)^2
0

After the change

julia> Int8(16)^2
256

Probably some users had liked to have this from the beginning. Nevertheless, now it would be a breaking change.

3 Likes

Related, with full discusion of why Julia does as it does:

No, - , * will always give integers for integer input, while / will always give Float64, and when b in a^b is negative (and since it is special-cased for the sign, I’m guessing it was chosen to ignore type of b when positive, unless when its BigInt). In both cases Rational is arguable more correct, but slower, so tat was decided.

See my thread (/ is the most problematic operator, and simply promoting to type of either a or b is not enough, nor is it enough for any other operator, just most likely to be enough):

See very long discussion:

I think they’re describing the apparent behavior of:


julia> for op in (+, -, *, /)
           a = 1
           b = 1.0
           @assert op(a,b) isa Float64
           @assert op(b,a) isa Float64
       end

which is not true for ^, but the above behavior is just an accident for /, and the reason ^ doesn’t promote is also an accident

You are right, / will always give Float64.

Regarding the discussions about integer exponents and integer overflow, note that I don’t have a problem with unchecked integer overflow. 2^63 equals -9223372036854775808, which is totally acceptable and unsurprising, I’m just saying that Int8(2)^63 should give the same result, because one of the arguments of ^, namely 63, is of type Int64 (at least on my system). On the other hand, Int8(2)^Int8(63) should give 0.

I also think that my discussion does not involve Floats at all. It seems that powers with float arguments are handled by a different function:

julia> @which 3.0^2
^(x::Float64, y::Integer) in Base.Math at math.jl:922

whereas

julia> @which 3^2
^(x::T, p::T) where T<:Integer in Base at intfuncs.jl:290

So, by changing the power_by_squaring() function, there wouldn’t be a performance decrease stemming from promoting integer exponents to floats.

Anyway, as @stephancb said, changing power_by_squaring() would be breaking, at this point. I guess I’ll just note down this behaviour and be careful with my code :slight_smile:

maybe we can look at it this way: when the exponent is an integer, you’re just multiplying base to itself that many times, you’re not really doing (basic) arithmetic with two numbers, one of them is for counting purposes, so what type of integer the exponent is doesn’t matter at all.

10 Likes

But that’s just a bad argument. Conceptually it’s the same as Int8(2) * Int8(2) * … * Int8(2), 63 times and if you spell it out it would evaluate to Int8. Neither the former or latter type is ideal, so I disagree with “should”. And:

Since you’re ok with modular arithmetic (many would want a BigInt result), it’s not too bad to get result only consistent with the first argument.

I had this in mind:

julia> typeof(Int8(2)^-63)
Float64

so simply promoting to the latter type will not do, in fact neither. That works because of some magic in Julia, I thought it actually equivalent to:

julia> typeof(Int8(2)^Int64(-63))
ERROR: DomainError with -63:
Cannot raise an integer x to a negative power -63.
Make x or -63 a float by adding a zero decimal (e.g., 2.0^-63 or 2^-63.0 instead of 2^-63), or write 1/x^63, float(x)^-63, x^float(-63) or (x//1)^-63
Stacktrace:
 [1] throw_domerr_powbysq(#unused#::Int8, p::Int64)
   @ Base ./intfuncs.jl:228
 [2] power_by_squaring(x_::Int8, p::Int64)
   @ Base ./intfuncs.jl:249
 [3] ^(x::Int8, p::Int64)
   @ Base ./intfuncs.jl:274
 [4] top-level scope
   @ REPL[46]:1

A primary motivator for promotion seems to be to “ensure” the output type is capable of representing “most” results. However, this isn’t really feasible for integers since even Int128(2)^typemax(Int8) will overflow. Since even a base of 2 can overflow the largest native type with the smallest integer type exponent, there’s really no correspondence between the exponent type and the safe width of the output.

Personally, I am happy with the behavior of inheriting the type from the base. And while I haven’t given it any thought before, needing to write literal powers X^UInt8(3) to avoid promotion seems annoying.

13 Likes

I mean, the fact that 2^63 is the same as 2*2*...*2 is just because of the current implementation in Julia, not necessarily because of the concept of exponentiation. Yes, one way to mathematically define 2^63 is 2*2*...*2, but there are many other ways. It’s the same with addition: you can define x+p as x+1+1+...+1 with p ones. A direct implementation of this definition would just increment the binary value of x (p times), not caring at all about the type of p. With such implementation, Int8(127)+1 would overflow; but with the current implementation of + in Julia, Int8(127)+1 doesn’t overflow.

For sure there is a reason why addition is not implemented as repeated incrementation, and this reason may well be Speed. The same reason might underlie the choice of implementing 2^63 as 2*2*...*2. However, I would rather have a small set of rules that are independent of the algorithmic implementation, than having to remember how each operation is implemented. In this case, after reading the documentation for the first time I thought I could apply the following rule: “When Julia does a mathematical operation, the type of the result is the common type between the arguments (except when division is involved, in which case it’s more complicated)”. So I was surprised when I found that this rule didn’t apply to powers between positive integers. I really don’t care about overflow, what I was after is just consistency.


Ah, I see… So yes, you guys made good points that this behaviour is reasonable, and there are a lot of other corner cases like this that make the matter more complicated.

4 Likes

I don’t think ^ and * are comparable. * is usually ^ a group operation where both operands come from the same group, while ^ with internet exponents is better thought of as an r-module.

1 Like