In Julia, if I write Int8(127) + 1 I get 128 as an answer (and not an overflow), because the two numbers are promoted to Int before performing the addition. The same promotion occurs for other operations like -, *, and /, but not for ^. I’ve read in promotion.jl that ^(x::Number, y::Number) = ^(promote(x,y)...), so I expected that the result of ^(Int8(127), 2) would be 16129 of type Int, but instead the result is 1 of type Int8. In fact, the result of ^ between integers seems to be always of the same type of the first argument:
x = Int64(3)
y = Int8(5)
x^y
# result: 243 of type Int64
What is really weird for me is that if I call promote() explicitly it works: ^(promote(Int8(127), 2)...) returns 16129 of type Int64 as I expected. But then what is the purpose of that line of code in promotion.jl? It doesn’t seem to be applied. What am I missing here?
julia> @which x^y
# https://github.com/JuliaLang/julia/tree/5e7b6dd090791ca545897d9625a82cae34c116b1/base/intfuncs.jl#L299
^(x::Number, p::Integer) in Base at intfuncs.jl:299
Ahh, I see. Indeed, if I extend Base.^ by saying ^(x::Number, p::Integer) = ^(promote(x,p)...), then ^(Int8(127), Int64(2)) yields 16129 of type Int64. But why is ^ different from other operators in this respect? I’ve read some comments in the code about letting the compiler optimise the operation according to the type of the exponent… Can this be the reason? Still, it’s a little puzzling to me that ^ should be so special.
I would imagine it would be unacceptable from a performance point of view if x^2 is calculated as pow(x, 2.0) instead of as x*x. Promoting the exponent to a float would be pretty bad.
That’s exactly what happens, though: if we specify either the base or the exponent as a float, promotion occurs. E.g. 3^2.0 == 9.0 or 3.0^2 == 9.0. This is also what I would expect after reading the documentation. But the problem is just for integers of different sizes (no need to involve floats or other data types). Int8(3)^Int64(2) returns 9 of type Int8, so no promotion occurs… whereas for all other operations, and for exponentiations between integers and floats, promotion occurs. E.g. Int8(3) + Int64(2) returns 5 of type Int64. I agree with @jling that the proximal reason is that exponentiation between integers is implemented as x * x repeated p times, so the type of the exponent is not considered. Isn’t this behaviour inconsistent? It could be easily fixed by adjusting the power_by_squaring function. And since we are limited to exponentiations between integers (albeit of different sizes) I don’t think performance plays an important role here.
Pardon my poor choice of words, what I mean is the following.
In the case of 3.0^2, the result is of type Float, the ‘common type’ between Float and Int.
Similarly, in the case of Int8(3) + Int64(2), the result is of type Int64, the ‘common type’ between Int8 and Int64.
From my point of view, the expected behaviour of Int8(3)^Int64(2) should be to return 9 of type Int64, betcause Int64 is the common type between the two arguments. The actual behaviour, however, is that this operation returns a number of type Int8, because of how exponentiation is implemented in power_by_squaring.
What I would do is, at the beginning of the power_by_squaring(x_, p::Integer) function, convert x_ (the base) to the common type between the base and the exponent, and leave the rest of the algorithm unchanged. It would still result in exponentiation between integers, but the behaviour would be consistent with the other operations.
No, - , * will always give integers for integer input, while / will always give Float64, and when b in a^b is negative (and since it is special-cased for the sign, I’m guessing it was chosen to ignore type of b when positive, unless when its BigInt). In both cases Rational is arguable more correct, but slower, so tat was decided.
See my thread (/ is the most problematic operator, and simply promoting to type of either a or b is not enough, nor is it enough for any other operator, just most likely to be enough):
Regarding the discussions about integer exponents and integer overflow, note that I don’t have a problem with unchecked integer overflow. 2^63 equals -9223372036854775808, which is totally acceptable and unsurprising, I’m just saying that Int8(2)^63 should give the same result, because one of the arguments of ^, namely 63, is of type Int64 (at least on my system). On the other hand, Int8(2)^Int8(63) should give 0.
I also think that my discussion does not involve Floats at all. It seems that powers with float arguments are handled by a different function:
julia> @which 3.0^2
^(x::Float64, y::Integer) in Base.Math at math.jl:922
whereas
julia> @which 3^2
^(x::T, p::T) where T<:Integer in Base at intfuncs.jl:290
So, by changing the power_by_squaring() function, there wouldn’t be a performance decrease stemming from promoting integer exponents to floats.
Anyway, as @stephancb said, changing power_by_squaring() would be breaking, at this point. I guess I’ll just note down this behaviour and be careful with my code
maybe we can look at it this way: when the exponent is an integer, you’re just multiplying base to itself that many times, you’re not really doing (basic) arithmetic with two numbers, one of them is for counting purposes, so what type of integer the exponent is doesn’t matter at all.
But that’s just a bad argument. Conceptually it’s the same as Int8(2) * Int8(2) * … * Int8(2), 63 times and if you spell it out it would evaluate to Int8. Neither the former or latter type is ideal, so I disagree with “should”. And:
Since you’re ok with modular arithmetic (many would want a BigInt result), it’s not too bad to get result only consistent with the first argument.
I had this in mind:
julia> typeof(Int8(2)^-63)
Float64
so simply promoting to the latter type will not do, in fact neither. That works because of some magic in Julia, I thought it actually equivalent to:
julia> typeof(Int8(2)^Int64(-63))
ERROR: DomainError with -63:
Cannot raise an integer x to a negative power -63.
Make x or -63 a float by adding a zero decimal (e.g., 2.0^-63 or 2^-63.0 instead of 2^-63), or write 1/x^63, float(x)^-63, x^float(-63) or (x//1)^-63
Stacktrace:
[1] throw_domerr_powbysq(#unused#::Int8, p::Int64)
@ Base ./intfuncs.jl:228
[2] power_by_squaring(x_::Int8, p::Int64)
@ Base ./intfuncs.jl:249
[3] ^(x::Int8, p::Int64)
@ Base ./intfuncs.jl:274
[4] top-level scope
@ REPL[46]:1
A primary motivator for promotion seems to be to “ensure” the output type is capable of representing “most” results. However, this isn’t really feasible for integers since even Int128(2)^typemax(Int8) will overflow. Since even a base of 2 can overflow the largest native type with the smallest integer type exponent, there’s really no correspondence between the exponent type and the safe width of the output.
Personally, I am happy with the behavior of inheriting the type from the base. And while I haven’t given it any thought before, needing to write literal powers X^UInt8(3) to avoid promotion seems annoying.
I mean, the fact that 2^63 is the same as 2*2*...*2 is just because of the current implementation in Julia, not necessarily because of the concept of exponentiation. Yes, one way to mathematically define 2^63 is 2*2*...*2, but there are many other ways. It’s the same with addition: you can define x+p as x+1+1+...+1 with p ones. A direct implementation of this definition would just increment the binary value of x (p times), not caring at all about the type of p. With such implementation, Int8(127)+1 would overflow; but with the current implementation of + in Julia, Int8(127)+1 doesn’t overflow.
For sure there is a reason why addition is not implemented as repeated incrementation, and this reason may well be Speed. The same reason might underlie the choice of implementing 2^63 as 2*2*...*2. However, I would rather have a small set of rules that are independent of the algorithmic implementation, than having to remember how each operation is implemented. In this case, after reading the documentation for the first time I thought I could apply the following rule: “When Julia does a mathematical operation, the type of the result is the common type between the arguments (except when division is involved, in which case it’s more complicated)”. So I was surprised when I found that this rule didn’t apply to powers between positive integers. I really don’t care about overflow, what I was after is just consistency.
Ah, I see… So yes, you guys made good points that this behaviour is reasonable, and there are a lot of other corner cases like this that make the matter more complicated.