Typeof(n*n) in factorial

I can be wrong but to me it seems like a performance improvement. If the bindings in both sides of *= are typeasserted to the same type, then no conversions or type instability happens.

However, it seems like it can affect the correctness of the algorithm. If * can return a type different of both operands (and it can, as this is the whole reason for using typeof(n*n) instead of typeof(n)), then there is no guarantee that f * i will return typeof(n*n), there can be that typeof((n*n)*(n*n)) is a different type, in this case declaring a type for the variable make the algorithm try to convert the new type back to the old type, and if the new type does not fit the old type, what is very probably the reason because it is different, then an InexactError will probably be raised.

I think the reason for n*n might be related to Int64 always returned:

julia> factorial(Int32(10))
3628800

julia> typeof(ans)
Int64

julia> typeof(ans)  # unlike for Int32(10)*Int32(10); @code_native UInt32(10)*UInt32(10) gives different instruction than for 10*10, and on some CPUs/platforms faster (only 32-bit ones?)

Int64

I was tuning this code (successfully, but didnā€™t get merged), and itā€™s a bit ugly as intentionally it returns Int64 (also on 32-bit platforms, still type-stable on both).

I do not recall, and didnā€™t have 32-bit Julia to test, but I know 32-bit assembly multiply is faster than for 64-bit (Iā€™m not sure itā€™s much faster but I was going to exploit this in another context), so thatā€™s one reason for n*n to return a wider type 64-bit, while if I would again multiply it with 32-bit or 64-bit I wouldnā€™t go wider still, as then thereā€™s no fast multiply (you can stil lwork with Int128, itā€™s just slower).

[32-bit seems like a happy medium for me in a lot of contexts, such as array indexes, not base pointers, and then if I widen to 64-bit on multiply, I get space saving without overflow of multiply, getting best of both 32- and 64-bit.]

With my optimization, I was changing that, because lookup-tables are surprisingly slow (unlike in the old days, when a useful optimization trick). I.e. I was changing for Int (but also BigInt, I was less concerned with speeding that up, as Iā€™m not sure how much used in practice, but a fallback to my Int code, and maybe a lookup for low values might help).

But thatā€™s an explicit conversion and has nothing to do with the method above.

The only thing I could find that calls that method is factorial(true).

I think that the bottom line is that while Base has a lot of high-quality code, it also has stuff thatā€™s there for no apparent reason (or maybe there was a good reason originally, but things moved on).

This particular code will probably be changed when someone refactors it for some other reason unrelated to this, since for most numerical types,

typeof(n * n) === typeof(n)

so it is innocuous.

Incidentally, I wish people wrote more comments with code.

2 Likes

One could specify that user-defined types should provide a method for Base.to_power_type unless the fallback already works. Then implement factorial using that. Something like:

function factorial(n::Integer)
    n < 0 && throw(DomainError(n, "`n` must be nonnegative."))
    i = to_power_type(n)
    f = one(i)
    while i > one(i)
        f *= i
        i -= one(i)
    end
    return f
end

Still, Iā€™m quite happy with the current implementation. It works out of the box with StaticNumbers, and that would not be the case if it had used typeof(n) as the output type.

2 Likes

I disagree with this ā€” unless someone demonstrates a use case/exception, types that we think of as ā€œnumbersā€ should just be closed under *.

As mentioned above, there are types for which this does not hold, but for those factorial makes little sense anyway.

If we really want to formalize it, I would go with a trait that tells whether a type is closed under *. But that may be overkill.

1 Like

have a look at Base.power_by_squaring it already uses

to_power_type(x) = convert(Base._return_type(*, Tuple{typeof(x), typeof(x)}), x)

@Tamas_Papp: An example that comes to my mind is with symbolic tools: an expression can be written efficiently (in a particular representation), but itā€™s square (or nth power) canā€™t and then you may wish to change into generic implementation.

By that standard, Base.Irrational doesnā€™t qualifyā€¦

I think StaticNumbers is a good example of Number types that are useful but not closed under multiplication. Although they are mostly intended for things like passing the sizes of NTuples and StaticArrays in a type-stable way, they are also meant to work with general math, and usually lead to efficient constant-folding. For example:

julia> @code_llvm factorial(static(20))

;  @ intfuncs.jl:889 within `factorial'
define i64 @julia_factorial_7662() {
top:
;  @ intfuncs.jl:895 within `factorial'
  ret i64 2432902008176640000
}
1 Like

Precisely ā€” thatā€™s why I used ā€œnumbersā€ instead of <:Number.

It would be great to be able to have traits that determine if a type is

  1. closed under +, -, and * (more or less a ring)
  2. closed under that and / (a field),
  3. closed under transcendental functions etc

with alternative functions that just give a type that can contain the result for each of the above, eg

transcendental_result_type(::Type{Irrational{T}}) where T = Float64

Much more fine-grained descriptions would be possible, and also a complete set of shadow operators over types (eg Int/Int === Float64, what is currently done with return types), but overcomplicating this may be too much at this point, the caller should just promote and arrive quickly at a concrete type that can contain the result, or fall back to Any.

I tried to do part of this in

in a more complicated way, but that does not scale.

3 Likes

That package looks really useful! All that is needed is than anyone who creates a number type for which the default result_ring or result_field donā€™t give the right result add their own methods.

(It might also be nice to have of_ring_type(x) as a shorthand for result_ring(typeof(x))(x).)

Thanks, but I am now considering revamping this. TL;DR:

result_field(T, S)

is probably not needed as often as

result_field(promote_type(T, S))

so simple functions that map from types to types should be sufficient for 99% of the cases and would allow much simpler code.

1 Like