Type domain (static) integers

nsajko · August 24, 2024, 4:52pm

I’m trying to get type domain integers into Base:

github.com/JuliaLang/julia

`Integer`s in the type domain

JuliaLang:master ← nsajko:TypeDomainIntegers

opened 11:04AM - 23 Aug 24 UTC

nsajko

+1348 -40

Introduce `TypeDomainInteger`, a subtype of `Integer` in the type domain. That i…s, each value is of singleton type. These type domain integers are then applied to many places in `Base` (yet other applications are also possible). To be clear, this should go into `Base`, as opposed to a user package, because: * the invalidations when defining a new `Integer` subtype are horrid * it should be utilized from within `Base`, e.g., it fixes the bug below for irrationals Fixes #37977 Updates #34003 Closes #44538

Type domain integers are not a new idea, there was previously an effort by @Zach_Christensen to get a StaticInt type into Base, and two relevant packages exist, StaticNumbers.jl and Static.jl. That said, my design is more elegant and the types are more expressive than previous efforts (the previous effort are basically just Val design-wise).

Looking at some previous discussion on this forum and on Github, e.g., Static.jl vs StaticNumbers.jl, I see I’m not alone in considering type domain integers as a fundamental feature, and a feature that would be useful in user packages. Even though the two mentioned packages kind of implement the feature, it’s actually not possible to do this properly outside of Base, because having it outside of Base means it’s not available for improving Base, not without type piracy.

Despite this, the opposition to including type domain integers in Base is strong, to say the least. I’d like to know more about the reasons people are opposed to getting this into Base, so I could better prepare for a possible triage discussion.

nsajko · August 24, 2024, 4:56pm

Responding to a historic criticism:

Disagree, integers are much more fundamental. Floating-point numbers, in particular, may be constructed from the integers (indeed, that’s how they’re defined, and that’s how they’re implemented), so a “static float” or another “static real” type is less essential for inclusion into Base.

That said, I think adding more such types into Base could be a good thing. It’s true that Base is too fat (e.g., it depends on GMP and even MPFR), but that’s no argument for excluding these more fundamental features from Base.

adienes · August 24, 2024, 8:50pm

You have used the words “useful,” “fundamental,” “expressive,” “essential,” several times, but you have not made the case as to why a type domain integer that subtypes Integer and behaves like an integer arithmetically etc. is so important.

Personally, I don’t really see the reason. Do you have any benchmarks for a common set of use cases showing relevant performance improvements over plain value-domain integers?

invalidate a huge chunk of Base code, as shipped in the default sysimage. This means, for example, that just loading a package defining a trivial Integer subtype from a fresh REPL session will cause lag on the order of a second when trying to do almost any action, including just pressing a key, while the invalidated code is getting recompiled. Thus depending on a package that defines a new Integer subtype isn’t acceptable for almost any package in the ecosystem

This is unfortunate, but not really a reason to put something in Base if Base does not itself depend on the functionality.

aplavin · August 24, 2024, 8:58pm

Do you see that with StaticNumbers.jl? I’ve been using it whenever I need static integers (not too often, but sometimes useful), didn’t notice any major issues.

nsajko · August 25, 2024, 9:26pm

The large majority of invalidations in Base happened to be caused by a recent regression on master. I hope almost all of these will eventually be fixed or worked around, so I edited out the mention of invalidation in the OP. See:

github.com/JuliaLang/julia

fix a huge part of invalidation on defining a new `Integer` subtype

JuliaLang:master ← nsajko:getindex_String_UnitRange

opened 06:40PM - 25 Aug 24 UTC

nsajko

+1 -1

I have no idea why this helps, it really seems like Julia should be able to infe…r the type of `n` as `Int` on its own. Perhaps something to do with the `print`-related callers of `getindex` taking `Vararg` args? MWE: ```julia using SnoopCompileCore length(@snoop_invalidations begin struct I <: Integer end Csize_t Csize_t(::I) = Csize_t(0) end) ``` It returns `5806` before this change and `46` after the change.

Palli · August 26, 2024, 9:54am

It’s not obvious from your PR or text here if/how this relates to floating point. I want to understand this more, and I see it fixes an issue related to irrationals, i.e. for “one(pi)”. I’m just thinking would an analogous TypeDomainFloat make sense?

jakobnissen · August 26, 2024, 12:40pm

I must admit I don’t understand why you’d want this in Julia at all, much less Base. Though I gather, from the wide use of Static.jl and similar packages, that there must be a use case.

From my (currently uninformed) perspective, I’m against it. It seems to me that the major motivation for static integers is to move computation from runtime to compile time. However, I think this is a poor solution for several reasons:

Conceptually, it’s conflating the type domain and compile time computation. That’s somewhat understandable, as we kind of have to do this in Julia to get compile time computation, as it is now. But I’m not wild about leaning further into this conflation. I would much rather see a solution where constant propagation was beefed up. A discussion about how constant propagation for constant struct fields could be enabled would be more fruitful.
I would think that what one would want most of the time is constant evaluation rather than specialization, and only specialization some of the time. Type-level integers force specialization all the time.
I don’t really understand the advantage over using Val. Sure, you can do StaticInt(1) + 1, but what is the advantage of this over unwrapping the Val to an integer, then doing your operations? This should work and be properly constant folded already. For example, foo(x::Val{N}) where N = Val(div(3N + 1, 7)) when called with Val{200}() already correctly infers that it returns Val{85}().
It raises the uncomfortable question about other static values. What about floats or Chars? The fact that the underlying problems (specialization and compile-time computation) have nothing to do with integers but the proposed solution only solves it for integers seem like a design problem.

nsajko · August 26, 2024, 12:40pm

It doesn’t.

nsajko · August 26, 2024, 12:55pm

On my phone so I’ll just respond to this for now.

Stronger constant propagation would certainly be nice, but it can’t make “values in the type domain” (in general) obsolete, because constant propagation is a compiler optimization, i.e., it’s a best-effort kind of thing. Apart from that, stronger constant propagation would actually make my design more useful, because it could improve the abstract type inference results.

aplavin · August 26, 2024, 1:16pm

I also don’t see a reason to put into into Base, but regarding usecases in general:

Having static integers that are actually <: Integer allows the user to plug them into whatever code works with integers – and potentially get more stability/speedup. A simple example:

julia> using StaticNumbers

julia> f = Base.Fix2(getindex, 3)
(::Base.Fix2{typeof(getindex), Int64}) (generic function with 1 method)

julia> g = Base.Fix2(getindex, static(3))
(::Base.Fix2{typeof(getindex), StaticInteger{3}}) (generic function with 1 method)

julia> xs = [(1, 2., "3", '4') for _ in 1:10^4];

julia> @btime map($f, $xs);
  58.541 μs (10011 allocations: 547.22 KiB)

julia> @btime map($g, $xs);
  6.458 μs (2 allocations: 78.17 KiB)

while Val can only be passed to functions that explicitly support it.

adienes · August 26, 2024, 1:24pm

I appreciate that you provided a concrete example where the existence of a static integer yields a performance improvement.

While I understand that your point was not necessarily about this example in particular, rather I understood your point to be more general about getting potential “accidental” speedups from various dispatches / internals / etc. by just switching Int to StaticInteger, I do think it should be noted that @btime map(x->x[3], $xs); is exactly as performant as the static version here and avoids the extra dependency.

aplavin · August 26, 2024, 1:28pm

Yes, sure, you can even simplify this example to fill("3", 10^4) (: Even faster and shorter. That’s not the point of the example though.

nsajko · August 26, 2024, 2:02pm

Type domain integers are the motivation. Types are great:

they can be dispatched on
you get type safety automatically, instead of having to implement various run time checks
they allow elegant compile time computation

Really don’t see where you’re coming from. Julia’s powerful type system is one of the best things about Julia.

Already addressed this in the above reply, improvements to constant propagation are independent of and complementary to type domain integers.

Not sure what you mean.

There’s @nospecialize and @nospecializeinfer when necessary.

This is like asking, “why write Julia when there’s assembly language”. It’s ugly and you’re giving up on generic code. It also AFAIK puts more pressure on Julia’s compiler than my design, and results in failures sooner (for less recursion depth). And the inference is worse.
As noted above, my design is more expressive than that. E.g., there are types such as NegativeInteger, NonnegativeInteger, PositiveInteger, etc. These all subtype TypeDomainInteger and it’s possible to express other types, such as GreaterThanOne, etc. Positives also subtype nonnegatives, of course.

This is addressed above. TypeDomainFloats may perhaps a good idea, but they’re less fundamental than type domain integers. TBH I don’t think I’d ever use type domain floats, but type domain rationals would certainly be nice. Will probably make a package for type domain rationals. In any case this tangent isn’t very relevant.

Don’t understand this. Underlying problems of what?

Yeah, and I’m quite proud of the type safety.

mbauman · August 26, 2024, 3:00pm

Type inference is also a compiler optimization, though. In cases where it fails (or is off), the consequences are so much worse than constant propagation failing. I know that personally, I no longer reach for Val. Lots of work has improved constant prop significantly in the past few releases, and it’s great.

This is a really cool bit of code, and includes some impressive and clever magic. But it’s also so magical and clever that I can’t meaningfully provide input on it, except to say that the recursive construction makes compile time pretty inscrutable and highly exponential just to construct these things. Using the package:

julia> using TypeDomainNaturalNumbers, Dates

julia> t = now()
       NonnegativeInteger(1000)
       now() - t
6222 milliseconds

julia> t = now()
       NonnegativeInteger(1000)
       now() - t
890 milliseconds

julia> t = now()
       NonnegativeInteger(2000)
       now() - t
24574 milliseconds

julia> t = now()
       NonnegativeInteger(2000)
       now() - t
4221 milliseconds

julia> NonnegativeInteger(10000); # many minutes

I know, I know, you’ll tell me, don’t do that. But then what does this gain us over the simpler Zero() and One() types? Defining zero() seriously · Issue #34003 · JuliaLang/julia · GitHub, where it’s not been clear that even those two limited types would be advantageous.

CameronBieganek · August 26, 2024, 3:54pm

I wouldn’t say that type safety exists in Julia, because Julia is a dynamic language. All errors are runtime errors.

Janis_Erdmanis · August 26, 2024, 6:18pm

Julia will end up being a static language for programs whose call graph can be extracted from the entry point. Hence, runtime or compile-time dispatch errors become just as different between compiling binary or running C++ in REPL mode. Type safety offers better assurances than runtime assertions, where the latter needs to be checked after each function call.

Julia really shines with its rich type system and its ability to put values in a type domain. The other few languages that can do similar things are Haskell via type classes and OCaml. One particularly good use for them is when one wants to model cryptographic groups, where group parameters can be put at the type domain and particular point coordinates within instances. More of that can be seen in the CryptoGroups.jl library, which is similar to the Haskell elliptic-curve library where type classes are used.

nsajko · August 27, 2024, 9:26am

There are two things going on here:

My design of natural numbers is such that representing a larger natural number has greater cost for the compiler than representing a smaller one, because the type is larger/more deeply nested. This is unlike the Val-like design in which the type is parameterized by an isbits value. ~~IMO this is usually fine~~, although now I realize I should have been upfront about this. ~~The reason it should be fine is that, even with the Val-like design, a large integer causes a greater cost for the compiler when used in operations (e.g., arithmetic).~~
In general, putting values into the type domain is expensive. Except when constant-folded, which is not the case in your example, where 2000 is not a compile-time constant.

EDIT: the third thing going on here is that, for a call like NonnegativeInteger(n), the number of run time dispatches is O(n) as currently implemented. It’d be simple to “fix” this in my code by moving n into the type domain, and then back, after a function barrier; proceeding with conversion afterwards. Then the conversion itself would be constant folded, and there would be only a single run time dispatch in total. That said, I’d say it’s better to leave this forceful specialization to user code, because n would realistically be a compile time constant most of the time anyway.

caleb-allen · August 27, 2024, 12:25pm

Can you expand on this? I’m curious what this looks like in terms of usage—from a naive perspective I can only envision it as a Val-like design. Why is your design better?

caleb-allen · August 27, 2024, 12:33pm

In the PR you mention:

the core design was copied from a package of mine.

Can you make this available?

nsajko · August 27, 2024, 2:11pm

I withdraw this statement . I was thinking specifically about the case of recursive algorithms where the depth of the recursion depends on the value of the integer, a case my design should be especially well-suited for. So the Val-like design has the benefit of being able to represent larger values than my design in practice.

In conclusion, there’s a trade off here between the type domain integer designs, so perhaps it’d be better to put the old Val-like design in Base, rather than my new recursive design. I’d still like to see type domain integers in Base, though.

Topic		Replies	Views
Static.jl vs StaticNumbers.jl Tooling	24	1248	September 14, 2022
Default Real (Float) kind? Internals & Design float	50	875	June 7, 2025
Julia messes up integer exponents? Performance integer-overflow	106	6462	February 21, 2019
Tonight I hated julia, REPL and overflows New to Julia integer-overflow	64	6306	March 14, 2020
Proposal for ∈ isa type syntax Internals & Design proposal , syntax	31	2900	June 4, 2019

Type domain (static) integers

Related topics