HN comment on Julia vs Swift type checking and auto diff

#1

Anyone know if this is accurate?

https://news.ycombinator.com/item?id=19897585

“I don’t think static type checking is a matter of tooling, getting Julia to perform well amounts to either a lot of trust in the compiler (is static type inference for the Julia type system even possible?) or manual type annotations as far as I can tell. And static types are not there in a large program to make them necesarily faster, but to automatically enforce contracts. For this to work they have to be mandatory, not added on as an after thought. I don’t want to be able to run a program only to discover it crashes because a method was not overloaded with this particular combination of parameter types.”

Also some comments on how hacking autodiff into the swift compiler in c++ means it will be more perfomant than messing with Julia IR. Doesn’t sound right to me

1 Like
#2

I recall that you initiated a similar discussion,

in which you did not participate after people replied. There are some relevant points there.

#3

I appreciated the discussion, but I had nothing to add. Also the other discussion was focused on general benefits and tradeoffs of static vs dynamic, and this one on specific claims and counterclaims about extending Julia’s tooling to do a bit more of what swift does at compile time (er JIT time) , to ameliorate a potential drawback.

2 Likes
#4

“All” that it takes to add static type checking to Julia is to designate a checkable subset of the language with rules such that if those rules are followed, the subset is guaranteed to be type safe. Then people can activate a “static” mode for portions of their code such that they are only allowed to use this subset of the language and will get a guaranteed compile time error if they stray from it. Then there cannot be any type errors in their code if it compiles. This is not impossible but hasn’t been a high priority, but we’ve started to talk about it in compiler discussions because this kind of thing would be very helpful in targeting GPUs and TPUs.

An interesting observation is that some newer static languages like Crystal and Nim—and to some extent Rust and possibly Swift as well—are forgoing the traditional static approach of having well-defined formal type rules that have proofs of completeness and soundness. Instead, they are allowing programs to compile so long as the current compiler happens to be able figure out types. This is, in practice, very similar to what Julia does, if Julia just threw an error whenever the compiler couldn’t infer precise types. There are some major differences though:

  1. Julia doesn’t care if you write code that can’t be inferred, it lets you run that code, it just might be slower;
  2. Julia has been designed so that whether the compiler can infer types or not is an implementation detail that (mostly) only affects performance.

This StackOverflow answer has more details about the difference. By comparison, in these recent static languages, it is very much exposed to the user what can and cannot be inferred since you cannot compile a programs unless type inference succeeds. This allows additional flexibility in usage, which is great in the short term, but in the long term, I suspect that it completely locks these languages into their current compiler implementations. Users get very upset if programs that used to work stop working, so the compiler is effectively never allowed to get worse at figuring out precise types. But any major compiler change tends to make some inference more precise and some less precise. As time goes by and the precision of inference in these languages ratchets monotonically upward and there’s more user code in the world that they’re not allowed to break, I suspect that these languages will be increasingly locked into how their compiler currently works and will be unable to change it in any way. This is not too dissimilar from how CPython is locked into its C API and cannot make implementation changes that would improve performance because it would break too many Python packages.

By contrast, Julia 1.0 came with a complete compiler overhaul that gave major performance improvements and while there were lots of breaking changes in the release, they were due to explicit API changes, not the compiler rewrite. There have also been pretty significant compiler changes in 1.1 and 1.2, which have been non-breaking. Ironically, I suspect this situation will prevent these static languages from doing the kinds of innovative compiler work that has made technology like Cassette and Zygote and compiling to GPUs and TPUs possible in Julia. Since Julia isn’t committed to precise static semantics, we have a huge amount of flexibility when it comes to evolving and improving the compiler design. Traditional static languages have given themselves wiggle room by picking formal type semantics up front—any implementation of those semantics is valid, which allows them to experiment and change the compiler. This new generation of static languages has given users more flexibility by giving up on the abstraction barrier of formal static semantics but at the cost of sacrificing future flexibility in compiler implementation.

I could be wrong about all of this, but my prediction is that innovation in compiler tech in the recent crop of static languages may grind to a halt as they are used more widely and become less willing to break user code.

26 Likes
#5

This reminds me of other similar choices that I learned about on this forum, eg that the compiler is free to decide whether to copy immutables. “Freedom to the compiler” could be an official Julia design principle :sunglasses:

5 Likes
#6

Warning: rant below.

Yeah, but performance is the #1 reason most of us came to Julia in the first place, and it has a major impact on how we structure our code and types. If someone’s code took 1s in Julia 1.0, but 100s in 1.1, then it would get filed as a performance regression, and it would be promptly fixed in a patch release AFAICT. Do you really have that much compiler freedom? Have you used it? When has there been a major performance regression on a commonly used idiom?

I’ve seen Jeff’s threat of inferring a lot of stuff as Any, to reduce compiler latency. As someone managing rather complex, performance sensitive code with missings, units, and parametric types, I worry. If Julia 1.2 is suddenly 10X worse because of some failed inference, it might lock us on Julia 1.1 for a while, until I get the courage to dive into the code and satisfy the compiler’s new, unstated demands.

Furthermore, the “compiler freedom” idea justifies/explains the lack of information. There’s no user-facing docs for constant propagation (that I know of) because “it’s an implementation detail subject to change at any time”. So if I want to write high-performance, generic code, the only solution is to hang out on discourse, github and slack to soak in the occasional bubbles of wisdom that percolate from the dev team.

I once had a function that was

  • type-stable on vectors of floats and missings,
  • stable on vectors of unitful quantities
  • unstable on vectors of unitful quantities and missings

IIRC it required a significant code refactoring to solve the issue. How was I to know in advance?

I’m just ranting here. I don’t have a solution. An optimizing compiler is an amazingly complex juggling act, and Julia is one of the best systems I’ve ever used. I get why the dev team is enthusiastic about compiler freedom. I can see the upside. However, as a performance-seeking user, it is frequently frustrating.

A lot of the pain comes from even identifying where the bad performance comes from. Tooling (like Traceur?) helps. @code_warntype was great in 0.6, but it’s a lot more cryptic in 1.0, and I don’t know if I can trust its output (constants aren’t propagated, are they?). Also: @type_stable blocks.

EDIT: testing for perf. regression is also messy, but it just occurred to me that if there was a @count_dynamic_calls foo(...) macro, it could be a perfectly deterministic way of quantifying performance for testing. And it sounds very doable. Dynamic calls are already slowish, increasing a counter should not be an issue.

8 Likes
#7

That sounds scary. Where is that found? I’m having a hard time imagining this.

My understanding was that constant propagation is not now, nor was it ever mature enough that one should rely on it in performance critical applications. As such, I’ve always treated it as a bonus, so I’ve never found my code to be suddenly slower because constant propagation is not working as expected. I certainly agree that it should be more well documented.

Seems pretty clear to me, at least for the purposes of identifying type instability. Granted, I usually start on my more basic functions and work my way up rather than trying to disect the @code_warntype on a really elaborate function call. Even in that case, however, it seems to indicate pretty unambiguously whether the code is type stable in the end.

That sounds to me a lot like the type union was just too big. I usually assume that if I can’t write the relevant type union simply (e.g. Union{Int,String,Missing}) then Union types are not going to do me any good type stability wise.

I don’t want to come off as fully drinking-the-cool-aid here: type instability arising from IO and generic container types can be frustrating, and I can’t help but feel that the language still lacks some expressiveness to elegantly describe things like NamedTuple. Furthermore, there seem to be a whole bunch of optimizations due for StaticArrays such as a special case handling of NTuple{N,T} (i.e. so that the compiler understands that all of the Tuple arguments are the same).

That said, I certainly feel that there is always some subset of the language I can work in to make things type stable. Usually once my containers are all properly typed, everything works as expected. Because of this, @StefanKarpinski’s implication that it would be straight-forward to designate a static subset of the language doesn’t seem far-fetched to me, in fact I often think about this.

If anyone has a collection of particularly pathological examples where type instability is either very difficult to determine, impossible, or impractical to resolve, I’d definitely be interested to see them. I assume those who work on the compiler would be interested as well.

2 Likes
#8

It has already been happening, you just didn’t notice :wink:. Seriously, I think that a lot of the latency improvements in 1.2 wouldn’t have been possible if we were unable to occasionally infer less precise types. Some of the big issues that have been solved have been of the “type inference wild goose chase” variety, where type inference gets caught up in some combinatorial explosion and takes minutes to infer something. One may think “oh, of course you should fix that” but what if we weren’t allowed to fix that because there was code out there that would stop compiling if we didn’t go on that wild goose chase? You can also say, oh, but surely if the choice is between code that compiles but takes an absurdly long time to do so and breaking that code, then breaking that code is a viable option. But that’s not necessarily the choice from one user’s perspective—it may not be their code that takes minutes to compile, it may just be that changing the heuristics to avoid the wild goose chase in some cases also causes other code where there old heuristic worked perfectly fine to stop being inferred. Those people are gonna be pissed when their code that used to work and compile in reasonable time stops compiling entirely.

Of course, we don’t want code to regress, but keeping good performance is much more forgiving than never being allowed to reduce type inference precision. We could, for example, do less aggressive type inference but then guess what we think the type of something might be and emit code that’s efficient for that case, while handling other cases with slower, more generic code (and potentially even recompiling if we’re wrong too often). Without the aforementioned freedom, that wouldn’t be an option.

That sounds like a difficult situation, but that kind of thing has gotten much better in later Julia releases precisely because of better heuristics and without latitude to change heuristics these things might have been much harder to improve.

I don’t think this is entirely accidental and it has to do with the experimental and iterative approach that Julia takes to these things—which is a bit unusual for compilers, but I think the results speak for themselves. It is definitely frustrating when you’re living on the edge of what the compiler can handle, but there’s still a pretty large “safe zone” where your code is certainly not going to regress no matter how the compiler changes.

13 Likes
#9

All good points.

If there was a documented safe zone,

That would help tremendously my ability to plan ahead. I’m glad that it’s on the table!

4 Likes
#10

Perhaps a more explicit definition of what constitutes this “safe zone”, what kind of code constructs are guaranteed to not regress, would help.

#11

Sure, that would be great to have. Which do you want the most?

  1. The new multithreading system;
  2. A solution to the time-to-first-plot problem;
  3. A precise safe zone definition.
6 Likes
#12
julia> @time using Gaston; plot(1:10)
  1.601037 seconds (1.13 M allocations: 55.433 MiB, 1.00% gc time)

julia>

:slight_smile:

#13

One great thing about Julia’s release process is that, IIUC, the tests of the packages registered in General are run before the actual release. So, if package authors want certain language constructs to be in the “safe zone”, they can test it with @inferred etc. in their test suite. Maybe having SafeZone.jl for a central collection for such test suite can be a good idea?

Of course, one pitfall could be that core devs might ignore failures from such tests because they (need to) use functions explicitly marked as implementation details.

#14

Just for curiosity, would this new multithreading system implement this: Multiple Julia instances in a C++ software ?

#15

That is what Jeff is talking about there.

1 Like
#16

Yes :smiley:

5 Likes
#17

I once had a function that was

  • type-stable on vectors of floats and missings,
  • stable on vectors of unitful quantities
  • unstable on vectors of unitful quantities and missings

IIRC it required a significant code refactoring to solve the issue. How was I to know in advance?

As long as there are constants like these that affect inference, this kind of unfortunate outcome is unavoidable. (And at least currently, those constants appear to be pretty important.)

I would guess that the only resolution is to have @why_did_inference_fail foo(x, y, ...) so that the problem can be readily diagnosed.

4 Likes
#18

It is worth pointing out here that all compilers are complex and opaque with respect to performance. Having a static type system doesn’t help as much when what you really care about are inlining decisions, escape analysis, constant propagation, etc. So I don’t think Julia asks you to “trust” the compiler that much more than usual. Is it helpful to have visibility into all this? Absolutely, and I hope that doing a good enough job of that means you don’t really need a specified static type system to get performance.

14 Likes
#19

Could the user have control over this?
Namely have some “Volume Dial” to set the balance between reduce latency to performance?

#20

I think this observation is very important — I am fine with inference having various limitations (given the benefits, as explained in this topic), provided that

  1. they can be detected easily, mostly in an automated way, so I can test for them,
  2. it is easy to identify, diagnose, reason about, and ideally correct the problem.

(1) is mostly solved by Test.@inferred (though a non-erroring @isinferred would be nice, too), but (2) is currently a large barrier to entry for new Julia users; as you explain above, these skills are learned by doing, and not really formalized.

4 Likes