Hashing Method Cache IDs

mrufsvold · February 7, 2023, 11:22pm

Disclosure at the top, I’m asking a question here that is out of my depth, but I keep thinking about it and would really appreciate the thoughts of smart folks.

A while back, I watched this Strange Loop Talk on Unison, a distributed programming language. Unison’s “thing” is that it identifies functions, not by their name but, by the hash of their AST representation. So if you define two functions with the exact same args and body, it will only compile one, and functions that call either of them will actually point to the same function after compilation. It seems to work really well for the problem Unison is trying to solve which is fragmentation across distributed systems.

It got me thinking, could Julia leverage something like this to help with the (pre/re)compilation time problem. My best understanding of the current reason for long recompilation after updates is that the potential for invalidations and the complexity of many packages adding methods and such means you need to run compilation pretty naively since you don’t know if the body of any given function name is the same after an update.

What if the Expr that represents the lowered, inferred version of a specific method call was hashed and then the native code was then cached with that hash as its key. As long as the body of that function is not changed, then future recompilations could skip the native code gen step.

You could go a step further and replace all variable names with v1, v2, … vN before hashing, so renaming variables wouldn’t cause cache misses.

I’m totally expecting to learn that this is a bad idea for very good and obvious reasons, so thank you in advance for your patience.

uniment · February 8, 2023, 4:35am

As mentioned in this exchange, caching of this nature is somewhere on the horizon but it’s not on the schedule. But seeing how long and contentious that thread is, maybe its priority will be bumped up

mrufsvold · February 8, 2023, 11:47am

Is exactly what prompted me to get off my butt and post this idea! Glad to know it’s already in the conversation. Thanks!

uniment · February 8, 2023, 9:41pm

Well, it’s in the conversation of making it so simple anonymous functions only have to compile once, not the conversation of improving TTFX. Plus, there aren’t even concrete plans for it so I don’t know that I’d consider it solved.

Having no expertise on the matter, to me it looks like a great idea that should be launched to the top of the priority list.

mrufsvold · February 8, 2023, 9:48pm

Fair, if a core dev weighs in on it more precisely, I’ll move “Solution,” but it might be as good as I’ll get for now

Topic		Replies	Views
Does Julia cache version of function with given input value? General Usage dispatch , cache	7	850	August 31, 2021
Julia precompilation limits or are there really any? General Usage precompilation , snoopcompile , jit	20	3325	June 20, 2022
Does Julia recompile if the redefinition of the function is the same General Usage performance , compilation	8	533	November 29, 2022
Hashes change whenever package is pre-compiled General Usage question , package	12	652	April 14, 2020
Benchmarking function compile time Performance question , compilation , benchmark	10	254	January 22, 2025

Hashing Method Cache IDs

Related topics