When a function compilation in Julia no longer gets used, it doesn’t go away. Instead, it stays in memory. Normally, this is not a problem because you don’t generate so many functions in the first place, and each function doesn’t take much memory. However, in my (private) symbolic regression use case, I generated a lot of functions and learned the painful way that unused generated functions stay in memory. This poses a serious issue in semi-dynamic programs where codes are generated and compiled (because it is executed enough time or long enough to warrant compilation), but also generated over time(and causes memory leak when the function is no longer used).
I was running my code for hours before this issue surfaced. This is certainly NOT a common issue one faces.
1 Like
There has essentially been no progress on that for a long time. It is a low priority item.
At some point I made a sketch for how we could get there, but it comes with a semantic limitation.
Functions and methods are partitioned by world ages. So if we could prove that a world age has become unreachable we could prune methods (and everything else is just figuring out the mechanics of unloading code)
But world ages are just UInt right now… So people can store them and then use invoke in world
to jump across time…
One avenue might be to turn world age into an opaque GC tracked token.
Right now it doesn’t seem worthwhile in terms of complexity and costs
8 Likes
Turning world age into an opaque type is the right way to go. It’s only valid to use a world age that you got from a function that returns world ages.
7 Likes
Yeah, but it’s not free and will make comparison more expensive. So Jameson and I were thinking of ways of getting more features out of making them opaque that would be worth the performance loss.
5 Likes
How is it not free? Wouldn’t it simply be something like this?
struct WorldAge
age::UInt
end
Or are you talking about the GC-tracked part? In that case, I guess it might be:
mutable struct WorldAge
const age::UInt
end
1 Like
We have a lot of places that are essentially:
world = current_task->world
if code_instance->min_world <= world &&
code_instance->max_world > world
If you put it in a mutable struct
then it’s three additional memory dereferences on a very hot part of the code. (We also need to allocate 16byte per world, but that’s fine since they will need to be globally uniqued.)
Not insurmountable and if we really need/want code GC we can do it, but also not something that is completely free.
Also the GC tracking is weirdly inverted. Right now we have
CodeInstance -> World
But what we really want is:
World -> CodeInstance
so that when World becomes disconnected in the object-graph all the information that is only valid in this world become disconnected as well. Add to that we normally use WorldRange
which is a set of worlds… That makes the tracing part even weirder. Maybe we would need to ensure that World’s are allocated sequentially in memory and we could use the pointer directly to perform range queries.
Anyway pure speculation at this point.
6 Likes