How's the progress in garbage-collecting unused functions?

When a function compilation in Julia no longer gets used, it doesn’t go away. Instead, it stays in memory. Normally, this is not a problem because you don’t generate so many functions in the first place, and each function doesn’t take much memory. However, in my (private) symbolic regression use case, I generated a lot of functions and learned the painful way that unused generated functions stay in memory. This poses a serious issue in semi-dynamic programs where codes are generated and compiled (because it is executed enough time or long enough to warrant compilation), but also generated over time(and causes memory leak when the function is no longer used).

I was running my code for hours before this issue surfaced. This is certainly NOT a common issue one faces.

1 Like

There has essentially been no progress on that for a long time. It is a low priority item.

At some point I made a sketch for how we could get there, but it comes with a semantic limitation.

Functions and methods are partitioned by world ages. So if we could prove that a world age has become unreachable we could prune methods (and everything else is just figuring out the mechanics of unloading code)

But world ages are just UInt right now… So people can store them and then use invoke in world to jump across time…

One avenue might be to turn world age into an opaque GC tracked token.

Right now it doesn’t seem worthwhile in terms of complexity and costs

8 Likes

Turning world age into an opaque type is the right way to go. It’s only valid to use a world age that you got from a function that returns world ages.

7 Likes

Yeah, but it’s not free and will make comparison more expensive. So Jameson and I were thinking of ways of getting more features out of making them opaque that would be worth the performance loss.

5 Likes

How is it not free? Wouldn’t it simply be something like this?

struct WorldAge
    age::UInt
end

Or are you talking about the GC-tracked part? In that case, I guess it might be:

mutable struct WorldAge
    const age::UInt
end
1 Like

We have a lot of places that are essentially:

world = current_task->world
if code_instance->min_world <= world &&
   code_instance->max_world > world

If you put it in a mutable struct then it’s three additional memory dereferences on a very hot part of the code. (We also need to allocate 16byte per world, but that’s fine since they will need to be globally uniqued.)

Not insurmountable and if we really need/want code GC we can do it, but also not something that is completely free.

Also the GC tracking is weirdly inverted. Right now we have

CodeInstance -> World

But what we really want is:

World -> CodeInstance

so that when World becomes disconnected in the object-graph all the information that is only valid in this world become disconnected as well. Add to that we normally use WorldRange which is a set of worlds… That makes the tracing part even weirder. Maybe we would need to ensure that World’s are allocated sequentially in memory and we could use the pointer directly to perform range queries.

Anyway pure speculation at this point.

6 Likes