Compile Time Dictionary

Suppose I have a function that looks somewhat like the following:

foo(x::T) where {T} = begin
    v1 = lookup(T) # Lookup some value specified by type T
    # Perform work with v1
    # Return

My question is can I compile away the lookup because a given type T results in a specific value so when specializing on type T the compiler can just hardcode the value? If so what is the proper way to do it?

I don’t think you can get significantly better than an IDDict.

If you know that you’ll only ever put in types or other compile time constants, you can dispatch on it to return your specific constants. If they only depend on your types though, you could just as well precompute them.

Do you have a specific usecase in mind?

If the lookup returns always the same object for the same type this will happen naturally. You just need to be aware of when Julia avoids specializing.

In my use case the type T is also a parametric type which further might be parameterized by other parametric types. Writing a dispatch function for all of them would be too tedious

Yeah this is what my use case would demand. Is there a way to verify the specialization is happening?

Note that a lookup in some global dictionary or otherwise global constant will in general not compile away - if the object is mutable (like a Dict is), there’s no guarantee that its elements don’t change. Using dispatch to get around that doesn’t have that issue.

Is something like this an option?

for T in (types,...)
    @eval lookup(::Type{T}) = $(some_computation(T))

This will do the computation once for the T you’re interested in and define the lookup methods for you. For more details, I think I’d need more information about what your types look like.

1 Like

Hmm, that actually could be an option. So define the methods I need for dispatch dynamically? Coming from any other language that sounds like bad practice, but is this something more standard in Julia?

Metaprogramming should always be approached with care, because it is often harder to debug and to mantain. However, sometimes, if the alternative is writing a lot of boilerplate code, the metaprogramming can end up being more maintainable.

I think the @code_native should be able to show you if the value is inlined or not? Sorry, I do not go down to this level often.

Yes, in a sense. I don’t usually recommend doing this (avoiding eval and all that), except in cases like this where it will save you a lot of boilerplate. As long as you’ve defined those methods ahead of actually using them, it should be fine.

You may have to play a little with it to interpolate T into the expression as well, but other than that this should work just fine. Still, some more info about what the context surrounding your question is would be nice - if possible I’d like to avoid an XY-Problem.

Either way, there currently is no mechanism to guarantee that anything will be inlined or completely eliminated by the compiler at compile time. Usually though a single call is rather inexpensive, and if the called function is small it will probably be inlined anway.

The use case is for creating a DAG workflow. The DAG consists of nodes which is an abstract type extended by many concrete mutable structs. The type of each node in the DAG is unique (the behavior of the node is entirely encoded by the type and its type parameters).

Each node contains a val and the type of val is a type parameter for many of the nodes (could be fixed in which case it is not). The nodes are mutable because in each loop of an event loop the nodes’ val are updated according to a calc function that is dispatched based on the nodes type.

There are some nodes which are meta nodes. That is their val is a description of the val of another node. For example you may want to keep track of the max value ever taken by another node. This Max node is thus parameterized by the node type it wishes to keep track of the max of (recall that the type of every node is unique in the DAG).

The calc method for the Max node will need to lookup the val of the node that the Max node is computed over. Rather than perform a lookup in a dictionary (I am very interested in extracting as much performance as possible, hence the heavy use of type parameterizations) I am looking to intern the memory location of the relevant dependency val right in a calc function that is specialized for this particular Max node.

I want to write only one calc method that handles all Max nodes (there may be more than one Max node since you might want to track the max of multiple nodes) that specializes based on the exact parameterization of the Max type (that is each Max node in the DAG should get its own specialized method without having to resort to dispatch). It should have direct access to its dependency without having to go through an intermediary.

In general once the DAG is created and all nodes clocked by the event loop at least once each node should have a fully specialized code path dedicated to its exact type that is entered when calc is called on it. There will be an Engine which is parameterized by a tuple type consisting of the types of all nodes that will construct the DAG. The Engine is what is actually responsible for calling calc on each node.

Based on your suggestion I am thinking I can insert the creation of the lookup methods in the constructor of the Engine or the run method that takes in an Engine

Is this how specialization is implemented in Julia? When I call a polymorphic function with a specific type does it just generate a new method that will be called by dispatch in the future when provided with the same type?

Methods and specializations are two different things. Each function body with a distinct signature (i.e., number/type of parameters) but the same name is a method. However, a method can be generic (as you did not specify each parameter as having a concrete type), and then when it is called with some specific set of objects as parameters it will generate a specialization for that method. The programmer in general do not interact with specializations directly, they are done on the fly when methods are called. So the hierarchy is function → method → specialization and each function can have multiple methods and each method multiple specializations. If you have a function with a single method and the method has every parameter annotated with a concrete type, then there is only one function, one method, and one specialization.

1 Like

I understand that (perhaps I did not demonstrate it as concisely as you did)

calc is a function. The function will have a polymorphic method for each node type (ignoring the parameterization of the type). And then each method will have a specialization for each parameterization of said type.

The problem at hand is that I would like to intern the memory location of the val dependencies directly into the specialization which should be possible because the dependencies of a node are given in its type parameterization. As opposed to use a dictionary that all nodes share for lookup of their specific dependencies (lookup is not specialized in this case)

Note that in my use case calc will be called on each node on the scale of ten million times and there could be a 1000 nodes, most with 1-2 direct dependencies. Hence the desire to specialize as much as possible.

I really do not understand why you mean by intern the memory location of the val dependencies directly into the specialization. You mean to inline the memory location of a val? (what exactly is this val? you mean an object of type Val? or this can be any value?)

Every node in the DAG has a field val. A node may have dependencies it needs to lookup in order to perform its calc. However the calc method could be specialized to not have to perform the lookup for the dependencies’ val since what its dependencies are are specified in its type parameterization, and the concrete type of each node in the DAG is unique.

For example I could have two max nodes m1 and m2. And two other nodes n1 and n2. The val of m1 should be the max value that the val of n1 ever took. Similarly for m2 and n2. m1 and m2 will have different types. The type of m1 will be parameterized by the type of n1, and the type of m2 will be parameterized by the type of n2 (although both are still Max nodes).

There is a calc method that operates on Max nodes. That is both m1 and m2 would be dispatched to this method. However it could be specialized since m1 and m2 track the max of different nodes which are in turn of different concrete types.

Let’s take a step back. It sounds like you’re trying to lift all computation over your DAG into the type domain, which is probably not a good idea, if the number of unique types you have is as large as you say. The compiler/type machinery is not designed for general computation.

Based on this, I think you’ve misunderstood what a “method” and a “specialization” in julia is (and I think that’s why @Henrique_Becker wrote their explanation). In julia, a function is just a name. A method is a concrete implementation of that function for some (possibly parametrized) typed arguments. A specialization is a concrete compiled version of a specific method for some concrete types.

Never in this chain does anything have access to fields of any runtime objects - only when the code is actually run later on does the memory backing a given object exist in the first place.

No, that’s not what I was thinking of when I made that suggestion. In fact, I’d suggest not relying on type domain computations for your problem at all - they will become a bottleneck, due to having to compile a new method for each individual object. This will greatly increase compile time and may not even eliminate all dispatch like you hope to.

Like I said above, at the point where types exist, values don’t. The core difficulty of your problem is the DAG reduction (which sounds like a symbolic execution engine to me?) – you can’t compile that away.

In general, if you find yourself wanting to create (tens of) thousands of specializations in quick succession, you will very quickly run into the reality of having to compile those specializations. At that point, keeping your code a bit more generic and taking another approach is probably a good idea (though admittedly I’d have to think more about what approach would be fastest here and what sort of problems your exact code has - as a first guess, eating the dispatch cost and not parametrizing Max on its value will probably be beneficial).

1 Like

The computation for the val of each node is kicked off by recieving a message. These messages are spaced out in time over the whole day, the critical aspect is to respond as quickly as possible. Biting the bullet on one compilation (even if it took minutes) would be fine for me because I could just warm up the engine prior to receiving the actual messages.

Hence I think I would rather eat the compilation cost and not the cost of not specializing everything that can be specialized. Keeping the structure of the DAG in the type domain rather than as a runtime object.

Does that message potentially result in a new parametrized Max? If so, you will have to eat specialization, i.e. new compilation cost. If not, why bother with types in the first place? You could just as well actually have a proper precomputed, constant lookup table that will certainly be faster than dispatching.

The process of parsing your message and constructing the object to then pass to lookup sounds inherently dynamic and will be type unstable - resulting in dynamic dispatch yet again, no matter whether you’ve conditioned your engine to compile the methods ahead of time. You could have saved the dispatch time by using the message you get as a key in a precomputed Dict or similar directly, which is just a regular memoization technique.

Maybe I’m misunderstanding though.