But isn’t union splitting only possible when a specific small set of possible types is known a priori? I don’t see how that helps with larger projects that are supposed to be extensible. For me, the raytracer problem shows that we’re somehow missing functionality to deal with such scenarios. How do other languages deal with this problem? In Swift I’d have defined an enum with all necessary types, but of course that’s not dynamically extensible either.
I this other thread an alternative in Java was tested:
But the conclusion (unexpected according to the author) was that it was slower than any of the Julia alternatives.
In that thread about the ray tracer someone suggests that C++ has a better way to deal with that, but we didn’t see that actually implemented. I think that this is a problem that deserves some attention, at least for a definitive answer and advice be available.
I think that this is one of the cases, where Julia complexity is hidden behind very intuitive usage design. Take what following with the grain of salt, since I never wrote compiler in my life and maybe very far away from the real reason why your idea can’t be implemented, but we can try and reason about it.
Assume that you have function like
function f(x1, x2)
y1 = g1(x1, x2)
y2 = g2(x1, x2)
z = h(y1, y2)
return z
end
and you want to constraint it’s output with annotation f::Float64
and want for compiler to take this information into account and avoid unnecessary boxing.
Now, since Julia support multiple dispatch, you may have multiple versions of g1
and g2
depending on the types of its arguments, so you have to infer arguments types in order to choose appropriate functions. When you are moving top-down (from the types of arguments x1
and x2
) this task is more or less straightforward - you recursively go inside functions g1
, g2
until you hit some basic functions which return type you can infer and combining this information you can infer return types of these functions. As a result you get types of y1
and y2
so you know which version of h
to choose and so you finally can write final llvm code (of course procedure is way more complicated than that, but I think it is main idea).
Now, what happens if you try to take into account fact that the return type of f
is Float64
? You can infer that type of z
is Float64
, because it is the return value, but what are the types of y1
and y2
? You can’t just go inside the function and infer types from basic functions, because input types cardinality can become huge very quickly. For example, what if h(y1, y2) = y1 + y2
? y1
and y2
can have any types that gives you Float64
in sum. It can be Float64
and Float64
or Float64
and Float32
or Float64
and Int
or Float64
and MyAwesomeType
if you define plus
operation for these types.
At this point you have huge number of possible types of y1
and y2
and for each combination compiler should infer possible types of x1
and x2
in order to choose appropriate versions of functions g1
and g2
. So this procedure should be repeated for g1
and g2
and you can see how this task quickly explode out of control. Very soon you just give up and say that everything is Any
and box all variables, because there is nothing you can say about their types, so you have to choose most common of all.
So, I think backward type propagation is breaking very fast and as such it has no meaning. If you can’t infer types more than one or two step backwards what’s the reason to implement this machinery at all? It’s better to have forward type propagation and let user annotate the type of the result - which is happening now in Julia.
Maybe there are ways to solve this problem (other than annotate everything - this would return us to the land of static compiling), but it looks like it is not going to be simple, or at least it will be way more complicated than current implementation of the compiler with its limitations.
You can use the trick here to make it faster: Snooping on and fixing invalidations: @snoopr · SnoopCompile. Gives you about a factor of 2. The key is to remember that a.x
is itself a function call.
Could you please explain that a bit?
I am unable to see how that is possible without stating that get_value(c::ConcreteC)
also returns Float64
, which may be false.
You just need to add a method
function Base.getproperty(a::Abstract, field::Symbol)
field === :x && return getfield(a, :x)::Float64
error("unknown field ", field)
end
and then you don’t need the type-assert on get_value
. getfield
is a builtin, so there’s only one such method.
@jules, I know you said you don’t want to consider union-splitting, but do keep in mind you can accelerate your code for some known types and then have a slower path for any extensions. For example,
function test_fast(cs)
x = 0.0
for c in cs
val = isa(c, ConcreteA) ? c.x :
isa(c, ConcreteB) ? c.x :
isa(c, ConcreteC) ? c.x :
isa(c, ConcreteD) ? c.x :
isa(c, ConcreteE) ? c.x : c.x
x += val
end
x
end
doesn’t “lose” anything because that final c.x
will handle any type outside the five listed. It is also by far the fastest option yet: if I increase the size of cs
by another 100x, I get
julia> @time test(cs)
0.129554 seconds (5.00 M allocations: 76.294 MiB, 7.95% gc time)
5.0e6
julia> @time test_fast(cs)
0.017303 seconds (1 allocation: 16 bytes)
5.0e6
While that isa
block might look useless (all options just call c.x
), what you’ve done is provide multiple call sites, and the compiler can optimize each one of them differently.
Interesting, so maybe a feasible path to achieve this while keeping code extensible would be to use a macro as in Macro to write function with many conditionals - #8 by Skoffer, and then add a mechanism to reload all functions that depend on the subtypes
output if a new subtype has been added, effectively just overwriting the old version. I think that would give you both optimal speed, low allocations, and would allow others to add new subtypes after the fact.
There’s
https://github.com/jlapeyre/ManualDispatch.jl
though I confess I haven’t played with it myself. I worry about the fact that the @unionsplit
macro doesn’t seem to allow you to pass the name of the variable that you want to do all the isa
on. The solution by @Skoffer you linked looks better.
But I’d strongly urge you to think twice (or three times) before doing that “redefine your methods” trick automatically. You’ll get nasty world age issues, etc. If you can just pick a set of types and live with it, that would be far better.
Not automatically, I thought about just calling some function like reinitialize
or so after adding a new type. This is just me playing around with ray tracing and trying to squeeze out performance as a learning exercise. Sometimes one can hit quite complex issues that way which reveal interesting properties of the language and come in handy later.
If you trigger redefinitions manually, that’s a bit better, because that puts the user in control of the updates. By returning to the REPL after the update and before running any code, you also give the world-age a chance to update, so you shouldn’t get mysterious errors.
Thank you @tim.holy !
The problem was that in the in the OP different fields were used in the different get_value
methods, and in that case I still cannot see how replacing get_value
with getproperty
can help.
Not sure if relevant, but when spliting manually, like Tim did, it is possible to inject the splitting rules from the call site:
function sum_concretes_withgetter(cs; getfn = get_value)
x = 0.0
for c in cs
val = getfn(c)
x += val
end
x
end
function concrete_getter(c)
if c isa ConcreteA
return get_value(c)
elseif c isa ConcreteB
return get_value(c)
end
error("Got unknown type in concrete_getter")
end
julia> @btime sum_concretes_withgetter(concretes_ab; getfn = concrete_getter)
34.400 μs (1 allocation: 16 bytes)
This way it may also be possible to dynamically generate the getter just before the call, if the types are not known statically at the call site.
Right, it wouldn’t; I was responding to the updated example, where the field names are consistent. You could make getproperty
access different fields for different structs, but at a certain point that gets more confusing than helpful.
The SnoopCompile docs I linked to address a fairly common situation in which subtypes of an abstract type are expected to have some fields in common so that they can provide a common interface. In Julia 1.6 we’ve improved the inferrability of code quite a bit by using that trick for several IO, LibGit2, and other abstract types.
With the @unionsplit
macro from the other thread I could eliminate all the allocations stemming from passing different Materials and Shapes to my raytracing functions. Pretty nice