Letting a package use external user-created functions

The performance problem comes when a loop like

for p1 in v1
  for p2 in v2
     compute(p1,p2)
  end
end

Has to call different methods of compute at each iteration. That will happen if the type of p1 or the type of p2 change from iteration to iteration. That should be avoided.

I know @lmiq answered, but I find a few details are missing.

I agree with @lmiq. Yes this is the way to go.

Simpler function signature

If you do this, you don’t need to explicitly tell Julia that the vectors might have different element types in your function signature. Julia will automatically generate specialized versions of compute_interaction for those vectors, and will probably inline your functions! If not, well, you end up paying the cost of a function call - but that’s not incredibly expensive:

function compute_interaction(particles_i::Vector, particles_j::Vector)
    for p_i in particles_i, p_j in particles_j
        p_i.force += compute_force(p_i, p_j)
    end
end

Though I question the practicality of having a force accumulator in the Particle object itself because it might get messy to track.

EDIT: Maybe I should withdraw this previous statement. There is something somewhat natural about this representation. It might actually work out, if done well.

I will point out that I highly suggest you at least specify that particles_i::Vector & particles_i::Vector are Vector-s - unlike what @lmiq suggested:

This definition would try to trap all two-argument calls to compute_interaction (but your implementation clearly expects Vector-s).

More reading: Function Barriers

I think this is what Julia developers call function barriers. I have read discussions about them, but I can’t find any of them at this time.

There is a mention of function barriers here:
→ Performance Tips · The Julia Language

But they are examples of explicit barriers, as opposed to being implicitly created through function calls.

If someone reading this knows where a proper discussion of this design pattern is found, please include it in this thread.

I will almost agree here, except that you probably want AbstractVector, because then you accept views. (edit: and static arrays)

1 Like

Not exactly. What matters is whether or not Julia can tell what’s in each cell of those vectors. This is when I try to pretend I am the Julia compiler (or whatever we call the component that deals with this logic):

When Julia can’t anticipate

If particles_i is a Vector{AbstractParticle}, and particles_j is a Vector{ConcreteParticle1} then, for each loop iteration:

  • Juila doesn’t know what type of particle will be in particles_i[i].
  • Even if Juila does know that particles_j[j] is of type ConcreteParticle1.

…So for each iteration, Julia has to do something like:

if isa(particles_i[i], ConcreteParticle1)
    compute_force(particles_i[i]::ConcreteParticle1, particles_j[j]::ConcreteParticle1)
elseif isa(particles_i[i], ConcreteParticle2)
    compute_force(particles_i[i]::ConcreteParticle2, particles_j[j]::ConcreteParticle1)
elseif isa(particles_i[i], ConcreteParticle3)
    ...
end

And this is where you loose performance. Compute pipelines are broken when the compiler can’t predict what data it will encounter (or calls it will need to perform).

When Julia CAN anticipate

On the other hand, if Julia KNOWS that particles_i is a Vector{ConcreteParticle2}, and particles_j is a Vector{ConcreteParticle1}, then it knows exactly which version of the code to execute:

  • Juila does know that particles_i[i] is of type ConcreteParticle2.
  • AND Juila does know that particles_j[j] is of type ConcreteParticle1.

so it just needs to always run the following:

compute_force(particles_i[i]::ConcreteParticle2, particles_j[j]::ConcreteParticle1)

(for this particular call to compute_interaction with known, concrete vector types)

No more if statements! No more broken pipelines!

1 Like

Very true. I have a mental block with using AbstractVector, partly because I avoid using it. I think the reason I avoid AbstractVector so much is that it keeps me from writing:

DoSomething(::Vector) = ... #Default behaviour
DoSomething(::Vector{ConcreteType1}) = ...
DoSomething(::Vector{ConcreteType2}) = ...

And forces me to ALWAYS use the more verbose where notation:

DoSomething(::AbstractVector{T}) where T<: AbstractType = ... #Default behaviour
DoSomething(::AbstractVector{T}) where T<: ConcreteType1} = ...
DoSomething(::AbstractVector{T}) where T<: ConcreteType1} = ...

Which I find makes the method definitions harder to read.

…But that doesn’t mean I am right in doing so.

1 Like

I am totally with you. I don’t like that either, and in functions that I had to accept views I finally ended many times removing completely the annotations.

Just out of curiosity @Michael_Wang, did you know about NBodySimulator.jl?:
https://github.com/SciML/NBodySimulator.jl

I just stumbled onto this recently. I don’t fully appreciate why this is under the “machine learning” umbrella, but this looks somewhat similar to what you are describing.