Static vector of functions as a field of a composite type

I’m trying to comprehend the details of the type system and its effect on performance, and would appreciate if you can help me understand how it applies to my current case, thank you!

Question:
Would it be a bad idea to use a SVector of functions in the field of a composite type, e.g:

struct DifferentiableProblem{N, F <: Function}
    target_function::F
    derivatives::SVector{N, F}
end

I have two confusions:

  1. Since each function is its own concrete type with associated methods, derivatives field must obtain a concrete type as defined above. But I’m guessing a concrete function object doesn’t have a fixed size, so having a SVector of them is probably not a good idea?
  2. I know that high-order functions are performant in Julia, and it’s usually not a good idea to store functions in the structs. But in my case these functions are not “methods” to be applied to the data, but they are the part of the data. So that’s why I thought of storing them in a struct, but perhaps there are better approaches. Please see below for details.

Details of the case:
I’m developing a package where the user defines a problem by entering a (non-linear) function f \colon \mathbb{R}^{n+k} \to \mathbb{R}^n, where \mathbb{R}^k can be thought of as the parameter space. Then for each point p in (a grid of) a bounded subset of the parameter space, the program makes some computations, which uses the function f_p \colon \mathbb{R}^n \to \mathbb{R}^n given by f_p(x) = f(x,p), and the main diagonal of its Jacobian. Since for each problem the form of the function f and the relevant entries of its Jacobian is fixed, I thought it might make sense computationally to generate and store the relevant derivative functions once at the beginning when the user defines the problem.

Addendum:
For constructing the derivatives, I am currently doing symbolic calculations using build_function of Symbolics.jl for 2 reasons: (1) JuliaInterval methods are utilized in the computations and I had hard time to mix them well with AD packages, since I guess both use propagation, (2) I thought it would be computationally cheaper to generate the derivative functions symbolically instead of calculating them via AD each time. But maybe I am also confused a bit here.

An SVector containing different types (every function has its own unique type) will either have an element type that is a union (eg Union{typeof(sin),typeof(cos)}) or abstract (eg Function). Neither of these is going to be great for performance.

For this purpose, I would instead use a plain Tuple, since a Tuple can hold a different type in every position. In other words, I would suggest something more like

struct DifferentiableProblem{F,T<:Tuple}
    target_function::F
    derivatives::T
end

If this particular part of the code isn’t super performance sensitive, you might do fine with dynamic dispatch. In that case you can just leave these fields weakly typed or untyped:

struct DifferentiableProblem
    target_function::Any # non-concrete type
    derivatives::Tuple # non-concrete type
end

Depending on what you’re doing, there may or may not be a big performance cost to this. Concrete typing is important in performance-critical code segments, but for higher-level pieces sometimes unstable types can work just fine (and save a lot of compiler effort). Function barriers can do great work.

3 Likes

Allow me to add that a tuple is interesting for low-dimensional collections, if you have more than a handful of functions I don’t think it’s gonna save you.

There were several Discourse topics on storing functions in arrays recently, but I can’t seem to find them. It would definitely be worth checking out.

As for the computational cost of symbolic vs algorithmic vs numerical differentiation, it depends very much on the problem at hand. There are simple cases where a symbolic formula for the derivative will have exponentially many terms, whereas algorithmic differentiation will avoid duplicate computation by reinterpreting a simple piece of code (not unlike the recursive version of Fibonacci vs a for loop). Can you tell us more about your use case?

Thank you @mikmoore and @gdalle for your great answers!

Regarding storing functions: The collections are indeed low-dimensional, e.g. <10. The reason I explored using SVectors is to be able to access the number derivative functions for compiler to reason about when iterating over, as discussed in this post, but I guess one can simply use bare Tuples as suggested in this post directly without the extra Static Array structure on top. I guess another way would be to use FunctionWrappers.jl for the target_function, and use the type of the output to inform compiler about the length of the derivatives Tuple (since if the codomain of the target_function is \mathbb{R}^n, then the number of derivatives, i.e. the length of

Regarding Symbolic D vs AD: I think the package would benefit from AD, since the target_functions are very nonlinear, and in most cases would lead to “expression swell” as you mentioned. As an example of a target_function (for some fixed parameters) h \colon \mathbb{R}^n \to \mathbb{R}^n:

h(y)=(h_1(y),...,h_n(y)), such that
       h_i(y) = \int_X f_i(x,y)g(x) dx, where
            f_i(x,y)=\frac{e^{1 - \lVert x- y_i \rVert}}{1+\sum_j{e^{1 - \lVert x- y_j \rVert}}} and g(x) is some pdf.
             and the derivatives of interest are \frac{\partial^k h_i}{\partial y_i^k} for k \in {1,2}.

The use case: The practice is a multi-objective optimization: finding local Nash equilibria in a game theoretic setting. So the aim is to find the roots of \tilde{h}(y)=(\frac{\partial h_1}{\partial y_1},...,\frac{\partial h_n}{\partial y_n}) such that \frac{\partial^2 h_i}{\partial y_i^2}<0 \, \forall i (ignoring other cases for now). One approach that I have been trying is to use interval arithmetics (à la JuliaIntervals by @dpsanders et al.), especially using their IntervalConstraintProgramming.jl(*), however the current version seems to be not compatible with Dual type propagation needed to couple it with something like ForwardDiff.jl, since it internally converts constraints to ModelingToolkit.Operation types for which the Dual is not defined, and I couldn’t figure out how to make it work yet. (I guess I should open a new topic for this since it’s a different issue)
Long story short, that’s why I reverted to using symbolic differentiation using Symbolics.jl.

(*) The reason that I try to implement IntervalConstraintProgramming is that each y_i might be allowed to be vector-valued, e.g. points in 2D space, and thus the function for which the roots to be found becomes \tilde{h} \colon \mathbb{R}^{2n} \to \mathbb{R}^n, disallowing methods like Newton due to non-invertable Jacobian (as far as I understood).

1 Like

Have you tried reverse-mode automatic differentiation?