Unitful and automatic differentiation?

Automatic differentiation (at least with ForwarDiff) does not support Unitful quantitites:

julia> using Unitful, ForwardDiff

julia> f(x) = x^2
f (generic function with 1 method)

julia> x = 2.0u"nm"
2.0 nm

julia> ForwardDiff.derivative(f,x)
ERROR: MethodError: no method matching derivative(::typeof(f), ::Quantity{Float64, 𝐋, Unitful.FreeUnits{(nm,), 𝐋, nothing}})

I found this implementation: UnitfulDual.jl - Combining dual numbers with physical units (@AleMorales - thanks). But it is not registered.

Does any other solution exist, or, @AleMorales, is that package ready for prime time :slight_smile: ?

Related to this question, I was somewhat surprised that automatic differentiation does work with Measurements, which I thought shared a similar type propagation logic.


  • The target function must be written generically enough to accept numbers of type T<:Real as input (or arrays of these numbers). The function doesn’t require a specific type signature, as long as the type signature is generic enough to avoid breaking this rule. This also means that any storage assigned used within the function must be generic as well (see this comment for an example).

But Unitful quantities are subtype of Number, not Real:


Uhm… simply changing that subtyping solves the problem:

julia> f(x) = x^2
f (generic function with 1 method)

julia> ForwardDiff.derivative(f,2.0)

julia> ForwardDiff.derivative(f,2.0u"nm")
4.0 nm

Of course units can be attributed to integers, etc, and derivatives do not. But it seems something that deserves some thought?

Maybe ForwardDiff should be able to work on more general types and error somewhere else down the line?

I’ve always had problems with ForwardDiff and custom types, Measurement more specifically. It works with simple functions, like your f above:

julia> f(x) = x ^ 2
f (generic function with 1 method)

julia> ForwardDiff.derivative(f, 5 ± 0.2)
10.0 ± 0.4

but as soon as you do something more complicated, it gets lost

julia> g(x) = x ^ (2 ± 0.1) # the function itself involves a `Measurement`
g (generic function with 1 method)

julia> ForwardDiff.derivative(g, 5 ± 0.2)
ERROR: MethodError: ^(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(g), Measurement{Float64}}, Measurement{Float64}, 1}, ::Measurement{Float64}) is ambiguous. Candidates:
  ^(a::Real, b::Measurement) in Measurements at /home/mose/.julia/packages/Measurements/4OZKq/src/math.jl:303
  ^(x::ForwardDiff.Dual{Tx, V, N} where {V, N}, y::AbstractFloat) where Tx in ForwardDiff at /home/mose/.julia/packages/ForwardDiff/aBsxl/src/dual.jl:144
  ^(x::ForwardDiff.Dual{Tx, V, N} where {V, N}, y::Real) where Tx in ForwardDiff at /home/mose/.julia/packages/ForwardDiff/aBsxl/src/dual.jl:144
Possible fix, define
  ^(::ForwardDiff.Dual{Tx, V, N} where {V, N}, ::Measurement) where Tx
 [1] g(x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(g), Measurement{Float64}}, Measurement{Float64}, 1})
   @ Main ./REPL[23]:1
 [2] derivative(f::typeof(g), x::Measurement{Float64})
   @ ForwardDiff ~/.julia/packages/ForwardDiff/aBsxl/src/derivative.jl:14
 [3] top-level scope
   @ REPL[24]:1
1 Like

you mean specifically with ForwarDiff or this is general for automatic differentiation packages?

ForwardDiff specifically, because of its limitations. I had some more luck with AutoGrad.jl, but I tried that a few years ago now.


A related thing. When working to make my code type-generic, one thing is that the user can provide as input a vector of vectors (a Vector{SVector{3,Flot64}}, for example - these are 3D particles). Or the user can provide a (3,N) matrix.

To accept both types of inputs, internally I reinterpret the matrices:

julia> using StaticArrays

julia> x = rand(3,10);

julia> x_re = reinterpret(reshape, SVector{3,eltype(x)}, x)
10-element reinterpret(reshape, SVector{3, Float64}, ::Matrix{Float64}) with eltype SVector{3, Float64}:
 [0.5742818048521419, 0.32234266246581833, 0.6093222017438875]
 [0.866691366605002, 0.1318015340128984, 0.16367154831685848]
 [0.6015181769893767, 0.5517370492686737, 0.678539023285754]
 [0.7542841101508817, 0.3016408997455371, 0.6186488361258082]
 [0.6023197313041428, 0.40427222266515384, 0.9545413271708354]
 [0.5863109575920729, 0.09176860424885502, 0.12026428196223216]
 [0.7147765247840423, 0.49433535648543603, 0.3328376337636578]
 [0.7681148704397425, 0.32334972226471614, 0.22885420042279425]
 [0.6759855817145615, 0.8790234427839108, 0.11503876383911571]
 [0.9814601419098894, 0.47802148780118725, 0.4606405182254045]

This works for eltype(x) <: Real thus it works for ForwardDiff. It also works as is for Unitful quantities.

But it does not work for measurements:

julia> using Measurements

julia> x = [ measurement(rand(),rand()) for i in 1:3, j in 1:10 ]
3×10 Matrix{Measurement{Float64}}:
   0.7±0.99   0.541±0.054   0.9±0.18  0.089±0.6   …  0.83±0.98  0.065±0.2    0.15±0.2    0.007±0.74
 0.623±0.078   0.21±0.44   0.55±0.64   0.74±0.46     0.76±0.69   0.78±0.87   0.96±0.66   0.089±0.094
   0.9±0.7     0.98±0.89   0.92±1.0    0.76±0.71     0.13±0.78   0.65±0.82  0.023±0.067    0.6±0.29

julia> x_re = reinterpret(reshape, SVector{3,eltype(x)}, x)
ERROR: ArgumentError: cannot reinterpret `Measurement{Float64}` as `SVector{3, Measurement{Float64}}`, type `SVector{3, Measurement{Float64}}` is not a bits type

Is there any good way to support all these types? Or maybe I have to check if the type is bits, and if not just copy the data to a vector of non-static vectors, and allow the code to (slowly) operate on these?

This seems to be exactly the problem promote solves. Could both packages do that instead?

@less ^(1//2, big"0.75")  # ^(x::Number, y::Number) = ^(promote(x,y)...)

Maybe they don’t for reasons of efficiency, multiplying Dual(4.0, (1,2,3)) by 5.0 is 4 operations, but if you make Dual(5.0, (0,0,0)) first then it’s 10 I think. Perhaps something like Dual(5.0, ()) could work?

Do what exactly?

I mean don’t define any methods like ^(x::Dual, y::Real) and ^(x::Real, y::Dual), define only ^(x::Dual, y::Dual). Instead add methods to promote_rule.

I think this is how Base deals with its variety of number types. For N different real numbers, you’d need N^2 methods of ^ to unambiguously handle any pair. But instead you can have N methods, only for matching types, and promote enlarges until they match.

You still need to somehow decide which one will be the outer type, i.e. will the result be a Dual(2 ± 0.2) or a Dual(2) ± Dual(0.2).

The fundamental issue that I found with ForwardDiff (and why I created UnitfulDual), is that in a scientific model each input may have different units of measure (or different physical dimensions) and Unitful defines a different type for each unit (to avoid solving the problem at “runtime”). Hence, the partials in the Dual Number would have to be a heterogeneous tuple, and this not allowed by ForwardDiff (this is the real problem IMO, rather than subtyping from Real, that could be solved by redefining the types in Unitful probably).

The reason why I never registered UnitfulDual is that I am not sure the approach in UnitfulDual is the right approach. Having different types for different inputs sounds great when you want to automate the unit analysis without any impact on simulation performance or taking your time to go through all the equations. However the moment that you need to compute a Jacobian matrix, a gradient or just run an optimization or ODE solver on your model, then it becomes a problem, because all these algorithms assume the inputs and outputs are arrays or array-like structures, which means all elements need to have the same type. So at that point you will have to extract the numeric values from the quantities, make sure that they are all in the same scale, etc., so you end up doing all the work that you wanted to avoid in the first place.

So I think we need to rethink whether it is really useful to have physical dimensions and units of measures resolved via type inference, considering all the fancy things we want to do with our models (at least in my case in turned out not to be so useful). I have some ideas on how to do it based on what I have been doing “manually” in the past, but I never found the time to implement it (I would only have time to do this on the job and my projects have been focusing on more data-related topics lately, I shall come back to scientific modelling soon-ish).

In the mean time, feel free to use UnitfulDual, as far as I have tested it works OK and I am just reusing the code that Unitful and ForwardDiff rely on anyways. But be aware that I am not and probably will not be actively working on that package.

If people are interested I could open a thread and write down my ideas for an alternative approach to unit analysis on models and we can discuss there where it make sense and the best way forward (I am just not used to open source development, so I feel a bit shy/insecure about the best way to do this…).


This really is not my area but I am curious. It seems like the problem is that the algorithms assume their arguments are in arrays of one type. Why not generalize the algorithms to support more complex data structures instead of reducing the data structures to fit the algorithms?

1 Like

One of the underlying issues is that things like matrix multiplications on homogeneous arrays are highly optimized (BLAS, LAPACK). With heterogeneous arrays one ends up with a generic fallback - which is going to be comparatively slow.

For reference, I am linking to two related ForwardDiff.jl issues/PRs:
Relax requirement that all partials have to be of the same type · Issue #546 · JuliaDiff/ForwardDiff.jl (github.com)

[Breaking] Allow value and partials to have distinct types by timholy · Pull Request #463 · JuliaDiff/ForwardDiff.jl (github.com)

(I was typing this in parallel to @vettert, so I edited it a bit after his answer popped up)

There is a fundamental contradiction between tools for scientific computing (based on array-like structures, all elements have the same type, usually a floating-point number type) and the idea of Unitful (and similar packages in other languages, like Boost units in C++) of encoding units in the type information. So this goes beyond automatic differentiation, it affect the whole stack of tools you may want to use.

The question is how to still have dimensional analysis and automated unit conversion in our models without paying a price in simulation performance, while still benefiting from all the tools for scientific computing.


Yes. And coupled with the fact that even a super simple physical model (for example: dx/dt = v, dv/dt = a with x position, v velocity and a acceleration) will involve different units in it, it’s a real conundrum and would, probably, a fundamentally different approach than a straightforward combination of Unitful.jl and ForwardDiff.jl.

I know that @ChrisRackauckas has previously mentioned on discourse (can’t find the post now) that he also had some ideas/a perspective on this.

Maybe it’s worthwhile spelling it out and having a community discussion on it. After all, physical models are pretty widespread and getting rid of worrying about units while still staying performant would be really desirable in a lot of applications.

1 Like

I thought Julia’s matrix multiplication with @Elrod’s LoopVectorization.jl-based tools like @Mason’s Gaius.jl was competitive with those optimized libraries. Could heterogeneous matrix operations also be fast (perhaps by representing them as homogeneous blocks under the hood), and thereby enable fast unitful algorithms and solvers, by staying in Julia rather than diving into C/C++/Fortran? Or is it too much of a lift to replace all of LAPACK in type-generic Julia?

1 Like

Yep, dimensional analysis only needs to be done once per model, so ModelingToolkit could do it (and indeed I also remember seeing a reference to this, maybe the last JuliaCon…). If you then assume the inputs will be in base units (m instead of km or mm) and in the same system (so all SI), then I think you do not need to do anything at runtime. My only issue would be that some models need units (e.g. agent based models in ecology, complex simulation engines) but (I think) they cannot be built with ModelingToolkit, so an alternative would be needed.

1 Like