FlexUnits.jl v0.3.0 is a major new release that bridges the gap between Unitful.jl and DynamicQuantities.jl. As many are aware, Unitful.jl exhibits superior performance when units can be be inferred at parse time, but struggles when the compiler cannot infer units inside a function, forcing it to perform expensive run-time dispatch. Another package, DynamicQuantities.jl takes an alternative approach, where all dimensions can be represented by a single type. Dimension operations must be performed at run-time, but this approach is much less expensive than run-time dispatch, making it a safer option overall.
FlexUnits.jl bridges the gap between these two packages. With an API that closely resembles Unitful, FlexUnits now enables static units and dimensions by default where dimensions are a parameter. However, promotion rules convert quantities with static dimensions to ones with dynamic dimensions if multiple values must be represented in a single type (such as collecting multiple quantities with different units inside a vector or dictionary). This retains the high-performance behaviour of Unitful.jl when units are known at compile time, but often falls back to the performance of DynanicQuantity.jl if they canāt be inferred. In the first set of benchmarks, we see that FlexUnits.jl and DynamicQuantities.jl vastly outperform Unitful.jl (by more than 100x) when units cannot be inferred.
using FlexUnits
using .UnitRegistry
import DynamicQuantities
import Unitful
using BenchmarkTools
v1uni = [1.0*Unitful.u"m/s", 1.0*Unitful.u"J/kg", 1.0*Unitful.u"A/V"]
v1dyn = [1.0*DynamicQuantities.u"m/s", 1.0*DynamicQuantities.u"J/kg", 1.0*DynamicQuantities.u"A/V"]
v1flex = [1.0u"m/s", 1.0u"J/kg", 1.0u"A/V"]
@btime sum(x->x^0.0, $v1uni)
7.575 μs (86 allocations: 3.92 KiB)
@btime sum(x->x^0.0, $v1dyn)
41.667 ns (0 allocations: 0 bytes)
@btime sum(x->x^0.0, $v1flex)
27.209 ns (0 allocations: 0 bytes)
In the second example, we see that FlexUnits.jl and Unitful.jl outperform DynanicQuantities.jl when units can be inferred by the compiler.
t1uni = [1.0*Unitful.u"m/s", 1.0*Unitful.u"m/s", 1.0*Unitful.u"m/s"]
t1dyn = [1.0*DynamicQuantities.u"m/s", 1.0*DynamicQuantities.u"m/s", 1.0*DynamicQuantities.u"m/s"]
t1flex = [1.0u"m/s", 1.0u"m/s", 1.0u"m/s"]
@btime sum(x->x*x, $t1uni)
2.900 ns (0 allocations: 0 bytes)
@btime sum(x->x*x, $t1dyn)
7.800 ns (0 allocations: 0 bytes)
@btime sum(x->x*x, $t1flex)
2.900 ns (0 allocations: 0 bytes)
While this performance boost over DynamicQuantities.jl isnāt as dramatic as the previous boost over Unitful.jl, it is still significant. However, the benefits donāt stop here.
One major difference from Unitful.jl is that FlexUnits.jl will always convert to base units before performing a calculation. This prevents over-specialization because entities like pressure or mass flow will only ever have one unit associated with them. This can result in more successful attempts to statically resolve units. As an example, let us consider an iterative approach to solving the Peng-robinson equation of state for volume. Pressure can be explicitly solved, but solving for volume requires solving a cubic equation. This can be solved directly, but for the sake of illustration, we solve it iteratively in this function
function pressure(state)
(T, V) = (state.T, state.V)
R = state.R
(a, b) = (state.a, state.b)
α = f_alpha(state)
r2 = sqrt(2)
Ī = (-1+r2, -1-r2)
P = R*T/(V-b) - (α*a)/((V-Ī[1]*b)*(V-Ī[2]*b))
return P
end
function volume(state)
(P, T) = (state.P, state.T)
R = state.R
V = R*T/P
#Use the residual error of the ideal gas law to predict V and iterate
for ii in 1:N_ITER[]
Ph = pressure(state)
Zr = Ph/P #Residual compressibility factor
V = V*Zr #Use compressibility to predict volume at P
end
return V
end
The full code can be found in the test folder in this repo, but the main issue here is that if the state variable contains Unitful quantities, it will struggle with statically resolving for V when it is reassigned. If we run the benchmarks with 10 iterations, with Float64 values along with quantitites from each of the three packages, we see the following results:
No Units (Baseline) 45.369 ns (0 allocations: 0 bytes)
Static Unitful.jl 919.048 ns (21 allocations: 336 bytes)
Static DynamicQ.jl 273.622 ns (0 allocations: 0 bytes)
Static FlexUnits.jl 45.579 ns (0 allocations: 0 bytes)
Here, Unitful.jl is worse than DynamicQuantities, even though units should be statically inferrable while FlexUnits.jl is the only package that runs without overhead when compared to the baseline Float64 implementation. Examining the results from the Unitful.jl implementation offers further explanation.
julia> @btime volume($uf_t)
653.125 ns (21 allocations: 336 bytes)
3.4362314782993218e-16 J^11 m^-30 mol^-1 Pa^-11
The units of V are evolving into an ever-growing monstrosity that still resolves to molar volumes. If we re-wrote the āvolumeā function to store the the type of V and convert to this type every iteration, we get much more reasonable results and zero overhead.
julia> @btime volume($uf_t)
48.081 ns (0 allocations: 0 bytes)
3.4362314782993218e-16 J mol^-1 Pa^-1
However, this required modifying existing code to fit Unitful.jl which may not be feasible when trying to apply units to other peoplesā code. One last trick that this new FlexUnits.jl offers is the ability to apply function barriers to dynamic quantities. If the argument states is a vector, requiring a single type, we can see that DynamicQuantities and FlexUnits outperform Unitful.
Dynamic Unitful 5.880 μs (212 allocations: 3.31 KiB)
Dynamic DynamicQ 281.102 ns (0 allocations: 0 bytes)
Dynamic FlexUnits 232.012 ns (0 allocations: 0 bytes)
However, one can apply a function barrier in FlexUnits that explicitly converts the dynamic units into static ones before running the calculations
julia> @btime volume_function_barrier($fl_v)
41.513 ns (0 allocations: 0 bytes)
3.4362314782993218e-16 m³/mol
While this isnāt exactly zero overhead, it is very close. Moreover, only the internals of this function are specialized, avoiding excessive specialization in other areas of the code; this would likely improve compile-time and even run-time performance.
This update was the largest enhancement on the roadmap, so the package should be much more stable from here on out. Now that itās at a point where it is arguably useful and fairly stable, I will be working on the documentation shortly.