Numeric stability of Base.exp of static matrix of Float32

Guys,

I want to use Base.exp in my model which uses Float32 with StaticArray and StaticVector. However, I found there is a numerical instability of Base.exp with the 3*3 matrix…see the example above:

using StaticArrays

julia> deriv_mat = Float32[-50.75532 69.86392 0.0; 50.75532 -139.72784 54.483025; 0.0 69.86392 -108.96605]
3×3 Matrix{Float32}:
 -50.7553    69.8639     0.0
  50.7553  -139.728     54.483
   0.0       69.8639  -108.966

julia> exp(deriv_mat)
3×3 Matrix{Float32}:
 8.57472f-6  6.68494f-6  3.72485f-6
 4.85653f-6  3.78621f-6  2.10967f-6
 3.46999f-6  2.70524f-6  1.50736f-6

julia> deriv_mat_s = SMatrix{3, 3, Float32}(Float32[-50.75532 69.86392 0.0; 50.75532 -139.72784 54.483025; 0.0 69.86392 -108.96605])
3×3 SMatrix{3, 3, Float32, 9} with indices SOneTo(3)×SOneTo(3):
 -50.7553    69.8639     0.0
  50.7553  -139.728     54.483
   0.0       69.8639  -108.966

julia> exp(deriv_mat_s)
3×3 SMatrix{3, 3, Float32, 9} with indices SOneTo(3)×SOneTo(3):
 NaN  NaN  NaN
 NaN  NaN  NaN
 NaN  NaN  NaN

I need to use STRICTKY the Float32 with static matrix…so is there any way to overcome such numerical instability? if I convert it to Float64 then it creates extra allocations…

I am thinking to use something like this…@btime SMatrix{3, 3, Float32}(exp(SMatrix{3, 3, Float64}($deriv_mat_s_64))) but very ugly…

A3 = rand(3, 3)
As3 = SMatrix{3, 3}(A3)

julia> @btime exp($As3)
  93.400 ns (0 allocations: 0 bytes)
3×3 SMatrix{3, 3, Float64, 9} with indices SOneTo(3)×SOneTo(3):
 1.52765   0.53686  0.282278
 0.792425  1.74997  0.504976
 1.28887   1.30304  1.93264

julia> @btime SMatrix{3, 3, Float32}(exp(SMatrix{3, 3, Float64}($deriv_mat_s_64)))
  113.689 ns (0 allocations: 0 bytes)
3×3 SMatrix{3, 3, Float32, 9} with indices SOneTo(3)×SOneTo(3):
 8.5747f-6   6.68493f-6  3.72484f-6
 4.85652f-6  3.7862f-6   2.10967f-6
 3.46999f-6  2.70524f-6  1.50736f-6

I don’t know about the numerical instability, but this is less ugly:

Float32.(exp(Float64.(x)))

Possibly related to [BUG] exp of SMatrix fails numerically · Issue #1295 · JuliaArrays/StaticArrays.jl, though the 2x2 example there works now.

Just checking, do you intend exponentiation of each element or the generalization to square matrices?

not the element-wise exponential, but the matrix exponential.

See issues #785 (which first reports the issue) and #1285 (which correctly diagnoses the problem).

This is exacerbated by the large coefficients used in the Pade approximation of StaticArrays’ version of exp. One improvement would be to scale all the coefficients down so that the largest (rather than smallest) ones are on the order of 1 (in fact, it would probably resolve most issues with overflow in exp). It wouldn’t be a very difficult pull request, just find each set of coefficients and scale them by 2^{-n} for a useful choice of n.

3 Likes

Not strictly in Base as you know by now, and I filed an issue just now:

but I see it’s a known problem, and seemingly the third duplicate, so I closed mine. But at least see the links to the code from mine. I didn’t move them to other issues.

1 Like

I’m curious, what does Base do for Arrays, and is there a way to just repurpose that for StaticArrays so there’s only one real algorithm to maintain?

As I recall, StaticArrays for 3x3 and up uses nearly the same algorithm (with a very similar implementation) as Base except without the gebal. This issue really is just the large coefficients used in the Pade (though they are the same as used in Base) combined with a specialized 3x3 \ that is really vulnerable to overflow. Shrinking the coefficients or using lu would fix this issue.

I haven’t looked into modifying the Base version to be more widely applicable. It’s extremely limited right now (see #51008). But as I recall, Julia v1.12 is decent at avoiding allocations for non-escaping MArray. With a not-small reworking of the Base version, it might be possible to use it entirely (without allocations) by passing a MArray. I think the main factor would be working around the gebal in some way, either skipping it (the algorithm “works” without it) or making a suitable Julia implementation.

2 Likes

thanks for your reply! But what do you mean by Shrinking the coefficients or using lu would fix this issue.?

Besides, I need to use ForwardDiff.jacobian later to calculate Jacobin of my model, it told me that simply transferring type to Float32 then back to 64 would yield error in ForwardDiff.jacobian

Thanks but this is a dead end…I need to use ForwardDiff.jacobian later to calculate Jacobin of my model, it told me that simply transferring type to Float32 then back to 64 would yield error in ForwardDiff.jacobian

ERROR: LoadError: TaskFailedException

    nested task error: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#34#36", Float32}, Float32, 11})
    The type `Float64` exists, but no method is defined for this combination of argument types when trying to construct it.
    
    Closest candidates are:
      (::Type{T})(::Real, !Matched::RoundingMode) where T<:AbstractFloat
       @ Base rounding.jl:265
      (::Type{T})(::T) where T<:Number
       @ Core boot.jl:900
      Float64(!Matched::Float32)
       @ Base float.jl:341
      ...
    
    Stacktrace:
      [1] convert(::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{var"#34#36", Float32}, Float32, 11})
        @ Base ./number.jl:7
      [2] macro expansion
        @ ~/.julia_sindbad_clean/packages/StaticArraysCore/MoJEf/src/StaticArraysCore.jl:95 [inlined]
      [3] convert_ntuple
        @ ~/.julia_sindbad_clean/packages/StaticArraysCore/MoJEf/src/StaticArraysCore.jl:91 [inlined]
      [4] SMatrix{3, 3, Float64, 9}(x::NT)
        @ StaticArraysCore ~/.julia_sindbad_clean/packages/StaticArraysCore/MoJEf/src/StaticArraysCore.jl:127
      [5] SArray
        @ ~/.julia_sindbad_clean/packages/StaticArraysCore/MoJEf/src/StaticArraysCore.jl:131 [inlined]
      [6] StaticArray
        @ ~/.julia_sindbad_clean/packages/StaticArrays/DsPgf/src/convert.jl:180 [inlined]
      [7] solve_odes(new_vegW::SVector{3, Float32}, deriv_mat::SMatrix{3, 3, ForwardDiff.Dual{ForwardDiff.Tag{var"#34#36", Float32}, Float32, 11}, 9}, deriv_const::SVector{3, ForwardDiff.Dual{ForwardDiff.Tag{var"#34#36", Float32}, Float32, 11}}, init_cond::SVector{3, Float32}, deltaT::Float32, z_zero::Float32, helpers::NT)

Thanks for your reply @mikmoore, but how can we do this 2^{-n} scaling?

The source is here. Every branch has two polynomial evaluations (most of them using @evalpoly (which should probably be changed to evalpoly) except for one which writes them out long-form to exploit the Paterson-Stockmeyer method). The coefficients of these polynomials are the S(<integer>) terms. In every polynomial pair, the smallest term is 1. The largest term depends on the branch. If each term in a polynomial pair was scaled by the same constant (ideally a power of 2 so that there is no roundoff error), the result would be mathematically the same (because \frac{a}{b} = \frac{ac}{bc}). So, for instance, the largest polynomial pair has largest coefficient 64764752532480000. Scaling every term in that polynomial pair by 2^{-55} would change this coefficient to \approx 1.8. The numbers in the calculation would be smaller and would not overflow the calculation expA = (V - U) \ (V + U).

See issue #1285 (linked above) for discussion of X \ Y versus lu(X) \ Y. Solving with lu is less prone to overflow due to a not-very-careful implementation of X \ Y and would typically resolve this issue. Although it is somewhat slower.

Personally, I think changing the coefficients should be done in any case. The lu would then rarely be necessary.

The same coefficient change could be made to Base, though it always uses lu (even though it may not be written, because that’s always how it does square \), so the issue never showed up there.

2 Likes

Sure. The main point was to show that it’s better to write

Float64.(x)

than

SMatrix{3, 3, Float64}(x) 

I dislike needless verbosity.