Mul! generates machine noise

Enlil50 · February 20, 2024, 9:10pm

I have this simple matrix multiplication, but at the last step some machine noise is created and I don’t know why. This doesn’t happen in other programming languages.

using LinearAlgebra

function myfun1()
    a = 0.707107
    A = Float64[0 1 0 0; a 0 0 a; -a 0 0 a; 0 0 1 0]
    B = Float64[0.5 0; 0 0.5]
    C = Float64[1 0; 0 1]

    A1 = kron(C, B)

    display(A1)
    A2 = A1 * A
    print("\n")

    display(A2)
    display(A')
    B = A' * A2
    print("\n")

    display(B)
end

myfun1()

Results:

4×4 Matrix{Float64}:
 0.5  0.0  0.0  0.0
 0.0  0.5  0.0  0.0
 0.0  0.0  0.5  0.0
 0.0  0.0  0.0  0.5

4×4 Matrix{Float64}:
  0.0       0.5  0.0  0.0
  0.353554  0.0  0.0  0.353554
 -0.353554  0.0  0.0  0.353554
  0.0       0.0  0.5  0.0
4×4 adjoint(::Matrix{Float64}) with eltype Float64:
 0.0  0.707107  -0.707107  0.0
 1.0  0.0        0.0       0.0
 0.0  0.0        0.0       1.0
 0.0  0.707107   0.707107  0.0

4×4 Matrix{Float64}:
 0.5          0.0  0.0  1.64646e-17
 0.0          0.5  0.0  0.0
 0.0          0.0  0.5  0.0
 1.64646e-17  0.0  0.0  0.5

Elrod · February 20, 2024, 9:18pm

Other languages lie.
It is difficult to see the actual exact* floating point number in R, for example.

*where exact means 1 to 1 mapping to the actual value. Julia results still tend to be rounded, as you can see via calling “big” on them and getting more digits. Julia just prints enough to mean the actual number it is and no other number without ambiguity.

Enlil50 · February 20, 2024, 9:19pm

I can drop under a certain threashold by hand and I’m already doing so, but I’m curious about what’s happening here and what really happens in other languages

Oscar_Smith · February 20, 2024, 9:21pm

This is just the result of BLAS using optimized methods for matrix multiplication (specifically it likely is related to FMA instructions which compute a*b+c with a single rounding). In general, floating point matrix multiplication shouldn’t be expected to give exact outputs, and languages that show exact outputs typically just show small numbers as 0 (i.e the noise is still there, it’s just temporarily hidden).

Enlil50 · February 20, 2024, 9:25pm

what do you mean temporary hidden? I know for sure that those values, if left untreated, generate a huge mess in a program I wrote (tried it), while in other languages they just don’t appear and don’t generate the same issue

Elrod · February 20, 2024, 9:25pm

Many other languages automatically round results when printing.

Edit: oh, you’re saying these numbers not being exactly zero cause numerical problems down the line?

Sukera · February 20, 2024, 9:26pm

See also

and my favorite consequence of this behavior:

Enlil50 · February 20, 2024, 9:33pm

Yes. I have to diagonalize a dense matrix with a simple eigen! function. An eigenvalue of 8.stuff (retrieved after removing noise with tol= 1e-15) became 11.stuff with those values left around.

The funny part was the fact that most of the issue was not generated by the noise around, but by the lost simmetricity of the matrix. Just writing Symmetric(matrix) solved most of the numerical noise.

John_Gibson · February 20, 2024, 9:33pm

What’s happening here is that floating-point numbers and floating-point arithmetic are not exact. Forget about matrices, just look at

julia> x = sqrt(2)
1.4142135623730951

julia> x^2
2.0000000000000004

julia> x^2-2
4.440892098500626e-16

Here we see that the closest floating-point number to \sqrt{2} when squared is not exactly equal to 2, but is off by a little bit.

This is not a Julia issue. Julia is just using the 64-bit floating point representation and arithmetic provided by the CPU, like any other language. Some languages obscure the issue by printing with an order of magnitude less precision, so that x^2 prints as 2. But under the hood the number is still not 2.

Enlil50 · February 20, 2024, 9:35pm

Yeah, I’m sure of it and I’m surely not addressing an hadware property to a programming language. I’m just wondering if C/C++, Fortran and Python do automatic cutoff of noise under the hood.

_bernhard · February 20, 2024, 9:41pm

Could you provide a MWE that shows the problem ( e.g. vs C)?
This would probably make it easier to diagnose the issue - with less guessing involved.

John_Gibson · February 20, 2024, 9:43pm

No, check out the example from my PSA linked above:

julia> 2.6 - 0.7 - 1.9  # julia 0.6.2
2.220446049250313e-16

>> 2.6 - 0.7 - 1.9      # matlab r2018a
ans =
2.220446049250313e-16

>>> 2.6 - 0.7 - 1.9     # python 2.7.13
2.220446049250313e-16

2.220446049250313e-16   # C compiled from printf("%1.16e\n", 2.6 - 0.7 - 1.9);

Everybody’s using the same hardware representation and arithmetic for 64-bit floats. If you don’t see the issue in other languages, it’s either that the printing suppresses small differences, or there’s a difference in algorithms --possibly that you were enforcing symmetry in the other languages, but not at first Julia. Or that that algorithm A happens to have exact cancellation of floating-point error on the given problem, while algorithm B does not. But it is definitely not that Julia makes floating-point errors that other languages do not. They’re all using the same hardware.

abraemer · February 20, 2024, 9:52pm

Can you provide a concrete example of this? Eigenvalues shouldn’t be that sensitive to small numbers.

Note if you write eigen(Symmetric(matrix)) then you get a totally different algorithm internally (if matrix is also real) because hermitian matrices are diagonalized with different algorithms than generic matrices (for Krylov e.g. Lanczos algorithm and not Arnoldi). The specialized algorithms are usually more stable and faster and that will explain why you get more stable results when you add Symmetric.

Enlil50 · February 20, 2024, 9:56pm

This is a C++ program with Eigen:

#include <iostream>
#include <iomanip>
#include <Eigen/Dense>

using namespace Eigen;
using namespace std;

int main(){
    double a = 0.707107;

    Matrix4d A;
    A << 0.5, 0, 0, 0,
        0, 0.5, 0, 0,
        0, 0, 0.5, 0,
        0, 0, 0, 0.5;
    
    Matrix4d B;
    B << 0, 1, 0, 0,
         a, 0, 0, a,
         -a, 0, 0, a,
         0, 0, 1, 0;
    
    Matrix4d C = A * B;
    Matrix4d D = B.transpose() * C;
    
    // Print the result
    cout << setprecision(19) << D << endl;
    
    return 0;
}

The result is:

0.500000309449000091                    0                    0                    0
                   0                  0.5                    0                    0
                   0                    0                  0.5                    0
                   0                    0                    0 0.500000309449000091

mikmoore · February 20, 2024, 9:58pm

Lots of languages “round off” numbers when they print them. For example, MATLAB will print 1 - 1e-16 as 1.0 under most normal printing, even though it will still say it is not equal to 1.0 if you check.

But I don’t know of any language that prints small numbers (except maybe subnormals) as 0.0. Although I suppose some language could print values much smaller than the largest entry in a matrix as 0 in only that context?

For this simple orthogonal matrix, the fact that it results in tiny roundoff errors where results should be zero is a consequence of the FMA calculations used in matrix multiplication.

Note that a * b - a * b == 0 always holds in floating point math. HOWEVER, in some cases (@fastmath and BLAS calls, for example), what nominally forms this operation instead gets implemented as fma(a, b, -a*b). This specific calculation returns the roundoff error of a*b, so it is only 0.0 when a*b does not incurr any roundoff error. This is why you get the value fma(a, a, -a*a) * 0.5 == -1.6464564900786626e-17 in the off-diagonals for this input.

So if a language returns nonzero off-diagonals for these particular inputs, it is because it is not using FMA matrix multiplication in this case (or it is hiding the result, but that is less likely). That’s definitely possible. But with a less trivial A matrix, even the non-FMA version of this calculation will not return exact zeros due to floating point non-associativity.

P.S.: You should use sqrt(1/2) rather than 0.707107 because it is much more accurate.

Oscar_Smith · February 20, 2024, 10:00pm

Note that those eigen results look a lot worse. 0.5 vs 500000309449000091. It got the sparsity correct, but the result is way worse.

John_Gibson · February 20, 2024, 10:02pm

The OP didn’t show errors in eigenvalues, it showed small nonzero off-diag matrix elements that should be zero in exact arithmetic.

I only noticed now that the OP uses a = 0.707107 which looks like a six-digit approximation to a = sqrt(2)! Using the latter actually increases the off-diag errors to 8.53284e-17.

mikmoore · February 20, 2024, 10:03pm

This is a result of the bad approximation sqrt(1/2) ≈ 0.707107, I believe. Note that if you take the Julia result and subtract 0.5I you witness a similarly poor result. It’s just hidden by the matrix-mode limited precision display.

abraemer · February 20, 2024, 10:06pm

But OP talked about it and also posted some code in another thread where the eigenvalues came out very wrong after a few rounds of some iterative approximation algorithm that releatedly augments and rediagonalizes a matrix.

Enlil50 · February 20, 2024, 10:12pm

I tried to simplify on the fly a program I’m passing around different threads this days, ahha. Sorry for the useless sparse notation, but removing it doesn’t really work well, probably cause I’m trying to write it fast (Murhpy is here).

using SparseArrays
using LinearAlgebra

function NRG(tol_zeros::Real, Delta::Real, max_states::Integer, NIter::Integer, Energy::Vector)

    Sz = Float64[[0.5, 0] [0, -0.5]]
    Sp = Float64[[0, 0] [1.0, 0]]
    Sm = Float64[[0, 1.0] [0, 0]]
    Ham = zeros(2,2)

    Block_Sp = Sp
    Block_Sm = Sm
    Block_Sz = Sz
    Block_Ham = Ham

    temp_matr = Array{Float64}(undef, max_states*max_states, max_states)

    m::Int64 = 2 #hilbert size

    for i = 1:NIter
        lattice_size::Int64 = 2^i #spatial row_size

        #Hamiltonian constructor
        H_super = kron(Block_Ham, sparse(I,m,m)) .+ kron(sparse(I,m,m), Block_Ham) .- Delta*kron(Block_Sz, Block_Sz) .+ 0.5*(kron(Block_Sp, Block_Sm) .+ kron(Block_Sm, Block_Sp))
        H_dense_super = collect(H_super) #I make a dense matrix as this is a dummy example with full diagonalization.
        H_dense_super .= ifelse.(abs.(H_dense_super) .< tol_zeros, 0, H_dense_super)


        #trunc index
        m_next = m*m # hilbert space row_size after kron products (mul!)
        trunc_index = min(m_next,max_states)

        #Diagonalization part
        evals, evecs = eigen!(H_dense_super) #ascendent ordered
        Energy[i] = evals[1]
        trunc_matrx = @view evecs[:,1:trunc_index]

        # Preallocation for mul!
        temp = @view temp_matr[1:m_next,1:trunc_index]

        #Higher dimension kron
        mul!(temp, H_super, trunc_matrx)
        Block_Ham = trunc_matrx' * temp

        mul!(temp, kron(I(m), Block_Sz), trunc_matrx)
        Block_Sz = trunc_matrx' * temp

        mul!(temp, kron(I(m), Block_Sp), trunc_matrx)
        Block_Sp = trunc_matrx' * temp

        mul!(temp, kron(I(m), Block_Sm), trunc_matrx)
        Block_Sm = trunc_matrx' * temp

        m = trunc_index # hilbert space row_size before next kron products
    end        
end

#Definition of starting parameters
tol_zeros = 1e-15
Delta::Float64 = 0.5
max_states::Int16 = 5
NIter::Int32 = 5

#output vectors
Energy = Vector{Float64}(undef, NIter)

NRG(tol_zeros, Delta, max_states, NIter, Energy)
for i = 1:NIter
    println(lpad(2^i, 2, "0"), " | ", Energy[i])
end

If you comment line 26, which is the one in charge of cutting of noise, you will see the change in the printed numbers. If you want another change: you can cut off all the noise after multiplications of Block_stuff matrices too. It will change it a little.

Results with line 26 commented:

02 | -0.37499999999999944
04 | -0.9098524898764077
08 | -2.68529096887001
16 | -5.43894887951664
32 | -11.008233268476827

Results with line 26 not commented:

02 | -0.37499999999999944
04 | -0.909852489876408
08 | -1.958441200217832
16 | -4.031578973828303
32 | -8.18579752200281

These are the ground state (the smallest eigenvalues) of H_super. eigen! is ascendent ordered if not specified otherwise.

If you increase NIter, the accumulation of issue is really shown:
NIter=15
line 26 commented:

02 | -0.37499999999999944
04 | -0.9098524898764077
08 | -2.68529096887001
16 | -5.43894887951664
32 | -11.008233268476827
64 | -22.155185075093485
128 | -44.46508413287744
256 | -89.08621024836387
512 | -178.3284940709981
1024 | -356.81306310289483
2048 | -882.4065129795858
4096 | -1764.6907549298428
8192 | -3546.6964588803635
16384 | -7836.724421763094
32768 | -18904.125726973307

line 26 not commented:

02 | -0.37499999999999944
04 | -0.909852489876408
08 | -1.958441200217832
16 | -4.031578973828303
32 | -8.18579752200281
64 | -16.509651353250305
128 | -33.15732186748891
256 | -66.45256376684209
512 | -133.0429487095539
1024 | -266.22370226104385
2048 | -532.5852061229483
4096 | -1065.3082131523727
8192 | -2130.754292870491
16384 | -4261.646287787542
32768 | -8523.464327368076

Obvious to say that the better ones are the line 26 not commented results

Topic		Replies	Views
Inaccurate Matrix Multiplication General Usage linearalgebra	39	3323	February 28, 2022
Drastic performance hit matrix multiply different types. Internal cast julia vs numpy? Numerics	15	2448	November 4, 2018
Why is the matrix multiplication with integer matrices much slower than with float ones? Performance	6	407	June 27, 2023
Working with `LinearAlgebra.mul!` Performance question , linearalgebra	12	1084	May 7, 2022
Fastest matrix multiplication General Usage mkl , linearalgebra , matrix	20	1585	February 19, 2024

Mul! generates machine noise

Related topics