Multiplication times for complex matrices: julia vs Matlab

I’ve been looking at some different filter approaches for analyzing ECG signals using Matlab and have recently implemented them in julia in order to get a speed improvement. I have not been seeing the expected improvements and have tracked the issue down to the time for multiplying complex matrices. I have made some simple examples in both cases to evaluate the times. The julia example is rough as I am new to julia.

#!/usr/local/bin/julia

module TmMlt

ENV["CMPLXROOT"] = "/home/martin/Dropbox/Matlab/Complex/KM_ComplexFilterToolbox"
using MAT, LinearAlgebra, Printf, BenchmarkTools

tms = zeros(Float64,2,1)
f(a,b) = a*b
for n = [64 128 256 512 1024]
    a = ones(Complex, n, n);
    b = ones(Complex, n, n);
    tm1 = time();
    f(a,b);
    tm2 = time();
    @printf("n: %u, time: %f\n",n,tm2 - tm1)
end
end
n = 1024
a = ones(Complex, n, n);
b = ones(Complex, n, n);
@btime f(a,b)

which gave me:

julia> include("TmMlt.jl")
WARNING: replacing module TmMlt.
n: 64, time: 0.018154
n: 128, time: 0.238965
n: 256, time: 1.620080
n: 512, time: 13.005956
n: 1024, time: 126.109016
  127.565 s (2152726530 allocations: 64.16 GiB)

The matlab script I ran was:

function tmMltplys()
    for n=[64 128 256 512 1024 2048 4096]
        fprintf('For n=%d ',n);
        a = rand(n,n) + j*rand(n,n);
        b = rand(n,n) + j*rand(n,n);
        c = mlt(a,b);
    end


    function c = mlt(a, b)
    % multiply two matrices
        tic
        c = a*b;
        toc
    end
end

which gave:

>> tmMltplys
For n=64 Elapsed time is 0.000240 seconds.
For n=128 Elapsed time is 0.000493 seconds.
For n=256 Elapsed time is 0.001726 seconds.
For n=512 Elapsed time is 0.013201 seconds.
For n=1024 Elapsed time is 0.150515 seconds.
For n=2048 Elapsed time is 0.741310 seconds.
For n=4096 Elapsed time is 5.980977 seconds.
>> 

I couldn’t go over n=1024 in julia as it took too long. I must be doing something drastically wrong as the julia times for multiplying two comlex matrices are over 1000 times longer than matlab times. I also coded a julia multiply using for loops; these times were very similar (slightly faster) than just using the simple function shown above. If anyone can help identify what I am doing wrong, it would be much appreciated. Thanks.
-Ken

2 Likes

The main problem here is that Complex is an abstract type, you probably wanted Complex{Float64} or something. Which means you were getting an extremely generic, extremely slow, fallback version of *:

julia> a, b = ones(ComplexF64, 100, 100), ones(ComplexF64, 100, 100);

julia> @btime $a * $b;
  130.298 μs (2 allocations: 156.33 KiB)

julia> a, b = ones(Complex, 100, 100), ones(Complex, 100, 100);

julia> @btime $a * $b;
  65.006 ms (2030002 allocations: 62.03 MiB)

julia> 130298 / 65
2004.5846153846153

Once this is fixed, you are measuring only the speed of the BLAS library which actually does such mutiplication, not Julia nor Matlab. The defaults vary, you may wish to install MKL.jl if this is the bottleneck.

5 Likes

The equivalent julia code would be:

function tmMltplys()
   for n = [64 128 256 512 1024 2048 4096]
       println("For n=$n")
       a = rand(n,n) + im*rand(n,n)
       b = rand(n,n) + im*rand(n,n)
       @time a*b
   end
end
3 Likes

Did you mean

julia> 65e-3/130e-6
500.00000000000006

?

1 Like

Yes that’s better!

improbable22, thank you. For the case of n=4096, which I couldn’t measure previously, the time for multiplying two square ComplexF64 matrices is now 15s. Matlab’s time (6.0s) is still 2.5 times faster. The speedup of using ComplexF64 vs Complex for n=1024 resulted in a 615 times speedup; I will make it a point to never use Abstract types. It still seems Matlab is faster by factors of 2-2.5; I will look into MKL.jl. Still recoding into Julia, for filters time constrained by complex matrix multiplies, does not look like it will improve speed, I will also spend time reading on how to code for high speed in julia. I wonder if anyone ever considered a language like julia, but with static typing?

MATLAB uses MKL for matrix multiplication, while Julia uses OpenBLAS. So if MATLAB is faster, using MKL.jl should give Julia the same speed.

For code whose run time is dominated by calling library X, it wont really matter much which language you’re using to call library X.

Static typing does have its advantages, but performance isn’t one of them as long as type inference works. You can use @code_warntype or try Cthulhu to identify and fix type inference problems.

3 Likes

To follow up: I installed MKL, and now the times are effectively the same as those of Matlab. I should mention that after installing, I had to quit and restart julia’s REPL before the times changed.

#!/usr/local/bin/julia

module TmMlt

using MAT, LinearAlgebra, Printf, BenchmarkTools

f(a,b) = a*b
function tmMltplys()
   for n = [64 128 256 512 1024 2048 4096]
       a = rand(n,n) + im*rand(n,n)
       b = rand(n,n) + im*rand(n,n)
       println("For n=$(n), $(@belapsed f($a,$b))")
   end
end
tmMltplys()
end

Times:

julia> include("TmMlt.jl")
WARNING: replacing module TmMlt.
For n=64, 3.0043e-5
For n=128, 0.0001909
For n=256, 0.001420553
For n=512, 0.012165892
For n=1024, 0.090852981
For n=2048, 0.699499082
For n=4096, 5.723938093

Summary, for filters time-constrained by complex matrix multiplies, julia can be the same speed as Matlab. Still, julia runs on a Raspberry Pi4 without a license. This is enough reason for me to continue learning the language. Thanks improbable 22 and Elrod; your replies have really helped me; without them, I would have mistakenly given up on the language.

8 Likes

Don’t misunderstand. You can use abstract types without sacrificing performance, for example in function signatures. You should probably avoid them as element types of containers, or in fields of struct definitions. But there you can use parametric types instead.

If you always type your code with concrete types, it won’t be generic, and generic coding is one of the greatest strengths of Julia.

6 Likes

Thanks DNF. If I understand correctly, you are say that variables and fields in structs can not be abstract, but that function argument definitions can be? I have struggled quite a bit with understanding how and where to specify and convert types especially for complex vectors, arrays, and matrices. For example, consider:

Y[i,:] = Xfbi[i,:]*A;

where Y is an nxp array, Xfbi is an nxm array, and A is an mxp array. This doesn’t work as Xfbi[i,:] is converted into an mx1 column vector. The only way I could get this to work was to use

 rwXfbi = reshape(Xfb[i,:],1,m);
 Y[i,:] = rwXfbi*A;

I’m sure this not the right way, but I can’t figure out how to stop Julia from turning a row slice, that I would have thought would be a 1xm “array” into an incompatible column “vector”. Indeed, I can not understand why there are vectors at all? I also have difficulty understanding what

julia> function myfunc(c::MySimpleContainer{<:AbstractArray{<:AbstractFloat}})
           return c.a[1]+2
       end

means (from “Performance Tips”). I think I’m getting this slowly if I can paraphrase:
It means MySimpleContainer is a struct (or container? - I don’t understand the difference) that must be a non-abstract known array type; but that type can be any specific type that is under the “type tree?” of an array of a particular specific type of float? I’m really having a difficult time getting my head around this. Although, to be honest, I also still have a hard time with Interfaces and Reflection in Go, as well, so it’s probably because I haven’t spent enough time on it. The start-up on the language and understanding the semantics of the typing is proving to be challenging for me; I’m hoping I will eventually figure this out. Thanks again for your help.

So the basic idea is that you want the compiler to be able figure out the types of all of the things you are working with. The compiler doesn’t need you to specify the types of function arguments. It will always know those by itself, so putting types on function arguments is to restrict what sorts of things can be passed to the function rather than to ensure performance.

A struct has fields and usually you want to use the values in those fields to do something. That’s the whole point of a struct. You should specify the types of those fields, because then the compiler can know their types as long as it know what type of struct it is dealing with. Parametric types can have a type parameter and the types of the fields can depend on the type parameter.

A container is an umbrella term for an object that holds other objects (arrays, dicts, etc.). You want to specify the type of thing in the container so that the compiler knows what it will get when you access things in it.

Where you had gone wrong originally is that Complex is a parametric type. You can have a Complex{Float64}, or a Complex{Float32} etc. Your array didn’t specify what type of complex numbers was in it, so the compiler can’t know what type of number is in there. That will basically always be bad for performance. In this case it was particularly bad because matrix multiplication of Complex{Float64}, which is what you wanted to do, gets passed off to BLAS which is highly optimized, and that couldn’t happen for your abstractly typed arrays.

2 Likes

bcmichael, thank you; this helps a lot.

Yeah, no problem. Honestly I’m actually pretty surprised to learn that ones(Complex,n,n) actually works and gives an abstractly typed array. I can’t imagine almost any scenario where you would actually want to do that.

You can get a 1×m row slice using Xfbi[i:i,:], Xfbi[[i],:], or transpose(Xfbi[i,:]). If Xfbi is real-valued, Xfbi[i,:]' also works.

Vectors are just 1-d arrays, and it would be oddly inconsistent to allow arrays of any dimensionality except 1. Matlab does that anyway, because it can be convenient when you’re mostly doing matrix stuff. But when your arrays are really just arrays rather than matrices, it can get in the way, especially if your code is supposed to be generic with respect to the number of dimensions.

1 Like

Y[i,:] .= Xfbi[i,:]*A;

will probably work - note the single dot in front of the =.

1 Like

Indeed, that’s remarkable!

Sukera, I gave your suggestion a try, but it doesn’t seem to work?

julia> a = [1 2 3; 4 5 6; 7 8 9; 10 11 12]
4×3 Array{Int64,2}:
  1   2   3
  4   5   6
  7   8   9
 10  11  12

julia> b = [0.1 0.2; 0.1 0.2; 0.1 0.2]
3×2 Array{Float64,2}:
 0.1  0.2
 0.1  0.2
 0.1  0.2

julia> c .= a[1,:]*b
ERROR: DimensionMismatch("matrix A has dimensions (3,1), matrix B has dimensions (3,2)")

Since you didn’t specify the error, I assumed it was because the slice in Y was a different shape than the result from the multiplication. My apologies.

What you’re seeing is expected - you’re trying to multiply a 1x3 vector with a 3x2 matrix. Those dimensions don’t match and hence you can’t multiply them, as is expected for vector-matrix-multiplication. Using the suggestion by @yha you can transform the vector into a 3x1 matrix and the multiplication will work as expected.

1 Like

Those dimensions would match, but this is a size (3,) vector (and not 3x1 or 1x3) times a size (3,2) matrix, and that doesn’t work.

1 Like