SymRCM doesn't seem to help linear solver to perform

Niceno · October 16, 2021, 8:08am

Dear all,

Yesterday I came across SymRCM.jl package, which seems to work really neatly and is easy to use, but it doesn’t seem to bring me any performance benefits. Allow me to explain. I wrote a little computational fluid dynamics program in Julia, and this is how the system matrix looks after discretization:

julia> A
26419×26419 SparseArrays.SparseMatrixCSC{Float64, Int64} with 176659 stored entries:
⣿⣿⢮⣷⣴⣵⣰⣷⣖⣎⣮⡨⣶⣰⣁⣭⣗⣖⡾⣁⣄⡐⡀⣤⣶⣀⣀⣄⢄⣈⡀⡂⡐⠂⢀⠐⣁⡀⢀⣦⢐⣁⠀⠀⠀⡄⣰⣌⢰⣠⣹⣿⣷⣧⡀
⢮⣷⣿⢟⣿⣿⣿⣷⣿⣿⣿⣿⢿⣿⣿⣿⣿⣻⣿⣟⣿⣿⣿⣿⣻⣷⣿⣻⣿⡿⡿⣛⢍⣷⣯⣿⣮⢿⠿⢿⣷⢫⡳⢏⣻⣿⡷⣣⣽⣿⣿⣿⣿⣿⡇
⢔⣿⣿⣿⣿⣿⣿⣿⣿⡿⠭⣿⣷⢿⣽⣽⢽⡿⢻⣎⣿⡻⣟⣽⣷⡿⣯⡟⢿⣷⡯⡯⣻⣯⠿⢟⡏⣠⡿⢵⡑⣼⡻⣽⣗⣮⣥⣳⣸⣾⡿⢷⣽⣟⠇
⢴⣾⢿⣿⣿⣿⣿⣿⣟⣻⢷⣿⡟⣿⡾⣝⢍⣿⣟⣙⣺⢿⣟⢯⠹⣵⣴⣟⢯⠍⡚⢻⠼⢶⣗⣏⣻⣺⡳⢴⢧⣏⣐⡥⢿⢃⢿⡯⣹⡷⡻⣶⣺⢻⠇
⡸⢽⣿⣿⣿⡿⣿⣹⢿⣷⣻⣿⡍⡞⣰⠳⣫⡳⣟⣔⠥⡽⠭⣏⣿⡚⢖⣿⣯⠪⡪⣮⣵⢽⡭⢟⠹⣟⠷⠂⡦⣲⠇⢾⣣⣳⢆⡂⠼⢭⣵⡘⣿⣿⡂
⡊⡻⣿⣿⣧⣧⣽⣷⣿⣾⣿⣿⡶⢼⣭⣧⡠⠹⠴⣇⣵⡭⢯⢞⣥⣟⠶⡗⣬⡳⣦⠫⣴⠵⠜⠏⢣⣚⢰⣄⢿⢣⢦⡵⣱⡜⠹⢳⢌⡙⡟⢟⢻⣏⠁
⢘⣻⣿⣷⣽⣟⣿⣭⣣⠭⣘⣏⣿⣿⡛⣬⠵⠼⠓⡏⣺⠜⡌⠳⠥⢻⡒⢹⠺⣴⢹⣾⢡⠶⢯⠠⢔⢿⢲⠣⠟⡨⢶⡽⡢⣠⠇⢬⡛⢳⢜⣵⣎⣸⡅
⡅⣼⣿⣿⣗⣿⣞⢯⢴⡚⠧⣿⡛⣬⢿⣷⡽⠓⢫⣙⣵⢸⠎⢵⠘⢽⠏⠺⢳⣢⡞⠣⣳⠶⣂⣞⡵⡢⣼⣃⣈⠉⠔⢒⠱⢆⡮⢊⡗⢾⢇⡫⢫⣽⡆
⢹⢽⣿⣻⣷⡷⣧⣵⢯⡺⣄⡊⣑⡇⢷⠋⡿⣯⡕⠞⡵⣴⠢⠉⢁⡑⢔⢒⣊⡿⣮⡠⠻⢴⢚⢹⡠⡅⢁⡆⢘⣇⢒⣮⣆⢛⢼⡐⣫⡋⣽⡏⠭⢿⡇
⠞⢫⣿⢿⡻⢶⣟⢹⢛⢽⠴⢧⡽⠤⣏⢲⣱⠍⣿⣿⡈⡧⠖⢑⡎⠠⢉⢷⡝⠀⢒⣘⡟⣓⣚⢦⢓⢋⡂⣭⠾⠒⢕⢀⠚⢌⣑⠱⠒⣬⣝⡽⡷⢽⡆
⢀⠹⣿⣿⣿⡻⣾⣞⣅⡧⡕⡿⣚⠞⣑⣛⢑⣯⠦⡬⢿⣷⣒⠌⢛⢓⠰⠀⠓⡺⢙⠅⢐⠘⡉⠖⣋⠙⠔⡇⡛⡂⠳⣆⢒⡾⠪⢌⠧⡹⠧⣎⢹⣕⠀
⠀⣬⣿⣿⣟⣽⡿⣝⡧⢧⣫⢗⢦⡉⢎⣅⡌⠂⢜⢁⡘⠜⢿⣷⡢⠽⢬⡙⠁⢝⡑⢠⣍⢎⠉⠇⡜⣐⢑⢖⡙⢩⣐⢕⢠⠭⡉⠄⠦⣶⣆⡇⠡⠤⠄
⠘⢻⢿⣾⣽⡿⢗⣦⣻⠻⣥⢿⣥⣃⣖⣄⢅⠰⠊⡉⢿⢐⣌⡎⢿⣷⡖⢸⠧⡠⠐⢲⡅⢌⠠⠉⠠⠾⠈⢴⢲⠁⡈⡠⠰⢠⣎⠇⠘⠻⠆⣒⠺⣿⠃
⠀⢼⣿⣻⣯⠿⣴⢿⣼⣵⢼⠧⣜⣈⣫⡁⢰⢑⢧⣔⠐⠂⣆⠳⣘⣉⠻⣦⡘⣁⡀⠈⠸⡱⠸⣡⠓⡕⠘⠙⠗⣴⣀⡒⢚⡮⠶⢦⠁⠶⡥⢇⡊⡀⠀
⡀⢱⣿⡿⢿⣷⡏⠗⡫⡛⢦⡻⢚⣦⠹⣲⣮⡼⠓⠉⣹⡠⣅⢄⠉⡣⠖⢨⣻⣾⡕⡋⠨⡎⠈⣐⡘⣾⠰⢓⠠⠊⡉⡨⠃⡡⢮⠁⠚⢓⠃⣀⣨⣨⡅
⠠⠨⣿⢫⡯⡯⣾⣈⡪⣮⡬⡛⣳⣶⠾⡉⠊⡻⣘⢰⠗⠔⠑⣈⢰⣀⡀⠈⡵⠩⣿⣿⣀⠙⢈⢡⠁⠛⠀⠠⠆⡄⡠⠠⣬⡆⡁⢡⢀⠓⠦⡬⠄⢵⠆
⠰⠈⢧⣵⡿⣾⢲⣇⣕⣟⢔⡟⢡⡖⢹⡞⢛⣆⢿⢩⣐⠐⡣⢝⡁⢍⢖⡢⡢⠦⣄⠘⡻⣮⣢⠦⠉⠰⠱⡴⡂⢡⣄⡘⠓⠃⠐⣒⡶⠨⡗⡖⠪⠉⠁
⢀⠐⣯⣿⣿⢇⡽⢽⣧⢏⡶⠅⠋⡓⣨⢼⣞⣐⠺⣜⢣⠌⠧⠄⡄⠂⠖⣢⢂⢠⠆⣐⠨⡞⡻⣮⡨⠝⡇⠅⡪⡈⠾⠒⣠⣙⠐⠃⣌⡇⠈⢺⠇⠀⡁
⠁⠸⣮⣟⠋⣩⣻⣺⣷⢦⣩⢲⣴⣕⠱⡫⠄⠮⡽⢐⣏⠘⢒⢩⣠⡆⢝⠤⣲⣬⣥⠀⢃⡀⣆⠎⠻⣦⡧⠐⡣⠤⡀⣆⠏⢄⡀⢈⡈⡗⠘⢊⡩⢝⠂
⠠⣴⣿⣇⢟⣏⢙⣎⠹⠃⠐⢶⠼⡒⠶⢻⠡⠴⡌⣬⠴⠥⢱⢔⢂⣄⣖⠀⢴⢂⠀⡀⢑⡦⠍⠍⢉⠋⣻⣾⡲⠰⠃⠐⠈⠡⠂⣲⡤⢅⠈⢯⡵⠄⠀
⠔⢰⡽⣛⣑⣬⡭⢷⢨⣫⠿⣓⡛⡡⡆⠘⠶⢴⢺⠃⠻⠨⡗⣈⠜⠒⢙⣥⡠⠂⠈⠥⠌⣈⡊⠪⠉⡎⢘⡊⢻⣶⡊⠊⣀⢀⠐⠒⠁⠀⣼⡓⠦⠶⠂
⠀⠀⡽⢎⣟⣮⠔⡼⣩⣅⢌⡷⣜⡷⢰⢁⡸⣴⠑⢑⠹⢦⢔⢜⠂⡨⢠⠸⡃⡨⠀⡊⣀⠹⢺⠃⠠⢬⢉⠀⡪⠈⡿⣯⡉⠀⢉⣉⡇⠂⠒⠅⢡⠤⠀
⠀⠤⣿⣾⡹⣽⠿⢓⢭⣺⣑⠾⠈⣪⠱⢆⣬⢙⡚⢄⣸⡴⡄⡖⠐⣂⡺⡴⠍⡠⠢⠿⠽⠀⣄⢺⠋⢅⠆⡀⠀⢘⠃⠈⠻⣦⣀⠊⠂⠤⡂⡀⠩⢵⡂
⡐⢾⠽⣫⢥⣻⡿⡷⠨⠱⢷⣂⡉⣅⡪⢋⢒⠳⢕⡘⡊⢆⠃⠌⠮⠝⠸⣇⠎⠓⠅⣈⢰⢠⠴⠀⡀⢈⢨⣠⢰⠀⡇⢰⡠⠘⠻⣦⡶⠦⢰⢂⡁⡚⠃
⠐⣲⣷⣿⣲⣾⢷⡾⡖⣇⣆⠱⢿⣈⣹⣍⡯⠺⡘⣤⣍⡣⢨⣧⣶⡀⢡⡄⢾⢀⢤⠐⡘⡋⠦⠽⢦⠬⠄⢏⠁⠀⠩⠉⠈⡄⠸⡏⠟⢅⣐⣒⣒⣂⠀
⣷⣾⣿⣿⢿⣏⢻⣮⣑⠻⣿⢍⢖⣵⡭⡱⡷⠿⣗⡽⡩⢧⠬⠽⢨⢡⠥⢏⠉⢠⡈⡧⢹⠭⣢⣀⡲⢀⡦⣄⢶⠻⠜⠄⠈⠨⠰⢒⢰⢸⠑⢄⠇⡆⠀
⠽⣿⣿⣿⣷⢿⣾⣚⣿⣿⡿⢶⣊⣹⣏⣶⣧⣇⣝⣏⢗⢶⠁⡆⣾⣦⠊⠨⡂⣺⢄⣅⡎⠂⠉⠁⣇⢎⠑⠏⢨⡇⠁⡖⢇⣆⣡⠨⠸⢸⠩⠥⠑⢄⠀
⠀⠈⠉⠉⠉⠁⠉⠁⠈⠈⠁⠀⠁⠉⠈⠉⠉⠉⠈⠉⠀⠀⠀⠁⠉⠀⠀⠀⠁⠉⠈⠁⠁⠀⠁⠈⠈⠀⠀⠀⠈⠀⠀⠀⠈⠈⠉⠀⠀⠀⠀⠀⠀⠀⠁

I made no attempt to renumber the unknowns, and the structure is … well … ugly. But here comes SymRCM, which I use in the following way:

julia> p=symrcm(A);
julia> B=A[p,p]
26419×26419 SparseArrays.SparseMatrixCSC{Float64, Int64} with 176659 stored entries:
⠻⣦⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠙⢿⣷⣄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠹⣿⣿⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠈⠻⣿⣿⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠈⠻⣿⣿⣶⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠘⢿⡿⣯⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣷⣤⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⣿⣿⣿⣿⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣿⣿⣿⡿⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣯⢿⣷⡿⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣯⣿⣿⣝⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠳⣽⢿⣷⣿⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣿⣿⣿⣿⢦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣟⣿⣿⣿⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠛⢿⣿⣿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣧⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠻⣿⣿⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣿⣿⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣿⣿⣦⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠛⢿⣷⣄⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⡀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠁
julila> b=a[p];

The banded matrix looks really neat, SymRCM did a good job. However, if I run a few numerical test for the solution of both linear systems, this is what I get:

julia> @time A\a;
  0.214344 seconds (76 allocations: 87.890 MiB, 1.04% gc time)

julia> @time A\a;
  0.212042 seconds (76 allocations: 87.890 MiB, 1.05% gc time)

julia> @time B\b;
  0.222544 seconds (76 allocations: 85.415 MiB, 1.02% gc time)

julia> @time B\b;
  0.226223 seconds (76 allocations: 85.415 MiB, 1.00% gc time)

I tried with a bigger system too (499000x499000 matrix) and observed similar thing:

julia> @time A\a;
 10.209412 seconds (74 allocations: 2.419 GiB, 0.29% gc time)

julia> @time A\a;
 10.375933 seconds (78 allocations: 2.419 GiB, 0.87% gc time)

julia> @time A\a;
 10.203141 seconds (74 allocations: 2.419 GiB, 0.03% gc time)

julia> @time B\b;
 11.448977 seconds (78 allocations: 2.585 GiB, 0.75% gc time)

julia> @time B\b;
 11.374048 seconds (74 allocations: 2.585 GiB, 0.02% gc time)

julia> @time B\b;
 11.415554 seconds (76 allocations: 2.585 GiB, 0.24% gc time)

It seems that the banded system didn’t offer any advantage in terms of CPU time, it even seems fractionally slower. Do you guys have an idea what is going on here? Is it maybe so that linear solver being invoked with A\a also calls algorithms to reduce the matrix bandwidth?

P.S. The matrices are stored in SparseMatrixCSC format and I use LinearAlgebra and SparseArrays in addition to the SymRCM mentioned above. Quite frankly, I don’t know which solver is invoked with \ operator. I would assume one from Krylov space family. In such a case, band width might not play such a big role, but I would still assume fewer cache misses during the solution and better performance.

Cheers

Paulo_Jabardo · October 16, 2021, 6:25pm

I don’t know what the sparse matrix package does internally but you could try the BandedMatrices package https://github.com/JuliaMatrices/BandedMatrices.jl.

PetrKryslUCSD · October 17, 2021, 12:26am

What if you try one of the direct solvers? cholesky or ldlt?

RoyiAvital · October 17, 2021, 3:20am

I think it would be great if documentation includes example for using ldlt and cholesky as example.

Niceno · October 18, 2021, 8:50am

Thanks for the answer Petr. It is quite conceivable that the difference for direct solvers would be significant since the bands are much narrower after the call to symrcm and simply less memory is allocated and addressed. But this is not a route I would like to take, I would stay with iterative solvers since they have much smaller memory footprint and my long-term goal is to use many unknowns over a number of GPUs.

PetrKryslUCSD · October 18, 2021, 3:24pm

OK, but then there is little merit in renumbering. Iterative solvers don’t care.

Oscar_Smith · October 18, 2021, 3:38pm

In that case, you probably want to turn your matrix into a BandedMatrix

Niceno · October 18, 2021, 3:55pm

Thanks, but I would just increase the storage. I must stick with sparse solvers.

Oscar_Smith · October 18, 2021, 4:07pm

BandedMatrix is a sparse format. It will likely take less space than a generic sparse structure.

Niceno · October 19, 2021, 6:45am

I was expecting that renumbering unknowns will lead to fewer cache misses even in iterative solvers. But it seems I was wrong.

Niceno · October 22, 2021, 4:22am

Hi guys,

I have some additional details. Renumbering with SymRCM doesn’t help the default solver from SparseArrays.jl (which is umfpack, to my understanding), but it does help a lot if I use a combination of a solver from IterativeSolvers.jl with LU preconditioner from IncompleteLU.jl. Depending on the linear system size, renembering reduces the time spend in the linear solver anywhere from 30% to 60% for the systems I’ve been testing - which range from 20_000 to 2_000_000 unknowns.

Cheers

PetrKryslUCSD · October 22, 2021, 9:15am

Note that this is consistent with our reasoning above: LU factorization needs to consider all matrix entries that could become non-zero. The reordering reduces the total number of such operations because the entries are tightly clustered around the diagonal.

Niceno · October 22, 2021, 9:37am

Yes, I agree, pretty consistent.

Topic		Replies	Views
Sparse solve vs BandedMatrix; time and allocation surprise Performance	29	2435	May 16, 2022
How to solve this Ax=b faster? Numerics linearalgebra , linearsolve	24	2561	June 7, 2022
Implementing Cuthill-McKee Numerics	31	4979	February 18, 2023
Slow sparse matrix-vector product with symmetric matrices General Usage	34	4389	July 28, 2019
What is a recommended way to solve a linear equation? New to Julia question , linearalgebra , linearsolve	31	1324	May 8, 2025

SymRCM doesn't seem to help linear solver to perform

Related topics