Redistribute workload from in-homogeneous local workers to BLAS threads

VinceNeede · April 29, 2025, 8:56am

I have some expensive function that I’m executing on local workers, something like:

@everywhere begin
    using LinearAlgebra
    BLAS.set_num_threads(1)
    function expensive_fun()
        #some in-homogeneous task
    end
end
pmap(_ -> expensive_fun(), 1:N)

Since this is not homogeneous, if I start with 10 workers, I might end up with 2 after 10 minutes, while the remaining 2 might take other 60 minutes. Is there a way to redistribute the processors used initially as workers on the 2 remaining tasks as blas threads?

minimal working example

@everywhere begin
    using LinearAlgebra
    using ITensors, ITensorMPS
    using Random
    Random.seed!(1234)
    BLAS.set_num_threads(1)
    const sites = siteinds("S=1/2", 100)
    const mpo = random_mpo(sites)
    function random_evolve()
        linkdim = rand([fill(256, 7)..., fill(2048, 3)...])
        @info "evolving with" linkdim
        mps = random_mps(sites; linkdims=linkdim)
        apply(mpo, mps; cutoff=1.e-13)
    end
end

pmap(_ -> random_evolve(), 1:10)

Topic		Replies	Views
Pmap and multi-threaded BLAS Performance blas , parallel	2	958	November 29, 2019
Multi-threaded inverse of a matrix General Usage blas , multithreading , linearalgebra	6	1053	April 16, 2022
How to prevent BLAS from thrashing with Julia? General Usage parallel	5	2176	May 30, 2017
How do I use multithreaded BLAS in each MPI process Julia at Scale question	0	694	June 1, 2020
BLAS vs Threads on a cluster Performance	6	529	April 23, 2024

Redistribute workload from in-homogeneous local workers to BLAS threads

Related topics