Loading packages in a parallel thread

heliosdrm · August 27, 2021, 8:00am

Sometimes there are packages that are not needed in the first steps of a script, so performance can be improved by running those first steps and loading the “secondary” packages in parallel. For instance:

t = Threads.@spawn firststeps()
using Plots # not needed at all for `firststeps`
result = fetch(t)
plot(result)

However, I’m used to using packages first, before running anything, so I feel as if there was something wrong with this. (It’s not possible to do it the other way round: using must be done at top level, and I cannot @spawn it.)

So my question is: is there really something wrong with this approach, or is it ok?
Is it a good idea using packages in parallel with other computations, as long as those package are not yet needed?

Sukera · August 27, 2021, 9:41am

It should be fine - as far as I know, spawned tasks run in the world age they’re spawned in, so using on the main thread shouldn’t interfere with already running code as that increases the world age.

See this for a negative example (trying to access something that is only available in the future from the POV of the spawned task):

julia> f() = begin sleep(10); g() end

julia> t = Threads.@spawn f()                                                                      
Task (runnable) @0x00007f6de219ce70                                                                
                                                                                                   
julia> g() = "hello"                                                                               
g (generic function with 1 method)                                                                 
                                                                                                   
julia> fetch(t)                                                                                    
ERROR: TaskFailedException                                                                         
Stacktrace:                                                                                        
 [1] wait                                                                                          
   @ ./task.jl:322 [inlined]                                                                       
 [2] fetch(t::Task)                                                                                
   @ Base ./task.jl:337                                                                            
 [3] top-level scope                                                                               
   @ REPL[9]:1                                                                                     
                                                                                                   
    nested task error: MethodError: no method matching g()                                         
    The applicable method may be too new: running in world age 31218, while current world is 31219.
    Closest candidates are:                                                                        
      g() at REPL[8]:1 (method too new to be called from this world context.)                      
    Stacktrace:                                                                                    
     [1] f()                                                                                       
       @ Main ./REPL[4]:1                                                                          
     [2] (::var"#1#2")()                                                                           
       @ Main ./threadingconstructs.jl:178

Without defining g(), you get a regular MethodError from the spawned task, since there isn’t even one in the future. “World age” is one of the reasons why julia can be dynamic with eval and still be compiled instead of interpreted.

Topic		Replies	Views
How to run tasks in parallel? General Usage first-steps , multithreading	6	1357	February 22, 2020
Parallel for nested loop with inner loop first, and then outer loop General Usage	2	218	May 23, 2023
What is julia doing with your threads? General Usage	23	1120	February 21, 2024
Notes on multithreading with Julia Teaching & Outreach parallel , multithreading	5	1286	June 29, 2020
Overhead between calling a function and running the first line its code General Usage	7	784	March 8, 2023

Loading packages in a parallel thread

Related topics