Sometimes there are packages that are not needed in the first steps of a script, so performance can be improved by running those first steps and loading the “secondary” packages in parallel. For instance:
t = Threads.@spawn firststeps()
using Plots # not needed at all for `firststeps`
result = fetch(t)
However, I’m used to
using packages first, before running anything, so I feel as if there was something wrong with this. (It’s not possible to do it the other way round:
using must be done at top level, and I cannot
So my question is: is there really something wrong with this approach, or is it ok?
Is it a good idea
using packages in parallel with other computations, as long as those package are not yet needed?
It should be fine - as far as I know, spawned tasks run in the world age they’re spawned in, so
using on the main thread shouldn’t interfere with already running code as that increases the world age.
See this for a negative example (trying to access something that is only available in the future from the POV of the spawned task):
julia> f() = begin sleep(10); g() end
julia> t = Threads.@spawn f()
Task (runnable) @0x00007f6de219ce70
julia> g() = "hello"
g (generic function with 1 method)
@ ./task.jl:322 [inlined]
@ Base ./task.jl:337
 top-level scope
nested task error: MethodError: no method matching g()
The applicable method may be too new: running in world age 31218, while current world is 31219.
Closest candidates are:
g() at REPL:1 (method too new to be called from this world context.)
@ Main ./REPL:1
@ Main ./threadingconstructs.jl:178
g(), you get a regular
MethodError from the spawned task, since there isn’t even one in the future. “World age” is one of the reasons why julia can be dynamic with
eval and still be compiled instead of interpreted.