# start a new julia REPL
julia> function work(b)
sleep(3)
println(b)
end;
julia> function main()
t = Task(() -> work(c))
schedule(t)
c = 9
wait(t)
end;
julia> main()
9
You see, c is not defined when schedule(t). I expect an ERROR when schedule(t), am I right?
This comparative test is even more puzzling
julia> function work(b)
sleep(3)
println(b)
end;
julia> function main()
t = Task(() -> work(c))
schedule(t)
println("I merely insert one line here")
c = 9
wait(t)
end;
julia> main()
I merely insert one line here
ERROR: TaskFailedException
Stacktrace:
[1] wait(t::Task)
@ Base ./task.jl:370
[2] main()
@ Main ./REPL[2]:6
[3] top-level scope
@ REPL[3]:1
nested task error: UndefVarError: `c` not defined in local scope
Suggestion: check for an assignment to a local variable that shadows a global of the same name.
Stacktrace:
[1] (::var"#1#2")()
@ Main ./REPL[2]:2
julia> Threads.nthreads(:default), Threads.nthreads(:interactive)
(255, 1)
julia> versioninfo()
Julia Version 1.11.6
There are no promises about when () -> work(c) will be called. Doing schedule(t) only schedules the task but it might not actually start for a while, so whether or not it crashes depends on whether it runs before or after c gets defined.
In your first case, c is defined immediately after schedule(t), so will usually be defined before the task actually starts, so no error occurs.
In the second case, doing some IO usually ends up switching between tasks because IO is slow - it can do some other work while the IO is happening. This means that the task actually got started before c was defined, hence the error.
The schedule(t) only does a push!(workqueue, t) and returns, where workqueue is an internal queue of tasks belonging to the scheduler. The next time the scheduler activates the task is found and run. This typically happens when the current task enters some wait state, like I/O, sleep, yield, wait, lock or similar (It’s really not specified when this can happen, I think the doc states that it can happen at any time).
Anyway, when a task throws an error, the exception is just stored in its task struct before it terminates, and rethrown when a fetch or wait is performed. So you will not see it until a wait or fetch.
My meaning was, the arg to the work function should be decided when the task t is scheduled. Otherwise it’s super weird—I have no idea what c really is, i.e. I lost my mind about what I write becomes in the end.
For instance
task_vec = []
for i = [10, 20, 30]
for j = i+1:i+5
t = Task(() -> work(j))
schedule(t)
push!(task_vec, t)
println("i = $i, j = $j")
end
end
If j is cut-and-dried when schedule(t), then I really know what my code is doing about.
Otherwise, when the task defined at i = 10 is started when i = 20, then the previous j is lost, no?
There is also the additional point that you’re (sometimes) allowed to reference variables before (in terms of location in code) they are declared, which is not related to multi-threading. E.g.
julia> function main()
inc_c() = c += 1 # What is c at this point?
c = 1
inc_c()
println(c)
end;
julia> main()
2
c is a local variable in main. The closure () -> work(c) captures c. In Julia, capturing a variable does not make an implicit copy. The time at which a schedule(task) call happens is not relevant.
This is a different situation. The task contains the closure () -> work(j) where j is assigned anew for every iteration. This closure is really a callable immutable struct with one field j which is set when the closure is created. Like
struct uniquename
j::Int
end
(s::uniquename)() = work(s.j)
...
t = Task(uniquename(j))
In the former case, the c becomes a untyped “boxed” variable, which initially is undefined, but which can change later (which it does). If it happens to be defined before it’s used in the task, everything will be fine.
Great, now who’s going to explain why boxing happens
Basically, Julia’s frontend boxes all local variables that get captured by a closure and may get reassigned. In other words, to be able to support reassignment of local variables that get captured, Julia’s frontend changes the type of the local variable to a mutable type with a single mutable field, so, thereafter, Julia just sees mutation, not reassignment.
I can somewhat understand what your code is doing. But I cannot get your overall meaning. (Since I haven’t write any packages, I have almost zero experience about Struct, Module etc. Because as a user I don’t need to define them by myself).
the built-in syntax for functions in Julia is just syntax sugar for:
defining a new type, if it’s a new function
adding a method to the type making it callable
in Julia, any type can be callable, just add a method
in Julia, new types can be struct or mutable struct (or abstract type, but obviously those don’t have instances, so they are not relevant here)
TBH I’m not completely sure what @sgaure’s point was either. I think they’re saying that some closures get lowered to a mutable struct, but I’m not sure if that’s true.
I though for and let are somewhat alike since they both introduce a local scope.
But now I believe let is simpler thus superior, because it has no Union that for has:
julia> function test_local_scope()
let k = 1
end
end;
julia> @code_warntype test_local_scope()
Locals
k::Int64
julia> function test_local_scope()
for k = 1
end
end;
julia> @code_warntype test_local_scope()
Locals
@_2::Union{Nothing, Tuple{Int64, Nothing}}
As you can see, there is a yellow-color Union associated with for, which is not the case with let.
The intent of for is for repeated execution of code. The intent of let is to create a scope. Make your code more readable and maintainable by using the language constructs for their intended purpose, rather than for their side effects. Seeing a for loop abused as you do above will likely make most readers wonder at the point of it.
function work(b)
sleep(2)
println(b)
end
for i = [10, 20, 30]
for j = i+1:i+4
t = Task(() -> work(j))
schedule(t)
end
sleep(1)
end
I “spawned” 3 * 4 tasks here.
The first 3 tasks are scheduled when i = 10, surely.
But the first 3 tasks might be executed at the time instant when i == 20 or i == 30, therefore the j is no longer the old ones, e.g. j is 21, 22, 23, 24 now.
Will the first 3 tasks still use the old j’s, i.e. 11, 12, 13, 14? Why is this guaranteed?
Imagine it is expanded as
i = 10
for j = [11, 12, 13, 14]
body1(j)
end
i = 20
for j = [21, 22, 23, 24]
body2(j)
end
i = 30
for j = [31, 32, 33, 34]
body3(j)
end
Is the j serving as the arg to body1 the same thing as the j serving as the arg to body2?
I’m really confused.
I have a proposition, based on the discussions above
the following line 1 and line 2 are completely equivalent operations
... some background
v = rand(8)
i = 7
for j = view(v, 1:i) Threads.@spawn work(j) end # line 1
for ȷ = i:i Threads.@spawn work($(v[ȷ])) end # line 2
... some background
If no disagreement, I think this topic is coming to an end.
I’ve gained some deeper understanding (I hope so):
Taking this as an example
julia> function main()
for j = RANGE Threads.@spawn work( $(f(g(j))) ) end
end;
julia> @code_lowered main()
...
2 ┄
│ %10 = Main.f
│ %11 = Main.g
│ %12 = j
│ %13 = (%11)(%12)
│ %14 = (%10)(%13) # [COMMENT] THIS IS a _SNAPSHOT_ OF `f(g(j))`
│ 230 = %14
│ %16 = Base.Threads.Task
│ %17 = Main.:(var"#3#4") # [COMMENT] THIS IS `work` !
│ %18 = 230
│ %19 = Core.typeof(%18)
│ %20 = Core.apply_type(%17, %19)
│ %21 = 230
│ #3 = %new(%20, %21)
│ %23 = #3
│ task = (%16)(%23) # [COMMENT] TASK CREATED
....
I can see that the consequence of using $ is the cut-and-dried static variable %14, which is later again assigned to %21 and %23. Therefore, the snapshot value of f(g(j)) is already decided when the respective task is created.
This is not the case for work, which is assigned to %17 but is later unused, except in that type inference Core.apply_type. This suggest that work is volatile—if work mutates later, the executing of that respective task may use the mutated work.