Type stability and tasks

I’m looking into using tasks for some parallelization speedup, but I’m worried about type stability in the returned objects. For example, I’m calling a type-stable function over a range of Ints; the returned objects should all have the same type. All the discussion I’ve seen about this sort of thing is pretty old. Has Julia gotten better at inference? Is StableTasks.jl the thing to use?

You can use StableTask or wrap the threaded part into a function and force the return type it may work. It shouldn’t really mater though.

I would just use OhMyThreads.jl which do that in a safe way and gives you more control. Its also the recommended way now.

Julia’s type inference mechanism isn’t actually the bottleneck here. If an unparametric Task type wraps any (0-argument) function call with any return type, then functions like fetch that map a Task to return values are inherently type-unstable: fetch(::Task)::Any. You always could help the compiler afterward with type assertions like:

foo() = fetch(Threads.@spawn 1+1)::Int # either returns Int or errors

…but there is still that bit of fetch(::Task)::Any requiring a runtime type check Core.typeassert(::Any, Int). You can verify with @code_warntype and @code_llvm.

That said, you probably knew that already, and what you really meant to ask is if Julia has introduced parametric tasks like StableTasks into the tasks API. No, and since Task is documented for the existing API, it would take new signatures or calls like StableTasks. And that leads to another implied question: practical Julia is usually very insistent on type stability, so how were people using nonparametric Tasks this whole time? fetch(::Task) isn’t the go-to approach; the simplest case of a call fetching from one task is idling a call or task just to schedule another task, which isn’t a throughput improvement over just calling the underlying function (though more divided Tasks generally improve concurrency). Instead, tasks share memory via Channels, locks, and atomics, which can be type-stable via explicitly computed parameters. StableTasks’ internal metaprogramming actually makes StableTask{T} and the internal nothing-returning Task share @atomic x::T for fetch(::StableTask)::T to retrieve. The abstraction of StableTasks inferring T for you like a normal function call can be easier in some cases and is open to internal improvements without API changes.

I guess it’s important to mention that the most implementations of Task like thing would force boxing the value and something like foo() = fetch(Threads.@spawn 1+1)::Int has little runtime cost as compared to all the other overheads of spawning and scheduling and waiting on a task. (A type assert like that is usually 2 cpu instructions.

Note that this is exactly what StableTasks.jl does for you automatically. The point of doing this is not that it makes the fetch part faster (it doesn’t), the point is to stop the type instability from propagating outwards towards the rest of your program.

For example, in OhMyThreads.jl we use this to make reductions type stable for outside users:

julia> using OhMyThreads

julia> code_typed(Tuple{Float64, Vector{Float64}}) do x, v
           x + @noinline tmapreduce(sin, +, v)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = dynamic invoke Main.tmapreduce(Main.sin::Function, Main.:+::Function, v::Vector{Float64})::Float64
│   %2 = intrinsic Base.add_float(x, %1)::Float64
└──      return %2
) => Float64

This means that if you have code that calls tmapreduce, your code can still specialize on the return-type of the reduction, even if there’s something unstable happening inside, the instability doesn’t propagate outwards and infect the rest of your code. On the other hand, earlier packages like e.g. ThreadsX will make any code calling ThreadsX.mapreduce unstable:

julia> using ThreadsX

julia> code_typed(Tuple{Float64, Vector{Float64}}) do x, v
           x + @noinline ThreadsX.mapreduce(sin, +, v)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 =    invoke ThreadsX.mapreduce(Main.sin::Function, Main.:+::Function, v::Vector{Float64})::Any
│   %2 =   dynamic (x + %1)::Any
└──      return %2
) => Any

I know its not the question but I hope it will be trimmable some day

using OhMyThreads

function @main(ARGS::Vector{String})::Cint
    N = parse(Int,ARGS[1])
    A = rand(N)
    s::Float64 = @noinline tmapreduce(cos,+,A)
    println(Core.stdout,s)
    return 0;    
end

for now :

& juliac --project="." --jl-option=threads=auto --output-exe app  main.jl --trim

◓ Compiling...Verifier error #1: unresolved invoke from statement Main.tmapreduce(Main.cos, Main.:+, %new()::Vector{Float64})::Float64
Stacktrace:
 [1] main(ARGS::Vector{String})
   @ Main ~/Bureau/test_app/main.jl:6
 [2] _main(argc::Int32, argv::Ptr{Ptr{Int8}})
   @ Main ~/.julia/packages/JuliaC/StMkx/src/scripts/juliac-buildscript.jl:77

Verifier error #2: unresolved invoke from statement Main.tmapreduce(Main.cos, Main.:+, %new()::Vector{Float64})::Float64
Stacktrace:
 [1] main(ARGS::Vector{String})
   @ Main ~/Bureau/test_app/main.jl:6
 [2] _main(argc::Int32, argv::Ptr{Ptr{Int8}})
   @ Main ~/.julia/packages/JuliaC/StMkx/src/scripts/juliac-buildscript.jl:77

Trim verify finished with 2 errors, 0 warnings.

I tried with manuel out type and other args but no for now its not possible.
Even in unsafe mode

$ juliac --project="." --jl-option=threads=auto --output-exe app  main.jl --trim=unsafe
✓ Compiling...
$ ./app 10000
Internal error: during type inference of
tmapreduce(Core.Function, Core.Function, Core.Array{Core.Float64, 1})
Encountered unexpected error in runtime:
Core.MethodError(f=Base.Compiler.var"#typeinf_ext_toplevel"(), args=(tmapreduce(Core.Function, Core.Function, Core.Array{Core.Float64, 1}) from tmapreduce(Any, Any, Any...), 0x0000000000009793, 0x01, 0x00), world=0x0000000000002c84)
jl_method_error_bare at /cache/build/builder-amdci5-4/julialang/julia-release-1-dot-12/src/gf.c:3043
jl_method_error at /cache/build/builder-amdci5-4/julialang/julia-release-1-dot-12/src/gf.c:3061
jl_lookup_generic_ at /cache/build/builder-amdci5-4/julialang/julia-release-1-dot-12/src/gf.c:4184 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-4/julialang/julia-release-1-dot-12/src/gf.c:4210
jl_apply at /cache/build/builder-amdci5-4/julialang/julia-release-1-dot-12/src/julia.h:2391 [inlined]
jl_type_infer at /cache/build/builder-amdci5-4/julialang/julia-release-1-dot-12/src/gf.c:466
jl_compile_method_internal at /cache/build/builder-amdci5-4/julialang/julia-release-1-dot-12/src/gf.c:3516
_jl_invoke at /cache/build/builder-amdci5-4/julialang/julia-release-1-dot-12/src/gf.c:4006 [inlined]
ijl_invoke at /cache/build/builder-amdci5-4/julialang/julia-release-1-dot-12/src/gf.c:4021
main at /home/yolhan/Bureau/test_app/main.jl:6
_main at /home/yolhan/.julia/packages/JuliaC/StMkx/src/scripts/juliac-buildscript.jl:77
julia__main_3309_gfthunk at ./app (unknown line)
main at ./app (unknown line)
unknown function (ip: 0x768c89e2a1c9) at /lib/x86_64-linux-gnu/libc.so.6
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_start at ./app (unknown line)
fatal: error thrown and no exception handler available.
Core.MissingCodeError(mi=tmapreduce(Core.Function, Core.Function, Core.Array{Core.Float64, 1}) from tmapreduce(Any, Any, Any...))
jl_compile_method_internal at /cache/build/builder-amdci5-4/julialang/julia-release-1-dot-12/src/gf.c:3550
_jl_invoke at /cache/build/builder-amdci5-4/julialang/julia-release-1-dot-12/src/gf.c:4006 [inlined]
ijl_invoke at /cache/build/builder-amdci5-4/julialang/julia-release-1-dot-12/src/gf.c:4021
main at /home/yolhan/Bureau/test_app/main.jl:6
_main at /home/yolhan/.julia/packages/JuliaC/StMkx/src/scripts/juliac-buildscript.jl:77
julia__main_3309_gfthunk at ./app (unknown line)
main at ./app (unknown line)
unknown function (ip: 0x768c89e2a1c9) at /lib/x86_64-linux-gnu/libc.so.6
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_start at ./app (unknown line)

If I recall correctly you can do it if you first trace the execution with PrecompileTools.jl to collect all the necessary precompile statements, and then run JuliaC and inform it not to remove the specified statements you hit.

I forget the right way to do it though and im not at my computer currently.