Implementing option to profile a script as environment variable or command line argument

This is a bit of a stylistic “what’s the most elegant way to do this?” question than a functionality question. Basically, let’s say we have a script that runs some computation and saves some outputs for us, and we’re currently working on developing the script. Here’s my example:

using Base.Threads: @threads

# block of computation
res = Vector{Float32}(undef, 1_000)
@threads for i in 1:1_000
    sleep(0.01)  # something in here takes a while to run
    res[i] = 42
end

# output the results of the computation (maybe something like save to file in practice)
println("The results is $(join(res[1:10], ','))...")

Now, to aid in optimizing the script, I want to profile the parallelized loop in the middle and look at a flame graph. To do this, I could edit the script to remove threading and enclose the key loop in @profile(...) like this:

import ProfileSVG
using Profile: @profile, Profile
# using Base.Threads: @threads

Profile.clear()
res = Vector{Float32}(undef, 1_000)
@profile(
    # block of computation
    for i in 1:1_000
        sleep(0.01)  # something in here takes a while to run
        res[i] = 42
    end
)
ProfileSVG.save("flamegraph.svg")

# output the results of the computation (maybe something like save to file in practice)
println("The results is $(join(res[1:10], ','))...")

But then I’d have to change the script to toggle profiling. I am looking for a better way, something like having the script look for an environment variable and profile conditionally on that. I thought I should reach out to see if folks in the community have tips for a better workflow they could share. Maybe there is some syntactic magic for conditionally enabling/disabling the @profile and @threads macros? I’m thinking the ideal approach would look something like this:

import ProfileSVG
using Profile: @profile, Profile
using Base.Threads: @threads

const IS_PROFILED = get(ENV, "PROFILE_THE_SCRIPT", false)

Profile.clear()
res = Vector{Float32}(undef, 1_000)
@profile(;disable=!IS_PROFILED)(
    # block of computation
    @threads(;disable=IS_PROFILED) for i in 1:1_000
        sleep(0.01)  # something in here takes a while to run
        res[i] = 42
    end
)
if IS_PROFILED
    ProfileSVG.save("flamegraph.svg")
end

# output the results of the computation (maybe something like save to file in practice)
println("The results is $(join(res[1:10], ','))...")

But maybe there’s also just a much better workflow for occasionally profiling a script you’re developing to keep a sense of what parts are slow and catch performance-killing mistakes? All suggestions are greatly appreciated!

All of your script should be inside a main function anyway. You can just have this bit at the end:

const IS_PROFILED = get(ENV, "PROFILE_THE_SCRIPT", false)
if IS_PROFILED
    Profile.clear()
    @profile main(ARGS)
    ProfileSVG.save("flamegraph.svg")
else
    main(ARGS)
end

Or even define a function to do it:

function conditional_profile(f, args...; kwargs...)
    should_profile = get(ENV, "PROFILE_THE_SCRIPT", false)
    if should_profile
        Profile.clear()
        @profile f(args...; kwargs...)
        ProfileSVG.save("flamegraph.svg")
    else
        f(args...; kwargs...)
    end
end

and then call

conditional_profile(main, ARGS)
1 Like

Ah, you’re so correct that it’s cleaner to define a main() function to separate script config from script functionality. I think it looks and works quite nicely:

import ProfileSVG
using Profile: @profile, Profile
using Base.Threads: @threads


const IS_PROFILED = parse(Bool, (get(ENV, "PROFILE_THE_SCRIPT", "false")))

function main()
    res = Vector{Float32}(undef, 1_000)
    # block of computation
    @threads for i in 1:1_000
        sleep(0.01)  # something in here takes a while to run
        res[i] = 42
    end
    # output the results of the computation (maybe something like save to file in practice)
    println("The results is $(join(res[1:10], ','))...")
end

if IS_PROFILED
    Profile.clear()
    @profile main()
    ProfileSVG.save("flamegraph.svg")
else
    main()
end

The one thing this approach skips over is disabling parallelization. Upon actually running my demo script, I found that it seems the profiling actually does handle the threading just fine, so there seems no reason to disable it in this example. However, I would appreciate confirmation from someone more knowledgeable than I that wrapping multi-threaded code in @profile shouldn’t cause any unwelcome side effects. I’m also just curious if there is a better way to disable macros like @threads than the following:


import ProfileSVG
using Profile: @profile, Profile
using Base.Threads: @threads


const IS_PROFILED = parse(Bool, (get(ENV, "PROFILE_THE_SCRIPT", "false")))

function heavy_work(input)
    sleep(0.01)
    return 42
end

function main(;multithreading=true)
    res = Vector{Float32}(undef, 1_000)
    # block of computation
    if multithreading
        @threads for i in 1:1_000
            res[i] = heavy_work(i)   # something in here takes a while to run
        end
    else
        for i in 1:1_000
            res[i] = heavy_work(i)   # something in here takes a while to run
        end
    end 
    # output the results of the computation (maybe something like save to file in practice)
    println("The results is $(join(res[1:10], ','))...")
end

if IS_PROFILED
    Profile.clear()
    @profile main(;multithreading=false)
    ProfileSVG.save("flamegraph.svg")
else
    main()
end
1 Like

Update! It seems that to make the @threads macro (or any other macro for that matter) conditional, we need to either write the clunky conditional above, or turn to more macro-ing to automatically rewrite the conditional for us. There doesn’t seem to be another option, since macros re-write the code at parse-time before the run-time value of a flag like IS_PROFILED is set. Here’s what I came up with.

Contents of example.jl:

using Base.Threads: @threads

threading_arg = get(ARGS, 1, "")
println("Threading arg: $threading_arg")

macro conditional_on(condition, macroed_expr)
    macro_symbol, _, unmacroed_expr = macroed_expr.args
    macro_warning = "Macro `$(string(macro_symbol))` disabled since `$(string(condition))` == false"
    return quote
        if $condition
           $macroed_expr
        else
            @debug $macro_warning
            $unmacroed_expr
        end
    end
end


@time @conditional_on threading_arg == "threading" @threads for i in 1:10
    sleep(0.1)
end
println("Done")

Running example.jl:

$ JULIA_DEBUG=Main julia --threads=auto example.jl
Threading arg: 
┌ Debug: Macro `@threads` disabled since `threading_arg == "threading"` == false
└ @ Main example.jl:13
  1.158918 seconds (302.58 k allocations: 14.439 MiB)
Done
$ JULIA_DEBUG=Main julia --threads=auto example.jl threading
Threading arg: threading
  0.252536 seconds (49.44 k allocations: 2.756 MiB)
Done

I just read the macro documentation today and I can’t say I understand the hygiene part of the implications yet I just realized I need to esc() my expression in the macro and by doing so I potentially create clashes with variables named macro_symbol and unmacroed_expr, but I think I got the @conditional_on macro working alright for my use-case at least.

Combining this macro with the refactoring advice of @Henrique_Becker, I can now give a full answer to my initial question:

import ProfileSVG
using Profile: @profile, Profile
using Base.Threads: @threads


const IS_PROFILED = parse(Bool, (get(ENV, "PROFILE_THE_SCRIPT", "false")))

macro conditional_on(condition, macroed_expr)
    macro_symbol, _, unmacroed_expr = macroed_expr.args
    macro_warning = "Macro `$(string(macro_symbol))` disabled since `$(string(condition))` == false"
    return esc(quote
        if $condition
           $macroed_expr
        else
            @debug $macro_warning
            $unmacroed_expr
        end
    end)
end

function main()
    res = Vector{Float32}(undef, 1_000)
    # block of computation
    @conditional_on !IS_PROFILED @threads for i in 1:1_000
        sleep(0.01)  # something in here takes a while to run
        res[i] = 42
    end
    # output the results of the computation (maybe something like save to file in practice)
    println("The results is $(join(res[1:10], ','))...")
end

if IS_PROFILED
    Profile.clear()
    @profile main()
    ProfileSVG.save("flamegraph.svg")
else
    main()
end
1 Like

Update 2! It actually seems that the flamegraphs are weird when I run Julia with --threads=auto, even if I disable the @threads macro, but that the flamegraph looks the same with and without the @threads macro if I launch using --threads=1. So in my particular use-case of disabling threading for profiling, it seems the best course of action is to intervene with the --threads command-line option and not to worry if the @threads macro is used.