Reduce GC time

fdekerme · September 11, 2023, 10:03am

Hello ,
I tried to optimize my code to reduce GC time. Among other things, I tried to implement “in place” operations, in particular with a long GC function “is_intersect!”, which checks whether a given interval intersects with an interval list . However, the following code structure doesn’t change anything to my GC time compared to a “naive” implementation where I create new variable at each step. Do you see an implementations error in the following code?

A structure I used :

using IntervalSets
Base.@kwdef struct S
    id::Int16
    number_of_interval::Int16
    intervals::Vector{ClosedInterval{Float64}}
    ...
end

The is_intersect! function :

function is_intersect!(interval_1::Interval, structure_1::S, is_intersect :: Vector{Bool})
    is_intersect[1] = false
    for interval in structure_1.intervals
        if !isempty(intersect(interval_1, interval))
            is_intersect[1] = true
        end
    end
end

The main function :

function main(structure_1)
    compartment_id = 0
    t = 0.0
    is_intersect = Vector{Bool}(undef, 1)
    time_in_compartment = Vector{Float64}(undef, 1)
    while t < duration
        time_in_compartment[1] = rand(1)[1]

        if compartment_id == 0
            is_intersect!(t .. (t + time_in_compartment[1]), structure_1, is_intersect)

      ...

thanks !
fdekerm

Oscar_Smith · September 11, 2023, 12:23pm

one easy improvement is to use rand()rather than rand(1)[1]

fdekerme · September 11, 2023, 12:48pm

Yes, thank you for pointing that out, it was a relic of an earlier version!

kristoffer.carlsson · September 11, 2023, 2:15pm

Please post enough code so that it can actually be run.

fdekerme · September 12, 2023, 8:33am

The code is part of a bigger project, so I’ve tried to make a small version that I can share here. The GC time is obviously less marked than on the real version of the code, but do you see areas for improvement/errors in my implementation? For example, does using a mutable struct (struct cell here) and changing the value of its parameters during execution have a strong footprint on the GC?

using IntervalSets
using StatsBase: sample, Weights
using ProgressMeter

function is_intersect!(interval::Interval, fractions_list, is_intersect :: Vector{Bool})
    is_intersect[1] = false
    for fraction in fractions_list
        if !isempty(intersect(interval, fraction))
            is_intersect[1] = true
        end
    end
end

Base.@kwdef mutable struct cell
    const id::Int
    total::Float64 = 0.0 
    actual_compartment::Int16
end


function walk(cell_id, compartment_ids, fractions_list, duration)
    l = cell(id=cell_id,
        actual_compartment=0)

    is_intersect = Vector{Bool}(undef, 1)
    time_in_compartment = Vector{Float64}(undef, 1)
    t = 0.0
    while t < duration
        time_in_compartment[1] = 10 * rand()

        if l.actual_compartment == 0.0
            is_intersect!(t .. (t + time_in_compartment[1]), fractions_list, is_intersect)
            if is_intersect[1] 
                l.total += rand()
            end
        end

        t += time_in_compartment[1]

        l.actual_compartment = sample(compartment_ids)


    end
    return l.total
    
end

number_of_cells = 100_000
compartment_ids = [0,1,2,3,4,5]
fractions_list = [10..12, 20..22, 30..32, 40..42, 50..52, 60..62, 70..72, 80..82, 90..92, 100..102]
duration = 100

function main(number_of_cells, compartment_ids, fractions_list, duration)
    t = Vector{Float64}(undef, number_of_cells)

    p = Progress(number_of_cells) #Progression bar

    @time Threads.@threads for cell_id in 1:number_of_cells

        tot = walk(cell_id, compartment_ids, fractions_list, duration)

        t[cell_id] = tot
        

        next!(p)

    end
    return (t)
end

kristoffer.carlsson · September 12, 2023, 11:44am

It seems you allocate twice per cell which should be the two vectors getting allocated inside walk:

    is_intersect = Vector{Bool}(undef, 1)
    time_in_compartment = Vector{Float64}(undef, 1)

fdekerme · September 14, 2023, 9:49am

and would you see any solutions to improve this?

abraemer · September 14, 2023, 10:28am

In your current code; I can’t see a reason for using a Vector with a single element. Do you extend this vector in your true code? If not, just use a regular variable, or if you need the mutability (because you want to change the value elsewhere, but that could also be restructured) use a Ref instead. But to me nothing there looks like it allocates way to much.

Let’s take a step back though: How did you check for gctime/allocations? If you used @time it is likely that you included compilation time into your results which also inflates the allocations/gctime since the compiler allocates a lot. The first step to optimization is measuring and then optimizing where the gross of allocations (or time spent in general) happens. If you use vscode you can simply call main once to compile and then do @profview main(...) to get a flamegraph showing where time was spent during execution. Usually that is sufficient for optimization pruposes but you can also track allocations line-by-line as explained here. This will hopefully give you a better understanding of where to optimize and if not you can come back with the data and we will help you decipher it

Topic		Replies	Views
Insert value with fixed size/gc time Performance	5	141	July 1, 2024
Which part of these code I can change to improve the run speed and reduce the gc time? New to Julia	7	598	March 11, 2022
C struct garbage collection not run frequently enough General Usage garbage-collection , mutable-structure , c , gc	28	444	July 14, 2024
Help reduce large gc time Performance diffeq	32	3699	November 18, 2018
GC occurs at the worst time in tight loop (Garbage Collection) Performance question	93	3331	November 7, 2023

Reduce GC time

Related topics