Maintainability: Best way to have two versions of the same function with minor differences

Tetrakai · November 9, 2024, 5:12pm

Say I have two usecases for a function. One logs the result of each iteration and is parallelized using Polyester.@batch, the other only needs to return the final result and uses Threads.@threads. Performance is top priority in the second case, but “only” very important in the first case (with the logs).

Also, the choice is known at “compile time”, ie when the package is loaded.

NB: The real code has a bunch of logging events, so the code becomes less readable in that case:

using BenchmarkTools, Distributions, Polyester, .Threads

function example!(dat, dlog)
    @batch for i in 1:50
        flag = rand(Bernoulli(0.25))
        if flag
            idx = rand(1:10)
            dat[idx] += 1
            dlog[i]   = idx
        end
    end
    return dat, dlog
end

function example!(dat)
    @threads for i in 1:50
        flag = rand(Bernoulli(0.25))
        if flag
            idx = rand(1:10)
            dat[idx] += 1
        end
    end
    return dat
end

dat = fill(0, 10); dlog = fill(0, 50);
example!(dat, dlog)

dat = fill(0, 10); dlog = fill(0, 50);
example!(dat)

Is there a way to do this without needing to maintain two separate versions, or adding a bunch of if-statements?

Maybe if a global const is set, @batch is chosen then dlog gets updated each time dat is modified?

sgaure · November 9, 2024, 9:48pm

Perhaps something like this:

                                                                               
@inline function _example2!(dat, i, ::Val{dolog}, dlog=nothing) where dolog     
    flag = rand(Bernoulli(0.25))                                                
    if flag                                                                     
        idx = rand(1:10)                                                        
        dat[idx] += 1                                                           
        dolog && (dlog[i]   = idx)                                              
    end                                                                         
    return dolog ? (dat, dlog) : dat                                            
end                                                                             
                                                                                
function example2!(dat)                                                         
    @threads for i in 1:50                                                      
        _example2!(dat, i, Val(false))                                          
    end                                                                         
end                                                                             
                                                                                
function example2!(dat, dlog)                                                   
    @batch for i in 1:50                                                        
        _example2!(dat, i, Val(true), dlog)                                     
    end                                                                         
end

abraemer · November 9, 2024, 9:50pm

I think this could be a good application for Preferences.jl. Using this mechanism the user can configure (statically) what the library does.

Tetrakai · November 10, 2024, 3:30pm

This helps for the parallel library but still its cluttering the code up with lines about the log. Is there expected to be a performance benefit from doing it this way vs wrapping the log lines in if-statements?

Tetrakai · November 10, 2024, 3:31pm

I keep coming across this package for a number of reasons, so will definitely be checking it out.

sgaure · November 10, 2024, 3:36pm

The dolog is a compile time type parameter. So if _example2! is called with Val(false), the dolog && ... lines are optimized away, whereas if you call it with Val(true), the test is optimized away, so only an unconditional dlog[i] = idx remains. For tight loops a test can destroy performance, preventing vectorization, stalling the pipeline and so on, so it can be beneficial to avoid the tests altogether in this way.

Topic		Replies	Views
Another slowdown when using `Threads.@threads`? General Usage performance	11	825	March 24, 2022
What is a good design pattern for developing parallel types, one designed for safety and the other for performance? General Usage design-pattern	14	817	August 13, 2019
Improving performance elegantly - type stability General Usage type-stability	20	1408	January 13, 2019
Why with @threads, the execution time is worse? Performance question , multithreading	19	2797	September 16, 2021
Huge performance improvement by separating function? General Usage	15	1156	August 27, 2018

Maintainability: Best way to have two versions of the same function with minor differences

Related topics