Best Practices to Use Global States in Julia?

Hi everyone!

I’m trying to define some global states inside my module, but am also concerned about how to elevate the performance…
I have done some research on this, some suggests that to use a struct to pack the states. I try to make the illustration as simple as possible:

mutable struct GlobalState
    x::Real
    A::Array
end

const myState = GlobalState(1.0, ones(2, 2))

and I define some test functions for bencmark testings:

function test1(myState::GlobalState) # use global state

    B = fill(0.0, size((myState.A)))
  
    @views for _ in 1:1000
        B .+= myState.x .* myState.A
    end
end

function test2(myState::GlobalState) # unpacking??

    B = fill(0.0, size((myState.A)))
    x = myState.x
    A = copy(myState.A)

    @views for _ in 1:1000
        B .+= x .* A
    end
end

function ref() # reference

    B = zeros(2, 2)
    x = 1.0
    A = ones(2, 2)

    @views for _ in 1:1000
        B .+= x .* A
    end
end

The benchmark results are:

  601.600 μs (5002 allocations: 203.25 KiB)
  600.600 μs (5003 allocations: 203.34 KiB) 
  6.925 μs (2 allocations: 192 bytes)

In test1 and test2 there will be more memory allocations if the iteration grows. Does this implies a potential longer gc time? (Please correct me if I’m wrong about this :pray:) I thought test2 will be different from test1 since I have done some “unpacking” :joy:

The ref function does the same task with no global states infected. My problem is, how can I have the same performance as the ref function but still being able to have some sense of “communication” with the outside world? Thanks!!

(UPDATE: I have also tried to remove mutable from the struct definition, but results are the same)

1 Like

The fields of your struct do not have concrete types. Try:

mutable struct GlobalState
    x::Float64
    A::Matrix{Float64}
end

Does that fix your problem?

4 Likes

Yes this does help!!

I get the result now with

  6.780 μs (1 allocation: 96 bytes)
  7.733 μs (2 allocations: 192 bytes)
  6.880 μs (2 allocations: 192 bytes)

I think I just miss the point! Thanks for the solution!

@view and @views are for when you do slicing, for example

@views for _ in 1:1000
        B[2:5, :] .+= myState.x[2:5, :] .* myState.A[2:5, :]
    end

In your code it has no effect, and can be removed.

1 Like

Just a comment on this. It’s almost correct, but in this case you must use a function barrier, not just “unpacking”. The reason is the abstract fields of GlobalState. This means that the types of myState.A and myState.x are not fully known at the time of compilation, and, thus, not known in the for loop in test2. Thus, some work must be done at runtime. If you use a function barrier as follows, this problem is alleviated. But generally, it’s wise to have concrete fields in all structs involved in performance critical steps.

function dowork(B, x, A)                                                                       
    for _ in 1:1000                                                                          
        B .+= x .* A                                                                         
    end                                                                                      
end                                                                                          
                                                                                             
function test3(myState::GlobalState)                                                         
    dowork(zeros(size(myState.A)), myState.x, myState.A)                                     
end                                                                                          
3 Likes

Thanks for the tips!

Wow, didn’t know this trick!
Thanks for the advice!