Non-const globals in a macro and performances on the function calling the macro

I am trying to implement some “fixed stochasticity” in individual functions of my model (called once), where I have a global seed and then this is re-seed on each needed function based on the name of the function and the global seed, such that different functions are uncorrelated but the same function provides at each call the same outcomes, without passing RNGs around.

Anyhow, my issue is that in order to achieve this behaviour I need to use a non-const global (at module level) on the macro, and I wonder if this has a performance effect on the function calling the macro, or the macro implies a barrier effect.

This is a snippet of what I am trying to achieve:

cd(@__DIR__)
module Foo

import Random
export random_seed

global_random_seed = 123 

macro random_seed!()
    return quote
        st = stacktrace(backtrace())
        myf = ""
        for frm in st
            funcname = frm.func
            if frm.func != :backtrace && frm.func!= Symbol("macro expansion")
                myf = frm.func
                break
            end
        end
        m1 = $("$(__module__)")
        s = m1 * "$(myf)"
        Random.seed!(hash("$s",UInt64(global_random_seed)))
        @info "Random seeded with hash of \"$s\" and $(global_random_seed)"
    end
end

function init()
    Foo.global_random_seed = parse(Int64,readline("test_seed.txt"))  #just 125 in test_seed.txt
end

module FooFoo
import ..Foo
import ..Foo:@random_seed!, global_random_seed


function foo()
    @random_seed!()
    println(rand())
    println(rand())
end

function goo()
    @random_seed!()
    println(rand())
    println(rand())
end

end

end

Foo.init()
Foo.FooFoo.foo() # Julia v1.11: 0.008282138701719233 0.9042476148934772
Foo.FooFoo.foo() # Julia v1.11: 0.008282138701719233 0.9042476148934772
Foo.FooFoo.goo() # Julia v1.11: 0.4430332457748505   0.7426054431450675
Foo.FooFoo.goo() # Julia v1.11: 0.4430332457748505   0.7426054431450675
Foo.FooFoo.foo() # Julia v1.11: 0.008282138701719233 0.9042476148934772

The macro has no arguments or computation besides instantiating the return expression, so it’s just pasting that with generated local variables into macro call sites. global_random_seed is just a symbol in the expression, so it’s going to work like any global variable in a macro-less method. The bad type inference isn’t saved by the UInt64 call alone because the language doesn’t mandate type constructors return their own type; it’s saved by the only hash(::String, ...) method returning UInt.

Could do global_random_seed::Int for inherent type stability (or why not ::UInt64 if that’s all you’ll use it for), though non-const global variables require assignment checks and is thus implemented by a reference to a reference the last time I checked. Indexing and mutating const global_random_seed = Ref(123) would halve the work, though there is the risk of reading garbage values “assigned” to an uninitialized Ref{Int}(). The performance gain was actually dubious then, not sure how CPUs handled things.

Thank you. I am not concerned by the performances of the macro, but of the function where the macro is called.

These are themselves high-level functions that are called only once, but may contain for loops and run for days. My understanding is that compiling time is then treasurable, right ?

I don’t know what you mean by “compiling time” if you’re concerned about runtime performance. Annotating the global variable with a type shouldn’t stress the compiler any more than typical type stability practices, and whether it significantly affects runtime depends on how much of the runtime handles global_random_seed::Any. If it’s just the one UInt64 call that restores type stability in a days-long run, it’s negligible. If it’s making many short calls in your hot loop type-unstable, then it’s worth doing global_random_seed:Int.

As a side note, you can avoid searching the stack trace by instead creating a unique symbol in the macro as this:

macro random_seed!()
    h = hash(gensym("random_seed"))
    quote
        Random.seed!(hash(global_random_seed, $h))
    end
end

But the seed may change if you make changes to the program.

1 Like

To the program as a whole or to the function calling the macro ?

To the program as a whole. The gensym uses a global counter which is typically used in macros to generate unique symbols. Anywhere in packages which are imported anywhere. But the seeds will be fixed in any julia session when the module Foo has been imported.

It all depends on how stable you want your seeds to be. In julia 1.13, the hashes will change (Julia v1.13 Release Notes · The Julia Language), so then the seeds will change anyway. And some versions ago, the default rng changed from Mersenne twister to Xoshiro. And if you for some reason change your function name with the current setup, the seed will change.

It’s possible to replace the gensym with your own counter, but you may still get changes if you e.g. change the order of the function definitions in your module:

const count = Ref(0)
counter() = (count[] += 1)
macro ...
    h = hash(counter())
    quote
        ...
    end
end

I think I would have made something like this, to make it reasonably fast and stable:

import SHA
import Random
const global_random_seed = Ref("")

function setseed(s)
    global_random_seed[] = string(s)
    nothing
end

macro random_seed!(id)
    quote
        hash = SHA.sha1(string($id, global_random_seed[]))
        seed = reinterpret(UInt32, hash)
        Random.seed!(seed)
    end
end

function foo()
    @random_seed!("foo")
    println(rand())
    println(rand())
end

function goo()
    @random_seed!("goo")
    println(rand())
    println(rand())
end

The sha hash is standardized, so will be stable, but I’ve used an argument to the @random_seed macro, a literal string, to avoid the stack trace and save some time. SHA isn’t very fast compared to hash, so it’s a trade off. It’s possible to do the sha hashing somewhat faster by setting up a SHA.SHA1_CTX and do SHA.update! and SHA.digest! manually, instead of the prepacked SHA.sha1 which requires a single string.

You can also insert a @__MODULE__ in the string() inside the quote, to distinguish functions with the same id in different modules.

It’s also possible to get rid of the hashing altogether, since a slightly different seed will give a very different random sequence. Then you avoid allocations and hashing, apart from those inside Random.seed!:

const global_random_seed = Ref(UInt8[])

function setseed(s)
    global_random_seed[] = codeunits(string(s))
    nothing
end

macro random_seed!(id)
    bytes = Vector(codeunits(id))
    idlen = length(bytes)
    quote
        seedcodes = global_random_seed[]
        totlen = $idlen + length(seedcodes)
        pad = (4 - totlen % 4) % 4   # pad to reach multiple of 4 bytes
        resize!($bytes, totlen+pad)
        $bytes[$idlen+1:totlen] .= seedcodes
        $bytes[end-pad+1:end] .= 0xaa  # padding
        seed = reinterpret(UInt32, $bytes)
        Random.seed!(seed)
    end
end

If you really need fast setting of the seed (a couple of nanoseconds instead of a few hundred), you can use the internal Random.getstate and Random.setstate!. They are not public, and may disappear or change in minor versions:

const global_random_seed = Ref(UInt8[])

const seedgeneration = Ref(0)
function setseed(s)
    global_random_seed[] = codeunits(string(s))
    seedgeneration[] += 1
    nothing
end

mutable struct RandomState{T}
    state::T
    generation::Int
end

macro random_seed!(id)
    state = RandomState(Random.getstate(Random.default_rng()), -1)
    bytes = Vector(codeunits(id))
    idlen = length(bytes)
     quote
        if seedgeneration[] ≠ $state.generation
            seedcodes = global_random_seed[]
            totlen = $idlen + length(seedcodes)
            pad = (4 - totlen % 4) % 4   # pad to reach multiple of 4 bytes
            resize!($bytes, totlen+pad)
            $bytes[$idlen+1:totlen] .= seedcodes
            $bytes[end-pad+1:end] .= 0xaa
            seed = reinterpret(UInt32, $bytes)
            Random.seed!(seed)
            $state.generation = seedgeneration[]
            $state.state = Random.getstate(Random.default_rng())
        else
            Random.setstate!(Random.default_rng(), $state.state)
        end
    end
end