Question about functors and closures (I think šŸ˜…), with an example, but really a general Q

ā€¦while it is certainly enough to execute without error, please note, that it is semantically different from my original formulation, as it applies two masks and not just the one, being passed to next, which I donā€™t want.

Actually, I would go event further and write everything as follows, to avoid needing to write overly specific typesā€¦

MaskedRNG(seed::Integer, nbits) = MaskedRNG(RNG(seed), ~UInt64(0) >> ifelse(0 < abs(nbits) <= 64, sign(nbits)*64 - nbits, 64))

ā€¦I should probably give some context, at this point, of how I am actually using these rngs in my chess-engine, at all. I have datastructures like this, a lotā€¦

mutable struct TTEntry
    d0::UInt64  # use this, for now (1.)
    d1::UInt16  # use this, for now (10 bytes, total)
    # later...
    # ( 4)  checkkey            ::  Uint32
    # ( 6)  bestmove            ::  UInt16
    # ( 8)  score               ::   Int16
    # (10)  details / flags     ::  UInt16
                # score-type    :   1..2    {LOWER,EXACT,UPPER}={CUT,PV,ALL}
                # age           :   ...     (replacement-policy)
                # quality       :   ...     {#nodes used for score}
                # ...
    # (10)  Bytes, total
end

ā€¦it might look a bit cryptic, but it is just a simple declaration of the structure of a single hashtable-entry, of what is called a transposition-table in chess. It needs to pack ā€œdataā€ as compactly as possible, as the hashtable will be rather large (typically some singleā€¦double-digit - percentage of total ram available on a computer, so on my laptop with 16GB ram, total, something in the range of 512MBā€¦8GB, eventually), such that memory-layout of single entries becomes quite an issue.

As (semantic units of) data is packed into a few bits, only (a flag might even be packed into a single bit), I will (eventually) have several constant masks defined, for being able to have functions, accessing data according to this simple patternā€¦

function age(TTEntry::entry)
    return (entry.someData & MASK_TTENTRY_AGE) >> SHIFT_TTENTRY_AGE
end
function quality(TTEntry::entry)
    return (entry.someData & MASK_TTENTRY_QUALITY) >> SHIFT_TTENTRY_QUALITY 
end

So when using the rng, I only ever want to use the lowest n bits (for adressing indexed data, which is arranged in hashtable-structures of some size=2^k) [use-case #1], for example when performance-testing one of the hashtables. Or I want to conveniently be able to randomly set (possibly a bunch of) specific flags / detailed data, packed in some (unsigned integer-) primitive type similar to this: primitiveType āŠ»= next(random, MASK_AGE | MASK_QUALITY | MASK_OBSCUREFLAG) [use-case #2]. Reason for even coding my own, is (a) to have the ability to test against a reference-implementation, in java, with the same random-data-stream and (b) having a very specialized rng, optimized for performance-testing, within my chess-engine.

For any other use-case, Iā€™d likely just use one of the existing rngs. So Iā€™m perfectly happy, having the rng be specific to my use-cases, only, while maintaining simplicity.
While not adding to any body of libraries, restricting ones codes to meet specific requirements, often yields remarkable conciseness and quality-of-code / patterns. :wink:

2 Likes

Youā€™re right. Weird case when you pass an RNG, it does not have rng property, but state, so it will error (coding is hard). :sweat_smile:

Neat use case! I got too driven by the fact that I could fit a nice pattern into the masking, but really my only observation should have been that it errors in the case of m = 64. Maybe irrelevant for your use case.

I hope success with your test project. Maybe you could show it here in discourse later once finished.

On the other hand, having derailed the topic a bit (a lot?), about your original questions, you could have written your original code as follows

mutable struct RNG
    state::UInt64
end

RNG() = RNG(time_ns())

mutable struct MaskedRNG
    rng::RNG
    mask::UInt64
end

# we'll ignore the case of nbits = 64
MaskedRNG(seed::Integer, nbits) = MaskedRNG(RNG(seed), (1 << nbits) - 1)
MaskedRNG(nbits) = MaskedRNG(RNG(), (1 << nbits) - 1)

function (rng::RNG)() # thiss calls without arguments
    rng.state = _rng(rng.state)
end
(rng::RNG)(mask) = rng() & mask # masked case

(mrng::MaskedRNG)() = mrng.rng() & mrng.mask
(mrng::MaskedRNG)(mask) = mrng.rng() & mask # does not double mask

function _rng(x::UInt64)
    x āŠ»= (x << 21)
    x āŠ»= (x >>> 35)
    x āŠ»= (x << 4)
    return x % UInt64
end

And with an example

julia> r = MaskedRNG(1234, 8)
MaskedRNG(RNG(0x00000000000004d2), 0x00000000000000ff)

julia> r()
0x00000000000000f2

julia> [r() for _ in 1:5] # masked with 2 bytes
5-element Vector{UInt64}:
 0x00000000000000d2
 0x00000000000000ac
 0x0000000000000049
 0x000000000000003c
 0x00000000000000c9

julia> [r(0xffff) for _ in 1:5] # masked with 4 bytes
5-element Vector{UInt64}:
 0x000000000000c5f2
 0x000000000000daa2
 0x0000000000007909
 0x000000000000d018
 0x000000000000a698

As others have pointed, these are called functors or callable structs. Functors are different from closures, but under the hood they are more or less the ā€œsameā€, a struct (state) with a method defined to be called with the operator () (choose a better wording).

Thx for the code - that looks pretty neat!

One thing, I donā€™t quite grasp, though (I think):

Is there any reason, you are using this notation, without a function nameā€¦?

function (rng::RNG)()
    rng.state = _rng(rng.state)
end

ā€¦or maybe, my question is rather: Does it have any (other) application, other than saving to type an explicit function-name? Tbh, to me, that looks so foreign, that I donā€™t really see a place for yet another way to write (or rather omit) things, unless it has some other use-case, which I suspect, but I cannot see?

Regarding the case of 64(+) bits, hereā€¦

# we'll ignore the case of nbits = 64
MaskedRNG(seed::Integer, nbits) = MaskedRNG(RNG(seed), (1 << nbits) - 1)

I havenā€™t looked at anything, where it would be formally specified, but I suspect, that julia implements (like most languages) a modulo-arithmetic for over & underflows in primitive (unsigned) integers. At least thatā€™s what it looks like, what C & java do, and I thus wouldnā€™t worry about those cases - they should be handled automatically, in a way, that actually harmonizes with the implementation, hereā€¦

julia> bitstring((1 << 63) - 1)
"0111111111111111111111111111111111111111111111111111111111111111"

julia> bitstring((1 << 64) - 1)
"1111111111111111111111111111111111111111111111111111111111111111"

julia> bitstring((1 << 1000) - 1)
"1111111111111111111111111111111111111111111111111111111111111111"

Callable structs are useful when you want a struct to look like a function. If you donā€™t care about that, you donā€™t need to use it.
Technically I believe itā€™s undefined behavior to shift by more than 64 bits (it is in C), but it is relatively likely to do what you want.

Yes, yes and yes. I actually use it to define some objects which are linear operators. Being operators, they are applied to some other objects, and the notation of callable struct is quite natural in that case. :slight_smile:

This is correct, Julia implement modulo arithmetic. The problem is not that the bit pattern of an Int is the correct one, but that Julia will happily throw when you try to set -1 to an UInt64 (this happens when you initialize RNG).

julia> (1 << 64) - 1
-1

julia> UInt64((1 << 64) - 1)
ERROR: InexactError: check_top_bit(UInt64, -1)
Stacktrace:
 [1] throw_inexacterror(f::Symbol, #unused#::Type{UInt64}, val::Int64)
   @ Core ./boot.jl:614
 [2] check_top_bit
   @ ./boot.jl:628 [inlined]
 [3] toUInt64
   @ ./boot.jl:739 [inlined]
 [4] UInt64(x::Int64)
   @ Core ./boot.jl:769
 [5] top-level scope
   @ REPL[2]:1

In C, as pointed out by Oscar, overflow is undefined behaviour for signed integer operands but not for unsigned operands.

Callable structs are useful when you want a struct to look like a function. If you donā€™t care about that, you donā€™t need to use it.
:+1:

Technically I believe itā€™s undefined behavior to shift by more than 64 bits (it is in C), but it is relatively likely to do what you want.

In C, any unsigned-int-overflow has actually been specifically specified as non-existent, due to modulo-arithemetics, at least since C99.

As of some time in 2021, there exists some change-request / proposal, due to some unclear wordings and conflicts between some ISO-Norm, C++ and C-specifications, with this ā€œnewā€ wordingā€¦

The range of representable values for the unsigned type is 0 to 2N
-1 (inclusive). A computation involving unsigned operands can never produce an overflow, because
arithmetic for the unsigned type is performed modulo 2N.

But that just clarifies and solidifies the current state of affairs and I donā€™t think this part will likely change, in my lifetime - if it does, I believe that would cost billions, in total of software-rewrites, world-wide. :rofl:

I think this table is more useful, than tedious formal proposalsā€¦ INT30-C. Ensure that unsigned integer operations do not wrap - SEI CERT C Coding Standard - Confluence

I see, what you mean, and my code didnā€™t cover this case, meaning, I will have to be careful with that.
Though your example really only fails, because 1 in that statement is assumed to be a signed (!) Int (and I think the type is just being propagated from the innermost / first type-inference, onwards)ā€¦
ā€¦otherwise this wouldnā€™t be happening:

julia> ((1%UInt) << 64) - 1
0xffffffffffffffff

ā€¦I guess, I just have to annotate with types (even) more. :sweat_smile: