Iterator over all random elements without materialization

Hi All,

I have a specific question. I am working on abductive explanation of neural networks, f. Abductive explanation is a subset (let’s denote indices ii) of the input x guaranteeing that the model (neural network) provides the same output, i.e. (∀ z)(∀ i ∈) (x[i] == z[i])(f(x) = f(z)) (there does not exist adversarial sample). The search requires to repetitively show that there does not exist adversarial sample, for which I use MILP (now, I use HiGHS, but once i am back in office, I will switch to Gurobi). I have found, that frequently one finds adversarial sample by a sheer luck and random sampling is very effective. But I would like to have something like random sampling with certification, i.e. for small degrees of freedom, I would like to certifiably exhaust the input space. For that, I would like to iterate over all values in range 0:2^n -1), but ideally in random order. So my question is, if someone does not know the algorithm which would allow this without materializing all values in memory, i.e. I am not interested in solution `randperm(2^n), because I want to iterate in few batches.

Thanks in advance for answer (negative is fine, because I would not need to think about it).

If you want to do something randomly in batches, you can create a random number generator and save its state. For example:

julia> using Random

julia> rng = MersenneTwister(1234)
MersenneTwister(1234)

julia> x1 = rand(rng, 2)
2-element Vector{Float64}:
 0.5908446386657102
 0.7667970365022592

julia> rng
MersenneTwister(1234, (0, 1002, 0, 2))

julia> x2 = rand(rng, 2)
2-element Vector{Float64}:
 0.5662374165061859
 0.4600853424625171

In a new REPL:

julia> using Random

julia> rng2 = MersenneTwister(1234, (0, 1002, 0, 2))
MersenneTwister(1234, (0, 1002, 0, 2))

julia> x2 = rand(rng2, 2)
2-element Vector{Float64}:
 0.5662374165061859
 0.4600853424625171

You can tweak the last argument of MersenneTwister to set the state.

Hi,

this is something I had in mind. But I need to be certain that I iterate over each value just once and I provably iterate over all values in 0:2^N range for different N

You can use sample from StatsBase:

using StatsBase
using Random

julia> rng = MersenneTwister(1234)
MersenneTwister(1234)

julia> batch_size = 10
10

julia> indices_batch = sample(rng, 1:(2^10), batch_size; replace=false)
10-element Vector{Int64}:
 489
 618
 956
 122
 443
 934
 149
 152
 741
 409

Depending on how you want to divide it into batches, you can save the state of the rng to use in the next batch, or just put it into a for loop if you’re iterating all at once.

1 Like

An ideal block cypher with N bits is indistinguishable from a random permutation on 1:2^N. In other words, you can get your random permutation permuted(key, n) = encrypt(key, n).

So you need a block cypher with N as its block size, for non-standard N. You get that with e.g. Luby-Rackoff / Feistel, where you simply plug in any reasonable thing as round function (e.g. aes if you use hardware acceleration, otherwise chacha8 or something like that).

This sure is not optimal (you’re wrapping another layer around aes which already has its 12 rounds), but should still be reasonably fast.

PS. I am assuming that you need the 30 < N < 70 regime? For smaller N, just do the usual randperm, for larger N just use random outputs since you’re not gonna exhaust the stream anyways.

2 Likes

This sounds like a nice solution.
Do I have a guarantee that I will iterate over all values?

My N is currently upper bounded by 24. With that, I can linearly try all 2^24 in about 30s, which is OK. Since I am searching for a first adversarial sample (existence is sufficient), I was thinking that I might benefit from random search rather then from linear search. But the overhead has to be small, since testing one value takes around 2μs.

1 Like

Yes, a block cipher will go through all values exactly once, because it’s equivalent to a random permutation. So will sampling without replacement.

1 Like

The cryptographic construction has tiny overhead.

But it looks like it isn’t worth your time to optimize?

A permutation with N=24 is just 64 megabyte. So just use randperm or bite the bullet on linear iteration?

The “no materialization” really only becomes important if it takes significant amounts of memory.

2 Likes

If you need a permutation of 2^N, modular arithmetic is your friend. Multiply by an odd number and add another number and take modulus 2^24, and you have permuted 0…2^24-1. This isn’t cryptographically random, but random-ish (what is the adversary here?).

This doesn’t need to materialize anything.

2 Likes

This is what I have been looking for. I had a hunch there will be something like this.

Thanks a lot.