"Zipped" loops in test sets: a bad design pattern?

I’m (ab)using Julia’s testing interface to verify mathematical propositions for many randomly generated samples. I find myself using the following pattern a lot:

@testset "definitions" begin
	for x in rand(N)
		...
		@test cos(x) ≈ (exp(x*im) + exp(-x*im))/2
	end
end
@testset "basic identities" begin
	for x in rand(N)
		...
		@test tan(x) ≈ sin(x)/cos(x)
	end
end
@testset "complex identities" begin
	for x in rand(N)
		...
		@test sec(x)^2 == tan(x)^2 + 1
	end
end

This involves much code repetition when there is lots of set-up code (indicated by the ...).
The set-up code doesn’t lend well to being packaged into a function, because each ... declares many local variables used all throughout the tests, and each successive ... is different (but usually a superset of the last).

I find myself wanting to structure my code like this:

for x in rand(N)
	...
	@testset "definitions" begin
		@test cos(x) ≈ (exp(x*im) + exp(-x*im))/2
	end
	...
	@testset "basic identities" begin
		@test tan(x) ≈ sin(x)/cos(x)
	end
	...
	@testset "higher identities" begin
		@test sec(x)^2 ≈ tan(x)^2 + 1
	end
end

where each ... e.g., introduces new local variables used in the next set of tests.

This of course generates three new test sets for each sample, whereas I’d like to preserve the nicely collated test set output as in the original snippet.

Is there a way I could achieve the best of both worlds? E.g., by “merging” all test sets which are instantiated with the same label?

1 Like

I’m a bit new to the testing side of things, but are your local variables in the setup overriding one another or can they all be done ahead of one loop?

@Ian_Slagle Essentially, no; they could all be done at once. Risking bad question etiquette, here’s a wall of my actual code with @tests omitted.

@testset "bipolar" begin
	for σ in samples(N)
		ρ, φ = bipolar(σ)
		...
	end
end
@testset "bisplitunits" begin
	for σ in samples(N)
		β, σ̂ = bisplitunits(σ)
		...
	end
end
@testset "bisplit" begin
	for σ in samples(N)
		σ₊, σ₋ = bisplit(σ)
		...
	end
end
@testset "principle_generator" begin
	for σ in samples(N)
		σ′ = principle_generator(σ)
		...
	end
end
@testset "trivial identities" begin
	for σ₀ in samples(N)
		σ = principle_generator(σ₀)
		R = exp(σ)
		α = scalar_part(R)
		γ = pseudoscalar_part(R)
		...
	end
end
@testset "R = α + β₊u₊ + β₋u₋ + γI" begin
	for σ₀ in samples(N)
		σ = principle_generator(σ₀)
		R = exp(σ)
		α = scalar_part(R)
		γ = pseudoscalar_part(R)
		b = bivector_part(R)
		ρ, φ = bipolar(b)
		β, σ̂ = bisplitunits(b)
		δ, σ̂′ = bisplitunits(σ)
		...
	end
end

This looks fine to me, there isn’t a lot of actual repetition (unless I am missing something). The only thing you are repeating is for σ in samples(N).

While reusing code is generally advisable, I usually strive to keep tests simple and readable.

3 Likes

Only thing that can be done here (but I am unsure whether it makes things better or worse) is to wrap reusable generations of variables in a function. So your last two testsets can look like this

function trivial_identities_variables(σ₀)
    σ = principle_generator(σ₀)
    R = exp(σ)
    α = scalar_part(R)
    γ = pseudoscalar_part(R)
    
    return (; σ, R, α, γ)
end

function alphabetagamma_variables(σ₀)
    σ, R, α, γ = trivial_identities_variables(σ₀)
    b = bivector_part(R)
    ρ, φ = bipolar(b)
    β, σ̂ = bisplitunits(b)
    δ, σ̂′ = bisplitunits(σ)

    return (; σ, R, α, γ, b, ρ, φ, β, σ̂ , δ, σ̂′)
end

...

@testset "trivial identities" begin
	for σ₀ in samples(N)
		σ, R, α, γ = trivial_identities_variables(σ₀)
		...
	end
end
@testset "R = α + β₊u₊ + β₋u₋ + γI" begin
	for σ₀ in samples(N)
		σ, R, α, γ, b, ρ, φ, β, σ̂ , δ, σ̂′ = alphabetagamma_variables(σ₀)
		...
	end
end

And use can use UnPack.jl to extract only needed subset of variables.

Am I the only one who likes for loops to create test sets? Makes it easier to spot what iteration caused the failure:

julia> using Test

julia> @testset "Main test" begin
       @testset "Test for n=$n" for n in 1:5
           @testset "sum" begin
               @test sum(1:n) < 100
           end
           @testset "prod" begin
               @test prod(1:n) < 100
           end
       end
       end
prod: Test Failed at REPL[4]:7
  Expression: prod(1:n) < 100
   Evaluated: 120 < 100
Stacktrace:
    .....
Test Summary:  | Pass  Fail  Total
Main test      |    9     1     10
  Test for n=1 |    2            2
  Test for n=2 |    2            2
  Test for n=3 |    2            2
  Test for n=4 |    2            2
  Test for n=5 |    1     1      2
    sum        |    1            1
    prod       |          1      1
ERROR: Some tests did not pass: 9 passed, 1 failed, 0 errored, 0 broken.
3 Likes

@DrChainsaw This is true.
However, looped test sets are not so appropriate when iterating through a very high number of (random) samples, like my case here. It is not so important which sample is responsible for the error, either: what is important is whether the tests pass for all the samples, or fail for some.

@Skoffer True, but it’s perhaps harder to maintain: there is now an ordering for the variables to be aware of, and their definitions are hidden…

Like the original snippet, it still suffers from code repetition (in this example, recalculating the common variables σ, R, α, γ) for each test set. That’s what is tempting about the linear structure:

for σ₀ in samples(N) # arbitrarily large number of trials
	a, b, c, d = ...
	@testset "first set" begin
		@test F(a, b, c, d)
	end
	e, f, g, h = ...
	@testset "more complex set" begin
		# tests which assume the previous test sets have passed
		@test G(a, b, c, d, e, f, g, h)
	end
end

Not quite true: definition is hidden, that’s correct, but ordering is unimportant. There is a semicolon at the tuple definition, which means that this is a NamedTuple. UnPack can handle order of the NamedTuple

using UnPack

function f()
    a = 1
    b = 2
    (; a, b)
end

@unpack b, a = f()
julia> a
1

julia> b
2

I actually do not see code repetition in my example: variables are different because they depend on variable σ₀. So it’s either you put everything in a huge for block and reuse variables in different tests, or you calculate everything inside each loop.

I think the main problem is functions that accept a huge list of variables. It’s hard to find the simple syntax for them anyway.

1 Like