and I am not arriving at the same conclusion. Here’s the quick and dirty code I came up with to test it:
using StatsBase
function row_wise()
boxes = zeros(3,5)
for _ in 1:2
boxes[rand(1:3),rand(1:5)] = 1.0
end
counter = 0
for i in 1:size(boxes, 1), j in 1:size(boxes, 2)
counter += 1
if boxes[i,j] == 1.0
return counter
end
end
end
row_results = summarystats([row_wise() for _ in 1:5000])
function col_wise()
boxes = zeros(3,5)
for _ in 1:2
boxes[rand(1:3),rand(1:5)] = 1.0
end
counter = 0
for j in 1:size(boxes, 2), i in 1:size(boxes, 1)
counter += 1
if boxes[i,j] == 1.0
return counter
end
end
end
col_results = summarystats([col_wise() for _ in 1:5000])
I get the exact same distribution for both the row wise search and the column wise search. It takes 5.5 steps on average to find the first prize. What am I missing?
function generate_prizes()
prize_positions = Set()
while length(prize_positions) < 2
push!(prize_positions, (rand(1:3), rand(1:5)))
end
return prize_positions
end
function row_wise()
prize_positions = generate_prizes()
counter = 0
for i in 1:3, j in 1:5
counter += 1
if (i,j) in prize_positions
return counter
end
end
end
row_results = summarystats([row_wise() for _ in 1:10000])
function col_wise()
prize_positions = generate_prizes()
counter = 0
for j in 1:5, i in 1:3
counter += 1
if (i,j) in prize_positions
return counter
end
end
end
col_results = summarystats([col_wise() for _ in 1:10000])
I still get the same distribution for both search methods, but now the average steps has decreased slightly (as expected) to 5.3ish.
Interesting problem …
Your program seems to just calculate the average number of steps each needs individually (and independently). Instead, the two search orders need to be compared on each (fixed) box in order to determine who would win head-to-head.
My code get’s the opposite result though:
function newbox()
b = fill(false, 5, 3)
# Drawing two indices in order to ensure that there are exactly two boxes (your code might end up with one box)
b[rand(eachindex(b), 2)] .= true
b
end
numsteps = Base.Fix1(findfirst, identity) ∘ vec
# Julia is column-major, so vec(b') is rowwise
compete(b) = (; andrew = numsteps(b'), barbara = numsteps(b))
julia> tournament = compete.(newbox() for _ in 1:100_000);
# Probability of Andrew winning
julia> sum(res -> res.andrew < res.barbara, tournament) / length(tournament)
0.37276
# Probability of tie
julia> sum(res -> res.andrew == res.barbara, tournament) / length(tournament)
0.21727
# Probability of Barbara winning
julia> sum(res -> res.andrew > res.barbara, tournament) / length(tournament)
0.40997
I should really just find a computer but doesn’t rand(eachindex, 2) still have the same problem? I would have thought you need sample from StatsBase without replacement
I’m similarly seeing the slight edge to row-wise, encoding 0 as a row win and 1 as a column win and 0.5 as a tie. It’s fascinating to see how changing the dimensions changes these results.
julia> using Statistics, Random, StatsBase
julia> function firstwinner(boxes)
bycol = CartesianIndices(boxes)
byrow = permutedims(CartesianIndices(boxes))
for i in 1:length(boxes)
boxes[bycol[i]] && boxes[byrow[i]] && return 0.5 # tie
boxes[bycol[i]] && return 1.0 # column-wise
boxes[byrow[i]] && return 0.0 # row-wise
end
end
firstwinner (generic function with 1 method)
julia> function meanwinner(n)
boxes = falses(3,5)
mean(begin
boxes .= false
boxes[sample(begin:end, 2, replace=false)] .= true
firstwinner(boxes)
end for i in 1:n)
end
meanwinner (generic function with 1 method)
julia> meanwinner(100000)
0.48151
julia> meanwinner(1000000)
0.480625
julia> meanwinner(10000000)
0.48096015
I tried to use sample to generate the prize positions and my results now align with those of @bertschi:
function generate_prizes()
all_positions = [(i, j) for i in 1:3, j in 1:5]
prize_positions = sample(all_positions, 2; replace=false)
return prize_positions
end
function competition()
prize_positions = generate_prizes()
andrew_count = 0
barbara_count = 0
found_prize = false
for i in 1:3
for j in 1:5
andrew_count += 1
if (i, j) in prize_positions
found_prize = true
break
end
end
if found_prize
break
end
end
found_prize = false
for j in 1:5
for i in 1:3
barbara_count += 1
if (i, j) in prize_positions
found_prize = true
break
end
end
if found_prize
break
end
end
if andrew_count == barbara_count
return "tie"
elseif andrew_count < barbara_count
return "andrew"
else
return "barbara"
end
end
results = [competition() for _ in 1:100_000]
countmap(results)
Dict{String, Int64} with 3 entries:
"barbara" => 37118
"andrew" => 41036
"tie" => 21846
That’s bizarre. I think I see it now, though. If you look at the picture in the article:
Wouldn’t it be true that, after the first box, every 3rd box that Barbara gets to is one that Andrew has already opened? And for Andrew, every 5th box is one that Barbara has already opened? So Andrew is looking at more unopened boxes and therefore has a higher probability of finding the prizes…??
I wonder what the analytical solution to this looks like…
That seems compelling, but it can’t be the whole story because if there’s only one prize there’s no edge anymore. And if the grid is 4×5 the edge goes to columnwise. It’s gotta be about the asymmetry of the linearization of the upper/lower triangular regions as the Guardian discusses.
well, the flippant answer is that you’re missing the fact that these distributions are not independent (despite the fact that the expected hitting time is the same)
this feels to me like a similar category of counterintuitive result as the intransitive dice