The algorithm can be simplified considerably if we use the fact that the constraints generated by a guess and a target word, which is the pattern of colors on the tiles, can be considered as a 5-digit base-3 number. For example
function constraints(guess, actual)
val = 0
mult = 1
for (g, a) in zip(guess, actual)
val += (g == a ? 2 : (g ∈ actual ? 1 : 0)) * mult
mult *= 3
end
return val
end
Then we just need to keep track of the number of words that generate each of these numbers between 0 and 242.
For a given initial guess we can update an integer vector counts
of length 243 as
function ccounts!(counts, guess, words)
fill!(counts, 0)
for actual in words
counts[constraints(guess, actual) + 1] += 1
end
return counts
end
So, for example,
julia> show(ccounts!(counts, SVector{5,UInt8}("raise"...), words))
[168, 103, 10, 92, 78, 4, 91, 26, 9, 107, 23, 4, 34, 18, 1, 14, 2, 6, 51, 28, 1, 12, 4, 0, 6, 4, 1, 80, 24, 1, 43, 21, 0, 20, 2, 1, 21, 4, 1, 7, 1, 0, 5, 0, 0, 29, 5, 0, 0, 0, 0, 1, 0, 0, 17, 13, 1, 22, 8, 1, 7, 2, 0, 6, 1, 0, 1, 0, 0, 0, 0, 0, 9, 5, 0, 1, 0, 0, 2, 0, 0, 121, 102, 20, 69, 34, 13, 20, 28, 4, 35, 26, 8, 3, 1, 0, 0, 0, 0, 15, 12, 1, 1, 0, 0, 0, 0, 0, 41, 18, 2, 12, 4, 0, 1, 2, 0, 4, 4, 3, 1, 0, 0, 0, 0, 0, 3, 1, 0, 0, 0, 0, 0, 0, 0, 9, 7, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 61, 40, 6, 41, 26, 0, 26, 4, 1, 25, 3, 2, 2, 1, 0, 0, 0, 0, 23, 17, 0, 5, 1, 0, 3, 0, 0, 17, 10, 0, 20, 5, 0, 9, 0, 0, 5, 0, 0, 1, 0, 0, 0, 0, 0, 15, 2, 0, 1, 0, 0, 0, 0, 0, 20, 8, 2, 8, 2, 0, 5, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 1, 0, 0, 0, 1]
This means that there are 168 words that would generate all grey tiles if your initial guess is “raise”, 103 that would give you a yellow tile followed by 4 grey tiles, etc. Now here is the real subtle point - for a given initial guess a word matches a constraint pattern if and only if it generates that pattern. That is, the counts
vector is also the size of the pool after the first guess. Thus the average size of the pool after the first guess of “raise” is
julia> sum(abs2, counts)/sum(counts)
61.00086393088553
which corresponds to the earlier result.