This post is on Hacker News right now – it seems worth digging in to understand why Rust does so much better here given that the code seem like a fairly naive port without much optimization effort:
Isn’t the difference that the Julia version uses collect(::String)
to get a list of Char
while Rust just uses a view?
Yeah. Changing match_constraint
to search over the string makes it 5x faster.
function match_constraint(word::String, constraint::Constraint)::Bool
if constraint.type == has_no
constraint.letter ∉ word
elseif constraint.type == has_but_not_at
(constraint.letter ∈ word) && (word[constraint.index] != constraint.letter)
elseif constraint.type == has_at
word[constraint.index] == constraint.letter
end
end
How are you timing it ? Waiting for the first report of minutes left ?
Yeah. It’s not exact, but it’s good enough for a simple comparison.
- 5x faster is still 10x slower than the Rust code, so there’s probably something else going on.
- Is there a way to reduce the chance people unnecessarily call
collect
?
It’d be great if someone could check independently as I only spent 5 minutes and didn’t understand the code deeply enough to check whether my changes didn’t screw anything up, but on my machine I got:
Ranking guesses... 3/12972 words (8575 min left) # Original code
Ranking guesses... 10/12972 words (2124 min left) # Removing collect
Ranking guesses... 69/23629 words (160 min left) # Using String7
I made the following changes:
words_str7 = CSV.File("solutions"; header = false).Column1
non_solution_words_str7 = CSV.File("non-solution-guesses"; header = false).Column1
Then change all String
to AbstractString
and Vector{String}
to Vector{<:AbstractString}
, and finally in main()
change the first four lines to:
global words_str7; global allowed_guesses_str7
remaining_words = copy(words_str7) # List of words that currently fit all known constraints.
while length(remaining_words) > 1
best_guesses = rank_guesses(allowed_guesses_str7, remaining_words)[1:10]
to use the String7
versions of the word lists.
I might be missing something but isn’t this a game which is constrained to exactly five ascii letters? The first thing I would do if I cared about performance is to get rid of all the strings and work with static vectors of length 5.
Another thing I might be missing due to ignorance, but…
words = readlines("solutions")
non_solution_words = readlines("non-solution-guesses")
allowed_guesses = append!(words, non_solution_words)
…are words
and allowed_guesses
really meant to be the same array?
The rust code does
let non_solution_guesses_content = fs::read_to_string("non-solution-guesses")
.expect("Failed to read solutions file");
let non_solution_guesses: Vec<&str> = non_solution_guesses_content.lines().collect();
let allowed_guesses = [&solutions[..], &non_solution_guesses[..]].concat();
which I imagine would be equivalent to changing the Julia code to
allowed_guesses = vcat(words, non_solution_words)
This change alone makes the code 30 times faster.
This seems to be too good to be true. I tried to reproduce this using InlineStrings
and failed. Here is my current best, thanks to @Oscar_Smith, @GunnarFarneback.
function _readline(path)::Vector{Char}
Vector{Char}(readline(path))
end
function _readlines(path)::Vector{Vector{Char}}
words = Vector{Vector{Char}}()
for word in readlines(path)
push!(words, Vector{Char}(word))
end
println(length(words))
words
end
const words = _readlines("solutions.txt")
const non_solution_words = _readlines("non-solution-guesses.txt")
const allowed_guesses = append!(words, non_solution_words)
@enum ConstraintType has_no has_but_not_at has_at
struct Constraint
type::ConstraintType
letter::Char
index::Int
end
function isequal(c0::ConstraintType, c1::ConstraintType)::Bool
c0.type == c1.type && c0.letter == c1.letter && (c0.type == has_no || c0.index == c1.index)
end
function hash(c::ConstraintType, h::UInt = 0)::UInt
hash(c.type, h) ⊻ hash(c.letter, h ⊻ 1) ⊻ hash(c.index, h ⊻ 2)
end
function constraints(guess::Vector{Char}, actual::Vector{Char})::Vector{Constraint}
c = Vector{Constraint}()
actual_letters = collect(actual)
for (i, g, a) in zip(1:5, guess, actual)
if g == a
push!(c, Constraint(has_at, g, i))
elseif g ∈ actual_letters
push!(c, Constraint(has_but_not_at, g, i))
else
push!(c, Constraint(has_no, g, i))
end
end
c
end
# Code: . = has_no, x = has_but_not_at, o = has_at
function parse_constraints(template::Vector{Char}, guess::Vector{Char})::Vector{Constraint}
constraints = Vector{Constraint}()
for (i, c) in zip(1:5, template)
if c == 'x'
push!(constraints, Constraint(has_but_not_at, guess[i], i))
elseif c == 'o'
push!(constraints, Constraint(has_at, guess[i], i))
else
push!(constraints, Constraint(has_no, guess[i], i))
end
end
constraints
end
function match_constraint(word::Vector{Char}, constraint::Constraint)::Bool
if constraint.type == has_no
constraint.letter ∉ word
elseif constraint.type == has_but_not_at
(word[constraint.index] != constraint.letter) && (constraint.letter ∈ word)
elseif constraint.type == has_at
word[constraint.index] == constraint.letter
else
@assert false
end
end
function match_constraints(word::Vector{Char}, constraints::Vector{Constraint})::Bool
for c in constraints
if !match_constraint(word, c)
return false
end
end
return true
#all(map(c -> match_constraint(word, c), constraints))
end
memoize_remaining_for_constraint = Dict{Constraint,Vector{Vector{Char}}}()
function remaining_for_constraint(words::Vector{Vector{Char}}, constraint::Constraint)::Vector{Vector{Char}}
global memoize_remaining_for_constraint
if !haskey(memoize_remaining_for_constraint, constraint)
memoize_remaining_for_constraint[constraint] = Vector(filter(w -> match_constraint(w, constraint), words))
end
return ∩(memoize_remaining_for_constraint[constraint], words)
end
# Number of possible words left after performing a given guess against an actual
# word of the day, and receiving the corresponding constraint information.
function remaining(guess::Vector{Char}, actual::Vector{Char}, words::Vector{Vector{Char}})::Float64
cs = constraints(guess, actual)
rem = 0
for w in words
rem += match_constraints(w, cs) ? 1 : 0
end
rem
#return foldl(function(a::Float64, w::String)::Float64
# return a + (match_constraints(w, cs) ? 1 : 0)
#end, words, init = 0)
#length(reduce(∩, map(constraint -> remaining_for_constraint(words, constraint), cs)))
end
# Average number of possible words left after performing a given guess,
# and receiving the corresponding constraint information.
function avg_remaining(guess::Vector{Char}, words::Vector{Vector{Char}})::Float64
rem = 0
for w in words
rem += remaining(guess, w, words)
end
rem / length(words)
end
struct Choice
word::Vector{Char}
avg_remaining::Float64
end
function rank_guesses(guesses::Vector{Vector{Char}}, words::Vector{Vector{Char}})::Vector{Choice}
wt = length(guesses)
avg_time = 0
choices = Vector{Choice}(undef, wt)
for (wi, g) in zip(0:wt-1, guesses)
print("\x1b[1K\x1b[GRanking guesses... ", wi, "/", wt, " words (", Int(ceil(avg_time*(wt-wi)/60)), " min left)")
avg_time = (wi*avg_time + @elapsed choices[wi+1] = Choice(g, avg_remaining(g, words))) / (wi+1)
end
#time = @elapsed choices = map(function(g::Vector{Char})
# print("\x1b[1K\x1b[GRanking guesses... ", wi, "/", wt, " words (", Int(ceil(time*(wt-wi)/60)), " min left)")
# wi += 1
# Choice(g, avg_remaining(g, words))
#end, guesses)
print("\x1b[1K\x1b[G")
sort!(choices, by = c -> c.avg_remaining)
end
function main()
global words; global allowed_guesses
remaining_words = copy(words) # List of words that currently fit all known constraints.
while length(remaining_words) > 1
best_guesses = rank_guesses(allowed_guesses, remaining_words)[1:10]
for (i, g) in zip(1:10, best_guesses)
println(i, ". ", g.word, " (keeps ", g.avg_remaining, " words on average)")
end
println("Insert your guess: ")
guess = _readline(stdin)
println("Insert the results (o = letter at right spot, x = wrong spot, . = not in the word): ")
constraint_template = _readline(stdin)
remaining_words = filter(w -> match_constraints(w, parse_constraints(constraint_template, guess)), remaining_words)
end
println("Solution: ", remaining_words[1], ".")
end
showing something like
julia> main()
Ranking guesses... 3/12972 words (531 min left)
I needed to use JET.jl
to eliminate some dynamic dispatch and used Juno profiler to understand the bottlenecks. AFAIU we should be in the ballpark of Rust with this.
Regarding the conclusion of the blog post
But more importantly, this is not about the language itself. It is about the quality of the compiler’s optimizers.
I have to disagree, but on the other hand optimizing Julia codes is still quite an adventure for me.
But what is really nice about Julia: you can do something like
# Average number of possible words left after performing a given guess,
# and receiving the corresponding constraint information.
function avg_remaining(guess::Vector{Char}, words::Vector{Vector{Char}})::Float64
rem = ThreadsX.sum(words) do wo
cs = constraints(guess, wo)
sum(words) do wi
all(cs) do c
match_constraint(wi, c)
end ? 1 : 0
end
end
rem / length(words)
end
yielding some speedup:
julia> main()
Ranking guesses... 12/12972 words (111 min left)
I haven’t looked too closely at the code, but doesn’t the usual admonition against non-constant globals apply here to words
and allowed_guesses
?
Your code with @GunnarFarneback’s vcat
is actually 25% faster than Rust’s code:
GRanking guesses... 100/12972 words (16 min left) # Julia
GRanking guesses... 100/12972 words (20 min left) # Rust
I hope someone replies to that post and clears such claims about Julia by seemingly inexperienced programmers. That said, I still second the suggestion by @GunnarFarneback to use an SVector
and redesign the whole code if one desires speed.
This is crazy, why are you using vectors of Char
to work with strings?
That makes no sense. It is possible to write slow code in any language and no compiler can save you. In fact, most of the compiler is shared between Julia and Rust (LLVM).
I’ve implemented the following optimizations, ranked by the improvement they make:
- (This is a bugfix, rather) Concatenation instead of destructive extension of
words
:allowed_guesses = [words; non_solution_words]
(4700 min → 150 min) - Using
NTuple{5, UInt8}
instead ofString
to store and rank words (thanks to @GunnarFarneback for the idea) (150 min → 20 min) - Using
Int8
forindex
in theConstraint
struct (20 min → 15 min) - Removing non-
const
globals (no noticeable effect)
Github gist:
https://gist.github.com/vvpisarev/9da868c876ae01ed801ce97d7440218f
The offered choices seem to match those on the author’s website, so it looks like concatenation is what the author did have in mind writing the append!
.
In this case, it might make sense because instead of working with arbitrary strings you always have exactly 5 characters and you might want to be able to mutate the current answer to get a new trial, instead of allocating a bunch of different strings.
That part makes sense, but Vector{UInt8}
is 4x smaller.
I think the real morale of the story is to extend the title of the blog post,
Sometimes, rewriting in another language works
with
…because you might not make the same bug in both implementations.
I’m not so sure about all the optimizations proposed for the String
s. I don’t know Rust, so I looked up what Rust’s strings are like:
There are two types of strings in Rust:
String
and&str
.
AString
is stored as a vector of bytes (Vec<u8>
), but guaranteed to always be a > valid UTF-8 sequence.String
is heap allocated, growable and not null terminated.
&str
is a slice (&[u8]
) that always points to a valid UTF-8 sequence, and can be used to view into aString
, just like&[T]
is a view intoVec<T>
.
The Rust version seems to mainly work with &str
but also makes String
s in the guessing game and its rank_guesses
call (Choice
contains a String
). I’m not exactly sure what the Julia equivalents would be (the views could be SubString
or StringViews.jl
?), but it wouldn’t seem fair to only use stack-allocated or non-UTF8 structures when comparing to the Rust version. The original post fully admits its algorithm is suboptimal, so there is a lot of room for optimization in both languages’ versions that will have little to do with the comparison.
Of course, easy switches to simpler and faster data structures isn’t something to normally overlook and could be showed off in tandem to a faithful comparison. Also, Julia’s parametric types and multiple dispatch could allow reusing code for words of different lengths, which could be a Wordle option in the future.