# Is there an idiomatic way to iterate over all binary strings of a given length?

I would like to iterate over all binary strings of length n, for some smallish integer n. Is there an idiomatic Julia way to do that?

1 Like
``````for i = 0:2^(n+1)-1
s = bitstring(i)
println(s[length(s)-n:end])
end
``````

There might be a nicer way to get the last `n` characters of a string, but this was quick and dirty 4 Likes

You don’t need to interpolate here. Just do

``````println(s[end-n:end])
``````
3 Likes

Don’t you need to start from 0?

``````function bb(n)
for i = 0:2^n-1
s = bitstring(i)
println(s[end-n+1:end])
end
end
``````

and then

``````julia> bb(3)
000
001
010
011
100
101
110
111
``````
2 Likes

`last(somestring, n)`

4 Likes

Thank you for this. I wonder if there is a lighter weight solution if you need to iterate over a large range (say n = 30), quickly.

You may try to implement iterator which goes through Gray code, I think it’s as fast and lightweight as it can be. There is no dedicated package, but here is snippet from rosetta code: https://github.com/acmeism/RosettaCodeData/blob/master/Task/Gray-code/Julia/gray-code.julia

1 Like

That would iterate over integers not strings, right? Or were you suggesting I write a new version which iterates over strings?

Sorry, my comment can be misleading. Regarding your question, it actually depends on what type of object you are working with. If it’s `String` than I do not think that it is possible to do anything faster than solutions shown above (increase some integer with the following conversion to String). If on the other hand it’s possible to use `BitArray` or `Vector{UInt}` than you can write your own implementation of the Gray Code and mutate corresponding vector inplace.

Why do you need to use strings? What are you trying to do?

It’s never going to be super fast if you use `String`, because you need to allocate a new string on each iteration.

1 Like

I guess if I update the strings in Gray code order then I only need to change one entry at a time which could be very fast.

I need strings as I am computing various distances between them. For example the Levenshtein distance.

`String` is immutable in Julia, so it can’t be updated in-place. Of course, you could define your own string type, but a `Vector{Bool}` would be a much more natural choice here than a string.

It would be easy (and probably more efficient) to implement these distance metrics using `Vector{Bool}`.

As a side note, you still may use package like `StringDistances` with your own type of String, due to the multiple dispatch, for example

``````using StringDistances

struct MyString <: AbstractString
v::Vector{Bool}
end

Base.:ncodeunits(s::MyString) = length(s.v)
Base.:isvalid(s::MyString, i::Int) = true
Base.:iterate(s::MyString, i::Int) = iterate(s.v, i)

s1 = MyString([true, true, false])
s2 = MyString([false, false, false])
evaluate(Levenshtein(), s1, s2) # 2
``````
3 Likes

Is this right? I’m not familiar with this functionality, but according to the docs:

If `isvalid(s, i)` is true then `s[i]` will return the character whose encoding starts at that index.

1 Like

It depends on what we should consider character in this case. If character is `Bool`, than this definition is correct. If character is group of 8 `Bool` values, than things will be more complicated of course. I suppose then instead of `Vector{Bool}` one should use `Vector{UInt8}`.

I don’t get it. Why isn’t it

``````Base.isvalid(s::MyString, i::Int) = 1 <= i <= length(s.v)
``````

for example?

Ah, yes, that’s better of course.
My definition works, because `length` check basically happens inside `iterate` (and this function is mainly used in `StringDistances` algorithms), but in other cases it will fail.