How do I get readdlm to return a Matrix of strings?

I am reading in a list of english words into my program. I then want to make sure the words are lowercase so I am processing the list with lowercase() one word at a time and pushing them into another list. It seems that I run into problems when I hit a keyword for Julia like the word “false” in the original list.

Could you show use a minimum working example as well as the error that you are encountering? Calling lowercase("false") seems to work just fine.

julia> lowercase("false")
"false"

Expanding to an array of strings seems to work fine as well:

julia> keywords = ["Struct", "MUTABLE", "False", "TRUE", "1", "Symbol"]
6-element Vector{String}:
 "Struct"
 "MUTABLE"
 "False"
 "TRUE"
 "1"
 "Symbol"

julia> lowercase.(keywords)
6-element Vector{String}:
 "struct"
 "mutable"
 "false"
 "true"
 "1"
 "symbol"
2 Likes

It may be the way I am reading the words in

julia> using DelimitedFiles

julia> words = readdlm(“test.txt”)
3×1 Matrix{Any}:
“Hello”
false
“Goodbye”

julia> for word in words
push!(data2,lowercase(word))
end
ERROR: MethodError: no method matching lowercase(::Bool)
Closest candidates are:
lowercase(::T) where T<:AbstractChar at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/strings/unicode.jl:249
lowercase(::AbstractString) at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/strings/unicode.jl:540
Stacktrace:
[1] top-level scope
@ ./REPL[11]:2

Please quote code with backticks (```) so it’s easier to read.

The function readdlm allows you to pass a type, so I would try this:

words = readdlm("test.txt", String)
2 Likes

mtelm85’s answer is correct.

julia> open("test.txt", "w") do io
           println(io, "Hello")
           println(io, "false")
           println(io, "Goodbye")
       end

julia> readdlm("test.txt")
3×1 Matrix{Any}:
      "Hello"
 false
      "Goodbye"

julia> readdlm("test.txt", String)
3×1 Matrix{String}:
 "Hello"
 "false"
 "Goodbye"

julia> lowercase.(ans)
3×1 Matrix{String}:
 "hello"
 "false"
 "goodbye"
1 Like

Mark mtelm85’s answer as the solution… I’m just illustrating it.

I’m new to Julia so if I may ask. For the purpose I have is readdlm the appropriate function just to read in an array of strings? Also can I do the lowercase operation in-place on original array?

I would have probably done the following given that your words are on separate lines:

julia> function get_lowercase_lines(io = open("test.txt"))
           lowercase_lines = String[]
           for line in eachline(io)
               push!(lowercase_lines, lowercase(line))
           end
           return lowercase_lines
       end
get_lowercase_lines (generic function with 2 methods)

julia> get_lowercase_lines()
3-element Vector{String}:
 "hello"
 "false"
 "goodbye"

eachline is a lazy iterator. It does not allocate a String array to start.

1 Like

If you wanted a fully lazy version, consider this version that does not create a String array at all:

julia> function get_lowercase_lines_lazy(io = open("test.txt"))
           return (lowercase(line) for line in eachline(io))
       end
get_lowercase_lines_lazy (generic function with 2 methods)

julia> generator = get_lowercase_lines_lazy()
Base.Generator{Base.EachLine{IOStream}, var"#7#8"}(var"#7#8"(), Base.EachLine{IOStream}(IOStream(<file test.txt>), Base.var"#385#388"(), false))

julia> println.(generator);
hello
false
goodbye
1 Like

Could you please enlighten us as to why we can do this:

collect(get_lowercase_lines_lazy())
3-element Vector{String}:
 "hello"
 "false"
 "goodbye

but not collect(generator) ?

Thanks.

collect(generator) should work. You cannot call it twice though.

julia> generator = get_lowercase_lines_lazy()
Base.Generator{Base.EachLine{IOStream}, var"#1#2"}(var"#1#2"(), Base.EachLine{IOStream}(IOStream(<file test.txt>), Base.var"#385#388"(), false))

julia> collect(generator)
3-element Vector{String}:
 "hello"
 "false"
 "goodbye"

julia> collect(generator)
String[]
1 Like

Sorry if this is a silly question: why is this a one-time use generator when the first form can be collected as many times as we like?

collect(get_lowercase_lines_lazy()) creates a new generator every single time. The eachline generator will close the file at the end.

If you want to use it repeated, try this:

julia> function get_lowercase_lines_lazy(filename::String)
           s = open(filename)
           return (lowercase(line) for line in Base.EachLine(s, ondone=()->seekstart(s)))
       end
get_lowercase_lines_lazy (generic function with 3 methods)

julia> generator = get_lowercase_lines_lazy("test.txt")
Base.Generator{Base.EachLine{IOStream}, var"#3#5"}(var"#3#5"(), Base.EachLine{IOStream}(IOStream(<file test.txt>), var"#4#6"{IOStream}(IOStream(<file test.txt>)), false))

julia> collect(generator)
3-element Vector{String}:
 "hello"
 "false"
 "goodbye"

julia> collect(generator)
3-element Vector{String}:
 "hello"
 "false"
 "goodbye"
1 Like