I wish to replace some occurrence in a file and save it under a new file. The following does what I need, but it reads the entire file before making the substitution.
Is there a better way to do this ?
some_variable = "Oranges"
open("some_file", "r") do file
global data = read(file, String)
data = replace(data, r"some text (?i)" => "$some_variable")
open("another_file", "w") do io
I’m using Julia 1.6.5 under Windows 7.
I think you should be able to nest
open("in.txt","r") do io_in
open("out.txt","r") do io_out
readline(io_in) |> print
readline(io_out) |> print
you can also use open without a
io_in = open("in.txt","r")
io_out = open("in.txt","w")
# do stuff
The implementation when using a function as a first argument or a
do block looks like this:
function open(f::Function, args...; kwargs...)
io = open(args...; kwargs...)
However, I am not sure if there is an implementation of
replace that acts on data streams.
Assuming that you can do your replacements on a line by line basis, something like this should work:
open("another_file", "w") do out
for line in eachline("some_file")
println(out, replace(line, r"some text (?i)" => "$some_variable"))
If you have very large files it’s probably better to read larger blocks than a line at a time, but you need to arrange it so the things you replace are not split between blocks.
Possibly of help, this is some production code of mine which solves the easier task of determining whether two files are identical.
block_size = 2 ^ 20
open(file.reference, "r") do reference
open(file.filename, "r") do file
reference_block = read(reference, block_size)
file_block = read(file, block_size)
reference_block == file_block || return false
length(reference_block) < block_size && return true
I also found BufferedStreams.jl. This might implement some useful functionality such as anchoring a stream.
Also if the content you want to replace starts with a constant sequence
readuntil might be useful.
Thank you all for your answers !
So, if I need a a line by line replacement, something like this would be good?
some_variable = "Oranges"
open("some_output_file", "w") do file_out
open("some_intput_file", "r") do file_in
readline(file_in) |> data -> replace(data, r"some text (?i)" => "$some_variable") |> data -> do_other_stuff(data) |> data -> println(file_out, data)
And I can optimize by using either
readuntil(data, "prefix") for text data, or by specifying a bigger block size for binary data and pass it to the
@feanor12 what is “anchoring a stream” ?
And a last question, when I try to put the pipe operator on a new line I get the following error:
|> data -> replace(data, r"some text (?i)" => "$some_variable")
|> data -> do_other_stuff(data)
|> data -> println(file_out, data)
ERROR: syntax: "|>" is not a unary operator
Is there a possibility to put the pipe operator in a newline, so if I have several operations, or a lot of parameter it improves code readability ?
There is a descriptions of anchors that can be found here : Input Streams · BufferedStreams
Basically it marks a specific part of the stream that is kept in the buffer even when it is refilled. This makes it easier to deal with partial matches.
To improve the pipe problem there are different packages like Chain.jl, Pipe.jl or Lazy.jl, but if you put
|> at the end of the lines it should work.