Hello
,
I would like to load a .txt file and split it inot several DataFrames .
The file has the following general structure:
#Name:
#Id:
#Dosename:
#RoiName:brain
#Roi volume
#Unit: Gy
0.000 100.000
0.290 89.0
0.580 67.8
0.870 55.0
1.161 43.1
1.451 21.3
#RoiName:neck
#River volume
#Unit: Gy
0.000 100.000
0.081 89.1
0.162 68.3
0.243 56.9
The idea would be to split the file at each “#RoiName:
” and make a several DataFrames :
brain_df =
x y
0.000 100.000
0.290 89.0
0.580 67.8
0.870 55.0
1.161 43.1
1.451 21.3
neck_df =
x y
0.000 100.000
0.081 89.1
0.162 68.3
0.243 56.9
I tried to load my .txt file with CSV.jl and then to convert it with DataFrames.jl, but no idea to “split” it has described above.
Thanks in advance ! 
using DataFrames
using CSV
input = """
#Name:
#Id:
#Dosename:
#RoiName:brain
#Roi volume
#Unit: Gy
0.000 100.000
0.290 89.0
0.580 67.8
0.870 55.0
1.161 43.1
1.451 21.3
#RoiName:neck
#River volume
#Unit: Gy
0.000 100.000
0.081 89.1
0.162 68.3
0.243 56.9
"""
io = IOBuffer(input)
dfs = DataFrame[]
buffer = String[]
for line in eachline(io)
if !startswith(line, "#")
if isempty(line)
if !isempty(buffer)
push!(dfs, CSV.read(IOBuffer(join(buffer, "\n")), DataFrame, header=["x", "y"]))
empty!(buffer)
end
else
push!(buffer, line)
end
end
end
if !isempty(buffer)
push!(dfs, CSV.read(IOBuffer(join(buffer, "\n")), DataFrame, header=["x", "y"]))
empty!(buffer)
end
1 Like
Alternatively, using readuntil
:
julia> open("sample.txt") do f
df = Dict{String,DataFrame}()
readuntil(f, "#RoiName:")
while !eof(f)
df[readline(f)] = CSV.read(IOBuffer(readuntil(f, "#RoiName:")), DataFrame;
comment="#", header=["x", "y"])
end
df
end
Dict{String, DataFrame} with 2 entries:
"brain" => 6×2 DataFrame…
"neck" => 4×2 DataFrame…
3 Likes
@stillyslalom, to run your nice code in Julia 1.8.5, I need to split the inner loop assignment as follows:
str = readline(f)
df[str] = ...
Is this a new feature in Julia 1.9?
Nope, just an erroneous simplification on my part - the LHS gets evaluated after the RHS. This works:
julia> open("sample.txt") do f
df = Dict{String,DataFrame}()
readuntil(f, "#RoiName:")
while !eof(f)
name, rest = readline(f), IOBuffer(readuntil(f, "#RoiName:"))
df[name] = CSV.read(rest, DataFrame; comment="#", header=["x", "y"])
end
df
end
Dict{String, DataFrame} with 2 entries:
"brain" => 6×2 DataFrame…
"neck" => 4×2 DataFrame…
2 Likes
Thank you very much for this solution!