Reading in multiple matrices of different dimensions from .txt file

,

Hi all,

I have a .txt file containing information for 10 matrices of different dimensions.

Each row of the .txt file corresponds to a row of a matrix, with whitespace distinguishing the elements. The matrices are seperated by a single blank line.

An example of what a 2x2 matrix and 3x3 matrix would look like within this .txt file is:

1 0
0 1

1 0 0 
0 1 0
0 0 1

Does anyone know of a function that could do this?

Thank you.

Welcome @conorhassan !
You could do the following

using DelimitedFiles
readdlm("your_file.txt")

This will create a matrix of numbers and blankspaces like

5x3 Matrix{Any}
1 0 " " 
0 1 " " 
1 0 0
0 1 0
0 0 1

You can recover your matrices taking slices from that matrix.This surely isn’t the cleanest solution but could be sufficient for your use case.

Hi Vincius,

Thank you for your answer. I agree that slicing does seem plausible, however the example in my question isn’t representitive of the dimensions of the intended use case. The dimensions of the 10 matrices vary from a 50x50 matrix to a 2147x2147 matrix, hence reading in the matrices via readdlm() returns a 6148x2147 with a lot of whitespace. Do you still think slicing is the way to go forward?

Hi, below is the function that I used to do what I needed. I have two particular things I would be interested in improving in order to make the function more general:

  • Is storing the matrices in an Array which holds elements of type Any an efficient way to do this?
  • I got around having to deal with whitespaces by hard coding the number of rows of each individual matrix and the number of matrices within the file.
function parse_whitespace_matrices(matrices)
    seperate_matrix_ind = [533;428;517;165;237;95;67;105;2147;2147]
    seperate_matrices = [] # array of type `Any`
    N = size(matrices) # dimensions of the .txt file which has been read in
    running_ind = 1
    for i in 1:10
        current_matrix = matrices[running_ind:(running_ind+seperate_matrix_ind[i]-1),1:seperate_matrix_ind[i]]
        push!(seperate_matrices,current_matrix)
        running_ind = running_ind + seperate_matrix_ind[i]
    end
    return seperate_matrices
end

You can use a vector of matrices with concrete type:

julia> x = Matrix{Float64}[]
Matrix{Float64}[]

julia> push!(x,rand(2,2))
1-element Vector{Matrix{Float64}}:
 [0.9344162398039686 0.7225797429869223; 0.36474398931320495 0.42920584775757464]

julia> push!(x,rand(3,3))
2-element Vector{Matrix{Float64}}:
 [0.9344162398039686 0.7225797429869223; 0.36474398931320495 0.42920584775757464]
 [0.7303420392586335 0.6562899345943587 0.2762288640158339; 0.8499765956876557 0.35664759034332527 0.20261229928884306; 0.36356739051722986 0.5809874893713416 0.9075228333567769]


Another solution:

using DelimitedFiles

matrices = Matrix{Int}[]

open("file.txt", "r") do io
    while !eof(io)
        str = readuntil(io, "\n\n")
        push!(matrices, readdlm(IOBuffer(str)))
    end
end
2 Likes

You can try:

using DelimitedFiles

function read_nmatrices(file::String)
   M =  Matrix{Float64}[]
   mi = Vector{Float64}[]
   open(file, "r") do io
      while !eof(io)   
         r = split(readline(io))
         if strip.(r)=="" || strip.(r) == []
            push!(M, hcat(mi...)')
            mi = Vector{Float64}[]
         else
            push!(mi, parse.(Float64,r))
         end
      end
  end
  !isempty(mi) && push!(M, hcat(mi...)')
  return M
end

file = raw"C:\..\Nmatrices.txt"
read_nmatrices(file)

4-element Vector{Matrix{Float64}}:
 [1.0 2.0 2.0; 2.0 3.0 4.0; -1.0 2.0 4.0]
 [1.0 2.3; 2.3 1.0]
 [1.0;;]
 [1.0 0.0 0.0 0.0; 0.0 1.0 0.0 0.0; 0.0 0.0 1.0 0.0; 0.0 0.0 0.0 1.0]
Tested on file `Nmatrices.txt` using Julia 1.7
1 2 2
2 3 4
-1 2 4

1  2.3
2.3 1

1

1.  0  0  0
0   1. 0  0
0   0  1. 0
0   0  0  1.