I want to use julia to write a code parallel reading data from txt file and saved into a big matrix.
However, I found the matrix seems to be distributed to each CPU and I need each CPU to be tremedous big. How can I prevent the matrix to be distributed?
This is the code
using DelimitedFiles
using LinearAlgebra
using Statistics
using JLD
N = 2120;
ins_force = zeros(N,3,5000);
Force = zeros(N,3,5000,3);
Threads.@threads for i = 1:3
start = time()
file = readdlm(string("Force",5000*(i-1),".txt"))
index = 1;
idx = 1;
while idx < size(file,1)
ins_force[:,:,index] = file[idx+1:idx+N,:];
index = index+1;
idx = idx +N+1;
end
time_end =time()
Force[:,:,:,i] = ins_force;
display(time_end-start)
end
I can use the cell and cell2mat to realize this by code
clc
clear
%% compute ins force
N = 2120;
list = dir(['./Force*'])
n = size(list,1)
index = 1;
Force = cell(1,1,200);
parfor i = 1:200
tic
A = textread(['./Force',num2str(5000*(i-1)),'.txt']);
index = 1
Force_temp= zeros(N,3,5000);
idx = 1;
while idx < size(A,1)
Force_temp(:,:,index) = A(idx+1:idx+N,:);
index = index+1;
idx = idx +N+1;
end
Force{1,1,i} = Force_temp;
toc
end
Force = cell2mat(Force);
Your question is a little unclear to me. Do you want your code to be multithreaded or not multithreaded? Regular Julia matrices are not distributed and all memory is visible to all CPUs when using threads.
I want the code to be multithreaded. The point is when the same code is running (Julia and Matlab do the almost similar things.). The Julia will Out of Memory. It seems to me that the Matrix is distributed? Maybe I have some misunderstanding for it. But is there other reasons to make the Julia code Out of Memory, for one or two prcessor?
This code is not thread-safe: ins_force is defnied outside of the multithreaded part, but used inside without accounting for the thread-loop index i.
Note that the Matlab version does not have this issue, there Force_temp is defined inside the parfor loop.
Furthermore, this code should be wrapped in a function to avoid usage of global variables, which are type-unstable.
Thank you for your useful suggestions!
Does this seems to be thread_save now?
I am a new parallel user. Iām not sure how to make it safe in Julia.
using DelimitedFiles
using LinearAlgebra
using Statistics
using JLD
N = 2120;
Force = zeros(N,3,5000,3);
function readtxt(i)
ins_force = zeros(N,3,5000);
file = readdlm(string("Force",5000*(i-1),".txt"))
index = 1;
idx = 1;
start =time()
while idx < size(file,1)
ins_force[:,:,index] = file[idx+1:idx+N,:];
index = index+1;
idx = idx +N+1;
end
time_end =time()
display(time_end-start)
return ins_force
end
Threads.@threads for i = 1:3
Force[:,:,:,i] = readtxt(i);
end
Yes. In addition you should define N either as const or as a function parameter to make your inner loop also type-stable.
Maybe this already solves your memory issues?