Very slow readdlm()


#1

Julia 0.6 file 500MB (only digits, linux stnadard ) on Win by readcsv() is reading in ±150 sek. Julia 0.7 has only readdlm(), The same file was reading over 2 h! What to do ? Is any package to conver file linux>win ? It is big back step for Win useres .
Paul


#2

Fix will be in 1.0.1.


#3

Nice, THX! Paul


#4

Haven’t tried to use the CSV package?


#5

Thanks, is one step mre , convert DataFrame to Array, but now it is ok :slight_smile:


#6

On 1.0.1 no better :confused:
Julia 0.6.2
julia> @time d=readdlm(“plik.txt”,’,’)
8.264069 seconds (9.32 M allocations: 347.220 MiB …

Julia 1.0.1
julia> @time d=readdlm(“plik.txt”,’,’)
54.991627 seconds (11.14 M allocations: 436.138 MiB

size of plik.txt = 56 456 KB


#7

It would be good if you could provide a file which shows the slowdown.


#8

https://drive.google.com/open?id=1gom3HKeHZYEWQwz4nQXu-xO44cOQAsqE


#9

Are you sure you upgraded to 1.0.1?

I get:

1.0.1

julia> @time d=readdlm("plik.txt",',');
  1.997139 seconds (9.00 M allocations: 330.061 MiB, 11.14% gc time)

julia> versioninfo()
Julia Version 1.0.1

0.6

julia> @time d=readdlm("plik.txt",',');
  2.057267 seconds (9.01 M allocations: 330.109 MiB, 5.33% gc time)

julia> versioninfo()
Julia Version 0.6.5-pre.0

#10

THX,
Yes. I am shure, but now is OK, about 7 sec. Propably another Task on my machine was lunched, sorry :slight_smile:
Paul


#11

@kristoffer.carlsson: I don’t understand one thing. The file size is about 56mb but the memory allocation is 330mb. Curious to know the reason if possible.


#12

The memory allocation is the total amount allocated during the execution of the function, not the memory usage when the function returns.


#13

So it means 330mb was allocated for reading the file. Or it is the maximum memory being allocated to entire Julia session when the file reading was in progress. Sorry if this appears to be a basic question.


#14

It means that if you sum all allocations that happened during the reading of the file, you get 330 MB.


#15

Thanks