Fastest Approach to reading Binary Files

#1

Hey guys!

I am using Julia 1.1 to read binary vtk files, in which some parameters are known and their corresponding filetype. My brother has made a script which utilizes the “readuntil” functionality to detect the attribute for an example velocity (position “x”) and then because the size of inputs is known the script will read and store each line in an array until end position (position “y”) is reached.

We have also implemented the Threads.@threads command so basically we have been able to read one parameter from 1001 binary vtk files with an average size of 12 mb each in about 260 seconds on an i7-cpu.

So the question is for you guys, is this performance “good” / do you know of any implementations which might help us improve or maybe a third comment? I’ve tried looking into what @sdanisch did here https://hackernoon.com/drawing-2-7-billion-points-in-10s-ecc8c85ca8fa using memory mapping, so is this the way to go?

I know the question is a bit “fluffy” and but any comments or pointers in the right direction, would be greatly beneficial.

Kind regards

#2

Slightly unrelated but have you seen https://github.com/jipolanco/WriteVTK.jl?

1 Like
#3

No I did not, but thanks for making me aware of it! Might be handy in the future. Currently the issue is still optimizing the data reading from binary vtk and into Julia. Currently I am able to process files with a 46 mb / second, so approximately four files at a second, but I hope I can push it a bit further still. Thanks for taking interest in the question though.

Kind regards