How fast is binary reading capabilities in Julia compared with other languages?

@tkluck

A “vtk” file is a “Visual Toolkit” file, which allows users to visualize simulation data, usually with Paraview, or any other kind of data. In my case I am trying to extract data from these files directly for two purposes:

  1. Avoiding slow reading from CSV files
  2. Lessening storage need for simulations

So opened in Paraview would show something like:

So these vtk files only store simulation data and nothing else. I’ve also included the minimal working example in the dropbox folder. Basically when I benchmark the functionality where I use readbytes!:

@benchmark readVtkArray("parts_")
BenchmarkTools.Trial:
  memory estimate:  986.72 KiB
  allocs estimate:  564
  --------------
  minimum time:     2.731 ms (0.00% GC)
  median time:      3.428 ms (0.00% GC)
  mean time:        3.822 ms (0.00% GC)
  maximum time:     6.814 ms (0.00% GC)
  --------------
  samples:          1307
  evals/sample:     1

So about 3.428 ms. When I use my own approach using read, where I read a Int32 at a time I get 1.6 ms on the files I’ve put in the dropbox link.

The command you have to use is:

using BenchmarkTools
@benchmark readVtkArray("parts_")

If you can get it down under 1.6 ms, I would be very happy - note that I utilize Threads.@threads and I use 4 on an i7.

@Tamas_Papp if you want to try memory mapping, I can tell you that “Idp” has type Int32 and that it is always nRow long while having a width of 1 ie. Array{Int32,1}.

If anything else, let me know guys, I tried to be as clear as possible.

Kind regards