Readbytes! is bugging me

Hey guys

I’ve been trying to read a binary file in which I know there exists n = 3186, Int32 values. The way I’ve done it is like this:

raw_data = zeros(UInt8,n*4)
readbytes!(filestream, raw_data,n*4)

By doing this I get:

raw_data
12744-element Array{UInt8,1}:
 0x00
 0x00
 0x00
 0x1b
 0x00
 0x00
 0x00
 0x1c
 0x00
 0x00
 0x00
 0x1d
    ⋮

Which is great! The problem arises when I try to reinterpret these numbers, I know that in this array piece I’ve shown you I should get 27, 28 and 29. But something is going wrong with Endian representation or when I use reinterpret because I get:

reinterpret(Int32,raw_data)
3-element reinterpret(Int32, ::Array{UInt8,1}):
 452984832
 469762048
 486539264

I can get the right numbers if I use ‘reverse’ but I have to use it twice, first to get the right numbers, then again to get the right indices - but I feel this is a wrong approach:

reverse(reinterpret(Int32,reverse(raw_data)))
3-element Array{Int32,1}:
 27
 28
 29

There must be an easier way and I assume that reversing twice makes a big performance hit? Currently my code reading line by line is 200 ms faster than readbytes! which seems a bit wrong. Hope someone can help.

Kind regards

You’re looking for the ntoh function:

map(ntoh, reinterpret(Int32, raw_data))
1 Like

Thanks, you came in clutch! :slight_smile:

Now I am only 130 ms slower, so will just have to fix my implementation now

Kind regards

May want to try in-place as well, using broadcasting:

data = reinterpret(Int32, raw_data)
data .= ntoh.(data)

And just in case : don’t forget to benchmark your code by wrapping it in a function; don’t benchmark at top level in the REPL. All top-level variables have type Any so the compiler can’t optimize it very well.

I’ve put it into a function and am benchmarking now using your in-line tip. I’ve gotten it down to 330 ms, which is still about 70 ms slower than my reading line by line.

    #Preallocate an array depending on datatype and of chosen size
    arrayVal::Array{UInt8,1} = zeros(UInt8,size[1]*4)
    readbytes!(fd, arrayVal,sizeof(arrayVal))
    #Close the open file
    close(fd)
    data = reinterpret(Int32,arrayVal)
    return ntoh.(data)
end

Maybe reading line by line is just superior in Julia instead of reading a whole array of bytes and making operations on it?

Down under; using readbytes!

@benchmark k = readVtkArray("PartAll",Idp)
BenchmarkTools.Trial:
  memory estimate:  477.45 MiB
  allocs estimate:  28701
  --------------
  minimum time:     328.899 ms (0.00% GC)
  median time:      342.392 ms (0.00% GC)
  mean time:        431.651 ms (20.01% GC)
  maximum time:     1.405 s (73.79% GC)
  --------------
  samples:          12
  evals/sample:     1

Down under; read line by line

@benchmark k = readVtkArray("PartAll",Idp)
BenchmarkTools.Trial:
  memory estimate:  261.69 MiB
  allocs estimate:  28691
  --------------
  minimum time:     276.426 ms (0.00% GC)
  median time:      289.531 ms (0.00% GC)
  mean time:        379.285 ms (21.72% GC)
  maximum time:     1.744 s (80.30% GC)
  --------------
  samples:          17
  evals/sample:     1

Kind regards

Two more things to try:

  1. Skip initializing to zero:
arrayVal = Vector{UInt8}(undef, size[1]*4)
  1. replace the copy operation return ntoh.(data) by changing data in-place:
data .= ntoh.(data)
return data

ON top of that, it might be helpful to paste a working example for both of your benchmarks.

I’ve done your suggestion 1 and can see some marginal improvement, so will keep it, but in this case your 2nd suggestion is slowing things down dramatically. I’ve made it as such:

#Preallocate an array depending on datatype and of chosen size
    arrayVal::Array{UInt8,1} = Vector{UInt8}(undef, size[1]*4)
    #readbytes!(fd, arrayVal,sizeof(arrayVal))
    read!(fd,arrayVal)
    #Close the open file
    close(fd)
    data = reinterpret(Int32,arrayVal)
    data .= ntoh.(data)
    return data

And now the results are:

@benchmark  k = readVtkArray("PartAll",Idp)
BenchmarkTools.Trial:
  memory estimate:  477.90 MiB
  allocs estimate:  28706
  --------------
  minimum time:     534.921 ms (0.00% GC)
  median time:      556.051 ms (0.00% GC)
  mean time:        719.914 ms (22.26% GC)
  maximum time:     2.025 s (71.23% GC)
  --------------
  samples:          9
  evals/sample:     1

Which is a major decrease (if I’ve done it correctly). I am trying to make a minimal working example available in my other post (How fast is binary reading capabilities in Julia compared with other languages? - #5 by Ahmed_Salih), I will try to post it as soon as possible.

Kind regards

Ah, yes, I’m seeing the same:

julia> function f()
       a = Vector{UInt8}(undef, 100_000)
       data = reinterpret(Int32, a)
       data .= ntoh.(data)
       data
       end
f (generic function with 1 method)

julia> using BenchmarkTools

julia> @btime f();
  284.997 μs (3 allocations: 97.80 KiB)

julia> function g()
       a = Vector{UInt8}(undef, 100_000)
       data = reinterpret(Int32, a)
       return ntoh.(data)
       end
g (generic function with 1 method)

julia> @btime g();
  14.242 μs (5 allocations: 195.56 KiB)

it seems related to the reinterpret call. I wonder if it’s actually necessary; it seems this might work too:

julia> function h()
       data = Vector{Int32}(undef, 100_000)
       data .= ntoh.(data)
       return data
       end
h (generic function with 1 method)

julia> @btime h();
  19.140 μs (2 allocations: 390.70 KiB)

… Okay, that’s still a bit slower. Then I see no reason to take my second suggestion.

I’ve also arrived at the conclusion that for bigger files, it is much faster to just read every byte/element in a for loop and handle conversion instantly, instead of reading the whole array and then reinterpreting. I still can’t make it make logical sense why this is the case, but I’ve tried a lot of different approaches and have not been able to beat line by line.

Thanks for your time.

Kind regards