Reading binary data from raw PCM files

srikar_sastry · January 31, 2019, 2:05pm

Hello,

I’m new to Julia and trying to figure out a way to read raw PCM samples of audio. When encoded in WAV, I can use the wavread function from the WAV package. But, when the file is simply raw PCM without encoder/decoder metadata, it is just binary data that needs to be read as int16 in little endian format (typically).

In python, I do this:

import numpy as np
f = open(infile, mode='r+')
y = np.fromfile(f,d)
if e == 'BIG':
    y.byteswap(True)
f.close()

https://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html

Any suggestion on how to read binary data to specified data format would be helpful.

Thanks

StefanKarpinski · January 31, 2019, 2:34pm

Assuming that in your code d is the number of 16-bit values in the file and e is a string indicating endianness, this code will do it:

f = open(infile)
y = Vector{UInt16}(undef, d)
read!(f, y)
if e == "BIG"
    for (x, i) in enumerate(y)
        y[i] = bswap(x)
    end
end
close(f)

However, one thing about your Python code is unclear to me: how does it know to read int16 code units?

dawbarton · January 31, 2019, 3:30pm

From the numpy docs it appears that d is actually the data type. (count is the third argument and the default is to read all the data.) As such, you just need to change one line in @StefanKarpinski’s code

y = Vector{UInt16}(undef, stat(f).size ÷ sizeof(UInt16))

Where here d is UInt16.

srikar_sastry · January 31, 2019, 4:19pm

Thanks Stefan and dawbarton. Appreciate the quick help.

Yes, d is the data type which I can pass in as ‘int16’ (or whatever the type may be).

The code seems to work for a couple of vectors I tried. From the code, am I right to assume that the read!(f,y) function will do the type conversion and convert the binary data into the type of y?

Thanks

StefanKarpinski · January 31, 2019, 5:02pm

There’s no conversion really, it just reads data from the file into the array’s memory.

srikar_sastry · February 1, 2019, 2:49am

Sorry, conversion was a poor descriptor.
What I was trying to ask was that, the read function reads the binary data one byte at a time, but when placing it in the allocated memory, it smartly places two bytes in one slot since we have defined it to be Int16. Slightly less intuitive than python (MATLAB does it the same as python too), where we specify how to interpret the binary data in the reading operation, but not a real issue. I guess I have to just get used to the Julia way.

Thanks for the quick suggestions and explanations!!

c42f · February 1, 2019, 4:33am

The read(io, T) function reads sizeof(T) bytes from the stream in “the canonical binary representation” of type T, which is the same as fromfile but for a single value of type T rather than an array. (For integers the “canonical binary representation” is the platform dependent binary format of the type in memory. For UInt16 just a pair of bytes, little endian on the typical PC but can be big endian for embedded devices.)

Julia’s standard read doesn’t have a “read all” option equivalent of numpy’s count=-1. If you’re after pure convenience, you can read the whole file as bytes, reinterpret those (pairwise) as UInt16 and swap the endianness all in one line:

y = bswap.(reinterpret(UInt16, read(filename)))

The solution using read! is more efficient when you know the size of the data but don’t know whether to do an endian swap (for type stability of the returned array at least). To combine the solutions above into a function:

function read_pcm(file_name, T; swap_endian=false)
    y = Vector{T}(undef, filesize(file_name) ÷ sizeof(T))
    read!(file_name, y)
    if swap_endian
        y .= bswap.(y)  # In place broadcast of bswap over `y` is probably the neatest way to write this.
    end
    return y
end

Though I’m not sure whether the T makes sense here, or whether all PCM files are UInt16 in practice?

Anyway, usage would be:

read_pcm("myfile.pcm", UInt16)
read_pcm("myfile.pcm", UInt8; swap_endian=true) # 8 bit audio ?!

srikar_sastry · February 1, 2019, 6:56pm

@c42f

Thanks for the detailed explanation. The one liner does seem to create an unfamiliar type:

Base.ReinterpretArray{Int16,1,UInt8,Array{UInt8,1}}

I’ll stick to the initial solution with the read! function and the “for loop” for the byte swap. The more I read up, the more my preferences adapt I recently came across some Julia training that talked about for loops being more efficient than vector operations (as opposed to MATLAB). If that is true, then in the interest of speed and efficiency, the initial for loop approach by Stefan is probably the better way (which you already mentioned is more efficient).

Btw, the endian swap is a corner case. Most audio is in little endian format and we don’t need to do the swap. And audio is always signed integers - just adding the comment for completeness, all the above solutions are valid replacing UInt16 with Int16. The focus for me was the packing of multiple bytes and copying to one memory location.

Thanks

StefanKarpinski · February 1, 2019, 8:16pm

The dotted version should be equivalent and is certainly slicker and more concise.

c42f · February 1, 2019, 8:47pm

Yes, it’s the type which is the real reason for the comment about efficiency, not the “vectorized” broadcast notation. The compiler generates less efficient code when you have a structure like

x = Type1()
if some_runtime_condition()
    x = Type2()
end
# ... do something with `x`

People call this “type stability”.

srikar_sastry · February 2, 2019, 5:44am

Got it. I quickly checked and the bswap doesn’t cause an ambiguous type. The reinterpret function does.

Thanks!

srikar_sastry · February 2, 2019, 5:45am

Got it! Thank you so much for the explanation!

srikar_sastry · February 2, 2019, 5:46am

Thank you everyone!
You guys answered my questions and also helped my understand new concepts! Appreciate the help!