I’m new to Julia and trying to figure out a way to read raw PCM samples of audio. When encoded in WAV, I can use the wavread function from the WAV package. But, when the file is simply raw PCM without encoder/decoder metadata, it is just binary data that needs to be read as int16 in little endian format (typically).
In python, I do this:
import numpy as np
f = open(infile, mode='r+')
y = np.fromfile(f,d)
if e == 'BIG':
y.byteswap(True)
f.close()
From the numpy docs it appears that d is actually the data type. (count is the third argument and the default is to read all the data.) As such, you just need to change one line in @StefanKarpinski’s code
y = Vector{UInt16}(undef, stat(f).size ÷ sizeof(UInt16))
Thanks Stefan and dawbarton. Appreciate the quick help.
Yes, d is the data type which I can pass in as ‘int16’ (or whatever the type may be).
The code seems to work for a couple of vectors I tried. From the code, am I right to assume that the read!(f,y) function will do the type conversion and convert the binary data into the type of y?
Sorry, conversion was a poor descriptor.
What I was trying to ask was that, the read function reads the binary data one byte at a time, but when placing it in the allocated memory, it smartly places two bytes in one slot since we have defined it to be Int16. Slightly less intuitive than python (MATLAB does it the same as python too), where we specify how to interpret the binary data in the reading operation, but not a real issue. I guess I have to just get used to the Julia way.
Thanks for the quick suggestions and explanations!!
The read(io, T) function reads sizeof(T) bytes from the stream in “the canonical binary representation” of type T, which is the same as fromfile but for a single value of type T rather than an array. (For integers the “canonical binary representation” is the platform dependent binary format of the type in memory. For UInt16 just a pair of bytes, little endian on the typical PC but can be big endian for embedded devices.)
Julia’s standard read doesn’t have a “read all” option equivalent of numpy’s count=-1. If you’re after pure convenience, you can read the whole file as bytes, reinterpret those (pairwise) as UInt16 and swap the endianness all in one line:
y = bswap.(reinterpret(UInt16, read(filename)))
The solution using read! is more efficient when you know the size of the data but don’t know whether to do an endian swap (for type stability of the returned array at least). To combine the solutions above into a function:
function read_pcm(file_name, T; swap_endian=false)
y = Vector{T}(undef, filesize(file_name) ÷ sizeof(T))
read!(file_name, y)
if swap_endian
y .= bswap.(y) # In place broadcast of bswap over `y` is probably the neatest way to write this.
end
return y
end
Though I’m not sure whether the T makes sense here, or whether all PCM files are UInt16 in practice?
Anyway, usage would be:
read_pcm("myfile.pcm", UInt16)
read_pcm("myfile.pcm", UInt8; swap_endian=true) # 8 bit audio ?!
I’ll stick to the initial solution with the read! function and the “for loop” for the byte swap. The more I read up, the more my preferences adapt I recently came across some Julia training that talked about for loops being more efficient than vector operations (as opposed to MATLAB). If that is true, then in the interest of speed and efficiency, the initial for loop approach by Stefan is probably the better way (which you already mentioned is more efficient).
Btw, the endian swap is a corner case. Most audio is in little endian format and we don’t need to do the swap. And audio is always signed integers - just adding the comment for completeness, all the above solutions are valid replacing UInt16 with Int16. The focus for me was the packing of multiple bytes and copying to one memory location.
Yes, it’s the type which is the real reason for the comment about efficiency, not the “vectorized” broadcast notation. The compiler generates less efficient code when you have a structure like
x = Type1()
if some_runtime_condition()
x = Type2()
end
# ... do something with `x`