New job, new problems!
I’m currently trying to figure out the best way to read a folder of very large WAV sound files and process them. Reading an entire file causes me to run out of memory. Thankfully, the WAV-package allows me to read smaller chunks at a time, so I can get around it. I was wondering though, is there a well thought-through way of doing this?
I need to:
- Read each files in a folder.
- Process each file (calculate spectrograms)
- Somehow downsample
- Save results
For now, I assume that each file can be processed independently, but it would be nice to have an approach that would allow for treating all files as one distributed file.
I have so far been considering
mmap, but it seems to work only if I have already gotten all data into one file? Is there perhaps something like a distributed