I wasn’t able to find any package for Speech-based Mood/Emotion recognition to use in a project I’m working on, so I need some help setting up something myself.
I know that LSTM’s are in use for this, I’ve seen WaveNet adapted for almost every other audio problem at this point and I have even seen some of the slightly outdated Spatio-Temporal Box Filters.
I guess what I really want to ask is, what would be a good place to start from? Not necessarily the best, most accurate or even fastest approach; but the simplest to implement.
Thanks in advance!
There’s a recent article that may be a good start: https://www.assemblyai.com/blog/end-to-end-speech-recognition-pytorch
You have a port of pytorch in Julia:
Pytorch support of LSTM:
You have this GSOC project:
And some usefull packages here: