Automatic Speech Recognition in Julia

Palli · April 5, 2022, 5:15pm

I don’t find anything either in Julia. Why do you need it implemented (fully) in Julia? Isn’t it good enough to call say PyTorch from Julia and use whatever is available (or Avalon.jl then helpful)? I’m not up-to-speed on moving models from Python to Julia, i.e. just the parameters, weights and biases, shouldn’t that be possible, and wasn’t there even a standard for it ONNX? Might likely just work for certain types of networks, e.g. I believe it’s an older standard than Transformers, so those excluded?

What I did find however brand-new from 31 March 2022:

Comprehensive experiments on the LibriSpeech corpus show that the proposed Speech2C can relatively reduce the word error rate (WER) by 19.2% over the method without decoder pre-training, and also outperforms significantly the state-of-the-art wav2vec 2.0 and HuBERT on finetuning subsets of 10h and 100h

“Wav2Vec 2.0” was state-of-the-art in 2020, according to its paper 2020 paper, is it still so, even though this other Feb 2022, states so (or is it just an evaluation/survay paper, and they tend to repeat claims?):

If someone DOES want to reimplement something in Julia, I at least would want them to find the state-of-the-art and use that…

Might be a helpful thread:

SincNet was also intriguing when I noticed it (might be outdated, or not, hadn’t heard of SpeechBrain):

SincNet is implemented in the SpeechBrain (https://speechbrain.github.io/) project as well.

sinc (and sin) looked intriguing for periodic functions, but may actually be outdated. SIREN is if I recall newer and better, and even something more recent, even better (applications I saw however for computer vision).

I hadn’t heard of conformers (thanks for the tip), only transformers, which it’s a variant of, but might also be too old:

Topic		Replies	Views
Best practices for Speech-to-Text conversion? Statistics question	5	1867	June 23, 2021
Speech-based Emotion Recognition Machine Learning question	1	851	April 19, 2020
[ANN] TIDIGITSRecipe.jl: the first Julia-flavoured speech recognition recipe! Package Announcements audio	0	624	February 26, 2021
Announcing Whisper.jl Package Announcements machine-learning , audio , speech-recognition	7	2151	May 23, 2023
Sequence language models in Julia Machine Learning	3	212	June 29, 2025

Automatic Speech Recognition in Julia

Related topics