How to make log-Mel spectrogram?

jling · April 26, 2023, 4:08am

The eventual goal is to make a pure Julia Whisper inference, but got stopped before 1st step.

Most ingredients of the spectrogram is simply a combination of hanging window and stft:

openai/whisper/blob/c09a7ae299c4c34c5839a76380ae407e7d785914/whisper/audio.py#L137-L141


      
          window = torch.hann_window(N_FFT).to(audio.device)
          stft = torch.stft(audio, N_FFT, HOP_LENGTH, window=window, return_complex=True)
          magnitudes = stft[..., :-1].abs() ** 2
          
          
filters = mel_filters(audio.device, n_mels)

in Julia, we can find them in DSP.jl(Periodograms - periodogram estimation · DSP.jl)

but where can I find the mel filterbank matrix? which is some audio specific scaling model

rafael.guerra · April 26, 2023, 7:11am

To make a log-Mel spectrogram from a DSP spectrogram, I believe you need to transform the frequency axis according to the Mel-scale formula and display the amplitudes in dB.

Linking also this “related” thread.

pywugate · June 19, 2024, 4:09pm

You can also take a look of MFCC.jl , MusicProcessing.jl and SpeechFeatures.jl

Topic		Replies	Views
Plotting a spectrogram using DSP.jl General Usage dsp	12	5986	March 31, 2023
How to calculate the real cepstrums of a time-series in Julia? Signal and Image Processing	6	340	January 27, 2024
Interactive audio spectrograms - new user Visualization plotting , dsp , pluto	5	1288	April 15, 2021
Logarithmic Smoothing - A Tutorial and Two Package Ideas Signal and Image Processing package , tutorials , smoothing	8	1977	November 16, 2023
Find Power Spectrum and Density Signal and Image Processing fftw , dsp	4	1684	August 7, 2022

How to make log-Mel spectrogram?

Related topics