First, has anyone reimplemented Sinc neural networks in Julia? Those are for speaker recognition, state-of-the-art I believe, and should similarly help for speech recognition, and other audio work.
So far, I’m just looking at the sinc function itself, and I’m not sure the Julia implementation is as fast as it could be.
julia> function my_sinc(x) # this works (EDIT: for small domain), but now naive version
sum([((-1)^n*x^(2n))/factorial(2n + 1) for n=0:3])
julia> function my_sinc2(x) # I' looking into evalpoly, can I do something like this
evalpoly(x, (((-1)^n*x^(2n))/factorial(2n + 1) for n=0:3))
Anyone know if Padé approximant or some other way is best? I tried to google a bit, and didn’t see, but found:
Explicitly calling factorial and computing powers of xseparately for each term in a Taylor series is not a good approach. Not only is this very slow (you have a lot of redundant computations!), but it can also overflow to 0*Inf if you compute enough terms.
See, for example, the new implementation of the cosc function in Base for an example of how to compute a Taylor series iteratively (to arbitrary precision), with specialized cases (evalpoly with hardcoded coefficients to a fixed order) for single and double precision.
Many software projects, particularly research projects, require two languages […]
The vector and matrix manipulation engine is quite powerful and efficient, making Lush ideal for heavy numerical applications and applications such as signal and image processing . […]
Lush offers an extensive library for gradient-based machine learning
The key difference between your version and the Base version is that your version only works for small x values. For example, my_sinc(10) is about -130 instead of the correct answer of 0. Making sinc fast and correct for the whole range pretty much requires calling sinpi which is going to be slow.
Yes, I know that. My question was really about exactly getting evalpoly to work (and if anyone actually working on Sinc [networks] already), and if Padé approximant appropriate. With the limit going to zero, a polynominal will never work without division, but it might for a large enough region, given a longer polynominal.
What are the options actually? I’m up to speed on the math of Taylor series (and sinc has one without division, if you’re ok with infinite…), missing some details on implementing with evalpoly. I’m reading up on minimax, and Padé etc. but what are the options I should be looking into? I think this (function approximation in general) will involve only polynominals, i.e. one or two (then division), of sufficient length.
Is this helpful to find the coefficients:
There I see e.g.:
linear_king_fit(E, U) , find coefficients a and b for E[i]^2 = a + b*U^0.5
linear_rational_fit(x, y, p, q) finds the coefficients for rational polynomials: y[i] = (a + a*x[i] + ... + a[p+1]*x[i]^p) / (1 + a[p+1+1]*x[i] + ... + a[p+1+q]*x[i]^q)
I.e. the latter seems helpful, not sure with the former. You do have sqrt in processors, not sure e.g. it used for function approximations, I think that and other options in the package are for other applications.
The best library for this is called Remez.jl it finds minimax polynomials or rational functions. That said, I’m not sure you’ll be able to find something accurate over the entire range that isn’t really slow.
You mean the from -Inf to Inf as the “entire range”? I was thinking maybe -5 to 5 (just a guess), then fallback. But I’m actually not sure what the range needs to be, what’s needed for signal processing.
Sometimes zero-error point constraints are included in a Modified Remez Exchange Algorithm.
I’m not sure this should be used, for few [closest to zero] extremes and/or zero-crossings. Nor how inaccurate the approximation is tolerable. For signal processing, I guess you want the frequency preserved, i.e. zero crossings, but e.g. the extremes could be off, assuming it would only be in the vertical, not horizontal direction.
Convolution with a sinc function is an entirely different beast — the limiting factor for convolution seems unlikely to be evaluation of sinc itself (if it is evaluated at all — e.g. an FFT-based method won’t evaluate sinc, whereas an approximate overlap-add method using a truncated sinc would precompute the filter coefficients anyway).
I see it used in the code, I’m not sure, are you saying there’s a better way, or this ok, just sinc will not be the speed critical part in the network (I know sinc is just used in the first layer):
for i in range(self.N_filt):
low_pass1 = 2*filt_beg_freq[i].float()*sinc(filt_beg_freq[i].float()*self.freq_scale,t_right)
low_pass2 = 2*filt_end_freq[i].float()*sinc(filt_end_freq[i].float()*self.freq_scale,t_right)
From the paper (see top post):
The first layer of a standard CNN performs a set of time-
domain convolutions between the input waveform and some
Finite Impulse Response (FIR) filters […]
In standard CNNs,
all the L elements (taps) of each filter are learned from data.
Conversely, the proposed SincNet (depicted in Fig. 1) per-
forms the convolution with a predefined function g that de-
pends on few learnable parameters θ only
In general I do trust Julia to have the best math approximations, but it seems sinc may be an exception (I saw talk of it shouldn’t be in Base), maybe not considered too important; or at least original sinc implementation implemented before this paper I found from last October: https://arxiv.org/abs/1812.10884
A rational approximation of the sinc function based on sampling and the Fourier transforms
Figure 6 shows the absolute difference between the original sinc function sinc (πν) and its rational approximation. Despite that the sinc function
is not easy to approximate [12, 16], only 2^(6−1) = 32 terms in the proposed
formula (16) provide absolute difference smaller than 3.2 × 10^−3 within the
range ν ∈ [−2π, 2π]. A MATLAB code validating the results based on the
rational approximation (16) of the sinc function sinc (πν) is provided in Ap-
Maybe I shouldn’t look at the code, with it tainted (so I couldn’t contribute to Julia)? But ok to look at formulas?
As we can see now, the truncated form of this limit can be implemented for
sampling in accordance with equation (1). Specifically, the sinc function can
be approximated as a cosine series expansion such that
FYI: the paper also seems to be on different things:
Fig. 2. The rectangular function approximations f(t) = 1/[(2t)^70 + 1] and
[I’ve never seen raised to 70th power, this high before, not sure if fast, a better way seems for a rectangular function…]
Voigt function , can be calculated by residues in a similar
way that we performed in our earlier work  to approximate the integral
(18). Due to rapid performance and high accuracy our algorithm  has
been implemented in current version of the atmospheric model bytran [6, 15]
Three digits of accuracy is fine for specialized applications, but the Base version needs to be accurate to close to machine precision. If your application only needs a crude sinc over a limited range, and the sinc is performance-critical, then by all means use a simple approximation.
In the code you quoted almost certainly the time is mostly spent in the conv1d function. The time spent calling sinc while precomputing filter coefficients should be negligible, especially in Julia.