Landau distribution or something similar

Good day everyone!

I need a tiny toss in the right direction since I am stuck with some statistics problems.

I am investigating signals which are similar to a Landau distribution but I don’t have a good physical model to describe those (and also don’t really have to time to develop one, but it might be a nice master thesis candidate ;) ). Anyways, I am trying to characterise the distribution by its “FWHM” and peak location/height (just like a simple Gaussian) and then investigate data where I have a mix of these signals, with different widths/heights and peaks – usually two or three of them.

Here is an example with a single peak:

Screenshot 2021-08-09 at 09.16.04

here with two peaks:

Screenshot 2021-08-09 at 09.17.05

and here with 3-4 peaks:

Screenshot 2021-08-09 at 09.45.27

Just for reference: in the ROOT (CERN) framework, there is a Landau function (see ROOT: Probability Density Functions (PDF) and TMath) which is implemented here: TMath - source file and calls the G110 denlan from CERNLIB (G110 Landau Distribution). I found the Fortran source code of the DENLAN function in the tarball (Index of /download/2006_source/tar).

Since I could not find any implementation of the Landau function in Julia packages (Distributions.jl etc.) I will probably just go ahead and reimplement it (incl. a PR to Distributions.jl) and then try to fit it with a mixed model and least-squares fit with JuMP.jl or so…

…but before I go ahead, I was wondering if there is a better approach or maybe even an existing implementation I overlooked. My first attempts with crude mixed Gaussian model fittings went horribly wrong, due to the asymmetry ;)
My current best attempt is using Peaks.jl and then fiddling around with cuts on the prominence and width of the peaks, but it’s quite messy and not stable enough. Especially when the peaks are not well separated, of course (see last example)

Sorry for this very broad question…

GSL.jl is a wrapper of the Gnu Scientific Library, which has the Landau Distribution.

1 Like

Ah, I have not thought of that, thanks, I’ll check it out :)

OK, I am a bit closer but I had to reimplement the Landau function (with all the helpers) since the one in GSL.jl is not parametrised, it’s the one with fixed parameters. I was confused since in the ROOT source it’s stated that the implementation is taken from CERNLIB and also used in GSL, however in GSL only the simplified one is implemented: ROOT: math/mathcore/src/PdfFuncMathCore.cxx Source File

Anyways I’ll wrap it up and aim for a PR in Distributions.jl, maybe someone finds it useful.

Meanwhile, I still struggle to fit my data, so currently I am doing a mix of the Landau and a Gaussian, but at least I made progress. Thanks again :)

Screenshot 2021-08-09 at 21.06.01

2 Likes

Try looking through the list, & add if I missed anything useful

3 Likes

Hey,

if you know (can code) the function that you want to fit with, you can quickly construct the model with two, three, or several PDF using AlgebraPDF

2 Likes

BTW, the signals look like SiPM noise :slight_smile:

1 Like

for posterity and discussion regarding license

1 Like

That’s the problem, unfortunately. I don’t have a good physical model and the Landau distribution is fairly complicated… The usable approximations use a lot of hardcoded arrays :see_no_evil:

Ha, well, it’s a PMT signal, but not primarily noise. It’s the time residual distribution of hits with respect to muon track reconstructions.

Btw. I just had a quick chat with @jling and he already made a PR to Distributions.jl but the license is problematic. I literally did basically the same implementation :see_no_evil:

Now it works really nicely, so I’d like to pour it into a package but not sure how – w.r.t. licensing.

Screenshot 2021-08-09 at 23.17.13

1 Like

What’s the concern about licensing?
If you use the ROOT code, you’ll have to use GPL. I think the easiest is to just put this in its own package. Does it need to live in Distributions.jl?

I would be curious to see how to fit mixture fractions with Distributions. When I was playing with it time ago, I did not figure out.

It might well be that you find the description (that is close to implementation) is some physics book, or Wikipedia page :laughing: so you can well reference them

Yes, that’s the plan! Since @jling already got a bit further by adding the Distributions.jl interface, he will probably go ahead and file a new package which will depend on Distributions.jl but decoupled license-wise. :)

Ah, that’s just a crude chi2-approach and minimised using Optim.jl. I can sum it up in a blog post if you like! But it’s not groundbreaking :) I’ll have to jump to JuMP.jl since I’ll probably need a bit more control over constraints…

Yes in fact we followed the source code of ROOT, CERLIB and GSL, as there are no easy-to-implement formulas here. Everything is heavily baked-in, see https://root.cern/doc/master/QuantFuncMathCore_8cxx_source.html#l00189 ;D

1 Like

ROOT seems to use LGPL, which doesn’t have the viral property of GPL. Distributions.jl could say “this file is under LGPL” and that’s it, that doesn’t affect the rest of the package

1 Like

Ah OK, interesting. I am always confused about such peculiarities of the licensing word. To be honest, I still don’t fully understand which part of the license permits the inclusion into an MIT licensed product :see_no_evil:

I found this, which kind of says the opposite if I understand correctly: ruby - How does using a LGPL gem affect my MIT licensed application? - Software Engineering Stack Exchange

There it’s stated:

The significant difference with GPL is, that it doesn’t impose the license on software using the library. Only if you’d modify the library or directly include parts of the code in your software, then your code would have to be LGPL.

In our case, we take parts of the original code (from CERNLIB/ROOT/GSL), transpile it to Julia and then include it in the source code.
Doesn’t that require that the target project is also LGPL?

Not in its entirety. I understand if Distributions.jl don’t want complexity and would prefer a codebase which has a single license, but LGPL wouldn’t make the whole project, it’d be “mostly MIT with a single file under LGPL”.

Also Julia itself is effectively multi-license at the moment, with most of the codebase under MIT, and some bits using different licenses

1 Like

indeed it can live in a separate package with a difference license and people have done it in Python.

And here’s ours:

4 Likes

The parameters of the Landau distribution, mu and c, are related to the shift and stretching, right?
If so, you can use grid interpolation for the shape.

BTW, the expression for p(x;mu,c) in wikipedia does not look like (x-mu)/c can be used as an argument

a typo (t/c)? Could you check, please?

Thanks Jerry! Is there anything missing for the package registration? :)

In the original code the parameters are called xi and x0. Here is a quick overview how they shape the distribution:

using PGFPlotsX
using LaTeXStrings
using Colors

push!(PGFPlotsX.CUSTOM_PREAMBLE, raw"\pgfplotsset{scale=2.0}")

xs = range(-10, 40; length=500)
@pgf axis = Axis({no_marks, ultra_thick})
xis = 1:2:10
x0s = 1:2:10
for (xi, color) ∈ zip(xis, distinguishable_colors(length(xis), [RGB(1,0,0), RGB(0,0,0)]))
    for (x0, alpha) ∈ zip(x0s, range(1, 0.2; length=length(x0s)))
        push!(axis, @pgf Plot({color = color, opacity = alpha}, Coordinates(xs, [landau_pdf(x, xi, x0) for x ∈ xs])))
        push!(axis, LegendEntry("xi = $xi, x0 = $x0"))
    end
end
axis

Screenshot 2021-08-12 at 12.28.24

Ah, @misha_mikhasenko maybe the problem you encountered was that for x0<0 the function is completely flat… In my fit, I simply introduced an additional shift to x. Btw. I also fit an extra scale factor, since I am using the function outside of its context.