I need a tiny toss in the right direction since I am stuck with some statistics problems.
I am investigating signals which are similar to a Landau distribution but I don’t have a good physical model to describe those (and also don’t really have to time to develop one, but it might be a nice master thesis candidate ;) ). Anyways, I am trying to characterise the distribution by its “FWHM” and peak location/height (just like a simple Gaussian) and then investigate data where I have a mix of these signals, with different widths/heights and peaks – usually two or three of them.
Since I could not find any implementation of the Landau function in Julia packages (Distributions.jl etc.) I will probably just go ahead and reimplement it (incl. a PR to Distributions.jl) and then try to fit it with a mixed model and least-squares fit with JuMP.jl or so…
…but before I go ahead, I was wondering if there is a better approach or maybe even an existing implementation I overlooked. My first attempts with crude mixed Gaussian model fittings went horribly wrong, due to the asymmetry ;)
My current best attempt is using Peaks.jl and then fiddling around with cuts on the prominence and width of the peaks, but it’s quite messy and not stable enough. Especially when the peaks are not well separated, of course (see last example)
OK, I am a bit closer but I had to reimplement the Landau function (with all the helpers) since the one in GSL.jl is not parametrised, it’s the one with fixed parameters. I was confused since in the ROOT source it’s stated that the implementation is taken from CERNLIB and also used in GSL, however in GSL only the simplified one is implemented: ROOT: math/mathcore/src/PdfFuncMathCore.cxx Source File
Anyways I’ll wrap it up and aim for a PR in Distributions.jl, maybe someone finds it useful.
Meanwhile, I still struggle to fit my data, so currently I am doing a mix of the Landau and a Gaussian, but at least I made progress. Thanks again :)
That’s the problem, unfortunately. I don’t have a good physical model and the Landau distribution is fairly complicated… The usable approximations use a lot of hardcoded arrays
Ha, well, it’s a PMT signal, but not primarily noise. It’s the time residual distribution of hits with respect to muon track reconstructions.
Btw. I just had a quick chat with @jling and he already made a PR to Distributions.jl but the license is problematic. I literally did basically the same implementation
Now it works really nicely, so I’d like to pour it into a package but not sure how – w.r.t. licensing.
What’s the concern about licensing?
If you use the ROOT code, you’ll have to use GPL. I think the easiest is to just put this in its own package. Does it need to live in Distributions.jl?
I would be curious to see how to fit mixture fractions with Distributions. When I was playing with it time ago, I did not figure out.
It might well be that you find the description (that is close to implementation) is some physics book, or Wikipedia page so you can well reference them
Yes, that’s the plan! Since @jling already got a bit further by adding the Distributions.jl interface, he will probably go ahead and file a new package which will depend on Distributions.jl but decoupled license-wise. :)
Ah, that’s just a crude chi2-approach and minimised using Optim.jl. I can sum it up in a blog post if you like! But it’s not groundbreaking :) I’ll have to jump to JuMP.jl since I’ll probably need a bit more control over constraints…
ROOT seems to use LGPL, which doesn’t have the viral property of GPL. Distributions.jl could say “this file is under LGPL” and that’s it, that doesn’t affect the rest of the package
Ah OK, interesting. I am always confused about such peculiarities of the licensing word. To be honest, I still don’t fully understand which part of the license permits the inclusion into an MIT licensed product
The significant difference with GPL is, that it doesn’t impose the license on software using the library. Only if you’d modify the library or directly include parts of the code in your software, then your code would have to be LGPL.
In our case, we take parts of the original code (from CERNLIB/ROOT/GSL), transpile it to Julia and then include it in the source code.
Doesn’t that require that the target project is also LGPL?
Not in its entirety. I understand if Distributions.jl don’t want complexity and would prefer a codebase which has a single license, but LGPL wouldn’t make the whole project, it’d be “mostly MIT with a single file under LGPL”.
Also Julia itself is effectively multi-license at the moment, with most of the codebase under MIT, and some bits using different licenses
The parameters of the Landau distribution, mu and c, are related to the shift and stretching, right?
If so, you can use grid interpolation for the shape.
BTW, the expression for p(x;mu,c) in wikipedia does not look like (x-mu)/c can be used as an argument
Ah, @misha_mikhasenko maybe the problem you encountered was that for x0<0 the function is completely flat… In my fit, I simply introduced an additional shift to x. Btw. I also fit an extra scale factor, since I am using the function outside of its context.