Peak finding from a distribution

newtothis · October 27, 2021, 2:48pm

I have a distribution as a vector in Julia. I can plot a histogram and/or a kde of for this data, but I want to find the peak values. The naive way in which I am doing it is to plot the histogram and then just retrieve the mode of the data in the vector used to plot the histogram. However, this output is sometimes not close to the peak I can see from the histogram. What do I do?

halleysfifthinc · October 27, 2021, 3:46pm

If there is only a single peak of interest for the distribution vector, then maximum/findmax (in Base) will find it for you. For multiple peaks (local maxima), the Peaks.jl package has relevant functions (e.g. findmaxima).

gustaphe · October 27, 2021, 4:07pm

Is it a vector of samples, or a probability density function/estimate?

If the former, you are looking for “Distribution Fitting” (Distribution Fitting · Distributions.jl). If the latter possibly GitHub - JuliaNLSolvers/LsqFit.jl: Simple curve fitting in Julia

rafael.guerra · October 27, 2021, 5:24pm

If you can fit a distribution using Distributions, then compute the mode.

dlakelan · October 27, 2021, 7:57pm

There’s a bug in your code if when you choose the middle of the histogram bin that has the highest count, it isn’t at the location that has the highest count on your histogram plot. Of course sometimes histograms are noisy, so the location with the highest count isn’t near the “peak” of the “smoothed histogram” you have in your brain. If that’s the issue, try using a kernel density estimate, and choose the sample with the highest estimated density.

using KernelDensity, Distributions,Random,StatsPlots

Random.seed!(1)
data = rand(Normal(3.0,3.0),200)

den = kde(data)
mval = findmax(den.density)
plot(den,xlim=(2,4))
println("Peak density occurs at $(den.x[mval[2]])")

newtothis · October 28, 2021, 1:12pm

It’s a vector of samples. Essentially, I run a function that counts how many steps some process takes. I repeat the process over many realisations to find some sort of a sample distribution. I want to find what the most likely number of steps is, for this process to happen.

I’m sorry if this is badly phrased or if it’s a known problem. I’m quite new to both stats and julia.

newtothis · October 28, 2021, 1:13pm

The kde from this code is just the line y = 0.

lmiq · October 28, 2021, 1:28pm

Maybe something like this is what you want:

julia> nsteps() = rand(1:50) # the "number of steps"
nsteps (generic function with 1 method)

julia> counter = zeros(Int,50); # 50 is the maximum number of steps possible

julia> for i in 1:1000 # number of samples
           n = nsteps() # check the number of steps
           counter[n] +=1 # add to counter
       end

julia> findmax(counter) # find index and number of the maximum counter
(28, 7)

julia> counter[7]
28

dlakelan · October 28, 2021, 1:51pm

No it’s not, but the code does zoom in to the region where the peak is and the curve is quite flat when zoomed.

Note, I was using Julia 1.7 so perhaps the RNG is different from your Julia version.

Note also, I edited the post to include using Random,StatsPlots because I had those already loaded in my julia session and didn’t notice you need to do that.

When I run the code above it finds a peak density at x = 2.85156… and graphs a fairly flat parabola like shape with peak value about 0.13589 at that x location.

newtothis · October 28, 2021, 6:35pm

Thank you. It works now.

Topic		Replies	Views
Find max value in density plot Statistics	7	2304	November 19, 2021
Estimate most probable value from data Statistics question	7	2972	June 19, 2019
Resources for learning Julia coming from Python Signal and Image Processing question	3	831	September 2, 2021
Density and histograms New to Julia plotting , plots , histogram	5	400	June 10, 2024
Landau distribution or something similar Statistics	23	3386	August 12, 2021

Peak finding from a distribution

Related topics