Peak finding from a distribution

I have a distribution as a vector in Julia. I can plot a histogram and/or a kde of for this data, but I want to find the peak values. The naive way in which I am doing it is to plot the histogram and then just retrieve the mode of the data in the vector used to plot the histogram. However, this output is sometimes not close to the peak I can see from the histogram. What do I do?

If there is only a single peak of interest for the distribution vector, then maximum/findmax (in Base) will find it for you. For multiple peaks (local maxima), the Peaks.jl package has relevant functions (e.g. findmaxima).

1 Like

Is it a vector of samples, or a probability density function/estimate?

If the former, you are looking for “Distribution Fitting” (Distribution Fitting · Distributions.jl). If the latter possibly GitHub - JuliaNLSolvers/LsqFit.jl: Simple curve fitting in Julia

If you can fit a distribution using Distributions, then compute the mode.

1 Like

There’s a bug in your code if when you choose the middle of the histogram bin that has the highest count, it isn’t at the location that has the highest count on your histogram plot. Of course sometimes histograms are noisy, so the location with the highest count isn’t near the “peak” of the “smoothed histogram” you have in your brain. If that’s the issue, try using a kernel density estimate, and choose the sample with the highest estimated density.

using KernelDensity, Distributions,Random,StatsPlots

data = rand(Normal(3.0,3.0),200)

den = kde(data)
mval = findmax(den.density)
println("Peak density occurs at $(den.x[mval[2]])")
1 Like

It’s a vector of samples. Essentially, I run a function that counts how many steps some process takes. I repeat the process over many realisations to find some sort of a sample distribution. I want to find what the most likely number of steps is, for this process to happen.

I’m sorry if this is badly phrased or if it’s a known problem. I’m quite new to both stats and julia.

The kde from this code is just the line y = 0.

Maybe something like this is what you want:

julia> nsteps() = rand(1:50) # the "number of steps"
nsteps (generic function with 1 method)

julia> counter = zeros(Int,50); # 50 is the maximum number of steps possible

julia> for i in 1:1000 # number of samples
           n = nsteps() # check the number of steps
           counter[n] +=1 # add to counter

julia> findmax(counter) # find index and number of the maximum counter
(28, 7)

julia> counter[7]

No it’s not, but the code does zoom in to the region where the peak is and the curve is quite flat when zoomed.

Note, I was using Julia 1.7 so perhaps the RNG is different from your Julia version.

Note also, I edited the post to include using Random,StatsPlots because I had those already loaded in my julia session and didn’t notice you need to do that.

When I run the code above it finds a peak density at x = 2.85156… and graphs a fairly flat parabola like shape with peak value about 0.13589 at that x location.

Thank you. It works now.