I have some empirical values for the cdf of a distribution which should be close to Normal. These are for x = 1, 2, … , 13. They are as follows:
[5.2909066017292616e-17, 2.998035847356917e-15, 2.893916126231429e-12, 2.893916126231429e-12, 5.984024927027287e-11, 9.828060393355409e-10, 1.3065783612939083e-8, 1.4263214765731293e-7, 1.2913225520247604e-6, 9.765050285646565e-6, 6.197926915196307e-5, 0.00033122740938565793, 0.001493228789321792]
I would like to find a normal distribution that fits these as well as possible. To do this I first defined a cost function and then tried to use optimize. I suspect I am doing it wrong however.
using Distributions
using Optim
empiricalcdf = [5.2909066017292616e-17, 2.998035847356917e-15, 2.893916126231429e-12, 2.893916126231429e-12, 5.984024927027287e-11, 9.828060393355409e-10, 1.3065783612939083e-8, 1.4263214765731293e-7, 1.2913225520247604e-6, 9.765050285646565e-6, 6.197926915196307e-5, 0.00033122740938565793, 0.001493228789321792]
function cdfcost(x)
return sum((empiricalcdf .- cdf.(Normal(x[1], x[2]), collect(1:13))).^2)
end
print(optimize(cdfcost, [10.0 ,2.0], BFGS()))
This gives me:
* Status: success
* Candidate solution
Minimizer: [1.00e+01, 2.00e+00]
Minimum: 2.428780e+00
* Found with
Algorithm: BFGS
Initial Point: [1.00e+01, 2.00e+00]
* Convergence measures
|x - x'| = 0.00e+00 ≤ 0.0e+00
|x - x'|/|x'| = 0.00e+00 ≤ 0.0e+00
|f(x) - f(x')| = 0.00e+00 ≤ 0.0e+00
|f(x) - f(x')|/|f(x')| = 0.00e+00 ≤ 0.0e+00
|g(x)| = 9.24e-01 ≰ 1.0e-08
* Work counters
Seconds run: 0 (vs limit Inf)
Iterations: 1
f(x) calls: 52
∇f(x) calls: 52
This seems to mean no optimization has occurred. What am I doing wrong and is there a simpler way to achieve the same goal?