Good $(period_of_day) to all,

Let me preface this question by saying that there is a good chance (p > 0.8) that I don’t know what I’m talking about, so feel free to redirect me. I’ll explain what I have and what I need to do, and you can tell me what keywords I should search on.

### What I have:

I have a `DataFrame`

(although it could easily be an `Array{Int64, 2}`

) that holds a data distribution. Column 1 is called `:bucket`

and contains the upper edges of 200 fixed width buckets. Column 2 is called `:count`

and contains the population of each bucket.

In most cases, the distribution is close to, but not exactly a Log-normal distribution. There are times when it is double humped, but I’ve narrowed this down to it being a combination of two Log-normal like distributions (in reality it is almost always a combination of multiple distributions, but typically one of them way outnumbers the others).

I can also determine the geometric mean and geometric standard deviation of this overall distribution

### What I’m trying to do:

I’m trying to identify the components of the curve, ie, the most significant Log-normal distributions via their geometric means & geometric standard deviations.

### What I’ve tried:

I’ve tried generating a random Log-normal distribution using something like this:

```
nos_n = randn(sum(df[:count])) # df is the dataframe from above
nos_n += log(geometric_mean)
nos_n *= log(geometric_stddev)
nos_ln = Float64[1.5 ^ k for k in nos_n]
dist_ln = hist(nos_ln, df[:buckets])
```

This does give me a log-normal distribution, but it doesn’t match the distribution I have in the dataframe, so I basically keep trying this with smaller datasets until I get a distribution that fits inside the original distribution, then I subtract that from the original, and try again with the left-over.

### My questions:

- Is there a better way to do this in Julia?
- Is there a standard name for what I’m doing?

Thanks for reading this far.

Philip