Hello all,

I’ve been thinking some more about a refactoring of Distributions.jl, and I’d like to ask for feedback on a proof of concept.

First, the abstract types:

```
abstract type Sampleable{P,X} end
abstract type Distribution{P,X} <: Sampleable{P,X} end
```

Here `P`

is the parameter space, and `X`

is the type of the sample space. Critically, there are no constraints on either of these at the type level. If you want `X`

to be something other than `<: Real`

or `<: Array`

, go for it. No problem.

`P`

can be anything, but `NamedTuple`

s are especially convenient, because they make it easy to use different parameterizations. For example, a `Gamma`

could look like this:

```
struct Gamma{P,X} <: Distribution{P,X}
par :: P
Gamma(nt::NamedTuple) = Gamma{typeof(nt), typeof(nt[1])}(nt)
end
```

Again, **the types are very flexible**. A `Gamma`

over symbolic values, for example, is easy.

Passing a `NamedTuple`

is a bit awkward, but that’s easily fixed with

```
Gamma(;kwargs...) = Gamma((;kwargs...))
```

We may also want a default parameterization without requiring names:

```
Gamma(k :: Real, θ :: Real) = Gamma(k=k, θ=θ)
```

`logpdf`

then looks like this:

```
function logpdf(d::Gamma{P,X}, x::X) where {P <: NamedTuple{(:k,:θ)}, X <: Real}
k = d.par.k; θ = d.par.θ
(k - 1) * log(x) - x/θ - k * log(θ) - loggamma(k)
end
```

This is for a shape `k`

and scale `θ`

. Maybe we’d rather parameterize by a shape and rate? Ok:

```
function logpdf(d::Gamma{P,X}, x::X) where {P <: NamedTuple{(:α,:β)}, X <: Real}
α = d.par.α; β = d.par.β
(α - 1) * log(x) - β*x + α * log(β) - loggamma(α)
end
```

Or by a shape and a mean?

```
function logpdf(d::Gamma{P,X}, x::X) where {P <: NamedTuple{(:k,:μ)}, X <: Real}
μ = d.par.μ; k = d.par.k
α = k; β = k/μ
(α - 1) * log(x) - β*x + α * log(β) - loggamma(α)
end
```

This is already much faster than Distributions (at least in part due to lack of inlining in the latter):

```
julia> @btime logpdf(Gamma(2.0,1.0),4.0)
13.964 ns (0 allocations: 0 bytes)
-2.613705638880109
```

vs

```
julia> using Distributions
julia> @btime logpdf(Gamma(2.0,1.0),4.0)
36.097 ns (0 allocations: 0 bytes)
-2.613705638880109
```

Some things to consider:

- Lots of packages depend on
`Distributions.jl`

, and these breaking changes could take a while to be fully supported - Defining separate methods for each parameterization is a bit verbose, but there’s always the option to fall back on one canonical parameterization
- Some aspects of this would probably need to be implemented using traits. But let’s not rush to that until it’s clearly the best approach
- This approach specifies a parameterization in terms of the names for a
`NamedTuple`

. But there’s likely to be some contention exactly what those names should be, for example should we replace`k`

and`θ`

with`shape`

and`scale`

? - Fields in
`NamedTuple`

s are ordered, which is a little troubling for this use case. There may be a clever way to dispatch according to the*set*of names, rather than the tuple itself