# What is the idiomatic way write model in Turing

What is the idiomatic way to write model in Turing. For example in Turing documentation for binary coinflip:

``````using Turing
using Distributions
using Random

p_true = 0.5
n = 1000
Random.seed!(12)
data = rand(Bernoulli(p_true),n)
# From Turing example in the documentation
@model function coinflip1(y)
p ~ Beta(1.,1.)
for i in 1:length(y)
y[i] ~ Bernoulli(p)
end
end

# I though the t following coinflip2 code is not working because no documentation, but its work
@model function coinflip2(y)
p ~ Beta(1.,1.)
y ~ Bernoulli(p)
end

# This model I read from paper "DynamicPPL: Stan-like Speed for Dynamic Probabilistic Models"
@model function coinflip3(y)
p ~ Beta(1.,1.)
y .~ Bernoulli(p)
end
``````

The coinflip2 and coinflip3 is faster than coinflip1 even have not written in doc. Using 2000 iterations and NUTS(0.65), here is simple benchmark using macro @time:

``````#coinflip1
9.260588 seconds (110.87 M allocations: 3.331 GiB, 6.34% gc time)
#coinflip2
0.161384 seconds (938.79 k allocations: 55.367 MiB, 8.81% gc time)
#coinflip3
0.172289 seconds (968.04 k allocations: 57.130 MiB, 6.77% gc time)
``````

The coinflip1 is more natural and clear. The coinflip2 and coinflip3 like pymc3 model.

Thanks

1 Like

I suspect that the for loop in `coinflip1` performs a separate evaluation on each iteration with overhead. Perhaps there is someway to improve performance on the Turing side. As a workaround, you can create a new distribution type and place the for loop inside the `logpdf` function. I use this in cases where (1) I am defining a custom distribution, (2) I am using a non-standard data type, (3) a for loop is a more natural way to define the model. Here is an example for comparison:

``````using Turing
using Distributions
using Random

import Distributions: logpdf, loglikelihood

struct MyDist{T} <: ContinuousUnivariateDistribution
θ::T
end

function logpdf(d::MyDist, data::Array{Bool,1})
LL = 0.0
for i in 1:length(data)
LL += logpdf(Bernoulli(d.θ), data[i])
end
return LL
end

loglikelihood(d::MyDist, data::Array{Bool,1}) = logpdf(d, data)

p_true = 0.5
n = 1000
Random.seed!(12)
data = rand(Bernoulli(p_true), n)

@model function coinflip4(y)
p ~ Beta(1.,1.)
y ~ MyDist(p)
end

n_samples = 2000
delta = .65

@time chain = sample(coinflip4(data), config, n_samples)
``````

The run time for `coinflip4` is nearly identical to `coinflip3`.

`0.205202 seconds (1.19 M allocations: 74.529 MiB, 5.45% gc time)`

Actually when I first using Turing, my complain is about it is slow in this binary coinflip problem compared to pymc3. I just wonder whether there are vectorized version were allowed or documented. However I can not find it in the doc or example but find it in the Arxiv paper that I mentioned before. Thank you for the workaround tips, I learned new thing about it.

try ReverseDiff as an AD backend and use

``````y ~ filldist(Bernoulli(p), length(y))
``````

or

``````y .~ Bernoulli(p)
``````

How should I think about the difference between the different ways of writing things? Are there docs on this somewhere?

there are performance tips in the docs https://turing.ml/dev/docs/using-turing/performancetips

1 Like

Thank you so much. The

``````y .~ Bernoulli(p)
``````

is nice way especially for some one from python, like me. I hope there is some doc about when to use this method.

So all these ways of writing the model are actually the exact same thing, and the syntax is just a matter of performance?

yes

1 Like