What is the idiomatic way write model in Turing

What is the idiomatic way to write model in Turing. For example in Turing documentation for binary coinflip:

using Turing
using Distributions
using Random

p_true = 0.5
n = 1000
Random.seed!(12)
data = rand(Bernoulli(p_true),n)
# From Turing example in the documentation
@model function coinflip1(y)
    p ~ Beta(1.,1.)
    for i in 1:length(y)
        y[i] ~ Bernoulli(p)
    end
end

# I though the t following coinflip2 code is not working because no documentation, but its work
@model function coinflip2(y)
    p ~ Beta(1.,1.)
    y ~ Bernoulli(p)
end

# This model I read from paper "DynamicPPL: Stan-like Speed for Dynamic Probabilistic Models" 
@model function coinflip3(y)
    p ~ Beta(1.,1.)
    y .~ Bernoulli(p)
end

The coinflip2 and coinflip3 is faster than coinflip1 even have not written in doc. Using 2000 iterations and NUTS(0.65), here is simple benchmark using macro @time:

#coinflip1
  9.260588 seconds (110.87 M allocations: 3.331 GiB, 6.34% gc time)
#coinflip2
  0.161384 seconds (938.79 k allocations: 55.367 MiB, 8.81% gc time)
#coinflip3
  0.172289 seconds (968.04 k allocations: 57.130 MiB, 6.77% gc time)

The coinflip1 is more natural and clear. The coinflip2 and coinflip3 like pymc3 model.

Thanks

I suspect that the for loop in coinflip1 performs a separate evaluation on each iteration with overhead. Perhaps there is someway to improve performance on the Turing side. As a workaround, you can create a new distribution type and place the for loop inside the logpdf function. I use this in cases where (1) I am defining a custom distribution, (2) I am using a non-standard data type, (3) a for loop is a more natural way to define the model. Here is an example for comparison:

using Turing
using Distributions
using Random

import Distributions: logpdf, loglikelihood

struct MyDist{T} <: ContinuousUnivariateDistribution
    θ::T
end

function logpdf(d::MyDist, data::Array{Bool,1})
    LL = 0.0
    for i in 1:length(data)
        LL += logpdf(Bernoulli(d.θ), data[i])
    end
    return LL
end

loglikelihood(d::MyDist, data::Array{Bool,1}) = logpdf(d, data)

p_true = 0.5
n = 1000
Random.seed!(12)
data = rand(Bernoulli(p_true), n)

@model function coinflip4(y)
    p ~ Beta(1.,1.)
    y ~ MyDist(p)
end

n_samples = 2000
delta = .65
n_adapt = 1000
config = NUTS(n_adapt, delta)

@time chain = sample(coinflip4(data), config, n_samples)

The run time for coinflip4 is nearly identical to coinflip3.

0.205202 seconds (1.19 M allocations: 74.529 MiB, 5.45% gc time)

Actually when I first using Turing, my complain is about it is slow in this binary coinflip problem compared to pymc3. I just wonder whether there are vectorized version were allowed or documented. However I can not find it in the doc or example but find it in the Arxiv paper that I mentioned before. Thank you for the workaround tips, I learned new thing about it.

try ReverseDiff as an AD backend and use

y ~ filldist(Bernoulli(p), length(y))

or

y .~ Bernoulli(p)

How should I think about the difference between the different ways of writing things? Are there docs on this somewhere?

there are performance tips in the docs Performance Tips

Thank you so much. The

y .~ Bernoulli(p)

is nice way especially for some one from python, like me. I hope there is some doc about when to use this method.

So all these ways of writing the model are actually the exact same thing, and the syntax is just a matter of performance?

yes