I’m a beginner with Turing.jl/probabilistic programming/Bayesian stats and I’m not understanding a key piece of the introductory coin-flip example in the docs and I’m hoping someone can explain it to me.

I understand the first part, where they start with a prior distribution and then update it based on some data that is collected (without using Turing.jl):

```
Ns = 0:100
for (i, N) in enumerate(Ns)
# Count the number of heads and tails.
heads = sum(data[1:i-1])
tails = N - heads
# Update our prior belief in closed form (this is possible because we use a conjugate prior).
updated_belief = Beta(prior_belief.α + heads, prior_belief.β + tails)
# Plotting
plot(updated_belief,
size = (500, 250),
title = "Updated belief after $$N observations",
xlabel = "probability of heads",
ylabel = "",
legend = nothing,
xlim = (0,1),
fill=0, α=0.3, w=3)
vline!([p_true])
end
```

This idea of starting with a prior distribution `Beta(1,1)`

and then updating it as you observe data makes perfect sense to me. However, when we get to the same example implemented with Turing.jl, I’m not understanding what is going on inside the for loop, and where the observed data comes into play.

```
using Turing
@model coinflip(y) = begin
# Our prior belief about the probability of heads in a coin.
p ~ Beta(1, 1)
# The number of observations.
N = length(y)
for n in 1:N
# Heads or tails of a coin are drawn from a Bernoulli distribution.
y[n] ~ Bernoulli(p)
end
end
```

As I understand this, we are defining a model `coinflip`

that takes some observed data as an argument. Inside of this model we start with the same prior as before where we believe that the parameter `p`

, which is the probability of achieving ‘heads’ on any given flip of the coin, is distributed as `Beta(1,1)`

. Then, for every observation of our data, we are drawing from the Bernoulli distribution (??) and then…??

I think what I need to know is, what exactly is `y[n] ~ Bernoulli(p)`

? Is there an equivalent in “plain vanilla” Julia code?

Are we replacing each value in `y`

with a randomly drawn value from the Bernoulli distribution? If this is the case, why would we even need the data in the first place?