yes, the notation y ~normal(μ, σ) is identical to y = μ + ε with ε ~ normal(0, σ).
Keep in mind that a variance on the scale of about 100 only corresponds to a standard deviation of about 10, though I agree that this prior is a bit wide for today’s standards (assuming x and y being on a roughly standard normal scale).
Priors like this were popular about 10 years ago, the dangers of these vague priors have only been appreciated relatively recently, see [1708.07487] The prior can generally only be understood in the context of the likelihood for a related paper.
The idea was often to “let the data speak for itself”, and “be conservative” (in the sense of letting the prior not influence the posterior too much), but now we know that wide priors can sometimes influence the model in unexpected ways.
A common example that high variance does not necessarily mean less assumptions is logistic regression, where a model like y \sim \text{inv_logit}(\theta) with \theta = \beta_0 + \beta x can allocate a lot of probability mass on values of \theta near 0 and 1 if the prior for \beta is wide.
Whether this is desireable or not depends on the context of course, but usually the goal that the modeler had in mind was not to encode strong prior assumptions into the model.
Though for a model like this, the justification for choosing priors like \sigma ~ \text{normal}(0, 100) is often simply because it does not matter. The error scale is usually well identified for linear models with normal likelihoods, so it really does not make a difference if you’re using 10, 100 or 1e6 as the standard deviation.