US elections model implementation in Turing

Hello guys, hope you are doing well. I am trying to implement the US forecasting election model from the paper by Linzer https://www.ime.usp.br/~abe/lista/pdfpWRrt4xFLt.pdf, and it would be great if someone could give me a little help
Here is my implementation:

@model function linzer_model(state_polls_dict, hist_state_forecast, poll_dates, states)
	n_states = length(hist_state_forecast)
	n_dates = length(poll_dates)
	σ_δ ~ Uniform(0, 10)
	σ_β ~ Uniform(0, 10)
	δ = Vector{Real}(undef, n_dates)
	β = Array{Real, 2}(undef, n_dates, n_states)
	# Random walks of the parameters β and δ
	for i in 1:(n_dates)
		if i == 1
			δ[i] = 0
			for j in 1:(n_states)
				β[i,j] = logit(hist_state_forecast[j])
			end
			continue
		end
		δ[i] ~ Normal(δ[i-1], σ_δ)
		for j in 1:(n_states)
			β[i,j] ~ Normal(β[i-1,j], σ_β)
		end
	end
	π = Array{Real,2}(undef, n_dates, n_states)
	for i in 1:n_dates
		π[i,:] = logistic.(β[i,:] .+ δ[i])
	end
	# rows -> days
	# columns -> states
	for i in 1:n_states
		for j in 1:n_dates
			if !(poll_dates[j] in keys(state_polls_dict[states[i]]))
				continue
			end
			n_polls = length(state_polls_dict[states[i]][poll_dates[j]][1,:])
			polls = state_polls_dict[states[i]][poll_dates[j]][1,:] # Hillary polls
			sample_sizes = state_polls_dict[states[i]][poll_dates[j]][3,:]
			for k in 1:n_polls
				polls[k] ~ Binomial(sample_sizes[k], π[j,i])
			end
		end
	end
end

The model basically goes like this:

  • Pre-election polls are generated by a Binomial distribution. We have polls from every state and days before the election.
  • the probabilities π for each poll of each states at each time, is given by π_ij = logit−1 (β_ij + δ_j).
  • β and δ are obtained by random walks, i.e., β_ij∼ N(βi, j+1, (σ^2)_β), δj ∼ N(δj+1, (σ^2)_δ)
  • σ_β and σ_δ have uniform priors.

The arguments of the model are:

  • state_polls_dict: A dictionary of dictionaries, containing information about the polls at every state and every date.
  • hist_state_forecast: An array with some historical forecasting information of each state
  • poll_dates: An array whith all the days where a poll was performed
  • states: An array with all the states

The model is quite complex and it would be great if someone could tell me if you see any errors in the implementation or if there is a simpler way to write it. Also, performance tips will be appreciated, as my model is taking too much to run (I am using HMC to sample).
Thanks a lot in advance for your time

3 Likes

I haven’t had a chance to look through your model, but I think it would be cool if you put your implementation on github. I’d love to look into a bit more when I have more time. Where do you get the polling data that go into the model, by the way?

1 Like

Data, I would recommend https://data.fivethirtyeight.com/.

For a STAN implementation to compare: https://github.com/TheEconomist/us-potus-model

2 Likes

hello guys, thank you very much for your answers. Here you can see my implementation in a jupyter notebook
I’m using polling data of the 2016 US election from FiveThirtyEight.

I think the biggest problem here is the implementation of the random walks of my model. I can’t find a better way to optimize the code for this (like using filldist() or distarray()). Any suggestion will be truly appreciated