US elections model implementation in Turing

entr0pidelic · November 9, 2020, 7:45pm

Hello guys, hope you are doing well. I am trying to implement the US forecasting election model from the paper by Linzer https://www.ime.usp.br/~abe/lista/pdfpWRrt4xFLt.pdf, and it would be great if someone could give me a little help
Here is my implementation:

@model function linzer_model(state_polls_dict, hist_state_forecast, poll_dates, states)
	n_states = length(hist_state_forecast)
	n_dates = length(poll_dates)
	σ_δ ~ Uniform(0, 10)
	σ_β ~ Uniform(0, 10)
	δ = Vector{Real}(undef, n_dates)
	β = Array{Real, 2}(undef, n_dates, n_states)
	# Random walks of the parameters β and δ
	for i in 1:(n_dates)
		if i == 1
			δ[i] = 0
			for j in 1:(n_states)
				β[i,j] = logit(hist_state_forecast[j])
			end
			continue
		end
		δ[i] ~ Normal(δ[i-1], σ_δ)
		for j in 1:(n_states)
			β[i,j] ~ Normal(β[i-1,j], σ_β)
		end
	end
	π = Array{Real,2}(undef, n_dates, n_states)
	for i in 1:n_dates
		π[i,:] = logistic.(β[i,:] .+ δ[i])
	end
	# rows -> days
	# columns -> states
	for i in 1:n_states
		for j in 1:n_dates
			if !(poll_dates[j] in keys(state_polls_dict[states[i]]))
				continue
			end
			n_polls = length(state_polls_dict[states[i]][poll_dates[j]][1,:])
			polls = state_polls_dict[states[i]][poll_dates[j]][1,:] # Hillary polls
			sample_sizes = state_polls_dict[states[i]][poll_dates[j]][3,:]
			for k in 1:n_polls
				polls[k] ~ Binomial(sample_sizes[k], π[j,i])
			end
		end
	end
end

The model basically goes like this:

Pre-election polls are generated by a Binomial distribution. We have polls from every state and days before the election.
the probabilities π for each poll of each states at each time, is given by π_ij = logit−1 (β_ij + δ_j).
β and δ are obtained by random walks, i.e., β_ij∼ N(βi, j+1, (σ^2)_β), δj ∼ N(δj+1, (σ^2)_δ)
σ_β and σ_δ have uniform priors.

The arguments of the model are:

state_polls_dict: A dictionary of dictionaries, containing information about the polls at every state and every date.
hist_state_forecast: An array with some historical forecasting information of each state
poll_dates: An array whith all the days where a poll was performed
states: An array with all the states

The model is quite complex and it would be great if someone could tell me if you see any errors in the implementation or if there is a simpler way to write it. Also, performance tips will be appreciated, as my model is taking too much to run (I am using HMC to sample).
Thanks a lot in advance for your time

grero · November 10, 2020, 5:15am

I haven’t had a chance to look through your model, but I think it would be cool if you put your implementation on github. I’d love to look into a bit more when I have more time. Where do you get the polling data that go into the model, by the way?

Emmanuel-R8 · November 10, 2020, 7:40am

Data, I would recommend https://data.fivethirtyeight.com/.

For a STAN implementation to compare: https://github.com/TheEconomist/us-potus-model

entr0pidelic · November 10, 2020, 3:47pm

hello guys, thank you very much for your answers. Here you can see my implementation in a jupyter notebook
I’m using polling data of the 2016 US election from FiveThirtyEight.

I think the biggest problem here is the implementation of the random walks of my model. I can’t find a better way to optimize the code for this (like using filldist() or distarray()). Any suggestion will be truly appreciated

Topic		Replies	Views
Random walks inside a model Probabilistic Programming question , turing	4	1111	November 17, 2020
Regime-switching Model in Turing.jl Probabilistic Programming turing	1	737	September 23, 2021
Bayesian logistic regression with Turing.jl Probabilistic Programming turing , monte-carlo	29	4499	May 18, 2021
Turing: help speed up state space model Probabilistic Programming turing	3	2080	November 11, 2021
Numerical errors in logit normal model using Turing.jl Probabilistic Programming question , turing	27	3756	November 9, 2019

US elections model implementation in Turing

Related topics