I am looking to estimate a simulated method of moments model. The most similar thread on SMM is from 2019 but does not settle issues: The most similar thread is: Global optimization: Simulated Method of Moments . Julia seems like the perfect language for running SMM. It would be great to find/develop a package that could provide more readable and shorter code.
The ModelingToolkit package has some really nice syntax (defining parameters and variables, creating arrays of equations that can be operated on), but does not appear to be geared towards these sorts of problems (maybe it can be trivially extended but I don’t know how). Below I will develop a pseudo-code example of the sort of features that would be nice. If anyone knows of existing code that does this sort of thing or would like to provide recommendations on developing it then I would be thrilled.
Psuedo code:
First set up the model. For a simple example to illustrate ideas, I am setting up a latent variable ar1 model, where the latent variable causes an outcome. x follows an ar1 process with fixed effect terms. x is observed with error: what is observed is \tilde{x}. y is a function of the latent variable x and and observed variable w.
N = 500
T = 4
@parameters β μ σε σϵ σα σξ a b c
@observations i t
@variables y x ̃x w ε ϵ α ξ
eqns = [xₜ₊₁ ~ αᵢ + β*xᵢₜ + εᵢₜ ∀ t ∈ [2:T] i∈N
̃xₜ ~ xₜ + ϵₓₜ t ∈ [1:T]
yᵢₜ ~ a + b*xᵢₜ + c*wᵢₜ + ξᵢₜ
εᵢₜ ~ Normal(0, σε) ∀ i t
ϵᵢₜ ~ Normal(0, σϵ) ∀ i t
αᵢ ~ Normal(μ, σα) ∀ i
ξᵢₜ ~ Normal(0, σξ) ∀ i t]
In the imagined syntax above, @parameters and @variables would work in a way similar ModelingToolkit. @observations would define the units of observation: in this case we have i and t, so the constructed data would be a panel (curently DataFrames does not support convenient ways to represent/manipulate panel data… that is something else that would be great to add but is beside the point for this thread).
The last four equations would define how random variables are distributed.
Note that y depends on w, but the process generating w is not defined. This is an incompletely specified model, but that is okay if w is given.
We can simulate this data from this process for given parameters values and values of w. A function method might look like:
simulate(empirical_data::DataFrame, eqns::Array, parms::NamedTuple, observations::Array{observations,1})::DataFrame = …
This method produces a dataframe with all simulated variables. We feed it a datafame which must have w as one of its variables because w is not specified. If we were to feed it a dataframe containing \varepsilon then we would treat \varepsilon as given. Also there must be columns in the dataframe corresponding to the units of observation t and i.
Simulation is crucial for SMM. SMM attempts to estimate parameters from the model by selecting parameter values that minimize a metric on moments/estimated coefficients calculated from both the actual data and the synthesized data. That is, we have good parameters in the the model if statistics calculated on the synthetic data look like the statistics calculated on the actual data.
So we would want to define an array of moments (or auxiliary model coefficients):
moments = [mean(yᵢ)
mean(xᵢ)
var(xᵢₜ)
cov(xᵢₜ₊₁, xᵢₜ)
...]
and we would fit the model with a command like
fitted_model = optim(df::Data, moments, W, method=:Wald, optimizer=:MCMC)
An approach like this would make it very simple to code up structural models for SMM estimation, and would make the code very readable, reducing risks of mistakes.
Thoughts?