STATA way more used by economists than R/Python/Matlab?

I come across this chart, from a recent paper (can’t attach PDFs) in AEA (American economic Association) Papers and Proceedings, and got very surprised to see that STATA dominate all other software used to produce empirical works in Economics (at least in AEA papers)… this is really odd, does anyone has an explanation for it ?


This is not really that odd. Stata makes it very easy to do the standard econometric tests. They are almost all one-liners. The manual is extremely detailed, so it is easy to pick up also. Most economics research does not fall into the realm of big data or complicated quantitative models (although all of the economists that frequent the Julia community obviously deal more with that). If you have an average-sized data set and are wanting to run a standard test on that data, Stata is the path of least resistance.

In fact, Stata Corp actually posted a job on the AEA jobs board this year, perhaps to protect their market share from new and exciting languages? :wink:

1 Like

Being a (financial) economists, I have some clues.

While I cannot explain the magnitude, it seems pretty clear that Stata is popular on the field of micro economics (and related fields) and AEA has moved in that direction.

Stata has robust data wrangling routines and plenty of very good econometrics “packages.” Also, the tradition in micro economics has typically put less emphasis on coding skills than other economics fields, which might tilt the numbers away from the alternatives. However, the trend seems to be towards R.


Simple answer: “it does the job”. Learning something new, while fun, takes a lot of time and in most cases there is no need for it. Juniors (I.e. pre-tenure) have little time because of tenure pressures, and seniors tend to co-author, in which case a new package is often only adopted when all coauthors sign up for it. Hence, you only see Julia being adopted by young economists that work in technical fields, on papers with few authors ("structural” fields).

As a former micro-economist and current political scientist who still regularly uses Stata for my final statistical analyses, I’d point to a few things:

  • It wasn’t designed to be a general purpose language; it was designed for tabular data manipulation. And as a result, the syntax is really nice for that specific use case. And most economists only do tabular data manipulation.
  • It’s what people are taught in their intro econ classes, so it’s the default. And because it doesn’t use standard programming conventions, switching costs (to R or Julia) are high.
  • Cost isn’t a factor since all universities have licenses.
  • It has every estimator used by social scientists built in and with good documentation basically as one-liners.