A few years ago I worked through Richard McElreath’s Statistical Rethinking. In it, McElreath discusses the use of Directed Acyclic Graphs to specify/analyze causal models, briefly covers do-calculus, etc. I don’t work in academia, but I do occasionally read research related to low wage workers, labor law violations, and related topics. One thing that I almost never see in the publications are DAGs.
I know there are a lot of scientists/academics in this community, so I’m wondering how common it is to use DAGs in causal inference, and also how common it is to actually include the DAGs in the publications themselves.
Also, I recently was discussing an issue with an individual whose title is Principle Research Scientist, and this person was unfamiliar with DAGs and their use in causal inference, which surprised me a little. Are DAGs commonly a part of PhD programs, or do they tend to be taught/used only in certain fields?
The short answer to your question will be yes, DAGs are very common in causal inference and many causal inference application use DAGs for experiment design. If you are familiar with the do-calculus, you probably remember than here DAGs is how the dependency between variables is represented.
However, it is true that there are two traditions or two parallel developments in causal inference. The one that strongly relies in DAGs is the one developed by people like Judea Pearl defining DAGs and probabilistic models on top of them. If you are interested in this approach, a seminar piece of work is this paper and the Book of Why sort of introduces the same framework. On the second hand, there is another tradition (not sure if this is the best word to describe this) that does causal inference without dealing DAGs (or at least no explicitly) and by just handling the probabilistic dependencies between variables. An excellent reference for an introductory view to causal inference are the Notes of Peng Ding on Causal Inference.
So, I think you can always think that there is a DAG in your causal model, but it is common in modern frameworks to use a different jargon, reason why some people doing causal inference may be unfamiliar with DAGs.
Afaict DAGs have only been taught widely within the last decade, so a scientist whose training was earlier may not have trained on them.
As for publications, I think the conventions are still being worked out, so “it depends”. They’re often used in the development of models; the DAG drawing may or may not be actually presented in the text. In epidemiology & public health they’re common, also increasing in social sciences. You can see papers citing Dagitty, a popular R package for DAGs, on Google Scholar.
As for publications, I think the conventions are still being worked out, so “it depends”. They’re often used in the development of models; the DAG drawing may or may not be actually presented in the text.
I find it very disappointing that they aren’t more widely included in the papers I read…for me, they are extremely helpful in understanding the structure of the problem and it makes it so much easier to understand the statistical models/data/tables/graphs that are always included.
Came here to post the Imbens paper which I think does a great job of summarising the two approaches and discussing why Pearl’s approach hasn’t found as much application as he’d like in the mainstream. Pearl of course takes issue with this, see e.g. here: