I’ve been reading about causal inference recently and trying to incorporate those methods into my research as much as I can. My impression is that many others are trying to do the same in various fields. I thought to start a new topic to see if anyone here had more insight, tips, ideas, or advice on this in Julia. What does the workflow look like?
My favorite learning resource has been the Hernán & Robins book Causal Inference: What If. This book presents methods such as inverse probability weighting, parametric g-formula, g-estimation, doubly robust machine learning estimators, instrumental variable estimation, causal mediation analysis, target trial emulation, and of course DAGs and the backdoor criterion, but also SWIGs (single-world intervention graphs that extend DAGs to the counterfactual realm).
(I did just find CausalInference.jl and CausalELM.jl.)
Do you happen to be on the Julia Slack? You might get some additional thoughts and insight posting within the #health-and-medicine channel there. I’d post more right now, but am swamped with some work/research recently.
I’ve been meaning to answer here for a while - I use causal inference techniques a lot in my work, although largely those from the Rubin world rather than from the Pearl world.
I’m sorry to say that Julia is light years behind R in particular, but also Stata and Python when it comes to causal inference.
Some packages I’ve used (not necessarily recently or professionally, so not endorsements):
I think Julia is a great language for these estimators, but unfortunately there hasn’t really been any project that really went beyond the single maintainer, narrow focus stage.
I think this depends a lot on the exact methodology you are using — a Bayesian approach with a custom model would require something totally different than a well-used frequentist estimator which may even be available in a package.
That said, I think most people would
clean the data. you will find a lot of packages for this, eg TidierOrg follows R’s tidyverse approach.
make preliminary/exploratory plots to check overlap etc. Again, there are tons of plotting packages, all with a different approach, and unifying APIs like Plots.jl. I would find one you are comfortable with before proceeding to the actual estimation.
perform inference, like I said the details depend on the approach
simulation, model checking, etc, similarly it depends on the approach.
I don’t know this book but I skimmed through the text and it is surprisingly light on plots. I personally like books by Andrew Gelman and coauthors, which are full of real-world examples and plots, eg the recent
Very nice! This is a core research tool. You can define your own causal assumptions (or a few alternatives) based on previous knowledge/evidence etc, and prove the adjustment set it implies in your particular situation. And you can then go the other direction with causal discovery to see if there’s any surprises compared to yours.
I’ll just mention that I would love to see more causal modeling tooling within Julia in general. Unfortunately I am not expert enough nor have time to dedicate to such endeavors at the moment, but it would be great to see it happen especially within the context of JuliaHealth or JuliaStats!
Hi! As someone also quite interested in doing causal inference in Julia, I just saw this post today and figured I’d plug my new package. It’s called CausalTables.jl and it’s designed to help people implement new causal estimators in Julia.
The package implements a Table that couples data with a DAG-like structure. This simplifies cleaning data for causal tasks (i.e. intervening on treatment, selecting parents of different variables, etc). The package also provides an interface to randomly draw data from a known structural causal model, allowing users to extract things like the true outcome regression and propensity scores, and approximate common causal parameters like the ATE. If you’re interested in checking it out, happy to take any feedback as well!
Thanks for the plug, exactly what I was hoping for!
And indeed it’d be a dream to have an organized causal methods ecosystem in Julia, bringing together all the different approaches and fields dealing with the same issue. Of course, one can go very far with just plain statistics tooling but there is so much more room for improvements and extending the tools to interventions, counterfactuals etc like this package extends Tables.
Absolutely agree. And if you’re ever interested in building any resources out yourself in the vein of “Causal Inference: What If,” I’d be happy to help out or collaborate.
BTW something I neglected to mention: for actually performing doubly-robust estimation of estimands like the ATE, there is also the quite sophisticated TMLE.jl which builds on top of MLJ for doing causal machine learning.