A very vanilla (at the moment) Julia package for causal inference, graphical models and structure learning with the PC algorithm. The package contains for now the classical PC algorithm and some related functionality.
The algorithms use the Julia package LightGraphs. Graphs are represented by sorted adjacency lists (vectors in the implemention). CPDAGs are just DiGraphs where unoriented edges are represented by both a forward and a backward directed edge.
References
D. M. Chickering: Learning Equivalence Classes of Bayesian-Network Structures. Journal of Machine Learning Research 2 (2002), 445-498.
D. Colombo, M. H. Maathuis: Order-Independent Constraint-Based Causal Structure Learning. Journal of Machine Learning Research 15 (2014), 3921-3962.
Just in time for JuliaCon we have added with the help of @RobertGregg the Parallel Greedy Equivalence Search (GES) as score based alternative to the PC algorithm!
Marcel Wienöbst at the same time added an extensive suit of adjustment set search functions I believe only matched by Dagitty in functionality (but perhaps not in performance).
Finally, CausalInference now uses Threads now at two crucial steps.
GES Example
using CausalInference
using TikzGraphs
using Random
Random.seed!(1)
# Generate some sample data to use with the GES algorithm
N = 2000 # number of data points
# define simple linear model with added noise
x = randn(N)
v = x + randn(N)*0.25
w = x + randn(N)*0.25
z = v + w + randn(N)*0.25
s = z + randn(N)*0.25
df = (x=x, v=v, w=w, z=z, s=s)
With this data ready, we can now see to what extent we can back out the underlying causal structure from the data using the GES algorithm. Under the hood, GES uses a score to determine the causal relationships between different variables in a given data set. By default, ges uses a Gaussian BIC to score different causal models.
est_g, score = ges(df; penalty=1.0, parallel=true)
tp = plot_pc_graph_tikz(est_g, [String(k) for k in keys(df)])
We can conclude from observational data that v and w are causes of z which causes s, but aren’t so sure about the relationship between x, v respective x and w.
Adjustment set search
The causal model we are going to study can be represented using the following DAG concerning a set of variables numbered 1 to 8 :
We are interested in the average causal effect (ACE) of a treatment (variable nr. 6) on an outcome (variable nr. 8). Variables nr. 1 and nr. 2 are unobserved.
Ordinary regression will fail to measure the effect because of the presence of a confounder (variable nr. 4). First intuition is to control for the confounder but it is not so straight forward here because of the presence of variables 1 and 2. But the new function list_covariate_adjustments tells what to do:
Zs = list_covariate_adjustment(dag, 6, 8, Int[], setdiff(Set(1:8), [1, 2]))
# here exclude variables nr. 1 and nr. 2 because they are unobserved.
lists possible adjustment sets,
println.(Zs);
Set([4, 3])
Set([5, 4])
Set([5, 4, 3])
tells us to control either for variables 4 and 3, or 5 and 4 etc. With this control variables in the regression we are able to measure the causal effect.
With that, the performance of CausalInference.jl is fast and compares with that of the C implementation in the R package pcalg. As causal model discovery is not an NP -hard problem if the Causal graph is not sparse.
PS: Also, can someone with rights edit the thread title to ANN:CausalInference.jl - Causal Inference in Julia so it becomes searchable?
Although (as the author of https://github.com/nilshg/SynthControl.jl and someone who works mostly in Rubin/Imbens causal inference) I gripe about the very general name of the package these are some cool updates!
I only see this post now and I only now see of this package. @mschauer have you seen this: Associations.jl ? It has several implementations for causal inference and causal graph construction on the basis of timeseries. Are the two frameworks on common grounds, and if so, can we collaborate?
@Datseris To be honest, I didn’t understand why you guys (Associations.jl devs, not you personally) decided against reaching out/collaboration on say your reimplementation of the PC algorithm and the skeleton algorithm. You are even using the same representation (partially oriented graphs represented as Graphs.SimpleDiGraph) and test your’s against mine showing that they can be exchanged
I don’t think such a decision was ever made.I just became aware your project now, and reached out literally within the next 5 minutes I can’t speak for my collaborator @kahaaga , but given the work we’ve done so far in ComplexityMeasures.jl, I would wager that also he wasn’t aware of your project. Note that Assocations.jl was originally named CausalityTools.jl, a project started August 2018. Looking at CausalInference.jl, it appears to have started even earlier, right? I think CausalityTools.jl started as a repo implementing novel PhD work, but over time it increased a lot in size and later added functionality on causal graph inference. In any case, the separation is unfortunate, as we are very big on collaboration, “stop reinventing the wheel” is my middle name
I take it you are interested to collaborate? If yes, send me a DM so that we can arrange a discussion? So that we don’t over-use this thread that is mainly for updates on CausalInference.jl