How to Learn Causal Inference on Your Own for Free

While everyone focuses on AI and predictive inference, standing out requires mastering not just prediction, but understanding the "why" behind the data – in other words, mastering Causal Inference.
You have heard that "correlation does not imply causation", but few truly grasp its implications or know when to confidently assert causality.
The distinction between predictive inference and causal inference is profound, with the latter often overlooked, leading to costly mistakes. The logic and models between the two approaches are vastly different and this guide aims to equip you with the knowledge to discern causal relationships with confidence.

I am convinced that causal inference is arguably one of the most valuable skills to acquire today for three reasons:
- It is tremendously useful for virtually any job, extending beyond data scientists to include business leaders and managers (see next Section).
- It remains a niche and few people are experts in this field and the interest is growing fast (cf. image above).
- As revealed by the Google Trends results (see image above), "causal machine learning" is the latest associated trend. Hence, knowing causal inference will help you connect this knowledge with the current AI focus and put you one step ahead.
To help you master causal inference and have a valuable asset on the job market and beyond, I crafted this self-study guide, suitable for all levels, requiring no prerequisites, and composed exclusively of free online resources.
Plan of the guide:
- Introduction: Causal inference key concepts
- Technical tools
- Randomized Experiments (A/B testing)
- Quasi-experimental design
- Advanced topics
- Conclusion
1. Introduction: Causal inference key concepts
Causality, the field focused on understanding the relationships between cause and effect, seeks to answer critical questions such as ‘Why?' and ‘What if?'. Understanding the concept of causality is crucial from fighting climate change, to our quest for happiness, including strategic decisions making.
Examples of major questions requiring causal inference:
- What impact might banning fuel cars have on pollution?
- What are the causes behind the spread of certain health issues?
- Could reducing screen time lead to increased happiness?
- What is the Return On Investment of our ad campaign?
In what follows I will essentially refer to two free e-books available with Python code and data to play with. The first e-book offers quick overviews, while the second allows for a more in-depth exploration of the content.
- Causal Inference for the Brave and True by Matheus Facure
- Causal Inference: The Mixtape by Scott Cuningham

1.1 The fundamental problem of causal inference
Let's dive into the most fundamental concept necessary to understand causal inference through a situation we might all be familiar with.
Imagine that you have been working on your computer all day long, a deadline is approaching, and you start to feel a headache coming on. You still have a few hours of work ahead, so you decide to take a pill. After a while, your headache is gone.
But then, you start questioning: Was it really the pill that made the difference? Or was it because you drank tea or took a break? The fascinating but eventually also frustrating part is that it is impossible to answer this question as all those effects are confounded.
The only way to know for certain if it was the pill that cured your headache would be to have two parallel worlds.
In one of the two worlds you take the pill, and in the other, you don't, or you take a placebo ideally. You can only prove the pill's causal effect if you feel better in the world where you took the pill, as the pill is the only difference between the two worlds.
Unfortunately, we do not have access to parallel worlds to experiment with and assess causality. Hence, many factors occur simultaneously and are confounded (e.g., taking a pill for a headache, drinking tea, and taking a break; increasing ad spending during peak sales seasons; assigning more police officers to areas with higher crime rates, etc.).
To quickly grasp this fundamental concept in more depth without requiring any additional technical knowledge, you can dive into the following article on Towards Data Science: