Workshop: Matching Estimation Strategies

Motivation

While randomized controlled trials (RCTs) remain the gold standard for causal inference, many research settings make randomization infeasible or unethical. In these observational settings, matching methods offer a quasi-experimental approach to estimate causal treatment effects.

Matching represents an intuitive method for addressing causal questions, primarily because it pushes the analyst to confront both the process of treatment assignment and the limitations of available data. By constructing comparable treatment and control groups based on observed characteristics, matching attempts to replicate the balance we would expect from randomization.

As Stuart (2010) notes:

“When estimating causal effects using observational data, it is desirable to replicate a randomized experiment as closely as possible by obtaining treated and control groups with similar covariate distributions. This goal can often be achieved by choosing well-matched samples of the original treated and control groups, thereby reducing bias due to the covariates. Since the 1970’s, work on matching methods has examined how to best choose treated and control subjects for comparison.”

Matching is a method of strategic subsampling from among treated and control cases. The investigator selects a nontreated control case for each treated case based on the characteristics in \(X_i\). All treated cases and matched control cases are retained, and all nonmatched control cases are discarded. Differences in \(Y_i\) are then calculated for trea- ted and matched cases, with the average difference serving as the treat- ment effect estimate for the group of individuals given the treatment.

Overview

Matching is usually introduced in one of two ways:

As a method to form quasi-experimental contrasts by sampling comparable treatment and control cases from among two larger pools of such cases
As a nonparametric method of adjustment for treatment assignment patterns when parametric regression estimators may not be trusted

Both strategies provide a means of recovering the causal effect of some treatment \(T_i\) on an outcomes \(Y_i\) given observable characteristics \(X_i\).

Core Workflow

To implement a matching method, you will typically follow a five-step procedure:

Collect observable characteristics: Collect chracteristic information about both observations in the treatment and the control group.
Defining closeness: Select a distance measure to determine whether an individual is a good match for another.
Implementing the match: Apply a matching algorithm given your distance measure.
Assessing match quality: Evaluate the balance of the resulting matched samples, iterating Steps 2 and 3 to improve the balance post matching.
Estimating the effect: Analyze outcomes and estimate the treatment effect given the matched sample.

Workshop Outline

This workshop is organized into the following modules:

Foundations

The Concept: Introduction to matching fundamentals and the intuition behind matching estimators
Distance Measures: Metrics for measuring similarity between units (Euclidean, Mahalanobis, propensity scores)

Matching Methods

Propensity Score Matching: Using predicted treatment probabilities to match observations
Covariate Matching: Direct matching on observed characteristics
LASSO Matching: High-dimensional variable selection with penalized propensity scores
Inverse Probability Weighting: Reweighting observations by inverse treatment probabilities to estimate causal effects without discarding observations
Entropy Balancing: Constructing optimization-based weights that exactly balance selected covariate moments between treatment groups
Doubly Robust Estimation: Combining propensity weighting with outcome regression for robustness

Validation & Application

Sensitivity Analysis: Testing the robustness of your findings to unobserved confounding
Coding Practice: Hands-on exercises with real data in R, Python, and Stata
The PSP Program: Empirical application using LASSO matching and matched-pair difference-in-differences (J-PAL MEL project)

Additional Resources

Recommended readings, software packages, and further learning materials

Video Review

This workshop was graciously recorded by the MEL J-PAL staff and is provided as a publicly available video to accompany the written materials.

References

Stuart, Elizabeth A. 2010. “Matching Methods for Causal Inference: A Review and a Look Forward.” Statistical Science 25 (1): 1–21. https://doi.org/10.1214/09-STS313.