In this article, we're going to explore a powerful statistical tool that lies at the intersection of economics and data analysis: the Regression Kink Design (RKD). This method is used to estimate causal effects in situations where the treatment intensity changes at a certain threshold.
Regression Kink Design: An Overview
The Regression Kink Design is a quasi-experimental design that allows for causal inference in settings where the slope of the treatment function changes at a certain threshold. This method is often used to estimate the causal effects of policy changes.
Assumptions
RKD relies on several key assumptions:
Kink at the Threshold: There's a kink (change in slope) in the treatment function at a certain threshold.
Continuity at the Threshold: The potential outcomes continue smoothly at the threshold.
No Manipulation around the Threshold: The individuals cannot manipulate the assignment variable to fall on either side of the threshold.
Method
In the Regression Kink Design, the causal effect of the treatment is identified from the kink at the threshold of the assignment variable. The method fits a regression model to the data and estimates the treatment effect as the difference in the slopes of the fitted line before and after the threshold.
Example
Imagine we're interested in the effect of financial aid on students' academic performance. The amount of financial aid a student receives could depend on their family income, with a kink at a certain income threshold. The change in academic performance around this kink could provide an estimate of the causal effect of financial aid.
Limitations, Pros and Cons
RKD has its limitations. It requires a large number of observations around the threshold and assumes that there's no other factor that could cause a kink at the threshold.
However, RKD also offers significant advantages. It provides a way to estimate causal effects without needing to randomize treatment. It also allows for heterogeneity in treatment effects and can be combined with other methods for robustness checks.
Regression Kink Design in Practice
To illustrate RKD, let's create a plot using a hypothetical example. We'll generate some data for a treatment function with a kink, apply a regression model, and show how RKD works.
Please note that in a real-world scenario, data would be collected from actual observations rather than being generated.
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
# Create an array of x values from -50 to 50
x = np.linspace(-50, 50, 500)
# Set a kink at x = 0
kink = 0
# Generate some noise
np.random.seed(0)
noise = np.random.normal(0, 10, 500)
# Generate y values: for x < 0, y = 2x + noise; for x >= 0, y = 2x + 1.5x (kink) + noise
y = np.where(x < kink, 2*x + noise, 2*x + 1.5*x + noise)
# Fit linear regression models on either side of the kink
model_left = LinearRegression().fit(x[x < kink].reshape(-1, 1), y[x < kink])
model_right = LinearRegression().fit(x[x >= kink].reshape(-1, 1), y[x >= kink])
# Generate predicted y values for the fitted models
y_pred_left = model_left.predict(x[x < kink].reshape(-1, 1))
y_pred_right = model_right.predict(x[x >= kink].reshape(-1, 1))
# Plotting
plt.figure(figsize=(10, 6))
plt.scatter(x, y, s=10, label="Data Points")
plt.plot(x[x < kink], y_pred_left, color='r', label="Regression Line (Left of Kink)")
plt.plot(x[x >= kink], y_pred_right, color='g', label="Regression Line (Right of Kink)")
plt.axvline(x=kink, color='b', linestyle='--', label="Kink")
plt.xlabel("Assignment Variable")
plt.ylabel("Outcome Variable")
plt.legend()
plt.title("Regression Kink Design Plot")
plt.show()
In the plot, the scatter points represent the data, and the red and green lines represent fitted regression lines to the left and right of the kink, respectively. The causal effect of the treatment is estimated as the difference in the slopes of these lines.
Conclusion
The Regression Kink Design provides a valuable tool for estimating causal effects in a wide range of settings. It's widely used in economics, public policy, and other fields. While it comes with its own set of assumptions and limitations, when used correctly, RKD can provide important insights into the causal impact of interventions.