It is not just a gap - Fuzzy Regression Discontinuity Design


Posted by ar851060 on 2023-08-04

Introduction

Regression Discontinuity Design (RDD) is a quasi-experimental design that leverages a cutoff or threshold in an assignment variable to estimate the causal effect of a treatment. The basic idea is that units just above and below this cutoff are nearly identical, except that those above the cutoff receive treatment, while those below do not. By comparing these groups right at the threshold, we can estimate the local treatment effect.

Fuzzy Regression Discontinuity Design (FRDD) relaxes a key assumption of sharp RDD - that the treatment assignment is a deterministic function of the assignment variable. In FRDD, the treatment assignment probabilistically jumps at the threshold, but there is not a sharp separation between treatment and control groups. This "fuzziness" of the discontinuity complicates estimation of the treatment effect.

Assumptions

The key assumptions of FRDD are:

  • There exists a threshold or cutoff c in the assignment variable Z
  • The probability of treatment jumps discontinuously at c
  • Units just above and below c are similar on all covariates X, except for the probability of treatment
  • The relationship between the outcome Y and Z is continuous through c (in the absence of treatment)

Methods

FRDD is typically estimated using a two-stage least squares (2SLS) instrumental variables approach:

  1. Model the first stage: Regress treatment T on the assignment variable Z and other covariates, allowing for a discontinuity at the threshold c. This models the "fuzziness" in treatment assignment.
  2. Model the second stage: Regress the outcome Y on the predicted treatment values from step 1 and other covariates X, again allowing for a discontinuity. The coefficient on predicted treatment estimates the local treatment effect at the threshold c.

Example

Consider a job training program that probabilistically prioritizes individuals based on a score from 1 to 10. Those with score above 5 have a 60% probability of receiving training, while those with score below 5 have a 30% probability. We want to estimate the effect of the training on subsequent wages.

In the first stage, we would regress actual treatment receipt on the assignment score Z and other covariates X, allowing for a discontinuity at 5. This would show the jump in treatment probability at the threshold.

In the second stage, we would regress wages on the predicted treatment values from the first stage along with X, again modeling the discontinuity. The coefficient on predicted treatment gives the FRDD estimate of the local treatment effect.

Limitations

Key limitations and challenges with FRDD include:

  • Sensitivity to functional form assumptions in the two stages
  • Difficulty in modeling the first stage treatment probability accurately
  • Need for large sample sizes near the cutoff
  • Interpretation as a local treatment effect - limited external validity

Pros and Cons

Pros:

  • Quasi-experimental, leverages naturally occurring thresholds
  • Does not rely on random assignment like RCTs
  • More plausible than sharp RDD in many cases

Cons:

  • Functional form assumptions can be restrictive
  • Treatment effect is local to the cutoff point
  • Need large samples near cutoff
  • Modeling the first stage adds complexity

Difference from Sharp RDD

The key difference between standard sharp RDD and fuzzy RDD is the determinism versus probabilism of the treatment assignment:

  • In sharp RDD, units above the cutoff are always treated, while those below are not. There is a sharp discontinuity in treatment.

  • In fuzzy RDD, the probability of treatment jumps at the cutoff, but there is not a deterministic assignment. The discontinuity in treatment probability is "fuzzy".

So while sharp RDD relies on a sharp separation of treated and control groups at the threshold, fuzzy RDD models the increased likelihood of treatment near the cutoff. Fuzzy RDD is more widely applicable in settings where the treatment assignment is based on a threshold in practice. The tradeoff is that fuzzy RDD requires more complex estimation and assumptions to model the first stage treatment process.


#causal inference









Related Posts

Python Web Flask 實戰開發教學 - SQLAlchemy 與 ORM

Python Web Flask 實戰開發教學 - SQLAlchemy 與 ORM

讀書心得 - 最高學以致用法

讀書心得 - 最高學以致用法

D15_ Unit4

D15_ Unit4


Comments