A Case Study About Why It Can Be Difficult To Test Whether Propensity Score Analysis Works in Field Experiments
Peikes, Moreno and Orzol (2008) sensibly caution researchers that propensity score analysis may not lead to valid causal inference in field applications. But at the same time, they made the far stronger claim to have performed an ideal test of whether propensity score matching in quasi-experimental data is capable of approximating the results of a randomized experiment in their dataset, and that this ideal test showed that such matching could not do so. In this article we show that their study does not support that conclusion because it failed to meet a number of basic criteria for an ideal test. By implication, many other purported tests of the effectiveness of propensity score analysis probably also fail to meet these criteria, and are therefore questionable contributions to the literature on the effects of propensity score analysis.
Keywords: Propensity Scores, Strong Ignorability, Quasi-Experiments, Within-Study Comparison