(with Nicole Pashley)
Factorial experiments are ubiquitous in the social and biomedical sciences, but when units fail to comply with each assigned factors, identification and estimation of the average treatment effects become impossible. Leveraging an instrumental variables approach, previous studies have shown how to identify and estimate the causal effect of treatment uptake among respondents who comply with treatment. A major caveat is that these identification results rely on strong assumptions on the effect of randomization on treatment uptake. This paper shows how to bound these complier average treatment effects under more mild assumptions on non-compliance.
(with Ruofan Ma and Aleksei Opacic)
Political scientists are increasingly attuned to the promises and pitfalls of establishing causal effects. But the vital question for many is not if a causal effect exists but why and how it exists. Even so, many researchers avoid causal mediation analyses due to the assumptions required, instead opting to explore causal mechanisms through what we call intermediate outcome tests. These tests use the same research design used to estimate the effect of treatment on the outcome to estimate the effect of the treatment on one or more mediators, with authors often concluding that evidence of the latter is evidence of a causal mechanism. We show in this paper that, without further assumptions, this can neither establish nor rule out the existence of a causal mechanism. Instead, such conclusions about the indirect effect of treatment rely on implicit and usually very strong assumptions that are often unmet. Thus, such causal mechanism tests, though very common in political science, should not be viewed as a free lunch but rather should be used judiciously, and researchers should explicitly state and defend the requisite assumptions.
(with Adam Glynn, Hanno Hilbig, and Connor Halloran Phillips)
Recent experimental studies in the social sciences have demonstrated that short, perspective-taking conversations are effective at reducing prejudicial attitudes and support for discriminatory public policies, but it is unclear if such interventions can directly affect policy views without changing prejudice. Unfortunately, the identification and estimation of the controlled direct effect—the natural causal quantity of interest for this question—has required strong selection-on-observables assumptions for any mediator. We leverage a recent experimental study with multiple survey waves of follow-up to identify and estimate the controlled direct effect using the changes in the outcome and mediator over time. This design allows us to weaken the identification assumptions to allow for linear and time-constant unmeasured confounding between the mediator and the outcome. Furthermore, we develop a semiparametrically efficient and doubly robust estimator for these quantities. We find that there is a robust controlled direct effect of perspective-taking conversations when subjective feelings are neutral but not positive or negative.
(with Soichiro Yamauchi)
Corporations, unions, and other interest groups have become key sponsors of television advertising in United States elections after the Supreme Court’s decision in Citizen’s United v. FEC that eliminated restrictions on such spending. This paper estimates the partisan effects of ads sponsored by these groups to obtain a more complete picture of voter behavior and electoral politics. Advertising strategies vary over the course of the campaign, and so marginal structural models are a natural tool to estimate these effects. Unfortunately, this approach requires an assumption of no unobserved confounders between the treatment and outcome, which may not be plausible with observational electoral data. To address this, we propose a novel inverse probability of treatment weighting estimator with propensity-score fixed effects to adjust for time-constant unmeasured confounding in marginal structural models of fixed-length treatment histories. We show that these estimators are consistent and asymptotically normal when the number of units and time periods grow at a similar rate. Unlike traditional fixed effect models, this approach works even when the outcome is only measured at a single point in time as in our setting, though the method does rely on some degree of treatment switching within units. Against conventional wisdom, we find that interest group ads are only effective when run by groups supporting Democratic candidates and that these effects are most prominent after Donald Trump became a presidential candidate in 2016.
(with Nicole Pashley and Dominic Valentino)
Experiments are vital for assessing causal effects, but their high cost often leads to small, sub-optimal sample sizes. We show how a particular experimental design—the Neyman allocation—can lead to more efficient experiments, achieving similar levels of statistical power as traditional designs with significantly fewer units. This design relies on unknown variances, and so previous work has proposed what we call the batch adaptive Neyman allocation (BANA) design that uses an initial pilot study to approximate the optimal Neyman allocation for a second larger batch. We extend BANA to multiarm experiments common in political science, derive an unbiased estimator for the design, and show how to perform inference in this setting. Simulations verify that the design’s advantages are most apparent when the outcome variance differs by treatment conditions. Finally, we review the heteroskedasticity of recent experimental studies and find that political scientists using BANA could achieve sample size savings of 15-30%.
(with Avi Acharya and Maya Sen)
(with Jacob R. Brown, Sophie Hill, Kosuke Imai, and Teppei Yamamoto)
Conditioning on variables affected by treatment can induce post-treatment bias when estimating causal effects. Although this suggests that researchers should measure potential moderators before administering the treatment in an experiment, doing so may also bias causal effect estimation if the covariate measurement primes respondents to react differently to the treatment. This paper formally analyzes this trade-off between post-treatment and priming biases in three experimental designs that vary when moderators are measured: pre-treatment, post-treatment, or a randomized choice between the two. We derive nonparametric bounds for interactions between the treatment and the moderator in each design and show how to use substantive assumptions to narrow these bounds. These bounds allow researchers to assess the sensitivity of their empirical findings to either source of bias. We extend the basic framework in two ways. First, we apply the framework to the case of post-treatment attention checks and bound how much inattentive respondents can attenuate estimated treatment effects. Second, we develop a parametric Bayesian approach to incorporate pre-treatment covariates in the analysis to sharpen our inferences and quantify estimation uncertainty. We apply these methods to a survey experiment on electoral messaging. We conclude with practical recommendations for scholars designing experiments.
(with Nicole Pashley)
Factorial experiments are widely used to assess the marginal, joint, and interactive effects of multiple concurrent factors. While a robust literature covers the design and analysis of these experiments, there is less work on how to handle treatment noncompliance in this setting. To fill this gap, we introduce a new methodology that uses the potential outcomes framework for analyzing 2^K factorial experiments with noncompliance on any number of factors. This framework builds on and extends the literature on both instrumental variables and factorial experiments in several ways. First, we define novel, complier-specific quantities of interest for this setting and show how to generalize key instrumental variables assumptions. Second, we show how partial compliance across factors gives researchers a choice over different types of compliers to target in estimation. Third, we show how to conduct inference for these new estimands from both the finite-population and superpopulation asymptotic perspectives. Finally, we illustrate these techniques by applying them to a field experiment on the effectiveness of different forms of get-out-the-vote canvassing. New easy-to-use, open-source software implements the methodology.
(with Michael Olson)
Analyzing variation in treatment effects across subsets of the population is an important way for social scientists to evaluate theoretical arguments. A common strategy in assessing such treatment effect heterogeneity is to include a multiplicative interaction term between the treatment and a hypothesized effect modifier in a regression model. In this paper, we show that this approach results in biased inferences due to unmodeled interactions between the effect modifier and other covariates. Researchers can include the additional interactions, but this can lead to unstable estimates due to overfitting. Machine learning algorithms can stabilize these estimates but can also lead to bias due to regularization and model selection mistakes. To overcome these issues, we use a post-double selection approach that utilizes several lasso estimators to select the interactions to include in the final model. We extend this approach to estimate uncertainty for both interaction and marginal effects. Simulation evidence shows that this approach has lower bias and uncertainty than competing methods, even when the number of covariates is large. We show in two empirical examples that the choice of method leads to dramatically different conclusions about effect heterogeneity.
(with Anton Strezhnev)
Time-varying treatments are prevalent in the social sciences. For example, a political campaign might decide to air attack ads against an opponent, but this decision to go negative will impact polling and, thus, future campaign strategy. If an analyst naively applies methods for point exposures to estimate the effect of earlier treatments, this would lead to post-treatment bias. Several existing methods can adjust for this type of time-varying confounding, but they typically rely on strong modeling assumptions. In this paper, we propose a novel two-step matching procedure for estimating the effect of two-period treatments. This method, telescope matching, reduces model dependence without inducing post-treatment bias by using matching with replacement to impute missing counterfactual outcomes. It then employs flexible regression models to correct for bias induced by imperfect matches. We derive the asymptotic properties of the telescope matching estimator and provide a consistent estimator for its variance. We illustrate telescope matching by investigating the effect of negative campaigning in U.S.\ Senate and gubernatorial elections. Using the method, we uncover a positive effect on turnout of negative ads early in a campaign but no effect of early negativity on vote shares.
(with Adam Glynn)
Repeated measurements of the same countries, people, or groups over time are vital to many fields of political science. These measurements, sometimes called time-series cross-sectional (TSCS) data, allow researchers to estimate a broad set of causal quantities, including contemporaneous and lagged treatment effects. Unfortunately, popular methods for TSCS data can only produce valid inferences for lagged effects under very strong assumptions. In this paper, we use potential outcomes to define causal quantities of interest in this settings and clarify how standard models like the autoregressive distributed lag model can produce biased estimates of these quantities due to post-treatment conditioning. We then describe two estimation strategies that avoid these post-treatment biases—inverse probability weighting and structural nested mean models—and show via simulations that they can outperform standard approaches in small sample settings. We illustrate these methods in a study of how welfare spending affects terrorism.
(with Avi Acharya and Maya Sen)
We present an approach to investigating causal mechanisms in experiments that include mediators, in particular survey experiments that provide or withhold information as in vignettes or conjoint designs. We propose an experimental design that can identify the controlled direct effect of a treatment and also, in some cases, what we call an intervention effect. These quantities can be used in ways to address substantive questions about causal mechanisms, and can be estimated with simple estimators using standard statistical software. We illustrate the approach via two examples, one on characteristics of U.S. Supreme Court nominees and the other on public perceptions of the democratic peace.
(with Avi Acharya and Maya Sen)
The standard approach in positive political theory posits that action choices are the consequences of attitudes. Could it be, however, that an individual’s actions also affect her fundamental preferences? We present a broad theoretical framework that captures the simple, yet powerful, intuition that actions frequently alter attitudes as individuals seek to minimize cognitive dissonance. This framework is particularly appropriate for the study of political attitudes and enables political scientists to formally address important questions that have remained inadequately answered by conventional rational choice approaches – questions such as “What are the origins of partisanship?” and “What drives ethnic and racial attitudes?” We illustrate our ideas with three examples from the literature: (1) how partisanship emerges naturally in a two party system despite policy being multi-dimensional, (2) how ethnic or racial hostility increases after acts of violence, and (3) how interactions with people who express different views can lead to empathetic changes in political positions.
In this paper, I introduce a Bayesian model for detecting changepoints in a time series of overdispersed count data, such as contributions to candidates over the course of a campaign or counts of terrorist violence. While many extant changepoint models force researchers to choose the number of changepoint ex ante, this model incorporates a hierarchical Dirichlet process prior in order to estimate the number of changepoints as well as their location. This allows researchers to discover salient structural breaks and perform inference on the number of such breaks in a given time series. I demonstrate the usefulness of the model with applications to campaign contributions in the 2012 U.S. Republican presidential primary and incidences of global terrorism from 1970 to 2015.
(With James Honaker and Gary King)
We extend a unified and easy-to-use approach to measurement error and missing data. Blackwell, Honaker, and King (2015a) gives an intuitive overview of the new technique, along with practical suggestions and empirical applications. Here, we offer more precise technical details; more sophisticated measurement error model specifications and estimation procedures; and analyses to assess the approach’s robustness to correlated measurement errors and to errors in categorical variables. These results support using the technique to reduce bias and increase efficiency in a wide variety of empirical research.
(With James Honaker and Gary King)
Although social scientists devote considerable effort to mitigating measurement error during data collection, they often ignore the issue during data analysis. And although many statistical methods have been proposed for reducing measurement error-induced biases, few have been widely used because of implausible assumptions, high levels of model dependence, difficult computation, or inapplicability with multiple mismeasured variables. We develop an easy-to-use alternative without these problems; it generalizes the popular multiple imputation (MI) framework by treating missing data problems as a limiting special case of extreme measurement error, and corrects for both. Like MI, the proposed framework is a simple two-step procedure, so that in the second step researchers can use whatever statistical method they would have if there had been no problem in the first place. We also offer empirical illustrations, open source software that implements all the methods described herein, and a companion paper with technical details and extensions.
In democratic countries, voting is one of the most important ways for citizens to influence policy and hold their representative accountable. And yet, in the United States and many other countries, rates of voter turnout are alarmingly low. Every election cycle, mobilization efforts encourage citizens to vote and ensure that elections reflect the true will of the people. To establish the most effective way of encouraging voter turnout, this paper seeks to differentiate between (1) the \emph{synergy hypothesis} that multiple instances of voter contact increase the effectiveness of a single form of contact, and (2) the \emph{backlash hypothesis} that multiple instances of contact are less effective or even counterproductive. Remarkably, previous studies have been unable to compare these hypotheses because extant approaches to analyzing experiments with noncompliance cannot speak to questions of causal interaction. I resolve this impasse by extending the traditional instrumental variables framework to accommodate multiple treatment-instrument pairs, which allows for the estimation of conditional and interaction effects to adjudicate between synergy and backlash. The analysis of two voter mobilization field experiments provides the first evidence of backlash to follow-up contact and a cautionary tale about experimental design for these quantities.
(with Avi Acharya and Maya Sen)
We show that contemporary differences in political attitudes across counties in the American South in part trace their origins to slavery’s prevalence more than 150 years ago. Whites who currently live in Southern counties that had high shares of slaves in 1860 are more likely to identify as a Republican, oppose affirmative action, and express racial resentment and colder feelings toward blacks. These results cannot be explained by existing theories, including the theory of contemporary racial threat. To explain these results, we offer evidence for a new theory involving the historical persistence of political and racial attitudes. Following the Civil War, Southern whites faced political and economic incentives to reinforce existing racist norms and institutions to maintain control over the newly free African-American population. This amplified local differences in racially conservative political attitudes, which in turn have been passed down locally across generations. Our results challenge the interpretation of a vast literature on racial attitudes in the American South.
(with Avi Acharya and Maya Sen)
Researchers seeking to establish causal relationships frequently control for variables on the purported causal pathway, checking whether the original treatment effect then disappears. Unfortunately, this common approach may lead to biased estimates. In this paper, we show that the bias can be avoided by focusing on a quantity of interest called the controlled direct effect. Under certain conditions, the controlled direct effect enables researchers to rule out competing explanations—an important objective for political scientists. To estimate the controlled direct effect without bias, we describe an easy-to-implement estimation strategy from the biostatistics literature. We extend this approach by deriving a consistent variance estimator and demonstrating how to conduct a sensitivity analysis. Two examples—one on ethnic fractionalization’s effect on civil war and one on the impact of historical plough use on contemporary female political participation—illustrate the framework and methodology.
The estimation of causal effects has a revered place in all fields of empirical political science, but a large volume methodological and applied work ignores a fundamental fact: most people are skeptical of estimated causal effects. In particular, researchers are often worried about the assumption of no omitted variables or no unmeasured confounders. This paper combines two approaches to sensitivity analysis to provide researchers with a tool to investigate how specific violations of no omitted variables alters their estimates. This approach can help researchers determine which narratives imply weaker results and which actually strengthen their claims. This gives researchers and critics a reasoned and quantitative approach to assessing the plausibility of causal effects. To demonstrate the approach, I present applications to three causal inference estimation strategies: regression, matching, and weighting.
Dynamic strategies are an essential part of politics. In the context of campaigns, for example, candidates continuously recalibrate their campaign strategy in response to polls and opponent actions. Traditional causal inference methods, however, assume that these dynamic decisions are made all at once, an assumption that forces a choice between omitted variable bias and post-treatment bias. Thus, these kinds of “single-shot” causal inference methods are inappropriate for dynamic processes like campagins. I resolve this dilemma by adapting models from biostatistics to estimate the effectiveness of an inherently dynamic process: a candidate’s decision to “go negative.” Using data from U.S. Senate and Gubernatorial elections (2002-2006), I find, in contrast to previous literature and alternative methods, that negative advertising is an effective campaign strategy for Democratic non-incumbents. Democratic incumbents, on the other hand, are hurt by their negativity.
Amelia II is a complete R package for multiple imputation of missing data. The package implements a new expectation-maximization with bootstrapping algorithm that works faster, with larger numbers of variables, and is far easier to use, than various Markov chain Monte Carlo approaches, but gives essentially the same answers. The program also improves imputation models by allowing researchers to put Bayesian priors on individual cell values, thereby including a great deal of potentially valuable and extensive information. It also includes features to accurately impute cross-sectional datasets, individual time series, or sets of time series for different cross-sections. A full set of graphical diagnostics are also available. The program is easy to use, and the simplicity of the algorithm makes it far more robust; both a simple command line and extensive graphical user interface are included.
In this article, we introduce a Stata implementation of coarsened exact matching, a new method for improving the estimation of causal effects by reducing imbalance in covariates between treated and control groups. Coarsened exact matching is faster, is easier to use and understand, requires fewer assumptions, is more easily automated, and possesses more attractive statistical properties for many applications than do existing matching methods. In coarsened exact matching, users temporarily coarsen their data, exact match on these coarsened data, and then run their analysis on the uncoarsened, matched data. Coarsened exact matching bounds the degree of model dependence and causal effect estimation error by ex ante user choice, is monotonic imbalance bounding (so that reducing the maximum imbalance on one variable has no effect on others), does not require a separate procedure to restrict data to common support, meets the congruence principle, is approximately invariant to measurement error, balances all nonlinearities and interactions in sample (i.e., not merely in expectation), and works with multiply imputed datasets. Other matching methods inherit many of the coarsened exact matching method’s properties when applied to further match data preprocessed by coarsened exact matching. The cem command implements the coarsened exact matching algorithm in Stata.