# Causal Inference

### Outline

Modern microeconometrics has the goal of empirically establishing and quantifying causal relations between interesting variables, such as participating in some government intervention and individual labour market outcomes.

In this course we discuss the most relevant methods (also called research designs in microeconometrics) that are used in empirical practice, such as matching, instrumental variables, regression discontinuity designs as well as difference-in-difference estimation based on cross-sectional and panel data. The methods will be explained together with their potential virtues and limitations in different disciplines. For each of the different research designs we start with a general discussion of its key assumptions and how they relate to empirical settings. Next, we discuss estimation principles as well as particular estimators suitable for the context of the particular design. Finally, some time will be devoted to look in more detail in an empirical paper that uses the particular methods. These empirical papers will come from the fields of labour, health and sports economics. Usually, in the morning we discuss the theory while the afternoon is devoted to empirical applications.

The course has the following structure:

Introduction and social experiments

This part starts with a discussion of the concept and notation of causality as used in microeconometrics. To simplify, we will base all theoretical analysis on a binary model, i.e. a model in which we are interested to learn the causal effect of changing a ‘treatment’ variable from 0 (control) to 1 (treated). For this model, the usual causal parameters of interest are introduced.

We will look in depths into the so-called confounding (omitted variable) problem and its impact on empirical results. We see that without identifying assumptions, which *cannot* be justified by the data, identification of causal effects is impossible.

Usually, the most credible set of identifying assumption is obtained in (social) experiments. Therefore, the pro’s and con’s of social experiments as well as examples are discussed together with the key concepts of ‘external’ and ‘internal’ validity of an empirical study.

Matching

The rational of matching estimation is that the confounding (selection) problem can be solved by econometrically creating two artificial samples in which the treated (1) and the control (0) observations have the same distribution of a set of covariates. We will discuss the implications of the underlying identifying assumptions as well as popular matching estimation methods. Finally, we use examples from the evaluation of German active labour market programmes to empirically illustrate the methods.

Instrumental Variables

Instrumental variables are variables that influence the outcomes of interest only because they influence the assignment of the treatment (and have no other impact on the outcomes). We analyse the basic set of assumptions necessary for this method to yield causal effects and discuss some possible estimation methods. They turn out to show several similarities to the matching methods. A paper from sports and health economics will illustrate these methods.

Regression Discontinuity Designs

The regression discontinuity design is a special, highly relevant case of an instrumental variable method. It exploits the fact that the probability of treatment assignment changes discontinuously at some value of a so-called ‘running-variable’. Again, identification and inference is discussed. Finally, if time permits, the method will be illustrated by an example from the evaluation of Swiss active labour market policy.

Difference-in-Difference Estimation

Difference-in-difference estimation has a long tradition in epidemiology, and more recently in many other disciplines. It is based on exploiting cross-sectional and time series variation in relevant variables. These methods will be illustrated by an empirical example concerning the effects of minimum wages.

Econometric background

Ideally, participants have an econometric background at the level of Wooldridge (2002, 2010), *Econometric Analysis of Cross Section and Panel Data*, 1^{st} or 2^{nd} edition. However, even if this is not the case, participants should be able to benefit from the course.

**Reading**

In preparation of the course, it will be useful to read the following two papers:

Imbens, G.W., and J.M. Wooldridge (2009): “Recent Developments in the Econometrics of Program Evaluation”, *Journal of Economic Literature*, 47 (1), 5-86.

Heckman, J.J. (2000): „Causal Parameters and Policy Analysis in Economics: A Twentieth Century Retrospective“, *Quarterly Journal of Economics*, 115, 45-97.

Moreover, the chapters that correspond to the topics above of the book by Angrist, Piscke, *Mostly Harmless Econometrics*, are a good preparation for this course.

Participants will receive an extensive list of references during the course.

**Exam**

At the end of the course, students will receive data sets and instructions. With this data, they are expected to conducted a small-scale empirical study. The resulting short paper (15 pages max) has to be submitted not later than 4 weeks after the end of the course and will be graded.

**Software and Hardware**

During the course, there will be no computational exercises. For the empirical paper, access to statistical software like Gauss, R, or Stata is necessary.

** **