# Bayesian Data Analysis

**Prerequisites (knowledge of topic)**

Linear regression (strong), Maximum Likelihood Estimation (some familiarity), Linear/Matrix Algebra (some exposure is helpful), R (not required, but helpful).

**Hardware**

Access to a laptop will be useful, but not absolutely necessary.

**Software**

R/RStudio, JAGS (both are freely available online).

**Learning objectives**

To understand what the Bayesian approach to statistical modeling is and to appreciate the differences between the Bayesian and Frequentist approaches. The students will be able to estimate a wide variety of models in the Bayesian framework and to adjust example code to fit their specific modeling needs.

**Course content**

-Theory/foundations of the Bayesian approach including:

-objective vs subjective probability

-how to derive and incorporate prior information

-the basics of MCMC sampling

-assessing convergence of Markov Chains

-Bayesian difference of means/ANOVA

-Bayesian versions of: Linear models, logit/probit (dichotomous/ordered/unordered choice models), Count models, Latent variable and measurement models, Multilevel models

-presentation of results

**Structure**

Day 1 a.m.: Overview of Bayesian approach—Bayes vs Frequentism. History of Bayesian statistics, Problems with the NHST, The Beta-Binomial model

Day 1 p.m.: Review of GLM/MLE. Probability review. Application of Bayes Rule.

Day 2 a.m.: Priors, Sampling methods (Inversion, Rejection, Gibbs sampling)

Day 2 p.m.: Convergence diagnostics. Using JAGS to estimate Bayesian models.

Day 3 a.m.: Estimating parameters of the Normal Distribution

Day 3 p.m.: Bayesian linear models, imputing missing data.

Day 4 a.m.: Choice models (dichotomous, ordered, unordered)

Day 4 p.m.: Latent variable models

Day 5 a.m.: Multilevel models: linear models.

Day 5 p.m.: Multilevel models: non-linear models, best practices for model

presentation.

**Literature****Mandatory**

Gill, J. (2008). *Bayesian Methods: A Social And Behavioral Sciences Approach.* Chapman and Hall, Boca Raton, FL

Gelman, A. and Hill, J. (2007).* Data Analysis Using Regression and Multilevel/Hierarchical*

Jackman, S. (2000). Estimation and Inference Are Missing Data Problems: Unifying Social Science Statistics via Bayesian Simulation. *Political Analysis*, 8(4):307–332. http://pan.oxfordjournals.org/content/8/4/307.full.pdf+html

**Supplementary / voluntary**

Siegfried, T. (2010). Odds are, it’s wrong: Science fails to face the shortcomings of statistics.* Science News,* 177(7):26–29. http://dx.doi.org/10.1002/scin.5591770721

Stegmueller, D. (2013). How Many Countries for Multilevel Modeling? A Comparison of Frequentist and Bayesian Approaches. *American Journal of Political Science.*

Bakker, R. (2009). Re-measuring left–right: A comparison of SEM and bayesian measurement models for extracting left–right party placements. Electoral Studies, 28(3):413–421

Bakker, R. and Poole, K. T. (2013). Bayesian Metric Multidimensional Scaling. Political Analysis, 21(1):125–140

For those unfamiliar with R: Jon Fox and Sandford Weisberg. *An R Companion to Applied Regression*. Sage, 2011.

**Mandatory readings before course start**

Western, B. and Jackman, S. (1994). Bayesian Inference for Comparative Research. *American Political Science Review*, 88(2):412–423. http://www.jstor.org/stable/2944713

Efron, B. (1986). Why Isn’t Everyone a Bayesian? *The American Statistician*, 40(1):1–5. http://www.jstor.org/stable/2683105

**Examination part**

A written homework assignment which consists of estimating a variety of models using JAGS as well as a brief essay describing how the students would go about incorporating Bayesian methods in their own work and what they see as the main advantages/disadvantages of doing so.

**Supplementary aids**

Open book/practical examinations. The students should use the example code from the lectures to help complete the practical component as well as both required texts to help answer the essay component. Specifically, the linear model and dichotomous choice model examples will be very useful as well as the first 3 chapters of the Gill text and Section 3 of the Gelman and Hill text.

**Examination content**

Bayesian versions of the linear and dichotomous choice models, including presenting the appropriate results in a professionally acceptable manner. This includes creating graphical representations of the model results as well as a thorough discussion of how to interpret the results.

For the essay component, students will need to be aware of the benefits of the Bayesian approach for their own research (or the lack thereof) and to describe, in detail, the types of choices they would need to make in order to apply Bayesian methods to their own work. This includes a detailed description and justification of what priors they would choose as well as what differences they would expect to see between the Bayesian and Frequentist approaches, if any, and why they would expect such differences.

**Literature**

The only required literature to complete the examinations are the 2 required texts and the code examples from the lectures.