# Advanced Biostatistics

**Prerequisites (knowledge of topic)**

1. Course “Introduction to Biostatistics”, required.

2. Basic knowledge of statistics, such as distributions (i.e., normal, t, F, chi-squares), mean/standard calculations, linear regressions (I will review some of these concepts when in the first class so to bring all students into the same level of understanding)

3. R/SAS knowledge. It would be greatly helpful if students have some basic knowledge of R or SAS. The course will be taught in R since it is free and open source, but SAS code will be available.

**Hardware**

Laptop required for practice with the latest version of R installed

**Software**

R is required. Please download and install the latest version of R/RStudio from http://r-project.org.

**Course content**

Introduction

1. This «Advanced Biostatistics» course is based on the book: "Clinical Trial Data Analysis Using R and SAS" co-authored by Din Chen, Karl E. Peace and Pinggao Zhang, published by Chapman and Hall/CRC Biostatistics Series in 2017 (thereafter referred as CTDA in this document)

2. This class is aimed to provide a thorough presentation and learning of biostatistical analyses with detailed step-by-step illustrations on their implementation using R and SAS. Examples are based on the authors' actual experience in many areas of biostatistical clinical drug development. After understanding the application, various biostatistical methods appropriate for analyzing data are identified. Then analysis code is developed using appropriate R/SAS packages and functions to analyze the data. Analysis code development and results are presented in a stepwise fashion. This stepwise approach should enable students to follow the logic and gain an understanding of the analysis methods and the R/SAS implementation so that they may use R/SAS to analyze their own biostatistical data.

3. Students are encoraged to bring their own research data to be used in the class as further examples.

Topics To Be Covered

1. R basics: Introduction to the R with Monte-Carlo simulation on clinical trial applications.

2. Treatment comparisons with continuous/categorical endpoints: We start with simple two treatment comparisons using t-test and extend the analysis to multiple treatment comparisons (Analysis of variance) and then to analysis of covariance with clinical covariates using "lm" function in R for continuous endpoints and "glm" for categorical endpoints.

3. Longitudinal clinical trials: We will illustrate longitudinal trials using R "lattice" graphical package and their analysis using linear mixed models for continuous endpoints (R function "lmer" from "lme4" package), generalized linear mixed modeland GEE for categorical endpoints (R function "glmmPQL" from "MASS" package and and "gee" from "gee" package)

4. Meta-analysis in clinical trials: Both the fixed-effect model and the random-effect model will be discussed with both categorical and continuous endpoints using the powerful graphical feathers in R “meta” package

5. Bayesian analysis in clinical trials using MCMC simulations with MCMCpack.

**Structure***Day 1-Monday (Chapters 1 and 2, CTDA): Introduction to R and Review of Biostatistics*

*Morning Session: R Basics*

Introduction to the R system, Monte-Carlo simulation in clinical trials

*Afternoon Session: R for basic biostatistical analysis *

Clinical trial simulations and data generation

Data distribution plotting, summary statistics and simple regression

Summary and Discussions

*Day 2-Tuesday (Chapters 3 and 4, CTDA):Treatment Comparison*

*Morning Session (Chapter 3): Treatment Comparison*

Data from Clinical Trials: Diastolic Blood Pressure and Data on Duodenal Ulcer Healing

Statistical Models for Treatment Comparisons:

Models for Continuous Endpoints

Student's t-Tests

One-Way Analysis of Variance (ANOVA)

Multi-Way ANOVA: Factorial Design

Multivariate Analysis of Variance (MANOVA)

Models for Categorical Endpoints: Pearson's ¬2-test

R Step-by-Step Illustration on Data Analysis

*Afternoon Session (Chapter 4): Treatment Comparisons with Covariates*

Data from Clinical Trials

Diastolic Blood Pressure

Clinical Trials for Betablockers

Clinical Trial on Familial Adenomatous Polyposis

Statistical Models Incorporating Covariates

ANCOVA Models for Continuous Endpoints

Logistic Regression for Binary/Binomial Endpoints

Poisson Regression for Clinical Endpoint with Counts

Overdispersion

Data Analysis in R

*Day 3-Wed (Chapters 6, CTDA)Longitudinal Data Analysis*

*Morning Session: Data and Longitudinal Modelling*

Longitudinal Data Structure

Diastolic Blood Pressure Data

Clinical Trial on Duodenal Ulcer Healing

Longitudinal Statistical Models

Linear Mixed Models

Generalized Linear Mixed Models

Afternoon Session: Step-by-Step Data Analysis using R

Analysis of Diastolic Blood Pressure Data

Data Graphics and Response Feature Analysis

Longitudinal Modeling with R package «nlme»

Analysis of Cimetidine Duodenal Ulcer Trial

Preliminary Analysis

Fit Logistic Regression to Binomial Data

Fit Generalized Linear Mixed Model

Summary and Discussions

*Day 4-Thursday (Chapters 8, CTDA)Biostatistics Meta-Analysis*

*Morning Session: Fixed-effect and Random-effect Models*

Statistical Models for Meta-Analysis

Clinical Hypotheses and Effect Size

Fixed-Effects Meta-Analysis Model: The Weighted-Average

Random-Effects Meta-Analysis Model: DerSimonian-Laird

Publication Bias

Meta-Data Analysis in R package “metafor”

Afternoon Session: Meta-Regression

Afternoon Session: Meta-Regression

Statistical Models for Meta-Regression

Heterogeneity

Meta-Regression in R package “metafor”

Summary and Discussions

Day 5-Friday (Chapters 9, CTDA)

Bayesian Biostatistics

Day 5-Friday (Chapters 9, CTDA)

Bayesian Biostatistics

*Morning Session: Bayesian Models*

Bayes' Theorem

From Prior Distribution to Posterior Distributions for Some Standard Distributions

Normal Distribution with Known Variance

Normal Distribution with Unknown Variance

Simulation from the Posterior Distribution

Direct Simulation

Importance Sampling

Gibbs Sampling

Metropolis-Hastings Algorithm

*Afternoon Session: R Packages for Bayesian Models *

R Packages using WinBUGS, R2WinBUGS, BRugs, rbugs, MCMCpack

Bayesian Data Analysis

Blood Pressure Data: Bayesian Linear Regression

Binomial Data: Bayesian Logistic Regression

Count Data: Bayesian Poisson Regression

Summary and Discussions

**Literature**

**Mandatory***Textbook*

Clinical Trial Data Analysis Using R and SAS(2017)

by Din Chen, Karl E. Peace and Pinggao Zhang,

Chapman and Hall/CRC Biostatistics Series

«Students are recommend to get this textbook or ebook from library»

**Supplementary / voluntary**

NA

**Mandatory readings before course start**

• Install the latest version of R

**Examination part**

• (20%) Class participation

• (80%) Take Home Project (due in 3 weeks after class)

**Supplementary aids**

• Take home open book project (see below for details)

**Examination content**Write a data analysis report using the longitudinal model learned from the class using the data provided (see “Data Sources” below). The report should include at least 5 Sections:

o (15%) Section 1 is on the literature review to formulate the research questions and objectives

o (30%) Section 2 is to discuss the statistical model development;

o (30%) Section 3 is on data analysis to support the research questions with conclusions, and

o (20%) Section 4 is the discussions and future research.

o (5%) Session 5 is the appendix to include all the references and analysis code.

• Suggested research objectives:

o Present a general overview of measuring change over an 8-month period in individual perceptions of mood and social adjustment by women who recently (1 month previous) underwent breast cancer surgery (intra-individual change).

o Present the longitudinal modeling that measures change across all subjects (inter-individual change).

o Demonstrate the addition of age and type of surgical treatment as possible predictors that may account for any change in the individual growth trajectories (i.e., intercept and slope) of mood and social adjustment.

• Data Sources:

o Excel file “hkcancer.xlsx” with variables explained at sheet “readme”

o Study of 405 Hong Kong Chinese women who underwent breast cancer surgery whether exhibited evidence of rate of change in their “mood” and “social adjustment” at 1, 4, and 8 months post-surgery.

• Reference:

o Byrne, B.M., Lam, W. W. T. and Fielding, R. (2008) Measuring pattern of change in personality assessments: an annotated application of latent growth curve modelling. Journal of Personality Assessment, 90:1-11.

o Data are from this reference

o PDF file attached

**Literature**

None.