Machine Learning with R - Introduction

Prerequisites (knowledge of topic)
This course assumes no prior experience with machine learning or R, though it may be helpful to be familiar with introductory statistics and programming.

Hardware
A laptop computer is required to complete the in-class exercises.

Software
R (https://www.r-project.org/) and R Studio (https://www.rstudio.com/products/rstudio/) are available at no cost and are needed for this course.

Course content
Machine learning, put simply, involves teaching computers to learn from experience, typically for the purpose of identifying or responding to patterns or making predictions about what may happen in the future. This course is intended to be an introduction to machine learning methods through the exploration of real-world examples. We will cover the basic math and statistical theory needed to understand and apply many of the most common machine learning techniques, but no advanced math or programming skills are required. The target audience may include social scientists or practitioners who are interested in understanding more about these methods and their applications. Students with extensive programming or statistics experience may be better served by a more theoretical course on these methods.

Structure
The course will be designed to be interactive, with ample time for hands-on practice with the Machine Learning methods. Each day will include several lectures based on a Machine Learning topic, in addition to hands-on “lab” sections to apply the learnings to new datasets (or your own data, if desired).

The schedule will be as follows:

Day 1: Introducing Machine Learning with R

Day 2: Intermediate ML Methods – Classification Models

 
Day 3: Intermediate ML Methods – Numeric Prediction

 
Day 4: Advanced Classification Models

Day 5: Other ML Methods

Literature

Mandatory
Machine Learning with R (3rd ed.) by Brett Lantz (2019). Packt Publishing

Supplementary / voluntary
None required.

Mandatory readings before course start
Please install R and R Studio on your laptop prior to the 1st class. Be sure that these are working correctly and that external packages can be installed. Instructions for doing this are in the first chapter of Machine Learning with R.

Examination part
100% of the course grade will be based on a project and final report (approximately 10 pages), to be delivered within 2-3 weeks after the course. The project is intended to demonstrate your ability to apply the course materials to a dataset of your own choosing. Students should feel free to use a project related to their career or field of study. For example, one may use this opportunity to advance his/her dissertation research or complete a task for his/her job. The exact scoring criteria for this assignment will be provided on the first day of class. This will be graded based on its use of the methods covered in class as well as making appropriate conclusions from the data.

There will also be brief quizzes at the start of each lecture, which cover the previous day's materials. These are ungraded and are designed to provoke thought and discussion.

Supplementary aids
Students may reference literature and class materials as needed when writing the final project report.


Examination content

The final project report should illustrate an ability to apply machine learning methods to a new dataset, which may be on a topic of the student’s choosing. The student should explore the data and explain the methods applied. Detailed instructions will be provided on the fist day of class.