Machine Learning with R - Introduction
Prerequisites (knowledge of topic)
This course assumes no prior experience with machine learning or R, though it may be helpful to be familiar with introductory statistics and programming.
A laptop computer is required to complete the in-class exercises.
R (https://www.r-project.org/) and R Studio (https://www.rstudio.com/products/rstudio/) are available at no cost and are needed for this course.
Machine learning, put simply, involves teaching computers to learn from experience, typically for the purpose of identifying or responding to patterns or making predictions about what may happen in the future. This course is intended to be an introduction to machine learning methods through the exploration of real-world examples. We will cover the basic math and statistical theory needed to understand and apply many of the most common machine learning techniques, but no advanced math or programming skills are required. The target audience may include social scientists or practitioners who are interested in understanding more about these methods and their applications. Students with extensive programming or statistics experience may be better served by a more theoretical course on these methods.
The course will be designed to be interactive, with ample time for hands-on practice with the Machine Learning methods. Each day will include several lectures based on a Machine Learning topic, in addition to hands-on “lab” sections to apply the learnings to new datasets (or your own data, if desired).
The schedule will be as follows:
Day 1: Introducing Machine Learning with R
- How machines learn
- Using R, R Studio, and R Markdown
- k-Nearest Neighbors
- Lab sections – installing R, using R Markdown, choosing own dataset (if desired)
Day 2: Intermediate ML Methods – Classification Models
- Quiz on Day 1 material
- Naïve Bayes
- Decision Trees and Rule Learners
- Lab sections – practicing with Naïve Bayes and decision trees
Day 3: Intermediate ML Methods – Numeric Prediction
- Quiz on Day 2 material
- Linear Regression
- Regression trees
- Logistic regression
- Lab sections – practicing with regression methods
Day 4: Advanced Classification Models
- Quiz on Day 3 material
- Neural Networks
- Support Vector Machines
- Random Forests
- Lab section – practice with neural networks, SVMs, and random forests
Day 5: Other ML Methods
- Quiz on Day 4 material
- Association Rules
- Hierarchical clustering
- k-Means clustering
- Lab section – practice with these methods, work on final report
Machine Learning with R (3rd ed.) by Brett Lantz (2019). Packt Publishing
Supplementary / voluntary
Mandatory readings before course start
Please install R and R Studio on your laptop prior to the 1st class. Be sure that these are working correctly and that external packages can be installed. Instructions for doing this are in the first chapter of Machine Learning with R.
100% of the course grade will be based on a project and final report (approximately 10 pages), to be delivered within 2-3 weeks after the course. The project is intended to demonstrate your ability to apply the course materials to a dataset of your own choosing. Students should feel free to use a project related to their career or field of study. For example, one may use this opportunity to advance his/her dissertation research or complete a task for his/her job. The exact scoring criteria for this assignment will be provided on the first day of class. This will be graded based on its use of the methods covered in class as well as making appropriate conclusions from the data.
There will also be brief quizzes at the start of each lecture, which cover the previous day's materials. These are ungraded and are designed to provoke thought and discussion.
Students may reference literature and class materials as needed when writing the final project report.
The final project report should illustrate an ability to apply machine learning methods to a new dataset, which may be on a topic of the student’s choosing. The student should explore the data and explain the methods applied. Detailed instructions will be provided on the fist day of class.