Data Mining in Business
Prerequisites (knowledge of topic)
Students will need an undergraduate or better for the following topics:
Students must have a laptop with more than 4GBs of RAM
• Software: R & R-Studio (if working locally)
1. If you are not familiar with R Studio please take a short introduction to R course at Lynda.com, DataQuest.com or DataCamp.com
• Access to git software to download data sets and class material or ability to download directly from the Internet
• In this course, we will be using https://rstudio.cloud/ to avoid local laptop issues for students. This will ensure all students are in the same environment and time won’t be spent with technical troubleshooting. As a result, please sign up for a free account.
If you stay engaged in the week-long, mostly case study based, course and complete the suggested readings and assignments:
You will be able to think systematically about how data is used to make business decisions. This objective will be accomplished using real, messy data and actual business problems. The methods to solve and make business decisions are interdisciplinary including statistics, economics and computer technology. Students will learn how to implement a variety of popular data mining algorithms in R (a free and open-source software) to tackle business problems and identify opportunities. Further, this course will help introduce the basics of R in data mining.
As a future business person, you will acquire the skill of applying data mining concepts to various business domains. The practical benefit is that data-driven decisions improve outcomes and often increase competitive advantages.
If you are an aspiring data scientist or technical practitioner, you will acquire practical applications of data mining methods used in many of today’s most successful organizations.
This is a draft of the work/case studies. It may change depending on topics and timing to improve learning outcomes. This is particularly true for day 5.
• R basics, installation and set up
• Basic exploratory data analysis
• Basic R plotting
• Basic clustering techniques: Kmean, KMediod
Day1 Afternoon Case:
• EDA & Customer Personas for (synthetic) customer data
Case1: 2 slides & supporting code identifying unique customer personas from the lab and methods used, presented to a “marketing executive”
• Forecasting Methods: Holt Winters, Decomposition, Linear Methods
• Scheduling with Erlang-C or similar
Day2 Afternoon Case:
• Capacity Planning & Scheduling Recommendations for (synthetic) future periods
Case2: 2 slides & supporting code recommending scheduling and cost considerations from the lab forecast presented to an “operations manager”
• Quantitative Equity Investing: Golden/Death Cross, MACD, RSI methods
• Non-Traditional Market investing: Simulating Risk/Reward
Day3 Afternoon Case:
• Investment modeling case
Case3: 2 slides and supporting code with Buy, hold, sell, or short recommendation for 3 equities not covered in class.
• Binary modeling techniques: logistic regression, DT, RF
• Consumer Credit Loan Default
• Marketing Offer Acceptance Modeling
Day4 Afternoon Case:
• Consumer propensity modeling for a (fictitious) credit offer
Case4: 2 slides, list of top 100 households for offer & supporting code identifying consumer propensity modeling for a (fictitious) credit offer
• Evaluating startups and venture capital valuation
• Final Examination
Case5 DUE MONDAY: 2 slides & supporting documentation for valuing prospective startup SaaS companies
Supplementary / voluntary
R For Data Science
• Electronically free https://r4ds.had.co.nz/
• Amazon Physical Copy https://smile.amazon.com/Data-Science-Transform-Visualize-Model/dp/1491910399
Each students’ individual case study coding, and summary slides will be evaluated to arrive at the final course score in conjunction with a paper exam on Friday. For the case assignments, coding basics and scripting determine the extent to which the technical concepts are understood. The summary and presentation portion allows students to demonstrate business understanding. The precision, accuracy and sophistication of their presentation will be evaluated from various persona perspectives such as marketing, and operations managers. Each case assignment will have an equal weight. Submissions must be provided by the next day’s class and for the last assignment by the Monday following the conclusion of the course.
10% Case1: 2 slides & supporting code identifying unique customer personas from the lab and methods used, presented to a “marketing executive”
10% Case2: 2 slides & supporting code recommending scheduling and cost considerations from the lab forecast presented to an “operations manager”
10% Case3: 2 slides and supporting code with Buy, hold, sell, or short recommendation for 3 equities not covered in class.
10% Case4: 2 slides & supporting code identifying consumer propensity modeling for a (fictitious) credit offer
10% Case5 DUE MONDAY: 2 slides & supporting documentation for valuing prospective startup SaaS companies
50% Final Exam: multiple choice and short answer
The exam will not have any supplementary aids.
An examination will cover R Programming, exploratory data analysis, forecasting basics, financial chart evaluation, interpretation and implementation of customer propensity modeling. Lastly a short section on private equity evaluation will be included.
Examination literature will include presentations and the course code repository.