Advanced Analytics from Data Science to Supplement a Health Actuary’s Toolkit
Program: Data Science Master's Degree
Location: La Crosse , Wisconsin (remote)
Student: Eric Engholdt
My capstone project applies several Data Science machine learning techniques and compares the results with actual 2021 experience of health insurance claim costs. Specifically,
- Part 1: Use several time series forecasting methods to forecast aggregate Indexed Monthly Claim Per Member Per Month (PMPM) values for eight major markets / regions (20 products in total over all eight regions) at my company, anonymized as large, national health insurer “XYZ” in the final paper. Forecasts are also compared with true actuarial forecasts from XYZ’s health actuaries.
- Part 2: Use machine learning (ML) models to predict physician office visits, physician costs per visit, which are fundamental building blocks of healthcare costs leading to an expected Claim PMPM (2019 data is the training set, 2021 data is the test set). Predictions are not compared to actuarial forecasts for this exercise – the purpose here is only to demonstrate how ML can be used with this specific example, which can be generalized to a full Actuarial Cost Model (ACM) if a health actuary chooses.
The advantage of my case study is that I can compare forecasts to what actually happened in 2021. In this way, my actuarial colleagues can see real results of Data Science applied to their own data. At the company where I perform actuarial services, actuaries are often too focused on descriptive analytics. One primary objective of my capstone was to use real company data, apply machine learning, and show how much more accurate data science predictive analytical methods can be.