ML-Based Employee Absence Prediction for Manufacturing Risk Management
Program: Data Science Master's Degree
Location: Fort Collins, Colorado (onsite)
Student: Trevor Nelson
This project addressed a critical operational challenge for a US-based manufacturing company: unplanned employee absences that disrupt production schedules and customer deliveries. Unlike inventory or tooling shortages that provide warning, staffing gaps from employee no-shows force managers to rely on gut-feeling heuristics when planning daily operations.
Project Objectives
1. Train a classifier model achieving minimum 70% precision on test data
2. Provide a staffing forecasting tool predicting no-show likelihood one day in advance
3. Ensure model explainability for legal and ethical compliance in employee-related decisions
Using PySpark on a distributed computing infrastructure, I developed a Random Forest model that achieved 74% precision despite extreme class imbalance (0.15% positive rate). The model processed tens of millions of records from the company’s workforce management system, incorporating sophisticated feature engineering, including rolling window statistics and team-level behavioral metrics.
The solution enables managers to abandon unreliable staffing assumptions and instead use data-driven forecasts for proactive mitigation planning, such as shifting employees between production lines. Model interpretability requirements drove the selection of tree-based algorithms, ensuring leaders understand how predictions are generated while maintaining compliance with employee privacy standards.