Capstone Projects

Predictive Sales Forecasting Through the Use of Non-Parametric Models

Program: Data Science Master's
Location: Not Specified (remote)
Student: Nicolas Brown

Current sales forecasting techniques are flawed in that they rely on the intuition of the sales staff. The common sales forecasting approach involves estimating the probability a sale will be closed and multiplying that probability by the sale amount. Sales forecasting may benefit from using machine learning models to predict if a sale will be won. This data-driven approach could result in increased forecasting accuracy, which would improve the quality of resource planning and allocation. This study evaluated four base models: Random Forest, Gradient Boost, XG Boost, and Decision Trees. The goal of this study was to identify which models provided the best precision, recall, and f1 scores. The model with the best precision can be used as the lower estimate of the sales forecast. The model with the best recall score can be used as the upper estimate of the sales forecast. The model with the best f1 score can be used to estimate the average sales forecast. This study found that Random Forest with SMOTE provided the best precision score, Decision Tree with SMOTE was the best model for recall, and Decision Tree without SMOTE was the best model for f1. This study found that The number of days since the last activity and the number of days a sale has been in the same stage were the top two predictors of winning a sale. By minimizing these two values, the sales team may be able to improve their sales win rate, increasing revenue.