Capstone Projects

Sales Forecasting and Customer Segmentation

Program: Data Science Master's Degree
Location: Not Specified (remote)
Student: Rachel Hancock

The goal of this client-based study was to provide Company X, a kitchen and bath manufacturer, with a machine-learning forecast model to predict the 2024 first-quarter sales for approximately 1,600 customers using historical data from the previous five years. Company X does not sell their products directly but through showrooms and web dealers – these dealers are considered their ‘customers. The motivation was to improve forecast accuracy and reduce the manual analysis required for Company X’s future forecasting efforts. Autoregressive integrated moving average (ARIMA), seasonal autoregressive integrated moving average (SARIMA), and long short-term memory networks (LSTM) machine-learning models were explored to determine the most accurate model. SARIMA was found to be the best-performing model for overall sales and ARIMA was found to be the best-performing model for individual customer sales. The ARIMA model surpassed Company X’s current five-year forecast accuracy, while the SARIMA model did notCustomers were also determined to be growing or declining based on their forecasts so preventative action could be taken to prevent customer churn. Additionally, customer segmentation was performed using an RFM (Recency, Frequency, Monetary) analysis to identify ‘Top Customers,’ ‘High-Value Customers,’ ‘Medium-Value Customers,’ ‘Low-Value Customers,’ and ‘Lost Customers.’ The qualities of ‘Top Customers’ were also identified to provide insight into increasing sales for underperformers.