Capstone Projects

Evaluate Forecasting Technique to Predict Sales and Optimize Supply Chain

Program: Data Science Master's
Location: Not Specified (remote)
Student: Pranshu Tiwari

 This study analyzed the time-series forecasting method on an online retail store dataset to predict sales for different product categories and regions. The forecasting methods explored statistical, and machine learning (ML) based on univariate and multivariate data at daily and monthly frequency. The time-series data is grouped by product and region to enable the users to make decisions on either to increase sales or ship items from adjacent regions/warehouses for efficient supply chain. For univariate forecasting, the methods included autoregressive integrated moving average (ARIMA), Seasonal Auto-Regressive Integrated Moving Average with exogenous factors (SARIMAX) and long short-term memory (LSTM) artificial neural network models  

The project aimed  to: 

  • Determine the efficacy of different ARIMA/SARIMAX/RNN models for monthly forecasting based on Product Type  
  • Determine the efficacy of univariate forecasting models as stated above with RNN based multivariate forecasting model leveraging economic data  
  • Determine the efficacy of multivariate daily forecasting by adding external regressors like  Reddit news based on  Doc 2 Vector Model using Neural Network. The Document to Vector model is derived for Reddit News using the genism package library in Python. 
  • Determine which products geography has persistently increased sales during the validation period and determine the confidence level associated with the forecast