Capstone Projects

Predicting Wine Quality with Machine Learning: Insights for Wineries & Retailers

Program: Data Science Master's Degree
Location: Not Specified (remote)
Student: Haosheng Chen

The project, Predicting Wine Quality with Machine Learning: Insights for Wineries and Retailers, explored how data science can enhance decision-making in the wine industry. The primary objective was to develop a machine learning pipeline to predict wine quality scores, compare various models, assess the impact of customer review sentiment, and identify the best model for practical use. Using a dataset of 156,731 wine records, the study incorporated structured features like winery, price, and designation, as well as unstructured customer reviews, to build predictive models. The process began with a preprocessing pipeline to handle missing values and encode categorical features, followed by sentiment analysis to extract insights from reviews. Multiple regression models, including Random Forest, XGBoost, KNN, and Neural Networks, were trained and evaluated to determine their effectiveness. The project’s findings provided actionable strategies for wineries to improve marketing and pricing and for retailers to optimize inventory and purchase decisions, highlighting the importance of key features like branding. Through this work, I gained hands-on experience in data preprocessing, model tuning, and translating technical insights into business applications. Future students can expect to develop similar skills while tackling real-world problems, learning the value of iterative analysis and clear communication in data science projects.