Predicting Data Pipeline Failures and Monitoring Data Quality Through Predictive Analytics
Program: Data Science Master's Degree
Location: Dallas, Texas (remote)
Student: Rishabh Gulati
The goal of this capstone project is to advance traditional data quality methods for data integration in data warehouses and data lakes. Traditional methods are predominantly reactive, with teams addressing pipeline failures only after issues arise, leading to service outages and diminished customer confidence. This project aims to shift to a proactive approach by implementing data quality monitoring and predictive analytics to detect potential failures before they occur, thereby reducing downtime and enhancing the reliability of data integration processes.