Healthcare Insurance Fraud Detection Using Machine Learning
Program: Data Science Master's Degree
Location: Not Specified (remote)
Student: Khadidja Malaïka Touré
This capstone project consisted in investigating how we could use data science to detect fraudulent activities in the health insurance sector. As insurance providers face significant financial losses from anomalies such as duplicate claims, phantom billing, inflated charges, and prescription fraud, it was important for us to create a data-driven system that could detect fraudulent claims early, improve the auditing process by targeting high-risk claims and prevent financial losses in the insurance sector. The project used a dataset that closely mirrors the real-life data and was completed in four phases: Data Cleaning and Preprocessing, Exploratory Data Analysis, Time Series Analysis and Model development. During these phases, We made sure to examine key features are for their relationship with fraudulent behavior and to analyze seasonal patterns to uncover temporal trends. We then trained several machine-learning model and compared their performances to get the best suited one for detection. At the end of this project, we performed an evaluation of the fraud detection model, highlighted the important feature to take into account while investigating fraud and made some recommendations to integrate the fraud detection solutions into the claims processing environment.