Capstone Projects

Redaction of Protected Health Information(PHI)/ Personal Identifiable Information(PII) from X-Ray Files

Program: Data Science Master's
Location: Not Specified (onsite)
Student: Abhishek Alexander

The purpose of this capstone project is to design and develop a machine learning model to identify and redact PHI/PII data from the X-ray files while maintaining the integrity of the X-ray file. This machine learning model will replace the manual process of redacting PHI/PII from the X-ray files, thus mitigating the risk of inadvertently exposing it due to any manual error or oversight. 

The machine learning model will also help to robustly implement the patient privacy laws and regulations laid down in the HIPAA regulations. 

The objective of the project is to design and develop a machine learning model to: 

  • Identify the sensitive PHI/PII region on the image file. 
  • Redact the identified PHI/PII from the image file. 
  • Ensure the integrity of the image remains unaffected after redaction of PHI/PII. 

This project addresses the concerns of an inadvertent exposure of the sensitive information to unauthorized people. This machine learning model will automatically redact the sensitive information like PHI/PII from the X-ray files.  A successful implementation can lead to the following: 

  • Protection of Patient Privacy: Automated redaction will mitigate the risk of unauthorize access to the PHI/PII information if the X-ray files are shared outside the health provider’s domain. This will also maintain the trust between the patient and the healthcare provider.
  • HIPAA Compliance: This program will also mitigate any ethical and legal risks by ensuring that the organization remain in compliance of HIPAA and other patient data privacy laws and regulations.