TNBC Biomarkers and Gene Expression: Understanding Subtype-Specific Data Through Machine Learning
Program: Applied Biotechnology Master's Degree
Host Company: Ann and Robert H. Lurie Children’s Hospital
Location: Chicago, Illinois (hybrid)
Student: Rosetta Sellers-Varela
Triple-negative breast cancer is highly invasive and prone to early recurrence, and Black women are disproportionately affected by the disease, experiencing a 40% higher death rate. The mRNA microarray provides a snapshot of transcript activity in a tumor sample, and understanding this data is critical to treatment efficacy and positive outcomes. The objective of the perform_analysis() algorithm is to take as input an Excel table of mRNA expression assay data and return a summary, boxplots, heatmap with dendrogram, and principal component analysis (PCA) to highlight trends in the assay data. The unsupervised machine learning methods used in the algorithm identify clusters with similar gene expression, assisting with early detection, the discovery of subtype-specific mutations, and targeted drug responses.