Skip to content
Universities of Wisconsin
Call Now608-262-2011 Call 608-262-2011 Request Info Request Info Search the UW Extended Campus website Search
Wisconsin Online Collaboratives
  • About Us
    • About Us
    • Accreditation
    • Our Campus Partners
  • Degrees & Programs
  • Admissions & Aid
    • How to Apply
    • Admission Pathways
    • Important Dates
    • Tuition & Financial Aid
    • Transferring Credits
    • Contact an Enrollment Adviser
  • Online Learning
    • About Online Learning
    • Online Learning Formats
    • Capstone Projects
    • Success Coaching
    • Technology Requirements
  • Stories & News
Home Home / Capstone Projects / Topic Modeling and Feature Extraction of Medical School Admissions Essays

Topic Modeling and Feature Extraction of Medical School Admissions Essays

Program: Data Science Master's Degree
Location: Minnesota (onsite)
Student: Jackie Dockendorf

A lot of structured and unstructured data is created throughout the different stages of the medical education continuum. Still, much of the unstructured data does not get used as frequently as structured data in research. Unstructured text data cannot be used in its raw format in most traditional statistical and machine learning analyses. Text must be transformed before it can be used for this purpose. A review of the literature found previous works that used higher education admissions essays for thematic, text mining, and computational linguistic analyses. It also found works that used the output of similar analyses to predict different types of outcomes. This paper discusses how a set of features were extracted from 1361 medical school admissions personal statement essays. The purpose was to create something from otherwise unstructured text data that could be linked to student, clinical, and workforce outcomes. Methods used include natural language processing techniques and unsupervised machine learning based topic modeling methods, including Latent Dirichlet Allocation and Non-Negative Matrix Factorization. A topic model was created with interpretable topics, which gave insight into the contents of the personal statements. The model was applied to the dataset of essays to create a feature vector that was exported. The results of the analysis have the potential to be used as input in other studies, and the methods used could be replicated for similar unstructured text data.

Let's Get Started Together

Apply Apply Schedule an Advising Call Schedule an Advising Call Request Info Request Info

This field is for validation purposes and should be left unchanged.
Are you interested in pursuing the degree or taking one or two courses?(Required)
Can we text you?(Required)

By selecting yes, I agree to receive updates about online degrees, events, and application deadlines from the Universities of Wisconsin.

Msg frequency varies depending on the activity of your record. Message and data rates may apply. Text HELP for help. You can opt out by responding STOP at any time. View our Terms and Conditions and Privacy Policy for more details.

Wisconsin Online Collaboratives will not share your personal information. Privacy Policy

Wisconsin Online Collaboratives

A Collaboration of the
Universities of Wisconsin

University of Wisconsin System

Pages

  • Our Degrees & Programs
  • How to Apply
  • Online Learning Formats
  • Our Campus Partners

Enrollment Advising

608-800-6762
learn@uwex.wisconsin.edu

Contact

780 Regent Street
Suite 130
Madison, WI 53715

Technical Support

1-877-724-7883
https://uwex.wisconsin.edu/technical-support/

Connect

  • . $name .facebook
  • . $name .linkedin
  • . $name .instagram
  • . $name .youtube

Copyright © 2026 Board of Regents of the University of Wisconsin System. | Privacy Policy