University of Wisconsin Extended Campus is now Wisconsin Online Collaboratives! This name reflects the partnerships of the 13 universities within the Universities of Wisconsin–our state's premier system of public higher education. Through these partnerships we will continue to support online degrees, certificates and courses–along with support services to you.

Capstone Projects

Utilizing Natural Language Processing Techniques to Drive Source to Target Mappings in ETL Processes

Program: Data Science Master's
Host Company: Cogitativo, Inc.
Location: Berkeley, California (onsite)
Student: Shawn Chapler

This project explores the use of NLP techniques to drive source to target mappings in ETL processes.  It explored the use of a noisy channel model to address abbreviations, so common tokens could be created across terms.  It also explored topic modeling techniques to determine whether or not it could be used to identify sub-topics within the file headers.   While the focus was in the healthcare domain, both the techniques and findings can be applied more broadly.