University of Wisconsin Extended Campus is now Wisconsin Online Collaboratives! This name reflects the partnerships of the 13 universities within the Universities of Wisconsin–our state's premier system of public higher education. Through these partnerships we will continue to support online degrees, certificates and courses–along with support services to you.

Capstone Projects

Classification of Podcasts using Content-Based and Context-Based Data

Program: Data Science Master's
Location: Not Specified (onsite)
Student: Samuel A. Bailey

This case study uses both supervised and unsupervised natural language processing techniques for text classification to compare utility of larger podcast transcript data to that of smaller podcast metadata in the data science task of categorization. Transcripts and metadata from 300 podcasts from 10 different categories were used for evaluation of different text-classification methods. Awareness of differences in utility between podcast transcripts and metadata for classification into categories is significant for a podcast streaming platform in focusing their time, effort and investment in properly categorizing this audio content for an easy to navigate and personalized user-experience.