Capstone Projects

Classification and Data Wrangling on the Amateur Astronomy Frontier

Program: Data Science Master's
Location: Eau Claire, Wisconsin (remote)
Student: Zachary Jacobson

The website, AstroBin, provides a unique image hosting platform to those involved in the hobby of astrophotography. AstroBin is unprecedented in its focus on collecting and organizing the meta data surrounding the images being uploaded; from the acquisition details, the equipment used to take the image, the post-processing steps performed, even the average phase of the moon which the photos were taken. Having these details organized and structured in a queryable form is an invaluable resource to any astrophotographer at all skill levels.    

In this client-based capstone, it was sought to improve upon the meta data collection efforts of the AstroBin website. This was done through two deliverables; an Elasticsearch synonym filter of space objects and their alias names and IDs, as well as an image title text classification model which could be used to provide an automated layer to image classification during a user’s image upload process. These deliverables were achieved through four objectives: 1) A comprehensive exploration of text classification modeling techniques and training data augmentations, 2) The thorough review and grading of each modeling technique and training data augmentation explored for a final selection of the deliverable classification model, 3) A process in which to compile space object aliases into the solr formatted synonym filter required by the Elasticsearch search engine, which depended upon 4) A novel data structure to map all space objects to their respective aliases (a Space Object Alias Map).