Skip to content
Universities of Wisconsin
Call Now608-262-2011 Call 608-262-2011 Request Info Request Info Search the UW Extended Campus website Search
Wisconsin Online Collaboratives
  • About Us
    • About Us
    • Accreditation
    • Our Campus Partners
  • Degrees & Programs
  • Admissions & Aid
    • How to Apply
    • Admission Pathways
    • Important Dates
    • Tuition & Financial Aid
    • Transferring Credits
    • Contact an Enrollment Adviser
  • Online Learning
    • About Online Learning
    • Online Learning Formats
    • Capstone Projects
    • Success Coaching
    • Technology Requirements
  • Stories & News
Home Home / Capstone Projects / The Development of an Augmented Data Management Package to Aid in the Development of Extract, Transform, and Load Processes

The Development of an Augmented Data Management Package to Aid in the Development of Extract, Transform, and Load Processes

Program: Data Science Master's Degree
Location: Not Specified (onsite)
Student: David Kendall

In many typical extract, transform, and load (ETL) implementations, certain aspects of the development effort are candidates for automation. When storing data in a typical relational database, the primary and foreign keys must be discovered. When storing data in a graph database, candidates for node and edge relationships must be discovered. Once these steps are complete, the quality of data being processed is taken into consideration. Data must be analyzed for inconsistencies and cleaned prior to ultimately being loaded into any destination system. This project attempts to create a package that utilizes machine learning and various computation methods to optimize and improve upon what was just recently described. In general, this practice is beginning to become known as Augmented Data Management. Various software tools exist today that perform the same functions as what this project attempts to do. The goal of this project is to show how a package to perform these functions can be built from scratch and eventually implemented into regular ETL processes. Various methods were used to identify key relationships depending on the storage implementation which you are deciding to implement. These relationships are primary and foreign keys for relational databases, and nodes and edges for graph databases. A measure called the Wharf Coefficient is used to find potential graph relationships while ratios of unique values to dataset rows is used to determine the primary and foreign keys. Finally, machine learning methods are used to detect anomalies in the datasets. These methods include Density Based Clustering and Application with Noise (DBSCAN), Isolation Forest, and Local Outlier Factor (LOF). The results of the methods are discussed in detail and provide a glimpse into the capabilities of python programming.

Let's Get Started Together

Apply Apply Schedule an Advising Call Schedule an Advising Call Request Info Request Info

This field is for validation purposes and should be left unchanged.
Are you interested in pursuing the degree or taking one or two courses?(Required)
Can we text you?(Required)

By selecting yes, I agree to receive updates about online degrees, events, and application deadlines from the Universities of Wisconsin.

Msg frequency varies depending on the activity of your record. Message and data rates may apply. Text HELP for help. You can opt out by responding STOP at any time. View our Terms and Conditions and Privacy Policy for more details.

Wisconsin Online Collaboratives will not share your personal information. Privacy Policy

Wisconsin Online Collaboratives

A Collaboration of the
Universities of Wisconsin

University of Wisconsin System

Pages

  • Our Degrees & Programs
  • How to Apply
  • Online Learning Formats
  • Our Campus Partners

Enrollment Advising

608-800-6762
learn@uwex.wisconsin.edu

Contact

780 Regent Street
Suite 130
Madison, WI 53715

Technical Support

1-877-724-7883
https://uwex.wisconsin.edu/technical-support/

Connect

  • . $name .facebook
  • . $name .linkedin
  • . $name .instagram
  • . $name .youtube

Copyright © 2026 Board of Regents of the University of Wisconsin System. | Privacy Policy