Skip to content
Universities of Wisconsin
Call Now608-262-2011 Call 608-262-2011 Request Info Request Info Search the UW Extended Campus website Search
Wisconsin Online Collaboratives
  • About Us
    • About Us
    • Accreditation
    • Our Campus Partners
  • Degrees & Programs
  • Admissions & Aid
    • How to Apply
    • Admission Pathways
    • Important Dates
    • Tuition & Financial Aid
    • Transferring Credits
    • Contact an Enrollment Adviser
  • Online Learning
    • About Online Learning
    • Online Learning Formats
    • Capstone Projects
    • Success Coaching
    • Technology Requirements
  • Stories & News
Home Home / Capstone Projects / Handling Missing Data: Effects of Different Approaches on the Performance of Predictive Models Built on Complex Big Data Datasets

Handling Missing Data: Effects of Different Approaches on the Performance of Predictive Models Built on Complex Big Data Datasets

Program: Data Science Master's Degree
Location: Not Specified (remote)
Student: David Vlosak

The purpose of this project was to explore the impact of different missing value imputation (MVI) approaches on the performance of predictive binary-classification models built on such imputed complex large or Big Data datasets. Based on an experimental strategy using simulation within a postpositive framework, findings indicated that the combination of MVI approach (bagging, KNN, mixed models), model (logistic regression, naïve Bayes, boosted trees), and dataset characteristics (number of cases and predictors, feature data types, missingness mechanisms, distribution of missing values among different predictor data types, and missing-data rates) impacted predictive performance for binary-classification problems. On the one hand, missing value imputation approaches consisting of a blend of different imputation methods (e.g., mixed models) resulted in the most accurate predictive performance regardless of model type or dataset characteristics (e.g., missing-data rate). Similarly, predictive performance for imputed complex datasets initially possessing a 25% missing-data rate was relatively accurate regardless of model and imputation type. On the other hand, MVI approaches using a singular imputation method (e.g., bagging and KNN) resulted in different predictive performance values depending on the model used and dataset characteristics (e.g., missing-data rate). Predictive performance was evaluated using overall classification accuracy (OCA), and the trustworthiness of OCA values were confirmed by the metric accuracy variation percentage (AVP). The project findings contributed to the existing gap in the literature by including complex datasets in studying the impact of MVI approaches and models on binary-classification predictive performance. The project findings also contributed to data-practitioner praxis by identifying some combinations of MVI approaches, models, and dataset characteristics that are likely to result in relatively accurate predictive performance and other combinations that might be best to avoid.

Let's Get Started Together

Apply Apply Schedule an Advising Call Schedule an Advising Call Request Info Request Info

This field is for validation purposes and should be left unchanged.
Are you interested in pursuing the degree or taking one or two courses?(Required)
Can we text you?(Required)

By selecting yes, I agree to receive updates about online degrees, events, and application deadlines from the Universities of Wisconsin.

Msg frequency varies depending on the activity of your record. Message and data rates may apply. Text HELP for help. You can opt out by responding STOP at any time. View our Terms and Conditions and Privacy Policy for more details.

Wisconsin Online Collaboratives will not share your personal information. Privacy Policy

Wisconsin Online Collaboratives

A Collaboration of the
Universities of Wisconsin

University of Wisconsin System

Pages

  • Our Degrees & Programs
  • How to Apply
  • Online Learning Formats
  • Our Campus Partners

Enrollment Advising

608-800-6762
learn@uwex.wisconsin.edu

Contact

780 Regent Street
Suite 130
Madison, WI 53715

Technical Support

1-877-724-7883
https://uwex.wisconsin.edu/technical-support/

Connect

  • . $name .facebook
  • . $name .linkedin
  • . $name .instagram
  • . $name .youtube

Copyright © 2026 Board of Regents of the University of Wisconsin System. | Privacy Policy