Capstone Projects

A Machine Learning Model and Web API for Predicting the Destruction of Single-Family Homes Due to High-Intensity Wildfires in California

Program: Data Science Master's
Host Company: Pyrologix, LLC
Location: Not Specified (remote)
Student: Joshua M Clark

From 2005 to 2020, wildfires destroyed nearly 60,000 structures across California, representing 67.2% of the total structure losses due to wildfire in the United States over the same period. California wildfires are projected to continue to increase in size, intensity, and duration ultimately impacting homeowners, communities, and many organizations (e.g., land management agencies, emergency managers, insurers). We constructed a model for predicting the likelihood of a home being destroyed by wildfire applicable to single-family homes in wildfire-prone areas throughout the state to increase homeowner awareness, inform risk reduction and suppression planning, and aid insurers in developing more accurate pricing and incentivization programs. Our final model (extreme gradient boosting) considered building characteristics (e.g., siding type, roof vents, window panes), exposure to wildland vegetation, and housing density of 16,477 homes affected by wildfires in California, achieving a balanced accuracy of 0.82, precision of 0.83, and AUC of 0.90. We found in our analysis that the presence of non-combustible siding (e.g., stucco, brick, cement, metal), multi-pane windows, no roof vents, and no or enclosed eaves generally decreased the likelihood of home loss. Features that increased the likelihood of home loss included being a mobile home, having large or unscreened roof vents, having an attached patio or carport, and having single-pane windows. Our final model was operationalized via an API allowing others to integrate model predictions within their applications. We recommend that organizations involved in wildfire risk reduction consider our results to inform California homeowners on appropriate strategies for protecting their homes from wildfire.