Exploring Rural Road Crash Data with Statistical Models
Program: Data Science Master's Degree
Location: Not Specified (onsite)
Student: Ryan Loos
The purpose of this study was to explore the County Road Safety Plan (CRSP) data set using traditional regression and classification methodologies to identify rural roadway features that are characteristic of severe crashes. Statistical methods were used to assess the best fitting traditional count regression models for both a base set of variables that matched previous phases of the CRSP process. Once a model type was select the base set of variables and a wider range of available variables were assessed for significance for explaining severe crashes for segments, intersections and curves. Similarly, an XGBoost classification model was used to assess variable feature importance and compare to regression results. Both model types were used to predict severe crashes and rank for each segment, intersection, and curve included in the data set. The assessment concluded that these methods could be used either in place of, or in combination with existing analytical methods to identify a different ranked set of locations based on critical roadway features. The hope is that the methodologies and analysis conducted here can be used for future iterations of the CRSP process to continue identifying and prioritizing rural segments, intersections, and curves that possess the characteristics of other severe crash locations for systemic proactive treatment.