Evaluating Risk in Construction Contracts: Utilizing Text Mining to Build a Contract Evaluation Tool for Subcontractors
Program: Data Science Master's Degree
Location: Wisconsin (onsite)
Student: R.D. Long
In the construction industry, the legal contract between a prime contractor and the subcontractors it hires has frequently been a source of financial risk. Compared to the prime contractor, the subcontractor is usually a smaller company lacking resources for thorough contract review by an attorney and is therefore at a disadvantage and open to additional risk of loss. An automated contract review, powered by text mining methods, would streamline the legal review, and bring it within the subcontractor’s means to obtain on more jobs. A review of available research and the marketplace illustrated rapid growth of text mining applications in many legal applications, including contract analysis. Traditional “bag of words” text mining techniques including word frequency correlation, term frequency-inverse document frequency, and bigrams were applied to a sample set of construction contracts and templates to determine similarity with one another and to an industry standard template, and to identify causal word and bigram frequency differences. The comparisons were applied at both the level of the entire subcontract and within contract section classifications as determined by topic modeling. The project verified wide contract variation within the industry sample. However, it provided little insight into the causes for the variation that was translatable into risk-increasing legal terms that could be communicated to a subcontractor. The project findings can be built upon via application of emerging text mining technologies that retain more of the context of the subcontracts.