Microsoft has been collecting 13 million work items and bugs since 2001, and used that data to create a machine learning model to fight software bugs. According to the company, the model distinguishes between security and non-security bugs 99% of the time and identify the high-priority bugs 97% of the time. 

“At Microsoft, 47,000 developers generate nearly 30 thousand bugs a month. These items get stored across over 100 AzureDevOps and GitHub repositories. To better label and prioritize bugs at that scale, we couldn’t just apply more people to the problem,” Microsoft’s senior security program manager Scott Christiansen, and data and applied scientist Mayana Pereira wrote in a post. “Large volumes of semi-curated data are perfect for machine learning.”

At the beginning of the project, Microsoft knew that it needed to seek out data that is general enough and not fitted to a small number of examples, looked for data that didn’t infringe on privacy regulations, and looked at generating data in a simulated environment to overcome issues that come with data extracted from the wild. 

Within the process, security experts approved training data before it was fed to the machine learning model and statistical sampling was used to provide the security experts a manageable amount of data to review.

“Our classification system needs to perform like a security expert, which means the subject matter expert is as important to the process as the data scientist,” Christiansen Pereira and wrote.

Collaboration between subject matter experts and data scientists was key to identifying all the data types and sources and the review process once the viable data was identified. Data scientists select a data modeling technique, train the model, and evaluate model performance while security experts evaluate the model in production by monitoring the average number of bugs and manually reviewing a random sampling of bugs, Microsoft explained.

In the end result, the model could classify bugs accurately and in the second step, was able to apply severity labels to the security bugs. 

“The process didn’t end once we had a model that worked. To make sure our bug modeling system keeps pace with the ever-evolving products at Microsoft, we conduct automated re-training. The data is still approved by a security expert before the model is retrained, and we continuously monitor the number of bugs generated in production,” Christiansen Pereira and wrote.

Additional details are available here.