This week Facebook has released Opacus, a new open-source project designed to train PyTorch models with differential privacy (DP). According to the company, differential privacy is a mathematical framework for quantifying the anonymization of sensitive data. It is meant to make PyTorch models more scalable and the adoption of machine learning easier. 

“With the release of Opacus, we hope to provide an easier path for researchers and engineers to adopt differential privacy in ML, as well as to accelerate DP research in the field,” Davide Testuggine and Ilya Mironov, applied research scientists at Facebook, wrote in a post

Key features of the release include:

  • Ability to compute batched pre-sample gradients throughAUtograd hooks in PyTorch. Facebook explained this approach is much faster than existing DP libraries that use microbatching.
  • Security features such as a cryptographically safe pseudo-random number generator 
  • Ability to quickly prototype ideas and mix and match code with PyTorch and Python code
  • Tutorials and helper functions to improve productivity
  • Ability to keep track of privacy budgets
  • Pre-trained and fine-tuned models

“Our goal with Opacus is to preserve the privacy of each training sample while limiting the impact on the accuracy of the final model. Opacus does this by modifying a standard PyTorch optimizer in order to enforce (and measure) DP during training. More specifically, our approach is centered on differentially private stochastic gradient descent (DP-SGD),” the researchers wrote. 

Testuggine and Mironov went on to explain that privacy-preserving machine learning is important because it minimizes attack surfaces and allows app developers to focus on building products. 

“We hope that by developing PyTorch tools like Opacus, we’re democratizing access to such privacy-preserving resources. We’re bridging the divide between the security community and general ML engineers with a faster, more flexible platform using PyTorch,” they wrote.