LinkedIn is sharing its “Project Every Member” initiative with the open sourcing of spark-inequality-impact, an Apache Spark library that can be used by other organizations in any domain where measuring and reducing inequality, or avoiding unintended inequality consequences may be desirable.  

“This work is furthering our commitment to closing the network gap and making sure everyone has a fair shot at finding and accessing opportunities, regardless of their background or connections,” LinkedIn wrote in a blog post.

LinkedIn announced last month that it would be building inclusive products through A/B testing in the initiative called Project Every Member. 

LinkedIn stated that any change on its platform is subjected to a series of testing and analysis processes to ensure that it achieves intended product goals and business objectives through A/B testing. The best way to go about it is to start by giving a preview of the change or feature to a few members for a limited time, and then measure the results. 

The Atkinson index is then used to determine which end of the distribution contributed most to the observed inequality and allows developers to encode other information about the population being measured into the analysis to overcome any shortcomings that A/B testing has. 

LinkedIn decided to implement Atkinson index computations using Apache Spark due to scalability considerations with respect to the size of the data over which to compute inequality, for example, the number of individuals who are part of specific A/B tests and the number of times inequality needs to be computed. 

While inequality metrics can already be computed on R and Python, they typically require users to fit all the data in memory within a single machine. 

“We are releasing a package that leverages the fact that the Atkinson index can be decomposed as a sum, which means the data does not to be held in memory all at once. We then use it as part of a larger pipeline that applies it to many A/B tests at once,” LinkedIn wrote. 

The code is available on GitHub here.