For the past two years, the Open Source Initiative (OSI) has been working on developing a definition for Open Source AI that the industry can use to determine which models are actually considered open, and it has a new draft of the definition to share as it nears the final release in October.
According to the organization back in May, there are many companies out there claiming their models are open source when they might really not be. Having a definition will make it easier for developers to make those determinations themselves.
Draft version 0.0.9 further clarifies the components for Open Source models and Open Source weights, and states that all components of a system need to meet the open source standard in order for that system to be considered open source.
The OSI has also decided that training data won’t play a role in classification. “After long deliberation and co-design sessions we have concluded that defining training data as a benefit, not a requirement, is the best way to go,” the OSI wrote in a post. “Training data is valuable to study AI systems: to understand the biases that have been learned, which can impact system behavior. But training data is not part of the preferred form for making modifications to an existing AI system. The insights and correlations in that data have already been learned.”
Some other changes in draft 0.0.9 are that the Checklist is now its own document, there are now references to conditions of availability of components, and the word “Model” was updated to “Weights” under the “Preferred form to make modifications,” because the way the word was used there was inconsistent with how it is used in the rest of the document.
According to the OSI, the items still on the roadmap before October include continuing to improve the drafts based on feedback from meeting with shareholders at events around the world, updating the FAQ, establishing a review process for future versions of the definition, and deciding how to address reviews of new licenses for datasets, documentation, and agreements around model parameters.
“Creating an Open Source AI Definition is an arduous task over the past two years, but we know the importance of creating this standard so the freedoms to use, study, share and modify AI systems can be guaranteed. Those are the core tenets of Open Source, and it warrants the dedicated work it has required,” OSI concluded.