The Linux Foundation, through its Open Compliance Program, is currently working on a standard for software packages and licenses. The program is a partnership of Linux and several other companies, and the first version of the Software Package Data Exchange standard (SPDX) is available with updates to be made in the future.
SPDX, currently available in a beta version with hopes for a full version in August, is meant to supply developers and companies with a license package delivery system that is machine readable and a standard way to explain how the software can be used, no matter what license developers or code distributors decide to use.
Black Duck, Canonical, OpenLogic, Protecode and others are working with the Foundation to push this standard out into the software development world. Mahshad Koohgoli, CEO of Protecode, said that the push for this standard resulted from a lack of clear understanding of licensing associated with software.
“There has never been a good description of the components of a software program. Packages can contain hundreds, thousands or tens of thousands of lines of code or files [and it is hard to keep track of the licenses associated with each],” he said.
“SPDX will be used to standardize information associated with distributed software packages.”
Noirin Shirley, the Apache Software Foundation’s executive vice president, said that software licenses are generally available to be applied in an ad hoc way to each and every project.
“In software, particularly in open-source software, there are two groups of people: the person(s) who hold the copyright and can do anything to the code, and the license holder(s) who have the right to use the code,” Shirley said.
She said that some licenses, like the GPL, LGPL, OGPL, etc., require the user to return modifications made to the code to the community at large, which is something a company would want to know before installing the code onto a proprietary piece of software, in terms of protecting business secrets and intellectual property.
Shirley added that the Apache licenses, which can be put on any open-source software (not just Apache projects), do not ask for a copyright grant, just a license. They also allow Apache and others to re-license the software for profit or otherwise.
A unique feature of SPDX, according to Koohgoli, is that it is machine generated and machine readable, to be consumed by business management software. “[This] helps computers keep track of licenses without developers and managers regulating it,” he said.
The possibilities, Koohgoli added, are unique as well. “You could auto-generate a bill of materials for clients, which would be a list of all files and licenses quickly [packaged in the SPDX standard]. It also standardizes the text of the licenses.
“There are 80 Open Source Initiative licenses and thousands of variations. It is an effort to read each license and truly understand your obligations [so SPDX will standardize that in a machine-readable file]. There is also a template available to fit your own license language into the SPDX standard.”
John Ellis, vice president of business development at Scanbuy and a member of the business team for the SPDX committee, said that understanding the obligations associated with each license is a big issue in the community.
“A lot of companies use open-source software,” he said. “When I was at Motorola, we wanted to know where [portions of code and software] came from, who wrote it, who holds the copyright, etc. Companies want to comply with the copyright laws, because just because it is open source doesn’t mean it doesn’t have a copyright. We wanted to use software with a strong understanding of where it came from.”
The SPDX project intends to create a unified and standard way of delivering software packages, from files to components and licenses to copyrights.
Kate Stewart, technical lead on the SPDX project worked with Ellis on the standard. She decided, along with others in the industry who were concerned about the licensing issues, that there should be particular fields that can populate a spreadsheet for commercial use, which is one portion of the SPDX standard. Both forms of the SPDX standard can be read by a computer—both the RDF file and the tag file—but the tag file is the one that populates a spreadsheet for both the open-source community and the commercial users to include as their list of materials.
She added that sharing software packages in a supply-chain fashion involves a bit of trust, so the first part of creating a standard is creating a way for developers and their managers to view who did the analysis of the software package being used. Then, packages need to show the name, ID, verification codes, where it was downloaded from, and the licenses and copyrights.
Cooperation wanted
The working group facilitates the need to get companies on board, and Ellis said that several big companies are working with the Linux Foundation to foster education and push the standard out into the world of proprietary software, much of which has open-source roots or components.
The SPDX project group has identified about 150 licenses and is working with Debian to ensure that DEP5 (a packaging standard for Java software packages) is compatible with the SPDX system, according to Stewart. DEP5 is standard for modifying and distributing Debian, an open-source operating system.
Of the 150 licenses identified, Ellis said there are about 10 used in the majority of cases, namely GPL version 2 and 3; LGPL version 2 and 3; Apache 1 and 2; EPL; Mozilla’s public license; and the Creative Commons licenses.
Shirley said that the learning and innovation gained from these open-source software communities is part of Apache’s mission and why Apache structures its projects in the incubator and with top-level projects, ensuring that a strong community looks after the advancement of the technology.
While Shirley said Apache doesn’t have problems with people understanding the licenses themselves, she said that sometimes there are issues with the legal nuances.
Tech attorneys often only have a broad level understanding of licenses, not necessarily the open-source licenses, Ellis said. He added that a lack of education about software licensing is the biggest problem facing the community today.
Companies, he said, sometimes ship embedded systems without understanding what is embedded in the chips they are selling. Many devices use Linux today, and companies should be more aware of the technical aspects of their coding, he said.
Shirley said Project Harmony is working to “harmonize licenses on the contribution side, instead of on the packaging side.” This project is a community-based group that works on contributor agreements for free and open-source software. It was launched in May 2010 by Amanda Brock, general counsel for Canonical, the distributor of Ubuntu Linux, and is hosted by Oregon State University’s Open Source Lab.
The project focuses on creating templates for open-source software contributor agreements as well as explaining the necessity of such. The contributor agreement templates can be adopted, in whole or part, by free and open-source projects, depending on the needs of the project and members on the project team.
Common licenses explained
Ellis and Shirley both agreed that there are a few licenses developers choose to use more often than others for open-source software. Here’s the list of the licenses cited to help give you a better understanding of their use and which projects would benefit.
GNU General Public License (GPL)
The GNU GPL (currently at version 3 although, as Ellis said, version 2 is still used) is a “free copyleft license for software and other kinds of works.” Free or open-source software does not refer to the price of the software; it refers to whether or not the source code can be accessed and whether it can be altered, added on to and added to different, new software packages. There is no warranty for GPL licensed software, modified versions must be marked as changed, and the use of the license gives future license holders the ability to run the unmodified program. The work covered under the GNU GPL must be, according to GNU, a covered work, which is further explained at its website.
GNU Lesser General Public License (LGPL)
The GNU LGPL covers the parts of a library associated with applications, combined works and other features of a software package as covered by the GPL.
This requires that the user must convey a copy of the modified version of the software. The difference between the GPL and the LGPL is that LGPL versions 2, 3 and 4 cover combined works.
GNU defines a combined work as a “work produced by combining or linking an application with the library, referring to a covered work governed by the license,” according to its website.
Apache Software Foundation License
The Apache license, approved by the Apache Software Foundation, is often used in commercial projects, according to ASF’s Shirley.
Version 2.0, which according to Ellis is used most often in addition to version 1.0, was approved by the ASF in 2004.
The license is reusable without modification by any project, can be included by reference instead of in every file, can require patent licenses on contributions, and helps clarify the license on submissions of contributions, according to the ASF.
It has not been determined if the Apache License is compatible with the GPL, but the SPDX standard for licensing might change that as a standard text for licenses is included in the software package data exchange project, said Mahshad Koohgoli, CEO of Protecode.
Apache’s license also includes contributor license agreements, much like the Harmony Project, which is trying to standardize agreements and licenses on the contributor side rather than on the packing side.
Eclipse Public License (EPL)
The EPL is provided by the Eclipse Foundation, and it enables a contributor (any person or entity that distributes a program) to grant a recipient (anyone who receives the program under the EPL) a “non-exclusive, worldwide, royalty-free copyright license to reproduce, prepare derivative works, or publicly display and perform, distribute and sublicense the contribution and derivative works in source code and object code form,” according to its website.
Mozilla Public License (MPL)
The MPL has a variety of definitions, and in general it grants a worldwide, royalty-free, non-exclusive license subject to third-party intellectual property claims. Modifications do not have to be included, and they can be included as part of a larger work or not; it is up to the packager.
Creative Commons
There are three layers to these licenses, and they include a machine-readable copy, a legal code copy and a human-readable copy. The Commons’ website lists six licenses that are used for a variety of commercial property on the Web, not only software. Licenses include:
Attribution: This allows others to distribute, modify and build upon the original work freely as long as proper credit is given to the original owner.
Attribution-ShareAlike: This allows others to distribute, modify and build upon the original work as long as the new copy is licensed under a like license, with all the same benefits given to new users. And, of course, attribution is required.
Attribution-NoDerivs: This allows for redistribution, both commercial and non-commercial, but the project cannot be modified and proper credit must be given.
Attribution-NonCommercial: This allows for modification and distribution, only in a non-commercial fashion, and the new license must acknowledge the first user and does not have the same license terms.
Attribution-NonCommercial-ShareAlike: Same as NonCommercial, however the new license holder has the same rights as the original holder.
Attribution-NonCommercial-NoDerivs: This is the most restrictive, according to the Creative Commons. New users must credit the original holder of the work, it cannot be modified, and it cannot be distributed without proper credit or used commercially.