As open-source software continues to become a critical part of the software industry, GitHub wants to ensure the community understands the pervasive landscape. The organization recently released an open set of data designed to help researchers, data enthusiasts and open-source members comprehend the overall needs of the community.

Some of the major findings highlight how valued documentation is to developers, even though it is often overlooked. The open-source data also reveals the impact on negative interactions, how open source is used by the world, and who makes up the open-source community.  

GitHub collaborated with researchers, the industry and the open-source community to design its 2017 Open Source Survey. With more than 50 questions, the survey covers a range of topics, from documentation to the open-source industry as a whole.

One of the key findings from the survey demonstrates that documentation is highly valued, but often overlooked. According to GitHub, documentation helps new GitHub users contribute to projects, use projects, and understand the overall conduct and standards of the community. It is proven that documentation is in fact an impactful way to contribute back to open source, according to the findings.

The survey found that incomplete or outdated documentation is a pervasive problem, according to 93% of respondents. However, 60% of contributors say that they rarely or never contribute to documentation. To combat this, GitHub suggests that developers help a maintainer out and open a pull request to improve documentation issues as needed.

The survey also found that documentation helps create inclusive communities, and licenses are by far the most important types of documentation to both users and contributors. The findings show 64% of respondents said an open-source license is very important when deciding whether or not to use a project.

In addition, GitHub found that negative interactions are infrequent, but highly visible, and these effects can extend beyond just the individuals directly involved. Respondents (18%) said they have personally experienced a negative interaction with another open-source user, and 50% witnessed one between other people.

“It’s not possible to know from this data whether the gap is due to people who experienced such interactions leaving open source, or broad visibility of incidents,” according to GitHub. “Either way, negative interactions impact many more than the immediate participants, so address problematic behavior swiftly, politely, and publicly, to send a signal to potential contributors that such behavior isn’t typical or tolerated.”

This negative behavior includes things like rudeness, name calling, and more serious incidents such as stalking, sexual advances, or doxxing, which is encountered by less than 5% of respondents. In order to address this behavior, GitHub suggests giving users the proper tools to protect themselves. This can include blocking a user, ISPs/hosting services, or even legal resources.

RELATED: Red Hat’s Marina Zhurakhinskaya fights for inclusivity, diversity in open-source community

Another key finding from the survey is open-source contributors do not yet reflect its broad audience of users. Improving projects’ accessibility could unlock future contributors, and fix the “huge” gaps in representation in open source.

The gender imbalance still remains a problem for GitHub. In fact, 95% of respondents are men, just 3% are women and 1% are non-binary. Women are still more likely than men to encounter language or content that makes them feel unwelcome (25% vs. 15%) as well as stereotyping (12% vs. 2%), and unsolicited sexual advances (6% vs. 3%).

“Collaboration between strangers is one of open source’s most remarkable aspects: strive to build a community where everyone feels welcome to participate,” according to GitHub.

Other data points look into using and contributing to open source on the job, as well as open source being the default when choosing software.