While engineers are likely to implement hardware-based solutions for handling network load balance, Facebook’s scale of operation far outweighed the practicality of hardware load balancing, instead requiring the development of a lightweight software solution. The current result of Facebook’s efforts is its latest open-source release, scalable network load balancer Katran.

The company’s software load balancing has been implemented for years on its backend, but the latest iteration of the solution was designed to meet a number of requirements that would help to adequately handle the network activity at their points of presence all over the world now and into the future.

According to blog post announcing Katran’s open-source release, those were:

  • Run on commodity Linux servers. This allows us to run the load balancer on part or all of the large fleet of currently deployed servers. A software-based load balancer satisfies this criteria.
  • Coexist with other services on a given server. This removes the need for dedicated servers that run the load balancer exclusively, thereby increasing fault tolerance.
  • Allow low-disruption maintenance. Facebook’s software must be able to evolve quickly in order to support new or improved products and services. Maintenance and upgrades are a norm, not exceptions, for the load balancer and backend layers. Minimizing disruption during these events allows us to iterate faster.
  • Offer easy instrumentation and debugging. All large distributed infrastructures must contend with anomalies and unexpected events, so reducing the time to debug and troubleshoot issues is important. The load balancer needs to be instrumentable and friendly to standard tools like tcpdump.

Two recent developments in kernel technology helped to achieve those goals, network engineers Ranjeeth Dasineni and Nikita Shirokov explained in the blog post: eXpress Data Path (XDP) and the eBPF virtual machine.

“The XDP provides a fast, programmable network data path without resorting to a full-fledged kernel bypass method and works in conjunction with the Linux networking stack. The eBPF virtual machine provides a flexible, efficient, and more reliable way to interact with the Linux kernel and to extend its functionality by running user-space supplied programs at specific points in the kernel. eBPF has already brought dramatic improvements to several areas, including tracing and filtering,” the company wrote. 

While the specific nature of the Katran forwarding plane software library meant that there are a few limitations, as outlined in the blog post, the developers hope that the open source release will lead to rapid improvement for the library and more efficient load-balancing for developers everywhere.

More information about Katran and the underlying technology can be found in the blog post and at the project’s repository.