The financial services company Robinhood has announced it is open-sourcing the distributed stream processing library Faust. According to the company, it’s scalable and reliable distributed systems led to the creation of Faust. Faust is designed to process large amounts of data in real-time and simplify the design and deployment of complex streaming architectures.
Developers Ask Solem and Vineet Goel explained that the Python 3 library was inspired by Kafka Streams and leverages recent updates to the language along with the new AsyncIO module to provide high performance asynchronous I/O.
“Unlike most stream processing frameworks, Faust does not use a DSL,” Solem and Goel wrote in a post. “Instead it provides stream processing as a Python library so you can reuse the tools you already use when stream processing. Anyone already familiar with Python programming will find it familiar and intuitive to use. We built Faust as a library that you can drop into any existing Python code, with support for all the libraries and frameworks that you like to use. Further, there is no need for resource managers such as Yarn or Mesos, deploy your application the way you already prefer.”
The developers say that they designed the tool to be easy-to-implement in any Python project and hope that its succinct source code will be a valuable learning tool for developers who want to learn more about how systems like Kafka Streams operate.
The tool has spread rapidly through Robinhood, according to the developers and the company’s teams have found uses for it including:
- Risk and fraud detection
- Ad tracking
- Order execution quality monitoring
- Distributed streaming of databases across Robinhood
- Robinhood Feed (chat feed on the cryptocurrency pages)
- Event logging pipelines
- News aggregation and tagging
“We plan on adding a lot more features in the future,” the developers wrote. “The most interesting planned feature is the ‘exactly once’ semantics recently introduced by Kafka.”