Topic: chaos engineering

Learn to harness chaos to build resilient systems

Systems in production fail. Nodes go down, networks become inaccessible. Chaos engineering is the practice of intentionally failing production infrastructure to see how resilient the system is. At this year’s ChaosConf, attendees will learn the how-to and benefits of failing parts of their infrastructure to see how their systems hold up, and to see where … continue reading

SD Times news digest: Gremlin’s integration with Spinnaker, Django 2.2, and MapR Data Platform update

Gremlin has announced a new integration with Spinnaker that will enable it to help companies automate chaos engineering. Gremlin is a company that specializes in helping companies implement chaos engineering, while Spinnaker is a continuous delivery platform first created by Netflix. According to Gremlin, chaos engineering can now be automated across platforms like AWS EC2, … continue reading

Industry Watch: Column as a service

As cloud services become more granular, more functionality can be had “as a service.” Two new interesting services caught my eye, and so I present this column as a service to you, dear readers. The first is “failure as a service,” which sounds counterintuitive. Wouldn’t people rather have success as a service? Most would, I’m … continue reading

Netflix details chaos engineering

Chaos engineering is not just for single servers anymore. These days, Netflix kicks entire regions of its servers offline, just to align priorities for developers. Casey Rosenthal, engineering manager for the traffic team and the chaos team at Netflix, explained the philosophies and practices behind the company’s development and testing practices at the STARWEST testing … continue reading

DMCA.com Protection Status