Observability helps developers and operators (“DevOps”) understand distributed systems: what’s slow, what’s broken, and what needs to be done to improve performance. In order to manage and understand multi-layered architectures, we need more than traditional logs and infrastructure metrics.
In this guide, we cover:
  • Common observability challenges in distributed systems
  • Understanding telemetry data: logs, metrics, and traces
  • The “three pillars of observability”
  • Requirements for effective observability solutions
  • Managing observability with SLAs, SLOs, and SLIs