Why OpenTelemetry is driving a new wave of innovation on top of observability data

Published: September 29th, 2021

- Ramon Guiu

The last decade has brought a progressive transition from monolithic applications that run on static infrastructure to microservices that run on highly dynamic cloud-native infrastructure. This shift has led to the rapid emergence of lots of new technologies, frameworks, and architectures and a new set of monitoring and observability tools that give engineers full visibility into the health and performance of these new systems.

Visibility is essential to ensure that a system and its dependencies behave as expected and to identify and speed resolution of any issues that may arise. To that end, teams need to gather complete health and performance telemetry data (metrics, logs, and traces) from all those components. This is accomplished through instrumentation.

Why do we need OpenTelemetry?

For many years there have been a wide variety of open-source and proprietary instrumentation tools like StatsD, Nagios plugins, Prometheus exporters, Datadog integrations, or New Relic agents. Unfortunately, while there are lots of open-source tools, there hasn’t been alignment about specific instrumentation standards, such as StatsD, in the developer community and between vendors. This makes interoperability a challenge.

The lack of instrumentation standards and interoperability has required every monitoring and observability tool to build their own collection of integrations to instrument the technologies developers use and need visibility into. For example, many monitoring tools have built integrations to instrument widely used databases like MySQL, including Prometheus MySQL Exporter, Datadog MySQL integration, and New Relic MySQL integration.

This is also true for application code instrumentation, where New Relic, Dynatrace, Datadog and other vendors have built complex agents that automatically instrument popular application frameworks and libraries. Developers spend years building instrumentation, and it requires a sizable investment to build a large enough catalog of integrations and maintain it as new versions of the technologies monitored are released. Not only is this a very inefficient use of global developer resources, it also creates vendor lock-in since you need to re-instrument your systems if you want to change your observability tool.

Finally, the value of (and where customers most benefit from!) innovation is not innovation on the instrumentation itself. It’s improvements and advancements on what you can do with the data that gets collected. The requirement to make a large investment on instrumentation – i.e., the area that delivers little benefit to end users – for new tools to enter the market has created a big barrier to entry and has severely limited innovation in the space.

This is all about to dramatically change, thanks to OpenTelemetry: an emerging open-source standard that is democratizing instrumentation.

OpenTelemetry has already gained a lot of momentum, with support from all major observability vendors, cloud providers, and many end users contributing to the project. It has become the second most active CNCF project in terms of number of contributions only behind Kubernetes. (It’s also recently been accepted as a CNCF incubating project, which reiterates its importance to engineering communities.).

Why is OpenTelemetry so popular?

OpenTelemetry approaches the instrumentation “problem” in a different way. Like other (usually proprietary) attempts, it provides a lot of out-of-the-box instrumentation for application frameworks and infrastructure components, as well as SDKs for developers to add their own instrumentation.

Unlike other instrumentation frameworks, OpenTelemetry covers metrics, traces, and logs, defines an API, semantic conventions, and a standard communication protocol (OpenTelemetry protocol or OTLP). Moreover, it is completely vendor agnostic, with a plugin architecture to export data to any backend.

Even more, OpenTelemetry’s goal is for developers who build technologies for others to use (e.g., application frameworks, databases, web servers, and service meshes) to bake instrumentation directly into the code they produce. This will make instrumentation readily available to anyone who uses the code in the future and avoid the need for another developer to learn the technology and figure out how to write instrumentation for it (which in some cases requires the use of complex techniques like bytecode injection.)

OpenTelemetry unlocks a lot of new value to all developers:

Interoperability. Analyze the entire flow of requests to your application as they go through your microservices, cloud services, and third party SaaS in your observability tool of choice. Effortlessly send your observability data to a data warehouse to be analyzed alongside your business data. OpenTelemetry’s common API, data semantics, and protocol make all of the above – and more – possible, out-of-the-box.
Ubiquitous instrumentation. Thanks to a much larger community working together vs. siloed duplicative efforts, everyone benefits from the broadest, deepest, and highest quality instrumentation available.
Future-proof. You can instrument your code once and use it anywhere since the vendor-agnostic approach enables you to send data to and run analysis in your backend of choice. Before OpenTelemetry, changing observability backends typically required a time-consuming reinstrumentation of your system.
Lower resource footprint. More and more instrumentation is directly baked into frameworks and technologies instead of injected, resulting in reduced CPU and memory utilization.
Improved uptime. With OpenTelemetry’s shared metadata, observability tools deliver better correlation between metrics, traces, and logs, so you troubleshoot and resolve production problems faster.

More importantly, companies no longer have to devote time, people, and money to developing their own product-specific instrumentation and can focus on improving developer experience. With access to a broad, deep, and high-quality observability data set of metrics, traces, and logs with no multi-million dollar investment in instrumentation, a new wave of new solutions that leverage observability data is about to come.

Let’s look at some examples to demonstrate what OpenTelemetry will – and is already – enabling developers to do:

AWS is embedding OpenTelemetry instrumentation across their services. For example, they have released automatic trace instrumentation for Java Lambda functions with no code changes. This gives developers immediate visibility into the performance of their Java code and enables them to send any collected data to their backend of choice. As a result, they’re not tied to a specific vendor and can send the data to multiple backends to solve for different use cases.
Kubernetes and the popular GraphQL Apollo Server have added initial OpenTelemetry tracing instrumentation to their code. This provides efficient out-of-the-box instrumentation that’s directly embedded in the code through the Go and JavaScript OpenTelemetry libraries, and the instrumentation is written by the experts that have built those technologies.
Jenkins, the open-source CI/CD server, offers an OpenTelemetry plugin to monitor and troubleshoot jobs using distributed tracing. This gives developers visibility into where time in jobs is spent and where errors are occurring to help troubleshoot and improve those jobs.
Rookout, a debugger for cloud-native applications, has integrated OpenTelemetry traces to provide additional context within the debugger itself. This helps developers understand the entire flow of the request traversing the code they are troubleshooting, with additional context from tags in the OpenTelemetry data.
Promscale lets developers store your OpenTelemetry trace data inside Postgres via OTLP. Then, developers can use powerful SQL queries to analyze their traces and correlate them with other business data that’s stored in Postgres. For example, if you develop a SaaS service that uses a database, you could analyze database query response time by customer ARR band to ensure your most valuable customers – who are most likely to suffer from bad query performance, since they store more data in your application – are seeing the best possible performance with your product.

OpenTelemetry is still being (very!) actively developed, so this is just the beginning. While many of the above products and projects will improve the lives of engineers who operate production environments, there is a greenfield of possibilities. With interoperability and ubiquitous instrumentation, there’s massive potential for existing companies to improve their existing products or develop new tools – and for new upstarts and entrepreneurs to leverage OpenTelemetry instrumentation to solve new problems or existing problems with new innovative approaches.

Learn more about OpenTelemetry at KubeCon + CloudNativeCon Oct. 11-15.

Article Tags

data, OpenTelemetry

About Ramon Guiu

Ramon Guiu is VP of Observability at Timescale, a Cloud Native Computing Foundation member.

View all posts by Ramon Guiu

Cookie	Duration	Description
cf_use_ob	past	Cloudflare sets this cookie to improve page load times and to disallow any security restrictions based on the visitor's IP address.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_S6PB8V57DG	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_846073_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
WMF-Last-Access	1 month 14 hours 26 minutes	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
__Host-GAPS	2 years	This cookie allows the website to identify a user and provide enhanced functionality and personalisation.
_pxhd	session	Used by Zoominfo to enhance customer data.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
__Secure-YEC	1 year 1 month	No description
_heatmaps_g2g_100754890	10 minutes	No description
_techvalidate_session	session	No description
cf_7166_id	20 years	No description
cf_7166_person_last_update	session	No description
f5avraaaaaaaaaaaaaaaa_session_	session	No description available.
GoogleAdServingTest	session	No description
Gyazo_cfwoker	7 years 2 months 17 days 7 hours	No description
incap_ses_451_2783402	session	No description
incap_ses_769_2783402	session	No description
loglevel	never	No description available.
m	2 years	No description available.
nlbi_2783402	session	No description
prism_252377639	1 month	No description
TS011605d9	session	No description
ustream-guest	session	No description available.
visid_incap_2783402	1 year	No description
xtc	1 year 1 month	No description

AI

AI and Software Development

Observability

Guide to Observability

CI/CD

A guide to CI/CD

Cloud Native

Cloud Native Content

Data

A Guide to Data

Test

Security Testing

Mobile

Mobile Testing

API

Sponsored by Parasoft

Performance

Load & Performance Testing

DevSecOps

A Guide to DevSecOps

Enterprise Security

A Guide to Security

Supply Chain Security

Supply Chain Security

Dev Manager

Dev Managers Content

Agile

A Guide To Agile

Value Stream

A Guide To Value Stream

Productivity

A Guide To Productivity

DevOps

DevOps Content

API

Gravitee.io

AI

AI and Software Development

Value Stream Management

A Guide To Value Stream

Why OpenTelemetry is driving a new wave of innovation on top of observability data

Article Tags

Subscribe to SDTimes

About Ramon Guiu

Related Articles

ScyllaDB X Cloud’s autoscaling capabilities meet the needs of unpredictable workloads in real time

Databricks adds new tools like Lakebase, Lakeflow Designer, and Agent Bricks to better support building AI apps and agents in the enterprise

Garbage in, garbage out: The importance of data quality when training AI models

ABBYY’s new OCR API enables developers to more easily extract data from documents