OpenTelemetry Metrics

What are metrics?

Metrics are used to measure, monitor, and compare performance, for example, you can measure server response time, memory utilization, error rate, and more.

OpenTelemetry specifiesopen in new window how to collect, aggregate, and send metrics to backend platforms. Using OpenTelemetry instruments, you can create counter, gauge, and histogram metrics.

Instruments

You capture measurements by creating instruments that have:

  • An unique name, for example, http.server.duration.
  • An instrument kind, for example, Counter.
  • An optional unit of measure, for example, milliseconds or bytes.
  • An optional description.

Instruments can be synchronous or asynchronous, additive (summable numbers) or grouping (histograms or non-summable numbers). Additive instruments that measure non-decreasing numbers are also called monotonic.

A single instrument can produce multiple timeseries. A timeseries is a metric with an unique set of attributes. For example, each host has a separate timeseries for the same metric name.

Additive instruments

Additive or summable instruments produce timeseries that, when added up together, produce another meaningful and accurate timeseries.

For example, http.server.requests is an additive timeseries, because you can sum the number of requests from different hosts to get the total number of requests.

But system.memory.utilization (percents) is not additive, because the sum of memory utilization from different hosts is invalid (90% + 90% = 180%) and inaccurate.

Synchronous instruments

Synchronous instruments are invoked together with operations they are measuring. For example, to measure the number of requests, you can call counter.Add(ctx, 1) whenever there is a new request. Synchronous measurements can have an associated execution context.

For synchronous instruments the difference between additive and grouping instruments is that additive instruments produce summable timeseries and grouping instruments produce a histogram.

Instrument kindPropertiesAggregationExample
Countermonotonicsum -> deltanumber of requests, request size
UpDownCounteradditivelast value -> sumnumber of connections
Histogramgroupinghistogramrequest duration, request size

Asynchronous instruments

Asynchronous instruments (observers) periodically invoke a callback function to collect measurements. For example, you can use observers to periodically measure memory or CPU usage. Asynchronous measurements can't have an associated context.

When choosing between UpDownCounterObserver (additive) and GaugeObserver (grouping), choose UpDownCounterObserver for summable timeseries and GaugeObserver otherwise. For example, to measure system.memory.usage (bytes), you should use UpDownCounterObserver. But to measure system.memory.utilization (percents), you should use GaugeObserver.

Instrument kindPropertiesAggregationExample
CounterObservermonotonicsum -> deltaCPU time
UpDownCounterObserveradditivelast value -> sumMemory usage (bytes)
GaugeObservergroupinglast valueMemory utilization (%)

Counter

synchronous monotonic

Counter is a synchronous instrument that measures additive non-decreasing values, for example, the number of:

  • processed requests
  • received bytes
  • disk reads

For Counter timeseries, backends usually compute deltas and display rate values, for example, per_min(http.server.requests) returns the number of processed requests per minute.

CounterObserver

asynchronous monotonic

CounterObserver is the asynchronous version of the Counter instrument.

UpDownCounter

synchronous additive

UpDownCounter is a synchronous instrument which measures additive values that increase or decrease with time, for example, the number of:

  • active requests
  • open connections
  • memory in use (megabytes)

For additive non-decreasing values you should use Counter or CounterObserver.

For UpDownCounter timeseries, backends usually display the last value, but different timeseries can be added up together, for example, go.sql.connections_open returns the total number of open connections and go.sql.connections_open{service.name = myservice} returns the number of open connections for one service.

UpDownCounterObserver

asynchronous additive

UpDownCounterOserver is the asynchronous version of the UpDownCounter instrument.

Histogram

synchronous grouping

Histogram is a synchronous instrument that produces a histogram from recorded values, for example:

  • request latency
  • request size

For Histogram timeseries, backends usually display percentiles, or a histogram, or a heatmap.

GaugeObserver

asynchronous grouping

GaugeObserver is an asynchronous instrument that measures non-additive values for which sum does not produce a meaningful correct result, for example:

  • error rate
  • memory utilization
  • cache hit rate

For GaugeObserver timeseries, backends usually display the last value and don't allow to sum different timeseries together.

What's next?

Next, learn about OpenTelemetry Metrics API for your programming language: