mirror of https://github.com/prometheus/prometheus
Import querying documentation from prometheus/docs
parent
299802dfd0
commit
e6cdc2d355
|
@ -1,5 +1,6 @@
|
||||||
---
|
---
|
||||||
title: Configuration
|
title: Configuration
|
||||||
|
sort_rank: 3
|
||||||
---
|
---
|
||||||
|
|
||||||
# Configuration
|
# Configuration
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
---
|
---
|
||||||
title: Getting started
|
title: Getting started
|
||||||
sort_rank: 10
|
sort_rank: 1
|
||||||
---
|
---
|
||||||
|
|
||||||
# Getting started
|
# Getting started
|
||||||
|
|
|
@ -14,3 +14,4 @@ The documentation is available alongside all the project documentation at
|
||||||
- [Installing](install.md)
|
- [Installing](install.md)
|
||||||
- [Getting started](getting_started.md)
|
- [Getting started](getting_started.md)
|
||||||
- [Configuration](configuration.md)
|
- [Configuration](configuration.md)
|
||||||
|
- [Querying](querying/basics.md)
|
||||||
|
|
|
@ -1,8 +1,9 @@
|
||||||
---
|
---
|
||||||
title: Installing
|
title: Installation
|
||||||
|
sort_rank: 2
|
||||||
---
|
---
|
||||||
|
|
||||||
# Installing
|
# Installation
|
||||||
|
|
||||||
## Using pre-compiled binaries
|
## Using pre-compiled binaries
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,417 @@
|
||||||
|
---
|
||||||
|
title: HTTP API
|
||||||
|
sort_rank: 7
|
||||||
|
---
|
||||||
|
|
||||||
|
# HTTP API
|
||||||
|
|
||||||
|
The current stable HTTP API is reachable under `/api/v1` on a Prometheus
|
||||||
|
server. Any non-breaking additions will be added under that endpoint.
|
||||||
|
|
||||||
|
## Format overview
|
||||||
|
|
||||||
|
The API response format is JSON. Every successful API request returns a `2xx`
|
||||||
|
status code.
|
||||||
|
|
||||||
|
Invalid requests that reach the API handlers return a JSON error object
|
||||||
|
and one of the following HTTP response codes:
|
||||||
|
|
||||||
|
- `400 Bad Request` when parameters are missing or incorrect.
|
||||||
|
- `422 Unprocessable Entity` when an expression can't be executed
|
||||||
|
([RFC4918](http://tools.ietf.org/html/rfc4918#page-78)).
|
||||||
|
- `503 Service Unavailable` when queries time out or abort.
|
||||||
|
|
||||||
|
Other non-`2xx` codes may be returned for errors occurring before the API
|
||||||
|
endpoint is reached.
|
||||||
|
|
||||||
|
The JSON response envelope format is as follows:
|
||||||
|
|
||||||
|
```
|
||||||
|
{
|
||||||
|
"status": "success" | "error",
|
||||||
|
"data": <data>,
|
||||||
|
|
||||||
|
// Only set if status is "error". The data field may still hold
|
||||||
|
// additional data.
|
||||||
|
"errorType": "<string>",
|
||||||
|
"error": "<string>"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Input timestamps may be provided either in
|
||||||
|
[RFC3339](https://www.ietf.org/rfc/rfc3339.txt) format or as a Unix timestamp
|
||||||
|
in seconds, with optional decimal places for sub-second precision. Output
|
||||||
|
timestamps are always represented as Unix timestamps in seconds.
|
||||||
|
|
||||||
|
Names of query parameters that may be repeated end with `[]`.
|
||||||
|
|
||||||
|
`<series_selector>` placeholders refer to Prometheus [time series
|
||||||
|
selectors](basics.md#time-series-selectors) like `http_requests_total` or
|
||||||
|
`http_requests_total{method=~"^GET|POST$"}` and need to be URL-encoded.
|
||||||
|
|
||||||
|
`<duration>` placeholders refer to Prometheus duration strings of the form
|
||||||
|
`[0-9]+[smhdwy]`. For example, `5m` refers to a duration of 5 minutes.
|
||||||
|
|
||||||
|
## Expression queries
|
||||||
|
|
||||||
|
Query language expressions may be evaluated at a single instant or over a range
|
||||||
|
of time. The sections below describe the API endpoints for each type of
|
||||||
|
expression query.
|
||||||
|
|
||||||
|
### Instant queries
|
||||||
|
|
||||||
|
The following endpoint evaluates an instant query at a single point in time:
|
||||||
|
|
||||||
|
```
|
||||||
|
GET /api/v1/query
|
||||||
|
```
|
||||||
|
|
||||||
|
URL query parameters:
|
||||||
|
|
||||||
|
- `query=<string>`: Prometheus expression query string.
|
||||||
|
- `time=<rfc3339 | unix_timestamp>`: Evaluation timestamp. Optional.
|
||||||
|
- `timeout=<duration>`: Evaluation timeout. Optional. Defaults to and
|
||||||
|
is capped by the value of the `-query.timeout` flag.
|
||||||
|
|
||||||
|
The current server time is used if the `time` parameter is omitted.
|
||||||
|
|
||||||
|
The `data` section of the query result has the following format:
|
||||||
|
|
||||||
|
```
|
||||||
|
{
|
||||||
|
"resultType": "matrix" | "vector" | "scalar" | "string",
|
||||||
|
"result": <value>
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`<value>` refers to the query result data, which has varying formats
|
||||||
|
depending on the `resultType`. See the [expression query result
|
||||||
|
formats](#expression-query-result-formats).
|
||||||
|
|
||||||
|
The following example evaluates the expression `up` at the time
|
||||||
|
`2015-07-01T20:10:51.781Z`:
|
||||||
|
|
||||||
|
```json
|
||||||
|
$ curl 'http://localhost:9090/api/v1/query?query=up&time=2015-07-01T20:10:51.781Z'
|
||||||
|
{
|
||||||
|
"status" : "success",
|
||||||
|
"data" : {
|
||||||
|
"resultType" : "vector",
|
||||||
|
"result" : [
|
||||||
|
{
|
||||||
|
"metric" : {
|
||||||
|
"__name__" : "up",
|
||||||
|
"job" : "prometheus",
|
||||||
|
"instance" : "localhost:9090"
|
||||||
|
},
|
||||||
|
"value": [ 1435781451.781, "1" ]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metric" : {
|
||||||
|
"__name__" : "up",
|
||||||
|
"job" : "node",
|
||||||
|
"instance" : "localhost:9100"
|
||||||
|
},
|
||||||
|
"value" : [ 1435781451.781, "0" ]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Range queries
|
||||||
|
|
||||||
|
The following endpoint evaluates an expression query over a range of time:
|
||||||
|
|
||||||
|
```
|
||||||
|
GET /api/v1/query_range
|
||||||
|
```
|
||||||
|
|
||||||
|
URL query parameters:
|
||||||
|
|
||||||
|
- `query=<string>`: Prometheus expression query string.
|
||||||
|
- `start=<rfc3339 | unix_timestamp>`: Start timestamp.
|
||||||
|
- `end=<rfc3339 | unix_timestamp>`: End timestamp.
|
||||||
|
- `step=<duration>`: Query resolution step width.
|
||||||
|
- `timeout=<duration>`: Evaluation timeout. Optional. Defaults to and
|
||||||
|
is capped by the value of the `-query.timeout` flag.
|
||||||
|
|
||||||
|
The `data` section of the query result has the following format:
|
||||||
|
|
||||||
|
```
|
||||||
|
{
|
||||||
|
"resultType": "matrix",
|
||||||
|
"result": <value>
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
For the format of the `<value>` placeholder, see the [range-vector result
|
||||||
|
format](#range-vectors).
|
||||||
|
|
||||||
|
The following example evaluates the expression `up` over a 30-second range with
|
||||||
|
a query resolution of 15 seconds.
|
||||||
|
|
||||||
|
```json
|
||||||
|
$ curl 'http://localhost:9090/api/v1/query_range?query=up&start=2015-07-01T20:10:30.781Z&end=2015-07-01T20:11:00.781Z&step=15s'
|
||||||
|
{
|
||||||
|
"status" : "success",
|
||||||
|
"data" : {
|
||||||
|
"resultType" : "matrix",
|
||||||
|
"result" : [
|
||||||
|
{
|
||||||
|
"metric" : {
|
||||||
|
"__name__" : "up",
|
||||||
|
"job" : "prometheus",
|
||||||
|
"instance" : "localhost:9090"
|
||||||
|
},
|
||||||
|
"values" : [
|
||||||
|
[ 1435781430.781, "1" ],
|
||||||
|
[ 1435781445.781, "1" ],
|
||||||
|
[ 1435781460.781, "1" ]
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metric" : {
|
||||||
|
"__name__" : "up",
|
||||||
|
"job" : "node",
|
||||||
|
"instance" : "localhost:9091"
|
||||||
|
},
|
||||||
|
"values" : [
|
||||||
|
[ 1435781430.781, "0" ],
|
||||||
|
[ 1435781445.781, "0" ],
|
||||||
|
[ 1435781460.781, "1" ]
|
||||||
|
]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Querying metadata
|
||||||
|
|
||||||
|
### Finding series by label matchers
|
||||||
|
|
||||||
|
The following endpoint returns the list of time series that match a certain label set.
|
||||||
|
|
||||||
|
```
|
||||||
|
GET /api/v1/series
|
||||||
|
```
|
||||||
|
|
||||||
|
URL query parameters:
|
||||||
|
|
||||||
|
- `match[]=<series_selector>`: Repeated series selector argument that selects the
|
||||||
|
series to return. At least one `match[]` argument must be provided.
|
||||||
|
- `start=<rfc3339 | unix_timestamp>`: Start timestamp.
|
||||||
|
- `end=<rfc3339 | unix_timestamp>`: End timestamp.
|
||||||
|
|
||||||
|
The `data` section of the query result consists of a list of objects that
|
||||||
|
contain the label name/value pairs which identify each series.
|
||||||
|
|
||||||
|
The following example returns all series that match either of the selectors
|
||||||
|
`up` or `process_start_time_seconds{job="prometheus"}`:
|
||||||
|
|
||||||
|
```json
|
||||||
|
$ curl -g 'http://localhost:9090/api/v1/series?match[]=up&match[]=process_start_time_seconds{job="prometheus"}'
|
||||||
|
{
|
||||||
|
"status" : "success",
|
||||||
|
"data" : [
|
||||||
|
{
|
||||||
|
"__name__" : "up",
|
||||||
|
"job" : "prometheus",
|
||||||
|
"instance" : "localhost:9090"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"__name__" : "up",
|
||||||
|
"job" : "node",
|
||||||
|
"instance" : "localhost:9091"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"__name__" : "process_start_time_seconds",
|
||||||
|
"job" : "prometheus",
|
||||||
|
"instance" : "localhost:9090"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Querying label values
|
||||||
|
|
||||||
|
The following endpoint returns a list of label values for a provided label name:
|
||||||
|
|
||||||
|
```
|
||||||
|
GET /api/v1/label/<label_name>/values
|
||||||
|
```
|
||||||
|
|
||||||
|
The `data` section of the JSON response is a list of string label names.
|
||||||
|
|
||||||
|
This example queries for all label values for the `job` label:
|
||||||
|
|
||||||
|
```json
|
||||||
|
$ curl http://localhost:9090/api/v1/label/job/values
|
||||||
|
{
|
||||||
|
"status" : "success",
|
||||||
|
"data" : [
|
||||||
|
"node",
|
||||||
|
"prometheus"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Deleting series
|
||||||
|
|
||||||
|
The following endpoint deletes matched series entirely from a Prometheus server:
|
||||||
|
|
||||||
|
```
|
||||||
|
DELETE /api/v1/series
|
||||||
|
```
|
||||||
|
|
||||||
|
URL query parameters:
|
||||||
|
|
||||||
|
- `match[]=<series_selector>`: Repeated label matcher argument that selects the
|
||||||
|
series to delete. At least one `match[]` argument must be provided.
|
||||||
|
|
||||||
|
The `data` section of the JSON response has the following format:
|
||||||
|
|
||||||
|
```
|
||||||
|
{
|
||||||
|
"numDeleted": <number of deleted series>
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The following example deletes all series that match either of the selectors
|
||||||
|
`up` or `process_start_time_seconds{job="prometheus"}`:
|
||||||
|
|
||||||
|
```json
|
||||||
|
$ curl -XDELETE -g 'http://localhost:9090/api/v1/series?match[]=up&match[]=process_start_time_seconds{job="prometheus"}'
|
||||||
|
{
|
||||||
|
"status" : "success",
|
||||||
|
"data" : {
|
||||||
|
"numDeleted" : 3
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Expression query result formats
|
||||||
|
|
||||||
|
Expression queries may return the following response values in the `result`
|
||||||
|
property of the `data` section. `<sample_value>` placeholders are numeric
|
||||||
|
sample values. JSON does not support special float values such as `NaN`, `Inf`,
|
||||||
|
and `-Inf`, so sample values are transferred as quoted JSON strings rather than
|
||||||
|
raw numbers.
|
||||||
|
|
||||||
|
### Range vectors
|
||||||
|
|
||||||
|
Range vectors are returned as result type `matrix`. The corresponding
|
||||||
|
`result` property has the following format:
|
||||||
|
|
||||||
|
```
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"metric": { "<label_name>": "<label_value>", ... },
|
||||||
|
"values": [ [ <unix_time>, "<sample_value>" ], ... ]
|
||||||
|
},
|
||||||
|
...
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Instant vectors
|
||||||
|
|
||||||
|
Instant vectors are returned as result type `vector`. The corresponding
|
||||||
|
`result` property has the following format:
|
||||||
|
|
||||||
|
```
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"metric": { "<label_name>": "<label_value>", ... },
|
||||||
|
"value": [ <unix_time>, "<sample_value>" ]
|
||||||
|
},
|
||||||
|
...
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Scalars
|
||||||
|
|
||||||
|
Scalar results are returned as result type `scalar`. The corresponding
|
||||||
|
`result` property has the following format:
|
||||||
|
|
||||||
|
```
|
||||||
|
[ <unix_time>, "<scalar_value>" ]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Strings
|
||||||
|
|
||||||
|
String results are returned as result type `string`. The corresponding
|
||||||
|
`result` property has the following format:
|
||||||
|
|
||||||
|
```
|
||||||
|
[ <unix_time>, "<string_value>" ]
|
||||||
|
```
|
||||||
|
|
||||||
|
## Targets
|
||||||
|
|
||||||
|
> This API is experimental as it is intended to be extended with targets
|
||||||
|
> dropped due to relabelling in the future.
|
||||||
|
|
||||||
|
The following endpoint returns an overview of the current state of the
|
||||||
|
Prometheus target discovery:
|
||||||
|
|
||||||
|
```
|
||||||
|
GET /api/v1/targets
|
||||||
|
```
|
||||||
|
|
||||||
|
Currently only the active targets are part of the response.
|
||||||
|
|
||||||
|
```json
|
||||||
|
$ curl http://localhost:9090/api/v1/targets
|
||||||
|
{
|
||||||
|
"status": "success", [3/11]
|
||||||
|
"data": {
|
||||||
|
"activeTargets": [
|
||||||
|
{
|
||||||
|
"discoveredLabels": {
|
||||||
|
"__address__": "127.0.0.1:9090",
|
||||||
|
"__metrics_path__": "/metrics",
|
||||||
|
"__scheme__": "http",
|
||||||
|
"job": "prometheus"
|
||||||
|
},
|
||||||
|
"labels": {
|
||||||
|
"instance": "127.0.0.1:9090",
|
||||||
|
"job": "prometheus"
|
||||||
|
},
|
||||||
|
"scrapeUrl": "http://127.0.0.1:9090/metrics",
|
||||||
|
"lastError": "",
|
||||||
|
"lastScrape": "2017-01-17T15:07:44.723715405+01:00",
|
||||||
|
"health": "up"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Alertmanagers
|
||||||
|
|
||||||
|
> This API is experimental as it is intended to be extended with Alertmanagers
|
||||||
|
> dropped due to relabelling in the future.
|
||||||
|
|
||||||
|
The following endpoint returns an overview of the current state of the
|
||||||
|
Prometheus alertmanager discovery:
|
||||||
|
|
||||||
|
```
|
||||||
|
GET /api/v1/alertmanagers
|
||||||
|
```
|
||||||
|
|
||||||
|
Currently only the active Alertmanagers are part of the response.
|
||||||
|
|
||||||
|
```json
|
||||||
|
$ curl http://localhost:9090/api/v1/alertmanagers
|
||||||
|
{
|
||||||
|
"status": "success",
|
||||||
|
"data": {
|
||||||
|
"activeAlertmanagers": [
|
||||||
|
{
|
||||||
|
"url": "http://127.0.0.1:9090/api/v1/alerts"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
|
@ -0,0 +1,215 @@
|
||||||
|
---
|
||||||
|
title: Querying basics
|
||||||
|
nav_title: Basics
|
||||||
|
sort_rank: 1
|
||||||
|
---
|
||||||
|
|
||||||
|
# Querying Prometheus
|
||||||
|
|
||||||
|
Prometheus provides a functional expression language that lets the user select
|
||||||
|
and aggregate time series data in real time. The result of an expression can
|
||||||
|
either be shown as a graph, viewed as tabular data in Prometheus's expression
|
||||||
|
browser, or consumed by external systems via the [HTTP API](api.md).
|
||||||
|
|
||||||
|
## Examples
|
||||||
|
|
||||||
|
This document is meant as a reference. For learning, it might be easier to
|
||||||
|
start with a couple of [examples](examples.md).
|
||||||
|
|
||||||
|
## Expression language data types
|
||||||
|
|
||||||
|
In Prometheus's expression language, an expression or sub-expression can
|
||||||
|
evaluate to one of four types:
|
||||||
|
|
||||||
|
* **Instant vector** - a set of time series containing a single sample for each time series, all sharing the same timestamp
|
||||||
|
* **Range vector** - a set of time series containing a range of data points over time for each time series
|
||||||
|
* **Scalar** - a simple numeric floating point value
|
||||||
|
* **String** - a simple string value; currently unused
|
||||||
|
|
||||||
|
Depending on the use-case (e.g. when graphing vs. displaying the output of an
|
||||||
|
expression), only some of these types are legal as the result from a
|
||||||
|
user-specified expression. For example, an expression that returns an instant
|
||||||
|
vector is the only type that can be directly graphed.
|
||||||
|
|
||||||
|
## Literals
|
||||||
|
|
||||||
|
### String literals
|
||||||
|
|
||||||
|
Strings may be specified as literals in single quotes, double quotes or
|
||||||
|
backticks.
|
||||||
|
|
||||||
|
PromQL follows the same [escaping rules as
|
||||||
|
Go](https://golang.org/ref/spec#String_literals). In single or double quotes a
|
||||||
|
backslash begins an escape sequence, which may be followed by `a`, `b`, `f`,
|
||||||
|
`n`, `r`, `t`, `v` or `\`. Specific characters can be provided using octal
|
||||||
|
(`\nnn`) or hexadecimal (`\xnn`, `\unnnn` and `\Unnnnnnnn`).
|
||||||
|
|
||||||
|
No escaping is processed inside backticks. Unlike Go, Prometheus does not discard newlines inside backticks.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
"this is a string"
|
||||||
|
'these are unescaped: \n \\ \t'
|
||||||
|
`these are not unescaped: \n ' " \t`
|
||||||
|
|
||||||
|
### Float literals
|
||||||
|
|
||||||
|
Scalar float values can be literally written as numbers of the form
|
||||||
|
`[-](digits)[.(digits)]`.
|
||||||
|
|
||||||
|
-2.43
|
||||||
|
|
||||||
|
## Time series Selectors
|
||||||
|
|
||||||
|
### Instant vector selectors
|
||||||
|
|
||||||
|
Instant vector selectors allow the selection of a set of time series and a
|
||||||
|
single sample value for each at a given timestamp (instant): in the simplest
|
||||||
|
form, only a metric name is specified. This results in an instant vector
|
||||||
|
containing elements for all time series that have this metric name.
|
||||||
|
|
||||||
|
This example selects all time series that have the `http_requests_total` metric
|
||||||
|
name:
|
||||||
|
|
||||||
|
http_requests_total
|
||||||
|
|
||||||
|
It is possible to filter these time series further by appending a set of labels
|
||||||
|
to match in curly braces (`{}`).
|
||||||
|
|
||||||
|
This example selects only those time series with the `http_requests_total`
|
||||||
|
metric name that also have the `job` label set to `prometheus` and their
|
||||||
|
`group` label set to `canary`:
|
||||||
|
|
||||||
|
http_requests_total{job="prometheus",group="canary"}
|
||||||
|
|
||||||
|
It is also possible to negatively match a label value, or to match label values
|
||||||
|
against regular expressions. The following label matching operators exist:
|
||||||
|
|
||||||
|
* `=`: Select labels that are exactly equal to the provided string.
|
||||||
|
* `!=`: Select labels that are not equal to the provided string.
|
||||||
|
* `=~`: Select labels that regex-match the provided string (or substring).
|
||||||
|
* `!~`: Select labels that do not regex-match the provided string (or substring).
|
||||||
|
|
||||||
|
For example, this selects all `http_requests_total` time series for `staging`,
|
||||||
|
`testing`, and `development` environments and HTTP methods other than `GET`.
|
||||||
|
|
||||||
|
http_requests_total{environment=~"staging|testing|development",method!="GET"}
|
||||||
|
|
||||||
|
Label matchers that match empty label values also select all time series that do
|
||||||
|
not have the specific label set at all. Regex-matches are fully anchored.
|
||||||
|
|
||||||
|
Vector selectors must either specify a name or at least one label matcher
|
||||||
|
that does not match the empty string. The following expression is illegal:
|
||||||
|
|
||||||
|
{job=~".*"} # Bad!
|
||||||
|
|
||||||
|
In contrast, these expressions are valid as they both have a selector that does not
|
||||||
|
match empty label values.
|
||||||
|
|
||||||
|
{job=~".+"} # Good!
|
||||||
|
{job=~".*",method="get"} # Good!
|
||||||
|
|
||||||
|
Label matchers can also be applied to metric names by matching against the internal
|
||||||
|
`__name__` label. For example, the expression `http_requests_total` is equivalent to
|
||||||
|
`{__name__="http_requests_total"}`. Matchers other than `=` (`!=`, `=~`, `!~`) may also be used.
|
||||||
|
The following expression selects all metrics that have a name starting with `job:`:
|
||||||
|
|
||||||
|
{__name__=~"^job:.*"}
|
||||||
|
|
||||||
|
### Range Vector Selectors
|
||||||
|
|
||||||
|
Range vector literals work like instant vector literals, except that they
|
||||||
|
select a range of samples back from the current instant. Syntactically, a range
|
||||||
|
duration is appended in square brackets (`[]`) at the end of a vector selector
|
||||||
|
to specify how far back in time values should be fetched for each resulting
|
||||||
|
range vector element.
|
||||||
|
|
||||||
|
Time durations are specified as a number, followed immediately by one of the
|
||||||
|
following units:
|
||||||
|
|
||||||
|
* `s` - seconds
|
||||||
|
* `m` - minutes
|
||||||
|
* `h` - hours
|
||||||
|
* `d` - days
|
||||||
|
* `w` - weeks
|
||||||
|
* `y` - years
|
||||||
|
|
||||||
|
In this example, we select all the values we have recorded within the last 5
|
||||||
|
minutes for all time series that have the metric name `http_requests_total` and
|
||||||
|
a `job` label set to `prometheus`:
|
||||||
|
|
||||||
|
http_requests_total{job="prometheus"}[5m]
|
||||||
|
|
||||||
|
### Offset modifier
|
||||||
|
|
||||||
|
The `offset` modifier allows changing the time offset for individual
|
||||||
|
instant and range vectors in a query.
|
||||||
|
|
||||||
|
For example, the following expression returns the value of
|
||||||
|
`http_requests_total` 5 minutes in the past relative to the current
|
||||||
|
query evaluation time:
|
||||||
|
|
||||||
|
http_requests_total offset 5m
|
||||||
|
|
||||||
|
Note that the `offset` modifier always needs to follow the selector
|
||||||
|
immediately, i.e. the following would be correct:
|
||||||
|
|
||||||
|
sum(http_requests_total{method="GET"} offset 5m) // GOOD.
|
||||||
|
|
||||||
|
While the following would be *incorrect*:
|
||||||
|
|
||||||
|
sum(http_requests_total{method="GET"}) offset 5m // INVALID.
|
||||||
|
|
||||||
|
The same works for range vectors. This returns the 5-minutes rate that
|
||||||
|
`http_requests_total` had a week ago:
|
||||||
|
|
||||||
|
rate(http_requests_total[5m] offset 1w)
|
||||||
|
|
||||||
|
## Operators
|
||||||
|
|
||||||
|
Prometheus supports many binary and aggregation operators. These are described
|
||||||
|
in detail in the [expression language operators](operators.md) page.
|
||||||
|
|
||||||
|
## Functions
|
||||||
|
|
||||||
|
Prometheus supports several functions to operate on data. These are described
|
||||||
|
in detail in the [expression language functions](functions.md) page.
|
||||||
|
|
||||||
|
## Gotchas
|
||||||
|
|
||||||
|
### Interpolation and staleness
|
||||||
|
|
||||||
|
When queries are run, timestamps at which to sample data are selected
|
||||||
|
independently of the actual present time series data. This is mainly to support
|
||||||
|
cases like aggregation (`sum`, `avg`, and so on), where multiple aggregated
|
||||||
|
time series do not exactly align in time. Because of their independence,
|
||||||
|
Prometheus needs to assign a value at those timestamps for each relevant time
|
||||||
|
series. It does so by simply taking the newest sample before this timestamp.
|
||||||
|
|
||||||
|
If no stored sample is found (by default) 5 minutes before a sampling timestamp,
|
||||||
|
no value is assigned for this time series at this point in time. This
|
||||||
|
effectively means that time series "disappear" from graphs at times where their
|
||||||
|
latest collected sample is older than 5 minutes.
|
||||||
|
|
||||||
|
NOTE: <b>NOTE:</b> Staleness and interpolation handling might change. See
|
||||||
|
https://github.com/prometheus/prometheus/issues/398 and
|
||||||
|
https://github.com/prometheus/prometheus/issues/581.
|
||||||
|
|
||||||
|
### Avoiding slow queries and overloads
|
||||||
|
|
||||||
|
If a query needs to operate on a very large amount of data, graphing it might
|
||||||
|
time out or overload the server or browser. Thus, when constructing queries
|
||||||
|
over unknown data, always start building the query in the tabular view of
|
||||||
|
Prometheus's expression browser until the result set seems reasonable
|
||||||
|
(hundreds, not thousands, of time series at most). Only when you have filtered
|
||||||
|
or aggregated your data sufficiently, switch to graph mode. If the expression
|
||||||
|
still takes too long to graph ad-hoc, pre-record it via a [recording
|
||||||
|
rule](rules.md#recording-rules).
|
||||||
|
|
||||||
|
This is especially relevant for Prometheus's query language, where a bare
|
||||||
|
metric name selector like `api_http_requests_total` could expand to thousands
|
||||||
|
of time series with different labels. Also keep in mind that expressions which
|
||||||
|
aggregate over many time series will generate load on the server even if the
|
||||||
|
output is only a small number of time series. This is similar to how it would
|
||||||
|
be slow to sum all values of a column in a relational database, even if the
|
||||||
|
output value is only a single number.
|
|
@ -0,0 +1,83 @@
|
||||||
|
---
|
||||||
|
title: Querying examples
|
||||||
|
nav_title: Examples
|
||||||
|
sort_rank: 4
|
||||||
|
---
|
||||||
|
|
||||||
|
# Query examples
|
||||||
|
|
||||||
|
## Simple time series selection
|
||||||
|
|
||||||
|
Return all time series with the metric `http_requests_total`:
|
||||||
|
|
||||||
|
http_requests_total
|
||||||
|
|
||||||
|
Return all time series with the metric `http_requests_total` and the given
|
||||||
|
`job` and `handler` labels:
|
||||||
|
|
||||||
|
http_requests_total{job="apiserver", handler="/api/comments"}
|
||||||
|
|
||||||
|
Return a whole range of time (in this case 5 minutes) for the same vector,
|
||||||
|
making it a range vector:
|
||||||
|
|
||||||
|
http_requests_total{job="apiserver", handler="/api/comments"}[5m]
|
||||||
|
|
||||||
|
Note that an expression resulting in a range vector cannot be graphed directly,
|
||||||
|
but viewed in the tabular ("Console") view of the expression browser.
|
||||||
|
|
||||||
|
Using regular expressions, you could select time series only for jobs whose
|
||||||
|
name match a certain pattern, in this case, all jobs that end with `server`.
|
||||||
|
Note that this does a substring match, not a full string match:
|
||||||
|
|
||||||
|
http_requests_total{job=~"server$"}
|
||||||
|
|
||||||
|
To select all HTTP status codes except 4xx ones, you could run:
|
||||||
|
|
||||||
|
http_requests_total{status!~"^4..$"}
|
||||||
|
|
||||||
|
## Using functions, operators, etc.
|
||||||
|
|
||||||
|
Return the per-second rate for all time series with the `http_requests_total`
|
||||||
|
metric name, as measured over the last 5 minutes:
|
||||||
|
|
||||||
|
rate(http_requests_total[5m])
|
||||||
|
|
||||||
|
Assuming that the `http_requests_total` time series all have the labels `job`
|
||||||
|
(fanout by job name) and `instance` (fanout by instance of the job), we might
|
||||||
|
want to sum over the rate of all instances, so we get fewer output time series,
|
||||||
|
but still preserve the `job` dimension:
|
||||||
|
|
||||||
|
sum(rate(http_requests_total[5m])) by (job)
|
||||||
|
|
||||||
|
If we have two different metrics with the same dimensional labels, we can apply
|
||||||
|
binary operators to them and elements on both sides with the same label set
|
||||||
|
will get matched and propagated to the output. For example, this expression
|
||||||
|
returns the unused memory in MiB for every instance (on a fictional cluster
|
||||||
|
scheduler exposing these metrics about the instances it runs):
|
||||||
|
|
||||||
|
(instance_memory_limit_bytes - instance_memory_usage_bytes) / 1024 / 1024
|
||||||
|
|
||||||
|
The same expression, but summed by application, could be written like this:
|
||||||
|
|
||||||
|
sum(
|
||||||
|
instance_memory_limit_bytes - instance_memory_usage_bytes
|
||||||
|
) by (app, proc) / 1024 / 1024
|
||||||
|
|
||||||
|
If the same fictional cluster scheduler exposed CPU usage metrics like the
|
||||||
|
following for every instance:
|
||||||
|
|
||||||
|
instance_cpu_time_ns{app="lion", proc="web", rev="34d0f99", env="prod", job="cluster-manager"}
|
||||||
|
instance_cpu_time_ns{app="elephant", proc="worker", rev="34d0f99", env="prod", job="cluster-manager"}
|
||||||
|
instance_cpu_time_ns{app="turtle", proc="api", rev="4d3a513", env="prod", job="cluster-manager"}
|
||||||
|
instance_cpu_time_ns{app="fox", proc="widget", rev="4d3a513", env="prod", job="cluster-manager"}
|
||||||
|
...
|
||||||
|
|
||||||
|
...we could get the top 3 CPU users grouped by application (`app`) and process
|
||||||
|
type (`proc`) like this:
|
||||||
|
|
||||||
|
topk(3, sum(rate(instance_cpu_time_ns[5m])) by (app, proc))
|
||||||
|
|
||||||
|
Assuming this metric contains one time series per running instance, you could
|
||||||
|
count the number of running instances per application like this:
|
||||||
|
|
||||||
|
count(instance_cpu_time_ns) by (app)
|
|
@ -0,0 +1,408 @@
|
||||||
|
---
|
||||||
|
title: Query functions
|
||||||
|
nav_title: Functions
|
||||||
|
sort_rank: 3
|
||||||
|
---
|
||||||
|
|
||||||
|
# Functions
|
||||||
|
|
||||||
|
Some functions have default arguments, e.g. `year(v=vector(time())
|
||||||
|
instant-vector)`. This means that there is one argument `v` which is an instant
|
||||||
|
vector, which if not provided it will default to the value of the expression
|
||||||
|
`vector(time())`.
|
||||||
|
|
||||||
|
## `abs()`
|
||||||
|
|
||||||
|
`abs(v instant-vector)` returns the input vector with all sample values converted to
|
||||||
|
their absolute value.
|
||||||
|
|
||||||
|
## `absent()`
|
||||||
|
|
||||||
|
`absent(v instant-vector)` returns an empty vector if the vector passed to it
|
||||||
|
has any elements and a 1-element vector with the value 1 if the vector passed to
|
||||||
|
it has no elements.
|
||||||
|
|
||||||
|
This is useful for alerting on when no time series exist for a given metric name
|
||||||
|
and label combination.
|
||||||
|
|
||||||
|
```
|
||||||
|
absent(nonexistent{job="myjob"})
|
||||||
|
# => {job="myjob"}
|
||||||
|
|
||||||
|
absent(nonexistent{job="myjob",instance=~".*"})
|
||||||
|
# => {job="myjob"}
|
||||||
|
|
||||||
|
absent(sum(nonexistent{job="myjob"}))
|
||||||
|
# => {}
|
||||||
|
```
|
||||||
|
|
||||||
|
In the second example, `absent()` tries to be smart about deriving labels of the
|
||||||
|
1-element output vector from the input vector.
|
||||||
|
|
||||||
|
## `ceil()`
|
||||||
|
|
||||||
|
`ceil(v instant-vector)` rounds the sample values of all elements in `v` up to
|
||||||
|
the nearest integer.
|
||||||
|
|
||||||
|
## `changes()`
|
||||||
|
|
||||||
|
For each input time series, `changes(v range-vector)` returns the number of
|
||||||
|
times its value has changed within the provided time range as an instant
|
||||||
|
vector.
|
||||||
|
|
||||||
|
## `clamp_max()`
|
||||||
|
|
||||||
|
`clamp_max(v instant-vector, max scalar)` clamps the sample values of all
|
||||||
|
elements in `v` to have an upper limit of `max`.
|
||||||
|
|
||||||
|
## `clamp_min()`
|
||||||
|
|
||||||
|
`clamp_min(v instant-vector, min scalar)` clamps the sample values of all
|
||||||
|
elements in `v` to have a lower limit of `min`.
|
||||||
|
|
||||||
|
## `count_scalar()`
|
||||||
|
|
||||||
|
`count_scalar(v instant-vector)` returns the number of elements in a time series
|
||||||
|
vector as a scalar. This is in contrast to the `count()`
|
||||||
|
[aggregation operator](operators.md#aggregation-operators), which
|
||||||
|
always returns a vector (an empty one if the input vector is empty) and allows
|
||||||
|
grouping by labels via a `by` clause.
|
||||||
|
|
||||||
|
## `day_of_month()`
|
||||||
|
|
||||||
|
`day_of_month(v=vector(time()) instant-vector)` returns the day of the month
|
||||||
|
for each of the given times in UTC. Returned values are from 1 to 31.
|
||||||
|
|
||||||
|
## `day_of_week()`
|
||||||
|
|
||||||
|
`day_of_week(v=vector(time()) instant-vector)` returns the day of the week for
|
||||||
|
each of the given times in UTC. Returned values are from 0 to 6, where 0 means
|
||||||
|
Sunday etc.
|
||||||
|
|
||||||
|
## `days_in_month()`
|
||||||
|
|
||||||
|
`days_in_month(v=vector(time()) instant-vector)` returns number of days in the
|
||||||
|
month for each of the given times in UTC. Returned values are from 28 to 31.
|
||||||
|
|
||||||
|
## `delta()`
|
||||||
|
|
||||||
|
`delta(v range-vector)` calculates the difference between the
|
||||||
|
first and last value of each time series element in a range vector `v`,
|
||||||
|
returning an instant vector with the given deltas and equivalent labels.
|
||||||
|
The delta is extrapolated to cover the full time range as specified in
|
||||||
|
the range vector selector, so that it is possible to get a non-integer
|
||||||
|
result even if the sample values are all integers.
|
||||||
|
|
||||||
|
The following example expression returns the difference in CPU temperature
|
||||||
|
between now and 2 hours ago:
|
||||||
|
|
||||||
|
```
|
||||||
|
delta(cpu_temp_celsius{host="zeus"}[2h])
|
||||||
|
```
|
||||||
|
|
||||||
|
`delta` should only be used with gauges.
|
||||||
|
|
||||||
|
## `deriv()`
|
||||||
|
|
||||||
|
`deriv(v range-vector)` calculates the per-second derivative of the time series in a range
|
||||||
|
vector `v`, using [simple linear regression](http://en.wikipedia.org/wiki/Simple_linear_regression).
|
||||||
|
|
||||||
|
`deriv` should only be used with gauges.
|
||||||
|
|
||||||
|
## `drop_common_labels()`
|
||||||
|
|
||||||
|
`drop_common_labels(instant-vector)` drops all labels that have the same name
|
||||||
|
and value across all series in the input vector.
|
||||||
|
|
||||||
|
## `exp()`
|
||||||
|
|
||||||
|
`exp(v instant-vector)` calculates the exponential function for all elements in `v`.
|
||||||
|
Special cases are:
|
||||||
|
|
||||||
|
* `Exp(+Inf) = +Inf`
|
||||||
|
* `Exp(NaN) = NaN`
|
||||||
|
|
||||||
|
## `floor()`
|
||||||
|
|
||||||
|
`floor(v instant-vector)` rounds the sample values of all elements in `v` down
|
||||||
|
to the nearest integer.
|
||||||
|
|
||||||
|
## `histogram_quantile()`
|
||||||
|
|
||||||
|
`histogram_quantile(φ float, b instant-vector)` calculates the φ-quantile (0 ≤ φ
|
||||||
|
≤ 1) from the buckets `b` of a
|
||||||
|
[histogram](https://prometheus.io/docs/concepts/metric_types/#histogram). (See
|
||||||
|
[histograms and summaries](https://prometheus.io/docs/practices/histograms) for
|
||||||
|
a detailed explanation of φ-quantiles and the usage of the histogram metric type
|
||||||
|
in general.) The samples in `b` are the counts of observations in each bucket.
|
||||||
|
Each sample must have a label `le` where the label value denotes the inclusive
|
||||||
|
upper bound of the bucket. (Samples without such a label are silently ignored.)
|
||||||
|
The [histogram metric type](https://prometheus.io/docs/concepts/metric_types/#histogram)
|
||||||
|
automatically provides time series with the `_bucket` suffix and the appropriate
|
||||||
|
labels.
|
||||||
|
|
||||||
|
Use the `rate()` function to specify the time window for the quantile
|
||||||
|
calculation.
|
||||||
|
|
||||||
|
Example: A histogram metric is called `http_request_duration_seconds`. To
|
||||||
|
calculate the 90th percentile of request durations over the last 10m, use the
|
||||||
|
following expression:
|
||||||
|
|
||||||
|
histogram_quantile(0.9, rate(http_request_duration_seconds_bucket[10m]))
|
||||||
|
|
||||||
|
The quantile is calculated for each label combination in
|
||||||
|
`http_request_duration_seconds`. To aggregate, use the `sum()` aggregator
|
||||||
|
around the `rate()` function. Since the `le` label is required by
|
||||||
|
`histogram_quantile()`, it has to be included in the `by` clause. The following
|
||||||
|
expression aggregates the 90th percentile by `job`:
|
||||||
|
|
||||||
|
histogram_quantile(0.9, sum(rate(http_request_duration_seconds_bucket[10m])) by (job, le))
|
||||||
|
|
||||||
|
To aggregate everything, specify only the `le` label:
|
||||||
|
|
||||||
|
histogram_quantile(0.9, sum(rate(http_request_duration_seconds_bucket[10m])) by (le))
|
||||||
|
|
||||||
|
The `histogram_quantile()` function interpolates quantile values by
|
||||||
|
assuming a linear distribution within a bucket. The highest bucket
|
||||||
|
must have an upper bound of `+Inf`. (Otherwise, `NaN` is returned.) If
|
||||||
|
a quantile is located in the highest bucket, the upper bound of the
|
||||||
|
second highest bucket is returned. A lower limit of the lowest bucket
|
||||||
|
is assumed to be 0 if the upper bound of that bucket is greater than
|
||||||
|
0. In that case, the usual linear interpolation is applied within that
|
||||||
|
bucket. Otherwise, the upper bound of the lowest bucket is returned
|
||||||
|
for quantiles located in the lowest bucket.
|
||||||
|
|
||||||
|
If `b` contains fewer than two buckets, `NaN` is returned. For φ < 0, `-Inf` is
|
||||||
|
returned. For φ > 1, `+Inf` is returned.
|
||||||
|
|
||||||
|
## `holt_winters()`
|
||||||
|
|
||||||
|
`holt_winters(v range-vector, sf scalar, tf scalar)` produces a smoothed value
|
||||||
|
for time series based on the range in `v`. The lower the smoothing factor `sf`,
|
||||||
|
the more importance is given to old data. The higher the trend factor `tf`, the
|
||||||
|
more trends in the data is considered. Both `sf` and `tf` must be between 0 and
|
||||||
|
1.
|
||||||
|
|
||||||
|
`holt_winters` should only be used with gauges.
|
||||||
|
|
||||||
|
## `hour()`
|
||||||
|
|
||||||
|
`hour(v=vector(time()) instant-vector)` returns the hour of the day
|
||||||
|
for each of the given times in UTC. Returned values are from 0 to 23.
|
||||||
|
|
||||||
|
## `idelta()`
|
||||||
|
|
||||||
|
`idelta(v range-vector)`
|
||||||
|
|
||||||
|
`idelta(v range-vector)` calculates the difference between the last two samples
|
||||||
|
in the range vector `v`, returning an instant vector with the given deltas and
|
||||||
|
equivalent labels.
|
||||||
|
|
||||||
|
`idelta` should only be used with gauges.
|
||||||
|
|
||||||
|
## `increase()`
|
||||||
|
|
||||||
|
`increase(v range-vector)` calculates the increase in the
|
||||||
|
time series in the range vector. Breaks in monotonicity (such as counter
|
||||||
|
resets due to target restarts) are automatically adjusted for. The
|
||||||
|
increase is extrapolated to cover the full time range as specified
|
||||||
|
in the range vector selector, so that it is possible to get a
|
||||||
|
non-integer result even if a counter increases only by integer
|
||||||
|
increments.
|
||||||
|
|
||||||
|
The following example expression returns the number of HTTP requests as measured
|
||||||
|
over the last 5 minutes, per time series in the range vector:
|
||||||
|
|
||||||
|
```
|
||||||
|
increase(http_requests_total{job="api-server"}[5m])
|
||||||
|
```
|
||||||
|
|
||||||
|
`increase` should only be used with counters. It is syntactic sugar
|
||||||
|
for `rate(v)` multiplied by the number of seconds under the specified
|
||||||
|
time range window, and should be used primarily for human readability.
|
||||||
|
Use `rate` in recording rules so that increases are tracked consistently
|
||||||
|
on a per-second basis.
|
||||||
|
|
||||||
|
## `irate()`
|
||||||
|
|
||||||
|
`irate(v range-vector)` calculates the per-second instant rate of increase of
|
||||||
|
the time series in the range vector. This is based on the last two data points.
|
||||||
|
Breaks in monotonicity (such as counter resets due to target restarts) are
|
||||||
|
automatically adjusted for.
|
||||||
|
|
||||||
|
The following example expression returns the per-second rate of HTTP requests
|
||||||
|
looking up to 5 minutes back for the two most recent data points, per time
|
||||||
|
series in the range vector:
|
||||||
|
|
||||||
|
```
|
||||||
|
irate(http_requests_total{job="api-server"}[5m])
|
||||||
|
```
|
||||||
|
|
||||||
|
`irate` should only be used when graphing volatile, fast-moving counters.
|
||||||
|
Use `rate` for alerts and slow-moving counters, as brief changes
|
||||||
|
in the rate can reset the `FOR` clause and graphs consisting entirely of rare
|
||||||
|
spikes are hard to read.
|
||||||
|
|
||||||
|
Note that when combining `irate()` with an
|
||||||
|
[aggregation operator](operators.md#aggregation-operators) (e.g. `sum()`)
|
||||||
|
or a function aggregating over time (any function ending in `_over_time`),
|
||||||
|
always take a `irate()` first, then aggregate. Otherwise `irate()` cannot detect
|
||||||
|
counter resets when your target restarts.
|
||||||
|
|
||||||
|
## `label_join()`
|
||||||
|
|
||||||
|
For each timeseries in `v`, `label_join(v instant-vector, dst_label string, separator string, src_label_1 string, src_label_2 string, ...)` joins all the values of all the `src_labels`
|
||||||
|
using `separator` and returns the timeseries with the label `dst_label` containing the joined value.
|
||||||
|
There can be any number of `src_labels` in this function.
|
||||||
|
|
||||||
|
This example will return a vector with each time series having a `foo` label with the value `a,b,c` added to it:
|
||||||
|
|
||||||
|
```
|
||||||
|
label_join(up{job="api-server",src1="a",src2="b",src3="c"}, "foo", ",", "src1", "src2", "src3")
|
||||||
|
```
|
||||||
|
|
||||||
|
## `label_replace()`
|
||||||
|
|
||||||
|
For each timeseries in `v`, `label_replace(v instant-vector, dst_label string,
|
||||||
|
replacement string, src_label string, regex string)` matches the regular
|
||||||
|
expression `regex` against the label `src_label`. If it matches, then the
|
||||||
|
timeseries is returned with the label `dst_label` replaced by the expansion of
|
||||||
|
`replacement`. `$1` is replaced with the first matching subgroup, `$2` with the
|
||||||
|
second etc. If the regular expression doesn't match then the timeseries is
|
||||||
|
returned unchanged.
|
||||||
|
|
||||||
|
This example will return a vector with each time series having a `foo`
|
||||||
|
label with the value `a` added to it:
|
||||||
|
|
||||||
|
```
|
||||||
|
label_replace(up{job="api-server",service="a:c"}, "foo", "$1", "service", "(.*):.*")
|
||||||
|
```
|
||||||
|
|
||||||
|
## `ln()`
|
||||||
|
|
||||||
|
`ln(v instant-vector)` calculates the natural logarithm for all elements in `v`.
|
||||||
|
Special cases are:
|
||||||
|
|
||||||
|
* `ln(+Inf) = +Inf`
|
||||||
|
* `ln(0) = -Inf`
|
||||||
|
* `ln(x < 0) = NaN`
|
||||||
|
* `ln(NaN) = NaN`
|
||||||
|
|
||||||
|
## `log2()`
|
||||||
|
|
||||||
|
`log2(v instant-vector)` calculates the binary logarithm for all elements in `v`.
|
||||||
|
The special cases are equivalent to those in `ln`.
|
||||||
|
|
||||||
|
## `log10()`
|
||||||
|
|
||||||
|
`log10(v instant-vector)` calculates the decimal logarithm for all elements in `v`.
|
||||||
|
The special cases are equivalent to those in `ln`.
|
||||||
|
|
||||||
|
## `minute()`
|
||||||
|
|
||||||
|
`minute(v=vector(time()) instant-vector)` returns the minute of the hour for each
|
||||||
|
of the given times in UTC. Returned values are from 0 to 59.
|
||||||
|
|
||||||
|
## `month()`
|
||||||
|
|
||||||
|
`month(v=vector(time()) instant-vector)` returns the month of the year for each
|
||||||
|
of the given times in UTC. Returned values are from 1 to 12, where 1 means
|
||||||
|
January etc.
|
||||||
|
|
||||||
|
## `predict_linear()`
|
||||||
|
|
||||||
|
`predict_linear(v range-vector, t scalar)` predicts the value of time series
|
||||||
|
`t` seconds from now, based on the range vector `v`, using [simple linear
|
||||||
|
regression](http://en.wikipedia.org/wiki/Simple_linear_regression).
|
||||||
|
|
||||||
|
`predict_linear` should only be used with gauges.
|
||||||
|
|
||||||
|
## `rate()`
|
||||||
|
|
||||||
|
`rate(v range-vector)` calculates the per-second average rate of increase of the
|
||||||
|
time series in the range vector. Breaks in monotonicity (such as counter
|
||||||
|
resets due to target restarts) are automatically adjusted for. Also, the
|
||||||
|
calculation extrapolates to the ends of the time range, allowing for missed
|
||||||
|
scrapes or imperfect alignment of scrape cycles with the range's time period.
|
||||||
|
|
||||||
|
The following example expression returns the per-second rate of HTTP requests as measured
|
||||||
|
over the last 5 minutes, per time series in the range vector:
|
||||||
|
|
||||||
|
```
|
||||||
|
rate(http_requests_total{job="api-server"}[5m])
|
||||||
|
```
|
||||||
|
|
||||||
|
`rate` should only be used with counters. It is best suited for alerting,
|
||||||
|
and for graphing of slow-moving counters.
|
||||||
|
|
||||||
|
Note that when combining `rate()` with an aggregation operator (e.g. `sum()`)
|
||||||
|
or a function aggregating over time (any function ending in `_over_time`),
|
||||||
|
always take a `rate()` first, then aggregate. Otherwise `rate()` cannot detect
|
||||||
|
counter resets when your target restarts.
|
||||||
|
|
||||||
|
## `resets()`
|
||||||
|
|
||||||
|
For each input time series, `resets(v range-vector)` returns the number of
|
||||||
|
counter resets within the provided time range as an instant vector. Any
|
||||||
|
decrease in the value between two consecutive samples is interpreted as a
|
||||||
|
counter reset.
|
||||||
|
|
||||||
|
`resets` should only be used with counters.
|
||||||
|
|
||||||
|
## `round()`
|
||||||
|
|
||||||
|
`round(v instant-vector, to_nearest=1 scalar)` rounds the sample values of all
|
||||||
|
elements in `v` to the nearest integer. Ties are resolved by rounding up. The
|
||||||
|
optional `to_nearest` argument allows specifying the nearest multiple to which
|
||||||
|
the sample values should be rounded. This multiple may also be a fraction.
|
||||||
|
|
||||||
|
## `scalar()`
|
||||||
|
|
||||||
|
Given a single-element input vector, `scalar(v instant-vector)` returns the
|
||||||
|
sample value of that single element as a scalar. If the input vector does not
|
||||||
|
have exactly one element, `scalar` will return `NaN`.
|
||||||
|
|
||||||
|
## `sort()`
|
||||||
|
|
||||||
|
`sort(v instant-vector)` returns vector elements sorted by their sample values,
|
||||||
|
in ascending order.
|
||||||
|
|
||||||
|
## `sort_desc()`
|
||||||
|
|
||||||
|
Same as `sort`, but sorts in descending order.
|
||||||
|
|
||||||
|
## `sqrt()`
|
||||||
|
|
||||||
|
`sqrt(v instant-vector)` calculates the square root of all elements in `v`.
|
||||||
|
|
||||||
|
## `time()`
|
||||||
|
|
||||||
|
`time()` returns the number of seconds since January 1, 1970 UTC. Note that
|
||||||
|
this does not actually return the current time, but the time at which the
|
||||||
|
expression is to be evaluated.
|
||||||
|
|
||||||
|
## `vector()`
|
||||||
|
|
||||||
|
`vector(s scalar)` returns the scalar `s` as a vector with no labels.
|
||||||
|
|
||||||
|
## `year()`
|
||||||
|
|
||||||
|
`year(v=vector(time()) instant-vector)` returns the year
|
||||||
|
for each of the given times in UTC.
|
||||||
|
|
||||||
|
## `<aggregation>_over_time()`
|
||||||
|
|
||||||
|
The following functions allow aggregating each series of a given range vector
|
||||||
|
over time and return an instant vector with per-series aggregation results:
|
||||||
|
|
||||||
|
* `avg_over_time(range-vector)`: the average value of all points in the specified interval.
|
||||||
|
* `min_over_time(range-vector)`: the minimum value of all points in the specified interval.
|
||||||
|
* `max_over_time(range-vector)`: the maximum value of all points in the specified interval.
|
||||||
|
* `sum_over_time(range-vector)`: the sum of all values in the specified interval.
|
||||||
|
* `count_over_time(range-vector)`: the count of all values in the specified interval.
|
||||||
|
* `quantile_over_time(scalar, range-vector)`: the φ-quantile (0 ≤ φ ≤ 1) of the values in the specified interval.
|
||||||
|
* `stddev_over_time(range-vector)`: the population standard deviation of the values in the specified interval.
|
||||||
|
* `stdvar_over_time(range-vector)`: the population standard variance of the values in the specified interval.
|
||||||
|
|
||||||
|
Note that all values in the specified interval have the same weight in the
|
||||||
|
aggregation even if the values are not equally spaced throughout the interval.
|
|
@ -0,0 +1,4 @@
|
||||||
|
---
|
||||||
|
title: Querying
|
||||||
|
sort_rank: 4
|
||||||
|
---
|
|
@ -0,0 +1,250 @@
|
||||||
|
---
|
||||||
|
title: Operators
|
||||||
|
sort_rank: 2
|
||||||
|
---
|
||||||
|
|
||||||
|
# Operators
|
||||||
|
|
||||||
|
## Binary operators
|
||||||
|
|
||||||
|
Prometheus's query language supports basic logical and arithmetic operators.
|
||||||
|
For operations between two instant vectors, the [matching behavior](#vector-matching)
|
||||||
|
can be modified.
|
||||||
|
|
||||||
|
### Arithmetic binary operators
|
||||||
|
|
||||||
|
The following binary arithmetic operators exist in Prometheus:
|
||||||
|
|
||||||
|
* `+` (addition)
|
||||||
|
* `-` (subtraction)
|
||||||
|
* `*` (multiplication)
|
||||||
|
* `/` (division)
|
||||||
|
* `%` (modulo)
|
||||||
|
* `^` (power/exponentiation)
|
||||||
|
|
||||||
|
Binary arithmetic operators are defined between scalar/scalar, vector/scalar,
|
||||||
|
and vector/vector value pairs.
|
||||||
|
|
||||||
|
**Between two scalars**, the behavior is obvious: they evaluate to another
|
||||||
|
scalar that is the result of the operator applied to both scalar operands.
|
||||||
|
|
||||||
|
**Between an instant vector and a scalar**, the operator is applied to the
|
||||||
|
value of every data sample in the vector. E.g. if a time series instant vector
|
||||||
|
is multiplied by 2, the result is another vector in which every sample value of
|
||||||
|
the original vector is multiplied by 2.
|
||||||
|
|
||||||
|
**Between two instant vectors**, a binary arithmetic operator is applied to
|
||||||
|
each entry in the left-hand-side vector and its [matching element](#vector-matching)
|
||||||
|
in the right hand vector. The result is propagated into the result vector and the metric
|
||||||
|
name is dropped. Entries for which no matching entry in the right-hand vector can be
|
||||||
|
found are not part of the result.
|
||||||
|
|
||||||
|
### Comparison binary operators
|
||||||
|
|
||||||
|
The following binary comparison operators exist in Prometheus:
|
||||||
|
|
||||||
|
* `==` (equal)
|
||||||
|
* `!=` (not-equal)
|
||||||
|
* `>` (greater-than)
|
||||||
|
* `<` (less-than)
|
||||||
|
* `>=` (greater-or-equal)
|
||||||
|
* `<=` (less-or-equal)
|
||||||
|
|
||||||
|
Comparison operators are defined between scalar/scalar, vector/scalar,
|
||||||
|
and vector/vector value pairs. By default they filter. Their behaviour can be
|
||||||
|
modified by providing `bool` after the operator, which will return `0` or `1`
|
||||||
|
for the value rather than filtering.
|
||||||
|
|
||||||
|
**Between two scalars**, the `bool` modifier must be provided and these
|
||||||
|
operators result in another scalar that is either `0` (`false`) or `1`
|
||||||
|
(`true`), depending on the comparison result.
|
||||||
|
|
||||||
|
**Between an instant vector and a scalar**, these operators are applied to the
|
||||||
|
value of every data sample in the vector, and vector elements between which the
|
||||||
|
comparison result is `false` get dropped from the result vector. If the `bool`
|
||||||
|
modifier is provided, vector elements that would be dropped instead have the value
|
||||||
|
`0` and vector elements that would be kept have the value `1`.
|
||||||
|
|
||||||
|
**Between two instant vectors**, these operators behave as a filter by default,
|
||||||
|
applied to matching entries. Vector elements for which the expression is not
|
||||||
|
true or which do not find a match on the other side of the expression get
|
||||||
|
dropped from the result, while the others are propagated into a result vector
|
||||||
|
with their original (left-hand-side) metric names and label values.
|
||||||
|
If the `bool` modifier is provided, vector elements that would have been
|
||||||
|
dropped instead have the value `0` and vector elements that would be kept have
|
||||||
|
the value `1` with the left-hand-side metric names and label values.
|
||||||
|
|
||||||
|
### Logical/set binary operators
|
||||||
|
|
||||||
|
These logical/set binary operators are only defined between instant vectors:
|
||||||
|
|
||||||
|
* `and` (intersection)
|
||||||
|
* `or` (union)
|
||||||
|
* `unless` (complement)
|
||||||
|
|
||||||
|
`vector1 and vector2` results in a vector consisting of the elements of
|
||||||
|
`vector1` for which there are elements in `vector2` with exactly matching
|
||||||
|
label sets. Other elements are dropped. The metric name and values are carried
|
||||||
|
over from the left-hand-side vector.
|
||||||
|
|
||||||
|
`vector1 or vector2` results in a vector that contains all original elements
|
||||||
|
(label sets + values) of `vector1` and additionally all elements of `vector2`
|
||||||
|
which do not have matching label sets in `vector1`.
|
||||||
|
|
||||||
|
`vector1 unless vector2` results in a vector consisting of the elements of
|
||||||
|
`vector1` for which there are no elements in `vector2` with exactly matching
|
||||||
|
label sets. All matching elements in both vectors are dropped.
|
||||||
|
|
||||||
|
## Vector matching
|
||||||
|
|
||||||
|
Operations between vectors attempt to find a matching element in the right-hand-side
|
||||||
|
vector for each entry in the left-hand side. There are two basic types of
|
||||||
|
matching behavior:
|
||||||
|
|
||||||
|
**One-to-one** finds a unique pair of entries from each side of the operation.
|
||||||
|
In the default case, that is an operation following the format `vector1 <operator> vector2`.
|
||||||
|
Two entries match if they have the exact same set of labels and corresponding values.
|
||||||
|
The `ignoring` keyword allows ignoring certain labels when matching, while the
|
||||||
|
`on` keyword allows reducing the set of considered labels to a provided list:
|
||||||
|
|
||||||
|
<vector expr> <bin-op> ignoring(<label list>) <vector expr>
|
||||||
|
<vector expr> <bin-op> on(<label list>) <vector expr>
|
||||||
|
|
||||||
|
Example input:
|
||||||
|
|
||||||
|
method_code:http_errors:rate5m{method="get", code="500"} 24
|
||||||
|
method_code:http_errors:rate5m{method="get", code="404"} 30
|
||||||
|
method_code:http_errors:rate5m{method="put", code="501"} 3
|
||||||
|
method_code:http_errors:rate5m{method="post", code="500"} 6
|
||||||
|
method_code:http_errors:rate5m{method="post", code="404"} 21
|
||||||
|
|
||||||
|
method:http_requests:rate5m{method="get"} 600
|
||||||
|
method:http_requests:rate5m{method="del"} 34
|
||||||
|
method:http_requests:rate5m{method="post"} 120
|
||||||
|
|
||||||
|
Example query:
|
||||||
|
|
||||||
|
method_code:http_errors:rate5m{code="500"} / ignoring(code) method:http_requests:rate5m
|
||||||
|
|
||||||
|
This returns a result vector containing the fraction of HTTP requests with status code
|
||||||
|
of 500 for each method, as measured over the last 5 minutes. Without `ignoring(code)` there
|
||||||
|
would have been no match as the metrics do not share the same set of labels.
|
||||||
|
The entries with methods `put` and `del` have no match and will not show up in the result:
|
||||||
|
|
||||||
|
{method="get"} 0.04 // 24 / 600
|
||||||
|
{method="post"} 0.05 // 6 / 120
|
||||||
|
|
||||||
|
**Many-to-one** and **one-to-many** matchings refer to the case where each vector element on
|
||||||
|
the "one"-side can match with multiple elements on the "many"-side. This has to
|
||||||
|
be explicitly requested using the `group_left` or `group_right` modifier, where
|
||||||
|
left/right determines which vector has the higher cardinality.
|
||||||
|
|
||||||
|
<vector expr> <bin-op> ignoring(<label list>) group_left(<label list>) <vector expr>
|
||||||
|
<vector expr> <bin-op> ignoring(<label list>) group_right(<label list>) <vector expr>
|
||||||
|
<vector expr> <bin-op> on(<label list>) group_left(<label list>) <vector expr>
|
||||||
|
<vector expr> <bin-op> on(<label list>) group_right(<label list>) <vector expr>
|
||||||
|
|
||||||
|
The label list provided with the group modifier contains additional labels from
|
||||||
|
the "one"-side to be included in the result metrics. For `on` a label can only
|
||||||
|
appear in one of the lists. Every time series of the result vector must be
|
||||||
|
uniquely identifiable.
|
||||||
|
|
||||||
|
_Grouping modifiers can only be used for
|
||||||
|
[comparison](#comparison-binary-operators) and
|
||||||
|
[arithmetic](#arithmetic-binary-operators). Operations as `and`, `unless` and
|
||||||
|
`or` operations match with all possible entries in the right vector by
|
||||||
|
default._
|
||||||
|
|
||||||
|
Example query:
|
||||||
|
|
||||||
|
method_code:http_errors:rate5m / ignoring(code) group_left method:http_requests:rate5m
|
||||||
|
|
||||||
|
In this case the left vector contains more than one entry per `method` label
|
||||||
|
value. Thus, we indicate this using `group_left`. The elements from the right
|
||||||
|
side are now matched with multiple elements with the same `method` label on the
|
||||||
|
left:
|
||||||
|
|
||||||
|
{method="get", code="500"} 0.04 // 24 / 600
|
||||||
|
{method="get", code="404"} 0.05 // 30 / 600
|
||||||
|
{method="post", code="500"} 0.05 // 6 / 120
|
||||||
|
{method="post", code="404"} 0.175 // 21 / 120
|
||||||
|
|
||||||
|
_Many-to-one and one-to-many matching are advanced use cases that should be carefully considered.
|
||||||
|
Often a proper use of `ignoring(<labels>)` provides the desired outcome._
|
||||||
|
|
||||||
|
## Aggregation operators
|
||||||
|
|
||||||
|
Prometheus supports the following built-in aggregation operators that can be
|
||||||
|
used to aggregate the elements of a single instant vector, resulting in a new
|
||||||
|
vector of fewer elements with aggregated values:
|
||||||
|
|
||||||
|
* `sum` (calculate sum over dimensions)
|
||||||
|
* `min` (select minimum over dimensions)
|
||||||
|
* `max` (select maximum over dimensions)
|
||||||
|
* `avg` (calculate the average over dimensions)
|
||||||
|
* `stddev` (calculate population standard deviation over dimensions)
|
||||||
|
* `stdvar` (calculate population standard variance over dimensions)
|
||||||
|
* `count` (count number of elements in the vector)
|
||||||
|
* `count_values` (count number of elements with the same value)
|
||||||
|
* `bottomk` (smallest k elements by sample value)
|
||||||
|
* `topk` (largest k elements by sample value)
|
||||||
|
* `quantile` (calculate φ-quantile (0 ≤ φ ≤ 1) over dimensions)
|
||||||
|
|
||||||
|
These operators can either be used to aggregate over **all** label dimensions
|
||||||
|
or preserve distinct dimensions by including a `without` or `by` clause.
|
||||||
|
|
||||||
|
<aggr-op>([parameter,] <vector expression>) [without|by (<label list>)] [keep_common]
|
||||||
|
|
||||||
|
`parameter` is only required for `count_values`, `quantile`, `topk` and
|
||||||
|
`bottomk`. `without` removes the listed labels from the result vector, while
|
||||||
|
all other labels are preserved the output. `by` does the opposite and drops
|
||||||
|
labels that are not listed in the `by` clause, even if their label values are
|
||||||
|
identical between all elements of the vector. The `keep_common` clause allows
|
||||||
|
keeping those extra labels (labels that are identical between elements, but not
|
||||||
|
in the `by` clause).
|
||||||
|
|
||||||
|
`count_values` outputs one time series per unique sample value. Each series has
|
||||||
|
an additional label. The name of that label is given by the aggregation
|
||||||
|
parameter, and the label value is the unique sample value. The value of each
|
||||||
|
time series is the number of times that sample value was present.
|
||||||
|
|
||||||
|
`topk` and `bottomk` are different from other aggregators in that a subset of
|
||||||
|
the input samples, including the original labels, are returned in the result
|
||||||
|
vector. `by` and `without` are only used to bucket the input vector.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
If the metric `http_requests_total` had time series that fan out by
|
||||||
|
`application`, `instance`, and `group` labels, we could calculate the total
|
||||||
|
number of seen HTTP requests per application and group over all instances via:
|
||||||
|
|
||||||
|
sum(http_requests_total) without (instance)
|
||||||
|
|
||||||
|
If we are just interested in the total of HTTP requests we have seen in **all**
|
||||||
|
applications, we could simply write:
|
||||||
|
|
||||||
|
sum(http_requests_total)
|
||||||
|
|
||||||
|
To count the number of binaries running each build version we could write:
|
||||||
|
|
||||||
|
count_values("version", build_version)
|
||||||
|
|
||||||
|
To get the 5 largest HTTP requests counts across all instances we could write:
|
||||||
|
|
||||||
|
topk(5, http_requests_total)
|
||||||
|
|
||||||
|
## Binary operator precedence
|
||||||
|
|
||||||
|
The following list shows the precedence of binary operators in Prometheus, from
|
||||||
|
highest to lowest.
|
||||||
|
|
||||||
|
1. `^`
|
||||||
|
2. `*`, `/`, `%`
|
||||||
|
3. `+`, `-`
|
||||||
|
4. `==`, `!=`, `<=`, `<`, `>=`, `>`
|
||||||
|
5. `and`, `unless`
|
||||||
|
6. `or`
|
||||||
|
|
||||||
|
Operators on the same precedence level are left-associative. For example,
|
||||||
|
`2 * 3 % 2` is equivalent to `(2 * 3) % 2`. However `^` is right associative,
|
||||||
|
so `2 ^ 3 ^ 2` is equivalent to `2 ^ (3 ^ 2)`.
|
|
@ -0,0 +1,66 @@
|
||||||
|
---
|
||||||
|
title: Recording rules
|
||||||
|
sort_rank: 6
|
||||||
|
---
|
||||||
|
|
||||||
|
# Defining recording rules
|
||||||
|
|
||||||
|
## Configuring rules
|
||||||
|
|
||||||
|
Prometheus supports two types of rules which may be configured and then
|
||||||
|
evaluated at regular intervals: recording rules and [alerting
|
||||||
|
rules](https://prometheus.io/docs/alerting/rules/). To include rules in
|
||||||
|
Prometheus, create a file containing the necessary rule statements and have
|
||||||
|
Prometheus load the file via the `rule_files` field in the [Prometheus
|
||||||
|
configuration](../configuration.md).
|
||||||
|
|
||||||
|
The rule files can be reloaded at runtime by sending `SIGHUP` to the Prometheus
|
||||||
|
process. The changes are only applied if all rule files are well-formatted.
|
||||||
|
|
||||||
|
## Syntax-checking rules
|
||||||
|
|
||||||
|
To quickly check whether a rule file is syntactically correct without starting
|
||||||
|
a Prometheus server, install and run Prometheus's `promtool` command-line
|
||||||
|
utility tool:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
go get github.com/prometheus/prometheus/cmd/promtool
|
||||||
|
promtool check-rules /path/to/example.rules
|
||||||
|
```
|
||||||
|
|
||||||
|
When the file is syntactically valid, the checker prints a textual
|
||||||
|
representation of the parsed rules to standard output and then exits with
|
||||||
|
a `0` return status.
|
||||||
|
|
||||||
|
If there are any syntax errors, it prints an error message to standard error
|
||||||
|
and exits with a `1` return status. On invalid input arguments the exit status
|
||||||
|
is `2`.
|
||||||
|
|
||||||
|
## Recording rules
|
||||||
|
|
||||||
|
Recording rules allow you to precompute frequently needed or computationally
|
||||||
|
expensive expressions and save their result as a new set of time series.
|
||||||
|
Querying the precomputed result will then often be much faster than executing
|
||||||
|
the original expression every time it is needed. This is especially useful for
|
||||||
|
dashboards, which need to query the same expression repeatedly every time they
|
||||||
|
refresh.
|
||||||
|
|
||||||
|
To add a new recording rule, add a line of the following syntax to your rule
|
||||||
|
file:
|
||||||
|
|
||||||
|
<new time series name>[{<label overrides>}] = <expression to record>
|
||||||
|
|
||||||
|
Some examples:
|
||||||
|
|
||||||
|
# Saving the per-job HTTP in-progress request count as a new set of time series:
|
||||||
|
job:http_inprogress_requests:sum = sum(http_inprogress_requests) by (job)
|
||||||
|
|
||||||
|
# Drop or rewrite labels in the result time series:
|
||||||
|
new_time_series{label_to_change="new_value",label_to_drop=""} = old_time_series
|
||||||
|
|
||||||
|
Recording rules are evaluated at the interval specified by the
|
||||||
|
`evaluation_interval` field in the Prometheus configuration. During each
|
||||||
|
evaluation cycle, the right-hand-side expression of the rule statement is
|
||||||
|
evaluated at the current instant in time and the resulting sample vector is
|
||||||
|
stored as a new set of time series with the current timestamp and a new metric
|
||||||
|
name (and perhaps an overridden set of labels).
|
Loading…
Reference in New Issue