mirror of https://github.com/prometheus/prometheus
Browse Source
PromQL engine: Delay deletion of __name__ label to the end of the query evaluation - This change allows optionally preserving the `__name__` label via the `label_replace` and `label_join` functions, and helps prevent the dreaded "vector cannot contain metrics with the same labelset" error. - The implementation extends the `Series` and `Sample` structs with a boolean flag indicating whether the `__name__` label should be deleted at the end of the query evaluation. - The `label_replace` and `label_join` functions can still access the value of the `__name__` label, even if it has been previously marked for deletion. If `__name__` is used as target label, it won't be dropped at the end of the query evaluation. - Fixes https://github.com/prometheus/prometheus/issues/11397 - See https://github.com/jcreixell/prometheus/pull/2 for previous discussion, including the decision to create this PR and benchmark it before considering other alternatives (like refactoring `labels.Labels`). - See https://github.com/jcreixell/prometheus/pull/1 for an alternative implementation using a special label instead of boolean flags. - Note: a feature flag `promql-delayed-name-removal` has been added as it changes the behavior of some "weird" queries (see https://github.com/prometheus/prometheus/issues/11397#issuecomment-1451998792) Example (this always fails, as `__name__` is being dropped by `count_over_time`): ``` count_over_time({__name__!=""}[1m]) => Error executing query: vector cannot contain metrics with the same labelset ``` Before: ``` label_replace(count_over_time({__name__!=""}[1m]), "__name__", "count_$1", "__name__", "(.+)") => Error executing query: vector cannot contain metrics with the same labelset ``` After: ``` label_replace(count_over_time({__name__!=""}[1m]), "__name__", "count_$1", "__name__", "(.+)") => count_go_gc_cycles_automatic_gc_cycles_total{instance="localhost:9090", job="prometheus"} 1 count_go_gc_cycles_forced_gc_cycles_total{instance="localhost:9090", job="prometheus"} 1 ... ``` Signed-off-by: Jorge Creixell <jcreixell@gmail.com> --------- Signed-off-by: Jorge Creixell <jcreixell@gmail.com> Signed-off-by: Björn Rabenstein <github@rabenste.in>pull/14765/head
Jorge Creixell
3 months ago
committed by
GitHub
9 changed files with 309 additions and 80 deletions
@ -0,0 +1,84 @@
|
||||
# Test for __name__ label drop. |
||||
load 5m |
||||
metric{env="1"} 0 60 120 |
||||
another_metric{env="1"} 60 120 180 |
||||
|
||||
# Does not drop __name__ for vector selector |
||||
eval instant at 15m metric{env="1"} |
||||
metric{env="1"} 120 |
||||
|
||||
# Drops __name__ for unary operators |
||||
eval instant at 15m -metric |
||||
{env="1"} -120 |
||||
|
||||
# Drops __name__ for binary operators |
||||
eval instant at 15m metric + another_metric |
||||
{env="1"} 300 |
||||
|
||||
# Does not drop __name__ for binary comparison operators |
||||
eval instant at 15m metric <= another_metric |
||||
metric{env="1"} 120 |
||||
|
||||
# Drops __name__ for binary comparison operators with "bool" modifier |
||||
eval instant at 15m metric <= bool another_metric |
||||
{env="1"} 1 |
||||
|
||||
# Drops __name__ for vector-scalar operations |
||||
eval instant at 15m metric * 2 |
||||
{env="1"} 240 |
||||
|
||||
# Drops __name__ for instant-vector functions |
||||
eval instant at 15m clamp(metric, 0, 100) |
||||
{env="1"} 100 |
||||
|
||||
# Drops __name__ for range-vector functions |
||||
eval instant at 15m rate(metric{env="1"}[10m]) |
||||
{env="1"} 0.2 |
||||
|
||||
# Does not drop __name__ for last_over_time function |
||||
eval instant at 15m last_over_time(metric{env="1"}[10m]) |
||||
metric{env="1"} 120 |
||||
|
||||
# Drops name for other _over_time functions |
||||
eval instant at 15m max_over_time(metric{env="1"}[10m]) |
||||
{env="1"} 120 |
||||
|
||||
# Allows relabeling (to-be-dropped) __name__ via label_replace |
||||
eval instant at 15m label_replace(rate({env="1"}[10m]), "my_name", "rate_$1", "__name__", "(.+)") |
||||
{my_name="rate_metric", env="1"} 0.2 |
||||
{my_name="rate_another_metric", env="1"} 0.2 |
||||
|
||||
# Allows preserving __name__ via label_replace |
||||
eval instant at 15m label_replace(rate({env="1"}[10m]), "__name__", "rate_$1", "__name__", "(.+)") |
||||
rate_metric{env="1"} 0.2 |
||||
rate_another_metric{env="1"} 0.2 |
||||
|
||||
# Allows relabeling (to-be-dropped) __name__ via label_join |
||||
eval instant at 15m label_join(rate({env="1"}[10m]), "my_name", "_", "__name__") |
||||
{my_name="metric", env="1"} 0.2 |
||||
{my_name="another_metric", env="1"} 0.2 |
||||
|
||||
# Allows preserving __name__ via label_join |
||||
eval instant at 15m label_join(rate({env="1"}[10m]), "__name__", "_", "__name__", "env") |
||||
metric_1{env="1"} 0.2 |
||||
another_metric_1{env="1"} 0.2 |
||||
|
||||
# Does not drop metric names fro aggregation operators |
||||
eval instant at 15m sum by (__name__, env) (metric{env="1"}) |
||||
metric{env="1"} 120 |
||||
|
||||
# Aggregation operators by __name__ lead to duplicate labelset errors (aggregation is partitioned by not yet removed __name__ label) |
||||
# This is an accidental side effect of delayed __name__ label dropping |
||||
eval_fail instant at 15m sum by (__name__) (rate({env="1"}[10m])) |
||||
|
||||
# Aggregation operators aggregate metrics with same labelset and to-be-dropped names |
||||
# This is an accidental side effect of delayed __name__ label dropping |
||||
eval instant at 15m sum(rate({env="1"}[10m])) by (env) |
||||
{env="1"} 0.4 |
||||
|
||||
# Aggregationk operators propagate __name__ label dropping information |
||||
eval instant at 15m topk(10, sum by (__name__, env) (metric{env="1"})) |
||||
metric{env="1"} 120 |
||||
|
||||
eval instant at 15m topk(10, sum by (__name__, env) (rate(metric{env="1"}[10m]))) |
||||
{env="1"} 0.2 |
Loading…
Reference in new issue