Browse Source

mixin: exclude iowait and steal from CPU Utilisation (#2194)

'iowait' and 'steal' indicate specific idle/wait states, which shouldn't
be counted into CPU Utilisation. Also see
https://github.com/prometheus-operator/kube-prometheus/pull/796 and
https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/667.

Per the iostat man page:

%idle
    Show the percentage of time that the CPU or CPUs were idle and the
    system did not have an outstanding disk I/O request.

%iowait
     Show the percentage of time that the CPU or CPUs were idle during
     which the system had an outstanding disk I/O request.

%steal
     Show the percentage of time spent in involuntary wait by the
     virtual CPU or CPUs while the hypervisor was servicing another
     virtual processor.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
pull/2200/head
Julian Wiedmann 3 years ago committed by GitHub
parent
commit
3e6f4ce627
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 4
      docs/node-mixin/dashboards/node.libsonnet
  2. 6
      docs/node-mixin/rules/rules.libsonnet

4
docs/node-mixin/dashboards/node.libsonnet

@ -23,9 +23,9 @@ local gauge = promgrafonnet.gauge;
.addTarget(prometheus.target( .addTarget(prometheus.target(
||| |||
( (
(1 - rate(node_cpu_seconds_total{%(nodeExporterSelector)s, mode="idle", instance="$instance"}[$__rate_interval])) (1 - sum without (mode) (rate(node_cpu_seconds_total{%(nodeExporterSelector)s, mode=~"idle|iowait|steal", instance="$instance"}[$__rate_interval])))
/ ignoring(cpu) group_left / ignoring(cpu) group_left
count without (cpu)( node_cpu_seconds_total{%(nodeExporterSelector)s, mode="idle", instance="$instance"}) count without (cpu, mode) (node_cpu_seconds_total{%(nodeExporterSelector)s, mode="idle", instance="$instance"})
) )
||| % $._config, ||| % $._config,
legendFormat='{{cpu}}', legendFormat='{{cpu}}',

6
docs/node-mixin/rules/rules.libsonnet

@ -14,11 +14,11 @@
||| % $._config, ||| % $._config,
}, },
{ {
// CPU utilisation is % CPU is not idle. // CPU utilisation is % CPU without {idle,iowait,steal}.
record: 'instance:node_cpu_utilisation:rate%(rateInterval)s' % $._config, record: 'instance:node_cpu_utilisation:rate%(rateInterval)s' % $._config,
expr: ||| expr: |||
1 - avg without (cpu, mode) ( 1 - avg without (cpu) (
rate(node_cpu_seconds_total{%(nodeExporterSelector)s, mode="idle"}[%(rateInterval)s]) sum without (mode) (rate(node_cpu_seconds_total{%(nodeExporterSelector)s, mode=~"idle|iowait|steal"}[%(rateInterval)s]))
) )
||| % $._config, ||| % $._config,
}, },

Loading…
Cancel
Save