Taylor Sly
9f9473859b
Fix description for NodeDiskIOSaturation alert ( #2929 )
...
NodeDiskIOSaturation description should say 30m per the "for" clause
Signed-off-by: Taylor Sly <slyt@users.noreply.github.com>
9 months ago
Anton Lugovoi
81fc05c45f
Make filesystem space prediction window configurable ( #2844 )
...
Signed-off-by: fitz123 <alugovoi@ordercapital.com>
1 year ago
Ayoub NASR
7333465abf
Add NodeBondingDegraded alert ( #2843 )
...
Signed-off-by: Ayoub Nasr <ayoub.nasr@scality.com>
1 year ago
Vitaly Zhuravlev
e8d7f4e8b3
Revert alerts pending durtions
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly
3e250a95a0
Update NodeSystemSaturation severity
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
b7dfb32bfc
Set severity to NodeCPUHighUsage to info
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
6bdc1d9c98
Add thresholds for memory, disk and system alerts
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
77ae769179
Add thresholds for memory alerts
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
2111e70ac7
Add comma after 'mounted on'
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
e48e7909f4
Extend alert description
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
da32f8de17
Decrease NodeSystemdServiceFailed severity to warning
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
580c497261
Add NodeSystemSaturation and NodeMemoryMajorPagesFaults
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
e15e7d6a7b
Fix NodeMemoryHighUtilization alert
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
c3ec6e8af1
Add diskDevice selector
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
962de6c921
Add %(nodeExporterSelector)s to Network and conntrack alerts
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
94fc82e418
Add NodeDiskIOSaturation alert
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
614030bb80
Set 'at' everywhere as preposition for instance
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
3d8075da7d
Decrease NodeNetwork*Errs pending period
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
74794182a7
Add failed systemd service alert
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
fd2d62af63
Add CPU and memory alerts
...
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
0e0399d41e
Decrease NodeFilesystem pending time to 15m
...
30m is too long and there is a risk of running out of disk space/inodes completely if something is filling up disk very fast (like log file).
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Vitaly Zhuravlev
fc967aa992
Add mountpoint to NodeFilesystem alerts
...
This helps to identify alerting filesystem.
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
1 year ago
Will Bollock
0a17e17718
docs (node/mixin): fix annotation for Skew alert ( #2671 )
...
This updates the annotation for the NodeClockSkewDetected mixin alert to
match the new threshold set.
Original discussion was in this PR: https://github.com/prometheus/node_exporter/pull/1480
I spent an embarrassingly large amount of time trying to figure out how
the heck that alert would mean 300s of clock skew. Turns out the
annotation was just left the same after the threshold change.
Signed-off-by: Will Bollock <wbollock@linode.com>
2 years ago
Jan Fajerski
87b8e3790d
docs/node-mixin: add fsMointpointSelector to alerts and dashboards ( #2446 )
...
* docs/node-mixin: add fsMountpointSelector
This adds the option to add a `mountpoint` selector to filesystem
related alerts. The default is `mountpoint!=""`.
* docs/node-mixins: add fsMountpointSelector to dashboards
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2 years ago
Paweł Krupa (paulfantom)
8571536327
docs/node-mixin: add missing selectors
...
Signed-off-by: Paweł Krupa (paulfantom) <pawel@krupa.net.pl>
2 years ago
Daniel Lenar
0b50eb7294
Reverse fsSpaceAvailableCriticalThreshold and fsSpaceAvailableWarningThreshold
...
Currently critical alert for space available alerts on warning and
warning alert for space available alerts on critical.
Signed-off-by: Daniel Lenar <dlenar@vailsys.com>
3 years ago
Vitaly Zhuravlev
8823605f12
Fix NodeFileDescriptorLimit alerts
...
Signed-off-by: Vitaly Zhuravlev <zhuravlev.vitaly@gmail.com>
3 years ago
paulfantom
832909dd25
docs/node-mixin/alerts: make NodeFilesystemAlmostOutOfSpace fire earlier
...
Signed-off-by: paulfantom <pawel@krupa.net.pl>
3 years ago
Loïc Blot
55ffe57cbc
feat(rules): add NodeFileDescriptorLimit kernel exhaustion alert
...
Add a new alert when fs.file-nr is close to fs.file-max
Signed-off-by: Loic Blot <loic.blot@unix-experience.fr>
4 years ago
Ben Kochie
eefb18db02
Merge pull request #1764 from dhoppe/patch-1
...
Use description instead of message as field for annotations
4 years ago
Ben Kochie
4b68aeb80a
Merge pull request #1862 from fsschmitt/fix/alerts-label-naming
...
fix: node_md_disks state label from fail to failed
4 years ago
Björn Rabenstein
9c9c636305
Merge pull request #1861 from paulfantom/network-alerts
...
docs/node-mixin/alerts: use ratio for network alerts
4 years ago
paulfantom
f81747e608
docs/node-mixin/alerts: add max error condition to alert about desynchronized clock
...
Signed-off-by: paulfantom <pawel@krupa.net.pl>
4 years ago
fsschmitt
effa4da989
fix: node_md_disks state label as failed
...
Signed-off-by: fsschmitt <492108+fsschmitt@users.noreply.github.com>
4 years ago
paulfantom
d7cbe85d22
docs/node-mixin/alerts: use a rate for network alerts
...
Signed-off-by: paulfantom <pawel@krupa.net.pl>
4 years ago
Nicolas Lamirault
ff2ff3410f
Configure 2 thresholds for NodeFilesystemAlmostOutOfSpace alert ( #1835 )
...
* Add: configure 2 thresholds for NodeFilesystemAlmostOutOfSpace alert
Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>
4 years ago
Rajat Vig
7dd8adf7ed
Fix NodeRAIDDegraded to not use a string rule expressions
...
Signed-off-by: Rajat Vig <rvig@etsy.com>
4 years ago
Simon Pasquier
02212dd2c6
Run jsonnetfmt
...
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
4 years ago
Hao Ke
9b7a0d06a1
Fix syntax error
...
Signed-off-by: Hao Ke <hao.ke@auryc.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
4 years ago
paulfantom
e4ec8e04c5
docs/node-mixin: add alerts about failing RAID array
...
Signed-off-by: paulfantom <pawel@krupa.net.pl>
4 years ago
Dennis Hoppe
fc64b70386
Use description instead of message as field for annotations
...
Signed-off-by: Dennis Hoppe <github@debian-solutions.de>
4 years ago
Frederic Branczyk
b42819b69d
Merge pull request #1657 from povilasv/NodeTextFileCollectorScrapeError
...
Add NodeTextFileCollectorScrapeError alert to mixin
5 years ago
Povilas Versockas
bd3e6d224c
Add NodeTextFileCollectorScrapeError alert to mixin
...
Signed-off-by: Povilas Versockas <p.versockas@gmail.com>
5 years ago
beorn7
8b00b22904
Fix sign error in `NodeClockSkewDetected`
...
Signed-off-by: beorn7 <beorn@grafana.com>
5 years ago
paulfantom
820f8d595e
docs/node-mixin: alert on desynchronised clock
...
Signed-off-by: paulfantom <pawel@krupa.net.pl>
5 years ago
Neraud
1006a2c4bb
Add missing coma
...
Signed-off-by: Neraud <neraud.login@gmail.com>
5 years ago
Povilas Versockas
48bb6f670c
Add NodeHighNumberConntrackEntriesUsed
...
Signed-off-by: Povilas Versockas <p.versockas@gmail.com>
5 years ago
iuri aranda
0107bc7942
Make FS space alerts thresholds configurable ( #1624 )
...
* Make FS space alerts thresholds configurable (#1 )
This makes it possible to tweak the thresholds for
the NodeFilesystemSpaceFillingUp alerts. Which
might be necessary in systems like Kubernetes,
where the image garbage collector runs at 85%,
so it's not a problem that the disk reaches that usage %.
Signed-off-by: iuri aranda <iuri@skyscrapers.eu>
5 years ago
Leo
dfeec07f2f
Fix node-mixin prometheus alert rules to use percentage
...
Signed-off-by: Leo <leonardjonathanoh@live.com>
5 years ago
beorn7
97ef113762
Make the severity of "critical" alerts configurable
...
This addresses the blissful scenario where single-node failures are
unproblematic. No reason to wake somebody up if a node is about to
screw itself up by filling the disk.
Signed-off-by: beorn7 <beorn@grafana.com>
5 years ago