* Make FS space alerts thresholds configurable (#1)
This makes it possible to tweak the thresholds for
the NodeFilesystemSpaceFillingUp alerts. Which
might be necessary in systems like Kubernetes,
where the image garbage collector runs at 85%,
so it's not a problem that the disk reaches that usage %.
Signed-off-by: iuri aranda <iuri@skyscrapers.eu>
Update CHANGELOG/VERSION for 1.0.0-rc.0 release.
* Add a note about new https settings to top-level README.
* Mark --web.config flag as experimental.
Signed-off-by: Ben Kochie <superq@gmail.com>
* Use `strconv.Itoa()` instead of `fmt.Sprintf()` for simple conversion.
* Eliminate copy-paste in collector setup.
Signed-off-by: Ben Kochie <superq@gmail.com>
* add a map of profilers to CPUids
`runtime.NumCPU()` returns the number of CPUs that the process can run
on. This number does not necessarily correlate to CPU ids if the
affinity mask of the process is set.
This change maintains the current behavior as default, but also allows
the user to specify a range of CPUids to use instead.
The CPU id is stored as the value of a map keyed on the profiler
object's address.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Signed-off-by: Daniel Hodges <hodges@uber.com>
Co-authored-by: jdamato-fsly <55214354+jdamato-fsly@users.noreply.github.com>
Many collectors depend on underlying features to be enabled. This causes
confusion about what "success" means. This changes the behavior of the
`node_scrape_collector_success` metric.
* When a collector is unable to find data don't return success.
* Catch the no data error and send to Debug log level to avoid log spam.
* Update collectors to support this new functionality.
* Fix copy-pasta mistake in infiband debug message.
Closes: https://github.com/prometheus/node_exporter/issues/1323
Signed-off-by: Ben Kochie <superq@gmail.com>
Let the node exporter collect the non-numeric data from
/sys/class/infiniband: board ID, firmware version, and HCA type.
Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
Co-authored-by: Ben Kochie <superq@gmail.com>
Reuse the Go-only implementation already in place for FreeBSD (#385) on
Darwin, DragonflyBSD, NetBSD and OpenBSD.
Tested on all affected platforms.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
This exposes RAPL statistics from /sys/class/powercap.
Co-Authored-By: Ben Kochie <superq@gmail.com>
Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
* Add unix socket support for supervisord collector
For example:
--collector.supervisord.url=unix:///var/run/supervisor.sock
Fixesprometheus/node_exporter#262
Signed-off-by: Paul Cameron <cameronpm@gmail.com>
Integer division and the order of operations when converting Mbps to Bps
results in a loss of accuracy if the interface speeds are set low.
e.g. 100 Mbps is reported as 12000000 Bps, should be 12500000
10 Mbps is reported as 1000000 Bps, should be 1250000
Signed-off-by: Thomas Lin <t.lin@mail.utoronto.ca>
This will now use `bcstats.numbufpages` instead of `uvmexp.vnodepages`.
Inspired by OpenBSD's `src/usr.bin/top`
Signed-off-by: Matthieu Guegan <matthieu.guegan@deindeal.ch>
* Add diskstat flush request counters for Linux 5.5+
* Update tests for diskstat flush request counters with Linux 5.5+
Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>
* Add makefile target to update sysfs fixtures.
* Use similar style for fixtures from procfs.
* Re-pack fixtures ttar file.
Signed-off-by: Ben Kochie <superq@gmail.com>
Collect the InfiniBand port state, the physical state, and the maximum
signal transfer rate.
Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
Add support for https connections.
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
Signed-off-by: James Ritchie <james.g.ritchie@baesystems.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Ben RIdley <benridley29@gmail.com>
We actually have to count or sum, respectively, _all_ the selected
metrics for the cluster-wide view. Which means it's easiest to use the
`scalar` approach after all (but only in the cluster dashboard). This
still propagates all the labels.
I have extended the comment for the `nodeExporterSelector` to note
that the cluster dashboard only makes sense if all the selected node
exporter actually belong to the same cluster.
Since this is jsonnet, users can easily disable the cluster
dashboard. Or even create multiple instances of the dashboards with
different `nodeExporterSelector`s for different clusters.
Signed-off-by: beorn7 <beorn@grafana.com>
The `instance:node_memory_swap_io_pages:rate1m` rule was intended to
measure the amount of memory pressure a system is under, but its name is
a bit misleading (it specifically refers to swap), and the rate of
`node_vmstat_pgmajfault` is a better metric for memory pressure
(see #1524).
This commit renames `instance:node_memory_swap_io_pages:rate1m` to
`instance:node_vmstat_pgmajfault:rate1m`, and defines it as
`rate(node_vmstat_pgmajfault{%(nodeExporterSelector)s}[1m])`. The
dashboards are updated accordingly.
Signed-off-by: Benoît Knecht <benoit.knecht@fsfe.org>