In certain instances on heavily loaded nodes with many network
devices, there may be concurrent access to the netdev collector's
`metricDescs` map, resulting in a panic. This adds a mutex to prevent
concurrent reads and writes to the map.
Signed-off-by: Brad Ison <bison@xvdf.io>
Move the systemd version function to an exporter method. This way we can
update the Verison information at every scrape, in case the underlying
version changes.
Signed-off-by: Ben Kochie <superq@gmail.com>
systemd patch versions are as important as the major version number;
they indicate security or bug fixes or other behavioural changes between
versions.
Use float64 over float32 as the rounding error with float32 rendered
250.3 as 250.3000030517578 in my testing.
Signed-off-by: Joe Groocock <jgroocock@cloudflare.com>
Signed-off-by: Joe Groocock <me@frebib.net>
analogous to the /var/lib/docker exclude added in
https://github.com/prometheus/node_exporter/pull/814
podman rootful containers mount eg. shm filesystems at
/var/lib/containers/storage/*-containers/*/userdata/shm. these should be
treated like things under /var/lib/docker by default.
Signed-off-by: Lauri Tirkkonen <lauri@hacktheplanet.fi>
Allow filtering APR entries based on device. Useful for ignoring
entries for network namespaces (containers).
Signed-off-by: Ben Kochie <superq@gmail.com>
This adds a new Linux metric, node_softirqs_total, which corresponds
to the 'softirq' line in /proc/stat. This metric is disabled by
default and it can be enabled with '--collector.stat.softirq'.
Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
Use the non-cgo version for all openbsd architectures.
The old code only pulled some defines from header files. Just add them
as enumerations in native go. Also be careful at what the SysctlRaw returns.
Implement a way that supports both recent and old pre-6.4 OpenBSD systems.
With go-1.16 OpenBSD binaries will link to libc and because of this binaries
built on OpenBSD 6.9-current do not run on OpenBSD 6.3. OpenBSD 6.3 is also
not supported for more then 2 years. So maybe the compat code is not needed.
Still validation object length before doing an unsafe pointer conversion
is probably reasonable but I'm no golang expert.
Signed-off-by: Claudio Jeker <claudio@openbsd.org>
TCP timeouts count is a useful signal to show
abnormal network performance and is another
signal to aid debugging. This metric can be
used to generate proactive alerts for host
network namespace workloads.
Signed-off-by: Martin Kennelly <mkennell@redhat.com>
The new `lnstat` collector produces a high number of metrics, per-cpu,
and results in approximately double the number of metrics previously
scraped. For example, a typical server with 64 cores produces 3832
lnstat metrics compared to 4147 metrics for the remaining collectors.
Therefore disable the `lnstat` collector by default.
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
Sanitizing the metric names can lead to duplicate metric names:
```
caller=level.go:63 level=error caller="error gathering metrics: [from Gatherer #2] collected metric \"node_ethtool_giant_hdr\" { label:<name:\"device\" value:\"ens192\" > untyped:<value:0" msg=" > } was collected before with the same name and label values"
```
Generate a map from the sanitized metric names to the metric names from
ethtool. In case of duplicate sanitized metric names drop both metrics,
because it is unknown which one to take.
Fixes: https://github.com/prometheus/node_exporter/issues/2185
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
Use SysctlTimeval from the golang.org/x/sys/unix package to
simplify the implementation of the boottime collector for the BSDs and
allows to build it without cgo.
Tested on macOS 11.6, FreeBSD 13 and OpenBSD 7.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Add a DMI collector to expose the Desktop Management Interface (DMI)
info from `/sys/class/dmi/id/`. This will expose information about the
BIOS, mainboard, chassis, and product.
Closes: https://github.com/prometheus/node_exporter/issues/303
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
Use `time.NewTimer()` and explicit `Stop()` to avoid memory bloat / GC problems with `time.After()` in the Linux filesystem collector timeout handling.
Signed-off-by: bawenmao <bawenmao@sogou-inc.com>
The ethtool_cmd struct from the linux kernel contains information about the speeds and features supported by a
network device. This includes speeds and duplex but also features like autonegotiate and 802.3x pause frames.
Closes#1444
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
* collector: Unwrap glob textfile directories
* collector: Store full path in mtime's file label
The point is to avoid duplicated gauges from files with the same name in
different directories.
This introduces support for exporting from multiple directories matching
given pattern (e.g. `/home/*/metrics/`).
Signed-off-by: Kiril Vladimirov <kiril@vladimiroff.org>
Expose GPU metrics using `sysfs/drm`.
`amdgpu` is the only driver which exposes this information through DRM.
Signed-off-by: Siavash Safi <siavash.safi@gmail.com>
Use the same flag pattern as netdev to make filtering methods the same.
* Move SanitizeMetricName to helper.go
Signed-off-by: Ben Kochie <superq@gmail.com>
* Refactor diskstats_linux to use procfs.
* Add `node_disk_info` metric.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
Co-authored-by: W. Andrew Denton <git@flying-snail.net>