Add new metrics for the InfiniBand network protocol including the amount of packets sent and received, the number of times the link has been downed and how many times the link has recovered from an error state.
Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>
Removed all global types that were unnecessary, and refactored to use constructor-created values and inline values instead of globals.
Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
This also involves removing zfs_zpool code for now.
Signed-Off-By: Corey Stewart <stewa169@purdue.edu>
Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
This patch makes stylistic changes to error strings, unexports method names by lower casing them, removes unused dataSetMetric, and adds copyright/licence information.
Signed-Off-By: Corey Stewart <stewa169@purdue.edu>
It is tested on FreeBSD 10.2-RELEASE and Linux (ZFS on Linux 0.6.5.4).
On FreeBSD, Solaris, etc. ZFS metrics are exposed through sysctls.
ZFS on Linux exposes the same metrics through procfs `/proc/spl/...`.
In addition to sysctl metrics, 'computed metrics' are exposed by
the collector, which are based on several sysctl values.
There is some conditional logic involved in computing these metrics
which cannot be easily mapped to PromQL.
Not all 92 ARC sysctls are exposed right now but this can be changed
with one additional LOC each.
The devstat API expects us to reuse one devinfo for many invocations of
devstat_getstats. In particular, it allocates and resizes memory
referenced by devinfo.
Querying the number of devices separately from the device list itself is
racy. Devices may be added or removed between the two calls; and removed
devices would lead to a segfault.
The memory allocated by calloc was never freed. Since the devinfo struct
never leaves the function, anyway, we might as well just allocate it on
the stack.
It seems solaris prefers "sys/loadavg.h" over "stdlib.h" when
fetching the load average.
For Illumos based OSes it was required to include "sys/time.h" to
ensure that "hrtime_t" was defined.
https://www.illumos.org/issues/6002
It also required setting the ldflags "-fno-stack-protector -lssp" to
avoid undefined symbols when linking with gcc.
/opt/local/go/pkg/tool/solaris_amd64/link: running gcc failed: exit status 1
Undefined first referenced
symbol in file
__stack_chk_fail /tmp/go-link-138622936/000002.o
__stack_chk_guard /tmp/go-link-138622936/000002.o
Instead of doing the whole metric exposition in a platform specific collector
implementation, this creates and updates the metrics in meminfo.go and
expected a platform specific implementation of getMemInfo on
*meminfoCollector.
This removes some error handling, which should be fine. If the calls
fail, we will get the zeroes, which is a safe enough fallback.
Additionally, if the first sysctl (page_size) succeeded it is unlikely
that other ones will fail.
node_exporter currently triggers autofs to mount the underlying
filesystem on every scrape. This is undesirable. Better ignore autofs.
The underlying filesystem that autofs mounts will be monitored though,
when the (real) filesystem is mounted.
They get printed all the time, as there are some tokens in the /proc
file that we simply don't support. It's better to keep these as
debugging messages, which may come in useful if new tags start to
appear.
- Use the right number of printf() arguments. Use %q where it makes sense.
- Use "DRBD" instead of "Drbd", per Go's style guide.
- Add _total suffixes to counter metrics.
- Mention the unit (bytes) in documentation strings once more.
This collector exposes most of the useful information that can be found
in /proc/drbd. Sizes are normalised to be in bytes, as /proc/drbd uses
kibibytes.
This change adds a new collector called "nfs" that parses the contents
of /proc/net/rpc/nfs and turns it into metrics. It can be used to
inspect the number of operations per type, but also to keep an eye on an
extraneous number of retransmissions, which may indicate connectivity
issues.
I've picked the name "nfs", as most operating systems use "nfs" for the
client component and "nfsd" as the server component. If we want to add
stats for the NFS server as well, we'd better call such a collector
"nfsd".
The chip label generation has been changed in #334 to prefer the
unique device path (e.g. the location on the PCI bus) due to #333.
Here, a new annotation metric ``node_hwmon_chip_names`` is
introduced which allows to link the unique chip sysfs path to a
human-readable chip name which may not be unique among chip sysfs
paths (for example, dual-slot systems have multiple
chipType="coretemp" sensors).
This allows to mitigate the downsides of the solution to #333
(namely that the device path may not be stable across kernels and
reboots) for cases where it does not matter that multiple devices
may have the same human-readable name (e.g. aggregation or where
at most one device with a common chip name is present).
For cases where no human-readable name can be derived, the
annotation metric is not emitted.