node_exporter

Commit Graph

Author	SHA1	Message	Date
Ben Kochie	1ab4a460c7	Update ppc64le end-to-end fixture. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-04-18 09:12:21 +02:00
Johannes 'fish' Ziemke	fd66a86a30	Remove gmond collector Signed-off-by: Johannes 'fish' Ziemke <github@freigeist.org>	2018-04-17 20:20:24 +02:00
Ben Kochie	0f5be132ac	Merge pull request #904 from prometheus/superq/if_alias Fix parsing of interface aliases in netdev linux	2018-04-17 13:37:21 +02:00
Ben Kochie	a528966dcd	Fix parsing of interface aliases in netdev linux Very old kernels expose interface aliases as `foo0:0`, adjust the line parsing to handle these names. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-04-17 13:15:02 +02:00
Ben Kochie	f6008b242b	Merge pull request #901 from mischief/bsd_boottime collector: implement node_boot_time_seconds for OpenBSD/NetBSD/Darwin	2018-04-17 07:48:39 +02:00
Jürgen Hötzel	de0632c2e9	Fix memory corruption when number of filesystems > 16 (#900 ) Signed-off-by: Juergen Hoetzel <juergen@archlinux.org>	2018-04-16 12:39:15 +02:00
mischief	26a385d7ab	collector: implement node_boot_time_seconds for OpenBSD/NetBSD/Darwin Signed-off-by: mischief <mischief@offblast.org>	2018-04-15 08:26:46 +00:00
Ben Kochie	015b86670a	Update ppc64le e2e output. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-04-14 15:28:06 +02:00
Ben Kochie	0507b0c9a2	Fix formatting. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-04-14 15:02:20 +02:00
Dmitriy Lukyanchikov	eddd1b9357	Fix netdev collector for linux (#890 ) fix variable name, fix transmitHeader extracting modify fixtures to run tests with updated netdev_linux collector Signed-off-by: dmitriy-lukyanchikov <d.lukyanchikov@anchorfree.com>	2018-04-14 13:58:56 +02:00
Derek Marcotte	fe86e908da	Update ppc64 fixtures to unbreak end-to-end. `efc1fdb` added new labels. Signed-off-by: Derek Marcotte <554b8425@razorfever.net>	2018-04-13 06:33:38 -04:00
Karsten Weiss	7e392e6634	Fix spelling mistakes found by codespell Signed-off-by: Karsten Weiss <knweiss@gmail.com>	2018-04-09 18:27:17 +02:00
Karsten Weiss	efc1fdb6d0	cpu: Add a 2nd label 'package' to metric node_cpu_core_throttles_total (#871 ) * cpu: Add a 2nd label 'package' to metric node_cpu_core_throttles_total This commit fixes the node_cpu_core_throttles_total metrics on multi-socket systems as the core_ids are the same for each package. I.e. we need to count them seperately. Rename the node_package_throttles_total metric label `node` to `package`. Reorganize the sys.ttar archive and use the same symlinks as the Linux kernel. Also, the new fixtures now use a dual-socket dual-core cpu w/o HT/SMT (node0: cpu0+1, node1: cpu2+3) as well as processor-less (memory-only) NUMA node 'node2' (this is a very rare case). Signed-off-by: Karsten Weiss <knweiss@gmail.com> * cpu: Use the direct /sys path to the cpu files. Use the direct path /sys/devices/system/cpu/cpu[0-9]* (without symlinks) instead of /sys/bus/cpu/devices/cpu[0-9]. The latter path also does not exist e.g. on RHEL 6.9's kernel. Signed-off-by: Karsten Weiss <knweiss@gmail.com> cpu: Reverse core+package throttle processing order Signed-off-by: Karsten Weiss <knweiss@gmail.com> * cpu: Add documentation URLs Signed-off-by: Karsten Weiss <knweiss@gmail.com>	2018-04-09 18:01:52 +02:00
Brian Brazil	31ce32f1fe	Greatly trim what netstat collector exposes by default (#876 ) Netstat is 40% of the metrics on my laptop, many of which are highly detailed information about IP internals in the kernel. ~300 such metrics on every machine in your fleet is excessive, so focus on key metrics by default, overridable by the user. Fixes #515 Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>	2018-03-30 19:28:08 +01:00
Ben Kochie	cf3edadcbb	Update fixtures * Add oom_kill to fixture. * Update e2e outputs. * Put regexp in order. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-03-29 22:00:02 +01:00
Brian Brazil	499c342fed	Greatly reduce the metrics vmstat returns by default. Vmstat has over 100 fields, most of which are highly detailed debug information. Trim this down to only essential fields by default, configurable by flag. Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>	2018-03-29 22:00:02 +01:00
Brian Brazil	c8c144587e	Enable bonding collector by default. (#872 ) Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>	2018-03-29 15:18:12 +01:00
Ben Kochie	779090db7e	Update ppc64le fixture (#867 ) Update to match standard e2e output. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-03-27 17:05:20 +02:00
Mario Trangoni	1f11a86d59	Fix nfs golint issues (#863 ) * procfs: update vendoring Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com> * procfs: fix e2e tests after nfs changes Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>	2018-03-22 22:25:37 +01:00
Ben Kochie	7b720df1c5	Use lowercase cpu label name in interrupts (#849 ) To match other CPU related metric labels, use a lowercase named label.	2018-03-08 15:04:49 +01:00
Johannes 'fish' Ziemke	424ca8e322	Drop exec_ in boot_timestamp_seconds on *bsd (#839 ) This closes #827.	2018-03-08 12:59:48 +01:00
colmbuckley	098f975b48	Correct the ClocksPerSec scaling factor on Darwin (#846 ) * Update cpu_darwin.go Change the definition of ClocksPerSec to read from limits.h * Update cpu_darwin.go	2018-03-07 11:56:57 +01:00
Julius Volz	864a6ee935	Treat custom textfile metric timestamps as errors (#769 ) This is clearer behavior and users will notice and fix their textfiles faster than if we just output a warning.	2018-02-27 19:43:38 +01:00
Rene Treffer	c504c7e264	Only report core throttles per core, not per cpu (#836 ) * Only report core throttles per core, not per cpu * Add topology/core_id to the cpu sysfs fixtures * Add new cpu fixtures to ttar file * Merge core_id reading and thermal throttle accounting * Declare core_id	2018-02-27 19:43:15 +01:00
Ben Kochie	e0d54a509c	Cleanup NFS metrics (#834 ) * Cleanup NFS metrics * Update `nfs` metric names to match `nfsd`. * Remove uneeded `tcp` label from TCP connections metric. * Remove uneeded `v` on `nfsd` metrics. * Enable all `nfs` v4 client metrics. * Remove `nfs` metric name overrides. * Add ppc64le fixture. * Fix typo.	2018-02-21 07:25:41 +01:00
Ben Kochie	3f41a2fecb	Update ppc64le fixture (#832 ) Updates fixture for ppc64le arch to latest output.	2018-02-19 20:43:33 +01:00
Ben Kochie	d33a447047	Remove deprecated prometheus.InstrumentHandlerFunc (#831 ) Update Prometheus client golang use to use `promhttp.Handler()` instead of `prometheus.InstrumentHandlerFunc()`.	2018-02-19 15:44:59 +01:00
Richard Elling	d7348a5c78	updates for zfsonlinux 0.7.5 (#779 ) * updates for zfsonlinux 0.7.5 * add constants for KSTAT_DATA_* types * added e2e test for negative values represented by uint64 that can result from ZFS bugs	2018-02-16 15:46:31 +01:00
Ben Kochie	6468e7c80b	Enable NFS client metrics by default. (#828 ) Enable NFS client metrics by default now that it nolonger prints errors on scrape if there are no metrics to display. Also fixup the nfsd README to match the nfs entry.	2018-02-16 15:42:47 +01:00
Ralf Horstmann	8d9c7ca659	Use swpginuse instead of swpgonly in meminfo_openbsd (#813 ) All tools in OpenBSD base system use swpginuse instead of swpgonly for reporting swap usage (snmpd, swapctl, top, vmstat), so let memory collector use that as well for consistency.	2018-02-16 11:34:41 +01:00
Kasinath Kottukkal	f6965e1812	Add overlay to defIgnoredFSTypes (#824 ) * Add overlay to defIgnoredFSTypes To avoid statfs() errors if node_exporter is running as non privileged user. * Updated defIngoredFSTypes values in sorted order	2018-02-16 09:47:50 +01:00
Ben Kochie	01bd99fb1a	Refactor NFS client collector (#816 ) * Update vendor github.com/prometheus/procfs/... * Refactor NFS collector Use new procfs library to parse NFS client stats. * Ignore nfs proc file not existing. * Refactor with reflection to walk the structs.	2018-02-15 13:40:38 +01:00
Brian Brazil	52c031890e	Add _seconds suffix to node_time. (#823 )	2018-02-14 16:59:08 +00:00
Ben Kochie	05eabe60fb	Fix error output in nfsd collector. (#821 )	2018-02-14 13:57:35 +01:00
Ben Kochie	3de2542d21	Fix NFSd metric type (#819 ) RPC Count should be a counter, not a gauge.	2018-02-13 17:03:22 +01:00
Matt Layher	544488ddd6	Fix remaining metric naming issues (#799 )	2018-02-12 18:53:31 +01:00
Ben Kochie	6a041692ed	Add NFS Server metrics collector. (#803 ) * Add NFS Server metrics collector. * Add File Handles metrics. * Add nfsd IO stats. * Add metrics for NFSd threads. * Add metrics for NFSd read ahead cache. * Add NFSd network traffic counters. * Add RPC metrics. * Add V2 requests metrics. * Add NFSv3 metrics. * Add NFSv4 metrics. * Update reply cache comment. * Update help text.	2018-02-12 17:56:05 +01:00
Brian Brazil	1072f2868d	Fix log level regression in #533	2018-02-07 15:16:20 +00:00
Brian Brazil	7e41a2b279	Ignore /var/lib/docker by default. (#814 ) The node exporter runs unprivileged, so it cannot statfs any filesystems under this directory causing log spam. In addition there tends to be high churn in the filesystems here (as it's basically application monitoring) which can cause high cardinaltiy and in one case caused Prometheus's index symbol table to get very large. Accordingly this should be ignored to reduce log spam and avoid performance issues. The filesystems themselves can in principle be monitored via container oriented exporters, and the underlying filesystems will still be monitored.	2018-02-06 17:10:59 +01:00
Ralf Horstmann	29ac809e48	Use unified CPU metric description on OpenBSD (#810 )	2018-02-01 23:59:19 +01:00
Derek Marcotte	fde5d2c6c9	Remove unsafe typecasts from sysctl_bsd getStructTimeval. (#741 ) There is a simpler way.	2018-02-01 18:43:40 +01:00
Ben Kochie	14d60958d6	Unify CPU collector conventions (#806 ) * Unify CPU collector conventions Add a common CPU metric description. * All collectors use the same `nodeCpuSecondsDesc`. * All collectors drop the `cpu` prefix for `cpu` label values. * Fix subsystem string in cpu_freebsd. * Fix Linux CPU freq label names.	2018-02-01 18:42:20 +01:00
Ralf Horstmann	e3c76b1f0c	Add OpenBSD CPU collector (#805 )	2018-02-01 18:33:49 +01:00
Tom Wilkie	6833eec187	Fix tests.	2018-01-31 15:22:17 +00:00
Tom Wilkie	0316bacceb	Only use one dbus connection, required some refactoring.	2018-01-31 15:19:18 +00:00
Tom Wilkie	a7fd6b8743	Export systemd timer last trigger sec.	2018-01-31 15:07:04 +00:00
Ben Kochie	111e3af437	Remove obsolete megacli collector. (#798 ) This collector has been replaced by the textfile collector tool `storcli.py`.	2018-01-23 11:25:42 +01:00
Julius Volz	6cac74f0e0	Add unit suffix to textfile collector mtime metric (#796 )	2018-01-22 14:02:19 +01:00
Brian Brazil	a98067a294	Make metrics better follow guidelines (#787 ) * Improve stat linux metric names. cpu is no longer used. * node_cpu -> node_cpu_seconds_total for Linux * Improve filesystem metric names with units * Improve units and names of linux disk stats Remove sector metrics, the bytes metrics cover those already. * Infiniband counters should end in _total * Improve timex metric names, convert to more normal units. See `3c073991eb/kernel/time/ntp.c (L909)` for what stabil means, looks like a moving average of some form. * Update test fixture * For meminfo metrics that had "kB" units, add _bytes * Interrupts counter should have _total	2018-01-17 17:55:55 +01:00
Ben Kochie	b4d7ba119a	Add fixture for ppc64le (#785 ) * Add support for per-architecture fixtures. * Add output for ppc64le.	2018-01-11 13:56:19 +01:00
Nick Owens	0629a081db	multiply page size after float64 coercion to avoid signed integer overflow (#780 )	2018-01-08 15:36:49 +01:00
Franz Pletz	d432f9857e	Use uint64 in the ZFS collector (#714 ) ZFS metrics can also be unsigned 64-bit integers that won't fit in int64 and causes the whole collector to fail.	2018-01-06 12:36:55 +01:00
Derek Marcotte	477fe4665a	Move FreeBSD/DragonflyBSD out of meminfo add kvm. (#547 ) * Move FreeBSD/DragonflyBSD out of meminfo add kvm. This gives us SwapUsed, and everything under one roof. * Fix typos per review. * Update to use newer API. * Remove premature optimization per PR feedback.	2018-01-04 12:23:26 +01:00
Sevag Hanssian	4329b0a86b	Add summary metrics for systemd exporter (#765 )	2018-01-04 11:49:36 +01:00
Matthieu Guegan	d6ef10bb56	Add openbsd meminfo (#724 ) * Implements meminfo collector for OpenBSD This is a rework of #151. * Fix CGO import * Add some useful metrics * Rename total -> size for normalization	2018-01-04 10:32:08 +01:00
Ben Kochie	7f6c59e198	Ignore more virtual filesystems (#775 ) Add additional Linux virtual filesystem types to the default list.	2018-01-03 17:22:02 +01:00
Netmonk	2aa8d0eb0c	[FIX] Exclude Linux proc from filesystem type regexp (#774 ) * [FIX] Issue 63, error on excluding proc filesystem on linux, improving regexp * [FIX] Reordering filter order	2018-01-03 11:40:32 +01:00
Julius Volz	f536857ac6	Fix e2e tests after textfile custom timestamp removal (#768 )	2017-12-24 11:54:33 +01:00
Shubheksha Jalan	1f2458f42c	Filter out testfile metrics correctly when using `collect[]` filters (#763 ) * remove injection hook for textfile metrics, convert them to prometheus format * add support for summaries * add support for histograms * add logic for handling inconsistent labels within a metric family for counter, gauge, untyped * change logic for parsing the metrics textfile * fix logic to adding missing labels * Export time and error metrics for textfiles * Add tests for new textfile collector, fix found bugs * refactor Update() to split into smaller functions * remove parseTextFiles(), fix import issue * add mtime metric directly to channel, fix handling of mtime during testing * rename variables related to labels * refactor: add default case, remove if guard for metrics, remove extra loop and slice * refactor: remove extra loop iterating over metric families * test: add test case for different metric type, fix found bug * test: add test for metrics with inconsistent labels * test: add test for histogram * test: add test for histogram with extra dimension * test: add test for summary * test: add test for summary with extra dimension * remove unnecessary creation of protobuf * nit: remove extra blank line	2017-12-23 20:21:58 +01:00
Ben Kochie	cd2a17176a	Add full make to CircleCI (#761 ) * Add full make to CircleCI Ensure end-to-end test is run. * Fix go fmt error. * Fix end-to-end output.	2017-12-21 16:24:23 +01:00
Wei Li	1e9bb4ec3a	textfile: fix duplicate metrics error (#738 ) The textfile gatherer should only be added to gatherer list once. Signed-off-by: Li Wei <liwei@anbutu.com>	2017-12-06 17:05:40 +01:00
Kristian Klausen	a96f1738b3	netdev: Change valueType to CounterValue (#749 ) All the metric only goes up, so the type should be counter. This also add _total to all the metric name. Fix: #747	2017-12-06 13:58:35 +01:00
Ben Kochie	2a80537547	Split out guest cpu metrics on Linux. (#744 ) Linux "guest" metrics for VMs are already accounted for in node_cpu `user` and `nice` metrics. Separate these into their own metric to avoid duplication of data.	2017-11-23 15:04:47 +01:00
Karsten Weiss	a8d7d1101a	cpu: Support processor-less (memory-only) NUMA nodes (#734 ) * cpu: Support processor-less (memory-only) NUMA nodes Processor-less (memory-only) NUMA nodes exist e.g. in systems that use Intel Optane drives for RAM expansion using Intel Memory Drive Technology (IMDT). IMDT RAM expansion supports two modes: * "Unify Remote Memory domains": present a processor-less (memory-only) NUMA domain, which is the default * "Expand local memory domains": to expand each processor’s memory domain with a portion of the memory made available by Optane and IMDT This commit fixes a crash in the first case (when "cpulist" is empty). Here's an example of such a system: $ numastat -m\|head -n5 Per-node system memory usage (in MBs): Node 0 Node 1 Node 2 Total --------------- --------------- --------------- --------------- MemTotal 118239.56 130816.00 464384.00 713439.56 $ for i in {0..2}; do echo -n "$i: " ; cat /sys/bus/node/devices/node$i/cpulist ; done 0: 0-7,16-23 1: 8-15,24-31 2: $ /opt/vsmp/bin/vsmpversion -vvv Memory Drive Technology: 8.2.1455.74 (Sep 28 2017 13:09:59) System configuration: Boards: 3 1 x Proc. + I/O + Memory 2 x NVM devices (Intel SSDPED1K375GAQ) Processors: 2, Cores: 16, Threads: 32 Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz Stepping 01 Memory (MB): 713472 (of 977450), Cache: 251416, Private: 12562 1 x 249088MB [262036/ 678/12270] 1 x 232192MB [357707/125369/ 146] 82:00.0#1 1 x 232192MB [357707/125369/ 146] 83:00.0#1 * cpu: rename some variables (pkg => node) * cpu: Use %v not %q in log.Debugf() format strings	2017-11-10 15:31:26 +01:00
Matt Layher	f6f9c8d6cc	Add and use sysReadFile in hwmon collector (#728 )	2017-11-07 07:49:37 +01:00
Tobias Klauser	d73f1e60c4	Simplify Utsname string conversion (#716 ) * Update golang.org/x/sys/unix This allows to use simplified string conversion of Utsname members. * Simplify Utsname string conversion Use Utsname from golang.org/x/sys/unix which contains byte array instead of int8/uint8 array members. This allows to simplify the string conversions of these members.	2017-11-02 11:57:14 +01:00
Ben Kochie	ea250d73f4	Fix off by one in Linux interrupts collector (#721 ) * Fix off by one in Linux interrupts collector * Fix off by one in CPU column handler. * Add test. * Enable interrupts in end-to-end test.	2017-11-02 09:59:46 +01:00
Matt Layher	296b62acb7	netstat: return nothing when /proc/net/snmp6 not found	2017-10-31 15:26:32 -04:00
Derek Marcotte	0eecaa9547	Correct buffer_bytes > INT_MAX on BSD/amd64. (#712 ) * Correct buffer_bytes > INT_MAX on BSD/amd64. The sysctl vfs.bufspace returns either an int or a long, depending on the value. Large values of vfs.bufspace will result in error messages like: couldn't get meminfo: cannot allocate memory This will detect the returned data type, and cast appropriately. * Added explicit length checks per feedback. * Flatten Value() to make it easier to read. * Simplify per feedback. * Fix style. * Doc updates.	2017-10-25 20:55:22 +02:00
Matt Layher	f9ad88fc03	xfs: expose correct fields, fix metric names	2017-10-20 18:41:51 -04:00
Siavash Safi	f3a7022602	Add `collect[]` parameter (#699 ) * Add `collect[]` parameter * Add TODo comment about staticcheck ignored * Restore promhttp.HandlerOpts * Log a warning and return HTTP error instead of failing * Check collector existence and status, cleanups * Fix warnings and error messages * Don't panic, return error if collector registration failed * Update README	2017-10-14 14:23:42 +02:00
Ben Kochie	deadfef4c9	Update vendoring (#685 ) * Update vendor github.com/coreos/go-systemd/dbus@v15 * Update vendor github.com/ema/qdisc * Update vendor github.com/godbus/dbus * Update vendor github.com/golang/protobuf/proto * Update vendor github.com/lufia/iostat * Update vendor github.com/matttproud/golang_protobuf_extensions/pbutil@v1.0.0 * Update vendor github.com/prometheus/client_golang/... * Update vendor github.com/prometheus/common/... * Update vendor github.com/prometheus/procfs/... * Update vendor github.com/sirupsen/logrus@v1.0.3 Adds vendor golang.org/x/crypto * Update vendor golang.org/x/net/... * Update vendor golang.org/x/sys/... * Update end to end output.	2017-10-05 16:20:47 +02:00
Brett Vickers	b62c7bc0ad	Updated vendored ntp package (#681 ) The github.com/beevik/ntp package was recently updated with some API changes that broke node_exporter. This commit fetches the latest version of the ntp package and brings node_exporter in line with the latest API.	2017-10-04 08:33:49 +02:00
Calle Pettersson	859a825bb8	Replace --collectors.enabled with per-collector flags (#640 ) * Move NodeCollector into package collector * Refactor collector enabling * Update README with new collector enabled flags * Fix out-of-date inline flag reference syntax * Use new flags in end-to-end tests * Add flag to disable all default collectors * Track if a flag has been set explicitly * Add --collectors.disable-defaults to README * Revert disable-defaults flag * Shorten flags * Fixup timex collector registration * Fix end-to-end tests * Change procfs and sysfs path flags * Fix review comments	2017-09-28 15:06:26 +02:00
Sami Kerola	3762191e66	Add timex collector (#664 ) This collector is based on adjtimex(2) system call. The collector returns three values, status if time is synchronised, offset to remote reference, and local clock frequency adjustment. Values are taken from kernel time keeping data structures to avoid getting involved how the synchronisation is implemented. By that I mean one should not care if time is update using ntpd, systemd.timesyncd, ptpd, and so on. Since all time sync implementation will always end up telling to kernel what is the status with time one can simply omit the software in between, and look results of the syncing. As a positive side effect this makes collector very quick and conceptually specific, this does not monitor availability of NTP server, or network in between, or dns resolution, and other unrelated but necessary things. Minimum set of values to keep eye on are the following three: The node_timex_sync_status tells if local clock is in sync with a remote clock. Value is set to zero when synchronisation to a reliable server is lost, or a time sync software is misconfigured. The node_timex_offset_seconds tells how much local clock is off when compared to reference. In case of multiple time references this value is outcome of RFC 5905 adjustment algorithm. Ideally offset should be close to zero, and it depends about use case how large value is acceptable. For example a typical web server is probably fine if offset is about 0.1 or less, but that would not be good enough for mobile phone base station operator. The node_timex_freq tells amount of adjustment to local clock tick frequency. For example if offset is one second and growing the local clock will need instruction to tick quicker. Number value itself is not very important, and occasional small adjustments are fine. When frequency is unusually in stable one can assume quality of time stamps will not be accurate to very far in sub second range. Obviously explaining why local clock frequency behaves like a passenger in roller coaster is different matter. Explanations can vary from system load, to environmental issues such as a machine being physically too hot. Rest of the measurements can help when debugging. If you run a clock server do probably want to collect and keep track of everything. Pull-request: https://github.com/prometheus/node_exporter/pull/664	2017-09-19 07:54:06 -07:00
Leonid Evdokimov	c169b4b1c5	Add metrics from SNTPv4 packet to ntp collector & add ntpd sanity check (#655 ) * Add metrics from SNTPv4 packet to ntp collector & add ntpd sanity check 1. Checking local clock against remote NTP daemon is bad idea, local ntpd acting as a client should do it better and avoid excessive load on remote NTP server so the collector is refactored to query local NTP server. 2. Checking local clock against remote one does not check local ntpd itself. Local ntpd may be down or out of sync due to network issues, but clock will be OK. 3. Checking NTP server using sanity of it's response is tricky and depends on ntpd implementation, that's why common `node_ntp_sanity` variable is exported. * `govendor add golang.org/x/net/ipv4`, it is dependency of github.com/beevik/ntp * Update github.com/beevik/ntp to include boring SNTP fix * Use variable name from RFC5905 * ntp: move code to make export of raw metrics more explicit * Move NTP math to `github.com/beevik/ntp` * Make `golint` happy * Add some brief docs explaining `ntp` #655 and `timex` #664 modules * ntp: drop XXX comment that got its decision * ntp: add `_seconds` suffix to relevant metrics * Better `node_ntp_leap` comment * s/node_ntp_reftime/node_ntp_reference_timestamp_seconds/ as requested by @discordianfish * Extract subsystem name to const as suggested by @SuperQ	2017-09-19 10:36:14 +02:00
Karsten Weiss	b0d5c00832	cpu: Metric 'package_throttles_total' is per package. (#657 ) * cpu: Metric 'package_throttles_total' is per package. 'package_throttles_total' is per package, not per cpu. This also reduces the total number of cpu time series a lot (esp for multi core cpus). * cpu: Better handling of a cpulist edge-case. * cpu: Extract the package number from the directory name. Do not rely on the range index. * cpu: Add package_throttle_count for node0 cpu1 This file must be ignored by the cpu collector.	2017-09-07 23:24:18 +02:00
Matthias Rampke	e1f129c729	Use int64 throughout the ZFS collector. This avoids issues with integer overflows on 32-bit architectures. The Prometheus data format is float64, so regardless of the architecture we should handle large numbers. Fixes #629.	2017-08-21 16:40:16 +00:00
Ben Kochie	8839640cd1	Ignore wifi collector permission errors (#646 ) Ignore the permission denined error when the wifi collector has no permission to read metrics.	2017-08-18 10:19:48 +02:00
Calle Pettersson	dfe07eaae8	Switch to kingpin flags (#639 ) * Switch to kingpin flags * Fix logrus vendoring * Fix flags in main tests * Fix vendoring versions	2017-08-12 15:07:24 +02:00
Ben Kochie	46c31d8a7e	Enable IPVS collector by default (#623 ) * Silence error output when no IPVS present. * Enable by default. * Update end-to-end fixture. * Update README.	2017-07-26 15:20:28 +02:00
Tobias Schmidt	515b5a933d	Fix build tags of loadavg collector The collector is only implemented for a subset of all operating systems supported by go. Compilation will fail if attempted for another OS target.	2017-07-20 15:13:58 -04:00
Tobias Schmidt	016d79535d	Fix build tags of meminfo collector The meminfo collector only supports darwin, dragonfly, freebsd and linux and must not be included in other archtictures.	2017-07-20 14:37:10 -04:00
Andrea De Pasquale	1369763067	Change raid0 status line regexp for mdadm collector (#619 )	2017-07-20 17:04:33 +02:00
Tobias Schmidt	921319c7eb	Merge pull request #583 from knweiss/golint Golint fixes	2017-07-10 23:49:36 +02:00
Aleksey Zhukov	7a914e58f2	Add parsing /proc/net/snmp6 file for netstat-linux (#615 ) * Add parsing /proc/net/snmp6 file * add /proc/net/snmp6 fixture * fix e2e test * gofmt * remove unuser variable * safe checks * add tests * change help format	2017-07-08 20:16:35 +02:00
Matt Layher	6e82fd1c56	Add XFS block mapping and block map B-tree stats (#575 )	2017-07-07 07:27:52 +02:00
ideaship	8d90276283	Add bcache collector (#597 ) * Add bcache collector for Linux This collector gathers metrics related to the Linux block cache (bcache) from sysfs. * Removed commented out code * Use project comment style * Add _sectors to metric name to indicate unit * Really use project comment style * Rename bcache.go to bcache_linux.go * Keep collector namespace clean Rename: - metric -> bcacheMetric - periodStatsToMetrics -> bcachePeriodStatsToMetric * Shorten slice initialization * Change label names to backing_device, cache_device * Remove five minute metrics (keep only total) * Include units in additional metric names * Enable bcache collector by default * Provide metrics in seconds, not nanoseconds * remove metrics with label "all" * Add fixtures, update end-to-end for bcache collector * Move fixtures/sys into tar.gz This changeset moves the collector/fixtures/sys directory into collector/fixtures/sys.tar.gz and tweaks the Makefile to unpack the tarball before tests are run. The reason for this change is that Windows does not allow colons in a path (colons are present in some of the bcache fixture files), nor can it (out of the box) deal with pathnames longer than 260 characters (which we would be increasingly likely to hit if we tried to replace colons with longer codes that are guaranteed not the turn up in regular file names). * Add ttar: plain text archive, replacement for tar This changeset adds ttar, a plain text replacement for tar, and uses it for the sysfs fixture archive. The syntax is loosely based on tar(1). Using a plain text archive makes it possible to review changes without downloading and extracting the archive. Also, when working on the repo, git diff and git log become useful again, allowing a committer to verify and track changes over time. The code is written in bash, because bash is available out of the box on all major flavors of Linux and on macOS. The feature set used is restricted to bash version 3.2 because that is what Apple is still shipping. The programm also works on Windows if bash is installed. Obviously, it does not solve the Windows limitations (path length limited to 260 characters, no symbolic links) that prompted the move to an archive format in the first place.	2017-07-07 07:20:18 +02:00
kadota kyohei	a077024f51	add diskstats on Darwin (#593 ) * Add diskstats collector for Darwin * Update year in the header * Update README.md * Add github.com/lufia/iostat to vendored packages * Change stats to follow naming guidelines * Add a entry of github.com/lufia/iostat into vendor.json * Remove /proc/diskstats from description	2017-07-06 13:51:24 +02:00
Rene Treffer	56bf8d4b2d	Add link to kernel documentation for sysfs/cpufreq files	2017-06-27 11:25:06 +02:00
Rene Treffer	bcc3cd92b8	Fix cpufreq statistics by converting kHz to Hz	2017-06-27 11:05:55 +02:00
Ben Kochie	182810056f	Fix Linux cpu errors (#606 ) Make the Linux cpu collector soft-error on missing `cpufreq` and `thermal_throttle` features.	2017-06-20 07:51:26 +02:00
Rene Treffer	2e9f1913b8	Move stat_linux to cpu_linux and add cpufreq stats (#548 )	2017-06-13 11:21:53 +02:00
Emanuele Rocca	047003b6bb	Add qdisc collector for Linux (#580 ) * Add qdisc collector for Linux This collector gathers basic queueing discipline metrics via netlink, similarly to what `tc -s qdisc show` does. * qdisc collector: nl-specific code moved, names fixed - netlink-specific parts moved to github.com/ema/qdisc - avoid using shortened names - counters renamed into XXX_total * Get rid of parseMessage error checking leftover * Add github.com/ema/qdisc to vendored packages * Update help texts and comments * Add qdisc collector to README file * qdisc collector end-to-end testing * Update qdisc dependency to latest version Update github.com/ema/qdisc dependency to revision 2c7e72d, which includes unit testing. * qdisc collector: rename "iface" label into "device"	2017-05-23 11:55:50 +02:00
Karsten Weiss	b2f4fd5776	Remove unused devstatCollector struct member 'bytes_total'. This also fixes this golint issue: devstat_freebsd.go:40:2: don't use underscores in Go names; struct field bytes_total should be bytesTotal	2017-05-14 19:51:53 +02:00
Jonas Große Sundrup	e6d031788f	Correct typo (#582 )	2017-05-14 19:46:23 +02:00
Karsten Weiss	bca09abf1c	golint: Fix NewStatCollector() doc string.	2017-05-14 13:51:47 +02:00
Karsten Weiss	b3e7420a27	cpu_darwin.go: s/cpu_ticks/cpuTicks/g	2017-05-14 13:51:42 +02:00
Karsten Weiss	b05c7d8dab	cpu_darwin.go: Fix doc strings.	2017-05-14 13:51:34 +02:00
Karsten Weiss	fff03c6c0c	Fix NewTCPStatCollector doc string.	2017-05-14 13:23:57 +02:00
Karsten Weiss	6720cfdbfe	golint: Fix comment on exported function NewDevstatCollector.	2017-05-14 13:21:39 +02:00
Karsten Weiss	b73af72853	Explicitly check for the rc 3 in call to getloadavg(). Reorder logic.	2017-05-14 13:07:54 +02:00
Karsten Weiss	af358ec800	golint fixes: if block ends with a return statement, so drop this else and outdent its block.	2017-05-14 12:55:44 +02:00
Karsten Weiss	732f839810	sysctl_bsd.go: golint fixes. Typo fix.	2017-05-14 12:51:57 +02:00
Robert Clark	58f50b31f2	Multiply port data XMIT/RCV metrics by 4 (#579 ) According to Mellanox, it is standard practice that the port_xmit_data and port_rcv_data files are split into 4 lanes. To get the actual transmit and receive values for each port, the metric needs to be multiplied by 4. Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>	2017-05-12 07:28:53 +02:00
Ben Kochie	8f3cddf734	Merge pull request #568 from mdlayher/xfs-init Initial XFS collector	2017-04-25 09:54:28 +02:00
Kai S	59f9b8c5c1	Handle nonexisting bonding_masters file (#569 ) * silently ignore nonexisting bonding_masters file Add an empty fixtures dir without a bonding_masters file to test. * Moved the check to the Update() method Dropped the empty test dir.	2017-04-24 23:19:17 +04:00
Matt Layher	1feb091b36	Initial XFS collector	2017-04-22 11:53:07 -04:00
Ben Kochie	e9aad0157c	Merge pull request #550 from derekmarcotte/dm-boottime Add exec_boot_time for freebsd, dragonfly	2017-04-22 09:18:05 +02:00
Derek Marcotte	5b557bf973	Fix metric name per review.	2017-04-21 16:25:31 -04:00
Derek Marcotte	db8ec9c6b4	Add exec_boot_time for freebsd, dragonfly Adds new sysctl type, bsdSysctlTypeStructTimeval to enable parsing of timevals from raw memory.	2017-04-21 10:23:19 -04:00
Daniele Sluijters	bb9d4ade0b	uname_linux: Build for 32bit MIPS too Since Go 1.8 32bit MIPS Big/Little Endian are supported assuming the target runs Linux and the kernel either emulates an FPU or can access the CPU one. This allows the node_collector to build for mips and mipsle opening up the possibility of running it on things like home routers (DD-\|Open\|ASUS-)Wrt firmware usually has the necessary bits in place.	2017-04-20 13:30:40 +02:00
Brian Brazil	f291d2d6dd	Get full resolution for node_time (#555 )	2017-04-19 18:31:21 +01:00
Karsten Weiss	d9703ff7c6	edac: Fix typo in csrow label of node_edac_csrow_uncorrectable_errors_total metric.	2017-04-18 12:45:06 +02:00
Tobias Schmidt	266f0958d2	Merge pull request #561 from derekmarcotte/dm-fix-dfly-build Fixes broken build on Dragonfly.	2017-04-17 17:31:12 +02:00
Derek Marcotte	83cecfa696	Fixes broken build on Dragonfly. Undefined err: `84eaa8fecd/collector/devstat_dragonfly.go (L145)`	2017-04-17 10:50:49 -04:00
Karsten Weiss	45ca8db352	Support the 'guest_nice' cpu mode of /proc/stat. 'guest_nice' is available since Linux 2.6.33.	2017-04-14 12:50:37 +02:00
Sam Kottler	6eafa51fa8	Add ARP collector for Linux (#540 ) * Implement commonalities and linux support for ARP collection * Add ARP collector to fixtures and run as part of e2e tests * Bubble up scanner errors * Use single return values where it makes sense * Add missing annotation * Move arp_common into arp_linux * Add license header to arp_linux.go * Address initial feedback * Use strings.Fields instead of strings.Split * Deal with scanner.Err() rather than throwing away errors * Check for scan errors in-line before interacting with the entries map * Don't interact with potentially empty text from scan * Check for scan errors outside the scan loop * Add comment about moving procfs parsing * Add more direct comment * Update initialism style to match go style guide * Put function args on the same line * Add TODO in front of comment about procfs extraction * Guard against strings.Fields returning an empty slice * Be more defensive about ARP table format and use upcase more broadly * Enable the ARP collector by default * Add ARP collector to the README * Remove 'entry'	2017-04-11 17:45:19 +02:00
Tobias Schmidt	8aec44617a	Remove Windows support Use https://github.com/martinlindhe/wmi_exporter instead.	2017-04-10 23:27:23 -03:00
Tobias Schmidt	41a44a4d24	Merge pull request #532 from prometheus/grobie/remove-extra-file-check mdadm: Remove extra file existence check	2017-03-31 05:35:12 +02:00
Ben Kochie	5f43211f67	Blacklist systemd scope units Blacklist `scope` units from systemd collector by default. These units are created with unique IDs programatically[0]. This leads to huge cardinality problems. [0]: https://www.freedesktop.org/software/systemd/man/systemd.scope.html	2017-03-23 14:02:46 +01:00
Tobias Schmidt	d290ea94b8	Fix export of stale device error metrics for unmounted filesystems Instead of maintaining a counter metric for device errors in memory, this change exports a gauge and uses const metrics to avoid leaking metrics for unmounted filesystems.	2017-03-22 21:48:18 -03:00
Tobias Schmidt	7b93b52010	Fix lint issues on filesystem BSD implementation	2017-03-22 21:48:12 -03:00
Tobias Schmidt	445ed44082	mdadm: Remove extra file existence check	2017-03-22 10:11:19 -03:00
Johannes 'fish' Ziemke	9676f5f2dc	Merge pull request #523 from roclark/support-legacy-infiniband Add support for legacy InfiniBand drivers	2017-03-21 10:52:07 +01:00
Johannes 'fish' Ziemke	620e9937e6	Merge pull request #524 from mdlayher/wifi-expand Expand wifi collector for more interface types	2017-03-21 10:32:44 +01:00
Juergen Hoetzel	aef2601cf6	Add missing dependency for static FreeBSD build	2017-03-20 16:59:45 +00:00
Matt Layher	2bfe410fb7	Expand wifi collector for more interface types	2017-03-20 12:25:01 -04:00
Robert Clark	3a5917dfdc	Add support for legacy InfiniBand drivers Older versions of the OFED drivers contain 64-bit variants of the port counters and are located in a directory named 'counters_ext'. This patch includes these older metrics that have since been deprecated with OFED 4.0. Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>	2017-03-20 10:37:21 -05:00
Tobias Schmidt	0400e437be	Fix and simplify parsing of raid metrics Fixes the wrong reporting of active+total disk metrics for inactive raids. Also simplifies the code and removes a couple of redundant comments.	2017-03-19 08:03:58 -03:00
Matt Layher	42c8a20545	Unexport wifiCollector metrics	2017-03-16 17:11:09 -04:00
Matt Layher	69368b7f9c	Add synthetic node_wifi_station_info metric for BSS information	2017-03-16 16:24:23 -04:00
Brian Brazil	a02e469b07	Report collector success/failure and duration per scrape. (#516 ) This is in line with best practices, and also saves us 63 timeseries on a default Linux setup.	2017-03-16 17:21:00 +00:00
Robert Clark	413e5af502	Skip metric files that don't exist In case a metric file within the InfiniBand collector doesn't exist, skip the metric in order to allow collection of the remaining valid InfiniBand metrics. Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>	2017-03-09 11:05:36 -06:00
Derek Marcotte	72d8576185	Refactor meminfo_bsd.go to use sysctl_bsd.go (#501 ) * Refactor meminfo_bsd.go to use sysctl_bsd.go * Fixed spelling.	2017-03-07 21:54:28 -04:00
Ben Kochie	5d22d41ed7	Merge pull request #484 from prometheus/grobie/update-vendored-packages Update vendored packages	2017-03-01 08:05:45 +01:00
Derek Marcotte	bdc2131332	Added node_memory_buffer, node_memory_swaptotal to meminfo_bsd (#451 )	2017-03-01 01:36:02 -04:00
Tobias Schmidt	ce117d7a40	Update vendored packages	2017-02-28 18:20:24 -04:00
Tobias Schmidt	84eaa8fecd	Remove more unnecessarily named return values	2017-02-28 17:33:46 -04:00
Derek Marcotte	5c28ab044d	Add BSD exec statistics collector (#457 ) * First pass of a sysctl_bsd source, exec_bsd + exec metrics * Incorportate PR feedback, including removing pre-build descriptions, unit conversion callback. * Remove redundant cached_description field, per PR feedback * Incorporate PR feedback	2017-02-28 17:23:10 -04:00
Tobias Schmidt	1bd94074dd	Delete unused code	2017-02-28 17:20:16 -04:00
Tobias Schmidt	922e74d58f	Remove unnecessarily named return variables Named return variables should only be used to describe the returned type further, e.g. `err error` doesn't add any new information and is just stutter.	2017-02-28 16:04:25 -04:00
Tobias Schmidt	084e585c2a	Fix scanner usage without error handling	2017-02-28 16:04:25 -04:00
Tobias Schmidt	d1dfda86ee	Fix wrong end-to-end expectation	2017-02-28 16:02:43 -04:00
Tobias Schmidt	abdebef47c	Fix gofmt -s and spelling issues	2017-02-28 14:01:28 -04:00
Tobias Schmidt	195b4d596c	Merge pull request #480 from prometheus/grobie/gosimple Simplify go code	2017-02-28 13:59:01 -04:00
Tobias Schmidt	694294baf5	Remove unnecessary conversions	2017-02-28 13:57:49 -04:00
Tobias Schmidt	21e13c7f52	Simplify code	2017-02-28 13:54:27 -04:00
Tobias Schmidt	c703435790	Fix all open go lint and vet issues	2017-02-28 13:05:38 -04:00
Ben Kochie	38cd07ebb9	Merge pull request #450 from roclark/add-infiniband infiniband: Add new collector for InfiniBand statistics	2017-02-16 14:33:19 +01:00

1 2 3 4 5 ...

600 Commits (fc73586c971225037aa09b5462031b9694278c74)