Adds a count for TCP packets received out of orders. This can be an
indication that there is packet loss on the way packets travel towards
this server. In that case, the sender will retransmit (and we can
already monitor the Tcp_RetransSegs there), but we have no way to
monitor the packet loss on the receiver side. When a packet is received
and the receiver detects previous one missing, it will increase the
TCPOFOQueue counter and reply with selective ACK to the sender, both
possible indications of packet loss. Confirmation of packet loss can be
achieved by taking packet captures, ignoring wireshark analysis, and
carefully looking at data being retransmitted based on the TCP seq.
Just like RetransSegs, TCPOFOQueue should be interesting for any
deployment as a mean to detect packet loss, so here suggesting adding it
to the default list.
Signed-off-by: François Rigault <frigo@amadeus.com>
Co-authored-by: François Rigault <frigo@amadeus.com>
* Bump exporter-toolkit to the latest release.
* Use new toolkit landing page function.
* Update kingpin flags.
Signed-off-by: Ben Kochie <superq@gmail.com>
TCP timeouts count is a useful signal to show
abnormal network performance and is another
signal to aid debugging. This metric can be
used to generate proactive alerts for host
network namespace workloads.
Signed-off-by: Martin Kennelly <mkennell@redhat.com>
Fix the error logging of the promhttp handler by connecting it to the
promlog setup.
* Switch to go-kit/log.
* Cleanup CHANGELOG.
Fixes: https://github.com/prometheus/node_exporter/issues/1886
Signed-off-by: Ben Kochie <superq@gmail.com>
TCP "OutRsts" is the number of TCP Resets sent by the node. This can be
useful for monitoring connection failures and flooding.
Signed-off-by: Ben Kochie <superq@gmail.com>
* netstat: Add TCP In/Out Segs
In order to get a better idea of TCP packet loss, we need to know how
many `node_netstat_Tcp_OutSegs` there are so we can compare this to
`node_netstat_Tcp_RetransSegs`.
Signed-off-by: Ben Kochie <superq@gmail.com>
* Update fixtures
Signed-off-by: Ben Kochie <superq@gmail.com>
Tcp SYN packet retransmits are a very useful signal as they affect
network performance disproportionately to regular TCP retransmits.
Signed-off-by: Ben Kochie <superq@gmail.com>
Netstat is 40% of the metrics on my laptop, many of which
are highly detailed information about IP internals in the kernel.
~300 such metrics on every machine in your fleet is excessive,
so focus on key metrics by default, overridable by the user.
Fixes#515
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Move NodeCollector into package collector
* Refactor collector enabling
* Update README with new collector enabled flags
* Fix out-of-date inline flag reference syntax
* Use new flags in end-to-end tests
* Add flag to disable all default collectors
* Track if a flag has been set explicitly
* Add --collectors.disable-defaults to README
* Revert disable-defaults flag
* Shorten flags
* Fixup timex collector registration
* Fix end-to-end tests
* Change procfs and sysfs path flags
* Fix review comments
Named return variables should only be used to describe the returned type
further, e.g. `err error` doesn't add any new information and is just
stutter.
Remove all hardcoded references to `/proc`. For all collectors that do
not use `github.com/prometheus/procfs` yet, provide a wrapper to
generate the full paths.
Reformulate help strings, errors and comments to remove absolute
references to `/proc`.
This is a breaking change: the `-collector.ipvs.procfs` flag is removed
in favor of the general flag. Since it only affected that collector it
was only useful for development, so this should not cause many issues.
This catches things like listen overflows, retransmits
and other things that are very useful for retroactive debugging
thus I think it's justified to have it on by default.