repo_sync
master
superq/goimports
superq/fs_include
superq/v2
superq/systemd_shutdown
mixin-fix-cpu-usage
release-1.8
superq/stylecheck
superq/refactor_kingpin
release-1.6
release-1.4
release-1.3
superq/os_collector
superq/replace_load_metric
release-1.2
fixpanic
release-1.1
aix-ppc64
security-dot-md
fish/remove-docker-hub-references
readme-remove-misleading-plugins
release-1.0
discordianfish-patch-1
revert-1677-expose-tls-copy
pgier/disable-default-collectors
superq/buildkite
release-0.18
superq/systemd_filter
superq/unit_type
release-0.17
bjk/fpm
release-0.16
release-0.15
v1.8.2
v1.8.1
v1.8.0
v1.7.0
v1.6.1
v1.6.0
v1.5.0
v1.4.1
v1.4.0
v1.4.0-rc.0
v1.3.1
v1.3.0
v1.2.2
v1.2.1
v1.2.0
v1.1.2
v1.1.1
v1.1.0
v1.0.1
v1.0.0
v1.0.0-rc.1
v1.0.0-rc.0
v0.18.1
v0.18.0
v0.17.0
v0.17.0-rc.0
v0.16.0
v0.16.0-rc.3
v0.16.0-rc.2
v0.16.0-rc.1
v0.16.0-rc.0
v0.15.2
v0.15.1
v0.15.0
0.12.0
0.12.0rc1
0.8.0
0.7.1
0.7.0
0.6.0
0.10.0
0.11.0
0.12.0rc2
0.12.0rc3
0.13.0-rc.2
0.8.1
0.9.0
v0.13.0
v0.13.0-rc.1
v0.13.0-rc.2
v0.14.0
v0.14.0-rc.1
v0.14.0-rc.2
${ noResults }
1 Commits (0662673ad6a626e93eb79bc784776a95164024a3)
Author | SHA1 | Message | Date |
---|---|---|---|
Matt Bostock | 516e5d4beb |
Add metric for outdated libraries (#957)
Add metrics that count how many running processes are linking to deleted libraries on each machine. Deleted libraries are usually outdated libraries, and outdated libraries may have known security vulnerabilities. The rationale behind storing these as metrics is allow the rollout of security fixes to be tracked across a fleet of machines, ensuring that all affected processes are restarted (e.g. via a reboot). I'm parsing the output from `/proc/*/maps` because it's using `lsof -d DEL` can be too slow, particularly if you have sockets that bind to thousands of IP addresses. The metric labels include the library path and the base filename, which allows us to pinpoint the exact path of the deleted library but also allows us to aggregate on the library name (or approximations of it) even if library locations differ between operating system versions. The metrics output and the CPU time consumed is as follows: user@host:~$ time sudo python processes.py # HELP node_processes_linking_deleted_libraries Count of running processes that link a deleted library # TYPE node_processes_linking_deleted_libraries gauge node_processes_linking_deleted_libraries{library_path="locale-archive", library_name="/usr/lib/locale"} 3 node_processes_linking_deleted_libraries{library_path="libevent-2.0.so.5.1.9", library_name="/usr/lib/x86_64-linux-gnu"} 4 real 0m0.071s user 0m0.030s sys 0m0.041s Including the library filename and path will result in reasonably high metrics cardinality, however I think the benefits when an urgent security patch is being deployed outweigh concerns around cardinality. This script assumes that library files do not contain spaces in their path. Signed-off-by: Matt Bostock <mbostock@cloudflare.com> |
7 years ago |