prometheus

Commit Graph

Author	SHA1	Message	Date
Julius Volz	6eecee55b7	Fix acronym caps in GeneratorURL. Change-Id: Ib18c1f617dcde1039e848059545a6d8831d9bf66	10 years ago
Bjoern Rabenstein	0ae1d8889a	Fix tests after merge. Change-Id: Ia90da9a3e48ed780ec38c4a6a1fd9ea34e7f6a58	10 years ago
Julius Volz	b7bf11230a	Add absent() function. A common problem in Prometheus alerting is to detect when no timeseries exist for a given metric name and label combination. Unfortunately, Prometheus alert expressions need to be of vector type, and "count(nonexistent_metric)" results in an empty vector, yielding no output vector elements to base an alert on. The newly introduced absent() function solves this issue: ALERT FooAbsent IF absent(foo{job="myjob"}) [...] absent() has the following behavior: - if the vector passed to it has any elements, it returns an empty vector. - if the vector passed to it has no elements, it returns a 1-element vector with the value 1. In the second case, absent() tries to be smart about deriving labels of the 1-element output vector from the input vector: absent(nonexistent{job="myjob"}) => {job="myjob"} absent(nonexistent{job="myjob",instance=~".*"}) => {job="myjob"} absent(sum(nonexistent{job="myjob"})) => {} That is, if the passed vector is a literal vector selector, it takes all "=" label matchers as the basis for the output labels, but ignores all non-equals or regex matchers. Also, if the passed vector results from a non-selector expression, no labels can be derived. Change-Id: I948505a1488d50265ab5692a3286bd7c8c70cd78	10 years ago
Julius Volz	3d47f94149	Drop metric names after transformations. After many transformations, it doesn't make sense to keep the metric names, since the result of the transformation is no longer that metric. This drops the metric name after such transformations and makes the web UI deal well with missing metric names. This depends on the current branch on the following things: - prometheus/client_golang needs to be at `e237cf15c6` in branch "julius/int-fingerprints" (to be merged with new storage) - prometheus/promdash needs to be at `dd7691c9c2` Change-Id: Ib3c8cad8d647d9854e8c653c424b8c235ccc231d	10 years ago
Bjoern Rabenstein	14bda4180c	Changes after pair code review. Change-Id: Ib72d40f8e9027818cfbbd32a7a7201eebda07455	10 years ago
Bjoern Rabenstein	006b5517e2	Simplify makefiles. This removes the dependancy on C leveldb and snappy. It also takes care of fewer dependencies as they would anyway not work on any non-Debian, non-Brew system. Change-Id: Ia70dce1ba8a816a003587927e0b3a3f8ad2fd28c	10 years ago
Bjoern Rabenstein	74c143c4c9	Improve scraper shutdown time. - Stop target pools in parallel. - Stop individual scrapers in goroutines, too. - Timing tweaks. Change-Id: I9dff1ee18616694f14b04408eaf1625d0f989696	10 years ago
Julius Volz	0712d738d1	Allow alternative "by"-clause position in grammar. In addition to the existing by-clause syntax: sum(<expression>) by (<labels>) [keeping_extra] ...this allows the following new syntax: sum by (<labels>) [keeping_extra] (<expression>) Both orderings may be used in a single expression. It is up to the users to establish guidelines around their usage. Change-Id: Iba10c9cc5fb6ac62edfcf246d281473e82467992	10 years ago
Julius Volz	0e48c18bbf	Allow omitting the metric name in queries. This allows the following expression syntaxes for selecting timeseries: foo (already valid before) foo{} (already valid before) {job="prometheus"} (new, select all timeseries for job "prometheus") Omitting both the metric name and any label matchers ("" or "{}") will still yield a syntax error. To get all timeseries, you could do: {__name__=~"."} or, without relying on knowledge about __metric__: {job=~"."} Change-Id: Ifee000b9ac0184ef6ced18411069c7f2699a2dda	10 years ago
Bjoern Rabenstein	096fa0f8b2	Squash a number of TODOs. - Staleness delta is no a proper function parameter and not replicated from package ast. - Named type 'chunks' replaced by explicit '[]chunk' to avoid confusion. - For the same reason, replaced 'chunkDescs' by '[]*chunkDescs'. - Verified that math.Modf is not a speed enhancement over conversion (actually 5x slower). - Renamed firstTimeField, lastTimeField into chunkFirstTime and chunkLastTime. - Verified unpin() is sufficiently goroutine-safe. - Decided not to update archivedFingerprintToTimeRange upon series truncation and added a rationale why. Change-Id: I863b8d785e5ad9f71eb63e229845eacf1bed8534	10 years ago
Bjoern Rabenstein	b3ed9aa7a2	Clean up start-up and shut-down. Change-Id: Idff4bbb0a15a9f879bfbb3da5b1025179cab5e2c	10 years ago
Bjoern Rabenstein	38fc24d0ed	Fix targetpool_test.go and other tests. Change-Id: I91a4dd1d39e01f174e1aaae653ce1ed7aecaa624	10 years ago
Julius Volz	7f5d3c2c29	Fix and improve the fp locker. Benchmark: $ go test -bench 'Fingerprint' -test.run 'Fingerprint' -test.cpu=1,2,4 OLD BenchmarkFingerprintLockerParallel 500000 3618 ns/op BenchmarkFingerprintLockerParallel-2 100000 12257 ns/op BenchmarkFingerprintLockerParallel-4 500000 10164 ns/op BenchmarkFingerprintLockerSerial 10000000 283 ns/op BenchmarkFingerprintLockerSerial-2 10000000 284 ns/op BenchmarkFingerprintLockerSerial-4 10000000 288 ns/op NEW BenchmarkFingerprintLockerParallel 1000000 1018 ns/op BenchmarkFingerprintLockerParallel-2 1000000 1164 ns/op BenchmarkFingerprintLockerParallel-4 2000000 910 ns/op BenchmarkFingerprintLockerSerial 50000000 56.0 ns/op BenchmarkFingerprintLockerSerial-2 50000000 47.9 ns/op BenchmarkFingerprintLockerSerial-4 50000000 54.5 ns/op Change-Id: I3c65a43822840e7e64c3c3cfe759e1de51272581	10 years ago
Julius Volz	358f97791d	Minor cleanups. Change-Id: Ia8685d8439a421fe2143d9ec7120d5bb5ab88d78	10 years ago
Bjoern Rabenstein	f5f9f3514a	Major code cleanup. - Make it go-vet and golint clean. - Add comments, TODOs, etc. Change-Id: If1392d96f3d5b4cdde597b10c8dff1769fcfabe2	10 years ago
Julius Volz	e7ed39c9a6	Initial experimental snapshot of next-gen storage. Change-Id: Ifb8709960dbedd1d9f5efd88cdd359ee9fa9d26d	10 years ago
Julius Volz	85497e3f38	Add function to drop common labels in a vector. This fixes https://github.com/prometheus/prometheus/issues/384. Change-Id: I2973c4baeb8a4618ec3875fb11c6fcf5d111784b	10 years ago
Julius Volz	3fdb74e571	Add more topk() / bottomk() tests. Test what happens if k > number of input elements. Change-Id: Ie724b850939e297ebf085f0a5a3522e9cfcc6534	10 years ago
Julius Volz	c582ae73c2	Implement topk() and bottomk() functions. To achieve O(log n * k) runtime, this uses a heap to track the current bottom-k or top-k elements while iterating over the full set of available elements. It would be possible to reuse more code between topk and bottomk, but I decided for some more duplication for the sake of clarity. This fixes https://github.com/prometheus/prometheus/issues/399 Change-Id: I7487ddaadbe7acb22ca2cf2283ba6e7915f2b336	10 years ago
Bjoern Rabenstein	1909686789	Make metrics exported by the Prometheus server itself more consistent. - Always spell out the time unit (e.g. milliseconds instead of ms). - Remove "_total" from the names of metrics that are not counters. - Make use of the "Namespace" and "Subsystem" fields in the options. - Removed the "capacity" facet from all metrics about channels/queues. These are all fixed via command line flags and will never change during the runtime of a process. Also, they should not be part of the same metric family. I have added separate metrics for the capacity of queues as convenience. (They will never change and are only set once.) - I left "metric_disk_latency_microseconds" unchanged, although that metric measures the latency of the storage device, even if it is not a spinning disk. "SSD" is read by many as "solid state disk", so it's not too far off. (It should be "solid state drive", of course, but "metric_drive_latency_microseconds" is probably confusing.) - Brian suggested to not mix "failure" and "success" outcome in the same metric family (distinguished by labels). For now, I left it as it is. We are touching some bigger issue here, especially as other parts in the Prometheus ecosystem are following the same principle. We still need to come to terms here and then change things consistently everywhere. Change-Id: If799458b450d18f78500f05990301c12525197d3	10 years ago
Julius Volz	00b9489f1c	Fix time() behavior. time() should return the timestamp for which the query is executed, not the actual current time. Change-Id: I430a45cabad7785cd58f95b1028a71dff4c87710	10 years ago
Julius Volz	c5984f1818	Add abs() and over-time aggregation functions. This implements aggregation functions over time as request in https://github.com/prometheus/prometheus/issues/383. Change-Id: Ifd69b850de8cfdf6e7a6c0e042056fa4c672410e	10 years ago
Brian Brazil	f525ca5d9e	Let consoles get graph links from experssions. Rename ConsoleLinkFromExpression, as we now have consoles. Change-Id: I7ed2c9c83863adb390b51121dd9736845f7bcdfc	10 years ago
Bjoern Rabenstein	8956faeccb	Migrate to new client_golang. This change will only be submitted when the new client_golang has been moved to the new version. Change-Id: Ifceb59333072a08286a8ac910709a8ba2e3a1581	10 years ago
Brian Brazil	960ede66dc	Use html/template for console templates and add template libary support. Add a function to bypass the new auto-escaping. Add a function to workaround go's templates only allowing passing in one argument. Change-Id: Id7aa3f95e7c227692dc22108388b1d9b1e2eec99	10 years ago
Julius Volz	2c4cab07b1	Fix acronym caps in GeneratorURL. Change-Id: Ib18c1f617dcde1039e848059545a6d8831d9bf66	10 years ago
Julius Volz	f1aac54104	Allow alternative "by"-clause position in grammar. In addition to the existing by-clause syntax: sum(<expression>) by (<labels>) [keeping_extra] ...this allows the following new syntax: sum by (<labels>) [keeping_extra] (<expression>) Both orderings may be used in a single expression. It is up to the users to establish guidelines around their usage. Change-Id: Iba10c9cc5fb6ac62edfcf246d281473e82467992	10 years ago
Julius Volz	080b952647	Allow omitting the metric name in queries. This allows the following expression syntaxes for selecting timeseries: foo (already valid before) foo{} (already valid before) {job="prometheus"} (new, select all timeseries for job "prometheus") Omitting both the metric name and any label matchers ("" or "{}") will still yield a syntax error. To get all timeseries, you could do: {__name__=~"."} or, without relying on knowledge about __metric__: {job=~"."} Change-Id: Ifee000b9ac0184ef6ced18411069c7f2699a2dda	10 years ago
Brian Brazil	35fb5378bc	Add back consoles link. Goes in index.html in consoles or else user data, if present. Change-Id: I5303d30aa24ca0c20d2e0f49121e04a260b9c4f4	10 years ago
Julius Volz	b65c5dd752	Add function to drop common labels in a vector. This fixes https://github.com/prometheus/prometheus/issues/384. Change-Id: I2973c4baeb8a4618ec3875fb11c6fcf5d111784b	10 years ago
Julius Volz	f7cd18abdf	Add more topk() / bottomk() tests. Test what happens if k > number of input elements. Change-Id: Ie724b850939e297ebf085f0a5a3522e9cfcc6534	10 years ago
Julius Volz	200d02effe	Implement topk() and bottomk() functions. To achieve O(log n * k) runtime, this uses a heap to track the current bottom-k or top-k elements while iterating over the full set of available elements. It would be possible to reuse more code between topk and bottomk, but I decided for some more duplication for the sake of clarity. This fixes https://github.com/prometheus/prometheus/issues/399 Change-Id: I7487ddaadbe7acb22ca2cf2283ba6e7915f2b336	10 years ago
Bjoern Rabenstein	24ece38f7c	Make metrics exported by the Prometheus server itself more consistent. - Always spell out the time unit (e.g. milliseconds instead of ms). - Remove "_total" from the names of metrics that are not counters. - Make use of the "Namespace" and "Subsystem" fields in the options. - Removed the "capacity" facet from all metrics about channels/queues. These are all fixed via command line flags and will never change during the runtime of a process. Also, they should not be part of the same metric family. I have added separate metrics for the capacity of queues as convenience. (They will never change and are only set once.) - I left "metric_disk_latency_microseconds" unchanged, although that metric measures the latency of the storage device, even if it is not a spinning disk. "SSD" is read by many as "solid state disk", so it's not too far off. (It should be "solid state drive", of course, but "metric_drive_latency_microseconds" is probably confusing.) - Brian suggested to not mix "failure" and "success" outcome in the same metric family (distinguished by labels). For now, I left it as it is. We are touching some bigger issue here, especially as other parts in the Prometheus ecosystem are following the same principle. We still need to come to terms here and then change things consistently everywhere. Change-Id: If799458b450d18f78500f05990301c12525197d3	10 years ago
Julius Volz	54eb21b22a	Fix time() behavior. time() should return the timestamp for which the query is executed, not the actual current time. Change-Id: I430a45cabad7785cd58f95b1028a71dff4c87710	10 years ago
Julius Volz	527831ec6e	Add abs() and over-time aggregation functions. This implements aggregation functions over time as request in https://github.com/prometheus/prometheus/issues/383. Change-Id: Ifd69b850de8cfdf6e7a6c0e042056fa4c672410e	10 years ago
Brian Brazil	8be993b8f8	Let consoles get graph links from experssions. Rename ConsoleLinkFromExpression, as we now have consoles. Change-Id: I7ed2c9c83863adb390b51121dd9736845f7bcdfc	10 years ago
Bjoern Rabenstein	2128d9d811	Migrate to new client_golang. This change will only be submitted when the new client_golang has been moved to the new version. Change-Id: Ifceb59333072a08286a8ac910709a8ba2e3a1581	11 years ago
Brian Brazil	29cf4d6ad9	Use html/template for console templates and add template libary support. Add a function to bypass the new auto-escaping. Add a function to workaround go's templates only allowing passing in one argument. Change-Id: Id7aa3f95e7c227692dc22108388b1d9b1e2eec99	11 years ago
Brian Brazil	e041c0cd46	Add console and alert templates with access to all data. Move rulemanager to it's own package to break cicrular dependency. Make NewTestTieredStorage available to tests, remove duplication. Change-Id: I33b321245a44aa727bfc3614a7c9ae5005b34e03	11 years ago
Bjoern Rabenstein	ca6a4fccef	Weed out our homegrown test.Tester. The Go stdlib has testing.TB now, which fulfills the exact same purpose. Change-Id: I0db9c73400e208ca376b932a02b7e3402234b87c	11 years ago
Julius Volz	6297a405f2	Do not indent API JSON responses. In one example response, this reduced the uncompressed size by 25% and the gzipped size by 11%. Change-Id: Ie80d44253124b9f8601b8ef9fc978e92dacff523	11 years ago
Julius Volz	01f652cb4c	Separate storage implementation from interfaces. This was initially motivated by wanting to distribute the rule checker tool under `tools/rule_checker`. However, this was not possible without also distributing the LevelDB dynamic libraries because the tool transitively depended on Levigo: rule checker -> query layer -> tiered storage layer -> leveldb This change separates external storage interfaces from the implementation (tiered storage, leveldb storage, memory storage) by putting them into separate packages: - storage/metric: public, implementation-agnostic interfaces - storage/metric/tiered: tiered storage implementation, including memory and LevelDB storage. I initially also considered splitting up the implementation into separate packages for tiered storage, memory storage, and LevelDB storage, but these are currently so intertwined that it would be another major project in itself. The query layers and most other parts of Prometheus now have notion of the storage implementation anymore and just use whatever implementation they get passed in via interfaces. The rule_checker is now a static binary :) Change-Id: I793bbf631a8648ca31790e7e772ecf9c2b92f7a0	11 years ago
Julius Volz	d411a7d810	Allow reversing vector and scalar arguments in binops. This allows putting a scalar as the first argument of a binary operator in which the second argument is a vector: <scalar> <binop> <vector> For example, 1 / http_requests_total ...will output a vector in which every sample value is 1 divided by the respective input vector element. This even works for filter binary operators now: 1 == http_requests_total Returns a vector with all values set to 1 for every element in http_requests_total whose initial value was 1. Note: For filter binary operators, the resulting values are always taken from the left-hand-side of the operation, no matter whether the scalar or the vector argument is the left-hand-side. That is, 1 != http_requests_total ...will set all result vector sample values to 1, although these are exactly the sample elements that were != 1 in the input vector. If you want to just filter elements without changing their sample values, you still need to do: http_requests_total != 1 The new filter form is a bit exotic, and so probably won't be used often. But it was easier to implement it than disallow it completely or change its behavior. Change-Id: Idd083f2bd3a1219ba1560cf4ace42f5b82e797a5	11 years ago
Julius Volz	c7c0b33d0b	Add regex-matching support for labels. There are four label-matching ops for selecting timeseries now: - Equal: = - NotEqual: != - RegexMatch: =~ - RegexNoMatch: !~ Instead of looking up labels by a simple clientmodel.LabelSet (basically an equals op for every key/value pair in the set), timeseries fingerprint selection is now done via a list of metric.LabelMatchers. Change-Id: I510a83f761198e80946146770ebb64e4abc3bb96	11 years ago
Bjoern Rabenstein	0a65b691cc	Disallow ":" in identifiers, but still allow it in metric names. Change-Id: Iace925ab1b71a360bd63357e87f68e727f7afbcb	11 years ago
Julius Volz	86fc13a52e	Convert metric.Values to slice of values. The initial impetus for this was that it made unmarshalling sample values much faster. Other relevant benchmark changes in ns/op: Benchmark old new speedup ================================================================== BenchmarkMarshal 179170 127996 1.4x BenchmarkUnmarshal 404984 132186 3.1x BenchmarkMemoryGetValueAtTime 57801 50050 1.2x BenchmarkMemoryGetBoundaryValues 64496 53194 1.2x BenchmarkMemoryGetRangeValues 66585 54065 1.2x BenchmarkStreamAdd 45.0 75.3 0.6x BenchmarkAppendSample1 1157 1587 0.7x BenchmarkAppendSample10 4090 4284 0.95x BenchmarkAppendSample100 45660 44066 1.0x BenchmarkAppendSample1000 579084 582380 1.0x BenchmarkMemoryAppendRepeatingValues 22796594 22005502 1.0x Overall, this gives us good speedups in the areas where they matter most: decoding values from disk and accessing the memory storage (which is also used for views). Some of the smaller append examples take minimally longer, but the cost seems to get amortized over larger appends, so I'm not worried about these. Also, we're currently not bottlenecked on the write path and have plenty of other optimizations available in that area if it becomes necessary. Memory allocations during appends don't change measurably at all. Change-Id: I7dc7394edea09506976765551f35b138518db9e8	11 years ago
Julius Volz	bc6ee6611e	Rename persistence_adapter.go -> view_adapter.go Change-Id: Ib45081393b734531d2f85a02f46e87930aab3273	11 years ago
Julius Volz	3f226c9724	Rename {Scalar,Vector}Literal to {Scalar,Vector}Selector. Change-Id: Ie92301f47f5f49f30b3a62c365e377108982b080	11 years ago
Bjoern Rabenstein	682cf6fc51	Simplify QueryAnalizer.Visit(). Change-Id: I628582a1903b7273e78921e22a475f1dae5ebaae	11 years ago
Bjoern Rabenstein	fd63500ed3	Make rules/ast golint clean. Mostly, that means adding compliant doc strings to exported items. Also, remove 'go vet' warnings where possible. (Some are unfortunately not to avoid, arguably bugs in 'go vet'.) Change-Id: I2827b6dd317492864c1383c3de1ea9eac5a219bb	11 years ago

... 6 7 8 9 10 ...

535 Commits (f53af5c849a1c3944fb67f74a429ddd540ad7fb9)