prometheus

Commit Graph

Author	SHA1	Message	Date
zenador	32ee1b15de	Fix error on ingesting out-of-order exemplars (#13021 ) Fix and improve ingesting exemplars for native histograms. See code comment for a detailed explanation of the algorithm. Note that this changes the current behavior for all kind of samples slightly: We now allow exemplars with the same timestamp as during the last scrape if the value or the labels have changed. Also note that we now do not ingest exemplars without timestamps for native histograms anymore. Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com> Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Co-authored-by: Björn Rabenstein <github@rabenste.in> --------- Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com> Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Signed-off-by: zenador <zenador@users.noreply.github.com> Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Co-authored-by: Björn Rabenstein <github@rabenste.in>	1 year ago
Julien Pivotto	0fe34f6d78	Follow-up to #13060 : Add test to ensure staleness tracking This commit introduces an additional test in `scrape_test.go` to verify staleness tracking when `trackTimestampStaleness` is enabled. The new `TestScrapeLoopAppendStalenessIfTrackTimestampStaleness` function asserts that the scrape loop correctly appends staleness markers when necessary, reflecting the expected behavior with the feature flag turned on. The previous tests were only testing end of scrape staleness. Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	1 year ago
Matthieu MOREL	7eaefcf379	ci(lint): enable errorlint on scrape (#12923 ) Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com> Signed-off-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com> Co-authored-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com>	1 year ago
George Krajcsovits	e399395b01	Native histograms vs labels (#13005 ) * Document le and quantile label transition due to native histograms Fixes: #12984 For full explanation see the related issue. The le and quantile labels are formatted as float with trailing .0 for whole number values when native histograms is enabled, e.g. 10.0. This changes the resulting series in Prometheus if previously we scraped the whole number itself, e.g. 10 over the text format. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> --------- Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>	1 year ago
Julien Pivotto	fed6f81377	scrape: Added trackTimestampsStaleness configuration option Add the ability to track staleness when an explicit timestamp is set. Useful for cAdvisor. Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	1 year ago
Julien Pivotto	84aadfc45b	scrape: Added trackTimestampsStaleness configuration option Add the ability to track staleness when an explicit timestamp is set. Useful for cAdvisor. Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	1 year ago
Paulin Todev	5752050b42	Scrape metrics can now be registered with a non-default registry. * A registerer is passed to the scrape Manager, and all scrape metrics register with it. * For now the registry which we pass to the scrape Manager is still the global one. Signed-off-by: Paulin Todev <paulin.todev@gmail.com>	1 year ago
Bartlomiej Plotka	624b973ebf	Added ability to specify scrape protocols to accept during HTTP content type negotiation. (#12738 ) * Added ability to specify scrape protocols to accept during HTTP content type negotiation. This is done via new option in GlobalConfig and ScrapeConfig: "scrape_protocol" Signed-off-by: bwplotka <bwplotka@gmail.com> * Fixed readability and log message. Signed-off-by: bwplotka <bwplotka@gmail.com> --------- Signed-off-by: bwplotka <bwplotka@gmail.com>	1 year ago
Bryan Boreham	f6d9c84fde	scraping: delay creating buffer, to save memory (#12953 ) We don't need the buffer to read the response until the scrape http call returns; creating it earlier makes the buffer pool larger. I split `scrape()` into `scrape()` which returns with the http response, and `readResponse()` which decompresses and copies the data into the supplied buffer. This design was chosen to minimize impact on the logic. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	1 year ago
Arve Knudsen	6daee89e5f	Add context argument to Querier.Select (#12660 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	1 year ago
György Krajcsovits	983c0c5e9d	Add missing buckets My previous proposal for a fix was wrong and also missed these. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	1 year ago
György Krajcsovits	2ae8c2bd3d	Set expected values in test The parsing doesn't seem to be perfect as I don't get all classic buckets possibly another bug found? Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	1 year ago
György Krajcsovits	2a781ec5ac	Replicate infinite loop in native-classic histogram scrape Enable scraping a native histogram with exemplars that leads to infinite loop. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	1 year ago
Bryan Boreham	627c99424b	scrape: extend TestDroppedTargetsList to check counts Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	1 year ago
Bryan Boreham	1e3fef6ab0	scraping: limit detail on dropped targets, to save memory (#12647 ) It's possible (quite common on Kubernetes) to have a service discovery return thousands of targets then drop most of them in relabel rules. The main place this data is used is to display in the web UI, where you don't want thousands of lines of display. The new limit is `keep_dropped_targets`, which defaults to 0 for backwards-compatibility. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	1 year ago
beorn7	536a487af4	scrape: Refactor names of float samples Continue to remove confusion that histogram samples are also samples and histogram values are also values etc. by renaming float values and float samples using the same schema as for histograms. Concretely: - result → resultFloats (corresponding to resultHistograms) - pendingResult → pendingFloats (corresponding to pendingHistograms) - rolledbackResult → rolledbackFloats (corresponding to rolledbackHistograms) - sample → floatSample (corresponding to histogramSample) This also order the fields in `collectResultAppender` more consistently. Signed-off-by: beorn7 <beorn@grafana.com>	1 year ago
beorn7	0e3f35324b	scrape: Enable ingestion of multiple exemplars per sample This has become a requirement for native histograms, as a single histogram sample commonly has many buckets, so that providing many exemplars makes sense. Since OM text doesn't support native histograms yet, the test had to be expanded to also support protobuf test cases. Signed-off-by: beorn7 <beorn@grafana.com>	1 year ago
beorn7	9e500345f3	textparse/scrape: Add option to scrape both classic and native histograms So far, if a target exposes a histogram with both classic and native buckets, a native-histogram enabled Prometheus would ignore the classic buckets. With the new scrape config option `scrape_classic_histograms` set, both buckets will be ingested, creating all the series of a classic histogram in parallel to the native histogram series. For example, a histogram `foo` would create a native histogram series `foo` and classic series called `foo_sum`, `foo_count`, and `foo_bucket`. This feature can be used in a migration strategy from classic to native histograms, where it is desired to have a transition period during which both native and classic histograms are present. Note that two bugs in classic histogram parsing were found and fixed as a byproduct of testing the new feature: 1. Series created from classic _gauge_ histograms didn't get the _sum/_count/_bucket prefix set. 2. Values of classic _float_ histograms weren't parsed properly. Signed-off-by: beorn7 <beorn@grafana.com>	2 years ago
György Krajcsovits	19a4f314f5	Refactor testutil/protobuf.go into scrape package Renamed to clientprotobuf.go and added comments to indicate the intended usage. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2 years ago
Russ Cox	28f5502828	scrape: fix two loop variable scoping bugs in test Consider code like: for i := 0; i < numTargets; i++ { stopFuncs = append(stopFuncs, func() { time.Sleep(i20time.Millisecond) }) } Because the loop variable i is shared by all closures, all the stopFuncs sleep for numTargets20 ms. If the i were made per-iteration, as we are considering for a future Go release, the stopFuncs would have sleep durations ranging from 0 to (numTargets-1)20 ms. Two tests had code like this and were checking that the aggregate sleep was at least numTargets20 ms ("at least as long as the last target slept"). This is only true today because i == numTarget during all the sleeps. To keep the code working even if the semantics of this loop change, this PR computes d := time.Duration((i+1)20) * time.Millisecond outside the closure (but inside the loop body), and then each closure has its own d. Now the sleeps range from 20 ms to numTargets*20 ms, keeping the test passing (and probably behaving closer to the intent of the test author). The failure being fixed can be reproduced by using the current Go development branch with GOEXPERIMENT=loopvar go test Signed-off-by: Russ Cox <rsc@golang.org>	2 years ago
Jeanette Tan	dfabc69303	Add tests according to code review Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	2 years ago
Jeanette Tan	2ad39baa72	Treat bucket limit like sample limit and make it fail the whole scrape and return an error Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	2 years ago
György Krajcsovits	071426f72f	Add unit test for bucket limit appender Refactors textparser test to use a common test utility to create protobuf representation from MetricFamily Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2 years ago
Jeanette Tan	4d21ac23e6	Implement bucket limit for native histograms Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	2 years ago
beorn7	5b53aa1108	style: Replace `else if` cascades with `switch` Wiser coders than myself have come to the conclusion that a `switch` statement is almost always superior to a statement that includes any `else if`. The exceptions that I have found in our codebase are just these two: * The `if else` is followed by an additional statement before the next condition (separated by a `;`). * The whole thing is within a `for` loop and `break` statements are used. In this case, using `switch` would require tagging the `for` loop, which probably tips the balance. Why are `switch` statements more readable? For one, fewer curly braces. But more importantly, the conditions all have the same alignment, so the whole thing follows the natural flow of going down a list of conditions. With `else if`, in contrast, all conditions but the first are "hidden" behind `} else if `, harder to spot and (for no good reason) presented differently from the first condition. I'm sure the aforemention wise coders can list even more reasons. In any case, I like it so much that I have found myself recommending it in code reviews. I would like to make it a habit in our code base, without making it a hard requirement that we would test on the CI. But for that, there has to be a role model, so this commit eliminates all `if else` occurrences, unless it is autogenerated code or fits one of the exceptions above. Signed-off-by: beorn7 <beorn@grafana.com>	2 years ago
beorn7	c3c7d44d84	lint: Adjust to the lint warnings raised by current versions of golint-ci We haven't updated golint-ci in our CI yet, but this commit prepares for that. There are a lot of new warnings, and it is mostly because the "revive" linter got updated. I agree with most of the new warnings, mostly around not naming unused function parameters (although it is justified in some cases for documentation purposes – while things like mocks are a good example where not naming the parameter is clearer). I'm pretty upset about the "empty block" warning to include `for` loops. It's such a common pattern to do something in the head of the `for` loop and then have an empty block. There is still an open issue about this: https://github.com/mgechev/revive/issues/810 I have disabled "revive" altogether in files where empty blocks are used excessively, and I have made the effort to add individual `// nolint:revive` where empty blocks are used just once or twice. It's borderline noisy, though, but let's go with it for now. I should mention that none of the "empty block" warnings for `for` loop bodies were legitimate. Signed-off-by: beorn7 <beorn@grafana.com>	2 years ago
Matthieu MOREL	fb3eb21230	enable gocritic, unconvert and unused linters Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2 years ago
Julien Pivotto	0c56e5d014	Update our own dependencies, support proxy from env Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2 years ago
Bryan Boreham	bec5abc4dc	scrape: remove unsafe code The `yolostring` routine was intended to avoid an allocation when converting from a `[]byte` to a `string` for map lookup. However, since 2014 Go has recognized this pattern and does not make a copy of the data when looking up a map. So the unsafe code is not necessary. In line with this, constants like `scrapeHealthMetricName` also become `[]byte`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	9bc6d7a7db	Update package scrape tests for new labels.Labels type Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	3c7de69059	storage: allow re-use of iterators Patterned after `Chunk.Iterator()`: pass the old iterator in so it can be re-used to avoid allocating a new object. (This commit does not do any re-use; it is just changing all the method signatures so re-use is possible in later commits.) Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Xiaochao Dong (@damnever)	9979024a30	Report error if the series contains invalid metric names or labels during scrape Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>	2 years ago
Ganesh Vernekar	3cbf87b83d	Enable protobuf negotiation only when histograms are enabled Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2 years ago
Bryan Boreham	4927e13537	scrape tests: undo EmptyLabels change Needs other code changes otherwise tests fail Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	14780c3b4e	scrape: in tests use labels.FromStrings And a few cases of `EmptyLabels()`. Replacing code which assumes the internal structure of `Labels`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bogdan Drutu	3cde9287a6	scrape: remove unused member from cacheEntry (#11281 ) Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>	2 years ago
Bogdan Drutu	c8cfe5c25d	scrape: remove unused argument in newScrapeLoop (#11283 ) Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com> Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>	2 years ago
Paschalis Tsilias	5a8e202f94	Append metadata to the WAL in the scrape loop (#10312 ) * Append metadata to the WAL Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Remove extra whitespace; Reword some docstrings and comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use RLock() for hasNewMetadata check Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use single byte for metric type in RefMetadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Update proposed WAL format for single-byte type metadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Address first round of review comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Amend description of metadata in wal.md Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Correct key used to retrieve metadata from cache When we're setting metadata entries in the scrapeCace, we're using the p.Help(), p.Unit(), p.Type() helpers, which retrieve the series name and use it as the cache key. When checking for cache entries though, we used p.Series() as the key, which included the metric name _with_ its labels. That meant that we were never actually hitting the cache. We're fixing this by utiling the __name__ internal label for correctly getting the cache entries after they've been set by setHelp(), setType() or setUnit(). Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Put feature behind a feature flag Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Reorder WAL format document Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix CR comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Extract logic about changing metadata in an anonymous function Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Implement new proposed WAL format and amend relevant tests Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use 'const' for metadata field names Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Apply metadata to head memSeries in Commit, not in AppendMetadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Add docstring and rename extracted helper in scrape.go Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix review comments around TestMetadata* tests Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Rebase with merged TSDB changes; fix duplicate definitions after rebase Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Remove leftover changes on db_test.go Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Rename feature flag Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Simplify updateMetadata helper function Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Remove extra newline Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>	2 years ago
Levi Harrison	d61459d826	`no-default-scrape-port` feature flag (#9523 ) * Add `no-default-scrape-port` flag Signed-off-by: Levi Harrison <git@leviharrison.dev>	2 years ago
Julien Pivotto	90583c8906	TestScrapeLoopCache: Display content of the appender (#10937 ) This should help identifying windows tests flakiness. Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2 years ago
Xiaonan Shen	0c3abdc26d	Keep relabeled scrape interval and timeout on reloads (#10916 ) * Preserve relabeled scrape interval and timeout on reloads Signed-off-by: Xiaonan Shen <s@sxn.dev>	2 years ago
Goutham Veeramachaneni	2381d7be57	Send target and metadata cache in context (again) (#10636 ) * Send target and metadata cache in context (again) The previous attempt was rolled back in #10590 due to memory issues. `sl.parentCtx` and `sl.ctx` both had a copy of the cache and target info in the previous attempt and it was hard to pin-point where the context was being retained causing the memory increase. I've experimented a bunch in #10627 to figure out that this approach doesn't cause memory increase. Beyond that, just using this info in _any_ other context is causing a memory increase. The change fixed a bunch of long-standing in the OTel Collector that the community was waiting on and release is blocked on a few downstream distrubutions of OTel Collector waiting on a fix. I propose to merge this change in while I investigate what is happening. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Gate the change behind a manager option Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>	3 years ago
Matthieu MOREL	e2ede285a2	refactor: move from io/ioutil to io and os packages (#10528 ) * refactor: move from io/ioutil to io and os packages * use fs.DirEntry instead of os.FileInfo after os.ReadDir Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>	3 years ago
Robert Fratto	f0ec619eec	scrape: allow providing a custom Dialer for scraping (#10415 ) * scrape: allow providing a custom Dialer for scraping This commit extends config.ScrapeConfig with an optional field to override how HTTP connections to targets are created. This field is not set directly in Prometheus, and is only added for the convenience of downstream importers. Closes #9706 Signed-off-by: Robert Fratto <robertfratto@gmail.com> * scrape: move custom dial function to scrape.Options Signed-off-by: Robert Fratto <robertfratto@gmail.com>	3 years ago
Jayapriya Pai	edfe657b54	scrape: Fix label_limits cache usage (#10370 ) Fixes #10344 Signed-off-by: Jayapriya Pai <janantha@redhat.com>	3 years ago
Matheus Pimenta	8d8ce641a4	error for invalid media type should not be completely swallowed (#10186 ) * error for invalid media type should not be completely swallowed Signed-off-by: Matheus Pimenta <matheuscscp@gmail.com>	3 years ago
beorn7	86cc83b13c	storage: iterator fixes after merge Signed-off-by: beorn7 <beorn@grafana.com>	3 years ago
Julien Pivotto	e94a0b28e1	Append reporting metrics without limit If reporting metrics fails due to reaching the limit, this makes the target appear as UP in the UI, but the metrics are missing. This commit bypasses that limit for report metrics. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	3 years ago
Björn Rabenstein	7e42acd3b1	tsdb: Rework iterators (#9877 ) - Pick At... method via return value of Next/Seek. - Do not clobber returned buckets. - Add partial FloatHistogram suppert. Note that the promql package is now _only_ dealing with FloatHistograms, following the idea that PromQL only knows float values. As a byproduct, I have removed the histogramSeries metric. In my understanding, series can have both float and histogram samples, so that metric doesn't make sense anymore. As another byproduct, I have converged the sampleBuf and the histogramSampleBuf in memSeries into one. The sample type stored in the sampleBuf has been extended to also contain histograms even before this commit. Signed-off-by: beorn7 <beorn@grafana.com>	3 years ago
beorn7	c954cd9d1d	Move packages out of deprecated pkg directory This creates a new `model` directory and moves all data-model related packages over there: exemplar labels relabel rulefmt textparse timestamp value All the others are more or less utilities and have been moved to `util`: gate logging modetimevfs pool runtime Signed-off-by: beorn7 <beorn@grafana.com>	3 years ago

1 2 3

139 Commits (9509ad082a0528625de6752516e58d0d381caa4e)