prometheus

Commit Graph

Author	SHA1	Message	Date
Jesus Vazquez	e934d0f011	Merge 'main' into sparsehistogram Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>	2022-10-05 22:14:49 +02:00
Ganesh Vernekar	f34aeefe6e	Allow overlapping blocks by default (#11331 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-09-28 19:17:54 +05:30
Paschalis Tsilias	f2ee959354	Remove 'metadata-storage' CLI flag (#11351 ) Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>	2022-09-27 12:05:09 +05:30
Jesus Vazquez	c1b669bf9b	Add out-of-order sample support to the TSDB (#11075 ) * Introduce out-of-order TSDB support This implementation is based on this design doc: https://docs.google.com/document/d/1Kppm7qL9C-BJB1j6yb6-9ObG3AbdZnFUBYPNNWwDBYM/edit?usp=sharing This commit adds support to accept out-of-order ("OOO") sample into the TSDB up to a configurable time allowance. If OOO is enabled, overlapping querying are automatically enabled. Most of the additions have been borrowed from https://github.com/grafana/mimir-prometheus/ Here is the list ist of the original commits cherry picked from mimir-prometheus into this branch: - `4b2198d7ec` - `2836e5513f` - `00b379c3a5` - `ff0dc75758` - `a632c73352` - `c6f3d4ab33` - `5e8406a1d4` - `abde1e0ba1` - `e70e769889` - `df59320886` Co-authored-by: Jesus Vazquez <jesus.vazquez@grafana.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Dieter Plaetinck <dieter@grafana.com> Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * gofumpt files Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Add license header to missing files Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix OOO tests due to existing chunk disk mapper implementation Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix truncate int overflow Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Add Sync method to the WAL and update tests Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * remove useless sync Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Update minOOOTime after truncating Head * Update minOOOTime after truncating Head Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix lint Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Add a unit test Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Load OutOfOrderTimeWindow only once per appender Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix OOO Head LabelValues and PostingsForMatchers Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix replay of OOO mmap chunks Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Remove unnecessary err check Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Prevent panic with ApplyConfig Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Run OOO compaction after restart if there is OOO data from WBL Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Apply Bartek's suggestions Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Refactor OOO compaction Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Address comments and TODOs - Added a comment explaining why we need the allow overlapping compaction toggle - Clarified TSDBConfig OutOfOrderTimeWindow doc - Added an owner to all the TODOs in the code Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Run go format Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix remaining review comments Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix tests Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Change wbl reference when truncating ooo in TestHeadMinOOOTimeUpdate Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix TestWBLAndMmapReplay test failure on windows Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Address most of the feedback Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Refactor the block meta for out of order Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix windows error Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix review comments Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Dieter Plaetinck <dieter@grafana.com> Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>	2022-09-20 22:35:50 +05:30
Ganesh Vernekar	d354f20c2a	Add a feature flag to control native histogram ingestion (#11253 ) * Add runtime config to control native histogram ingestion Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Make the config into a CLI flag Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-09-14 17:38:34 +05:30
Bryan Boreham	c438b50133	cmd/promtool: in tests use labels.FromStrings Replacing code which assumes the internal structure of `Labels`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-09-09 13:34:49 +02:00
Bryan Boreham	735914f692	cmd/prometheus: in tests use labels.FromStrings Replacing code which assumes the internal structure of `Labels`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-09-09 13:34:49 +02:00
Cosrider	bef6556ca5	delete redundant alias (#11180 ) Signed-off-by: Cosrider <cosrider7@gmail.com> Signed-off-by: Cosrider <cosrider7@gmail.com>	2022-08-31 15:50:38 +02:00
Paschalis Tsilias	5a8e202f94	Append metadata to the WAL in the scrape loop (#10312 ) * Append metadata to the WAL Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Remove extra whitespace; Reword some docstrings and comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use RLock() for hasNewMetadata check Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use single byte for metric type in RefMetadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Update proposed WAL format for single-byte type metadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Address first round of review comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Amend description of metadata in wal.md Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Correct key used to retrieve metadata from cache When we're setting metadata entries in the scrapeCace, we're using the p.Help(), p.Unit(), p.Type() helpers, which retrieve the series name and use it as the cache key. When checking for cache entries though, we used p.Series() as the key, which included the metric name _with_ its labels. That meant that we were never actually hitting the cache. We're fixing this by utiling the __name__ internal label for correctly getting the cache entries after they've been set by setHelp(), setType() or setUnit(). Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Put feature behind a feature flag Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Reorder WAL format document Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix CR comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Extract logic about changing metadata in an anonymous function Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Implement new proposed WAL format and amend relevant tests Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use 'const' for metadata field names Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Apply metadata to head memSeries in Commit, not in AppendMetadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Add docstring and rename extracted helper in scrape.go Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix review comments around TestMetadata* tests Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Rebase with merged TSDB changes; fix duplicate definitions after rebase Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Remove leftover changes on db_test.go Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Rename feature flag Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Simplify updateMetadata helper function Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Remove extra newline Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>	2022-08-31 15:50:05 +02:00
Bryan Boreham	8b863c42dd	Optimise relabeling by re-using memory (#11147 ) * model/relabel: Add benchmark Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * model/relabel: re-use Builder across relabels Saves memory allocations. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * labels.Builder: allow re-use of result slice This reduces memory allocations where the caller has a suitable slice available. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * model/relabel: re-use source values slice To reduce memory allocations. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Unwind one change causing test failures Restore original behaviour in PopulateLabels, where we must not overwrite the input set. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * relabel: simplify values optimisation Use a stack-based array for up to 16 source labels, which will be the vast majority of cases. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * lint Signed-off-by: Bryan Boreham <bjboreham@gmail.com> Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-08-19 15:27:52 +05:30
beorn7	c9fd3c235d	Merge branch 'main' into sparsehistogram	2022-08-10 17:54:37 +02:00
Levi Harrison	fa9bc5184a	Update and fix interface (#11131 ) Signed-off-by: Levi Harrison <git@leviharrison.dev>	2022-08-10 10:14:52 +02:00
Levi Harrison	d61459d826	`no-default-scrape-port` feature flag (#9523 ) * Add `no-default-scrape-port` flag Signed-off-by: Levi Harrison <git@leviharrison.dev>	2022-07-20 13:35:47 +02:00
Paschalis Tsilias	d1122e0743	Introduce TSDB changes for appending metadata to the WAL (#10972 ) * Append metadata to the WAL Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Remove extra whitespace; Reword some docstrings and comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use RLock() for hasNewMetadata check Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use single byte for metric type in RefMetadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Update proposed WAL format for single-byte type metadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Implementa MetadataAppender interface for the Agent Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Address first round of review comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Amend description of metadata in wal.md Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Correct key used to retrieve metadata from cache When we're setting metadata entries in the scrapeCace, we're using the p.Help(), p.Unit(), p.Type() helpers, which retrieve the series name and use it as the cache key. When checking for cache entries though, we used p.Series() as the key, which included the metric name _with_ its labels. That meant that we were never actually hitting the cache. We're fixing this by utiling the __name__ internal label for correctly getting the cache entries after they've been set by setHelp(), setType() or setUnit(). Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Put feature behind a feature flag Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix AppendMetadata docstring Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Reorder WAL format document Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Change error message of AppendMetadata; Fix access of s.meta in AppendMetadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Reuse temporary buffer in Metadata encoder Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Only keep latest metadata for each refID during checkpointing Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix test that's referencing decoding metadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Avoid creating metadata block if no new metadata are present Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Add tests for corrupt metadata block and relevant record type Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix CR comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Extract logic about changing metadata in an anonymous function Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Implement new proposed WAL format and amend relevant tests Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use 'const' for metadata field names Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Apply metadata to head memSeries in Commit, not in AppendMetadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Add docstring and rename extracted helper in scrape.go Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Add tests for tsdb-related cases Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix linter issues vol1 Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix linter issues vol2 Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix Windows test by closing WAL reader files Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use switch instead of two if statements in metadata decoding Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix review comments around TestMetadata* tests Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Add code for replaying WAL; test correctness of in-memory data after a replay Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Remove scrape-loop related code from PR Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Address first round of comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Simplify tests by sorting slices before comparison Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix test to use separate transactions Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Empty out buffer and record slices after encoding latest metadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix linting issue Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Update calculation for DroppedMetadata metric Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Rename MetadataAppender interface and AppendMetadata method to MetadataUpdater/UpdateMetadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Reuse buffer when encoding latest metadata for each series Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix review comments; Check all returned error values using two helpers Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Simplify use of helpers Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Satisfy linter Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>	2022-07-19 10:58:52 +02:00
beorn7	28f028e938	Merge branch 'main' into sparsehistogram	2022-07-12 19:07:13 +02:00
Julien Pivotto	7a2d24b76a	Fix flakiness in windows tests (#10983 ) Our windows CI is too slow, process takes lots of time to start. Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2022-07-06 10:33:14 +02:00
Julien Pivotto	13bd4fd3c8	Fix promtool check config not erroring properly on failures (#10952 ) Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2022-07-01 14:38:49 +02:00
lixin18	735a07444a	Update main_unix_test.go (#10917 ) so->,so Signed-off-by: lixin18 <68135097+lixin963@users.noreply.github.com>	2022-06-27 16:15:51 +02:00
beorn7	40ad5e284a	Merge branch 'main' into beorn7/sparsehistogram	2022-06-09 20:50:30 +02:00
David Leadbeater	355b8bcf0b	Add --lint-fatal option (#10815 ) This keeps the previous behaviour of printing details about duplicate rules but doesn't exit with a fatal exit code unless turned on. Signed-off-by: David Leadbeater <dgl@dgl.cx>	2022-06-03 23:33:39 +10:00
Ben Kochie	9570924511	Merge pull request #9638 from prometheus/superq/agentMode Add agent mode identifier	2022-05-24 10:11:21 +02:00
Matthieu MOREL	36eee11434	refactor (package cmd): move from github.com/pkg/errors to 'errors' and 'fmt' packages (#10733 ) Signed-off-by: Matthieu MOREL <mmorel-35@users.noreply.github.com> Co-authored-by: Matthieu MOREL <mmorel-35@users.noreply.github.com>	2022-05-24 16:58:59 +10:00
Łukasz Mierzwa	44e5f220c0	Move prometheus_ready metric to web package (#10729 ) This moves prometheus_ready to the web package and links it with the ready variable that decides if HTTP requests should return 200 or 503. This is a follow up change from #10682 Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2022-05-23 16:00:59 +02:00
Łukasz Mierzwa	070e409dba	Add prometheus_ready metric (#10682 ) When Prometheus starts it can take a long time before WAL is replayed and it can do anything useful. While it's starting it exposes metrics and other Prometheus servers can scrape it. We do have alerts that fire if any Prometheus server is not ingesting samples and so far we've been dealing with instances that are starting for a long time by adding a check on Prometheus process uptime. Relying on uptime isn't ideal because the time needed to start depends on the number of metrics scraped, and so on the amount of data in WAL. To help write better alerts it would be great if Prometheus exposed a metric that tells us it's fully started, that way any alert that suppose to notify us about any runtime issue can filter out starting instances. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2022-05-23 11:42:01 +02:00
Ben Ye	af5ea213f7	promtool: support matchers when querying label values (#10727 ) * promtool: support matchers when querying label values Signed-off-by: Ben Ye <ben.ye@bytedance.com> * address review comment Signed-off-by: Ben Ye <ben.ye@bytedance.com>	2022-05-23 11:10:45 +10:00
Łukasz Mierzwa	d3c9c4f574	Stop rule manager before TSDB is stopped (#10680 ) During shutdown TSDB is stopped before rule manager is stopped. Since TSDB shutdown can take a long time (minutes or 10s of minutes) it keeps rule manager running while parts of Prometheus are already stopped (most notebly scrape manager). This can cause false positive alerts to fire, mostly those that rely on absent() calls since new sample appends will stop while alert queries are still evaluated. Stop rules before stopping TSDB and scrape manager to avoid this problem. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2022-05-20 23:26:06 +02:00
Alban Hurtaud	41630b8e88	Add hidden flag to configure discovery loop interval (#10634 ) * Add hidden flag to configure discovery loop interval Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>	2022-05-06 00:42:04 +02:00
beorn7	3bc711e333	Merge branch 'main' into sparsehistogram	2022-05-04 13:37:13 +02:00
Matthieu MOREL	e2ede285a2	refactor: move from io/ioutil to io and os packages (#10528 ) * refactor: move from io/ioutil to io and os packages * use fs.DirEntry instead of os.FileInfo after os.ReadDir Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>	2022-04-27 11:24:36 +02:00
Filip Petkovski	1c1b174a8e	Add a --lint flag to the promtool check rules and check config commands (#10435 ) * Add a --lint flag to the promtool check rules and check config commands Checking rules with promtool emits warnings in the case of duplicate rules. These warnings do not result in a non-zero exit code and are difficult to spot in CI environments. Additionally, checking for duplicates is closer to a lint check rather than a syntax check. This commit adds a --lint flag to commands which include checking rules. The flag can be used to enable or disable certain linting options and cause the execution to return a non-zero exit code in case those options are not met. Signed-off-by: fpetkovski <filip.petkovsky@gmail.com> * Exit with status 3 on lint error Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>	2022-04-06 00:05:11 -04:00
beorn7	7ee1836ef5	Merge branch 'main' into sparsehistogram	2022-04-05 18:31:19 +02:00
Julien Pivotto	390956d317	Log gomaxprocs messages (#10506 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2022-03-30 19:16:22 +02:00
TomasKohout	c0fd228bad	Add dependency on go.uber.org/automaxprocs (#10498 ) * add dependency on go.uber.org/automaxprocs Signed-off-by: Tomas Kohout <tomas.kohout1995@gmail.com> Co-authored-by: Peter Bourgon <peterbourgon@users.noreply.github.com> Co-authored-by: Julien Pivotto <roidelapluie@gmail.com>	2022-03-30 12:50:11 +02:00
Julien Pivotto	f9d8e5245a	Plugins support (#10495 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2022-03-29 14:44:39 +02:00
Wilbert Guo	83a2e52bc2	Add SyncForState Implementation for Ruler HA (#10070 ) * continuously syncing activeAt for alerts Signed-off-by: Yijie Qin <qinyijie@amazon.com> Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * add import Signed-off-by: Yijie Qin <qinyijie@amazon.com> Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * Refactor SyncForState and add unit tests Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * Format code Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * Add hook for syncForState Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Fix go lint Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Refactor syncForState override implementation Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Add syncForState override func as argument to Update() Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Fix go formatting Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Fix circleci test errors Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Remove overrideFunc as argument to run() Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * remove the syncForState Signed-off-by: Yijie Qin <qinyijie@amazon.com> * use the override function to decide if need to replace the activeAt or not Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix test case Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix format Signed-off-by: Yijie Qin <qinyijie@amazon.com> * Trigger build Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fixing comments Signed-off-by: Yijie Qin <qinyijie@amazon.com> * return the result of map of alerts instead of single one Signed-off-by: Yijie Qin <qinyijie@amazon.com> * upper case the QueryforStateSeries Signed-off-by: Yijie Qin <qinyijie@amazon.com> * use a more generic rule group post process function type Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix indentation Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix gofmt Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix lint Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fixing naming Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix comments Signed-off-by: Yijie Qin <qinyijie@amazon.com> * add the lastEvalTimestamp as parameter Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fmt Signed-off-by: Yijie Qin <qinyijie@amazon.com> * change funcType to func Signed-off-by: Yijie Qin <qinyijie@amazon.com> Co-authored-by: Yijie Qin <qinyijie@amazon.com> Co-authored-by: Yijie Qin <63399121+qinxx108@users.noreply.github.com>	2022-03-29 02:16:46 +02:00
beorn7	4210aac74a	Merge branch 'main' into sparsehistogram	2022-03-22 14:47:42 +01:00
Alan Protasio	606ef33d91	Track and report Samples Queried per query We always track total samples queried and add those to the standard set of stats queries can report. We also allow optionally tracking per-step samples queried. This must be enabled both at the engine and query level to be tracked and rendered. The engine flag is exposed via a Prometheus feature flag, while the query flag is set when stats=all. Co-authored-by: Alan Protasio <approtas@amazon.com> Co-authored-by: Andrew Bloomgarden <blmgrdn@amazon.com> Co-authored-by: Harkishen Singh <harkishensingh@hotmail.com> Signed-off-by: Andrew Bloomgarden <blmgrdn@amazon.com>	2022-03-21 23:49:17 +01:00
Mauro Stettler	b025390cb4	Disable chunk write queue by default, allow user to configure the exact size (#10425 ) * Disable chunk write queue by default Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com> * update flag description Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>	2022-03-11 17:26:59 +01:00
ian woolf	025528a5d6	cmd: use os.MkdirTemp instead of ioutil.TempDir (#10320 ) Signed-off-by: ianwoolf <btw515wolf2@gmail.com>	2022-03-08 14:08:53 +01:00
Łukasz Mierzwa	a4317bf0ec	Run gofumpt on all files (#10392 ) * Run gofumpt on all files Getting golangci-lint errors when building on my laptop, possibly because I have newer version of gofumpt then what it was formatted with. Run gofumpt -w -extra on all files as it will be needed in the future anyway. * Update golangci-lint to v1.44.2 v1.44.0 upgraded gofumpt so bumping version in CI will help keep formatting correct for everyone * Address golangci-lint error Getting 'error-strings: error strings should not be capitalized or end with punctuation or a newline' from revive here. Drop new line. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2022-03-03 17:21:05 +01:00
SuperQ	b297520666	Add agent mode identifier Identify in the logs and liveness endpoints if the server is running in Agent mode or not. Signed-off-by: SuperQ <superq@gmail.com>	2022-02-17 05:27:09 +01:00
Tobias Klausmann	b998636893	Improve error logging for missing config and QL dir (#10260 ) * Improve error logging for missing config and QL dir Currently, when Prometheus can't open its config file or the query logging dir under the data dir, it only logs what it has been given default or commandline/config. Depending on the environment this can be less than helpful, since the working directory may be unclear to the user. I have specifically kept the existing error messages as intact as possible to a) still log the parameter as given and b) cause as little disruption for log-parsers/-analyzers as possible. So in case of the config file or the data dir being non-absolute paths, I use os.GetWd to find the working dir and assemble an absolute path for error logging purposes. If GetWd fails, we just log "unknown", as recovering from an error there would be very complex measure, likely not worth the code/effort. Example errors: ``` $ ./prometheus ts=2022-02-06T16:00:53.034Z caller=main.go:445 level=error msg="Error loading config (--config.file=prometheus.yml)" fullpath=/home/klausman/src/prometheus/prometheus.yml err="open prometheus.yml: no such file or directory" $ touch prometheus.yml $ ./prometheus [...] ts=2022-02-06T16:01:00.992Z caller=query_logger.go:99 level=error component=activeQueryTracker msg="Error opening query log file" file=data/queries.active fullpath=/home/klausman/src/prometheus/data/queries.active err="open data/queries.active: permission denied" panic: Unable to create mmap-ed active query log [...] $ ``` Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de> * Replace our own logic with just using filepath.Abs() Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de> * Further simplification Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de> * Review edits Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de> * Review edits Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de> * Review edits Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de>	2022-02-16 17:43:15 +01:00
Julien Pivotto	9a2e93228e	Switch to grafana/regexp everywhere (#10268 ) Let's have a consistent library for regexp. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2022-02-13 00:58:27 +01:00
Matej Gera	2c61d29b2a	Tracing: Migrate to OpenTelemetry library (#9724 ) Signed-off-by: Matej Gera <matejgera@gmail.com>	2022-01-25 11:08:04 +01:00
Rodrigo Queiro	70c1446a64	Clarify units of --storage.tsdb.retention.size (#10154 ) The flag uses Base2Bytes: `129ed4ec8b/cmd/prometheus/main.go (L1476)` Signed-off-by: Rodrigo Queiro <rodrigoq@google.com>	2022-01-13 00:55:57 +01:00
beorn7	b39f2739e5	PromQL: Always enable negative offset and @ modifier This follows the line of argument that the invariant of not looking ahead of the query time was merely emerging behavior and not a documented stable feature. Any query that looks ahead of the query time was simply invalid before the introduction of the negative offset and the @ modifier. Signed-off-by: beorn7 <beorn@grafana.com>	2022-01-11 17:08:55 +01:00
beorn7	61509fc840	PromQL: Promote negative offset and @ modifer to stable Following the argument that breaking the invariant that PromQL does not look ahead of the evaluation time implies a breaking change, we still need to keep the feature flag around, but at least we can communicate that the feature is considered stable, and that the feature flags will be ignored from v3 on. Signed-off-by: beorn7 <beorn@grafana.com>	2022-01-11 00:34:33 +01:00
Björn Rabenstein	0f4a1e6eac	Merge pull request #10119 from prometheus/beorn7/remote API: Promote remote-write-receiver to stable	2022-01-10 15:55:10 +01:00
chenlujjj	2ce94ac196	Add '--weight' flag to 'promtool check metrics' command (#10045 )	2022-01-07 16:58:28 -05:00
beorn7	8fdfa52976	API: Promote remote-write-receiver to stable Since `/api/v1/write` is a mutating endpoint, we should still activate the remote-write-receiver explicitly. But we should do it in the same way as the other mutating endpoints, i.e. via a flag `--web.enable-remote-write-receiver`. This commit marks the feature flag as deprecated, i.e. it still works but logs a warning on startup. This enables users to seamlessly migrate. With the next minor release, we can start ignoring the feature flag (but still warn a user that is trying to use it). Signed-off-by: beorn7 <beorn@grafana.com>	2022-01-05 15:36:07 +01:00
David Leadbeater	a961062c37	Disable time based retention in tests (#8818 ) Fixes #7699. Signed-off-by: David Leadbeater <dgl@dgl.cx>	2022-01-02 23:46:03 +01:00
Jessica G	174a1147d5	Merge pull request #9861 from JessicaGreben/minor-prom-improvements Add exit code constants in promtool	2021-12-31 12:07:02 -08:00
jessicagreben	4b03fa3100	replace config exit code with failure exit code Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-12-30 05:37:57 -08:00
beorn7	64c7bd2b08	Merge branch 'main' into sparsehistogram	2021-12-18 14:04:25 +01:00
jessicagreben	59f7ef06d0	update exit code for sd Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-12-18 04:45:15 -08:00
Nicholas Blott	c92673fb14	Remove check against cfg so interval/ timeout are always set (#10023 ) Signed-off-by: Nicholas Blott <blottn@tcd.ie>	2021-12-16 13:28:46 +01:00
beorn7	6f33ab2b35	Merge branch 'main' into sparsehistogram	2021-12-15 13:49:33 +01:00
Julien Pivotto	db1551bd21	Merge pull request #10016 from prometheus/release-2.32 Merge back release 2.32	2021-12-14 20:58:58 +01:00
Ben Ye	d9bbe7f3dd	mention agent mode in enable-feature flag help description (#9939 ) Signed-off-by: Ben Ye <ben.ye@bytedance.com>	2021-12-04 21:13:24 +01:00
zzehring	42628899b5	promtool: Add `--syntax-only` flag for `check config` This commit adds a `--syntax-only` flag for `promtool check config`. When passing in this flag, promtool will omit various file existence checks that would cause the check to fail (e.g. the check would not fail if `rule_files` files don't exist at their respective paths). This functionality will allow CI systems to check the syntax of configs without worrying about referenced files. Fixes: #5222 Signed-off-by: zzehring <zack.zehring@grafana.com>	2021-12-02 15:33:11 -05:00
jessicagreben	99bb56fc46	add errcodes from sd file Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-12-01 04:45:18 -08:00
beorn7	e8e9155a11	Merge branch 'main' into sparsehistogram	2021-11-30 18:22:37 +01:00
beorn7	e4e24453fa	Merge branch 'main' into beorn7/merge2	2021-11-30 17:19:06 +01:00
Filip Petkovski	5849521e90	promtool: Fix credentials file check (#9883 ) The promtool check config command still uses the bearer_token_file field which is deprecated in favour of authorization.credentials_file. This commit modifies the command to use the new field insted. Fixes #9874 Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>	2021-11-30 15:02:07 +11:00
Björn Rabenstein	7e42acd3b1	tsdb: Rework iterators (#9877 ) - Pick At... method via return value of Next/Seek. - Do not clobber returned buckets. - Add partial FloatHistogram suppert. Note that the promql package is now _only_ dealing with FloatHistograms, following the idea that PromQL only knows float values. As a byproduct, I have removed the histogramSeries metric. In my understanding, series can have both float and histogram samples, so that metric doesn't make sense anymore. As another byproduct, I have converged the sampleBuf and the histogramSampleBuf in memSeries into one. The sample type stored in the sampleBuf has been extended to also contain histograms even before this commit. Signed-off-by: beorn7 <beorn@grafana.com>	2021-11-29 13:24:23 +05:30
jessicagreben	764f2d03a5	add const for exit codes Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-11-24 09:17:49 -08:00
Matheus Alcantara	d9a8c453a0	cmd: use t.TempDir instead of ioutil.TempDir on tests (#9852 )	2021-11-23 20:09:28 -05:00
beorn7	5d4db805ac	Merge branch 'main' into sparsehistogram	2021-11-17 19:57:31 +01:00
beorn7	4c28d9fac7	Move to histogram.Histogram pointers This is to avoid copying the many fields of a histogram.Histogram all the time. This also fixes a bunch of formerly broken tests. Signed-off-by: beorn7 <beorn@grafana.com>	2021-11-12 23:17:35 +01:00
Robert Fratto	72a9f7fee9	Share TSDB locker code with agent (#9623 ) * share tsdb db locker code with agent Closes #9616 Signed-off-by: Robert Fratto <robertfratto@gmail.com> * add flag to disable lockfile for agent Signed-off-by: Robert Fratto <robertfratto@gmail.com> * use agentOnlySetting instead of PreAction Signed-off-by: Robert Fratto <robertfratto@gmail.com> * tsdb: address review feedback 1. Rename Locker to DirLocker 2. Move DirLocker to tsdb/tsdbutil 3. Name metric using fmt.Sprintf 4. Refine error checking in DirLocker test Signed-off-by: Robert Fratto <robertfratto@gmail.com> * tsdb: create test utilities to assert expected DirLocker behavior Signed-off-by: Robert Fratto <robertfratto@gmail.com> * tsdb/tsdbutil: fix lint errors Signed-off-by: Robert Fratto <robertfratto@gmail.com> * tsdb/agent: fix windows test failure Use new DB variable instead of overriding the old one. Signed-off-by: Robert Fratto <robertfratto@gmail.com>	2021-11-11 11:45:25 -05:00
Mateusz Gozdek	c08bb86be0	cmd/prometheus: use random listen port in TestStartupInterrupt test So it can be run in parallel safely. Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-11 01:37:24 +01:00
Mateusz Gozdek	7bd7573891	cmd/prometheus/main_unix_test.go: fix unix test styling * Formatting of error message is missing a space after ':'. * t.Fatalf should be used instead of t.Errorf+return. Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-11 01:37:24 +01:00
Robert Fratto	4a83e6f453	Remove agent mode warnings when loading configs (#9622 ) PR #9618 introduced failing to load the config file when agent mode is configured to run with unspported settings. This made the block that logs a warning on their configuration no-op, which is now removed. Signed-off-by: Robert Fratto <robertfratto@gmail.com>	2021-11-10 19:39:30 +05:30
Mateusz Gozdek	fa1b14e146	cmd/prometheus: randomize test port and isolate test data directory Between the tests. This enables parallelizing those tests, which should cut the test execution time. Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-10 09:40:43 +01:00
beorn7	c954cd9d1d	Move packages out of deprecated pkg directory This creates a new `model` directory and moves all data-model related packages over there: exemplar labels relabel rulefmt textparse timestamp value All the others are more or less utilities and have been moved to `util`: gate logging modetimevfs pool runtime Signed-off-by: beorn7 <beorn@grafana.com>	2021-11-09 08:03:10 +01:00
Dieter Plaetinck	cda025b5b5	TSDB: demistify SeriesRefs and ChunkRefs (#9536 ) * TSDB: demistify seriesRefs and ChunkRefs The TSDB package contains many types of series and chunk references, all shrouded in uint types. Often the same uint value may actually mean one of different types, in non-obvious ways. This PR aims to clarify the code and help navigating to relevant docs, usage, etc much quicker. Concretely: * Use appropriately named types and document their semantics and relations. * Make multiplexing and demuxing of types explicit (on the boundaries between concrete implementations and generic interfaces). * Casting between different types should be free. None of the changes should have any impact on how the code runs. TODO: Implement BlockSeriesRef where appropriate (for a future PR) Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * feedback Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * agent: demistify seriesRefs and ChunkRefs Signed-off-by: Dieter Plaetinck <dieter@grafana.com>	2021-11-06 15:40:04 +05:30
Bartlomiej Plotka	789274bf9c	cmd: Fixed storage flag regression introduced in #9660 Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2021-11-06 00:16:43 +01:00
Sunil Thaha	4bdaea7663	fix: storage.tsdb.path randomly initialised to data-agent/ (#9660 ) Using the same variable for storage.tsdb.path and storage.agent.path as below in main.go causes cfg.localStoragePath to be data/ or data-agent/ at random. a.Flag("storage.tsdb.path", "Base path for metrics storage."). PreAction(serverOnlySetting()). Default("data/").StringVar(&cfg.localStoragePath) a.Flag("storage.agent.path", "Base path for metrics storage."). PreAction(agentOnlySetting()). Default("data-agent/").StringVar(&cfg.localStoragePath) This patch fixes it by using a different variable for storage.agent.path Signed-off-by: Sunil Thaha sthaha@redhat.com Signed-off-by: Sunil Thaha <sthaha@redhat.com>	2021-11-04 10:08:01 +00:00
Bartlomiej Plotka	e68ccc7708	Fix misleading agent-only/server-only check messages. (#9650 ) * Fix misleading agent-only/server-only check messages. Issue: ``` [root@host01 ~]# docker run -it --net=host --rm -v /root/editor/prom-agent-batcopter.yaml:/etc/prometheus/prometheus.yaml -v /root/prom-batcopter-data:/prometheus -u root --name prom-agent-batcopter quay.io/prometheus/prometheus:main --enable-feature=agent --config.file=/etc/prometheus/prometheus.yaml --storage.tsdb.path=/prometheus --web.listen-address=:9091 ts=2021-11-02T16:00:59.789Z caller=main.go:205 level=info msg="Experimental agent mode enabled." The following flag(s) can not be used in agent mode: ["--enable-feature"] ``` Problem was that PreAction gives us all parsed flag. Context does not give us any info on what flag clause it was defined. Also added info for flag help about being server or agent only. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * gofumpt. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2021-11-04 09:08:53 +00:00
Mateusz Gozdek	c3beca72e2	cmd/prometheus: wait for Prometheus to shutdown in tests So temporary data directory can be successfully removed, as on Windows, directory cannot be in used while removal. Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-02 20:14:19 +01:00
Mateusz Gozdek	b7bdf6fab2	Fix imports formatting According to `2829908806 (r58457095)`. Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-02 19:52:34 +01:00
Mateusz Gozdek	1a6c2283a3	Format Go source files using 'gofumpt -w -s -extra' Part of #9557 Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-02 19:52:34 +01:00
Julien Pivotto	807f46a1ed	Gate agent behind a feature flag, valide mode flags (#9620 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-11-02 13:03:35 +00:00
Darshan Chaudhary	a7e554b158	add check service-discovery command (#8970 ) Signed-off-by: darshanime <deathbullet@gmail.com>	2021-11-01 14:42:12 +01:00
Hu Shuai	4b799c361a	Fix in typo in cmd/prometheus/main.go (#9632 ) Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>	2021-11-01 16:08:23 +05:30
Arthur Silva Sens	be2599c853	config: Make remote-write required for Agent mode (#9618 ) * config: Make remote-write required for Agent mode Signed-off-by: ArthurSens <arthursens2005@gmail.com>	2021-10-30 01:41:40 +02:00
Robert Fratto	bc72a718c4	Initial draft of prometheus-agent (#8785 ) * Initial draft of prometheus-agent This commit introduces a new binary, prometheus-agent, based on the Grafana Agent code. It runs a WAL-only version of prometheus without the TSDB, alerting, or rule evaluations. It is intended to be used to remote_write to Prometheus or another remote_write receiver. By default, prometheus-agent will listen on port 9095 to not collide with the prometheus default of 9090. Truncation of the WAL cooperates on a best-effort case with Remote Write. Every time the WAL is truncated, the minimum timestamp of data to truncate is determined by the lowest sent timestamp of all samples across all remote_write endpoints. This gives loose guarantees that data from the WAL will not try to be removed until the maximum sample lifetime passes or remote_write starts functionining. Signed-off-by: Robert Fratto <robertfratto@gmail.com> * add tests for Prometheus agent (#22) * add tests for Prometheus agent * add tests for Prometheus agent * rearranged tests as per the review comments * update tests for Agent * changes as per code review comments Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com> * incremental changes to prometheus agent Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com> * changes as per code review comments Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com> * Commit feedback from code review Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Robert Fratto <robertfratto@gmail.com> * Port over some comments from grafana/agent Signed-off-by: Robert Fratto <robertfratto@gmail.com> * Rename agent.Storage to agent.DB for tsdb consistency Signed-off-by: Robert Fratto <robertfratto@gmail.com> * Consolidate agentMode ifs in cmd/prometheus/main.go Signed-off-by: Robert Fratto <robertfratto@gmail.com> * Document PreAction usage requirements better for agent mode flags Signed-off-by: Robert Fratto <robertfratto@gmail.com> * remove unnecessary defaultListenAddr Signed-off-by: Robert Fratto <robertfratto@gmail.com> * `go fmt ./tsdb/agent` and fix lint errors Signed-off-by: Robert Fratto <robertfratto@gmail.com> Co-authored-by: SriKrishna Paparaju <paparaju@gmail.com>	2021-10-29 16:25:05 +01:00
David Leadbeater	c91c2bbea5	promtool: Show more human readable got/exp output (#8064 ) Avoid using %#v, nothing needs to parse this, so escaping " and so on leads to hard to read output. Add new lines, number and indentation to each alert series output. Signed-off-by: David Leadbeater <dgl@dgl.cx>	2021-10-28 22:17:18 +11:00
DrAuYueng	69e309d202	Expose TargetsFromGroup/AlertmanagerFromGroup func and reuse this for (#9343 ) static/file sd config check in promtool Signed-off-by: DrAuYueng <ouyang1204@gmail.com>	2021-10-28 02:01:28 +02:00
Julien Pivotto	73255e15f6	Address golint failures from revive Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-10-23 00:53:11 +02:00
Will Tran	97b0738895	add --max-block-duration in promtool create-blocks-from rules (#9511 ) * support maxBlockDuration for promtool tsdb create-blocks-from rules Fixes #9465 Signed-off-by: Will Tran <will@autonomic.ai> * don't hardcode 2h as the default block size in rules test Signed-off-by: Will Tran <will@autonomic.ai>	2021-10-21 23:28:37 +02:00
Furkan Türkal	9d0058a09e	Bind port 0 in main_test (#9558 ) Fixes #9499 Signed-off-by: Furkan <furkan.turkal@trendyol.com>	2021-10-21 14:59:20 +02:00
Julien Pivotto	432005826d	Add a feature flag to enable the new discovery manager (#9537 ) * Add a feature flag to enable the new manager This PR creates a copy of the legacy manager and uses it by default. It is a companion PR to #9349. With this PR, users can enable the new discovery manager and provide us with any feedback / side effects that the new behaviour might have. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-10-20 10:15:54 +02:00
beorn7	a9008f5423	Merge branch 'main' into sparsehistogram	2021-10-19 17:14:23 +02:00
jessicagreben	60d0990886	add more explicit label values Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-10-18 01:04:13 +02:00
jessicagreben	3da87d2f39	add unit test to check label rule labels override Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-10-18 01:04:13 +02:00
Julien Pivotto	f8372bc6b9	backfill: Apply rule labels after query labels Fix #9419 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-10-18 01:04:13 +02:00
beorn7	7a8bb8222c	Style cleanup of all the changes in sparsehistogram so far A lot of this code was hacked together, literally during a hackathon. This commit intends not to change the code substantially, but just make the code obey the usual style practices. A (possibly incomplete) list of areas: * Generally address linter warnings. * The `pgk` directory is deprecated as per dev-summit. No new packages should be added to it. I moved the new `pkg/histogram` package to `model` anticipating what's proposed in #9478. * Make the naming of the Sparse Histogram more consistent. Including abbreviations, there were just too many names for it: SparseHistogram, Histogram, Histo, hist, his, shs, h. The idea is to call it "Histogram" in general. Only add "Sparse" if it is needed to avoid confusion with conventional Histograms (which is rare because the TSDB really has no notion of conventional Histograms). Use abbreviations only in local scope, and then really abbreviate (not just removing three out of seven letters like in "Histo"). This is in the spirit of https://github.com/golang/go/wiki/CodeReviewComments#variable-names * Several other minor name changes. * A lot of formatting of doc comments. For one, following https://github.com/golang/go/wiki/CodeReviewComments#comment-sentences , but also layout question, anticipating how things will look like when rendered by `godoc` (even where `godoc` doesn't render them right now because they are for unexported types or not a doc comment at all but just a normal code comment - consistency is queen!). * Re-enabled `TestQueryLog` and `TestEndopints` (they pass now, leaving them disabled was presumably an oversight). * Bucket iterator for histogram.Histogram is now created with a method. * HistogramChunk.iterator now allows iterator recycling. (I think @dieterbe only commented it out because he was confused by the question in the comment.) * HistogramAppender.Append panics now because we decided to treat staleness marker differently. Signed-off-by: beorn7 <beorn@grafana.com>	2021-10-11 13:02:03 +02:00
beorn7	fd5ea4e0b5	Merge branch 'main' into sparsehistogram	2021-10-07 23:16:42 +02:00
Julien Pivotto	bd217c58a7	Backfill: Do not query after --end (#9340 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-09-15 16:02:41 +02:00
Julien Pivotto	1ea774f184	Merge pull request #9339 from roidelapluie/remove-double-align backfill: Do not align the start of the group since we align every rule.	2021-09-14 23:46:25 +02:00
Julien Pivotto	2bde71ec5f	Merge pull request #9338 from prometheus/release-2.30 merge back release 2.30	2021-09-14 23:46:11 +02:00
Julien Pivotto	691ce066fb	backfill: Do not align the start of the group since we align every rule. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-09-14 23:13:06 +02:00
jessicagreben	b0a21f9eab	rm overlap, add label builder to fix name bug Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-09-13 10:32:08 -07:00
Julien Pivotto	0111aa987e	Merge pull request #9312 from fpetkovski/promtool-analyze-compaction promtool: add extended flag for tsdb analysis	2021-09-08 17:27:01 +02:00
Julien Pivotto	48a101be1b	Allow to tune the scrape tolerance (#9283 ) * Allow to tune the scrape tolerance In most of the classic monitoring use cases, a few milliseconds difference can be omitted. In Prometheus, a few millisecond difference can however make a big difference. Currently, Prometheus will ignore up to 2 ms difference in the alignments. It turns out that for users who can afford a 10ms difference, there is a lot of resources and disk space to win, as shown in this graph, which shows the bytes / samples over a production Prometheus server. You can clearly see the switch from 2ms to 10ms tolerance. This pull request enables the adjustment of the scrape timestamp alignment tolerance. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Fix golint Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-09-08 17:27:33 +05:30
fpetkovski	449f874679	promtool: add extended flag for tsdb analysis The compaction analysis which runs under promtool tsdb analyze can be an intensive process which slows down the entire command. This commit adds an --extended flag to tsdb analyze which can be toggled for running long running tasks, such as compaction analysis. Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>	2021-09-08 10:50:01 +02:00
Julien Pivotto	ad642a85c0	Merge pull request #9304 from LeviHarrison/backfill-fix-date Rules backfill: fix new rule importer message	2021-09-07 18:01:03 +02:00
Julien Pivotto	bd24e2fb92	Merge pull request #9303 from LeviHarrison/backfill-return-1 Rules backfill: return 1 if unsuccessful	2021-09-07 18:00:42 +02:00
Levi Harrison	ded95ff434	Fix new rule importer message Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-09-06 22:19:29 -04:00
Levi Harrison	34e1b47968	Fixed error handling Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-09-06 21:55:57 -04:00
Holger Hans Peter Freyther	5edec40d60	promtool: Speed up checking for duplicate rules Trade space for speed. Convert all rules into our temporary struct, sort and then iterate. This is a significant when having many rules. Signed-off-by: Holger Hans Peter Freyther <holger@moiji-mobile.com>	2021-09-06 23:10:26 +08:00
Holger Hans Peter Freyther	3a309c1ae5	promtool: Add simple benchmark checkDuplicates benchmark Add a simple benchmark with a large number of rules. Signed-off-by: Holger Hans Peter Freyther <holger@moiji-mobile.com>	2021-09-06 23:10:26 +08:00
Holger Hans Peter Freyther	794937b3d6	promtool: Add testcase for detecting duplicates Introduce a basic test for checking for duplicate rules. Signed-off-by: Holger Hans Peter Freyther <holger@moiji-mobile.com>	2021-09-06 23:10:26 +08:00
SuperQ	31f4108758	Add scrape_timeout_seconds metric Add a new built-in metric `scrape_timeout_seconds` to allow monitoring of the ratio of scrape duration to the scrape timeout. Hide behind a feature flag to avoid additional cardinality by default. Signed-off-by: SuperQ <superq@gmail.com>	2021-09-02 12:15:35 +02:00
SuperQ	e167a45c65	Add new Go build tags. Add new go:build comments based on 1.17 formatting[0]. [0]: https://golang.org/doc/go1.17#gofmt Signed-off-by: SuperQ <superq@gmail.com>	2021-08-27 10:24:14 +02:00
Julien Pivotto	cab96a06ef	Merge release 2.29 in main (#9196 ) * PromQL: Fix start and end keywords masking label and metric names This commit fixes an issue with the "at modifier" that introduced two new keywords: `start` and `end`. In grouping options and in metric names, these keywords took precedence over metric or label names, so that those metrics and labels could no longer be referenced. Signed-off-by: Clayton Peters <clayton.peters@man.com> * Add in additional tests for metrics and/or labels called start/end. Signed-off-by: Clayton Peters <clayton.peters@man.com> * : Cut 2.29.0-rc.0 Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com> VERSION: bump to 2.29.0-rc.0 Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com> * Remove experimental wording on size-based retention Followup of #9004 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Fix PR reference in changelog Signed-off-by: George Brighton <george@gebn.co.uk> * Describe EC2 availability zone IDs at most once per refresh (#9142) Signed-off-by: George Brighton <george@gebn.co.uk> * Describe EC2 availability zones at most once per SD load Closes #9142. Signed-off-by: George Brighton <george@gebn.co.uk> * Incorporate feedback Signed-off-by: George Brighton <george@gebn.co.uk> * Integrate feedback Signed-off-by: George Brighton <george@gebn.co.uk> * Add a compatibility note for macOS users. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * : Cut v2.29.0-rc.1 Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com> Fix `kuma_sd` targetgroup reporting (#9157) * Bundle all xDS targets into a single group Signed-off-by: austin ce <austin.cawley@gmail.com> * : cut v2.29.0-rc.2 Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com> Rename links Signed-off-by: Levi Harrison <git@leviharrison.dev> * bump codemirror-promql to 0.17.0 Signed-off-by: Augustin Husson <husson.augustin@gmail.com> * : cut v2.29.0 Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com> tsdb: align atomically accessed int64 (#9192) This prevents a panic in 32-bit archs: https://pkg.go.dev/sync/atomic#pkg-note-BUG Fixed #9190 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Release 2.29.1 (#9193) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> Co-authored-by: Clayton Peters <clayton.peters@man.com> Co-authored-by: Frederic Branczyk <fbranczyk@gmail.com> Co-authored-by: George Brighton <george@gebn.co.uk> Co-authored-by: Austin Cawley-Edwards <austin.cawley@gmail.com> Co-authored-by: Levi Harrison <git@leviharrison.dev> Co-authored-by: Augustin Husson <husson.augustin@gmail.com>	2021-08-12 18:38:06 +02:00
Ganesh Vernekar	095f572d4a	Sync sparsehistogram branch with main (#9189 ) * Fix `kuma_sd` targetgroup reporting (#9157) * Bundle all xDS targets into a single group Signed-off-by: austin ce <austin.cawley@gmail.com> * Snapshot in-memory chunks on shutdown for faster restarts (#7229) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Rename links Signed-off-by: Levi Harrison <git@leviharrison.dev> * Remove Individual Data Type Caps in Per-shard Buffering for Remote Write (#8921) * Moved everything to nPending buffer Signed-off-by: Levi Harrison <git@leviharrison.dev> * Simplify exemplar capacity addition Signed-off-by: Levi Harrison <git@leviharrison.dev> * Added pre-allocation Signed-off-by: Levi Harrison <git@leviharrison.dev> * Don't allocate if not sending exemplars Signed-off-by: Levi Harrison <git@leviharrison.dev> * Avoid deadlock when processing duplicate series record (#9170) * Avoid deadlock when processing duplicate series record `processWALSamples()` needs to be able to send on its output channel before it can read the input channel, so reads to allow this in case the output channel is full. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * processWALSamples: update comment Previous text seems to relate to an earlier implementation. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Optimise WAL loading by removing extra map and caching min-time (#9160) * BenchmarkLoadWAL: close WAL after use So that goroutines are stopped and resources released Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * BenchmarkLoadWAL: make series IDs co-prime with #workers Series are distributed across workers by taking the modulus of the ID with the number of workers, so multiples of 100 are a poor choice. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * BenchmarkLoadWAL: simulate mmapped chunks Real Prometheus cuts chunks every 120 samples, then skips those samples when re-reading the WAL. Simulate this by creating a single mapped chunk for each series, since the max time is all the reader looks at. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Fix comment Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Remove series map from processWALSamples() The locks that is commented to reduce contention in are now sharded 32,000 ways, so won't be contended. Removing the map saves memory and goes just as fast. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * loadWAL: Cache the last mmapped chunk time So we can skip calling append() for samples it will reject. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Improvements from code review Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Full stops and capitals on comments Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Cache max time in both places mmappedChunks is updated Including refactor to extract function `setMMappedChunks`, to reduce code duplication. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Update head min/max time when mmapped chunks added This ensures we have the correct values if no WAL samples are added for that series. Note that `mSeries.maxTime()` was always `math.MinInt64` before, since that function doesn't consider mmapped chunks. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Split Go and React Tests (#8897) * Added go-ci and react-ci Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu> Signed-off-by: Levi Harrison <git@leviharrison.dev> * Remove search keymap from new expression editor (#9184) Signed-off-by: Julius Volz <julius.volz@gmail.com> Co-authored-by: Austin Cawley-Edwards <austin.cawley@gmail.com> Co-authored-by: Levi Harrison <git@leviharrison.dev> Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu> Co-authored-by: Bryan Boreham <bjboreham@gmail.com> Co-authored-by: Julius Volz <julius.volz@gmail.com>	2021-08-11 15:43:17 +05:30
Ganesh Vernekar	ee7e0071d1	Snapshot in-memory chunks on shutdown for faster restarts (#7229 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-06 17:51:01 +01:00
Ganesh Vernekar	8b70e87ab9	Merge remote-tracking branch 'upstream/main' into sparse-refactor Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-05 12:16:08 +05:30
jinglina	ed24e51e7c	remove redundant type conversion (#9126 ) Signed-off-by: jinglina <jinglinax@163.com>	2021-07-28 13:33:46 +05:30
Julien Pivotto	04f33e88f7	Merge pull request #9121 from LeviHarrison/revert-klog-fix Revert klog fix	2021-07-27 14:07:59 +02:00
Levi Harrison	58556c19be	Revert "Fix logging after the move to go-kit/log (#9021 )" This reverts commit `642722e5d0`. Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-07-27 07:37:03 -04:00
Ganesh Vernekar	507d61fdeb	Remove experimental tag on `--storage.tsdb.allow-overlapping-blocks` (#9117 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-07-27 14:38:20 +05:30
Martin Disibio	1bcd13d6b5	Exemplar resize (#8974 ) * Create experimental circular buffer resize method, benchmarks Signed-off-by: Martin Disibio <mdisibio@gmail.com> * Optimize exemplar resize to only replay as many exemplars as needed Signed-off-by: Martin Disibio <mdisibio@gmail.com> * More comments, benchmark AddExemplar Signed-off-by: Martin Disibio <mdisibio@gmail.com> * optimizations Signed-off-by: Martin Disibio <mdisibio@gmail.com> * comment Signed-off-by: Martin Disibio <mdisibio@gmail.com> * Slight refactor of resize benchmark + make use of resize via runtime reloadable storage config. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Some more config related changes. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Address some review comments. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Address more review comments. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Refactor to remove usage of noopExemplarStorage and avoid race condition when resizing from Head code. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix or add comments to clarify some of the new behaviour. Signed-off-by: Callum Styan <callumstyan@gmail.com> * fix potential panics related to negative exemplar buffer lengths Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Callum Styan <callumstyan@gmail.com>	2021-07-20 10:22:57 +05:30
Levi Harrison	3b5257d869	Changed disabled_features to feature_flags Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-07-13 22:03:51 -04:00
Ganesh Vernekar	78d68d5972	Make query_range serve histograms (#9036 ) * Modify query_range to serve only sparse histograms Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Finish CumulativeExpandSparseHistogram for positive schema Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix lint Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix bug and comment out tests for query_range Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix lint 2 Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-07-03 19:23:56 +05:30
Filip Petkovski	7c125aa5fb	Promtool: Add support for compaction analysis (#8940 ) * Extend promtool to support compaction analysis This commit extends the promtool tsdb analyze command to help troubleshoot high Prometheus disk usage. The command now plots a distribution of how full chunks are relative to the maximum capacity of 120 samples per chunk. Signed-off-by: fpetkovski <filip.petkovsky@gmail.com> * Update cmd/promtool/tsdb.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>	2021-07-02 11:08:52 +01:00
Julius Volz	441e6cd7d6	Merge release-2.28 back into main (#9035 ) * Cut v2.28.0-rc.0 (#8954) * Cut v2.28.0-rc.0 Signed-off-by: Julius Volz <julius.volz@gmail.com> * Changelog fixup Signed-off-by: Julius Volz <julius.volz@gmail.com> * Address review comments Signed-off-by: Julius Volz <julius.volz@gmail.com> * Downgrade some features to enhancements Signed-off-by: Julius Volz <julius.volz@gmail.com> * Adjust release date to today Signed-off-by: Julius Volz <julius.volz@gmail.com> * Migrate HTTP SD docs from docs repo (#8972) See discussion in https://github.com/prometheus/docs/pull/1975 Signed-off-by: Julius Volz <julius.volz@gmail.com> * Cut Prometheus v2.28.0 (#8973) Signed-off-by: Julius Volz <julius.volz@gmail.com> * HTTP SD: Allow charset in content type (#8981) * Added content type regex Signed-off-by: Levi Harrison <git@leviharrison.dev> Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * fixed disappeared target groups in http_sd #9019 Signed-off-by: servak <fservak@gmail.com> * Add a testcase for http-sd Signed-off-by: servak <fservak@gmail.com> * HTTP SD: Simplify logic of disappeared targetgroups (#9026) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Fix logging after the move to go-kit/log (#9021) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Cut Prometheus v2.28.1 (#9034) Signed-off-by: Julius Volz <julius.volz@gmail.com> Co-authored-by: Levi Harrison <git@leviharrison.dev> Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu> Co-authored-by: servak <fservak@gmail.com>	2021-07-01 18:02:13 +02:00
Levi Harrison	90976e7505	Promtool: Add feature flags to unit tests (#8958 ) * Added feature flag support to unit tests Signed-off-by: Levi Harrison <git@leviharrison.dev> * Added/fixed tests Signed-off-by: Levi Harrison <git@leviharrison.dev> * Addressed review comments Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-30 22:43:39 +01:00
Ankit Goel	d437cee73a	Move storage.tsdb.retention.size out of experimental #8728 (#9004 ) * Move storage.tsdb.retention.size out of experimental #8728 Signed-off-by: Ankit Goel <ankit.goel@deliveryhero.com>	2021-06-30 01:30:11 +02:00
Levi Harrison	ca1896c15b	Promtool: Validate service discovery files (#8950 ) * Check SD files in promtool Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-29 17:32:59 +02:00
Ganesh Vernekar	04ad56d9b8	Append sparse histograms into the Head block (#9013 ) * Append sparse histograms into the Head block Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Add AtHistogram() to Iterator interface. Make HistoChunk conform to Chunk interface. Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-06-29 20:08:46 +05:30
Steve Kuznetsov	fd6c852567	promtool: backfill: allow configuring block duration (#8919 ) * promtool: backfill: allow configuring block duration When backfilling large amounts of data across long periods of time, it may in certain circumstances be useful to use a longer block duration to increase the efficiency and speed of the backfilling process. This patch adds a flag --block-duration-power to allow a user to choose the power N where the block duration is 2^(N+1)h. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com> * promtool: use sub-tests in backfill testing Signed-off-by: Steve Kuznetsov <skuznets@redhat.com> * backfill: add messages to tests for clarity When someone new breaks a test, seeing "expected: false, got: true" is really not useful. A nice message helps here. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com> * backfill: test long block durations A test that uses a long block duration to write bigger blocks is added. The check to make sure all blocks are the default duration is removed. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>	2021-06-29 14:53:38 +05:30
Ganesh Vernekar	64bea6999e	HistogramAppender interface for sparse histograms (#9007 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-06-28 20:30:55 +05:30
Ben Kochie	7cb55d5732	Merge pull request #8802 from mwasilew2/yaml-linting Adds yamllinting to Makefile.common	2021-06-24 15:59:35 +02:00
Julien Pivotto	ba76bceb6b	Merge pull request #8917 from stevekuznetsov/skuznets/silence-backfill promtool: backfill: allow silencing output	2021-06-14 23:27:18 +02:00
Michal Wasilewski	3f686cad8b	fixes yamllint errors Signed-off-by: Michal Wasilewski <mwasilewski@gmx.com>	2021-06-12 12:47:47 +02:00
Levi Harrison	b5f6f8fb36	Switched to go-kit/log Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-11 12:28:36 -04:00
Steve Kuznetsov	ee771a2a66	promtool: backfill: allow silencing output When using the backfill command to add data to an ephemeral/test Prometheus instance, it is not important to see which data was added as it is often generated ahead of time and mostly irrelevant to the use-case. The current approach prints information about each block that is written, but does so in a generally inefficient and costly manner. This patch adds a `--quiet` flag that allows a user to opt out of this behavior. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>	2021-06-10 15:31:16 -07:00
Levi Harrison	7bc11dcb06	React UI: Add Starting Screen (#8662 ) * Added walreplay API endpoint Signed-off-by: Levi Harrison <git@leviharrison.dev> * Added starting page to react-ui Signed-off-by: Levi Harrison <git@leviharrison.dev> * Documented the new endpoint Signed-off-by: Levi Harrison <git@leviharrison.dev> * Fixed typos Signed-off-by: Levi Harrison <git@leviharrison.dev> Co-authored-by: Julius Volz <julius.volz@gmail.com> * Removed logo Signed-off-by: Levi Harrison <git@leviharrison.dev> * Changed isResponding to isUnexpected Signed-off-by: Levi Harrison <git@leviharrison.dev> * Changed width of progress bar Signed-off-by: Levi Harrison <git@leviharrison.dev> * Changed width of progress bar Signed-off-by: Levi Harrison <git@leviharrison.dev> * Added DB stats object Signed-off-by: Levi Harrison <git@leviharrison.dev> * Updated starting page to work with new fields Signed-off-by: Levi Harrison <git@leviharrison.dev> * Passing nil Signed-off-by: Levi Harrison <git@leviharrison.dev> * Passing nil (pt. 2) Signed-off-by: Levi Harrison <git@leviharrison.dev> * Passing nil (pt. 3) Signed-off-by: Levi Harrison <git@leviharrison.dev> * Passing nil (and also implementing a method this time) (pt. 4) Signed-off-by: Levi Harrison <git@leviharrison.dev> * Passing nil (and also implementing a method this time) (pt. 5) Signed-off-by: Levi Harrison <git@leviharrison.dev> * Changed const to let Signed-off-by: Levi Harrison <git@leviharrison.dev> * Passing nil (pt. 6) Signed-off-by: Levi Harrison <git@leviharrison.dev> * Remove SetStats method Signed-off-by: Levi Harrison <git@leviharrison.dev> * Added comma Signed-off-by: Levi Harrison <git@leviharrison.dev> * Changed api Signed-off-by: Levi Harrison <git@leviharrison.dev> * Changed to triple equals Signed-off-by: Levi Harrison <git@leviharrison.dev> * Fixed data response types Signed-off-by: Levi Harrison <git@leviharrison.dev> * Don't return pointer Signed-off-by: Levi Harrison <git@leviharrison.dev> * Changed version Signed-off-by: Levi Harrison <git@leviharrison.dev> * Fixed interface issue Signed-off-by: Levi Harrison <git@leviharrison.dev> * Fixed pointer Signed-off-by: Levi Harrison <git@leviharrison.dev> * Fixed copying lock value error Signed-off-by: Levi Harrison <git@leviharrison.dev> Co-authored-by: Julius Volz <julius.volz@gmail.com>	2021-06-05 15:29:32 +01:00
Levi Harrison	17ea8d006a	Added external URL access Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-05-30 23:35:26 -04:00
Bartlomiej Plotka	80545bfb2e	Instrumented circular exemplar storage. (#8712 ) * Instrumented circular storage. Fixes: https://github.com/prometheus/prometheus/issues/8708 Fixes: https://github.com/prometheus/prometheus/issues/8707 Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Fixed CB. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Addressed Julien comments. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Addressed Callum comments. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2021-04-16 13:44:53 +01:00
nberkley	f9e2dd0697	Add support for smaller block chunk segment allocations (#8478 ) * Add support for --storage.tsdb.max-chunk-size to suport small chunks for space limited prometheus instances. Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Update tsdb/compact.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Update tsdb/db.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Update cmd/prometheus/main.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Change naming scheme to Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Add a lower bound to --storage.tsdb.max-block-chunk-segment-size Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Update storage.md to explain what a chunk segment is Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Apply suggestions from code review Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com> Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Force tests Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Fix code style Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>	2021-04-15 14:25:01 +05:30
Julien Pivotto	ae73a6296a	Merge pull request #8683 from cuirunxing-hub/main typos correct	2021-04-02 20:14:55 +02:00
cuirunxing-hub	57bc2e94e2	typos correct Signed-off-by: cuirunxing-hub <cuirunxing@inspur.com>	2021-04-02 09:03:00 +08:00
Jess G	731545ad34	Add documentation for recording rule backfiller (#8674 ) * add docs for rule backfiller Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-04-01 22:38:00 +02:00
Julien Pivotto	e635ca834b	Add environment variable expansion in external label values Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-03-30 01:36:28 +02:00
Björn Rabenstein	9549a15c6f	Merge pull request #7675 from JessicaGreben/jg/11-retroactive-rule-eval Add rule importer to backfill	2021-03-29 19:09:21 +02:00
jessicagreben	896c828bb5	close writer after flush Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-03-29 06:45:12 -07:00
jessicagreben	d89a1d999f	add log with start/end times, close blocks before end of func Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-03-28 12:13:58 -07:00
Ben Kochie	f0bccba1c3	Update Go modules for 2.26 (#8636 ) * Update Go modules for 2.26 Bump all Go modules to the latest upstream. Signed-off-by: Ben Kochie <superq@gmail.com> * Fix promtool for new client_golang LabelValues now requires a list of string matchers. Signed-off-by: Ben Kochie <superq@gmail.com>	2021-03-24 09:41:12 +00:00
Julien Pivotto	c0c36b1155	Improve promql-negative-offset docs (#8631 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-03-22 10:16:43 +01:00
jessicagreben	8de4da3716	add changes per comments, fix tests Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-03-20 12:38:30 -07:00
Callum Styan	289ba11b79	Add circular in-memory exemplars storage (#6635 ) * Add circular in-memory exemplars storage Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com> Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> Signed-off-by: Martin Disibio <mdisibio@gmail.com> Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com> Co-authored-by: Martin Disibio <mdisibio@gmail.com> * Fix some comments, clean up exemplar metrics struct and exemplar tests. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix exemplar query api null vs empty array issue. Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com> Co-authored-by: Martin Disibio <mdisibio@gmail.com>	2021-03-16 15:17:45 +05:30
jessicagreben	e3a8132bb3	fix block alignment, add sample alignment Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-03-15 12:44:58 -07:00
jessicagreben	7c26642460	add block alignment and write in 2 hr blocks Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-03-14 10:10:55 -07:00
Julien Pivotto	63ea88af82	Merge pull request #8575 from pfreixes/add-scrapes-parameter Add num scrapes as tsdb write benchmark command flag	2021-03-11 13:09:50 +01:00
Pau Freixes	b1ac4a45e6	Add num scrapes as tsdb write benchmark command flag By default same value that was hardcoded is used, but with the new flag added the number of scrapes can be increased to any value. Signed-off-by: Pau Freixes <pfreixes@gmail.com>	2021-03-10 11:17:07 +01:00
Julien Pivotto	ad5ed416ba	Merge pull request #8487 from pschou/dev_neg_offset allow negative offset	2021-03-08 22:18:45 +01:00
Julien Pivotto	5742a18590	Fix subqueries with default resolution in promql unit tests Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-03-07 09:20:04 +01:00
jessicagreben	9fc53b7edf	fix appender.Add -> appender.Append Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-03-01 05:49:49 -08:00
Arthur Silva Sens	537c0aff49	Prometheus and Promtool binaries now print help and usage to stdout (#8542 ) Signed-off-by: ArthurSens <arthursens2005@gmail.com>	2021-02-25 19:52:34 +01:00
jessicagreben	78e84aed89	resolve merge conflict Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-02-24 09:47:29 -08:00
jessicagreben	f2db9dc722	add multi rule integration tests Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-02-24 09:42:31 -08:00
pschou	f80b52be69	Merge branch 'main' into dev_neg_offset	2021-02-23 20:52:57 -05:00
schou	22cd48868a	adding feature flag, promql-negative-offset Signed-off-by: schou <pschou@users.noreply.github.com>	2021-02-23 20:25:56 -05:00
Julien Pivotto	8c8de46003	Merge pull request #8036 from dgl/promtool-alert-err promtool: Don't end alert tests early, in some failure situations	2021-02-20 22:35:00 +01:00
Tom Wilkie	7369561305	Combine Appender.Add and AddFast into a single Append method. (#8489 ) This moves the label lookup into TSDB, whilst still keeping the cached-ref optimisation for repeated Appends. This makes the API easier to consume and implement. In particular this change is motivated by the scrape-time-aggregation work, which I don't think is possible to implement without it as it needs access to label values. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2021-02-18 17:37:00 +05:30
Julien Pivotto	1fac1c783b	Merge pull request #8504 from rbauduin/require_alertname promtool: alert_rule_test items require alertname	2021-02-17 22:07:52 +01:00
Julien Pivotto	2d172d0896	Merge pull request #8508 from prometheus/release-2.25 Merge back release 2.25	2021-02-17 16:26:34 +01:00
Raphael Bauduin	a7d64cad21	promtool: alert_rule_test items require alertname Accepting alert_rule_test without alertname is confusing as it will always pass with empty exp_alerts, and never with non-empty exp_alerts. Signed-off-by: Raphael Bauduin <raphael.bauduin@tessares.net>	2021-02-17 16:23:12 +01:00
Ganesh Vernekar	c4536fa28c	Increase block writer size for backfilling Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>	2021-02-17 15:45:41 +05:30
Julien Pivotto	a419b75abd	Merge pull request #8485 from hryniuk/promtool-query-errors-details Print details of API errors received by promtool	2021-02-16 22:47:08 +01:00
Łukasz Hryniuk	ab41de68b4	Print details of API errors Signed-off-by: Łukasz Hryniuk <code@hryniuk.pl>	2021-02-15 23:42:06 +01:00
David Leadbeater	3e30f72af1	promtool: Add more negative alert tests Signed-off-by: David Leadbeater <dgl@dgl.cx>	2021-02-15 17:00:49 +00:00
Julien Pivotto	e29b47b39e	Merge pull request #8440 from mishamo/master Add optional name property to testgroup for better test failure output	2021-02-09 21:23:24 +01:00
misha	1c3e7b4241	Use strings.Builder for neater error formatting Signed-off-by: misha <DL-OTTCloudPlatform-Nova@bskyb.internal>	2021-02-09 15:00:26 +00:00
Tom Wilkie	d479151f1f	Various enhancements and refactorings for remote write receiver: - Remove unrelated changes - Refactor code out of the API module - that is already getting pretty crowded. - Don't track reference for AddFast in remote write. This has the potential to consume unlimited server-side memory if a malicious client pushes a different label set for every series. For now, its easier and safer to always use the 'slow' path. - Return 400 on out of order samples. - Use remote.DecodeWriteRequest in the remote write adapters. - Put this behing the 'remote-write-server' feature flag - Add some (very) basic docs. - Used named return & add test for commit error propagation Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2021-02-08 20:41:23 +00:00
fuling	72475b8a0c	[ENHANCEMENT] remote storage:Add default api implementation of remote write Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>	2021-02-07 18:12:48 +00:00
misha	c2c5aeb16b	Add optional name property to testgroup for better test failure output Signed-off-by: misha <DL-OTTCloudPlatform-Nova@bskyb.internal>	2021-02-04 10:07:22 +00:00
Julien Pivotto	c1f8bd9944	Merge pull request #8432 from roidelapluie/backfillpanic backfill: move checkErr before we close the mmaped file	2021-02-03 16:32:35 +01:00
Julien Pivotto	9334269f2b	backfill: move checkErr before we close the mmaped file When printing the error, we still need access to the mmapped byte array of the file. Therefore, we make sure that we run it before closing the file. I could have done something more complex like a defer, or not closing the file, knowing that we would exit the program anyway. However, I think that in case we extend this in the future, or this is copy/paster elsewhere, we should continue closing the file. As it is small enough, I went for the solution to call the function 3 times instead of playing with a defer. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-02-01 21:18:42 +01:00
Jeremy Albinet	4a1f2c097e	Typo on plural in checkRules/checkDuplicates Signed-off-by: Jeremy Albinet <jalbinet@synthesio.com>	2021-02-01 15:43:05 +01:00
Julien Pivotto	2316062d4e	Deprecate --alertmanager.timeout Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-01-25 12:36:13 +01:00
Ganesh Vernekar	9199fcb8d1	'@ <timestamp>' modifier (#8121 ) This commit adds `@ <timestamp>` modifier as per this design doc: https://docs.google.com/document/d/1uSbD3T2beM-iX4-Hp7V074bzBRiRNlqUdcWP6JTDQSs/edit. An example query: ``` rate(process_cpu_seconds_total[1m]) and topk(7, rate(process_cpu_seconds_total[1h] @ 1234)) ``` which ranks based on last 1h rate and w.r.t. unix timestamp 1234 but actually plots the 1m rate. Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>	2021-01-20 16:27:39 +05:30
Julien Pivotto	ac2626757c	Update exporter-toolkit to 0.5.0 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-01-13 21:49:54 +01:00
Guangwen Feng	2df1a482da	Fix misspelled word in comment (#8348 ) Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>	2021-01-07 10:01:08 +00:00
Julien Pivotto	bc9f9ee3aa	Backfilling: fast-path for non-consecutive blocks (#8324 ) * Backfilling: optimize for non-consecutive blocks When you have missing data for > 2 hours, you spend a lot of time re-reading the complete file. It is not optimal. This introduces a fastpath for this scenario. Next, we do parse the metric even when we know we will not use it, based on its timestamp. This only computes the metric when we know its timestamp is right. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-12-30 02:06:41 +01:00
Julien Pivotto	003d6451fc	Promtool: add web config validation Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-12-29 16:55:29 +01:00
Julien Pivotto	5b4f46a348	Add TLS and basic authentication Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-12-28 21:33:44 +01:00
Ben Kochie	5055dfbbe4	Listen on web early in startup Avoid starting up components like the TSDB if we can't bind to the web listening port. Signed-off-by: Ben Kochie <superq@gmail.com>	2020-12-28 20:13:05 +01:00
beorn7	6bfa33308e	promtool: Print block meta-data slightly more nicely I initially thought I could somehow rescue the current column layout by recycling the tabwriter, but flushing completely blanks it. However, by setting a minimum width of 13, we get a slightly broader DURATION column but otherwise nice formatting, unless numbers get really big, but that's OK, I guess. Before: ``` BLOCK ULID MIN TIME MAX TIME DURATION NUM SAMPLES NUM CHUNKS NUM SERIES SIZE 01ETN0KGNP5WWK9T5QMQGBG9F1 2020-11-19 07:39:17 +0000 UTC 2020-11-19 07:44:17 +0000 UTC 5m0.001s 8 2 2 624B 01ETN0KGQSFF0AB2QDZVQG3CWC 2020-11-19 10:25:57 +0000 UTC 2020-11-19 10:30:57 +0000 UTC 5m0.001s 8 2 2 622B 01ETN0KGSW8KYP3YPG4X20P60Z 2020-11-19 13:12:37 +0000 UTC 2020-11-19 13:17:37 +0000 UTC 5m0.001s 8 2 2 625B ``` After: ``` BLOCK ULID MIN TIME MAX TIME DURATION NUM SAMPLES NUM CHUNKS NUM SERIES SIZE 01ETN0R72SXN9A1FG732P7KFFN 2020-11-19 07:39:17 +0000 UTC 2020-11-19 07:44:17 +0000 UTC 5m0.001s 8 2 2 624B 01ETN0R74Y9AG1A1MKN4MZK7WM 2020-11-19 10:25:57 +0000 UTC 2020-11-19 10:30:57 +0000 UTC 5m0.001s 8 2 2 622B 01ETN0R76KXZ5VQECMDNES49J6 2020-11-19 13:12:37 +0000 UTC 2020-11-19 13:17:37 +0000 UTC 5m0.001s 8 2 2 625B ``` After without the `-r` flag: ``` BLOCK ULID MIN TIME MAX TIME DURATION NUM SAMPLES NUM CHUNKS NUM SERIES SIZE 01ETN0RFFJ42274NWR1GH0RTV6 1605771557000 1605771857001 5m0.001s 8 2 2 624 01ETN0RFJ1MZCHHS2SBZS8XC27 1605781557000 1605781857001 5m0.001s 8 2 2 622 01ETN0RFM98N3V4KD2DZXFGHGN 1605791557000 1605791857001 5m0.001s 8 2 2 625 ``` Signed-off-by: beorn7 <beorn@grafana.com>	2020-12-28 16:55:12 +01:00
beorn7	651b57b9ab	Merge branch 'backfillhr' of git://github.com/roidelapluie/prometheus into review	2020-12-28 16:18:00 +01:00
yeya24	cedd2dbec9	create output directory before backfilling Signed-off-by: yeya24 <yb532204897@gmail.com>	2020-12-24 23:36:36 -05:00
Julien Pivotto	53480c168d	Backfill: print created blocks only, add human-readable option Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-12-23 20:42:30 +01:00
AdaephonBen	dca6954b0a	promtool: Add URL scheme when not provided (#7956 ) Signed-off-by: AdaephonBen <ma18btech11011@iith.ac.in>	2020-12-23 19:52:04 +01:00
lzhfromustc	27a6e1e174	test: add buffer to channel to avoid goroutine leak (#8274 ) Signed-off-by: lzhfromustc <lzhfromustc@gmail.com>	2020-12-10 09:09:21 +00:00
Julien Pivotto	7957731339	Inline defer Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-12-09 09:23:39 +01:00
Julien Pivotto	82b5f1d8b1	Backfill: Use mmap to reuse parser code Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-12-08 23:48:31 +01:00
jessicagreben	e32e4fcc53	fix unit test Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-30 11:02:45 -08:00
jessicagreben	cec3515fa3	fix linter Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-30 08:17:51 -08:00
jessicagreben	2e9946e4d7	add test Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-28 07:58:33 -08:00
jessicagreben	ac06d0a657	merge master/resolve conflict Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-26 08:43:07 -08:00
jessicagreben	ee85c22adb	flush samples to disk every 5k samples Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-26 08:30:06 -08:00
Atibhi Agrawal	b317b6ab9c	Backfill from OpenMetrics format (#8084 ) * get parser working Signed-off-by: aSquare14 <atibhi.a@gmail.com> * import file created Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Find min and max ts Signed-off-by: aSquare14 <atibhi.a@gmail.com> * make two passes over file and write to tsdb Signed-off-by: aSquare14 <atibhi.a@gmail.com> * print error messages Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix Max and Min initializer Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Start with unit tests Signed-off-by: aSquare14 <atibhi.a@gmail.com> * reset file read Signed-off-by: aSquare14 <atibhi.a@gmail.com> * align blocks to two hour range Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Add cleanup test Signed-off-by: aSquare14 <atibhi.a@gmail.com> * remove .ds_store Signed-off-by: aSquare14 <atibhi.a@gmail.com> * add license to import_test Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix Circle CI error Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Refactor code Move backfill from tsdb to promtool directory Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix gitignore Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Remove panic Rename ContenType Signed-off-by: aSquare14 <atibhi.a@gmail.com> * adjust mint Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix return statement Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix go modules Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Added unit test for backfill Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix CI error Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix file handling Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Close DB Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Close directory Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Error Handling Signed-off-by: aSquare14 <atibhi.a@gmail.com> * inline err Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix command line flags Signed-off-by: aSquare14 <atibhi.a@gmail.com> * add spaces before func fix pointers Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Add defer'd calls Signed-off-by: aSquare14 <atibhi.a@gmail.com> * move openmetrics.go content to backfill Signed-off-by: aSquare14 <atibhi.a@gmail.com> * changed args to flags Signed-off-by: aSquare14 <atibhi.a@gmail.com> * add tests for wrong OM files Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Added additional tests Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Add comment to warn of func reuse Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Make input required in main.go Signed-off-by: aSquare14 <atibhi.a@gmail.com> * defer blockwriter close Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix defer Signed-off-by: aSquare14 <atibhi.a@gmail.com> * defer Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Remove contentType Signed-off-by: aSquare14 <atibhi.a@gmail.com> * remove defer from backfilltest Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix defer remove in backfill_test Signed-off-by: aSquare14 <atibhi.a@gmail.com> * changes to fix CI errors Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix go.mod Signed-off-by: aSquare14 <atibhi.a@gmail.com> * change package name Signed-off-by: aSquare14 <atibhi.a@gmail.com> * assert->require Signed-off-by: aSquare14 <atibhi.a@gmail.com> * remove todo Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix format Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix todo Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix createblock Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix tests Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix defer Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix return Signed-off-by: aSquare14 <atibhi.a@gmail.com> * check err for anon func Signed-off-by: aSquare14 <atibhi.a@gmail.com> * change comments Signed-off-by: aSquare14 <atibhi.a@gmail.com> * update comment Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix for the Flush Bug Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix formatting, comments, names Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Print Blocks Signed-off-by: aSquare14 <atibhi.a@gmail.com> * cleanup Signed-off-by: aSquare14 <atibhi.a@gmail.com> * refactor test to take care of multiple samples Signed-off-by: aSquare14 <atibhi.a@gmail.com> * refactor tests Signed-off-by: aSquare14 <atibhi.a@gmail.com> * remove om Signed-off-by: aSquare14 <atibhi.a@gmail.com> * I dont know what I fixed Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix tests Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix tests, add test description, print blocks Signed-off-by: aSquare14 <atibhi.a@gmail.com> * commit after 5000 samples Signed-off-by: aSquare14 <atibhi.a@gmail.com> * reviews part 1 Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Series Count Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix CI Signed-off-by: aSquare14 <atibhi.a@gmail.com> * remove extra func Signed-off-by: aSquare14 <atibhi.a@gmail.com> * make timestamp into sec Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Reviews 2 Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Add Todo Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fixes Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fixes reviews Signed-off-by: aSquare14 <atibhi.a@gmail.com> * =0 Signed-off-by: aSquare14 <atibhi.a@gmail.com> * remove backfill.om Signed-off-by: aSquare14 <atibhi.a@gmail.com> * add global err var, remove stuff Signed-off-by: aSquare14 <atibhi.a@gmail.com> * change var name Signed-off-by: aSquare14 <atibhi.a@gmail.com> * sampleLimit pass as parameter Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Add test when number of samples greater than batch size Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Change name of batchsize Signed-off-by: aSquare14 <atibhi.a@gmail.com> * revert export Signed-off-by: aSquare14 <atibhi.a@gmail.com> * nits Signed-off-by: aSquare14 <atibhi.a@gmail.com> * remove Signed-off-by: aSquare14 <atibhi.a@gmail.com> * add comment, remove newline,consistent err Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Print Blocks Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Modify comments Signed-off-by: aSquare14 <atibhi.a@gmail.com> * db.Querier Signed-off-by: aSquare14 <atibhi.a@gmail.com> * add sanity check , get maxt and mint Signed-off-by: aSquare14 <atibhi.a@gmail.com> * ci error Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix Signed-off-by: aSquare14 <atibhi.a@gmail.com> * comment change Signed-off-by: aSquare14 <atibhi.a@gmail.com> * nits Signed-off-by: aSquare14 <atibhi.a@gmail.com> * NoError Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix Signed-off-by: aSquare14 <atibhi.a@gmail.com>	2020-11-26 10:37:06 +05:30
jessicagreben	5dd3577424	change name of promtool subcommand to create-blocks-from Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-22 15:05:02 -08:00
jessicagreben	19dee0a569	add name and labels to metric, eval all rules for each block Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-22 14:24:38 -08:00
gotjosh	4eca4dffb8	Allow metric metadata to be propagated via Remote Write. (#6815 ) * Introduce a metadata watcher Similarly to the WAL watcher, its purpose is to observe the scrape manager and pull metadata. Then, send it to a remote storage. Signed-off-by: gotjosh <josue@grafana.com> * Additional fixes after rebasing. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Rework samples/metadata metrics. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Use more descriptive variable names in MetadataWatcher collect. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix issues caused during rebasing. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix missing metric add and unneeded config code. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Address some review comments. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix metrics and docs Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Replace assert with require Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Bring back max_samples_per_send metric Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Fix tests Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> Co-authored-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>	2020-11-19 20:53:03 +05:30
jessicagreben	75654715d3	fix panics Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-01 07:54:04 -08:00
jessicagreben	61c9a89120	use milliseconds for blocksize Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-10-31 07:11:54 -07:00
jessicagreben	6980bcf671	unexport backfiller Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-10-31 06:40:56 -07:00
jessicagreben	3ed6457dd4	use blockwriter, rm multiwriter code Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-10-31 06:32:07 -07:00
Julien Pivotto	6c56a1faaa	Testify: move to require (#8122 ) * Testify: move to require Moving testify to require to fail tests early in case of errors. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * More moves Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-29 09:43:23 +00:00
Bartlomiej Plotka	3d8826a3d4	MultiError: Refactored MultiError for more concise and safe usage. (#8066 ) * MultiError: Refactored MultiError for more concise and safe usage. * Less lines * Goland IDE was marking every usage of old MultiError "potential nil" error * It was easy to forgot using Err() when error was returned, now it's safely assured on compile time. NOTE: Potentially I would rename package to merrors. (: In different PR. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Addressed review comments. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Addressed comments. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Fix after rebase. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2020-10-28 15:24:58 +00:00
Julien Pivotto	1282d1b39c	Refactor test assertions (#8110 ) * Refactor test assertions This pull request gets rid of assert.True where possible to use fine-grained assertions. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-27 11:06:53 +01:00
David Leadbeater	e7e60623ff	promtool: Calculate mint and maxt per test (#8096 ) * promtool: Calculate mint and maxt per test Previously a single test that used a later eval time would make all other tests in the file share the [mint, maxt] and potentially evaluate far more samples than needed. Fixes: #8019 Signed-off-by: David Leadbeater <dgl@dgl.cx>	2020-10-24 12:03:55 +01:00
Julien Pivotto	4e5b1722b3	Move away from testutil, refactor imports (#8087 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-22 11:00:08 +02:00
jessicagreben	36ac0b68f1	merge master, fix conflicts	2020-10-17 08:20:21 -07:00
Björn Rabenstein	71577e45eb	Merge pull request #8044 from prometheus/beorn7/metrics Instrumentation: Report valid configs in the respective metrics from the beginning	2020-10-12 23:32:02 +02:00
Arthur Silva Sens	4f45e201cc	Promtool tsdb list now prints block sizes (#7993 ) * promtool tsdb list now prints blocks' size Signed-off-by: arthursens <arthursens2005@gmail.com>	2020-10-12 23:15:40 +02:00
beorn7	0f3c1bf6cf	Report valid configs in the respective metrics from the beginning In #7399, an early validity check of the config was introduced to prevent the scenario where an invalid config is only detected after a possibly very long startup procedure. However, the respective success metrics are not updated after the initial validation so that the success metrics suggest an invalid config. If the startup procedure, like replaying the WAL, really takes very long, alerts about invalid config will trigger. This commit sets the succes metrics after initial validation. They will be set again after the "real" config (re-)load, but that shouldn't be a problem. The metric now truthfully represents whenever the config was successfully loaded, no matter if the result was then thrown away (because it was just for validation) or actually used. Signed-off-by: beorn7 <beorn@grafana.com>	2020-10-12 21:30:59 +02:00
David Leadbeater	5393ec22cb	promtool: Don't end alert tests early, in some failure situations If an alert test had a failing test, then any other alert test interval specified after that point would result in the test exiting early. This made debugging some tests more difficult than needed. Now only exit early for evaluation failures. Signed-off-by: David Leadbeater <dgl@dgl.cx>	2020-10-09 12:59:59 +01:00
Frederic Branczyk	da3ea43242	Merge pull request #7976 from roidelapluie/tolerance Introduce timestamp tolerance in scrapes	2020-10-08 09:21:19 +02:00
Julien Pivotto	be5ba1a62d	Fix wordings Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-07 21:44:36 +02:00
Julien Pivotto	4617d16b4b	Specify the removal Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-07 18:32:04 +02:00
Julien Pivotto	e2a2bf3c06	Add context Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-07 18:30:32 +02:00
Julien Pivotto	627ff84599	Adjust flag Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-07 18:25:52 +02:00
Julien Pivotto	6b618ecf02	Better description Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-07 17:43:42 +02:00
Julien Pivotto	536dfb6234	Add an experimental, hidden flag Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-07 17:31:46 +02:00
Frederic Branczyk	6be3ebdfe7	Merge pull request #8015 from simonpasquier/bump-k8s-deps Bump k8s dependencies + support k8s.io/klog/v2	2020-10-07 09:54:58 +02:00
Julien Pivotto	946819e16e	cmd/prometheus: Issue a warning on 32 bit archs (#8012 ) * cmd/prometheus: Issue a warning on 32 bit archs Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-06 21:42:56 +02:00
Simon Pasquier	9bb3555fe4	cmd/prometheus: support k8s.io/klog/v2 Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2020-10-06 14:56:14 +02:00
David Leadbeater	77c784ac93	Ensure alert rules are marked as restored in unit tests (#7661 ) This makes sure the ALERTS timeseries is created when unit testing alerting rules. Signed-off-by: David Leadbeater <dgl@dgl.cx>	2020-09-21 18:15:34 +02:00
jessicagreben	2e526cf2a7	add output dir parameter Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-09-13 08:38:32 -07:00
jessicagreben	dfa510086b	add alignment, mv rule importer to promtool dir, add queryRange Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-09-13 08:07:59 -07:00
Julien Pivotto	442b3364d7	Promtool: add evaluation time to instant query (#7829 ) * Promtool: add evaluation time to instant query Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Apply suggestion Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-08-25 11:32:25 +01:00
Andy Bursavich	4e6a94a27d	Invert service discovery dependencies (#7701 ) This also fixes a bug in query_log_file, which now is relative to the config file like all other paths. Signed-off-by: Andy Bursavich <abursavich@gmail.com>	2020-08-20 13:48:26 +01:00
Harold Dost	21a753c4e2	Make file permissions set to allow for wider umask options. (#7782 ) 0644 -> 0666 on all non vendored code. Fixes #7717 Signed-off-by: Harold Dost <harolddost@gmail.com>	2020-08-12 23:23:17 +02:00
Julien Pivotto	d661f84748	Log duration of reloads Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-08-06 21:49:26 +02:00
Annanay	9bba8a6eae	Merge branch 'master' into appender-context Signed-off-by: Annanay <annanayagarwal@gmail.com>	2020-07-30 16:43:18 +05:30
Julien Pivotto	01e3bfcd1a	Add warnings about NFS (#7691 ) * Add warnings about NFS Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-07-30 11:22:44 +02:00
Javier Palomo Almena	b58a613443	Replace sync/atomic with uber-go/atomic (#7683 ) * storage: Replace usage of sync/atomic with uber-go/atomic Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com> * tsdb: Replace usage of sync/atomic with uber-go/atomic Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com> * web: Replace usage of sync/atomic with uber-go/atomic Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com> * notifier: Replace usage of sync/atomic with uber-go/atomic Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com> * cmd: Replace usage of sync/atomic with uber-go/atomic Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com> * scripts: Verify that we are not using restricted packages It checks that we are not directly importing 'sync/atomic'. Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com> * Reorganise imports in blocks Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com> * notifier/test: Apply PR suggestions Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com> * storage/remote: avoid storing references on newEntry Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com> * Revert "scripts: Verify that we are not using restricted packages" This reverts commit `278d32748e`. Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com> * web: Group imports accordingly Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>	2020-07-30 13:15:42 +05:30
jessicagreben	7504b5ce7c	add rule importer with tsdb block writer Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>	2020-07-27 07:44:49 -07:00
Annanay	7f98a744e5	Add context to Appender interface Signed-off-by: Annanay <annanayagarwal@gmail.com>	2020-07-24 19:40:51 +05:30
chinhnc	e05c19da5d	Display block duration in promtool list blocks command (#7653 ) * Update tsdb.go Added DURATION column to `tsdb list` command Signed-off-by: soup <chicknsoupuds@gmail.com> * Use time.Duration instead of hardcoded hour Signed-off-by: soup <chicknsoupuds@gmail.com>	2020-07-24 19:01:20 +05:30
Ben Ye	50c261502e	add tsdb cmds into promtool (#6088 ) Signed-off-by: yeya24 <yb532204897@gmail.com> update tsdb cli in makefile and promu Signed-off-by: yeya24 <yb532204897@gmail.com> remove building tsdb bin Signed-off-by: yeya24 <yb532204897@gmail.com> remove useless func Signed-off-by: yeya24 <yb532204897@gmail.com> refactor analyzeBlock Signed-off-by: yeya24 <yb532204897@gmail.com> Fix Makefile Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2020-07-23 19:35:50 +01:00
Bartlomiej Plotka	a0df8a383a	promql: Removed global and add ability to have better interval for subqueries if not specified (#7628 ) * promql: Removed global and add ability to have better interval for subqueries if not specified ## Changes * Refactored tests for better hints testing * Added various TODO in places to enhance. * Moved DefaultEvalInterval global to opts with func(rangeMillis int64) int64 function instead Motivation: At Thanos we would love to have better control over the subqueries step/interval. This is important to choose proper resolution. I think having proper step also does not harm for Prometheus and remote read users. Especially on stateless querier we do not know evaluation interval and in fact putting global can be wrong to assume for Prometheus even. I think ideally we could try to have at least 3 samples within the range, the same way Prometheus UI and Grafana assumes. Anyway this interfaces allows to decide on promQL user basis. Open question: Is taking parent interval a smart move? Motivation for removing global: I spent 1h fighting with: === RUN TestEvaluations TestEvaluations: promql_test.go:31: unexpected error: error evaluating query "absent_over_time(rate(nonexistant[5m])[5m:])" (line 687): unexpected error: runtime error: integer divide by zero --- FAIL: TestEvaluations (0.32s) FAIL At the end I found that this fails on most of the versions including this master if you run this test alone. If run together with many other tests it passes. This is due to SetDefaultEvaluationInterval(1 * time.Minute) in test that is ran before TestEvaluations. Thanks to globals (: Let's fix it by dropping this global. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Added issue links for TODOs. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Removed irrelevant changes. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2020-07-22 14:39:51 +01:00
Julien Pivotto	b83cbacbdd	Rule manager: remove blocking channel in mail (#7631 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-07-22 00:13:24 +02:00
Ben Ye	e6ea798c32	promtool range query should exit when fail to parse time (#7505 ) Signed-off-by: yeya24 <yb532204897@gmail.com>	2020-07-16 23:53:04 +01:00
yeya24	797e48c1a3	support time range in promtool query labels Updated prometheus/client_golang and json-iterator/go Signed-off-by: yeya24 <yb532204897@gmail.com>	2020-07-03 11:29:39 -04:00
Frederic Branczyk	d17d88935c	rules: Use narrower interface for rule manager loading of for state (#7472 ) To load ALERT_FOR_STATE only `storage.Queryable` interface is required, so this patch uses this narrower interface for to perform this. Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>	2020-06-26 19:06:36 +01:00
Bartlomiej Plotka	b788986717	storage: Adjusted fully storage layer support for chunk iterators: Remote read client, readyStorage, fanout. (#7059 ) * Fixed nits introduced by https://github.com/prometheus/prometheus/pull/7334 * Added ChunkQueryable implementation to fanout and readyStorage. * Added more comments. * Changed NewVerticalChunkSeriesMerger to CompactingChunkSeriesMerger, removed tiny interface by reusing VerticalSeriesMergeFunc for overlapping algorithm for both chunks and series, for both querying and compacting (!) + made sure duplicates are merged. * Added ErrChunkSeriesSet * Added Samples interface for seamless []promb.Sample to []tsdbutil.Sample conversion. * Deprecating non chunks serieset based StreamChunkedReadResponses, added chunk one. * Improved tests. * Split remote client into Write (old storage) and read. * Queryable client is now SampleAndChunkQueryable. Since we cannot use nice QueryableFunc I moved all config based options to sampleAndChunkQueryableClient to aboid boilerplate. In next commit: Changes for TSDB. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2020-06-24 14:41:52 +01:00
Harkishen Singh	70b0a34616	Exit early on invalid config file (#7399 ) * Reload config file at start Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com> * relocated config checking Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com> * change log lever Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com> * add helpful comment Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>	2020-06-21 21:26:59 +05:30
Ben Kochie	8d3c2f6829	Enable WAL compression by default (#7410 ) Enable the `--storage.tsdb.wal-compression` flag by defualt. Signed-off-by: Ben Kochie <superq@gmail.com>	2020-06-18 17:59:40 +01:00
Jordan Neufeld	268b4c29e1	Support extended durations in promtool unit tests (Fixes #6285 ) (#6297 ) * Fixed evaluation_time duration parsing in promtool unit tests (Fixes #6285) Signed-off-by: Jordan Neufeld <jordan@neufeldtech.com>	2020-06-15 16:03:07 +01:00
Arthur Silva Sens	7727b9012e	Correction of misleading help text(#5142 ) (#7231 ) * Correction of misleading help text(#5142) Signed-off-by: arthursens <arthursens2005@gmail.com>	2020-05-11 12:15:01 +01:00
Julien Pivotto	9e265aba10	Merge pull request #7225 from prometheus/release-2.18 [Merge without Squash] Merge release-2.18 back to master for 2.18.1 fixes.	2020-05-07 21:23:59 +02:00
Hongcai Ren	c7e82274c6	replace github.com/prometheus/prometheus/testutil/promlint by github.com/prometheus/client_golang/prometheus/testutil/promlint from our codebase (#7209 ) Signed-off-by: RainbowMango <renhongcai@huawei.com>	2020-05-07 11:34:39 +01:00
Julien Pivotto	645b71e9ef	Fix snapshots (#7217 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-05-07 10:03:48 +01:00
Ganesh Vernekar	d4b9fe801f	M-map full chunks of Head from disk (#6679 ) When appending to the head and a chunk is full it is flushed to the disk and m-mapped (memory mapped) to free up memory Prom startup now happens in these stages - Iterate the m-maped chunks from disk and keep a map of series reference to its slice of mmapped chunks. - Iterate the WAL as usual. Whenever we create a new series, look for it's mmapped chunks in the map created before and add it to that series. If a head chunk is corrupted the currpted one and all chunks after that are deleted and the data after the corruption is recovered from the existing WAL which means that a corruption in m-mapped files results in NO data loss. [Mmaped chunks format](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/head_chunks.md) - main difference is that the chunk for mmaping now also includes series reference because there is no index for mapping series to chunks. [The block chunks](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/chunks.md) are accessed from the index which includes the offsets for the chunks in the chunks file - example - chunks of series ID have offsets 200, 500 etc in the chunk files. In case of mmaped chunks, the offsets are stored in memory and accessed from that. During WAL replay, these offsets are restored by iterating all m-mapped chunks as stated above by matching the series id present in the chunk header and offset of that chunk in that file. Prombench results _WAL Replay_ 1h Wal reply time 30% less wal reply time - 4m31 vs 3m36 2h Wal reply time 20% less wal reply time - 8m16 vs 7m _Memory During WAL Replay_ High Churn: 10-15% less RAM - 32gb vs 28gb 20% less RAM after compaction 34gb vs 27gb No Churn: 20-30% less RAM - 23gb vs 18gb 40% less RAM after compaction 32.5gb vs 20gb Screenshots are in [this comment](https://github.com/prometheus/prometheus/pull/6679#issuecomment-621678932) Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>	2020-05-06 21:00:00 +05:30
Ben Ye	1e4e37144d	Fixed wrongly handled not ready TSDB on web and API. (#7182 ) * fix federate endpoint panic Signed-off-by: yeya24 <yb532204897@gmail.com> * Fixed all cases of not ready TSDB being wrongly handled. * Fixed issue for federation. * Ensured this will never happen again thanks to interfaces * Fixes same issue for stats. * Added tests for readiness. * Fixed bug in stats. It was: status.MaxTime = db.Head().MaxTime() status.MinTime = db.Head().MaxTime() Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Addressed Brian's comments. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Addressed Brian's comments. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>	2020-04-29 17:16:14 +01:00
Vasily Sliouniaev	0393b188c9	Add Jaeger (#7148 ) * Trace remote read Signed-off-by: vas <vasily.sliouniaev@jet.com> * Use jaeger Signed-off-by: vas <vasily.sliouniaev@jet.com>	2020-04-23 02:05:55 +02:00
Marek Slabicki	8224ddec23	Capitalizing first letter of all log lines (#7043 ) Signed-off-by: Marek Slabicki <thaniri@gmail.com>	2020-04-11 09:22:18 +01:00
Brian Brazil	7646cbca32	Use .UTC everywhere we use time.Unix (#7066 ) time.Unix attaches the local timezone, which can then leak out (e.g. in the alert json). While this is harmless, we should be consistent. Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>	2020-03-29 17:35:39 +01:00
Ben Kochie	269e7c8091	Fix golint issues. Signed-off-by: Ben Kochie <superq@gmail.com>	2020-03-23 20:38:43 +01:00
johncming	bbacd2dd09	remove needless break. (#7008 ) Signed-off-by: johncming <johncming@yahoo.com>	2020-03-19 11:21:00 +00:00
李国忠	52025bd7a9	[comments] change word ‘wheter’ to ‘whether’ (#6912 ) * [comments] change word ‘wheter’ to ‘whether’ Signed-off-by: fuling <fuling.lgz@alibaba-inc.com> * [comments] change word ‘wheter’ to ‘whether’ Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>	2020-03-02 13:51:24 +05:30
Tobias Guggenmos	4835bbf376	Merge branch 'master' into split_parser	2020-02-19 15:18:13 +01:00
Bartlomiej Plotka	48ead578a0	Moved tsdbconfig to main. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2020-02-18 11:25:36 +00:00
Bartlomiej Plotka	a20bebf7eb	Moved readyStorage to main. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2020-02-17 18:03:57 +00:00
Bartlomiej Plotka	8a775bc468	Moved unit agnostic options to separate pkg. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2020-02-17 18:03:57 +00:00
Bartlomiej Plotka	59c9d6ef45	Addressed Brian's comments, moved metrics to main.go Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2020-02-17 18:03:57 +00:00
Bartlomiej Plotka	cfba92a133	Addressed comments. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2020-02-17 18:03:57 +00:00
Bartlomiej Plotka	34426766d8	Unify Iterator interfaces. All point to storage now. This is part of https://github.com/prometheus/prometheus/pull/5882 that can be done to simplify things. All todos I added will be fixed in follow up PRs. * querier.Querier, querier.Appender, querier.SeriesSet, and querier.Series interfaces merged with storage interface.go. All imports that. * querier.SeriesIterator replaced by chunkenc.Iterator * Added chunkenc.Iterator.Seek method and tests for xor implementation (?) * Since we properly handle SelectParams for Select methods I adjusted min max based on that. This should help in terms of performance for queries with functions like offset. * added Seek to deletedIterator and test. * storage/tsdb was removed as it was only a unnecessary glue with incompatible structs. No logic was changed, only different source of abstractions, so no need for benchmarks. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2020-02-17 18:03:54 +00:00
Tobias Guggenmos	454ba12676	Fix build errors in promtool Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>	2020-02-17 16:09:23 +01:00
Björn Rabenstein	af04cb22c8	Merge pull request #6821 from prometheus/release-2.16 Release 2.16	2020-02-14 13:10:14 +01:00
Julien Pivotto	ff0003e072	Make lookbackDelta a option of QueryEngine (#6746 ) * Make lookbackDelta a option of QueryEngine Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * julius' suggestion Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * remove trivial getter Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Assume lookback delta is always > 0 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * add debug log Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * don't expose loopback delta Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Specify that lookack delta is also used in federation Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Fix federation test While we have added some logic to the promql engine to keep it backwards compatible and have a 5 minute loopback by default, the web/ package is likely to really be internal to Prometheus and we should not add the same kind of heuritstics here. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * loopback delta: Fix debug log Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-02-10 00:58:23 +01:00
Julien Pivotto	d799078c88	also test start and end Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-02-08 16:42:50 +01:00
Julien Pivotto	881dde505a	promql: fix promql query log step unit Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-02-08 16:26:56 +01:00
Julien Pivotto	3c4c01eae2	Fix race in Query Log Test (#6727 ) A data race can happen if we run t.Log after the test t is done -- which in this case is highly possible because of the use of subtests and the fact that we call t.Log in a goroutine. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-01-30 13:51:18 -08:00
Julien Pivotto	9adad8ad30	Remove MaxConcurrent from the PromQL engine opts (#6712 ) Since we use ActiveQueryTracker to check for concurrency in `d992c36b3a` it does not make sense to keep the MaxConcurrent value as an option of the PromQL engine. This pull request removes it from the PromQL engine options, sets the max concurrent metric to -1 if there is no active query tracker, and use the value of the active query tracker otherwise. It removes dead code and also will inform people who import the promql package that we made that change, as it breaks the EngineOpts struct. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-01-28 20:38:49 +00:00
Julien Pivotto	5f27ac3583	Refactor query log fields (#6694 ) * Refactor query log fields Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-01-27 09:53:10 +00:00
Julien Pivotto	2b2eb79e8b	Add windows tests for query logger (#6653 ) * Add windows tests * Do not rely on time.Time in timer Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-01-20 13:17:11 +00:00
Julien Pivotto	0eb34299da	End-to-end Query Log test (#6600 ) * End-to-end Query Log test Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-01-19 21:56:13 +00:00
Julien Pivotto	1a58d2657d	Removed compilation step inside main_test (#6658 ) Inspired by https://github.com/prometheus/prometheus/pull/6347 and https://github.com/prometheus/prometheus/pull/6347#issuecomment-570151979 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-01-19 07:14:25 +00:00
Harkishen Singh	84e6459c4d	Adds support for line-column numbers for invalid rules, promtool (#6533 ) Signed-off-by: Harkishen Singh <harkishensingh@hotmail.com>	2020-01-15 18:07:54 +00:00
Julien Pivotto	3885562587	Query Logging styling (#6594 ) - Fix Json vs JSON in activequerylogger - Fix SetQueryLogger always returns nil Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-01-09 21:11:39 +00:00
Julien Pivotto	9d9bc524e5	Add query log (#6520 ) * Add query log, make stats logged in JSON like in the API Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-01-08 13:28:43 +00:00
Simon Pasquier	cccd542891	*: avoid missed Alertmanager targets (#6455 ) This change makes sure that nearly-identical Alertmanager configurations aren't merged together. The config's identifier was the MD5 hash of the configuration serialized to JSON but because `relabel.Regexp` has no public field and doesn't implement the JSON.Marshaler interface, it was always serialized to "{}". In practice, the identifier can be based on the index of the configuration in the list. Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-12-12 17:00:19 +01:00
Brooks Swinnerton	0ea3a2218d	Add time units to storage.tsdb.retention.size flag (#6365 ) * Add time units to storage.tsdb.retention.size flag In an effort to reduce confusion with the `m` option of the `ParseDuration()` function, this commit adds the available time units to the `storage.tsdb.retention.time` flag to help showcase that there is no option for months (which could be assumed to be `m`). If someone were looking to set the retention to six months, they may mistakenly do so with `6m`, which would reduce their retention to six minutes. Signed-off-by: Brooks Swinnerton <bswinnerton@gmail.com>	2019-11-30 08:00:51 +00:00
johncming	ad4bc5701e	remove unwanted break (#6338 ) Signed-off-by: johncming <johncming@yahoo.com>	2019-11-18 23:01:03 -08:00
akerele abraham	9d39fdad0c	unittest: check for rule files existence (#6075 ) Signed-off-by: akerele abraham <abrahamakerele38@gmail.com>	2019-11-18 13:54:52 -08:00
Chris Marchbanks	1d1f64b4bc	Fix Promtool showing false duplicate rule warnings (#6270 ) Alert rules do not use the Record field, so any alerts with the same labels and different names would be counted as being duplicates. Promtool will now consider either field when finding duplicates. Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>	2019-11-05 11:22:31 -07:00
Simon Pasquier	ddff1480a7	cmd/promtool: improve output for PromQL tests (#6052 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-09-25 09:26:29 +02:00
Harkishen Singh	e097c70e6d	add checks for metrics and display duplicate fields (#6026 ) Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>	2019-09-20 11:29:47 +01:00
Simon Pasquier	06066a3619	*: improve error messages when parsing bad rules (#5965 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-08-28 17:36:48 +02:00
Sayan Chowdhury	cb66e325d8	Show the warnings during label query (#5924 ) This patch loops through the warnings while querying the label and spits the output to stderr Fixes #5885 Signed-off-by: Sayan Chowdhury <sayan.chowdhury2012@gmail.com>	2019-08-24 19:42:21 +02:00
Bartek Płotka	48b2c9c8ea	remote-read: streamed chunked server side; Extended protobuf; Added chunked, checksumed reader (#5703 ) Part of: https://github.com/prometheus/prometheus/issues/4517 and https://github.com/improbable-eng/thanos/issues/488 Changes: * Extended protobuf for chunked remote read and negotation. * Added checksumed, chunked Writer/Reader. * Added Server side implementation for chunked streamed remote-read. Signed-off-by: Bartek Plotka <bwplotka@gmail.com>	2019-08-19 21:16:10 +01:00
Bartek Płotka	5cb32d67f9	Merge pull request #5893 from prometheus/unify-tsdbutil Removed extra tsdb/testutil after merge.	2019-08-15 12:07:59 +01:00

... 4 5 6 7 8 ...

868 Commits (d1c251ff775b36ac7e84360659cd9ed82a314882)