prometheus

Commit Graph

Author	SHA1	Message	Date
Łukasz Mierzwa	070e409dba	Add prometheus_ready metric (#10682 ) When Prometheus starts it can take a long time before WAL is replayed and it can do anything useful. While it's starting it exposes metrics and other Prometheus servers can scrape it. We do have alerts that fire if any Prometheus server is not ingesting samples and so far we've been dealing with instances that are starting for a long time by adding a check on Prometheus process uptime. Relying on uptime isn't ideal because the time needed to start depends on the number of metrics scraped, and so on the amount of data in WAL. To help write better alerts it would be great if Prometheus exposed a metric that tells us it's fully started, that way any alert that suppose to notify us about any runtime issue can filter out starting instances. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2022-05-23 11:42:01 +02:00
Ben Ye	af5ea213f7	promtool: support matchers when querying label values (#10727 ) * promtool: support matchers when querying label values Signed-off-by: Ben Ye <ben.ye@bytedance.com> * address review comment Signed-off-by: Ben Ye <ben.ye@bytedance.com>	2022-05-23 11:10:45 +10:00
Łukasz Mierzwa	d3c9c4f574	Stop rule manager before TSDB is stopped (#10680 ) During shutdown TSDB is stopped before rule manager is stopped. Since TSDB shutdown can take a long time (minutes or 10s of minutes) it keeps rule manager running while parts of Prometheus are already stopped (most notebly scrape manager). This can cause false positive alerts to fire, mostly those that rely on absent() calls since new sample appends will stop while alert queries are still evaluated. Stop rules before stopping TSDB and scrape manager to avoid this problem. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2022-05-20 23:26:06 +02:00
Alban Hurtaud	41630b8e88	Add hidden flag to configure discovery loop interval (#10634 ) * Add hidden flag to configure discovery loop interval Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>	2022-05-06 00:42:04 +02:00
beorn7	3bc711e333	Merge branch 'main' into sparsehistogram	2022-05-04 13:37:13 +02:00
Matthieu MOREL	e2ede285a2	refactor: move from io/ioutil to io and os packages (#10528 ) * refactor: move from io/ioutil to io and os packages * use fs.DirEntry instead of os.FileInfo after os.ReadDir Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>	2022-04-27 11:24:36 +02:00
Filip Petkovski	1c1b174a8e	Add a --lint flag to the promtool check rules and check config commands (#10435 ) * Add a --lint flag to the promtool check rules and check config commands Checking rules with promtool emits warnings in the case of duplicate rules. These warnings do not result in a non-zero exit code and are difficult to spot in CI environments. Additionally, checking for duplicates is closer to a lint check rather than a syntax check. This commit adds a --lint flag to commands which include checking rules. The flag can be used to enable or disable certain linting options and cause the execution to return a non-zero exit code in case those options are not met. Signed-off-by: fpetkovski <filip.petkovsky@gmail.com> * Exit with status 3 on lint error Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>	2022-04-06 00:05:11 -04:00
beorn7	7ee1836ef5	Merge branch 'main' into sparsehistogram	2022-04-05 18:31:19 +02:00
Julien Pivotto	390956d317	Log gomaxprocs messages (#10506 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2022-03-30 19:16:22 +02:00
TomasKohout	c0fd228bad	Add dependency on go.uber.org/automaxprocs (#10498 ) * add dependency on go.uber.org/automaxprocs Signed-off-by: Tomas Kohout <tomas.kohout1995@gmail.com> Co-authored-by: Peter Bourgon <peterbourgon@users.noreply.github.com> Co-authored-by: Julien Pivotto <roidelapluie@gmail.com>	2022-03-30 12:50:11 +02:00
Julien Pivotto	f9d8e5245a	Plugins support (#10495 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2022-03-29 14:44:39 +02:00
Wilbert Guo	83a2e52bc2	Add SyncForState Implementation for Ruler HA (#10070 ) * continuously syncing activeAt for alerts Signed-off-by: Yijie Qin <qinyijie@amazon.com> Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * add import Signed-off-by: Yijie Qin <qinyijie@amazon.com> Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * Refactor SyncForState and add unit tests Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * Format code Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * Add hook for syncForState Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Fix go lint Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Refactor syncForState override implementation Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Add syncForState override func as argument to Update() Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Fix go formatting Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Fix circleci test errors Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Remove overrideFunc as argument to run() Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * remove the syncForState Signed-off-by: Yijie Qin <qinyijie@amazon.com> * use the override function to decide if need to replace the activeAt or not Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix test case Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix format Signed-off-by: Yijie Qin <qinyijie@amazon.com> * Trigger build Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fixing comments Signed-off-by: Yijie Qin <qinyijie@amazon.com> * return the result of map of alerts instead of single one Signed-off-by: Yijie Qin <qinyijie@amazon.com> * upper case the QueryforStateSeries Signed-off-by: Yijie Qin <qinyijie@amazon.com> * use a more generic rule group post process function type Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix indentation Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix gofmt Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix lint Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fixing naming Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix comments Signed-off-by: Yijie Qin <qinyijie@amazon.com> * add the lastEvalTimestamp as parameter Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fmt Signed-off-by: Yijie Qin <qinyijie@amazon.com> * change funcType to func Signed-off-by: Yijie Qin <qinyijie@amazon.com> Co-authored-by: Yijie Qin <qinyijie@amazon.com> Co-authored-by: Yijie Qin <63399121+qinxx108@users.noreply.github.com>	2022-03-29 02:16:46 +02:00
beorn7	4210aac74a	Merge branch 'main' into sparsehistogram	2022-03-22 14:47:42 +01:00
Alan Protasio	606ef33d91	Track and report Samples Queried per query We always track total samples queried and add those to the standard set of stats queries can report. We also allow optionally tracking per-step samples queried. This must be enabled both at the engine and query level to be tracked and rendered. The engine flag is exposed via a Prometheus feature flag, while the query flag is set when stats=all. Co-authored-by: Alan Protasio <approtas@amazon.com> Co-authored-by: Andrew Bloomgarden <blmgrdn@amazon.com> Co-authored-by: Harkishen Singh <harkishensingh@hotmail.com> Signed-off-by: Andrew Bloomgarden <blmgrdn@amazon.com>	2022-03-21 23:49:17 +01:00
Mauro Stettler	b025390cb4	Disable chunk write queue by default, allow user to configure the exact size (#10425 ) * Disable chunk write queue by default Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com> * update flag description Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>	2022-03-11 17:26:59 +01:00
ian woolf	025528a5d6	cmd: use os.MkdirTemp instead of ioutil.TempDir (#10320 ) Signed-off-by: ianwoolf <btw515wolf2@gmail.com>	2022-03-08 14:08:53 +01:00
Łukasz Mierzwa	a4317bf0ec	Run gofumpt on all files (#10392 ) * Run gofumpt on all files Getting golangci-lint errors when building on my laptop, possibly because I have newer version of gofumpt then what it was formatted with. Run gofumpt -w -extra on all files as it will be needed in the future anyway. * Update golangci-lint to v1.44.2 v1.44.0 upgraded gofumpt so bumping version in CI will help keep formatting correct for everyone * Address golangci-lint error Getting 'error-strings: error strings should not be capitalized or end with punctuation or a newline' from revive here. Drop new line. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2022-03-03 17:21:05 +01:00
SuperQ	b297520666	Add agent mode identifier Identify in the logs and liveness endpoints if the server is running in Agent mode or not. Signed-off-by: SuperQ <superq@gmail.com>	2022-02-17 05:27:09 +01:00
Tobias Klausmann	b998636893	Improve error logging for missing config and QL dir (#10260 ) * Improve error logging for missing config and QL dir Currently, when Prometheus can't open its config file or the query logging dir under the data dir, it only logs what it has been given default or commandline/config. Depending on the environment this can be less than helpful, since the working directory may be unclear to the user. I have specifically kept the existing error messages as intact as possible to a) still log the parameter as given and b) cause as little disruption for log-parsers/-analyzers as possible. So in case of the config file or the data dir being non-absolute paths, I use os.GetWd to find the working dir and assemble an absolute path for error logging purposes. If GetWd fails, we just log "unknown", as recovering from an error there would be very complex measure, likely not worth the code/effort. Example errors: ``` $ ./prometheus ts=2022-02-06T16:00:53.034Z caller=main.go:445 level=error msg="Error loading config (--config.file=prometheus.yml)" fullpath=/home/klausman/src/prometheus/prometheus.yml err="open prometheus.yml: no such file or directory" $ touch prometheus.yml $ ./prometheus [...] ts=2022-02-06T16:01:00.992Z caller=query_logger.go:99 level=error component=activeQueryTracker msg="Error opening query log file" file=data/queries.active fullpath=/home/klausman/src/prometheus/data/queries.active err="open data/queries.active: permission denied" panic: Unable to create mmap-ed active query log [...] $ ``` Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de> * Replace our own logic with just using filepath.Abs() Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de> * Further simplification Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de> * Review edits Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de> * Review edits Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de> * Review edits Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de>	2022-02-16 17:43:15 +01:00
Julien Pivotto	9a2e93228e	Switch to grafana/regexp everywhere (#10268 ) Let's have a consistent library for regexp. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2022-02-13 00:58:27 +01:00
Matej Gera	2c61d29b2a	Tracing: Migrate to OpenTelemetry library (#9724 ) Signed-off-by: Matej Gera <matejgera@gmail.com>	2022-01-25 11:08:04 +01:00
Rodrigo Queiro	70c1446a64	Clarify units of --storage.tsdb.retention.size (#10154 ) The flag uses Base2Bytes: `129ed4ec8b/cmd/prometheus/main.go (L1476)` Signed-off-by: Rodrigo Queiro <rodrigoq@google.com>	2022-01-13 00:55:57 +01:00
beorn7	b39f2739e5	PromQL: Always enable negative offset and @ modifier This follows the line of argument that the invariant of not looking ahead of the query time was merely emerging behavior and not a documented stable feature. Any query that looks ahead of the query time was simply invalid before the introduction of the negative offset and the @ modifier. Signed-off-by: beorn7 <beorn@grafana.com>	2022-01-11 17:08:55 +01:00
beorn7	61509fc840	PromQL: Promote negative offset and @ modifer to stable Following the argument that breaking the invariant that PromQL does not look ahead of the evaluation time implies a breaking change, we still need to keep the feature flag around, but at least we can communicate that the feature is considered stable, and that the feature flags will be ignored from v3 on. Signed-off-by: beorn7 <beorn@grafana.com>	2022-01-11 00:34:33 +01:00
Björn Rabenstein	0f4a1e6eac	Merge pull request #10119 from prometheus/beorn7/remote API: Promote remote-write-receiver to stable	2022-01-10 15:55:10 +01:00
chenlujjj	2ce94ac196	Add '--weight' flag to 'promtool check metrics' command (#10045 )	2022-01-07 16:58:28 -05:00
beorn7	8fdfa52976	API: Promote remote-write-receiver to stable Since `/api/v1/write` is a mutating endpoint, we should still activate the remote-write-receiver explicitly. But we should do it in the same way as the other mutating endpoints, i.e. via a flag `--web.enable-remote-write-receiver`. This commit marks the feature flag as deprecated, i.e. it still works but logs a warning on startup. This enables users to seamlessly migrate. With the next minor release, we can start ignoring the feature flag (but still warn a user that is trying to use it). Signed-off-by: beorn7 <beorn@grafana.com>	2022-01-05 15:36:07 +01:00
David Leadbeater	a961062c37	Disable time based retention in tests (#8818 ) Fixes #7699. Signed-off-by: David Leadbeater <dgl@dgl.cx>	2022-01-02 23:46:03 +01:00
Jessica G	174a1147d5	Merge pull request #9861 from JessicaGreben/minor-prom-improvements Add exit code constants in promtool	2021-12-31 12:07:02 -08:00
jessicagreben	4b03fa3100	replace config exit code with failure exit code Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-12-30 05:37:57 -08:00
beorn7	64c7bd2b08	Merge branch 'main' into sparsehistogram	2021-12-18 14:04:25 +01:00
jessicagreben	59f7ef06d0	update exit code for sd Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-12-18 04:45:15 -08:00
Nicholas Blott	c92673fb14	Remove check against cfg so interval/ timeout are always set (#10023 ) Signed-off-by: Nicholas Blott <blottn@tcd.ie>	2021-12-16 13:28:46 +01:00
beorn7	6f33ab2b35	Merge branch 'main' into sparsehistogram	2021-12-15 13:49:33 +01:00
Julien Pivotto	db1551bd21	Merge pull request #10016 from prometheus/release-2.32 Merge back release 2.32	2021-12-14 20:58:58 +01:00
Ben Ye	d9bbe7f3dd	mention agent mode in enable-feature flag help description (#9939 ) Signed-off-by: Ben Ye <ben.ye@bytedance.com>	2021-12-04 21:13:24 +01:00
zzehring	42628899b5	promtool: Add `--syntax-only` flag for `check config` This commit adds a `--syntax-only` flag for `promtool check config`. When passing in this flag, promtool will omit various file existence checks that would cause the check to fail (e.g. the check would not fail if `rule_files` files don't exist at their respective paths). This functionality will allow CI systems to check the syntax of configs without worrying about referenced files. Fixes: #5222 Signed-off-by: zzehring <zack.zehring@grafana.com>	2021-12-02 15:33:11 -05:00
jessicagreben	99bb56fc46	add errcodes from sd file Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-12-01 04:45:18 -08:00
beorn7	e8e9155a11	Merge branch 'main' into sparsehistogram	2021-11-30 18:22:37 +01:00
beorn7	e4e24453fa	Merge branch 'main' into beorn7/merge2	2021-11-30 17:19:06 +01:00
Filip Petkovski	5849521e90	promtool: Fix credentials file check (#9883 ) The promtool check config command still uses the bearer_token_file field which is deprecated in favour of authorization.credentials_file. This commit modifies the command to use the new field insted. Fixes #9874 Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>	2021-11-30 15:02:07 +11:00
Björn Rabenstein	7e42acd3b1	tsdb: Rework iterators (#9877 ) - Pick At... method via return value of Next/Seek. - Do not clobber returned buckets. - Add partial FloatHistogram suppert. Note that the promql package is now _only_ dealing with FloatHistograms, following the idea that PromQL only knows float values. As a byproduct, I have removed the histogramSeries metric. In my understanding, series can have both float and histogram samples, so that metric doesn't make sense anymore. As another byproduct, I have converged the sampleBuf and the histogramSampleBuf in memSeries into one. The sample type stored in the sampleBuf has been extended to also contain histograms even before this commit. Signed-off-by: beorn7 <beorn@grafana.com>	2021-11-29 13:24:23 +05:30
jessicagreben	764f2d03a5	add const for exit codes Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-11-24 09:17:49 -08:00
Matheus Alcantara	d9a8c453a0	cmd: use t.TempDir instead of ioutil.TempDir on tests (#9852 )	2021-11-23 20:09:28 -05:00
beorn7	5d4db805ac	Merge branch 'main' into sparsehistogram	2021-11-17 19:57:31 +01:00
beorn7	4c28d9fac7	Move to histogram.Histogram pointers This is to avoid copying the many fields of a histogram.Histogram all the time. This also fixes a bunch of formerly broken tests. Signed-off-by: beorn7 <beorn@grafana.com>	2021-11-12 23:17:35 +01:00
Robert Fratto	72a9f7fee9	Share TSDB locker code with agent (#9623 ) * share tsdb db locker code with agent Closes #9616 Signed-off-by: Robert Fratto <robertfratto@gmail.com> * add flag to disable lockfile for agent Signed-off-by: Robert Fratto <robertfratto@gmail.com> * use agentOnlySetting instead of PreAction Signed-off-by: Robert Fratto <robertfratto@gmail.com> * tsdb: address review feedback 1. Rename Locker to DirLocker 2. Move DirLocker to tsdb/tsdbutil 3. Name metric using fmt.Sprintf 4. Refine error checking in DirLocker test Signed-off-by: Robert Fratto <robertfratto@gmail.com> * tsdb: create test utilities to assert expected DirLocker behavior Signed-off-by: Robert Fratto <robertfratto@gmail.com> * tsdb/tsdbutil: fix lint errors Signed-off-by: Robert Fratto <robertfratto@gmail.com> * tsdb/agent: fix windows test failure Use new DB variable instead of overriding the old one. Signed-off-by: Robert Fratto <robertfratto@gmail.com>	2021-11-11 11:45:25 -05:00
Mateusz Gozdek	c08bb86be0	cmd/prometheus: use random listen port in TestStartupInterrupt test So it can be run in parallel safely. Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-11 01:37:24 +01:00
Mateusz Gozdek	7bd7573891	cmd/prometheus/main_unix_test.go: fix unix test styling * Formatting of error message is missing a space after ':'. * t.Fatalf should be used instead of t.Errorf+return. Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-11 01:37:24 +01:00
Robert Fratto	4a83e6f453	Remove agent mode warnings when loading configs (#9622 ) PR #9618 introduced failing to load the config file when agent mode is configured to run with unspported settings. This made the block that logs a warning on their configuration no-op, which is now removed. Signed-off-by: Robert Fratto <robertfratto@gmail.com>	2021-11-10 19:39:30 +05:30
Mateusz Gozdek	fa1b14e146	cmd/prometheus: randomize test port and isolate test data directory Between the tests. This enables parallelizing those tests, which should cut the test execution time. Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-10 09:40:43 +01:00
beorn7	c954cd9d1d	Move packages out of deprecated pkg directory This creates a new `model` directory and moves all data-model related packages over there: exemplar labels relabel rulefmt textparse timestamp value All the others are more or less utilities and have been moved to `util`: gate logging modetimevfs pool runtime Signed-off-by: beorn7 <beorn@grafana.com>	2021-11-09 08:03:10 +01:00
Dieter Plaetinck	cda025b5b5	TSDB: demistify SeriesRefs and ChunkRefs (#9536 ) * TSDB: demistify seriesRefs and ChunkRefs The TSDB package contains many types of series and chunk references, all shrouded in uint types. Often the same uint value may actually mean one of different types, in non-obvious ways. This PR aims to clarify the code and help navigating to relevant docs, usage, etc much quicker. Concretely: * Use appropriately named types and document their semantics and relations. * Make multiplexing and demuxing of types explicit (on the boundaries between concrete implementations and generic interfaces). * Casting between different types should be free. None of the changes should have any impact on how the code runs. TODO: Implement BlockSeriesRef where appropriate (for a future PR) Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * feedback Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * agent: demistify seriesRefs and ChunkRefs Signed-off-by: Dieter Plaetinck <dieter@grafana.com>	2021-11-06 15:40:04 +05:30
Bartlomiej Plotka	789274bf9c	cmd: Fixed storage flag regression introduced in #9660 Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2021-11-06 00:16:43 +01:00
Sunil Thaha	4bdaea7663	fix: storage.tsdb.path randomly initialised to data-agent/ (#9660 ) Using the same variable for storage.tsdb.path and storage.agent.path as below in main.go causes cfg.localStoragePath to be data/ or data-agent/ at random. a.Flag("storage.tsdb.path", "Base path for metrics storage."). PreAction(serverOnlySetting()). Default("data/").StringVar(&cfg.localStoragePath) a.Flag("storage.agent.path", "Base path for metrics storage."). PreAction(agentOnlySetting()). Default("data-agent/").StringVar(&cfg.localStoragePath) This patch fixes it by using a different variable for storage.agent.path Signed-off-by: Sunil Thaha sthaha@redhat.com Signed-off-by: Sunil Thaha <sthaha@redhat.com>	2021-11-04 10:08:01 +00:00
Bartlomiej Plotka	e68ccc7708	Fix misleading agent-only/server-only check messages. (#9650 ) * Fix misleading agent-only/server-only check messages. Issue: ``` [root@host01 ~]# docker run -it --net=host --rm -v /root/editor/prom-agent-batcopter.yaml:/etc/prometheus/prometheus.yaml -v /root/prom-batcopter-data:/prometheus -u root --name prom-agent-batcopter quay.io/prometheus/prometheus:main --enable-feature=agent --config.file=/etc/prometheus/prometheus.yaml --storage.tsdb.path=/prometheus --web.listen-address=:9091 ts=2021-11-02T16:00:59.789Z caller=main.go:205 level=info msg="Experimental agent mode enabled." The following flag(s) can not be used in agent mode: ["--enable-feature"] ``` Problem was that PreAction gives us all parsed flag. Context does not give us any info on what flag clause it was defined. Also added info for flag help about being server or agent only. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * gofumpt. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2021-11-04 09:08:53 +00:00
Mateusz Gozdek	c3beca72e2	cmd/prometheus: wait for Prometheus to shutdown in tests So temporary data directory can be successfully removed, as on Windows, directory cannot be in used while removal. Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-02 20:14:19 +01:00
Mateusz Gozdek	b7bdf6fab2	Fix imports formatting According to `2829908806 (r58457095)`. Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-02 19:52:34 +01:00
Mateusz Gozdek	1a6c2283a3	Format Go source files using 'gofumpt -w -s -extra' Part of #9557 Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-02 19:52:34 +01:00
Julien Pivotto	807f46a1ed	Gate agent behind a feature flag, valide mode flags (#9620 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-11-02 13:03:35 +00:00
Darshan Chaudhary	a7e554b158	add check service-discovery command (#8970 ) Signed-off-by: darshanime <deathbullet@gmail.com>	2021-11-01 14:42:12 +01:00
Hu Shuai	4b799c361a	Fix in typo in cmd/prometheus/main.go (#9632 ) Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>	2021-11-01 16:08:23 +05:30
Arthur Silva Sens	be2599c853	config: Make remote-write required for Agent mode (#9618 ) * config: Make remote-write required for Agent mode Signed-off-by: ArthurSens <arthursens2005@gmail.com>	2021-10-30 01:41:40 +02:00
Robert Fratto	bc72a718c4	Initial draft of prometheus-agent (#8785 ) * Initial draft of prometheus-agent This commit introduces a new binary, prometheus-agent, based on the Grafana Agent code. It runs a WAL-only version of prometheus without the TSDB, alerting, or rule evaluations. It is intended to be used to remote_write to Prometheus or another remote_write receiver. By default, prometheus-agent will listen on port 9095 to not collide with the prometheus default of 9090. Truncation of the WAL cooperates on a best-effort case with Remote Write. Every time the WAL is truncated, the minimum timestamp of data to truncate is determined by the lowest sent timestamp of all samples across all remote_write endpoints. This gives loose guarantees that data from the WAL will not try to be removed until the maximum sample lifetime passes or remote_write starts functionining. Signed-off-by: Robert Fratto <robertfratto@gmail.com> * add tests for Prometheus agent (#22) * add tests for Prometheus agent * add tests for Prometheus agent * rearranged tests as per the review comments * update tests for Agent * changes as per code review comments Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com> * incremental changes to prometheus agent Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com> * changes as per code review comments Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com> * Commit feedback from code review Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Robert Fratto <robertfratto@gmail.com> * Port over some comments from grafana/agent Signed-off-by: Robert Fratto <robertfratto@gmail.com> * Rename agent.Storage to agent.DB for tsdb consistency Signed-off-by: Robert Fratto <robertfratto@gmail.com> * Consolidate agentMode ifs in cmd/prometheus/main.go Signed-off-by: Robert Fratto <robertfratto@gmail.com> * Document PreAction usage requirements better for agent mode flags Signed-off-by: Robert Fratto <robertfratto@gmail.com> * remove unnecessary defaultListenAddr Signed-off-by: Robert Fratto <robertfratto@gmail.com> * `go fmt ./tsdb/agent` and fix lint errors Signed-off-by: Robert Fratto <robertfratto@gmail.com> Co-authored-by: SriKrishna Paparaju <paparaju@gmail.com>	2021-10-29 16:25:05 +01:00
David Leadbeater	c91c2bbea5	promtool: Show more human readable got/exp output (#8064 ) Avoid using %#v, nothing needs to parse this, so escaping " and so on leads to hard to read output. Add new lines, number and indentation to each alert series output. Signed-off-by: David Leadbeater <dgl@dgl.cx>	2021-10-28 22:17:18 +11:00
DrAuYueng	69e309d202	Expose TargetsFromGroup/AlertmanagerFromGroup func and reuse this for (#9343 ) static/file sd config check in promtool Signed-off-by: DrAuYueng <ouyang1204@gmail.com>	2021-10-28 02:01:28 +02:00
Julien Pivotto	73255e15f6	Address golint failures from revive Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-10-23 00:53:11 +02:00
Will Tran	97b0738895	add --max-block-duration in promtool create-blocks-from rules (#9511 ) * support maxBlockDuration for promtool tsdb create-blocks-from rules Fixes #9465 Signed-off-by: Will Tran <will@autonomic.ai> * don't hardcode 2h as the default block size in rules test Signed-off-by: Will Tran <will@autonomic.ai>	2021-10-21 23:28:37 +02:00
Furkan Türkal	9d0058a09e	Bind port 0 in main_test (#9558 ) Fixes #9499 Signed-off-by: Furkan <furkan.turkal@trendyol.com>	2021-10-21 14:59:20 +02:00
Julien Pivotto	432005826d	Add a feature flag to enable the new discovery manager (#9537 ) * Add a feature flag to enable the new manager This PR creates a copy of the legacy manager and uses it by default. It is a companion PR to #9349. With this PR, users can enable the new discovery manager and provide us with any feedback / side effects that the new behaviour might have. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-10-20 10:15:54 +02:00
beorn7	a9008f5423	Merge branch 'main' into sparsehistogram	2021-10-19 17:14:23 +02:00
jessicagreben	60d0990886	add more explicit label values Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-10-18 01:04:13 +02:00
jessicagreben	3da87d2f39	add unit test to check label rule labels override Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-10-18 01:04:13 +02:00
Julien Pivotto	f8372bc6b9	backfill: Apply rule labels after query labels Fix #9419 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-10-18 01:04:13 +02:00
beorn7	7a8bb8222c	Style cleanup of all the changes in sparsehistogram so far A lot of this code was hacked together, literally during a hackathon. This commit intends not to change the code substantially, but just make the code obey the usual style practices. A (possibly incomplete) list of areas: * Generally address linter warnings. * The `pgk` directory is deprecated as per dev-summit. No new packages should be added to it. I moved the new `pkg/histogram` package to `model` anticipating what's proposed in #9478. * Make the naming of the Sparse Histogram more consistent. Including abbreviations, there were just too many names for it: SparseHistogram, Histogram, Histo, hist, his, shs, h. The idea is to call it "Histogram" in general. Only add "Sparse" if it is needed to avoid confusion with conventional Histograms (which is rare because the TSDB really has no notion of conventional Histograms). Use abbreviations only in local scope, and then really abbreviate (not just removing three out of seven letters like in "Histo"). This is in the spirit of https://github.com/golang/go/wiki/CodeReviewComments#variable-names * Several other minor name changes. * A lot of formatting of doc comments. For one, following https://github.com/golang/go/wiki/CodeReviewComments#comment-sentences , but also layout question, anticipating how things will look like when rendered by `godoc` (even where `godoc` doesn't render them right now because they are for unexported types or not a doc comment at all but just a normal code comment - consistency is queen!). * Re-enabled `TestQueryLog` and `TestEndopints` (they pass now, leaving them disabled was presumably an oversight). * Bucket iterator for histogram.Histogram is now created with a method. * HistogramChunk.iterator now allows iterator recycling. (I think @dieterbe only commented it out because he was confused by the question in the comment.) * HistogramAppender.Append panics now because we decided to treat staleness marker differently. Signed-off-by: beorn7 <beorn@grafana.com>	2021-10-11 13:02:03 +02:00
beorn7	fd5ea4e0b5	Merge branch 'main' into sparsehistogram	2021-10-07 23:16:42 +02:00
Julien Pivotto	bd217c58a7	Backfill: Do not query after --end (#9340 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-09-15 16:02:41 +02:00
Julien Pivotto	1ea774f184	Merge pull request #9339 from roidelapluie/remove-double-align backfill: Do not align the start of the group since we align every rule.	2021-09-14 23:46:25 +02:00
Julien Pivotto	2bde71ec5f	Merge pull request #9338 from prometheus/release-2.30 merge back release 2.30	2021-09-14 23:46:11 +02:00
Julien Pivotto	691ce066fb	backfill: Do not align the start of the group since we align every rule. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-09-14 23:13:06 +02:00
jessicagreben	b0a21f9eab	rm overlap, add label builder to fix name bug Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-09-13 10:32:08 -07:00
Julien Pivotto	0111aa987e	Merge pull request #9312 from fpetkovski/promtool-analyze-compaction promtool: add extended flag for tsdb analysis	2021-09-08 17:27:01 +02:00
Julien Pivotto	48a101be1b	Allow to tune the scrape tolerance (#9283 ) * Allow to tune the scrape tolerance In most of the classic monitoring use cases, a few milliseconds difference can be omitted. In Prometheus, a few millisecond difference can however make a big difference. Currently, Prometheus will ignore up to 2 ms difference in the alignments. It turns out that for users who can afford a 10ms difference, there is a lot of resources and disk space to win, as shown in this graph, which shows the bytes / samples over a production Prometheus server. You can clearly see the switch from 2ms to 10ms tolerance. This pull request enables the adjustment of the scrape timestamp alignment tolerance. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Fix golint Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-09-08 17:27:33 +05:30
fpetkovski	449f874679	promtool: add extended flag for tsdb analysis The compaction analysis which runs under promtool tsdb analyze can be an intensive process which slows down the entire command. This commit adds an --extended flag to tsdb analyze which can be toggled for running long running tasks, such as compaction analysis. Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>	2021-09-08 10:50:01 +02:00
Julien Pivotto	ad642a85c0	Merge pull request #9304 from LeviHarrison/backfill-fix-date Rules backfill: fix new rule importer message	2021-09-07 18:01:03 +02:00
Julien Pivotto	bd24e2fb92	Merge pull request #9303 from LeviHarrison/backfill-return-1 Rules backfill: return 1 if unsuccessful	2021-09-07 18:00:42 +02:00
Levi Harrison	ded95ff434	Fix new rule importer message Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-09-06 22:19:29 -04:00
Levi Harrison	34e1b47968	Fixed error handling Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-09-06 21:55:57 -04:00
Holger Hans Peter Freyther	5edec40d60	promtool: Speed up checking for duplicate rules Trade space for speed. Convert all rules into our temporary struct, sort and then iterate. This is a significant when having many rules. Signed-off-by: Holger Hans Peter Freyther <holger@moiji-mobile.com>	2021-09-06 23:10:26 +08:00
Holger Hans Peter Freyther	3a309c1ae5	promtool: Add simple benchmark checkDuplicates benchmark Add a simple benchmark with a large number of rules. Signed-off-by: Holger Hans Peter Freyther <holger@moiji-mobile.com>	2021-09-06 23:10:26 +08:00
Holger Hans Peter Freyther	794937b3d6	promtool: Add testcase for detecting duplicates Introduce a basic test for checking for duplicate rules. Signed-off-by: Holger Hans Peter Freyther <holger@moiji-mobile.com>	2021-09-06 23:10:26 +08:00
SuperQ	31f4108758	Add scrape_timeout_seconds metric Add a new built-in metric `scrape_timeout_seconds` to allow monitoring of the ratio of scrape duration to the scrape timeout. Hide behind a feature flag to avoid additional cardinality by default. Signed-off-by: SuperQ <superq@gmail.com>	2021-09-02 12:15:35 +02:00
SuperQ	e167a45c65	Add new Go build tags. Add new go:build comments based on 1.17 formatting[0]. [0]: https://golang.org/doc/go1.17#gofmt Signed-off-by: SuperQ <superq@gmail.com>	2021-08-27 10:24:14 +02:00
Julien Pivotto	cab96a06ef	Merge release 2.29 in main (#9196 ) * PromQL: Fix start and end keywords masking label and metric names This commit fixes an issue with the "at modifier" that introduced two new keywords: `start` and `end`. In grouping options and in metric names, these keywords took precedence over metric or label names, so that those metrics and labels could no longer be referenced. Signed-off-by: Clayton Peters <clayton.peters@man.com> * Add in additional tests for metrics and/or labels called start/end. Signed-off-by: Clayton Peters <clayton.peters@man.com> * : Cut 2.29.0-rc.0 Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com> VERSION: bump to 2.29.0-rc.0 Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com> * Remove experimental wording on size-based retention Followup of #9004 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Fix PR reference in changelog Signed-off-by: George Brighton <george@gebn.co.uk> * Describe EC2 availability zone IDs at most once per refresh (#9142) Signed-off-by: George Brighton <george@gebn.co.uk> * Describe EC2 availability zones at most once per SD load Closes #9142. Signed-off-by: George Brighton <george@gebn.co.uk> * Incorporate feedback Signed-off-by: George Brighton <george@gebn.co.uk> * Integrate feedback Signed-off-by: George Brighton <george@gebn.co.uk> * Add a compatibility note for macOS users. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * : Cut v2.29.0-rc.1 Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com> Fix `kuma_sd` targetgroup reporting (#9157) * Bundle all xDS targets into a single group Signed-off-by: austin ce <austin.cawley@gmail.com> * : cut v2.29.0-rc.2 Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com> Rename links Signed-off-by: Levi Harrison <git@leviharrison.dev> * bump codemirror-promql to 0.17.0 Signed-off-by: Augustin Husson <husson.augustin@gmail.com> * : cut v2.29.0 Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com> tsdb: align atomically accessed int64 (#9192) This prevents a panic in 32-bit archs: https://pkg.go.dev/sync/atomic#pkg-note-BUG Fixed #9190 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Release 2.29.1 (#9193) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> Co-authored-by: Clayton Peters <clayton.peters@man.com> Co-authored-by: Frederic Branczyk <fbranczyk@gmail.com> Co-authored-by: George Brighton <george@gebn.co.uk> Co-authored-by: Austin Cawley-Edwards <austin.cawley@gmail.com> Co-authored-by: Levi Harrison <git@leviharrison.dev> Co-authored-by: Augustin Husson <husson.augustin@gmail.com>	2021-08-12 18:38:06 +02:00
Ganesh Vernekar	095f572d4a	Sync sparsehistogram branch with main (#9189 ) * Fix `kuma_sd` targetgroup reporting (#9157) * Bundle all xDS targets into a single group Signed-off-by: austin ce <austin.cawley@gmail.com> * Snapshot in-memory chunks on shutdown for faster restarts (#7229) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Rename links Signed-off-by: Levi Harrison <git@leviharrison.dev> * Remove Individual Data Type Caps in Per-shard Buffering for Remote Write (#8921) * Moved everything to nPending buffer Signed-off-by: Levi Harrison <git@leviharrison.dev> * Simplify exemplar capacity addition Signed-off-by: Levi Harrison <git@leviharrison.dev> * Added pre-allocation Signed-off-by: Levi Harrison <git@leviharrison.dev> * Don't allocate if not sending exemplars Signed-off-by: Levi Harrison <git@leviharrison.dev> * Avoid deadlock when processing duplicate series record (#9170) * Avoid deadlock when processing duplicate series record `processWALSamples()` needs to be able to send on its output channel before it can read the input channel, so reads to allow this in case the output channel is full. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * processWALSamples: update comment Previous text seems to relate to an earlier implementation. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Optimise WAL loading by removing extra map and caching min-time (#9160) * BenchmarkLoadWAL: close WAL after use So that goroutines are stopped and resources released Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * BenchmarkLoadWAL: make series IDs co-prime with #workers Series are distributed across workers by taking the modulus of the ID with the number of workers, so multiples of 100 are a poor choice. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * BenchmarkLoadWAL: simulate mmapped chunks Real Prometheus cuts chunks every 120 samples, then skips those samples when re-reading the WAL. Simulate this by creating a single mapped chunk for each series, since the max time is all the reader looks at. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Fix comment Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Remove series map from processWALSamples() The locks that is commented to reduce contention in are now sharded 32,000 ways, so won't be contended. Removing the map saves memory and goes just as fast. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * loadWAL: Cache the last mmapped chunk time So we can skip calling append() for samples it will reject. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Improvements from code review Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Full stops and capitals on comments Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Cache max time in both places mmappedChunks is updated Including refactor to extract function `setMMappedChunks`, to reduce code duplication. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Update head min/max time when mmapped chunks added This ensures we have the correct values if no WAL samples are added for that series. Note that `mSeries.maxTime()` was always `math.MinInt64` before, since that function doesn't consider mmapped chunks. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Split Go and React Tests (#8897) * Added go-ci and react-ci Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu> Signed-off-by: Levi Harrison <git@leviharrison.dev> * Remove search keymap from new expression editor (#9184) Signed-off-by: Julius Volz <julius.volz@gmail.com> Co-authored-by: Austin Cawley-Edwards <austin.cawley@gmail.com> Co-authored-by: Levi Harrison <git@leviharrison.dev> Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu> Co-authored-by: Bryan Boreham <bjboreham@gmail.com> Co-authored-by: Julius Volz <julius.volz@gmail.com>	2021-08-11 15:43:17 +05:30
Ganesh Vernekar	ee7e0071d1	Snapshot in-memory chunks on shutdown for faster restarts (#7229 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-06 17:51:01 +01:00
Ganesh Vernekar	8b70e87ab9	Merge remote-tracking branch 'upstream/main' into sparse-refactor Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-05 12:16:08 +05:30
jinglina	ed24e51e7c	remove redundant type conversion (#9126 ) Signed-off-by: jinglina <jinglinax@163.com>	2021-07-28 13:33:46 +05:30
Julien Pivotto	04f33e88f7	Merge pull request #9121 from LeviHarrison/revert-klog-fix Revert klog fix	2021-07-27 14:07:59 +02:00
Levi Harrison	58556c19be	Revert "Fix logging after the move to go-kit/log (#9021 )" This reverts commit `642722e5d0`. Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-07-27 07:37:03 -04:00
Ganesh Vernekar	507d61fdeb	Remove experimental tag on `--storage.tsdb.allow-overlapping-blocks` (#9117 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-07-27 14:38:20 +05:30
Martin Disibio	1bcd13d6b5	Exemplar resize (#8974 ) * Create experimental circular buffer resize method, benchmarks Signed-off-by: Martin Disibio <mdisibio@gmail.com> * Optimize exemplar resize to only replay as many exemplars as needed Signed-off-by: Martin Disibio <mdisibio@gmail.com> * More comments, benchmark AddExemplar Signed-off-by: Martin Disibio <mdisibio@gmail.com> * optimizations Signed-off-by: Martin Disibio <mdisibio@gmail.com> * comment Signed-off-by: Martin Disibio <mdisibio@gmail.com> * Slight refactor of resize benchmark + make use of resize via runtime reloadable storage config. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Some more config related changes. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Address some review comments. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Address more review comments. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Refactor to remove usage of noopExemplarStorage and avoid race condition when resizing from Head code. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix or add comments to clarify some of the new behaviour. Signed-off-by: Callum Styan <callumstyan@gmail.com> * fix potential panics related to negative exemplar buffer lengths Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Callum Styan <callumstyan@gmail.com>	2021-07-20 10:22:57 +05:30
Levi Harrison	3b5257d869	Changed disabled_features to feature_flags Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-07-13 22:03:51 -04:00
Ganesh Vernekar	78d68d5972	Make query_range serve histograms (#9036 ) * Modify query_range to serve only sparse histograms Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Finish CumulativeExpandSparseHistogram for positive schema Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix lint Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix bug and comment out tests for query_range Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix lint 2 Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-07-03 19:23:56 +05:30
Filip Petkovski	7c125aa5fb	Promtool: Add support for compaction analysis (#8940 ) * Extend promtool to support compaction analysis This commit extends the promtool tsdb analyze command to help troubleshoot high Prometheus disk usage. The command now plots a distribution of how full chunks are relative to the maximum capacity of 120 samples per chunk. Signed-off-by: fpetkovski <filip.petkovsky@gmail.com> * Update cmd/promtool/tsdb.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>	2021-07-02 11:08:52 +01:00
Julius Volz	441e6cd7d6	Merge release-2.28 back into main (#9035 ) * Cut v2.28.0-rc.0 (#8954) * Cut v2.28.0-rc.0 Signed-off-by: Julius Volz <julius.volz@gmail.com> * Changelog fixup Signed-off-by: Julius Volz <julius.volz@gmail.com> * Address review comments Signed-off-by: Julius Volz <julius.volz@gmail.com> * Downgrade some features to enhancements Signed-off-by: Julius Volz <julius.volz@gmail.com> * Adjust release date to today Signed-off-by: Julius Volz <julius.volz@gmail.com> * Migrate HTTP SD docs from docs repo (#8972) See discussion in https://github.com/prometheus/docs/pull/1975 Signed-off-by: Julius Volz <julius.volz@gmail.com> * Cut Prometheus v2.28.0 (#8973) Signed-off-by: Julius Volz <julius.volz@gmail.com> * HTTP SD: Allow charset in content type (#8981) * Added content type regex Signed-off-by: Levi Harrison <git@leviharrison.dev> Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * fixed disappeared target groups in http_sd #9019 Signed-off-by: servak <fservak@gmail.com> * Add a testcase for http-sd Signed-off-by: servak <fservak@gmail.com> * HTTP SD: Simplify logic of disappeared targetgroups (#9026) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Fix logging after the move to go-kit/log (#9021) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Cut Prometheus v2.28.1 (#9034) Signed-off-by: Julius Volz <julius.volz@gmail.com> Co-authored-by: Levi Harrison <git@leviharrison.dev> Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu> Co-authored-by: servak <fservak@gmail.com>	2021-07-01 18:02:13 +02:00
Levi Harrison	90976e7505	Promtool: Add feature flags to unit tests (#8958 ) * Added feature flag support to unit tests Signed-off-by: Levi Harrison <git@leviharrison.dev> * Added/fixed tests Signed-off-by: Levi Harrison <git@leviharrison.dev> * Addressed review comments Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-30 22:43:39 +01:00
Ankit Goel	d437cee73a	Move storage.tsdb.retention.size out of experimental #8728 (#9004 ) * Move storage.tsdb.retention.size out of experimental #8728 Signed-off-by: Ankit Goel <ankit.goel@deliveryhero.com>	2021-06-30 01:30:11 +02:00
Levi Harrison	ca1896c15b	Promtool: Validate service discovery files (#8950 ) * Check SD files in promtool Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-29 17:32:59 +02:00
Ganesh Vernekar	04ad56d9b8	Append sparse histograms into the Head block (#9013 ) * Append sparse histograms into the Head block Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Add AtHistogram() to Iterator interface. Make HistoChunk conform to Chunk interface. Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-06-29 20:08:46 +05:30
Steve Kuznetsov	fd6c852567	promtool: backfill: allow configuring block duration (#8919 ) * promtool: backfill: allow configuring block duration When backfilling large amounts of data across long periods of time, it may in certain circumstances be useful to use a longer block duration to increase the efficiency and speed of the backfilling process. This patch adds a flag --block-duration-power to allow a user to choose the power N where the block duration is 2^(N+1)h. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com> * promtool: use sub-tests in backfill testing Signed-off-by: Steve Kuznetsov <skuznets@redhat.com> * backfill: add messages to tests for clarity When someone new breaks a test, seeing "expected: false, got: true" is really not useful. A nice message helps here. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com> * backfill: test long block durations A test that uses a long block duration to write bigger blocks is added. The check to make sure all blocks are the default duration is removed. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>	2021-06-29 14:53:38 +05:30
Ganesh Vernekar	64bea6999e	HistogramAppender interface for sparse histograms (#9007 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-06-28 20:30:55 +05:30
Ben Kochie	7cb55d5732	Merge pull request #8802 from mwasilew2/yaml-linting Adds yamllinting to Makefile.common	2021-06-24 15:59:35 +02:00
Julien Pivotto	ba76bceb6b	Merge pull request #8917 from stevekuznetsov/skuznets/silence-backfill promtool: backfill: allow silencing output	2021-06-14 23:27:18 +02:00
Michal Wasilewski	3f686cad8b	fixes yamllint errors Signed-off-by: Michal Wasilewski <mwasilewski@gmx.com>	2021-06-12 12:47:47 +02:00
Levi Harrison	b5f6f8fb36	Switched to go-kit/log Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-11 12:28:36 -04:00
Steve Kuznetsov	ee771a2a66	promtool: backfill: allow silencing output When using the backfill command to add data to an ephemeral/test Prometheus instance, it is not important to see which data was added as it is often generated ahead of time and mostly irrelevant to the use-case. The current approach prints information about each block that is written, but does so in a generally inefficient and costly manner. This patch adds a `--quiet` flag that allows a user to opt out of this behavior. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>	2021-06-10 15:31:16 -07:00
Levi Harrison	7bc11dcb06	React UI: Add Starting Screen (#8662 ) * Added walreplay API endpoint Signed-off-by: Levi Harrison <git@leviharrison.dev> * Added starting page to react-ui Signed-off-by: Levi Harrison <git@leviharrison.dev> * Documented the new endpoint Signed-off-by: Levi Harrison <git@leviharrison.dev> * Fixed typos Signed-off-by: Levi Harrison <git@leviharrison.dev> Co-authored-by: Julius Volz <julius.volz@gmail.com> * Removed logo Signed-off-by: Levi Harrison <git@leviharrison.dev> * Changed isResponding to isUnexpected Signed-off-by: Levi Harrison <git@leviharrison.dev> * Changed width of progress bar Signed-off-by: Levi Harrison <git@leviharrison.dev> * Changed width of progress bar Signed-off-by: Levi Harrison <git@leviharrison.dev> * Added DB stats object Signed-off-by: Levi Harrison <git@leviharrison.dev> * Updated starting page to work with new fields Signed-off-by: Levi Harrison <git@leviharrison.dev> * Passing nil Signed-off-by: Levi Harrison <git@leviharrison.dev> * Passing nil (pt. 2) Signed-off-by: Levi Harrison <git@leviharrison.dev> * Passing nil (pt. 3) Signed-off-by: Levi Harrison <git@leviharrison.dev> * Passing nil (and also implementing a method this time) (pt. 4) Signed-off-by: Levi Harrison <git@leviharrison.dev> * Passing nil (and also implementing a method this time) (pt. 5) Signed-off-by: Levi Harrison <git@leviharrison.dev> * Changed const to let Signed-off-by: Levi Harrison <git@leviharrison.dev> * Passing nil (pt. 6) Signed-off-by: Levi Harrison <git@leviharrison.dev> * Remove SetStats method Signed-off-by: Levi Harrison <git@leviharrison.dev> * Added comma Signed-off-by: Levi Harrison <git@leviharrison.dev> * Changed api Signed-off-by: Levi Harrison <git@leviharrison.dev> * Changed to triple equals Signed-off-by: Levi Harrison <git@leviharrison.dev> * Fixed data response types Signed-off-by: Levi Harrison <git@leviharrison.dev> * Don't return pointer Signed-off-by: Levi Harrison <git@leviharrison.dev> * Changed version Signed-off-by: Levi Harrison <git@leviharrison.dev> * Fixed interface issue Signed-off-by: Levi Harrison <git@leviharrison.dev> * Fixed pointer Signed-off-by: Levi Harrison <git@leviharrison.dev> * Fixed copying lock value error Signed-off-by: Levi Harrison <git@leviharrison.dev> Co-authored-by: Julius Volz <julius.volz@gmail.com>	2021-06-05 15:29:32 +01:00
Levi Harrison	17ea8d006a	Added external URL access Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-05-30 23:35:26 -04:00
Bartlomiej Plotka	80545bfb2e	Instrumented circular exemplar storage. (#8712 ) * Instrumented circular storage. Fixes: https://github.com/prometheus/prometheus/issues/8708 Fixes: https://github.com/prometheus/prometheus/issues/8707 Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Fixed CB. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Addressed Julien comments. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Addressed Callum comments. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2021-04-16 13:44:53 +01:00
nberkley	f9e2dd0697	Add support for smaller block chunk segment allocations (#8478 ) * Add support for --storage.tsdb.max-chunk-size to suport small chunks for space limited prometheus instances. Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Update tsdb/compact.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Update tsdb/db.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Update cmd/prometheus/main.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Change naming scheme to Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Add a lower bound to --storage.tsdb.max-block-chunk-segment-size Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Update storage.md to explain what a chunk segment is Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Apply suggestions from code review Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com> Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Force tests Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> * Fix code style Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>	2021-04-15 14:25:01 +05:30
Julien Pivotto	ae73a6296a	Merge pull request #8683 from cuirunxing-hub/main typos correct	2021-04-02 20:14:55 +02:00
cuirunxing-hub	57bc2e94e2	typos correct Signed-off-by: cuirunxing-hub <cuirunxing@inspur.com>	2021-04-02 09:03:00 +08:00
Jess G	731545ad34	Add documentation for recording rule backfiller (#8674 ) * add docs for rule backfiller Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-04-01 22:38:00 +02:00
Julien Pivotto	e635ca834b	Add environment variable expansion in external label values Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-03-30 01:36:28 +02:00
Björn Rabenstein	9549a15c6f	Merge pull request #7675 from JessicaGreben/jg/11-retroactive-rule-eval Add rule importer to backfill	2021-03-29 19:09:21 +02:00
jessicagreben	896c828bb5	close writer after flush Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-03-29 06:45:12 -07:00
jessicagreben	d89a1d999f	add log with start/end times, close blocks before end of func Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-03-28 12:13:58 -07:00
Ben Kochie	f0bccba1c3	Update Go modules for 2.26 (#8636 ) * Update Go modules for 2.26 Bump all Go modules to the latest upstream. Signed-off-by: Ben Kochie <superq@gmail.com> * Fix promtool for new client_golang LabelValues now requires a list of string matchers. Signed-off-by: Ben Kochie <superq@gmail.com>	2021-03-24 09:41:12 +00:00
Julien Pivotto	c0c36b1155	Improve promql-negative-offset docs (#8631 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-03-22 10:16:43 +01:00
jessicagreben	8de4da3716	add changes per comments, fix tests Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-03-20 12:38:30 -07:00
Callum Styan	289ba11b79	Add circular in-memory exemplars storage (#6635 ) * Add circular in-memory exemplars storage Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com> Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> Signed-off-by: Martin Disibio <mdisibio@gmail.com> Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com> Co-authored-by: Martin Disibio <mdisibio@gmail.com> * Fix some comments, clean up exemplar metrics struct and exemplar tests. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix exemplar query api null vs empty array issue. Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com> Co-authored-by: Martin Disibio <mdisibio@gmail.com>	2021-03-16 15:17:45 +05:30
jessicagreben	e3a8132bb3	fix block alignment, add sample alignment Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-03-15 12:44:58 -07:00
jessicagreben	7c26642460	add block alignment and write in 2 hr blocks Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-03-14 10:10:55 -07:00
Julien Pivotto	63ea88af82	Merge pull request #8575 from pfreixes/add-scrapes-parameter Add num scrapes as tsdb write benchmark command flag	2021-03-11 13:09:50 +01:00
Pau Freixes	b1ac4a45e6	Add num scrapes as tsdb write benchmark command flag By default same value that was hardcoded is used, but with the new flag added the number of scrapes can be increased to any value. Signed-off-by: Pau Freixes <pfreixes@gmail.com>	2021-03-10 11:17:07 +01:00
Julien Pivotto	ad5ed416ba	Merge pull request #8487 from pschou/dev_neg_offset allow negative offset	2021-03-08 22:18:45 +01:00
Julien Pivotto	5742a18590	Fix subqueries with default resolution in promql unit tests Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-03-07 09:20:04 +01:00
jessicagreben	9fc53b7edf	fix appender.Add -> appender.Append Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-03-01 05:49:49 -08:00
Arthur Silva Sens	537c0aff49	Prometheus and Promtool binaries now print help and usage to stdout (#8542 ) Signed-off-by: ArthurSens <arthursens2005@gmail.com>	2021-02-25 19:52:34 +01:00
jessicagreben	78e84aed89	resolve merge conflict Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-02-24 09:47:29 -08:00
jessicagreben	f2db9dc722	add multi rule integration tests Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-02-24 09:42:31 -08:00
pschou	f80b52be69	Merge branch 'main' into dev_neg_offset	2021-02-23 20:52:57 -05:00
schou	22cd48868a	adding feature flag, promql-negative-offset Signed-off-by: schou <pschou@users.noreply.github.com>	2021-02-23 20:25:56 -05:00
Julien Pivotto	8c8de46003	Merge pull request #8036 from dgl/promtool-alert-err promtool: Don't end alert tests early, in some failure situations	2021-02-20 22:35:00 +01:00
Tom Wilkie	7369561305	Combine Appender.Add and AddFast into a single Append method. (#8489 ) This moves the label lookup into TSDB, whilst still keeping the cached-ref optimisation for repeated Appends. This makes the API easier to consume and implement. In particular this change is motivated by the scrape-time-aggregation work, which I don't think is possible to implement without it as it needs access to label values. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2021-02-18 17:37:00 +05:30
Julien Pivotto	1fac1c783b	Merge pull request #8504 from rbauduin/require_alertname promtool: alert_rule_test items require alertname	2021-02-17 22:07:52 +01:00
Julien Pivotto	2d172d0896	Merge pull request #8508 from prometheus/release-2.25 Merge back release 2.25	2021-02-17 16:26:34 +01:00
Raphael Bauduin	a7d64cad21	promtool: alert_rule_test items require alertname Accepting alert_rule_test without alertname is confusing as it will always pass with empty exp_alerts, and never with non-empty exp_alerts. Signed-off-by: Raphael Bauduin <raphael.bauduin@tessares.net>	2021-02-17 16:23:12 +01:00
Ganesh Vernekar	c4536fa28c	Increase block writer size for backfilling Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>	2021-02-17 15:45:41 +05:30
Julien Pivotto	a419b75abd	Merge pull request #8485 from hryniuk/promtool-query-errors-details Print details of API errors received by promtool	2021-02-16 22:47:08 +01:00
Łukasz Hryniuk	ab41de68b4	Print details of API errors Signed-off-by: Łukasz Hryniuk <code@hryniuk.pl>	2021-02-15 23:42:06 +01:00
David Leadbeater	3e30f72af1	promtool: Add more negative alert tests Signed-off-by: David Leadbeater <dgl@dgl.cx>	2021-02-15 17:00:49 +00:00
Julien Pivotto	e29b47b39e	Merge pull request #8440 from mishamo/master Add optional name property to testgroup for better test failure output	2021-02-09 21:23:24 +01:00
misha	1c3e7b4241	Use strings.Builder for neater error formatting Signed-off-by: misha <DL-OTTCloudPlatform-Nova@bskyb.internal>	2021-02-09 15:00:26 +00:00
Tom Wilkie	d479151f1f	Various enhancements and refactorings for remote write receiver: - Remove unrelated changes - Refactor code out of the API module - that is already getting pretty crowded. - Don't track reference for AddFast in remote write. This has the potential to consume unlimited server-side memory if a malicious client pushes a different label set for every series. For now, its easier and safer to always use the 'slow' path. - Return 400 on out of order samples. - Use remote.DecodeWriteRequest in the remote write adapters. - Put this behing the 'remote-write-server' feature flag - Add some (very) basic docs. - Used named return & add test for commit error propagation Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2021-02-08 20:41:23 +00:00
fuling	72475b8a0c	[ENHANCEMENT] remote storage:Add default api implementation of remote write Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>	2021-02-07 18:12:48 +00:00
misha	c2c5aeb16b	Add optional name property to testgroup for better test failure output Signed-off-by: misha <DL-OTTCloudPlatform-Nova@bskyb.internal>	2021-02-04 10:07:22 +00:00
Julien Pivotto	c1f8bd9944	Merge pull request #8432 from roidelapluie/backfillpanic backfill: move checkErr before we close the mmaped file	2021-02-03 16:32:35 +01:00
Julien Pivotto	9334269f2b	backfill: move checkErr before we close the mmaped file When printing the error, we still need access to the mmapped byte array of the file. Therefore, we make sure that we run it before closing the file. I could have done something more complex like a defer, or not closing the file, knowing that we would exit the program anyway. However, I think that in case we extend this in the future, or this is copy/paster elsewhere, we should continue closing the file. As it is small enough, I went for the solution to call the function 3 times instead of playing with a defer. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-02-01 21:18:42 +01:00
Jeremy Albinet	4a1f2c097e	Typo on plural in checkRules/checkDuplicates Signed-off-by: Jeremy Albinet <jalbinet@synthesio.com>	2021-02-01 15:43:05 +01:00
Julien Pivotto	2316062d4e	Deprecate --alertmanager.timeout Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-01-25 12:36:13 +01:00
Ganesh Vernekar	9199fcb8d1	'@ <timestamp>' modifier (#8121 ) This commit adds `@ <timestamp>` modifier as per this design doc: https://docs.google.com/document/d/1uSbD3T2beM-iX4-Hp7V074bzBRiRNlqUdcWP6JTDQSs/edit. An example query: ``` rate(process_cpu_seconds_total[1m]) and topk(7, rate(process_cpu_seconds_total[1h] @ 1234)) ``` which ranks based on last 1h rate and w.r.t. unix timestamp 1234 but actually plots the 1m rate. Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>	2021-01-20 16:27:39 +05:30
Julien Pivotto	ac2626757c	Update exporter-toolkit to 0.5.0 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-01-13 21:49:54 +01:00
Guangwen Feng	2df1a482da	Fix misspelled word in comment (#8348 ) Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>	2021-01-07 10:01:08 +00:00
Julien Pivotto	bc9f9ee3aa	Backfilling: fast-path for non-consecutive blocks (#8324 ) * Backfilling: optimize for non-consecutive blocks When you have missing data for > 2 hours, you spend a lot of time re-reading the complete file. It is not optimal. This introduces a fastpath for this scenario. Next, we do parse the metric even when we know we will not use it, based on its timestamp. This only computes the metric when we know its timestamp is right. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-12-30 02:06:41 +01:00
Julien Pivotto	003d6451fc	Promtool: add web config validation Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-12-29 16:55:29 +01:00
Julien Pivotto	5b4f46a348	Add TLS and basic authentication Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-12-28 21:33:44 +01:00
Ben Kochie	5055dfbbe4	Listen on web early in startup Avoid starting up components like the TSDB if we can't bind to the web listening port. Signed-off-by: Ben Kochie <superq@gmail.com>	2020-12-28 20:13:05 +01:00
beorn7	6bfa33308e	promtool: Print block meta-data slightly more nicely I initially thought I could somehow rescue the current column layout by recycling the tabwriter, but flushing completely blanks it. However, by setting a minimum width of 13, we get a slightly broader DURATION column but otherwise nice formatting, unless numbers get really big, but that's OK, I guess. Before: ``` BLOCK ULID MIN TIME MAX TIME DURATION NUM SAMPLES NUM CHUNKS NUM SERIES SIZE 01ETN0KGNP5WWK9T5QMQGBG9F1 2020-11-19 07:39:17 +0000 UTC 2020-11-19 07:44:17 +0000 UTC 5m0.001s 8 2 2 624B 01ETN0KGQSFF0AB2QDZVQG3CWC 2020-11-19 10:25:57 +0000 UTC 2020-11-19 10:30:57 +0000 UTC 5m0.001s 8 2 2 622B 01ETN0KGSW8KYP3YPG4X20P60Z 2020-11-19 13:12:37 +0000 UTC 2020-11-19 13:17:37 +0000 UTC 5m0.001s 8 2 2 625B ``` After: ``` BLOCK ULID MIN TIME MAX TIME DURATION NUM SAMPLES NUM CHUNKS NUM SERIES SIZE 01ETN0R72SXN9A1FG732P7KFFN 2020-11-19 07:39:17 +0000 UTC 2020-11-19 07:44:17 +0000 UTC 5m0.001s 8 2 2 624B 01ETN0R74Y9AG1A1MKN4MZK7WM 2020-11-19 10:25:57 +0000 UTC 2020-11-19 10:30:57 +0000 UTC 5m0.001s 8 2 2 622B 01ETN0R76KXZ5VQECMDNES49J6 2020-11-19 13:12:37 +0000 UTC 2020-11-19 13:17:37 +0000 UTC 5m0.001s 8 2 2 625B ``` After without the `-r` flag: ``` BLOCK ULID MIN TIME MAX TIME DURATION NUM SAMPLES NUM CHUNKS NUM SERIES SIZE 01ETN0RFFJ42274NWR1GH0RTV6 1605771557000 1605771857001 5m0.001s 8 2 2 624 01ETN0RFJ1MZCHHS2SBZS8XC27 1605781557000 1605781857001 5m0.001s 8 2 2 622 01ETN0RFM98N3V4KD2DZXFGHGN 1605791557000 1605791857001 5m0.001s 8 2 2 625 ``` Signed-off-by: beorn7 <beorn@grafana.com>	2020-12-28 16:55:12 +01:00
beorn7	651b57b9ab	Merge branch 'backfillhr' of git://github.com/roidelapluie/prometheus into review	2020-12-28 16:18:00 +01:00
yeya24	cedd2dbec9	create output directory before backfilling Signed-off-by: yeya24 <yb532204897@gmail.com>	2020-12-24 23:36:36 -05:00
Julien Pivotto	53480c168d	Backfill: print created blocks only, add human-readable option Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-12-23 20:42:30 +01:00
AdaephonBen	dca6954b0a	promtool: Add URL scheme when not provided (#7956 ) Signed-off-by: AdaephonBen <ma18btech11011@iith.ac.in>	2020-12-23 19:52:04 +01:00
lzhfromustc	27a6e1e174	test: add buffer to channel to avoid goroutine leak (#8274 ) Signed-off-by: lzhfromustc <lzhfromustc@gmail.com>	2020-12-10 09:09:21 +00:00
Julien Pivotto	7957731339	Inline defer Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-12-09 09:23:39 +01:00
Julien Pivotto	82b5f1d8b1	Backfill: Use mmap to reuse parser code Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-12-08 23:48:31 +01:00
jessicagreben	e32e4fcc53	fix unit test Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-30 11:02:45 -08:00
jessicagreben	cec3515fa3	fix linter Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-30 08:17:51 -08:00
jessicagreben	2e9946e4d7	add test Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-28 07:58:33 -08:00
jessicagreben	ac06d0a657	merge master/resolve conflict Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-26 08:43:07 -08:00
jessicagreben	ee85c22adb	flush samples to disk every 5k samples Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-26 08:30:06 -08:00
Atibhi Agrawal	b317b6ab9c	Backfill from OpenMetrics format (#8084 ) * get parser working Signed-off-by: aSquare14 <atibhi.a@gmail.com> * import file created Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Find min and max ts Signed-off-by: aSquare14 <atibhi.a@gmail.com> * make two passes over file and write to tsdb Signed-off-by: aSquare14 <atibhi.a@gmail.com> * print error messages Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix Max and Min initializer Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Start with unit tests Signed-off-by: aSquare14 <atibhi.a@gmail.com> * reset file read Signed-off-by: aSquare14 <atibhi.a@gmail.com> * align blocks to two hour range Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Add cleanup test Signed-off-by: aSquare14 <atibhi.a@gmail.com> * remove .ds_store Signed-off-by: aSquare14 <atibhi.a@gmail.com> * add license to import_test Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix Circle CI error Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Refactor code Move backfill from tsdb to promtool directory Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix gitignore Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Remove panic Rename ContenType Signed-off-by: aSquare14 <atibhi.a@gmail.com> * adjust mint Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix return statement Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix go modules Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Added unit test for backfill Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix CI error Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix file handling Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Close DB Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Close directory Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Error Handling Signed-off-by: aSquare14 <atibhi.a@gmail.com> * inline err Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix command line flags Signed-off-by: aSquare14 <atibhi.a@gmail.com> * add spaces before func fix pointers Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Add defer'd calls Signed-off-by: aSquare14 <atibhi.a@gmail.com> * move openmetrics.go content to backfill Signed-off-by: aSquare14 <atibhi.a@gmail.com> * changed args to flags Signed-off-by: aSquare14 <atibhi.a@gmail.com> * add tests for wrong OM files Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Added additional tests Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Add comment to warn of func reuse Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Make input required in main.go Signed-off-by: aSquare14 <atibhi.a@gmail.com> * defer blockwriter close Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix defer Signed-off-by: aSquare14 <atibhi.a@gmail.com> * defer Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Remove contentType Signed-off-by: aSquare14 <atibhi.a@gmail.com> * remove defer from backfilltest Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix defer remove in backfill_test Signed-off-by: aSquare14 <atibhi.a@gmail.com> * changes to fix CI errors Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix go.mod Signed-off-by: aSquare14 <atibhi.a@gmail.com> * change package name Signed-off-by: aSquare14 <atibhi.a@gmail.com> * assert->require Signed-off-by: aSquare14 <atibhi.a@gmail.com> * remove todo Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix format Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix todo Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix createblock Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix tests Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix defer Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix return Signed-off-by: aSquare14 <atibhi.a@gmail.com> * check err for anon func Signed-off-by: aSquare14 <atibhi.a@gmail.com> * change comments Signed-off-by: aSquare14 <atibhi.a@gmail.com> * update comment Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix for the Flush Bug Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix formatting, comments, names Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Print Blocks Signed-off-by: aSquare14 <atibhi.a@gmail.com> * cleanup Signed-off-by: aSquare14 <atibhi.a@gmail.com> * refactor test to take care of multiple samples Signed-off-by: aSquare14 <atibhi.a@gmail.com> * refactor tests Signed-off-by: aSquare14 <atibhi.a@gmail.com> * remove om Signed-off-by: aSquare14 <atibhi.a@gmail.com> * I dont know what I fixed Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix tests Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fix tests, add test description, print blocks Signed-off-by: aSquare14 <atibhi.a@gmail.com> * commit after 5000 samples Signed-off-by: aSquare14 <atibhi.a@gmail.com> * reviews part 1 Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Series Count Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix CI Signed-off-by: aSquare14 <atibhi.a@gmail.com> * remove extra func Signed-off-by: aSquare14 <atibhi.a@gmail.com> * make timestamp into sec Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Reviews 2 Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Add Todo Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Fixes Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fixes reviews Signed-off-by: aSquare14 <atibhi.a@gmail.com> * =0 Signed-off-by: aSquare14 <atibhi.a@gmail.com> * remove backfill.om Signed-off-by: aSquare14 <atibhi.a@gmail.com> * add global err var, remove stuff Signed-off-by: aSquare14 <atibhi.a@gmail.com> * change var name Signed-off-by: aSquare14 <atibhi.a@gmail.com> * sampleLimit pass as parameter Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Add test when number of samples greater than batch size Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Change name of batchsize Signed-off-by: aSquare14 <atibhi.a@gmail.com> * revert export Signed-off-by: aSquare14 <atibhi.a@gmail.com> * nits Signed-off-by: aSquare14 <atibhi.a@gmail.com> * remove Signed-off-by: aSquare14 <atibhi.a@gmail.com> * add comment, remove newline,consistent err Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Print Blocks Signed-off-by: aSquare14 <atibhi.a@gmail.com> * Modify comments Signed-off-by: aSquare14 <atibhi.a@gmail.com> * db.Querier Signed-off-by: aSquare14 <atibhi.a@gmail.com> * add sanity check , get maxt and mint Signed-off-by: aSquare14 <atibhi.a@gmail.com> * ci error Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix Signed-off-by: aSquare14 <atibhi.a@gmail.com> * comment change Signed-off-by: aSquare14 <atibhi.a@gmail.com> * nits Signed-off-by: aSquare14 <atibhi.a@gmail.com> * NoError Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix Signed-off-by: aSquare14 <atibhi.a@gmail.com> * fix Signed-off-by: aSquare14 <atibhi.a@gmail.com>	2020-11-26 10:37:06 +05:30
jessicagreben	5dd3577424	change name of promtool subcommand to create-blocks-from Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-22 15:05:02 -08:00
jessicagreben	19dee0a569	add name and labels to metric, eval all rules for each block Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-22 14:24:38 -08:00
gotjosh	4eca4dffb8	Allow metric metadata to be propagated via Remote Write. (#6815 ) * Introduce a metadata watcher Similarly to the WAL watcher, its purpose is to observe the scrape manager and pull metadata. Then, send it to a remote storage. Signed-off-by: gotjosh <josue@grafana.com> * Additional fixes after rebasing. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Rework samples/metadata metrics. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Use more descriptive variable names in MetadataWatcher collect. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix issues caused during rebasing. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix missing metric add and unneeded config code. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Address some review comments. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix metrics and docs Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Replace assert with require Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Bring back max_samples_per_send metric Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Fix tests Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> Co-authored-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>	2020-11-19 20:53:03 +05:30
jessicagreben	75654715d3	fix panics Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-01 07:54:04 -08:00
jessicagreben	61c9a89120	use milliseconds for blocksize Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-10-31 07:11:54 -07:00
jessicagreben	6980bcf671	unexport backfiller Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-10-31 06:40:56 -07:00
jessicagreben	3ed6457dd4	use blockwriter, rm multiwriter code Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-10-31 06:32:07 -07:00
Julien Pivotto	6c56a1faaa	Testify: move to require (#8122 ) * Testify: move to require Moving testify to require to fail tests early in case of errors. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * More moves Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-29 09:43:23 +00:00
Bartlomiej Plotka	3d8826a3d4	MultiError: Refactored MultiError for more concise and safe usage. (#8066 ) * MultiError: Refactored MultiError for more concise and safe usage. * Less lines * Goland IDE was marking every usage of old MultiError "potential nil" error * It was easy to forgot using Err() when error was returned, now it's safely assured on compile time. NOTE: Potentially I would rename package to merrors. (: In different PR. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Addressed review comments. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Addressed comments. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Fix after rebase. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2020-10-28 15:24:58 +00:00
Julien Pivotto	1282d1b39c	Refactor test assertions (#8110 ) * Refactor test assertions This pull request gets rid of assert.True where possible to use fine-grained assertions. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-27 11:06:53 +01:00
David Leadbeater	e7e60623ff	promtool: Calculate mint and maxt per test (#8096 ) * promtool: Calculate mint and maxt per test Previously a single test that used a later eval time would make all other tests in the file share the [mint, maxt] and potentially evaluate far more samples than needed. Fixes: #8019 Signed-off-by: David Leadbeater <dgl@dgl.cx>	2020-10-24 12:03:55 +01:00
Julien Pivotto	4e5b1722b3	Move away from testutil, refactor imports (#8087 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-22 11:00:08 +02:00
jessicagreben	36ac0b68f1	merge master, fix conflicts	2020-10-17 08:20:21 -07:00
Björn Rabenstein	71577e45eb	Merge pull request #8044 from prometheus/beorn7/metrics Instrumentation: Report valid configs in the respective metrics from the beginning	2020-10-12 23:32:02 +02:00
Arthur Silva Sens	4f45e201cc	Promtool tsdb list now prints block sizes (#7993 ) * promtool tsdb list now prints blocks' size Signed-off-by: arthursens <arthursens2005@gmail.com>	2020-10-12 23:15:40 +02:00
beorn7	0f3c1bf6cf	Report valid configs in the respective metrics from the beginning In #7399, an early validity check of the config was introduced to prevent the scenario where an invalid config is only detected after a possibly very long startup procedure. However, the respective success metrics are not updated after the initial validation so that the success metrics suggest an invalid config. If the startup procedure, like replaying the WAL, really takes very long, alerts about invalid config will trigger. This commit sets the succes metrics after initial validation. They will be set again after the "real" config (re-)load, but that shouldn't be a problem. The metric now truthfully represents whenever the config was successfully loaded, no matter if the result was then thrown away (because it was just for validation) or actually used. Signed-off-by: beorn7 <beorn@grafana.com>	2020-10-12 21:30:59 +02:00
David Leadbeater	5393ec22cb	promtool: Don't end alert tests early, in some failure situations If an alert test had a failing test, then any other alert test interval specified after that point would result in the test exiting early. This made debugging some tests more difficult than needed. Now only exit early for evaluation failures. Signed-off-by: David Leadbeater <dgl@dgl.cx>	2020-10-09 12:59:59 +01:00

... 2 3 4 5 6 ...

745 Commits (e0a00f45db839a7f2f1e83895a815f74b5706e9a)