This commit adds a `--syntax-only` flag for `promtool check config`.
When passing in this flag, promtool will omit various file existence
checks that would cause the check to fail (e.g. the check would not
fail if `rule_files` files don't exist at their respective paths).
This functionality will allow CI systems to check the syntax of
configs without worrying about referenced files.
Fixes: #5222
Signed-off-by: zzehring <zack.zehring@grafana.com>
The promtool check config command still uses the bearer_token_file
field which is deprecated in favour of authorization.credentials_file.
This commit modifies the command to use the new field insted.
Fixes#9874
Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
* share tsdb db locker code with agent
Closes#9616
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* add flag to disable lockfile for agent
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* use agentOnlySetting instead of PreAction
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* tsdb: address review feedback
1. Rename Locker to DirLocker
2. Move DirLocker to tsdb/tsdbutil
3. Name metric using fmt.Sprintf
4. Refine error checking in DirLocker test
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* tsdb: create test utilities to assert expected DirLocker behavior
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* tsdb/tsdbutil: fix lint errors
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* tsdb/agent: fix windows test failure
Use new DB variable instead of overriding the old one.
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* Formatting of error message is missing a space after ':'.
* t.Fatalf should be used instead of t.Errorf+return.
Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
PR #9618 introduced failing to load the config file when agent mode is
configured to run with unspported settings. This made the block that
logs a warning on their configuration no-op, which is now removed.
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
Between the tests. This enables parallelizing those tests, which should
cut the test execution time.
Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
This creates a new `model` directory and moves all data-model related
packages over there:
exemplar labels relabel rulefmt textparse timestamp value
All the others are more or less utilities and have been moved to `util`:
gate logging modetimevfs pool runtime
Signed-off-by: beorn7 <beorn@grafana.com>
* TSDB: demistify seriesRefs and ChunkRefs
The TSDB package contains many types of series and chunk references,
all shrouded in uint types. Often the same uint value may
actually mean one of different types, in non-obvious ways.
This PR aims to clarify the code and help navigating to relevant docs,
usage, etc much quicker.
Concretely:
* Use appropriately named types and document their semantics and
relations.
* Make multiplexing and demuxing of types explicit
(on the boundaries between concrete implementations and generic
interfaces).
* Casting between different types should be free. None of the changes
should have any impact on how the code runs.
TODO: Implement BlockSeriesRef where appropriate (for a future PR)
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* feedback
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* agent: demistify seriesRefs and ChunkRefs
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
Using the same variable for storage.tsdb.path and storage.agent.path
as below in main.go causes cfg.localStoragePath to be data/ or
data-agent/ at random.
a.Flag("storage.tsdb.path", "Base path for metrics storage.").
PreAction(serverOnlySetting()).
Default("data/").StringVar(&cfg.localStoragePath)
a.Flag("storage.agent.path", "Base path for metrics storage.").
PreAction(agentOnlySetting()).
Default("data-agent/").StringVar(&cfg.localStoragePath)
This patch fixes it by using a different variable for storage.agent.path
Signed-off-by: Sunil Thaha sthaha@redhat.com
Signed-off-by: Sunil Thaha <sthaha@redhat.com>
* Fix misleading agent-only/server-only check messages.
Issue:
```
[root@host01 ~]# docker run -it --net=host --rm -v /root/editor/prom-agent-batcopter.yaml:/etc/prometheus/prometheus.yaml -v /root/prom-batcopter-data:/prometheus -u root --name prom-agent-batcopter quay.io/prometheus/prometheus:main --enable-feature=agent --config.file=/etc/prometheus/prometheus.yaml --storage.tsdb.path=/prometheus --web.listen-address=:9091
ts=2021-11-02T16:00:59.789Z caller=main.go:205 level=info msg="Experimental agent mode enabled."
The following flag(s) can not be used in agent mode: ["--enable-feature"]
```
Problem was that PreAction gives us all parsed flag. Context does not give us any info on what flag clause it was defined.
Also added info for flag help about being server or agent only.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
* gofumpt.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
So temporary data directory can be successfully removed, as on Windows,
directory cannot be in used while removal.
Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
* Initial draft of prometheus-agent
This commit introduces a new binary, prometheus-agent, based on the
Grafana Agent code. It runs a WAL-only version of prometheus without the
TSDB, alerting, or rule evaluations. It is intended to be used to
remote_write to Prometheus or another remote_write receiver.
By default, prometheus-agent will listen on port 9095 to not collide
with the prometheus default of 9090.
Truncation of the WAL cooperates on a best-effort case with Remote
Write. Every time the WAL is truncated, the minimum timestamp of data to
truncate is determined by the lowest sent timestamp of all samples
across all remote_write endpoints. This gives loose guarantees that data
from the WAL will not try to be removed until the maximum sample
lifetime passes or remote_write starts functionining.
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* add tests for Prometheus agent (#22)
* add tests for Prometheus agent
* add tests for Prometheus agent
* rearranged tests as per the review comments
* update tests for Agent
* changes as per code review comments
Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>
* incremental changes to prometheus agent
Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>
* changes as per code review comments
Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>
* Commit feedback from code review
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* Port over some comments from grafana/agent
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* Rename agent.Storage to agent.DB for tsdb consistency
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* Consolidate agentMode ifs in cmd/prometheus/main.go
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* Document PreAction usage requirements better for agent mode flags
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* remove unnecessary defaultListenAddr
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* `go fmt ./tsdb/agent` and fix lint errors
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
Co-authored-by: SriKrishna Paparaju <paparaju@gmail.com>
Avoid using %#v, nothing needs to parse this, so escaping " and so on
leads to hard to read output.
Add new lines, number and indentation to each alert series output.
Signed-off-by: David Leadbeater <dgl@dgl.cx>
* support maxBlockDuration for promtool tsdb create-blocks-from rules
Fixes#9465
Signed-off-by: Will Tran <will@autonomic.ai>
* don't hardcode 2h as the default block size in rules test
Signed-off-by: Will Tran <will@autonomic.ai>
* Add a feature flag to enable the new manager
This PR creates a copy of the legacy manager and uses it by default.
It is a companion PR to #9349. With this PR, users can enable the new
discovery manager and provide us with any feedback / side effects that
the new behaviour might have.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* Allow to tune the scrape tolerance
In most of the classic monitoring use cases, a few milliseconds
difference can be omitted.
In Prometheus, a few millisecond difference can however make a big
difference.
Currently, Prometheus will ignore up to 2 ms difference in the
alignments.
It turns out that for users who can afford a 10ms difference, there is a
lot of resources and disk space to win, as shown in this graph, which
shows the bytes / samples over a production Prometheus server. You can
clearly see the switch from 2ms to 10ms tolerance.
This pull request enables the adjustment of the scrape timestamp
alignment tolerance.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* Fix golint
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
The compaction analysis which runs under promtool tsdb analyze can be an
intensive process which slows down the entire command.
This commit adds an --extended flag to tsdb analyze which can be toggled
for running long running tasks, such as compaction analysis.
Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
Trade space for speed. Convert all rules into our temporary struct, sort
and then iterate. This is a significant when having many rules.
Signed-off-by: Holger Hans Peter Freyther <holger@moiji-mobile.com>
Add a new built-in metric `scrape_timeout_seconds` to allow monitoring
of the ratio of scrape duration to the scrape timeout. Hide behind a
feature flag to avoid additional cardinality by default.
Signed-off-by: SuperQ <superq@gmail.com>