The scrape manage receiver's channel now just saves the target sets
and another backgorund runner updates the scrape loops every 5 seconds.
This is so that the scrape manager doesn't block the receiving channel
when it does the long background reloading of the scrape loops.
Active and dropped targets are now saved in each scrape pool instead of
the scrape manager. This is mainly to avoid races when getting the
targets via the web api.
When reloading the scrape loops now happens in parallel to speed up the
final disared state and this also speeds up the prometheus's shutting
down.
Also updated some funcs signatures in the web package for consistency.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* promql: Rewrote tests with testutil for functions_test
Signed-off-by: Elif T. Kuş <elifkus@gmail.com>
* pkg/relabel: Rewrote tests with testutil for relabel_test
Signed-off-by: Elif T. Kuş <elifkus@gmail.com>
* discovery/consul: Rewrote tests with testutil for consul_test
Signed-off-by: Elif T. Kuş <elifkus@gmail.com>
* scrape: Rewrote tests with testutil for manager_test
Signed-off-by: Elif T. Kuş <elifkus@gmail.com>
This commit avoids passing the full scrape configuration down to the
scrape loop to fix data races when the scrape configuration is being
reloaded.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
read bearer token on every request
removed unuseful scrape manager startup log
new tests -TestScrapeManagerReloadNoChange( scrape pool is not reloaded
when the config hasn't changed), TestMissingBearerAuthFile ,
TestBearerAuthFileRoundTripper
Switched to testing by way of the static_configs rather than
dns_sd_config parameter. Verified that the revised test both passes
without network access, and also still catches the bug it's supposed to
cover.
Verify that if the configs change, target groups are cleaned on
TargetManager.reload (rather than having old ones linger around, even if
they are no longer present in the configs).
This covers the bug fixed in #1907 -- I verified that by checking out
source from before that commit.
This is a start on #1906
It's actually happening in several places (and for flags, we use the
standard Go time.Duration...). This at least reduces all our
home-grown parsing to one place (in model).
For the SNMP and blackbox exporters where
the ports tends to not be 80/443 and indeed
there may not be a port this makes the relabelling
a bit simpler as you don't have to figure out this
logic exists and strip off the :80.
This is a breaking change for the example configs of
those exporters.
With the blackbox exporter, the instance label will commonly
be used for things other than hostnames so remove this restriction.
https://example.com or https://example.com/probe/me are some examples.
To prevent user error, check that urls aren't provided as targets
when there's no relabelling that could potentically fix them.
The prefixed target provider changed a pointerized target group that was
reused in the wrapped target provider, causing an ever-increasing chain
of source prefixes in target groups from the Consul target provider.
We now make this bug generally impossible by switching the target group
channel from pointer to value type and thus ensuring that target groups
are copied before being passed on to other parts of the system.
I tried to not let the depointerization leak too far outside of the
channel handling (both upstream and downstream) because I tried that
initially and caused some nasty bugs, which I want to minimize.
Fixes https://github.com/prometheus/prometheus/issues/1083
merge() closes the channel that handleUpdates() reads from when there
are zero configured target providers in the configuration. In that case,
the for-select loop in handleUpdates() entered a busy loop. It should
exit when the upstream channel is closed.
Include position of same SD mechanisms within the same scrape configuration.
Move unique prefixing out of SD implementations and target manager into
its own interface.
This calculates how much a counter increases over
a given period of time, which is the area under the curve
of it's rate.
increase(x[5m]) is equivilent to rate(x[5m]) * 300.
Appending to the storage can block for a long time. Timing out
scrapes can also cause longer blocks. This commit avoids that those
blocks affect other compnents than the target itself.
Also the Target interface was removed.
With this commit, sending SIGHUP to the Prometheus process will reload
and apply the configuration file. The different components attempt
to handle failing changes gracefully.
This commit adds a relabelling stage on the set of base
labels from which a target is created. It allows to drop
targets and rewrite any regular or internal label.
This commit changes the configuration interface from job configs to scrape
configs. This includes allowing multiple ways of target definition at once
and moving DNS SD to its own config message. DNS SD can now contain multiple
DNS names per configured discovery.
This commit shifts responsibility for maintaining targets from providers and
pools to the target manager. Target groups have a source name that identifies
them for updates.
/api/targets was undocumented and never used and also broken.
Showing instance and job labels on the status page (next to targets)
does not make sense as those labels are set in an obvious way.
Also add a doc comment to TargetStateToClass.
The one central sample ingestion channel has caused a variety of
trouble. This commit removes it. Targets and rule evaluation call an
Append method directly now. To incorporate multiple storage backends
(like OpenTSDB), storage.Tee forks the Append into two different
appenders.
Note that the tsdb queue manager had its own queue anyway. It was a
queue after a queue... Much queue, so overhead...
Targets have their own little buffer (implemented as a channel) to
avoid stalling during an http scrape. But a new scrape will only be
started once the old one is fully ingested.
The contraption of three pipelined ingesters was removed. A Target is
an ingester itself now. Despite more logic in Target, things should be
less confusing now.
Also, remove lint and vet warnings in ast.go.
The current wording suggests that a target is not reachable at all,
although it might also get set when the target was reachable, but there
was some other error during the scrape (invalid headers or invalid
scrape content). UNHEALTHY is a more general wording that includes all
these cases.
For consistency, ALIVE is also renamed to HEALTHY.