* scrape: Add global jitter for HA server
Covers issue in https://github.com/prometheus/prometheus/pull/4926#issuecomment-449039848
where the HA setup become a problem for targets unable to be scraped simultaneously.
The new jitter per server relies on the hostname and external labels which necessarily to be uniq.
As before, scrape offset will be calculated with regard the absolute time, so even
restart/reload doesn't change scrape time per scrape target + prometheus instance.
Use fqdn if possible, otherwise fall back to the hostname. It adds extra random seed
to calculate server hash to be distinguish on machines with the same hostname, but
different DC.
Signed-off-by: Aleksei Semiglazov <xjewer@gmail.com>
* scrape: catch errors when creating HTTP clients
This change makes sure that no scrape pool is created with a nil HTTP
client.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Address Tariq's comment
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Address Brian's comment
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
The scrape manage receiver's channel now just saves the target sets
and another backgorund runner updates the scrape loops every 5 seconds.
This is so that the scrape manager doesn't block the receiving channel
when it does the long background reloading of the scrape loops.
Active and dropped targets are now saved in each scrape pool instead of
the scrape manager. This is mainly to avoid races when getting the
targets via the web api.
When reloading the scrape loops now happens in parallel to speed up the
final disared state and this also speeds up the prometheus's shutting
down.
Also updated some funcs signatures in the web package for consistency.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* promql: Rewrote tests with testutil for functions_test
Signed-off-by: Elif T. Kuş <elifkus@gmail.com>
* pkg/relabel: Rewrote tests with testutil for relabel_test
Signed-off-by: Elif T. Kuş <elifkus@gmail.com>
* discovery/consul: Rewrote tests with testutil for consul_test
Signed-off-by: Elif T. Kuş <elifkus@gmail.com>
* scrape: Rewrote tests with testutil for manager_test
Signed-off-by: Elif T. Kuş <elifkus@gmail.com>
This commit avoids passing the full scrape configuration down to the
scrape loop to fix data races when the scrape configuration is being
reloaded.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
read bearer token on every request
removed unuseful scrape manager startup log
new tests -TestScrapeManagerReloadNoChange( scrape pool is not reloaded
when the config hasn't changed), TestMissingBearerAuthFile ,
TestBearerAuthFileRoundTripper
Switched to testing by way of the static_configs rather than
dns_sd_config parameter. Verified that the revised test both passes
without network access, and also still catches the bug it's supposed to
cover.
Verify that if the configs change, target groups are cleaned on
TargetManager.reload (rather than having old ones linger around, even if
they are no longer present in the configs).
This covers the bug fixed in #1907 -- I verified that by checking out
source from before that commit.
This is a start on #1906
It's actually happening in several places (and for flags, we use the
standard Go time.Duration...). This at least reduces all our
home-grown parsing to one place (in model).
For the SNMP and blackbox exporters where
the ports tends to not be 80/443 and indeed
there may not be a port this makes the relabelling
a bit simpler as you don't have to figure out this
logic exists and strip off the :80.
This is a breaking change for the example configs of
those exporters.
With the blackbox exporter, the instance label will commonly
be used for things other than hostnames so remove this restriction.
https://example.com or https://example.com/probe/me are some examples.
To prevent user error, check that urls aren't provided as targets
when there's no relabelling that could potentically fix them.
The prefixed target provider changed a pointerized target group that was
reused in the wrapped target provider, causing an ever-increasing chain
of source prefixes in target groups from the Consul target provider.
We now make this bug generally impossible by switching the target group
channel from pointer to value type and thus ensuring that target groups
are copied before being passed on to other parts of the system.
I tried to not let the depointerization leak too far outside of the
channel handling (both upstream and downstream) because I tried that
initially and caused some nasty bugs, which I want to minimize.
Fixes https://github.com/prometheus/prometheus/issues/1083
merge() closes the channel that handleUpdates() reads from when there
are zero configured target providers in the configuration. In that case,
the for-select loop in handleUpdates() entered a busy loop. It should
exit when the upstream channel is closed.
Include position of same SD mechanisms within the same scrape configuration.
Move unique prefixing out of SD implementations and target manager into
its own interface.
This calculates how much a counter increases over
a given period of time, which is the area under the curve
of it's rate.
increase(x[5m]) is equivilent to rate(x[5m]) * 300.
Appending to the storage can block for a long time. Timing out
scrapes can also cause longer blocks. This commit avoids that those
blocks affect other compnents than the target itself.
Also the Target interface was removed.
With this commit, sending SIGHUP to the Prometheus process will reload
and apply the configuration file. The different components attempt
to handle failing changes gracefully.
This commit adds a relabelling stage on the set of base
labels from which a target is created. It allows to drop
targets and rewrite any regular or internal label.
This commit changes the configuration interface from job configs to scrape
configs. This includes allowing multiple ways of target definition at once
and moving DNS SD to its own config message. DNS SD can now contain multiple
DNS names per configured discovery.