Commit Graph

102 Commits (b8159563419c0e87b61f4d3f6482bf898d14dfa1)

Author SHA1 Message Date
Björn Rabenstein 4b8f963847 Merge pull request #1915 from prometheus/release-1.0
Forward-merge the bug fix from release-1.0
2016-08-24 13:04:45 +02:00
beorn7 e2b3626e0c retrieval: Clean up target group map on config reload
Also, remove unused `providers` field in targetSet.

If the config file changes, we recreate all providers (by calling
`providersFromConfig`) and retrieve all targets anew from the newly
created providers. From that perspective, it cannot harm to clean up
the target group map in the targetSet. Not doing so (as it was the
case so far) keeps stale targets around. This mattered if an existing
key in the target group map was not overwritten in the initial fetch
of all targets from the providers. Examples where that mattered:

```
scrape_configs:
- job_name: "foo"
  static_configs:
  - targets: ["foo:9090"]
  - targets: ["bar:9090"]
```
updated to:
```
scrape_configs:
- job_name: "foo"
  static_configs:
  - targets: ["foo:9090"]
```

`bar:9090` would still be monitored. (The static provider just
enumerates the target groups. If the number of target groups
decreases, the old ones stay around.

```
scrape_configs:
- job_name: "foo"
  dns_sd_configs:
  - names:
    - "srv.name.one.example.org"
```
updated to:
```
scrape_configs:
- job_name: "foo"
  dns_sd_configs:
  - names:
    - "srv.name.two.example.org"
```

Now both SRV records are still monitored. The SRV name is part of the
key in the target group map, thus the new one is just added and the
old ane stays around.

Obviously, this should have tests, and should have tests before, not
only for this case. This is the quick fix. I have created
https://github.com/prometheus/prometheus/issues/1906 to track test
creation.

Fixes https://github.com/prometheus/prometheus/issues/1610 .
2016-08-22 19:25:33 +02:00
Frederic Branczyk 7714b9c781 move relabeling functionality to its own package
also remove the returned error as it was always nil
2016-08-09 14:19:20 +02:00
Dmitry Vorobev 273e457da4 web: return status code and error message for config resource 2016-07-15 10:15:24 +02:00
Fabian Reinartz d0eeae9d0e retrieval: don't sync to uninitialized scrape pool
This change does just signal a scrape target update to the scraping loop
once an initial target set is fetched.
Before, the scrape pool was directly synced, causing a race against an
uninitialized scrape pool.

Fixes #1703
2016-06-14 14:04:22 +02:00
Fabian Reinartz 0f21bd31ca config: deprecate `target_groups` for `static_configs`
This change deprecates the `target_groups` option in favor
of `static_configs`. The old configuration is still accepted
but prints a warning.
Configuration loading errors if both options are set.
2016-06-08 15:55:25 +02:00
Fabian Reinartz 74c448386c Merge pull request #1665 from prometheus/fabxc-retrpanic
Fix kubernetes SD crash
2016-05-25 17:13:27 -07:00
Fabian Reinartz 12b03db373 retrieval: handle nil target groups from updates 2016-05-25 16:59:16 -07:00
Fabian Reinartz ea36efbbd1 retrieval: document panic behavior 2016-05-25 16:17:25 -07:00
Fabian Reinartz a5ba166935 retrieval: don't panic on non-HTTP scheme 2016-05-25 16:05:20 -07:00
stuart nelson d02591814b discovery/dns: move dns to own package 2016-05-06 11:14:26 +02:00
Fabian Reinartz f94fc76608 Merge pull request #1592 from prometheus/fabxc-consul-ref
discovery: sanitize Consul service discovery
2016-04-30 21:18:33 +02:00
Fabian Reinartz 9f8feb9ff6 discovery: consolidate Marathon SD files 2016-04-30 11:56:11 +02:00
Fabian Reinartz 5837e6a97f discovery: move consul SD into own package 2016-04-25 16:56:27 +02:00
Seth Miller 0988e3b937 Add support for Azure discovery
This change adds the ability to do target discovery with Microsoft's Azure platform.
2016-04-06 22:47:02 -05:00
Fabian Reinartz 769389e559 Fix potential race in ctx intialization 2016-04-05 20:27:31 +02:00
Fabian Reinartz f2e359962c Sort exported targets 2016-03-08 17:12:27 +01:00
Fabian Reinartz 56fc9bdff3 Handle closed target provider channel
This fixes the case where a target provider closes the update
channel and exits before the context is canceled.
This should only be true for the static provider but it's safer
to generally handle this case.
2016-03-08 15:49:03 +01:00
Fabian Reinartz 2060a0a15b Turn target group members into plain lists.
As the scrape pool deduplicates targets now, it is no longer necessary
to store a hash map for members of each group.
2016-03-01 14:33:12 +01:00
Fabian Reinartz 0d7105abee Remove scrape config from Target.
This commit removes the scrapeConfig entirely from Target.
All identity defining parameters are thus immutable now and the mutex
can be removed..

Target identity is now correctly defined by the labels and the full URL.
This in particular includes URL parameters that are not specified in the
label set.

Fingerprint is also removed from hash to remove an unnecessary tight coupling
to the common/model package.
2016-03-01 14:32:57 +01:00
Fabian Reinartz 75681b691a Extract HTTP client from Target.
The HTTP client is the same across all targets with the same
scrape configuration. Thus, this commit moves it into the scrape
pool.
2016-03-01 14:31:57 +01:00
Fabian Reinartz 76a8c6160d Deduplicate targets in scrape pool.
With this commit the scrape pool deduplicates incoming
targets before scraping them. This way multiple target providers
can produce the same target but it will be scraped only once.
2016-03-01 13:50:51 +01:00
Fabian Reinartz 84f74b9a84 Apply new scrape config on reload.
This commit updates a target set's scrape configuration
on reload. This will cause all running scrape loops to be
stopped and started again with new parameters.
2016-03-01 13:50:51 +01:00
Fabian Reinartz 775316f8d2 Move appender construction from Target to scrapePool 2016-03-01 13:50:51 +01:00
Fabian Reinartz 05de8b7f8d Extract target scraping into scrape loop.
This commit factors out the scrape loop handling into
its own data structure.
For the transition it will be directly attached to the
target.
2016-03-01 13:48:36 +01:00
Fabian Reinartz cebba3efbb Simplify and fix TargetManager reloading 2016-03-01 13:48:36 +01:00
Fabian Reinartz d15adfc917 Preserve target state across reloads.
This commit moves Scraper handling into a separate scrapePool type.
TargetSets only manage TargetProvider lifecycles and sync the
retrieved updates to the scrapePool.

TargetProviders are now expected to send a full initial target set
within 5 seconds. The scrapePools preserve target state across reloads
and only drop targets after the initial set was synced.
2016-03-01 13:48:36 +01:00
Fabian Reinartz 5b30bdb610 Change TargetProvider interface.
This commit changes the TargetProvider interface to use a
context.Context and send lists of TargetGroups, rather than
single ones.
2016-03-01 13:48:36 +01:00
Fabian Reinartz 5bfa4cdd46 Simplify target update handling.
We group providers by their scrape configuration. Each provider produces
target groups with an unique identifier.

On stopping a set of target providers we cancel the target providers,
stop scraping the targets and wait for the scrapers to finish.

On configuration reload all provider sets are stopped and new ones
are created. This will make targets disappear briefly on configuration
reload. Potentially scrapes are missed but due to the consistent
scrape intervals implemented recently, the impact is minor.
2016-03-01 13:48:36 +01:00
Fabian Reinartz 6df1f49c13 Remove fullLabels method and fix target updating
With recent changes to a Target's internal data representation
updating by fullLabels() assigns the additional default
instance label. This breaks target identity comparison and causes
identical targets from service discovery to be constantly swapped.
2016-02-22 13:06:30 +01:00
Fabian Reinartz 825831e98f Use fingerprint for target identity comparison
So far we were using the InstanceIdentifier to compare equality of targets.
This is not always accurate, for example for the blackbox exporter where the 
actual target is in the parameter.
2016-02-17 16:34:53 +01:00
Fabian Reinartz a06bc75519 Remove occurrences of 'base' labels 2016-02-15 10:36:57 +01:00
Julius Volz 3728b5872f Fix target update error handling.
Fixes https://github.com/prometheus/prometheus/issues/1378
2016-02-08 21:42:59 +01:00
beorn7 ec08c9a391 Rework the way to communicate backpressure (AKA suspended ingestion)
This gives up on the idea to communicate throuh the Append() call (by
either not returning as it is now or returning an error as
suggested/explored elsewhere). Here I have added a Throttled() call,
which has the advantage that it can be called before a whole _batch_
of Append()'s. Scrapes will happen completely or not at all. Same for
rule group evaluations. That's a highly desired behavior (as discussed
elsewhere). The code is even simpler now as the whole ingestion buffer
could be removed.

Logging of throttled mode has been streamlined and will create at most
one message per minute.
2016-02-01 14:45:44 +01:00
Julien Dehee 061fe2f364 Support AirBnB's Smartstack Nerve client for SD
nerve's registration format differs from serverset. With this commit
there is now a dedicated treecache file in util,
and two separate files for serverset and nerve.

Reference:
https://github.com/airbnb/nerve
2016-01-18 14:07:28 +01:00
Fabian Reinartz 29a69eecb8 Do not panic in Consul SD creation 2015-11-30 18:41:48 +01:00
Brian Brazil 427bf29db1 Add in default port after relabelling.
For the SNMP and blackbox exporters where
the ports tends to not be 80/443 and indeed
there may not be a port this makes the relabelling
a bit simpler as you don't have to figure out this
logic exists and strip off the :80.

This is a breaking change for the example configs of
those exporters.
2015-11-08 11:42:18 +00:00
Brian Brazil fd2bd81cd8 Allow all instance labels in target groups
With the blackbox exporter, the instance label will commonly
be used for things other than hostnames so remove this restriction.
https://example.com or https://example.com/probe/me are some examples.

To prevent user error, check that urls aren't provided as targets
when there's no relabelling that could potentically fix them.
2015-11-07 14:35:20 +00:00
Julius Volz d88aea7e6f Fix SD mechanism source prefix handling.
The prefixed target provider changed a pointerized target group that was
reused in the wrapped target provider, causing an ever-increasing chain
of source prefixes in target groups from the Consul target provider.

We now make this bug generally impossible by switching the target group
channel from pointer to value type and thus ensuring that target groups
are copied before being passed on to other parts of the system.

I tried to not let the depointerization leak too far outside of the
channel handling (both upstream and downstream) because I tried that
initially and caused some nasty bugs, which I want to minimize.

Fixes https://github.com/prometheus/prometheus/issues/1083
2015-10-09 14:08:22 +02:00
Matt Jibson dcb4856d72 Add SD for Amazon EC2 instances 2015-10-06 18:36:17 -04:00
Fabian Reinartz e3b6ec9784 Switch to common/log 2015-10-03 10:21:43 +02:00
Julius Volz 99e8fff872 Fix target manager CPU busyloop caused by bad done-channel handling.
Unfortunately this isn't nicely testable, as it's timing-dependent and
one would have to detect a stray goroutine doing a CPU busyloop...

Fixes https://github.com/prometheus/prometheus/issues/1114
2015-09-28 11:51:16 +02:00
Fabian Reinartz cc1a2a2061 Remove attachment of global labels upon ingestion 2015-09-03 14:16:23 +02:00
Fabian Reinartz ebf417a282 Fix map initialization 2015-09-01 18:06:22 +02:00
Julius Volz 995d3b831d Fix most golint warnings.
This is with `golint -min_confidence=0.5`.

I left several lint warnings untouched because they were either
incorrect or I felt it was better not to change them at the moment.
2015-08-26 12:44:46 +02:00
Julius Volz d36a7f4e6f Fix busylooping in case of no target providers.
merge() closes the channel that handleUpdates() reads from when there
are zero configured target providers in the configuration. In that case,
the for-select loop in handleUpdates() entered a busy loop. It should
exit when the upstream channel is closed.
2015-08-24 16:42:28 +02:00
Fabian Reinartz 438e232c9b Fix grouping of import blocks 2015-08-22 09:42:45 +02:00
Fabian Reinartz 306e8468a0 Switch from client_golang/model to common/model 2015-08-21 13:33:38 +02:00
Fabian Reinartz f592740bac Only exit static target provider on done 2015-08-18 11:51:53 +02:00
Fabian Reinartz 4e84b86510 Improve target discovery pipeline
Replace the TargetProvider Stop method with done channels
that ensure properly broadcasted shutdown of the whole pipeline.
2015-08-14 13:30:27 +02:00