Fix race by properly locking access to scrape pools. Use separate mutex for information needed by UI so that UI isn't blocked when targets are being updated.
* web: replace deprecated InstrumentHandler()
This change replaces the deprecated InstrumentHandler function by the
equivalent functions from the promhttp package.
The following metrics are removed:
* http_request_duration_microseconds (Summary).
* http_request_size_bytes (Summary).
* http_requests_total (Counter).
And the following metrics are added instead:
* prometheus_http_request_duration_seconds (Histogram).
* prometheus_http_response_size_bytes (Histogram).
* promhttp_metric_handler_requests_in_flight (Gauge).
* promhttp_metric_handler_requests_total (Counter).
* Update github.com/prometheus/common/route package
* web: refactor using the new prometheus/common/route package
After removing the checkbox in #3913 the only remaining element that
looked like it was the new Show Annotations checkbox on the Alerts page.
Which in turn didn't look like the Enable query history checkout on the
graph page. So:
1. This takes the Enable query history button as canonical.
2. Updates the show annotations button code to match it.
3. Simplifies the JS for the checkbox.
The new Service Discovery page uses the CSS/JS from the Targets page but
used slightly differently. This makes the job header match in the
Service Discovery page for a more consistent look-n-feel.
* Added only healthy to Targets
This adds a "Only heathly" button to supplement the "Only unhealthy"
button. The two are mutually exclusive.
I've also added a red/green text color to the buttons.
Arguably this could be a toggle instead if folks think this is
worthwhile... Happy to modify it.
* Moved functions above init
* Simplifed code and made prettier
* Appeased codeacy
* Made buttons square
* Fix JS error: cannot read source of undefined
When the page was refreshed with queries on the page,
the updateTypeaheadMetricsSet function was called before
the typeahead had been initialized.
* Fix: updates URL when query submits
When queries were submitted by pressing enter, the URL did not update
to reflect the change. Not sure why, but this was only the case when
the queries were non-simple, meaning when either labels werre specified
or other promql functions were used.
* Rebase master and make assets
This is a very minor UX change. The current "No Alert rules" present
table row has the `alert_header` class attached. This changes the cursor
and some other stuff and makes sense with the populated table but less
sense with the unpopulated table. So removing it the latter case.
This adds a parameter to the storage selection interface which allows
query engine(s) to pass information about the operations surrounding a
data selection.
This can for example be used by remote storage backends to infer the
correct downsampling aggregates that need to be provided.
When you have no alerting rules defined you get a screen sharing this
information in the WebUI. If no rules are defined then you instead see
an empty white screen. This adds a "No rules" defined `else` clause and
a `Rules` header to the page.
* Do not autoselect the first item in the dropdown
* Historical queries only show in dropdown when toggled on
* Move shared behavior to queryHistory.isEnabled function
* Do not auto submit selected history queries
net.Listener converts 0.0.0.0 to :: which fails for hosts where IPv6 is
disabled. This change uses the original listen address parameter instead
of grpcl.Addr().String().
Federation makes use of dedupedSeriesSet to merge SeriesSets for every
query into one output stream. If many match[] arguments are provided,
many dedupedSeriesSet objects will get chained. This has the downside of
causing a potential O(n*k) running time, where n is the number of series
and k the number of match[] arguments.
In the mean time, the storage package provides a mergeSeriesSet that
accomplishes the same with an O(n*log(k)) running time by making use of
a binary heap. Let's just get rid of dedupedSeriesSet and change all
existing callers to use mergeSeriesSet.
When there is an empty result set, the Prometheus server replies with
{"status":"success","data":{"resultType":"vector","result":null}}
That "null" reply was not handled correctly by the graphing library.
This commit handles that case and shows "no data" in the UI console view
instead of throwing an error in the browser javascript console.
Fixes#3515
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
API consumers should be able to get insight into the query run times.
The UI currently measures total roundtrip times. This PR allows for more
fine grained metrics to be exposed.
* adds new timer for total execution time (queue + eval)
* expose new timer, queue timer, and eval timer in stats field of the
range query response:
```json
{
"status": "success",
"data": {
"resultType": "matrix",
"result": [],
"stats": {
"execQueueTimeNs": 4683,
"execTotalTimeNs": 2086587,
"totalEvalTimeNs": 2077851
}
}
}
```
* stats field is optional, only set when query parameter `stats` is not
empty
Try it via
```sh
curl 'http://localhost:9090/api/v1/query_range?query=up&start=1486480279&end=1486483879&step=14000&stats=true'
```
Review feedback
* moved query stats json generation to query_stats.go
* use seconds for all query timers
* expose all timers available
* Changed ExecTotalTime string representation from Exec queue total time to Exec total time
This PR fixes#3072 by providing POST endpoints for `query` and `query_range`.
POST request must be made with `Content-Type: application/x-www-form-urlencoded` header.
* Add UI warning for time drift >30 seconds
* Yellow time drift warning & better warning message
* Set warning threshold to 30 sec
* Include changed assets
* Re-add contexts to storage.Storage.Querier()
These are needed when replacing the storage by a multi-tenant
implementation where the tenant is stored in the context.
The 1.x query interfaces already had contexts, but they got lost in 2.x.
* Convert promql.Engine to use native contexts
No matter how we refactor docs, `/docs/` will stay the prefix, so there's not long-term risk in changing this.
One we version docs, we should probably try and keep link & version in sync.
Whenever a route prefix is applied, the router prepends the prefix to
the URL path on the request. For most handlers, this is not an issue
because the request's path is only used for routing and is not actually
needed by the handler itself. However, Prometheus delegates the handling
of the /debug/* endpoints to the http.DefaultServeMux which has it's own
routing logic that depends on the url.Path. As a result, whenever a
prefix is applied, the prefixed URL is passed to the DefaultServeMux
which has no awareness of the prefix and returns a 404.
This change fixes the issue by creating a new serveDebug handler which
routes requests /debug/* requests to appropriate net/http/pprof handler
and removing the net/http/pprof import in cmd/prometheus since it is no
longer necessary.
Fixes#2183.
This PR adds the `/status/config` endpoint which exposes the currently
loaded Prometheus config. This is the same config that is displayed on
`/config` in the UI in YAML format. The response payload looks like
such:
```
{
"status": "success",
"data": {
"yaml": <CONFIG>
}
}
```
Issue #3046 is triggered by html/template changes in go1.9.
See https://tip.golang.org/pkg/html/template. Quote:
// To ease migration to Go 1.9 and beyond, "html" and "urlquery" will
// continue to be allowed as the last command in a pipeline. However, if the
// pipeline occurs in an unquoted attribute value context, "html" is
// disallowed. Avoid using "html" and "urlquery" entirely in new templates.
The commit also includes a trivial whitespace fix.
To cover the cases where stale markers may not be available,
we need to infer the interval and mark series stale based on that.
As we're lacking stale markers this is less accurate, however
it should be good enough for these cases.
We need 4 intervals as if say we had data at t=0 and t=10,
coming via federation. The next data point should be at t=20 however it
could take up to t=30 for it actually to be ingested, t=40 for it to be
scraped via federation and t=50 for it to be ingested.
We then add 10% on to that for slack, as we do elsewhere.
* Use request.Context() instead of a global map of contexts.
* Add some basic opentracing instrumentation on the query path.
* Remove tracehandler endpoint.
This is needed for federating non-instance level metrics, so they don't
end up with the instance label of the prometheus target.
Also sort external labels, so label output order is consistent.
* Fixed int64 overflow for timestamp in v1/api parseDuration and parseTime
This led to unexpected results on wrong query with "(...)&start=148966367200.372&end=1489667272.372"
That query is wrong because of `start > end` but actually internal int64 overflow caused start to be something around MinInt64 (huge negative value) and was passing validation.
BTW: Not sure if negative timestamp makes sense even.. But model.Earliest is actually MinInt64, can someone explain me why?
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
* Added missing trailing periods on comments.
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
* MOved to only `<` and `>`. Removed equal.
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
Expose buildQueryUrl, refactor dispatch to use
buildQueryUrl will allow users to execute queries over the range of an
existing graph. This will be helpful to select data series they wish to
annotate the graph with, for example.
The fuzzy library didn't try to find a "best match", but settled on the
first fuzzy match that exists. This patch includes a modified version of
the fuzzy library, which recursivley tries on the rest of the search
string to find a better match. If found, returns that one.
Another small modification is that if a pattern fully matches, it
skips the lookup entirley and returns the highest score possible for
that match.
For some of the queries, the fuzzy lookup was not filtering properly.
The problem is due to the "replace" beind made on the query itself. It
accidently removes only the first underscore. This patch changes it so
that it removes all of the whitespaces, letting the fuzzy algorithm do
its magic, also fixing this problem.
Originally, the underscore were replaced by a space for this specific
reason, to let the user type a space and have the lookup treat it as the
word break.
Fixes#2380