Commit Graph

340 Commits (89ecb3a3f2e48a66260a9af1a834ed8617b86f27)

Author SHA1 Message Date
Ganesh Vernekar dbe55c1352 Subquery (#4831)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2018-12-22 13:47:13 +00:00
Tom Wilkie e1d9bf77f1
Export the error field in ErrStorage, so we can 'throw' it outside the package. (#4954)
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-12-04 16:49:21 +00:00
mknapphrt f0e9196dca Return warnings on a remote read fail (#4832)
Signed-off-by: Mark Knapp <mknapp@hudson-trading.com>
2018-11-30 14:27:12 +00:00
Ben Kochie c6399296dc
Fix spelling/typos (#4921)
* Fix spelling/typos

Fix spelling/typos reported by codespell/misspell.
* UK -> US spelling changes.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-11-27 17:44:29 +01:00
Bryan Boreham 9a956872a3 Make ErrorStorage a concrete type not an interface
Since it is used in a type assertion, having it as an alias to the
error interface is the same as saying 'error', i.e. it succeeds for
all types of error.  Change to a struct which is a concrete type and
the type assertion will only succeed if the type is identical.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2018-10-04 13:13:41 +00:00
Callum Styan 9bca041285 WIP: keep track of samples per query, set a max # of samples (#4513)
* keep track of samples per query, set a max # of samples that can be in
memory at once

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2018-10-02 12:59:19 +01:00
Tom Wilkie 4c52400708
Limit concurrent remote reads. (#4656)
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-09-25 20:07:34 +01:00
Harsh Agarwal 18a9a390b5 Add duplicate-labelset check for range/instant vectors (#4589)
Signed-off-by: Harsh Agarwal <cs15btech11019@iith.ac.in>
2018-09-18 10:46:13 +01:00
Ganesh Vernekar 576ee4d309 Label name check for 'count_values' (#4585)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2018-09-13 15:27:36 +05:30
Dan Cech 9f4cb06a37 use Welford/Knuth method to compute standard deviation and variance (#4533)
* use Welford/Knuth method to compute standard deviation and variance, avoids float precision issues
* use better method for calculating avg and avg_over_time

Signed-off-by: Dan Cech <dcech@grafana.com>
2018-08-26 10:28:47 +01:00
Goutham Veeramachaneni 71855a22a4
Add tracing spans to promql (#4436)
* Add spans to promql

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>

* Simplify timer and span tracking.

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2018-08-16 13:11:34 +05:30
Thomas Jackson 56daa1f28a Only add LookbackDelta to vector selectors (#4399)
Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>

Related to #4226
2018-07-19 06:16:05 +01:00
Alin Sinpalean 372e7652b7 Reuse (copy) overlapping matrix samples between range evaluation steps (#4315)
* Reuse (copy) overlapping matrix samples between range evaluation steps.

Signed-off-by: Alin Sinpalean <alin.sinpalean@gmail.com>
2018-07-18 11:14:02 +01:00
Tony Lee bcdaf8e2d2 add unused pointslices to the pool (#4363)
Signed-off-by: Tony Lee <tl@hudson-trading.com>
2018-07-18 05:29:21 +01:00
Alin Sinpalean e3b775b78b Simplify BufferedSeriesIterator usage (#4294)
* Allow for BufferedSeriesIterator instances to be created without an underlying iterator, to simplify their usage.

Signed-off-by: Alin Sinpalean <alin.sinpalean@gmail.com>
2018-07-18 05:10:28 +01:00
Julius Volz 219e477272 Fix some (valid) lint errors (#4287)
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-18 05:07:33 +01:00
Thomas Jackson 92c6f0c92e Add offset to selectParams (#4226)
* Add Start/End to SelectParams
* Make remote read use the new selectParams for start/end

This commit will continue sending the start/end time of the remote read
query as the overarching promql time and the specific range of data that
the query is intersted in receiving a response to is now part of the
ReadHints (upstream discussion in #4226).

* Remove unused vendored code

The genproto.sh script was updated, but the code wasn't regenerated.
This simply removes the vendored deps that are no longer part of the
codegen output.

Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
2018-07-18 04:58:00 +01:00
Alin Sinpalean 96fb0b2155 Optimize PromQL aggregations (#4248)
* Compute hash of label subsets without creating a LabelSet first.

Signed-off-by: Alin Sinpalean <alin.sinpalean@gmail.com>
2018-07-18 04:56:27 +01:00
Tom Wilkie 3228814456 Don't forget to register query_duration_seconds{slice="queue_time"} (#4381)
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-07-15 12:24:37 +01:00
Thomas Jackson a6dace8829 Check for timeout in each iteration of matrixSelector (#4300)
Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>

Fixes #4288
2018-06-21 22:43:31 +01:00
Thomas Jackson 630f42fcf1 Timeout if populating iterators takes too long (#4291)
Right now promql won't time out a request if populating the iterators
takes a long time.

Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>

Fixes #4289
2018-06-21 08:14:51 +01:00
Thomas Jackson 404abe0f1c Bubble up errors to promql from populating iterators (#4136)
This changes the Walk/Inspect API inside the promql package to bubble
up errors. This is done by having the inspector return an error (instead
of a bool) and then bubbling that up in the Walk. This way if any error
is encountered in the Walk() the walk will stop and return the error.
This avoids issues where errors from the Querier where being ignored
(causing incorrect promql evaluation).

Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>

Fixes #4136
2018-06-07 17:27:34 +01:00
Mario Trangoni 0e2aa35771 promql: fix unconvert issues (#4040)
See,
$ gometalinter --vendor --disable-all --enable=unconvert --deadline 6m ./...
promql/engine.go:1396:26⚠️ unnecessary conversion (unconvert)
promql/engine.go:1396:40⚠️ unnecessary conversion (unconvert)
promql/engine.go:1398:26⚠️ unnecessary conversion (unconvert)
promql/engine.go:1398:40⚠️ unnecessary conversion (unconvert)
promql/engine.go:1427:26⚠️ unnecessary conversion (unconvert)
promql/engine.go:1427:40⚠️ unnecessary conversion (unconvert)
promql/engine.go:1429:26⚠️ unnecessary conversion (unconvert)
promql/engine.go:1429:40⚠️ unnecessary conversion (unconvert)
promql/engine.go:1505:50⚠️ unnecessary conversion (unconvert)
promql/engine.go:1573:46⚠️ unnecessary conversion (unconvert)
promql/engine.go:1578:46⚠️ unnecessary conversion (unconvert)
promql/engine.go:1591:80⚠️ unnecessary conversion (unconvert)
promql/engine.go:1602:94⚠️ unnecessary conversion (unconvert)
promql/engine.go:1630:18⚠️ unnecessary conversion (unconvert)
promql/engine.go:1631:24⚠️ unnecessary conversion (unconvert)
promql/engine.go:1634:18⚠️ unnecessary conversion (unconvert)
promql/engine.go:1635:34⚠️ unnecessary conversion (unconvert)
promql/functions.go:302:42⚠️ unnecessary conversion (unconvert)
promql/functions.go:315:42⚠️ unnecessary conversion (unconvert)
promql/functions.go:334:26⚠️ unnecessary conversion (unconvert)
promql/functions.go:395:31⚠️ unnecessary conversion (unconvert)
promql/functions.go:406:31⚠️ unnecessary conversion (unconvert)
promql/functions.go:454:27⚠️ unnecessary conversion (unconvert)
promql/functions.go:701:46⚠️ unnecessary conversion (unconvert)
promql/functions.go:701:78⚠️ unnecessary conversion (unconvert)
promql/functions.go:730:43⚠️ unnecessary conversion (unconvert)
promql/functions.go:1220:23⚠️ unnecessary conversion (unconvert)
promql/functions.go:1249:23⚠️ unnecessary conversion (unconvert)
promql/quantile.go:107:54⚠️ unnecessary conversion (unconvert)
promql/quantile.go:182:16⚠️ unnecessary conversion (unconvert)
promql/quantile.go:182:64⚠️ unnecessary conversion (unconvert)

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2018-06-06 18:20:38 +01:00
Brian Brazil dd6781add2 Optimise PromQL (#3966)
* Move range logic to 'eval'

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make aggregegate range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* PromQL is statically typed, so don't eval to find the type.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Extend rangewrapper to multiple exprs

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Start making function evaluation ranged

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make instant queries a special case of range queries

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Eliminate evalString

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Evaluate range vector functions one series at a time

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make unary operators range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make binops range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Pass time to range-aware functions.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make simple _over_time functions range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Reduce allocs when working with matrix selectors

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Add basic benchmark for range evaluation

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Reuse objects for function arguments

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Do dropmetricname and allocating output vector only once.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Add range-aware support for range vector functions with params

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Optimise holt_winters, cut cpu and allocs by ~25%

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make rate&friends range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make more functions range aware. Document calling convention.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make date functions range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make simple math functions range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Convert more functions to be range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make more functions range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Specialcase timestamp() with vector selector arg for range awareness

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Remove transition code for functions

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Remove the rest of the engine transition code

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Remove more obselete code

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Remove the last uses of the eval* functions

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Remove engine finalizers to prevent corruption

The finalizers set by matrixSelector were being called
just before the value they were retruning to the pool
was then being provided to the caller. Thus a concurrent query
could corrupt the data that the user has just been returned.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Add new benchmark suite for range functinos

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Migrate existing benchmarks to new system

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Expand promql benchmarks

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Simply test by removing unused range code

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* When testing instant queries, check range queries too.

To protect against subsequent steps in a range query being
affected by the previous steps, add a test that evaluates
an instant query that we know works again as a range query
with the tiimestamp we care about not being the first step.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Reuse ring for matrix iters. Put query results back in pool.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Reuse buffer when iterating over matrix selectors

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Unary minus should remove metric name

Cut down benchmarks for faster runs.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Reduce repetition in benchmark test cases

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Work series by series when doing normal vectorSelectors

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Optimise benchmark setup, cuts time by 60%

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Have rangeWrapper use an evalNodeHelper to cache across steps

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Use evalNodeHelper with functions

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Cache dropMetricName within a node evaluation.

This saves both the calculations and allocs done by dropMetricName
across steps.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Reuse input vectors in rangewrapper

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Reuse the point slices in the matrixes input/output by rangeWrapper

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make benchmark setup faster using AddFast

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Simplify benchmark code.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Add caching in VectorBinop

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Use xor to have one-level resultMetric hash key

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Add more benchmarks

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Call Query.Close in apiv1

This allows point slices allocated for the response data
to be reused by later queries, saving allocations.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Optimise histogram_quantile

It's now 5-10% faster with 97% less garbage generated for 1k steps

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make the input collection in rangeVector linear rather than quadratic

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Optimise label_replace, for 1k steps 15x fewer allocs and 3x faster

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Optimise label_join, 1.8x faster and 11x less memory for 1k steps

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Expand benchmarks, cleanup comments, simplify numSteps logic.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Address Fabian's comments

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Comments from Alin.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Address jrv's comments

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Remove dead code

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Address Simon's comments.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Rename populateIterators, pre-init some sizes

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Handle case where function has non-matrix args first

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Split rangeWrapper out to rangeEval function, improve comments

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Cleanup and make things more consistent

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make EvalNodeHelper public

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Fabian's comments.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-06-04 15:47:45 +02:00
David King 6286c10df0 Fix OOM when a large K is used in topk queries (#4087)
This attempts to close #3973.

Handles cases where the length of the input vector to an aggregate topk
/ bottomk function is less than the K paramater. The change updates
Prometheus to allocate a result vector the same length as the input
vector in these cases.

Previously Prometheus would out-of-memory panic for large K values. This
change makes that unlikely unless the size of the input vector is
equally large.

Signed-off-by: David King <dave@davbo.org>
2018-04-16 09:03:04 +01:00
Tony Lee 7cd56f56df add queue_time slice to query_duration_seconds (#4050) 2018-04-05 19:56:58 +01:00
Anton Tereshchenkov 18bbec050c promql: propagate storage errors 2018-03-14 15:19:22 +01:00
Nikunj Aggarwal 998dfcbac6 Expose itemtype outside the package (#3933) 2018-03-08 16:52:44 +00:00
Fabian Reinartz 309c666426
Merge pull request #3671 from prometheus/queryparams
*: implement query params
2018-02-15 12:24:34 +01:00
Fabian Reinartz 7ccd4b39b8 *: implement query params
This adds a parameter to the storage selection interface which allows
query engine(s) to pass information about the operations surrounding a
data selection.
This can for example be used by remote storage backends to infer the
correct downsampling aggregates that need to be provided.
2018-02-13 12:17:22 +01:00
Krasi Georgiev a53d4ed197 drop metric name for bool modifier (#3821)
fixes #3820
2018-02-11 16:15:55 +00:00
Fabian Reinartz f8fccc73d8 promql: remove global metrics 2017-11-24 07:57:54 +01:00
Fabian Reinartz 83cd270ea4 *: adapt to storage interface changes 2017-11-23 19:05:04 +01:00
David Kaltschmidt 87c46ea6c3 Renamed TotalEvalTime to EvalTotalTime
* TotalFoo suggested a comprehensive timing, but TotalEvalTime was part
of the Exec timings, together with Queue timings
* The other option was to rename ExecTotalTime to TotalExecTime, but
 there was already ExecQueueTime, suggesting Exec to be some sort of
group
2017-11-17 17:46:51 +01:00
David Kaltschmidt c93e54d240 Adds execution timer stats to the range query
API consumers should be able to get insight into the query run times.
The UI currently measures total roundtrip times. This PR allows for more
fine grained metrics to be exposed.

* adds new timer for total execution time (queue + eval)

* expose new timer, queue timer, and eval timer in stats field of the
 range query response:
```json
{
  "status": "success",
  "data": {
    "resultType": "matrix",
    "result": [],
    "stats": {
      "execQueueTimeNs": 4683,
      "execTotalTimeNs": 2086587,
      "totalEvalTimeNs": 2077851
    }
  }
}
```

* stats field is optional, only set when query parameter `stats` is not
empty

Try it via
```sh
curl 'http://localhost:9090/api/v1/query_range?query=up&start=1486480279&end=1486483879&step=14000&stats=true'
```

Review feedback

* moved query stats json generation to query_stats.go
* use seconds for all query timers
* expose all timers available
* Changed ExecTotalTime string representation from Exec queue total time to Exec total time
2017-11-16 16:05:10 +01:00
Brian Brazil 99905f82a6 Remove keep_common modifier.
See #3060
2017-10-05 13:27:48 +01:00
Brian Brazil 67274f0794 Remove 4 interval staleness heuristic. (#3244)
This means that if there is no stale marker, only the usual staleness
delta (5m) applies.

It has occured to me that there is an oddity in the heurestic. It works
fine as long as you have 2 points within the last 5m, but breaks down
when the time window advances to the point where you have just 1 point.

Consider you had points at t=0 and t=10. With the heurestic it goes stale
at t=51, up until t=300. However from t=301 until t=310 we only
see the t=10 point and the series comes back to life. That is not
desirable.

I don't see a way to keep this form of heurestic working given this
issue, so thus I'm removing it.
2017-10-05 12:55:14 +01:00
Julius Volz f7e8348a88 Re-add contexts to storage.Storage.Querier() (#3230)
* Re-add contexts to storage.Storage.Querier()

These are needed when replacing the storage by a multi-tenant
implementation where the tenant is stored in the context.

The 1.x query interfaces already had contexts, but they got lost in 2.x.

* Convert promql.Engine to use native contexts
2017-10-04 21:04:15 +02:00
Fabian Reinartz d21f149745 *: migrate to go-kit/log 2017-09-08 22:01:51 +05:30
Fabian Reinartz 25f3e1c424 Merge branch 'master' into mergemaster 2017-08-10 17:04:25 +02:00
Alexey Palazhchenko 695ec0b981 Fix few typos. (#2962) 2017-07-18 13:58:00 +01:00
Goutham Veeramachaneni 4194d2ac79 Call At() only if Next() is true
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-07-13 18:42:45 +02:00
Goutham Veeramachaneni d407bd150c Consolidate the duration params in CLI
* All CLI params moved to model.Duration

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-16 20:20:57 +05:30
Goutham Veeramachaneni 507790a357
Rework logging to use explicitly passed logger
Mostly cleaned up the global logger use. Still some uses in discovery
package.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-16 15:52:44 +05:30
Brian Brazil 220e78b9c3 Consider a series stale after 4.1 intervals with no data.
To cover the cases where stale markers may not be available,
we need to infer the interval and mark series stale based on that.
As we're lacking stale markers this is less accurate, however
it should be good enough for these cases.

We need 4 intervals as if say we had data at t=0 and t=10,
coming via federation. The next data point should be at t=20 however it
could take up to t=30 for it actually to be ingested, t=40 for it to be
scraped via federation and t=50 for it to be ingested.
We then add 10% on to that for slack, as we do elsewhere.
2017-05-24 14:27:17 +01:00
Brian Brazil c02c25d5ba Allow peeking back further in buffer. 2017-05-24 14:27:17 +01:00
Brian Brazil a5cf25743c Move stalness check into a function 2017-05-16 18:33:51 +01:00
Brian Brazil 80b40e6d91 Add initial staleness handing to promql.
For instant vectors, if "stale" is the newest sample
ignore the timeseries.

For range vectors, filter out "stale" samples.

Make it possible to inject "stale" samples in promql tests.
2017-05-16 18:33:51 +01:00
Fabian Reinartz 6e804b3497 Merge branch 'master' into dev-2.0 2017-05-12 13:29:58 +02:00
Brian Brazil fcc88f0e1e query/query_range should return eval timestamp
Query and query_range should return the timestamp
at which an evaluation is performed, not the timestamp
of the data. This is as that's what query range asked
for, and we need to keep query consistent with that.

Query for a matrix remains unchanged, returning the literal
matrix.
2017-05-12 12:00:31 +01:00
Brian Brazil 517b81f927 Add timestamp() function.
Make the timestamp of instant vectors be the timestamp of the sample
rather than the evaluation. We were not using this anywhere, so this is
safe.

Add a function to return the timestamp of samples in an instant vector.

Fixes #1557
2017-05-12 12:00:31 +01:00
Tom Wilkie 4d9b917d11 Instrument Prometheus with OpenTracing (#2554)
* Use request.Context() instead of a global map of contexts.

* Add some basic opentracing instrumentation on the query path.

* Remove tracehandler endpoint.
2017-05-02 18:49:29 -05:00
Fabian Reinartz 73b8ff0ddc Merge branch 'master' into dev-2.0 2017-04-27 10:19:55 +02:00
Tom Wilkie f0e8a5f37c Add promql.ErrStorage, which is interpreted by the API as a 500. 2017-04-06 14:41:23 +01:00
Fabian Reinartz c389193b37 Merge branch 'master' into dev-2.0 2017-03-17 16:27:07 +01:00
Fabian Reinartz 0ecd205794 promql: Use buffer pool for matrix allocations 2017-03-14 10:57:34 +01:00
Fabian Reinartz b09b90a940 Correctly close querier on error, revendor tsdb 2017-03-09 15:40:52 +01:00
Goutham Veeramachaneni 6634984a38
Comments and Typo Fixes 2017-03-06 17:16:37 +05:30
Fabian Reinartz 9304179ef7 Merge branch 'master' into dev-2.0 2017-03-02 08:16:58 +01:00
Alex Somesan 18cd7246b5 Instrument query engine timings (#2418)
* Instrument query engine statistics
2017-02-13 16:45:00 +00:00
Fabian Reinartz 1d3cdd0d67 Merge branch 'master' into dev-2.0-rebase 2017-01-30 17:43:01 +01:00
André Carvalho c43dfaba1c Add max concurrent and current queries engine metrics (#2326)
* Add max concurrent and current queries engine metrics

This commit adds two metrics to the promql/engine: the
number of max concurrent queries, as configured by the flag, and
the number of current queries being served+blocked in the engine.
2017-01-07 14:41:25 +00:00
Fabian Reinartz bc20d93f0a storage: rename iterator value getters to At() 2017-01-02 13:33:37 +01:00
Fabian Reinartz 28f547bcc7 api/v1: fix tests, restore series queries 2016-12-30 10:43:44 +01:00
Fabian Reinartz f8fc1f5bb2 *: migrate ingestion to new batch Appender 2016-12-29 11:03:56 +01:00
Fabian Reinartz 71fe0c58a8 promql: misc fixes 2016-12-28 11:32:15 +01:00
Fabian Reinartz fecf9532b9 *: fix misc compile errors 2016-12-25 11:42:57 +01:00
Fabian Reinartz 0492ddbd4d *: fully decouple tsdb, add new storage interfaces 2016-12-25 01:43:22 +01:00
Fabian Reinartz 9ea10d5265 promql: use labels.Builder to modify labels 2016-12-24 14:35:24 +01:00
Fabian Reinartz c6cd998905 promql: use local labels, add conversion 2016-12-24 14:01:37 +01:00
Fabian Reinartz ff504af2aa promql: undo accidental exports 2016-12-24 11:41:37 +01:00
Fabian Reinartz 6dedf89cc3 promql: rename SampleStream to Series 2016-12-24 11:32:42 +01:00
Fabian Reinartz c5f225b920 promql: export Sample 2016-12-24 11:32:10 +01:00
Fabian Reinartz 65581a3d46 promql: export SmapleStream 2016-12-24 11:29:39 +01:00
Fabian Reinartz 6315d00942 promql: export String value 2016-12-24 11:25:26 +01:00
Fabian Reinartz ac5d3bc05e promql: scalar T/V and Point 2016-12-24 11:23:06 +01:00
Fabian Reinartz 09666e2e2a promql: make scalar public 2016-12-24 10:44:04 +01:00
Fabian Reinartz b3f71df350 promql: make matrix exported 2016-12-24 10:42:54 +01:00
Fabian Reinartz a62df87022 promql: rename vector 2016-12-24 10:40:09 +01:00
Fabian Reinartz 15a931dbdb promql: migrate model types, use tsdb interfaces 2016-12-24 00:39:52 +01:00
Tristan Colgate 68fc15fe4e Report type names in the form used in documentation 2016-11-18 10:12:55 +00:00
beorn7 4e3abc6cbf Simply use `math.Mod(float64, float64)` after all
This circumvents all the problems with int overflow, plus it is what was originally intended.
2016-11-08 21:03:31 +01:00
beorn7 5cf5bb427a Check for int64 overflow when converting from float64 2016-11-05 00:48:32 +01:00
beorn7 92c0ef1a92 Merge branch 'release-1.2' into beorn7/release 2016-11-03 22:48:39 +01:00
beorn7 07f1bdfe94 Fix MOD binop for scalars and vectors
Previously, a floating point number that would round down to 0 would
cause a "division by zero" panic.
2016-11-03 19:03:44 +01:00
Fabian Reinartz 8fa18d564a storage: enhance Querier interface usage
This extracts Querier as an instantiateable and closeable object
rather than just defining extending methods of the storage interface.
This improves composability and allows abstracting query transactions,
which can be useful for transaction-level caches, consistent data views,
and encapsulating teardown.
2016-10-16 10:39:29 +02:00
Julius Volz c187308366 storage: Contextify storage interfaces.
This is based on https://github.com/prometheus/prometheus/pull/1997.

This adds contexts to the relevant Storage methods and already passes
PromQL's new per-query context into the storage's query methods.
The immediate motivation supporting multi-tenancy in Frankenstein, but
this could also be used by Prometheus's normal local storage to support
cancellations and timeouts at some point.
2016-09-19 16:29:07 +02:00
Julius Volz ed5a0f0abe promql: Allow per-query contexts.
For Weaveworks' Frankenstein, we need to support multitenancy. In
Frankenstein, we initially solved this without modifying the promql
package at all: we constructed a new promql.Engine for every
query and injected a storage implementation into that engine which would
be primed to only collect data for a given user.

This is problematic to upstream, however. Prometheus assumes that there
is only one engine: the query concurrency gate is part of the engine,
and the engine contains one central cancellable context to shut down all
queries. Also, creating a new engine for every query seems like overkill.

Thus, we want to be able to pass per-query contexts into a single engine.

This change gets rid of the promql.Engine's built-in base context and
allows passing in a per-query context instead. Central cancellation of
all queries is still possible by deriving all passed-in contexts from
one central one, but this is now the responsibility of the caller. The
central query context is now created in main() and passed into the
relevant components (web handler / API, rule manager).

In a next step, the per-query context would have to be passed to the
storage implementation, so that the storage can implement multi-tenancy
or other features based on the contextual information.
2016-09-19 15:38:17 +02:00
beorn7 71571a8ec4 promql: Fix (and simplify) populating iterators
This was only relevant so far for the benchmark suite as it would
recycle Expr for repetitions. However, the append is unnecessary as
each node is only inspected once when populating iterators, and
population must always start from scratch.

This also introduces error checking during benchmarks and fixes the so
far undetected test errors during benchmarking.

Also, remove a style nit (two golint warnings less…).
2016-08-24 18:37:09 +02:00
Julius Volz 3bfec97d46 Make the storage interface higher-level.
See discussion in
https://groups.google.com/forum/#!topic/prometheus-developers/bkuGbVlvQ9g

The main idea is that the user of a storage shouldn't have to deal with
fingerprints anymore, and should not need to do an individual preload
call for each metric. The storage interface needs to be made more
high-level to not expose these details.

This also makes it easier to reuse the same storage interface for remote
storages later, as fewer roundtrips are required and the fingerprint
concept doesn't work well across the network.

NOTE: this deliberately gets rid of a small optimization in the old
query Analyzer, where we dedupe instants and ranges for the same series.
This should have a minor impact, as most queries do not have multiple
selectors loading the same series (and at the same offset).
2016-07-25 13:59:22 +02:00
Brian Brazil 0303ccc6a7 Add quantile aggregator. 2016-07-21 00:09:19 +01:00
Brian Brazil 16690736ab Add count_values() aggregator.
This is useful for counting how many instances
of a job are running a particular version/build.

Fixes #622
2016-07-05 17:14:01 +01:00
Brian Brazil 3e5136e36d Make topk/bottomk aggregators. 2016-07-04 13:18:19 +01:00
Brian Brazil 3b89616d82 Allow on, ignoring, by and without wit empty laberls.
This offers new semantics in allowing on() for matching
two single-element vectors with no known common labels.
Previosuly this was often done using on(dummy).

This also allows making it explict that you meant
to do an aggregation without labels via by().

Fixes #1597.
2016-06-24 14:12:51 +01:00
Brian Brazil 246a817300 Flip vector matching to be ignoring by default.
This is a noop semantically.
2016-06-23 17:23:44 +01:00
Julius Volz b7b6717438 Separate query interface out of local.Storage.
PromQL only requires a much narrower interface than local.Storage in
order to run queries. Narrower interfaces are easier to replace and
test, too.

We could also change the web interface to use local.Querier, except that
we'll probably use appending functions from there in the future.
2016-06-23 15:14:38 +02:00
royels 2fdc5717a3 promql: add power binary operation 2016-06-22 23:34:46 -04:00
Ali Reza e7eba75690 remove keeping_extra because it's replaced with keep_common
change all keepExtra label into keepCommon, and move action into removed list

change incorrect token list
2016-05-27 00:02:04 +07:00
Brian Brazil 7201c010c4 Rename On to MatchingLabels 2016-04-26 14:28:36 +01:00
Brian Brazil d991f0cf47 For many-to-one matches, always copy label from one side.
This is a breaking change for everyone using the machine roles
labeling approach.
2016-04-21 19:35:41 +01:00
Brian Brazil 768d09fd2a Change on+group_* to take copy from the one side.
If the label doesn't exist on the one side, it's not copied.

All labels on the many inside are included, this is a breaking change
but likely low impact.
2016-04-21 19:35:40 +01:00
Brian Brazil d1edfb25b3 Add support for OneToMany with IGNORING.
The labels listed in the group_ modifier will be copied from the one
side to the many side. It will be valid to specify no labels.

This is intended to replace the existing ON/GROUP_* support.,
2016-04-21 19:35:35 +01:00
Brian Brazil 1d08c4fef0 Add 'ignoring' as modifier for binops.
Where 'on' uses the given labels to match,
'ignoring' uses all other labels to match.

group_left/right is not supported yet.
2016-04-21 19:34:29 +01:00
Tobias Schmidt 8cc86f25c0 Implement relative complement set operator "unless"
The `unless` set operator can be used to return all vector elements from
the LHS which do not match the elements on the RHS. A use case is to
return all metrics for nodes which do not have a specific role:

    node_load1 unless on(instance) chef_role{role="app"}
2016-04-04 01:29:44 -04:00
beorn7 c740789ce3 Improve predict_linear
Fixes https://github.com/prometheus/prometheus/issues/1401

This remove the last (and in fact bogus) use of BoundaryValues.

Thus, a whole lot of unused (and arguably sub-optimal / ugly) code can
be removed here, too.
2016-02-25 12:10:55 +01:00
beorn7 0e202dacb4 Streamline series iterator creation
This will fix issue #1035 and will also help to make issue #1264 less
bad.

The fundamental problem in the current code:

In the preload phase, we quite accurately determine which chunks will
be used for the query being executed. However, in the subsequent step
of creating series iterators, the created iterators are referencing
_all_ in-memory chunks in their series, even the un-pinned ones. In
iterator creation, we copy a pointer to each in-memory chunk of a
series into the iterator. While this creates a certain amount of
allocation churn, the worst thing about it is that copying the chunk
pointer out of the chunkDesc requires a mutex acquisition. (Remember
that the iterator will also reference un-pinned chunks, so we need to
acquire the mutex to protect against concurrent eviction.) The worst
case happens if a series doesn't even contain any relevant samples for
the query time range. We notice that during preloading but then we
will still create a series iterator for it. But even for series that
do contain relevant samples, the overhead is quite bad for instant
queries that retrieve a single sample from each series, but still go
through all the effort of series iterator creation. All of that is
particularly bad if a series has many in-memory chunks.

This commit addresses the problem from two sides:

First, it merges preloading and iterator creation into one step,
i.e. the preload call returns an iterator for exactly the preloaded
chunks.

Second, the required mutex acquisition in chunkDesc has been greatly
reduced. That was enabled by a side effect of the first step, which is
that the iterator is only referencing pinned chunks, so there is no
risk of concurrent eviction anymore, and chunks can be accessed
without mutex acquisition.

To simplify the code changes for the above, the long-planned change of
ValueAtTime to ValueAtOrBefore time was performed at the same
time. (It should have been done first, but it kind of accidentally
happened while I was in the middle of writing the series iterator
changes. Sorry for that.) So far, we actively filtered the up to two
values that were returned by ValueAtTime, i.e. we invested work to
retrieve up to two values, and then we invested more work to throw one
of them away.

The SeriesIterator.BoundaryValues method can be removed once #1401 is
fixed. But I really didn't want to load even more changes into this
PR.

Benchmarks:

The BenchmarkFuzz.* benchmarks run 83% faster (i.e. about six times
faster) and allocate 95% fewer bytes. The reason for that is that the
benchmark reads one sample after another from the time series and
creates a new series iterator for each sample read.

To find out how much these improvements matter in practice, I have
mirrored a beefy Prometheus server at SoundCloud that suffers from
both issues #1035 and #1264. To reach steady state that would be
comparable, the server needs to run for 15d. So far, it has run for
1d. The test server currently has only half as many memory time series
and 60% of the memory chunks the main server has. The 90th percentile
rule evaluation cycle time is ~11s on the main server and only ~3s on
the test server. However, these numbers might get much closer over
time.

In addition to performance improvements, this commit removes about 150
LOC.
2016-02-19 16:24:38 +01:00
Julius Volz 9b6d69610a Fix various typos in comments.
Helpfully reported by
https://goreportcard.com/report/github.com/prometheus/prometheus :)
2016-02-10 03:47:00 +01:00
Brian Brazil 9d0112d7cf Add without aggregator modifier.
This has the advantage that the user doesn't need
to list all labels they want to keep (as with "by")
but without having to worry about inconsistent labels
as when there's only one time series (as with "keeping_common").

Almost all aggregation should use this rather than the existing
two options as it's much less error prone and easier to maintain
due to not having to always add in "job" plus whatever other common
job-level labels you have like "region".
2016-02-08 14:05:33 +00:00
Brian Brazil 89760dd77d Handle NaN for min/max.
Similar to topk and sort, prefer not returning NaN
where possible.
2016-01-06 12:41:40 +00:00
Fabian Reinartz e3b6ec9784 Switch to common/log 2015-10-03 10:21:43 +02:00
Brian Brazil 29e8dc2c49 promql: Add 'bool' modifier to comparison functions
When doing comparison operations on vectors, filtering
sometimes gets in the way and you have to go to a fair bit of
effort to workaround it in order to always return a result.
The 'bool' modifier instead of filtering returns 0/1 depending
on the result of the compairson.

This is also a prerequisite to removing plain scalar/scalar comparisons,
as it maintains the current behaviour under a new syntax.
2015-09-02 14:51:44 +01:00
Julius Volz 077a753e6b Merge pull request #1006 from prometheus/true-values
promql: Remove interpolation of vector values.
2015-08-25 16:11:07 +02:00
Fabian Reinartz d6b8da8d43 Switch promql types to common/model 2015-08-25 13:49:14 +02:00
Brian Brazil fb585e4591 promql: Remove interpolation of vector values.
The current behaviour produces values that are not
from rules or scrapes. So if for example I have
a boolean 0/1 it can be returned as 0.2344589. This
prevents a number of advanced use cases, introduces
race conditions and can produce misleading graphs.
2015-08-24 17:37:31 +01:00
Fabian Reinartz 1535ef1457 Replace metric.SamplePair with model.SamplePair 2015-08-22 14:52:35 +02:00
Fabian Reinartz 438e232c9b Fix grouping of import blocks 2015-08-22 09:42:45 +02:00
Fabian Reinartz 306e8468a0 Switch from client_golang/model to common/model 2015-08-21 13:33:38 +02:00
Laurie Malau cdf38ab93a Log runtime errors during query evaluation instead of panicking. 2015-08-19 16:56:41 +02:00
Julius Volz 27ed874358 Implement label_replace()
Implements part of https://github.com/prometheus/prometheus/issues/959.
2015-08-18 14:20:07 +02:00
Fabian Reinartz 690b5f1575 Remove multi-statement queries
This commit removes the possibility to have multi-statement queries
which had no full support anyway. This makes the caller responsible
for multi-statement semantics.
Multiple tests are no longer timing-dependent.
2015-08-10 14:26:20 +02:00
Fabian Reinartz 579fdf65e2 Implement unary expression for vector types.
Closes #956
2015-08-04 15:46:36 +02:00
Fabian Reinartz 3d67d75935 promql: implement JSON array format for scalar and string 2015-07-06 13:09:26 +02:00
Fabian Reinartz 77e8983221 promql: add MarshalJSON method for SamplePair 2015-07-06 10:29:59 +02:00
Fabian Reinartz 70d7a987a7 promql: add json tags, fix query constructor. 2015-06-25 13:44:05 +02:00
Fabian Reinartz fe301d7946 promql: remove global flags 2015-06-15 19:01:06 +02:00
Fabian Reinartz c32ae22119 promql: fix missing metric in range results. 2015-06-11 23:50:53 +02:00
Fabian Reinartz cb10ceac18 promql: allow scalar expressions in range queries, improve errors.
These changes allow to do range queries over scalar expressions.
Errors on bad types for range queries are now raised on query creation
rather than evaluation.
2015-06-10 18:36:02 +02:00
Fabian Reinartz 0de6edbdfc Move pkg/ to util/ 2015-06-01 21:12:32 +02:00
Fabian Reinartz ccf51b132e Move stats package to pkg/stats 2015-06-01 21:12:31 +02:00
beorn7 3b9c421a69 Weed out all the [Gg]et* method names.
The only exception is getNumChunksToPersist to avoid naming the struct
member numChunksToPersist in a weird way.
2015-05-20 19:13:06 +02:00
Fabian Reinartz ac4d63b833 Merge pull request #689 from prometheus/fabxc/qltest
Add basic testing language, migrate tests
2015-05-18 19:22:48 +02:00
Fabian Reinartz 6321964738 Add parsing and execution of new test format.
This commit adds a new test structure that parses and executes
the new testing language.
2015-05-18 17:47:47 +02:00
Fabian Reinartz ce487f763e Simplify vector binary evaluation logic 2015-05-17 00:02:34 +02:00
Fabian Reinartz 8a109e061b Extract OR operation into own eval method. 2015-05-16 14:00:11 +02:00
Fabian Reinartz 2c3e9e2e87 Extract AND operation into own eval method. 2015-05-16 13:33:03 +02:00
Fabian Reinartz 9ab1f6c690 Limit maximum number of concurrent queries.
A high number of concurrent queries can slow each other down
so that none of them is reasonbly responsive. This commit limits
the number of queries being concurrently executed.
2015-05-06 11:34:17 +02:00
Fabian Reinartz d59d1cb2c1 Fix Error() methods. 2015-05-01 17:58:58 +02:00
Fabian Reinartz fe935179cd Stop routing rule statements through the engine. 2015-04-29 18:01:43 +02:00
Fabian Reinartz 25cdff3527 Remove `name` arg from `Parse*` functions, enhance parsing errors. 2015-04-29 16:38:41 +02:00
Fabian Reinartz 5602328c7c Refactor query evaluation.
This copies the evaluation logic from the current rules/ package.
The new engine handles the execution process from query string to final result.
It provides query timeout and cancellation and general flexibility for
future changes.

functions.go: Add evaluation implementation. Slight changes to in/out data but
	not to the processing logic.
quantile.go: No changes.
analyzer.go: No changes.
engine.go: Actually new part. Mainly consists of evaluation methods
	which were not changed.
setup_test.go: Copy of rules/helpers_test.go to setup test storage.
promql_test.go: Copy of rules/rules_test.go.
2015-04-28 14:19:05 +02:00