Commit Graph

245 Commits (61e14a6bdb8a97e03ad58a8b11617480a5eff881)

Author SHA1 Message Date
beorn7 836f1db04c Improve MetricsForLabelMatchers
WIP: This needs more tests.

It now gets a from and through value, which it may opportunistically
use to optimize the retrieval. With possible future range indices,
this could be used in a very efficient way. This change merely applies
some easy checks, which should nevertheless solve the use case of
heavy rule evaluations on servers with a lot of series churn.

Idea is the following:

- Only archive series that are at least as old as the headChunkTimeout
  (which was already extremely unlikely to happen).

- Then maintain a high watermark for the last archival, i.e. no
  archived series has a sample more recent than that watermark.

- Any query that doesn't reach to a time before that watermark doesn't
  have to touch the archive index at all. (A production server at
  Soundcloud with the aforementioned series churn and heavy rule
  evaluations spends 50% of its CPU time in archive index
  lookups. Since rule evaluations usually only touch very recent
  values, most of those lookup should disappear with this change.)

- Federation with a very broad label matcher will profit from this,
  too.

As a byproduct, the un-needed MetricForFingerprint method was removed
from the Storage interface.
2016-03-09 00:25:59 +01:00
beorn7 a7408bfb47 Unify duration parsing
It's actually happening in several places (and for flags, we use the
standard Go time.Duration...). This at least reduces all our
home-grown parsing to one place (in model).
2016-01-29 15:41:50 +01:00
Julius Volz 1ae23bf5e9 Handle OPTIONS HTTP requests correctly.
Fixes https://github.com/prometheus/prometheus/issues/1346
2016-01-26 12:31:44 +01:00
Tobias Schmidt 7a6a0630d1 Merge pull request #1213 from prometheus/fix-wrong-http-status-codes
Return HTTP server error codes for execution errors
2015-11-12 09:12:17 -08:00
Tobias Schmidt bf84faa010 Return HTTP server error codes for execution errors 2015-11-11 16:22:20 -08:00
Tobias Schmidt 50079a85a1 Make time parameter optional in v1 query API
If no time paramter is provided, the current server timestamp is used.
2015-11-11 13:30:06 -08:00
Fabian Reinartz 33aab4169c Anchor regexes in vector matching
This commit makes the regex behavior of vector matching consistent with
configuration and label_replace() by anchoring it.

Fixes #1200
2015-11-05 11:23:43 +01:00
Julius Volz 0088aa4d45 Merge pull request #1132 from prometheus/fix-quoting-and-escaping
Support escape sequences in strings and add raw strings
2015-10-08 20:51:18 +02:00
Julius Volz 46c5260761 Support escape sequences in strings and add raw strings.
This adapts some functionality from the Go standard library for string
literal lexing and unquoting/unescaping.

The following string types are now supported:

Double- or single-quoted strings:

  These support all escape sequences that Go supports in double-quoted
  string literals. The difference is that Prometheus also has
  single-quoted strings (instead of single-quoted runes in Go). Raw
  newlines are not allowed.

Backtick-quoted raw strings:

  Strings quoted in backticks are treated as raw strings just like in Go
  and may contain raw newlines and other special characters directly.

Fixes https://github.com/prometheus/prometheus/issues/1122
Fixes https://github.com/prometheus/prometheus/issues/1121
2015-10-08 19:17:21 +02:00
Fabian Reinartz e3b6ec9784 Switch to common/log 2015-10-03 10:21:43 +02:00
Fabian Reinartz 398bbf906b Switch to common/route package 2015-09-24 17:08:47 +02:00
Fabian Reinartz 171f50706a Fix unkeyed field errors. 2015-09-18 17:00:08 +02:00
Fabian Reinartz f8a25f6af7 Apply HTTP handler compression everywhere 2015-09-17 14:49:50 +02:00
Julius Volz 995d3b831d Fix most golint warnings.
This is with `golint -min_confidence=0.5`.

I left several lint warnings untouched because they were either
incorrect or I felt it was better not to change them at the moment.
2015-08-26 12:44:46 +02:00
Fabian Reinartz d6b8da8d43 Switch promql types to common/model 2015-08-25 13:49:14 +02:00
Fabian Reinartz 1535ef1457 Replace metric.SamplePair with model.SamplePair 2015-08-22 14:52:35 +02:00
Fabian Reinartz 438e232c9b Fix grouping of import blocks 2015-08-22 09:42:45 +02:00
Fabian Reinartz 306e8468a0 Switch from client_golang/model to common/model 2015-08-21 13:33:38 +02:00
Fabian Reinartz 62b4e89b39 Restore legacy API scalar format 2015-07-16 20:19:18 +02:00
Fabian Reinartz 3d67d75935 promql: implement JSON array format for scalar and string 2015-07-06 13:09:26 +02:00
Fabian Reinartz 77e8983221 promql: add MarshalJSON method for SamplePair 2015-07-06 10:29:59 +02:00
Fabian Reinartz 8f904d6a54 api/v1: fix response format tests 2015-07-02 14:12:26 +02:00
Fabian Reinartz b36fa7ad61 api/v1: fix Content-Type in response 2015-07-01 22:47:25 +02:00
Julius Volz bc1c789bab Disallow cross-origin DELETE and POST requests. 2015-06-24 17:26:49 +02:00
Fabian Reinartz 1eff186555 Merge pull request #810 from prometheus/fabxc/lmatch
Match empty labels.
2015-06-22 15:45:50 +02:00
Fabian Reinartz 5b91ea9b36 storage: improve label matching and allow unset matching.
Matching of empty labels now also matches metrics where the label
was not explicitly set to the empty string.
2015-06-22 15:33:44 +02:00
Fabian Reinartz 94cd321be1 promql: error if all label matchers are empty. 2015-06-22 15:33:44 +02:00
Fabian Reinartz 85d7c7640a web: remove flags, refactor handlers 2015-06-15 19:01:06 +02:00
Fabian Reinartz 7bb7e565a4 web/api: add GET and DELETE /series endpoints 2015-06-11 12:24:57 +02:00
Fabian Reinartz 7be94ce962 web/api: improve errors, add tests 2015-06-10 18:36:02 +02:00
Fabian Reinartz 75b0b7420e web/api: replace /metrics/names with /label/:name/values endpoint. 2015-06-08 23:10:52 +02:00
Fabian Reinartz 5b713911e3 web/api: enable running API legacy and v1 in parallel 2015-06-08 19:11:48 +02:00
Fabian Reinartz ab9c98acac web/api: add initial API v1 implementation. 2015-06-06 21:47:36 +02:00
Fabian Reinartz e88e5f680b web: simplify prefix handling using util/route package. 2015-06-03 15:53:04 +02:00
Fabian Reinartz 78047326b4 web: cleanup initialization of web service. 2015-06-03 08:45:43 +02:00
Fabian Reinartz 0de6edbdfc Move pkg/ to util/ 2015-06-01 21:12:32 +02:00
Fabian Reinartz dfaf31a1da Move web/httputils to pkg/httputil and add DeadlineClient to it 2015-06-01 21:12:31 +02:00
Julius Volz d7c015c149 Convert pathPrefix to not have trailing slash. 2015-06-01 12:43:17 +02:00
Björn Rabenstein c44e7cd105 Merge pull request #706 from prometheus/beorn7/persistence2
Improve iterator performance.
2015-05-21 13:48:52 +02:00
beorn7 3b9c421a69 Weed out all the [Gg]et* method names.
The only exception is getNumChunksToPersist to avoid naming the struct
member numChunksToPersist in a weird way.
2015-05-20 19:13:06 +02:00
Julius Volz 267fd34156 Switch Prometheus to use github.com/prometheus/log.
This change is conceptually very simple, although the diff is large. It
switches logging from "github.com/golang/glog" to
"github.com/prometheus/log", while not actually changing any log
messages. V(1)-style logging has been changed to be log.Debug*().
2015-05-20 18:19:32 +02:00
Fabian Reinartz 279831cdf1 Fix and improve parsing error output. 2015-04-30 12:19:39 +02:00
Fabian Reinartz 25cdff3527 Remove `name` arg from `Parse*` functions, enhance parsing errors. 2015-04-29 16:38:41 +02:00
Fabian Reinartz 3ca11bcaf5 Switch Prometheus to promql package.
This commit removes all functionality from rules/ that is now handled in
promql/.
All parts of Prometheus are changed to use the promql/ package.
2015-04-28 16:19:23 +02:00
Ceesjan Luiten 0e18784c64 Make all paths absolute to support proxies 2015-04-02 20:36:47 +02:00
Julius Volz 33702da8a8 Use simple Now() func in API instead of utility.Time. 2015-03-27 23:43:47 +01:00
Julius Volz a5a553f1da Add initial HTTP API tests.
This covers the /query (instant query) endpoint for now. Others to
follow.
2015-03-27 21:37:55 +01:00
Julius Volz 3f2686d0b3 Remove unused fields from MetricsService. 2015-03-27 18:51:13 +01:00
Julius Volz c9b76def4c Report all query API HTTP errors in JSON format. 2015-03-27 16:48:03 +01:00
Julius Volz df314ead84 Remove unnecessary "else" branch in query API. 2015-03-21 17:54:30 +01:00
Julius Volz a68b880c27 Add tests for new timestamp/duration functions.
...and fix the first bugs in them where they truncate precision below a
second.
2015-03-21 17:50:45 +01:00
Julius Volz cb816ea14a Improve timestamp/duration parsing in query API.
Don't handle `0` as a special timestamp value for "now" anymore, except
in the `QueryRange()` case, where existing API consumers still expect
`0` to mean "now".

Also, properly return errors now for malformed timestamp/duration
float values.
2015-03-21 16:58:45 +01:00
Julius Volz 8a4acefd66 Allow custom timestamps in instant query API. 2015-03-20 23:10:58 +01:00
Julius Volz c78436d707 Remove unused API time dependency injection. 2015-03-20 23:10:26 +01:00
beorn7 fa1935a644 Remove /api/targets call and do not show job and instance labels on status.
/api/targets was undocumented and never used and also broken.

Showing instance and job labels on the status page (next to targets)
does not make sense as those labels are set in an obvious way.

Also add a doc comment to TargetStateToClass.
2015-03-18 18:53:43 +01:00
Fabian Reinartz fa1e90003b Query timeout added.
This is related to #454. Queries now timeout after a duration set by
the -query.timeout flag. The TotalEvalTimer is now started/stopped
inside any of the ast.Eval* functions.
2015-02-03 08:04:27 +01:00
Bjoern Rabenstein 5859b74f1b Clean up license issues.
- Move CONTRIBUTORS.md to the more common AUTHORS.
- Added the required NOTICE file.
- Changed "Prometheus Team" to "The Prometheus Authors".
- Reverted the erroneous changes to the Apache License.
2015-01-21 20:07:45 +01:00
Julius Volz cc27fb8aab Rename remaining all-caps constants in AST layer.
Change-Id: Ibe97e30981969056ffcdb89e63c1468ea1ffa140
2014-12-25 01:30:47 +01:00
Bjoern Rabenstein 39efe6358b Fix typos and import order.
This doesn't make the import order consistend everywhere, just where
it was touched by the previous commit.

Change-Id: I82fc75f8691da9901c7ceb808e6f6fe8e5d62c0e
2014-12-10 17:46:56 +01:00
Bjoern Rabenstein b1e4956142 Apply a giant code cleanup.
Essentially:

- Remove unused code.

- Make it 'go vet' clean. The only remaining warnings are in generated code.

- Make it 'golint' clean. The only remaining warnings are in gerenated code.

- Smoothed out same minor things.

Change-Id: I3fe5c1fbead27b0e7a9c247fee2f5a45bc2d42c6
2014-12-10 16:16:49 +01:00
Julius Volz c3fcea45e3 Support finer time resolutions than 1 second.
Change-Id: I4c5f1d6d2361e841999b23283d1961b1bd0c2859
2014-11-25 17:09:04 +01:00
Brian Brazil f114bbd4e7 Make query_range more robust.
Gracefully handle decimal values, by truncating them.

Limit amount of steps, to avoid accidentally pulling too much data.
This limit returns up to ~500kB per timeseries, and allows
for 60s granularity for a week and 1h granularity for a year.

Change-Id: Ie549fc24deb2eecbc6c5d1b6088a548a6b02e849
2014-11-25 17:09:04 +01:00
Bjoern Rabenstein 71206dbc06 More code cleanups.
Add license text everywhere.
And others....

Change-Id: I11ccde267a2ef7eb366c4788ba7aeae14ba7545c
2014-11-25 17:07:44 +01:00
Bjoern Rabenstein f5f9f3514a Major code cleanup.
- Make it go-vet and golint clean.
- Add comments, TODOs, etc.

Change-Id: If1392d96f3d5b4cdde597b10c8dff1769fcfabe2
2014-11-25 17:02:53 +01:00
Julius Volz e7ed39c9a6 Initial experimental snapshot of next-gen storage.
Change-Id: Ifb8709960dbedd1d9f5efd88cdd359ee9fa9d26d
2014-11-25 17:02:00 +01:00
Julius Volz 21cafe6cd7 Only evict memory series after they are on disk.
This fixes the problem where samples become temporarily unavailable for
queries while they are being flushed to disk. Although the entire
flushing code could use some major refactoring, I'm explicitly trying to
do the minimal change to fix the problem since there's a whole new
storage implementation in the pipeline.

Change-Id: I0f5393a30b88654c73567456aeaea62f8b3756d9
2014-11-25 17:01:59 +01:00
Bjoern Rabenstein 8956faeccb Migrate to new client_golang.
This change will only be submitted when the new client_golang has been
moved to the new version.

Change-Id: Ifceb59333072a08286a8ac910709a8ba2e3a1581
2014-11-25 17:01:59 +01:00
Brian Brazil 1828b1f55c Only log every query when debugging.
Change-Id: I4f988d81cda6f6deb0ed7f497de4aa75409b158f
2014-11-25 17:01:59 +01:00
Julius Volz 01f652cb4c Separate storage implementation from interfaces.
This was initially motivated by wanting to distribute the rule checker
tool under `tools/rule_checker`. However, this was not possible without
also distributing the LevelDB dynamic libraries because the tool
transitively depended on Levigo:

rule checker -> query layer -> tiered storage layer -> leveldb

This change separates external storage interfaces from the
implementation (tiered storage, leveldb storage, memory storage) by
putting them into separate packages:

- storage/metric: public, implementation-agnostic interfaces
- storage/metric/tiered: tiered storage implementation, including memory
                         and LevelDB storage.

I initially also considered splitting up the implementation into
separate packages for tiered storage, memory storage, and LevelDB
storage, but these are currently so intertwined that it would be another
major project in itself.

The query layers and most other parts of Prometheus now have notion of
the storage implementation anymore and just use whatever implementation
they get passed in via interfaces.

The rule_checker is now a static binary :)

Change-Id: I793bbf631a8648ca31790e7e772ecf9c2b92f7a0
2014-04-16 13:30:19 +02:00
Julius Volz 740d448983 Use custom timestamp type for sample timestamps and related code.
So far we've been using Go's native time.Time for anything related to sample
timestamps. Since the range of time.Time is much bigger than what we need, this
has created two problems:

- there could be time.Time values which were out of the range/precision of the
  time type that we persist to disk, therefore causing incorrectly ordered keys.
  One bug caused by this was:

  https://github.com/prometheus/prometheus/issues/367

  It would be good to use a timestamp type that's more closely aligned with
  what the underlying storage supports.

- sizeof(time.Time) is 192, while Prometheus should be ok with a single 64-bit
  Unix timestamp (possibly even a 32-bit one). Since we store samples in large
  numbers, this seriously affects memory usage. Furthermore, copying/working
  with the data will be faster if it's smaller.

*MEMORY USAGE RESULTS*
Initial memory usage comparisons for a running Prometheus with 1 timeseries and
100,000 samples show roughly a 13% decrease in total (VIRT) memory usage. In my
tests, this advantage for some reason decreased a bit the more samples the
timeseries had (to 5-7% for millions of samples). This I can't fully explain,
but perhaps garbage collection issues were involved.

*WHEN TO USE THE NEW TIMESTAMP TYPE*
The new clientmodel.Timestamp type should be used whenever time
calculations are either directly or indirectly related to sample
timestamps.

For example:
- the timestamp of a sample itself
- all kinds of watermarks
- anything that may become or is compared to a sample timestamp (like the timestamp
  passed into Target.Scrape()).

When to still use time.Time:
- for measuring durations/times not related to sample timestamps, like duration
  telemetry exporting, timers that indicate how frequently to execute some
  action, etc.

*NOTE ON OPERATOR OPTIMIZATION TESTS*
We don't use operator optimization code anymore, but it still lives in
the code as dead code. It still has tests, but I couldn't get all of them to
pass with the new timestamp format. I commented out the failing cases for now,
but we should probably remove the dead code soon. I just didn't want to do that
in the same change as this.

Change-Id: I821787414b0debe85c9fffaeb57abd453727af0f
2013-12-03 09:11:28 +01:00
Conor Hennessy eba01d1119 Remove usage of gorest.
Due to on going issues, we've decided to remove gorest. It started with gorest
not being thread-safe (it does introspection to create a new handler which is
an easy process to mess up with multiple threads of execution):
    https://code.google.com/p/gorest/issues/detail?id=15
While the issue has been marked fixed, it looks like the patch has introduced
more problems than the original issue and simply doesn't work properly.
I'm not sure the behaviour was thought through properly. If a new instance is
needed every request then a handler-factory is needed or the library needs to
set expectations about how the new objects should interact with their
constructor state.
While it was tempting to try out another routing library, I think for now
it's better to use dumb vanilla Go routing. At least until we decide which
URL format we intend to standardize on.

Change-Id: Ica3da135d05f8ab8fc206f51eeca4f684f8efa0e
2013-10-23 14:19:14 +02:00
Julius Volz a50ee8df30 Always set CORS headers at beginning of API handler.
Change-Id: Icde9a74260c4bb919f09c3e10c6dd5f372ccdaec
2013-10-16 15:59:47 +02:00
Julius Volz 788587426b Make scrape timeouts configurable per job.
Change-Id: I77a7514ad9e7969771f873d63d6353ec50082a62
2013-08-19 12:21:47 +02:00
Julius Volz 0003027dce Add needed trailing spaces in logs. 2013-08-12 18:22:48 +02:00
Julius Volz aa5d251f8d Use github.com/golang/glog for all logging. 2013-08-12 17:54:36 +02:00
Julius Volz 35ee2cd3cb Add alertmanager notification support to Prometheus.
Alert definitions now also have mandatory SUMMARY and DESCRIPTION fields
that get sent along a firing alert to the alert manager.
2013-07-30 17:23:41 +02:00
Julius Volz 9f07f8677a Generate tabular console view from JSON data. 2013-07-24 12:28:59 +02:00
Matt T. Proud 30b1cf80b5 WIP - Snapshot of Moving to Client Model. 2013-06-25 15:52:42 +02:00
Julius Volz 1fe3d3b06b Remove obsolete argument from target handling code. 2013-06-11 17:54:58 +02:00
Julius Volz 51689d965d Add debug timers to instant and range queries.
This adds timers around several query-relevant code blocks. For now, the
query timer stats are only logged for queries initiated through the UI.
In other cases (rule evaluations), the stats are simply thrown away.

My hope is that this helps us understand where queries spend time,
especially in cases where they sometimes hang for unusual amounts of
time.
2013-06-05 18:32:54 +02:00
Julius Volz 56324d8ce2 Make AST query storage non-global. 2013-05-07 13:15:10 +02:00
Matt T. Proud 3b9b1c6ab4 Define dependencies for web. stack concretely.
This commit destroys the use of AppState, which makes passing
concrete state along to various serving components onerous.
2013-05-06 11:13:12 +02:00
Julius Volz 9cea5d9df8 Convert the Prometheus configuration to protocol buffers. 2013-04-30 22:26:00 +02:00
Matt T. Proud e86f4d9dfd Convert time readers to represent time in UTC.
Go's time.Time represents time as UTC in its fundamental data type.
That said, when using ``time.Unix(...)``, it sets the zone for the
time representation to the local.  Unfortunately with diagnosis and
our tests, it is a PITA to jump between various zones, even though
the serialized version remains the same.

To keep things easy, all places where times are generated or read
are converted into UTC.  These conversions are cheap, for
``Time.In`` merely changes a pointer reference in the struct,
nothing more.  This enables me to diagnose test failures with fixture
data very easily.
2013-04-24 12:19:41 +02:00
Julius Volz a0d311c9e6 Constantize job name label. 2013-04-15 11:47:54 +02:00
Bernerd Schaefer 8af0bbb3a0 Set job label for targets registered through the API
This is set when jobs are statically registered (see
retrieval/targetmanager.go#L92), and should be set here, too.
2013-04-12 14:50:44 +02:00
Bernerd Schaefer 5e9447996b Set CORS Headers on API requests
By setting Access-Control headers, the Prometheus metrics API can be
accessed by cross-origin javascript applications (e.g., an external
dashboard pulling Prometheus metrics).
2013-04-11 14:51:42 +02:00
Julius Volz ec413459fa Depointerize Matrix/Vector types as well as time.Time arguments. 2013-03-28 18:07:12 +01:00
Julius Volz 2b8f0b2cc7 Constantize metric name label name. 2013-03-26 16:20:23 +01:00
Julius Volz dd67ab115b Change GetAllMetricNames() to GetAllValuesForLabel(). 2013-03-26 14:47:07 +01:00
Julius Volz 8e4c5b0cea Use AST query analyzer and views with tiered storage. 2013-03-21 18:16:52 +01:00
Julius Volz 20c5ca1d72 Lower-case web API method arguments. 2013-03-21 18:11:02 +01:00
Julius Volz f1fc7d717a Allow replacing job targets via HTTP API.
This roughly comprises the following changes:

- index target pools by job instead of scrape interval
- make targets within a pool exchangable while preserving existing
  health state for targets
- allow exchanging targets via HTTP API (PUT)
- show target lists in /status (experimental, for own debug use)
2013-02-28 21:33:29 +01:00
Julius Volz 23374788d3 Beginnings of a Prometheus status page. 2013-02-14 19:03:17 +01:00
Julius Volz 0cbd03ccf9 Move web-related code/resources to a subdirectory. 2013-02-08 14:52:36 +01:00