Changes to the UI:
- "Active Since" timestamps are now human-readable.
- Alerting rules are now pretty-printed better.
- Labels are no longer just strings, but alert bubbles (like we do on
the status page for base labels).
- Alert states and target health states are now capitalized in the
presentation layer rather than at the source.
This change is conceptually very simple, although the diff is large. It
switches logging from "github.com/golang/glog" to
"github.com/prometheus/log", while not actually changing any log
messages. V(1)-style logging has been changed to be log.Debug*().
With this commit, sending SIGHUP to the Prometheus process will reload
and apply the configuration file. The different components attempt
to handle failing changes gracefully.
This commits renames the RuleManager to Manager as the package
name is 'rules' now. The unused layer of abstraction of the
RuleManager interface is removed.
The one central sample ingestion channel has caused a variety of
trouble. This commit removes it. Targets and rule evaluation call an
Append method directly now. To incorporate multiple storage backends
(like OpenTSDB), storage.Tee forks the Append into two different
appenders.
Note that the tsdb queue manager had its own queue anyway. It was a
queue after a queue... Much queue, so overhead...
Targets have their own little buffer (implemented as a channel) to
avoid stalling during an http scrape. But a new scrape will only be
started once the old one is fully ingested.
The contraption of three pipelined ingesters was removed. A Target is
an ingester itself now. Despite more logic in Target, things should be
less confusing now.
Also, remove lint and vet warnings in ast.go.
Unary expressions cause parsing errors if they are done in the lexer
by tokenizing them into the number.
This fix moves unary expressions to the parser.
This commits implements the OR operation between two vectors.
Vector matching using the ON clause is added to limit the set of
labels that define a match between two elements. Group modifiers
(GROUP_LEFT/GROUP_RIGHT) to request many-to-one matching are added.
This adds support for scientific notation in the expression language, as
well as for all possible literal forms of +Inf/-Inf/NaN.
TODO: Keep enough state in the parser/lexer to distinguish contexts in
which "Inf", "NaN", etc. should be parsed as a number vs. parsed as a
label name. Currently, foo{nan="bar"} would be a syntax error. However,
that is an existing bug for all our reserved words. E.g. foo{sum="bar"}
is a syntax error as well. This should be fixed separately.
Since we are now getting really deep into floating point calculation,
the tests had to take into account the precision loss. Since the rule
tests are based on direct line matching in the output, implementing
the "almost equal" semantics was pretty cumbersome, but here we are.
This allows changing the time offset for individual instant and range
vectors in a query.
For example, this returns the value of `foo` 5 minutes in the past
relative to the current query evaluation time:
foo offset 5m
Note that the `offset` modifier always needs to follow the selector
immediately. I.e. the following would be correct:
sum(foo offset 5m) // GOOD.
While the following would be *incorrect*:
sum(foo) offset 5m // INVALID.
The same works for range vectors. This returns the 5-minutes-rate that
`foo` had a week ago:
rate(foo[5m] offset 1w)
This change touches the following components:
* Lexer/parser: additions to correctly parse the new `offset`/`OFFSET`
keyword.
* AST: vector and matrix nodes now have an additional `offset` field.
This is used during their evaluation to adjust query and result times
appropriately.
* Query analyzer: now works on separate sets of ranges and instants per
offset. Isolating different offsets from each other completely in this
way keeps the preloading code relatively simple.
No storage engine changes were needed by this change.
The rules tests have been changed to not probe the internal
implementation details of the query analyzer anymore (how many instants
and ranges have been preloaded). This would also become too cumbersome
to test with the new model, and measuring the result of the query should
be sufficient.
This fixes https://github.com/prometheus/prometheus/issues/529
This fixed https://github.com/prometheus/promdash/issues/201
This is related to #454. Queries now timeout after a duration set by
the -query.timeout flag. The TotalEvalTimer is now started/stopped
inside any of the ast.Eval* functions.
The 2nd isCounter argument to delta is ugly, make it optional as the first step
of deprecating it. This will makes delta only ever applied to gauges.
Add a deriv function to calculate the least squares
slope of a gauge. This is more useful for prediction than delta,
as it isn't as heavily influenced by outliers at the boundaries.