This commit introduces a new `/api/v1/notifications/live` endpoint that
utilizes Server-Sent Events (SSE) to stream notifications to the web UI.
This is used to display alerts such as when a configuration reload
has failed.
I opted for SSE over WebSockets because SSE is simpler to implement and
more robust for our use case. Since we only need one-way communication
from the server to the client, SSE fits perfectly without the overhead
of establishing and maintaining a two-way WebSocket connection.
When the SSE connection fails, we go back to a classic
/api/v1/notifications API endpoint.
This commit also contains the required UI changes for the new Mantine UI.
Signed-off-by: Julien <roidelapluie@o11y.eu>
The pattern of `import _ "net/http/pprof"` adds handlers to the default
http handler, but Prometheus does not use that. There are explicit
handlers in `web/web.go`.
So, we can remove this line with no impact to behaviour.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* scrape/scrape_test.go: reduce the time it takes to reload the manager
TestNativeHistogramMaxSchemaSet took over 3x5s to complete because
there's a minimum reload interval.
I've made the testcases run in parallel and reduced the reload interval
to 10ms. Now the test runs in around 0.1-0.2 seconds.
Ran test 10000 times to check if it's flaky.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
---------
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* [ENHANCEMENT] Alerts: remove metrics for removed Alertmanagers
So they don't continue to report stale values.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
The instant vector documentation does not explain which metric samples are selected - in particular, it makes no reference to staleness.
It's confusing when reading the docs to understand how exactly Prometheus selects the metrics to report: the most recent sample older than the search timestamp specified in the API request, so long as that metric is not "stale".
Signed-off-by: Craig Ringer <craig.ringer@enterprisedb.com>
In detail:
- Clarify that label name and value length limits are in byte,
not in UTF-8 data points.
- More consistent formatting to keep 80 characters line limet.
- Clarify various misleading specifications around "per sample",
"per scrape", "per scrape config", "per job"...
- Fix grammar.
Signed-off-by: beorn7 <beorn@grafana.com>
* PromQL.Engine: Refactor Matrix expansion into a method
Add utility method promql.evaluator.expandSeriesToMatrix, for expanding a slice
of storage.Series into a promql.Matrix.
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
* Rename to generateMatrix
Rename evaluator.expandSeriesToMatrix into generateMatrix, while also dropping
the start, end, interval arguments since they are evaluator fields.
Write more extensive method documentation.
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
* Rename to evalVectorSelector
Rename to evalVectorSelector after discussing with @michahoffmann.
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
---------
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
If the query overlaps the range currently undergoing compaction, we
should only fetch chunks up to that time. Need to store that min time
in `HeadAndOOOIndexReader`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
If the query overlaps the range currently undergoing compaction, we
should only fetch chunks up to that time. Need to store that min time
in `HeadAndOOOIndexReader`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
When a YAML file is invalid, trigger auto-reload anyway so that user is
aware that the configuration file is incorrect.
Failing to do so does not change the reload status in metrics and api.
Signed-off-by: Julien <roidelapluie@o11y.eu>
Previously, scrapes durations that are very short (e.g., connection refused)
could show as empty (durations under 1 millisecond).
This commit ensures that sub-millisecond durations are correctly
displayed as "0ms" or "1ms" when necessary.
- Adjusted `humanizeDuration` to round sub-millisecond durations to the
nearest millisecond.
- Updated unit tests to verify the correct handling of sub-millisecond
values.
Signed-off-by: Julien <roidelapluie@o11y.eu>