This switches from the prehistoric EventSource API to the more modern
fetch-event-source package. That packages gives us full control over the
retries.
It also gives us the opportunity to close the event source when the
browser tab is hidden, saving resources.
Signed-off-by: Julien <roidelapluie@o11y.eu>
This commit introduces a new `/api/v1/notifications/live` endpoint that
utilizes Server-Sent Events (SSE) to stream notifications to the web UI.
This is used to display alerts such as when a configuration reload
has failed.
I opted for SSE over WebSockets because SSE is simpler to implement and
more robust for our use case. Since we only need one-way communication
from the server to the client, SSE fits perfectly without the overhead
of establishing and maintaining a two-way WebSocket connection.
When the SSE connection fails, we go back to a classic
/api/v1/notifications API endpoint.
This commit also contains the required UI changes for the new Mantine UI.
Signed-off-by: Julien <roidelapluie@o11y.eu>
These functions operate on whole series, not on samples, so they do not
fit into the table of functions that return a Vector. Remove the stub
entries that were left to help downstream users of the code identify
what changed.
We cannot remove the entries from the `FunctionCalls` map without
breaking `TestFunctionList`, so put some nils in to keep it happy.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
The pattern of `import _ "net/http/pprof"` adds handlers to the default
http handler, but Prometheus does not use that. There are explicit
handlers in `web/web.go`.
So, we can remove this line with no impact to behaviour.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* scrape/scrape_test.go: reduce the time it takes to reload the manager
TestNativeHistogramMaxSchemaSet took over 3x5s to complete because
there's a minimum reload interval.
I've made the testcases run in parallel and reduced the reload interval
to 10ms. Now the test runs in around 0.1-0.2 seconds.
Ran test 10000 times to check if it's flaky.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
---------
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* [ENHANCEMENT] Alerts: remove metrics for removed Alertmanagers
So they don't continue to report stale values.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Go's sorting functions can re-order equal elements, so the strategy of
sorting by the fallback ordering first does not always work.
Pulling the fallback into the main comparison function is more reliable
and more efficient.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
The instant vector documentation does not explain which metric samples are selected - in particular, it makes no reference to staleness.
It's confusing when reading the docs to understand how exactly Prometheus selects the metrics to report: the most recent sample older than the search timestamp specified in the API request, so long as that metric is not "stale".
Signed-off-by: Craig Ringer <craig.ringer@enterprisedb.com>
We are still seeing lock contention on MemPostings.mtx, and MemPostings.Delete() is by far the most expensive operation on that mutex.
This adds parallelism to that method, trying to reduce the amount of time we spend with the mutex held.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
In detail:
- Clarify that label name and value length limits are in byte,
not in UTF-8 data points.
- More consistent formatting to keep 80 characters line limet.
- Clarify various misleading specifications around "per sample",
"per scrape", "per scrape config", "per job"...
- Fix grammar.
Signed-off-by: beorn7 <beorn@grafana.com>