Commit Graph

14111 Commits (e32dbeb3843497a84703560c4e3e5f06c00a6c83)

Author SHA1 Message Date
machine424 97f3219157 test(discovery): add a Configs test showing that the custom unmarshalling/marshalling is broken.
This went under the radar because the utils are never called directly.

We usually marshall/unmarshal Configs as embeded in a struct using UnmarshalYAMLWithInlineConfigs/MarshalYAMLWithInlineConfigs
which bypasses Configs' custom UnmarshalYAML/MarshalYAML

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-09-30 12:33:07 +02:00
Julien 537c5dbbcf
Merge pull request #14994 from roidelapluie/notifications2
Follow-up on notifications via SSE
2024-09-30 10:17:34 +02:00
Bryan Boreham 54de4fb780
Merge pull request #14975 from colega/process-mempostings-delete-with-gomaxprocs-workers
Process `MemPostings.Delete()` with `GOMAXPROCS` workers
2024-09-29 07:58:42 +01:00
Ayoub Mrini 105ab2e95a
fix(test): adjust defer invocations (#14996)
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-09-27 17:13:51 +01:00
Julien e34563bfe0 Retry SSE connection unless max clients have been reached.
This switches from the prehistoric EventSource API to the more modern
fetch-event-source package. That packages gives us full control over the
retries.

It also gives us the opportunity to close the event source when the
browser tab is hidden, saving resources.

Signed-off-by: Julien <roidelapluie@o11y.eu>
2024-09-27 16:18:33 +02:00
Julien f9bbad1148 Limit the number of SSE Subscribers to 16 by default
Signed-off-by: Julien <roidelapluie@o11y.eu>
2024-09-27 15:51:51 +02:00
Julien 7aa4721373
Merge pull request #14946 from roidelapluie/notifications
Add notifications to the Web UI
2024-09-27 15:50:43 +02:00
Julien 6cde0096e2 Add notifications to the web UI when configuration reload fails.
This commit introduces a new `/api/v1/notifications/live` endpoint that
utilizes Server-Sent Events (SSE) to stream notifications to the web UI.
This is used to display alerts such as when a configuration reload
has failed.

I opted for SSE over WebSockets because SSE is simpler to implement and
more robust for our use case. Since we only need one-way communication
from the server to the client, SSE fits perfectly without the overhead
of establishing and maintaining a two-way WebSocket connection.

When the SSE connection fails, we go back to a classic
/api/v1/notifications API endpoint.

This commit also contains the required UI changes for the new Mantine UI.

Signed-off-by: Julien <roidelapluie@o11y.eu>
2024-09-27 15:28:38 +02:00
Bryan Boreham b8e5b7cda9 [REFACTOR] PromQL: remove label_join and label_replace stubs
These functions operate on whole series, not on samples, so they do not
fit into the table of functions that return a Vector. Remove the stub
entries that were left to help downstream users of the code identify
what changed.

We cannot remove the entries from the `FunctionCalls` map without
breaking `TestFunctionList`, so put some nils in to keep it happy.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-27 11:20:45 +01:00
Oleg Zaytsev ada8a6ef10
Add some more tests for MemPostings_Delete
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-27 10:14:39 +02:00
Bryan Boreham 410fcce6f0
Remove unnecessary pprof import (#14988)
The pattern of `import _ "net/http/pprof"` adds handlers to the default
http handler, but Prometheus does not use that. There are explicit
handlers in `web/web.go`.

So, we can remove this line with no impact to behaviour.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-27 07:45:49 +01:00
Julius Volz 5f26d86daa
Merge pull request #14982 from prometheus/fix-remove-defunct-alert-close-buttons
Remove Query page alert close buttons that don't do anything
2024-09-26 20:39:45 +02:00
Björn Rabenstein f74722841b
Merge pull request #14160 from alex-kattathra-johnson/issue-13959
Remove no-default-scrape-port featureFlag
2024-09-26 18:45:56 +02:00
George Krajcsovits 79a6238e19
scrape/scrape_test.go: reduce the time it takes to reload the manager (#14447)
* scrape/scrape_test.go: reduce the time it takes to reload the manager

TestNativeHistogramMaxSchemaSet took over 3x5s to complete because
there's a minimum reload interval.

I've made the testcases run in parallel and reduced the reload interval
to 10ms. Now the test runs in around 0.1-0.2 seconds.

Ran test 10000 times to check if it's flaky.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

---------

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-09-26 18:35:15 +02:00
Arthur Silva Sens d5f65cfce0
Merge pull request #14694 from prometheus/ct-histogram
Histogram CT Zero ingestion
2024-09-26 12:48:46 -03:00
Bryan Boreham 5710ddf24f
[ENHANCEMENT] Alerts: remove metrics for removed Alertmanagers (#13909)
* [ENHANCEMENT] Alerts: remove metrics for removed Alertmanagers

So they don't continue to report stale values.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-26 15:32:18 +01:00
Arthur Silva Sens 95a53ef982
Join tests for appending float and histogram CTs
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2024-09-26 11:29:31 -03:00
Arthur Silva Sens 6bd9b1a7cc
Histogram CT Zero ingestion
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2024-09-26 11:29:22 -03:00
Oleg Zaytsev 4fd2556baa
Extract processWithBoundedParallelismAndConsistentWorkers
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-26 15:43:19 +02:00
Björn Rabenstein 751100b3d0
Merge pull request #12998 from ringerc/docs-instant-vector-staleness
Docs: Refer to staleness in instant vector documentation
2024-09-26 14:52:34 +02:00
Craig Ringer 15b68e989c Refer to staleness in instant vector documentation
The instant vector documentation does not explain which metric samples are selected - in particular, it makes no reference to staleness.

It's confusing when reading the docs to understand how exactly Prometheus selects the metrics to report: the most recent sample older than the search timestamp specified in the API request, so long as that metric is not "stale".

Signed-off-by: Craig Ringer <craig.ringer@enterprisedb.com>
2024-09-26 11:54:31 +12:00
Julius Volz fcbd18dabb Remove Query page alert close buttons that don't do anything
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2024-09-25 18:27:27 +02:00
Alex Johnson be0f10054e Remove no-default-scrape-port featureFlag
Signed-off-by: Alex Johnson <alex.kattathra.johnson@gmail.com>
2024-09-25 10:13:19 -05:00
Oleg Zaytsev ccd0308abc
Don't do anything if MemPostings are empty
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-25 15:00:10 +02:00
Oleg Zaytsev 9c417aa710
Fix deadlock with empty MemPostings
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-25 14:08:50 +02:00
Bryan Boreham 5d8f0ef0c2
Merge pull request #14721 from bboreham/exp-grow-postings
[PERF] TSDB: Grow postings by doubling
2024-09-25 10:47:55 +01:00
Oleg Zaytsev e196b977af
Process MemPostings.Delete() with GOMAXPROCS workers
We are still seeing lock contention on MemPostings.mtx, and MemPostings.Delete() is by far the most expensive operation on that mutex.

This adds parallelism to that method, trying to reduce the amount of time we spend with the mutex held.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-25 10:38:47 +02:00
Julius Volz 5037cf75f2
Merge pull request #14972 from prometheus/jvp/make-mantime-ui-assets-relative
UI: Make mantime UI assets relative
2024-09-24 17:38:21 +02:00
Björn Rabenstein 67caa03dc1
Merge pull request #14970 from prometheus/beorn7/doc
docs: Improve, clarify, and fix documentation on scrape limits
2024-09-24 16:30:15 +02:00
Jesus Vazquez cb4bc5e786
UI: Make mantime UI assets relative
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
2024-09-24 15:30:54 +02:00
beorn7 a9243d4d2c docs: Improve, clarify, and fix documentation on scrape limits
In detail:

- Clarify that label name and value length limits are in byte,
  not in UTF-8 data points.

- More consistent formatting to keep 80 characters line limet.

- Clarify various misleading specifications around "per sample",
  "per scrape", "per scrape config", "per job"...

- Fix grammar.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-09-24 14:55:54 +02:00
Bryan Boreham a0f26febc2
Merge pull request #12180 from damnever/perf/relabel-add-label
Optimize constant label pair adding from relabeling.
2024-09-24 12:22:05 +01:00
Arve Knudsen c2bbabb4a7
promql.Engine: Refactor vector selector evaluation into a method (#14900)
* PromQL.Engine: Refactor Matrix expansion into a method

Add utility method promql.evaluator.expandSeriesToMatrix, for expanding a slice
of storage.Series into a promql.Matrix.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Rename to generateMatrix

Rename evaluator.expandSeriesToMatrix into generateMatrix, while also dropping
the start, end, interval arguments since they are evaluator fields.
Write more extensive method documentation.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Rename to evalVectorSelector

Rename to evalVectorSelector after discussing with @michahoffmann.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-09-24 11:03:56 +01:00
Bryan Boreham faf5ba29ba
Merge pull request #14959 from prometheus/merge-2.55-into-main
Merge 2.55 into main
2024-09-23 18:39:37 +01:00
Arve Knudsen 3f9b869fb5 Fix react-app (old UI) package-lock.json
cd web/ui/react-app
npm install

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-09-23 16:34:37 +01:00
George Krajcsovits f179cb948b
chore: bump client_golang from 1.20.3 to 1.20.4 (#14963)
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-09-23 13:46:51 +02:00
Julien 919648cafc
Merge pull request #14947 from roidelapluie/reloadinvalidyaml
fix(autoreload): Reload invalid yaml files
2024-09-23 10:03:23 +02:00
Bryan Boreham 4c90118361 Remove CHANGELOG duplicate line
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

#14402 is the issue and #14403 is the fix.
2024-09-22 17:53:41 +01:00
Bryan Boreham ca673eb749 Merge remote-tracking branch 'origin/release-2.55' into merge-2.55-into-main
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-22 17:49:34 +01:00
Bryan Boreham e3f5c7c2a0 [Release 2.55] Update CHANGELOG
Make text more consistent with 3.0 branch

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-22 17:42:04 +01:00
Bryan Boreham 31c5760551
Neater string vs byte-slice conversions (#14425)
unsafe.Slice and unsafe.StringData were added in Go 1.20

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-21 12:19:21 +02:00
Arthur Silva Sens 6bcb064d93
Merge pull request #14950 from Maniktherana/fuzz-om-minor-change
chore: remove unused code
2024-09-21 09:22:17 +01:00
Julius Volz 52fe4cc4ee
Merge pull request #14944 from roidelapluie/copy
Mantine UI: removed unuse file
2024-09-20 21:28:09 +02:00
Julius Volz dfc6f4b5bc
Merge pull request #14945 from roidelapluie/submillis
fix(web): properly format sub-millisecond durations in target status page
2024-09-20 21:27:16 +02:00
Bryan Boreham e0260930d6
Merge pull request #14951 from prometheus/update-rel-2.55
[release-2.55] Add #14948 to rc0
2024-09-20 18:42:51 +01:00
Bryan Boreham d42232e178
Merge pull request #14932 from bboreham/chunk-xor-combine-writebits
[PERF] TSDB: Chunk encoding: shorten some write sequences
2024-09-20 17:53:54 +01:00
Bryan Boreham e3617cbd2c Add #14948 to CHANGELOG
Also update the date of the RC which hasn't gone out yet.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-20 17:48:04 +01:00
Bryan Boreham 6f0d6038b7 [BUGFIX] TSDB: Only query chunks up to truncation time (#14948)
If the query overlaps the range currently undergoing compaction, we
should only fetch chunks up to that time. Need to store that min time
in `HeadAndOOOIndexReader`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-20 17:44:04 +01:00
Arthur Silva Sens ca18f298e1
Merge pull request #14949 from Maniktherana/minor-fixes-ct
refac: make typeRequiresCT private
2024-09-20 17:41:06 +01:00
Bryan Boreham 9215252221
[BUGFIX] TSDB: Only query chunks up to truncation time (#14948)
If the query overlaps the range currently undergoing compaction, we
should only fetch chunks up to that time. Need to store that min time
in `HeadAndOOOIndexReader`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-20 18:40:17 +02:00