Commit Graph

14655 Commits (6a6630d2a7db01c7968f5bb77cea6b0c264acbfd)

Author SHA1 Message Date
Jan Fajerski 24a10528ac
Merge pull request #15205 from tjhop/chore/slog-fixes
slog: various fixes
2024-10-25 11:11:46 +02:00
Jan Fajerski 07d01a9e0c
Merge pull request #15219 from jan--f/rw-default-http2-off
[CHANGE] Remote-write: default enable_http2 to false
2024-10-25 11:07:55 +02:00
TJ Hoplock 4f9e4dc016 ref: remove unused deduper log wrapper methods
I used these wrapper methods during initial development of the custom
handler that the deduper now implements. Since the deduper implements
slog.Handler and can be used directly as a logger, these wrapper methods
are no longer needed.

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
2024-10-24 22:31:37 -04:00
TJ Hoplock b602393473 fix: avoid data race in log deduper
This change should have been included in the initial prometheus slog
conversion, but I must've lost track of it in all the rebases involved
in that PR.

This changes the dedupe logger so that the only method that needs to use
the lock is the `Handle()` method that actually interacts with the
deduplication map.

Ex:
```
==================
WARNING: DATA RACE
Write at 0x00c000518bc0 by goroutine 29481:
  github.com/prometheus/prometheus/util/logging.(*Deduper).WithAttrs()
      /home/tjhop/go/src/github.com/prometheus/prometheus/util/logging/dedupe.go:89 +0xef
  log/slog.(*Logger).With()
      /home/tjhop/.asdf/installs/golang/1.23.1/go/src/log/slog/logger.go:132 +0x106
  github.com/prometheus/prometheus/storage/remote.NewQueueManager()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/queue_manager.go:483 +0x7a9
  github.com/prometheus/prometheus/storage/remote.(*WriteStorage).ApplyConfig()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/write.go:201 +0x102c
  github.com/prometheus/prometheus/storage/remote.(*Storage).ApplyConfig()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/storage.go:92 +0xfd
  github.com/prometheus/prometheus/storage/remote.TestWriteStorageApplyConfigsDuringCommit.func1()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/storage_test.go:172 +0x3e4
  github.com/prometheus/prometheus/storage/remote.TestWriteStorageApplyConfigsDuringCommit.gowrap1()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/storage_test.go:174 +0x41

Previous read at 0x00c000518bc0 by goroutine 31261:
  github.com/prometheus/prometheus/util/logging.(*Deduper).Handle()
      /home/tjhop/go/src/github.com/prometheus/prometheus/util/logging/dedupe.go:82 +0x2b1
  log/slog.(*Logger).log()
      /home/tjhop/.asdf/installs/golang/1.23.1/go/src/log/slog/logger.go:257 +0x228
  log/slog.(*Logger).Error()
      /home/tjhop/.asdf/installs/golang/1.23.1/go/src/log/slog/logger.go:230 +0x3d4
  github.com/prometheus/prometheus/tsdb/wlog.(*Watcher).loop()
      /home/tjhop/go/src/github.com/prometheus/prometheus/tsdb/wlog/watcher.go:254 +0x2db
  github.com/prometheus/prometheus/tsdb/wlog.(*Watcher).Start.gowrap1()
      /home/tjhop/go/src/github.com/prometheus/prometheus/tsdb/wlog/watcher.go:227 +0x33

Goroutine 29481 (running) created at:
  github.com/prometheus/prometheus/storage/remote.TestWriteStorageApplyConfigsDuringCommit()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/storage_test.go:164 +0xe4
  testing.tRunner()
      /home/tjhop/.asdf/installs/golang/1.23.1/go/src/testing/testing.go:1690 +0x226
  testing.(*T).Run.gowrap1()
      /home/tjhop/.asdf/installs/golang/1.23.1/go/src/testing/testing.go:1743 +0x44

Goroutine 31261 (running) created at:
  github.com/prometheus/prometheus/tsdb/wlog.(*Watcher).Start()
      /home/tjhop/go/src/github.com/prometheus/prometheus/tsdb/wlog/watcher.go:227 +0x177
  github.com/prometheus/prometheus/storage/remote.(*QueueManager).Start()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/queue_manager.go:934 +0x304
  github.com/prometheus/prometheus/storage/remote.(*WriteStorage).ApplyConfig()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/write.go:232 +0x151b
  github.com/prometheus/prometheus/storage/remote.(*Storage).ApplyConfig()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/storage.go:92 +0xfd
  github.com/prometheus/prometheus/storage/remote.TestWriteStorageApplyConfigsDuringCommit.func1()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/storage_test.go:172 +0x3e4
  github.com/prometheus/prometheus/storage/remote.TestWriteStorageApplyConfigsDuringCommit.gowrap1()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/storage_test.go:174 +0x41
==================
--- FAIL: TestWriteStorageApplyConfigsDuringCommit (2.26s)
    testing.go:1399: race detected during execution of test
FAIL
FAIL    github.com/prometheus/prometheus/storage/remote 68.321s
```

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
2024-10-24 22:30:38 -04:00
Jan Fajerski 7939eab77a remote-write: change test default expected to http2 disabled
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2024-10-24 22:32:08 +02:00
Bryan Boreham 20fdc8f541 [CHANGE] Remote-write: default enable_http2 to false
Remote-write creates several shards to parallelise sending, each with
its own http connection. We do not want them all combined onto one
socket by http2.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-10-24 22:27:06 +02:00
Ben Ye 99882eec3b log last series labelset when hitting OOO series labels during compaction
Signed-off-by: Ben Ye <benye@amazon.com>
2024-10-24 09:27:15 -07:00
Jesus Vazquez 3cb09acb21
Docs: Remove experimental note on out of order feature (#15215)
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
2024-10-24 18:18:21 +02:00
George Krajcsovits 469573b13b
fix(nhcb): do not return nhcb from parse if exponential is present (#15209)
From: https://github.com/prometheus/prometheus/pull/14978#discussion_r1800755481
Also encode the requirement table set in #13532

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-24 18:14:05 +02:00
Jonathan Ballet 7ca90e5729 doc: fix formatting
Signed-off-by: Jonathan Ballet <jon@multani.info>
2024-10-24 08:55:23 +02:00
George Krajcsovits 2182b83271
feat(nhcb): implement created timestamp handling (#15198)
Call through to the underlaying parser if we are not in a histogram
and the entry is a series or exponential native histogram. Otherwise store
and retrieve CT for NHCB.

* fix(omparser): losing exemplars when CT is parsed

Fixes: #15137
Ignore exemplars while peeking ahead during CT parsing.
Simplify state reset with defer().

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-24 07:38:58 +02:00
Vanshika cccbe72514
TSDB: Fix some edge cases when OOO is enabled (#14710)
Fix some edge cases when OOO is enabled

Signed-off-by: Vanshikav123 <vanshikav928@gmail.com>
Signed-off-by: Vanshika <102902652+Vanshikav123@users.noreply.github.com>
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
Co-authored-by: Jesus Vazquez <jesusvzpg@gmail.com>
2024-10-23 17:34:28 +02:00
Björn Rabenstein 7c7116fea8
Merge pull request #15176 from jhesketh/jhesketh/round
Round function should ignore native histograms
2024-10-22 19:14:16 +02:00
George Krajcsovits aa81210c8b
NHCB scrape: refactor state handling and speed up scrape test (#15193)
* NHCB: scrape use state field and not booleans

From comment https://github.com/prometheus/prometheus/pull/14978#discussion_r1800898724

Also make compareLabels read only and move storeLabels to the first
processed classic histogram series.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Speed up TestConvertClassicHistogramsToNHCB 3x

Reduce the startup time and timeouts

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* lint fix

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

---------

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-22 17:49:25 +01:00
Björn Rabenstein 3bb5e28c6b
Merge pull request #15197 from prometheus/alexg/docs-issue-11570
docs: add keep_firing_for in alerting rules
2024-10-22 15:35:36 +02:00
George Krajcsovits 1b4e7f74e6
feat(tools): add debug printouts to rules unit testing (#15196)
* promtool: Add debug flag for rule tests

This makes it print out the tsdb state (both input_series and rules that
are run) at the end of a test, making reasoning about tests much easier.

Signed-off-by: David Leadbeater <dgl@dgl.cx>

* Reuse generated test name from junit testing

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

---------

Signed-off-by: David Leadbeater <dgl@dgl.cx>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: David Leadbeater <dgl@dgl.cx>
2024-10-22 15:24:36 +02:00
alexgreenbank 3afcda82be docs: add keep_firing_for in alerting rules
Signed-off-by: alexgreenbank <alex.greenbank@grafana.com>
2024-10-22 14:19:01 +01:00
machine424 cebcdce78a
fix(storage/mergeQuerier): copy the matcjers slice before passing it to queriers as
some of them may alter it.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-10-22 14:08:47 +02:00
machine424 eb523a6b29
fix(storage/mergeQuerier): add a reproducer for data race that occurs when one of the queriers alters the passed matchers and propose a fix
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-10-22 14:08:46 +02:00
Bryan Boreham 91d80252c3
Merge pull request #15194 from prometheus/make-release-2.55
Create release 2.55.0
2024-10-22 11:53:52 +01:00
Bryan Boreham bb27c6b896 Create release 2.55.0
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-10-22 09:31:02 +01:00
George Krajcsovits ad4857de52
Merge pull request #14978 from prometheus/nhcb-scrape-impl
feat: NHCB: convert classic histograms to nhcb in scrape MVP
2024-10-22 07:55:58 +02:00
Yijie Qin d2802c6fac
api: Add rule group pagination to list rules api (#14017)
* Add paginated feature to list rules api

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* Refactor to simplify code:

* Reduce number of variables
* Reduce type convesion

Signed-off-by: Raphael Silva <rapphil@gmail.com>

* Simplify paginated implementation

* Remove maxAlerts parameter.
* Reuse existing API responses by using omitempty in some fields

Signed-off-by: Raphael Silva <rapphil@gmail.com>

* Simplify pagination implementation

* Eliminate the need to sort the rule groups.

Signed-off-by: Raphael Silva <rapphil@gmail.com>

* Fix linting error

Signed-off-by: Raphael Silva <rapphil@gmail.com>

* Add more unit tests

Signed-off-by: Raphael Silva <rapphil@gmail.com>

* Update pagination parameters to be consistent with existing parameters

Signed-off-by: Raphael Silva <rapphil@gmail.com>

* Rename max_rule_groups to max_groups

Signed-off-by: Raphael Silva <rapphil@gmail.com>

* Refactor to simplify code

Signed-off-by: Raphael Silva <rapphil@gmail.com>

* Refactor to simplify the calculation of next token

Signed-off-by: Raphael Silva <rapphil@gmail.com>

* Handle corner case in pagination request

Signed-off-by: Raphael Silva <rapphil@gmail.com>

* Handle corner cases for pagination of list rules

Signed-off-by: Raphael Silva <rapphil@gmail.com>

* Update documentation for list rules parameters

Signed-off-by: Raphael Silva <rapphil@gmail.com>

* Refactor comments

Signed-off-by: Raphael Silva <rapphil@gmail.com>

* Simplify pagination implementation

* Eliminate need for extra structs to store pagination parameters

Signed-off-by: Raphael Silva <rapphil@gmail.com>

* Update docs/querying/api.md

Co-authored-by: Julius Volz <julius.volz@gmail.com>
Signed-off-by: Raphael Philipe Mendes da Silva <rapphil@gmail.com>

* Update web/api/v1/api.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Raphael Philipe Mendes da Silva <rapphil@gmail.com>

* Update comment describing the need for next token

Signed-off-by: Raphael Silva <rapphil@gmail.com>

---------

Signed-off-by: Yijie Qin <qinyijie@amazon.com>
Signed-off-by: Raphael Silva <rapphil@gmail.com>
Signed-off-by: Raphael Philipe Mendes da Silva <rapphil@gmail.com>
Co-authored-by: Raphael Silva <rapphil@gmail.com>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2024-10-22 00:04:40 +01:00
George Krajcsovits 877fd2a60e
Update scrape/scrape.go
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
2024-10-21 16:01:34 +02:00
Bryan Boreham 70e2d23027
Merge pull request #11474 from clwluvw/group-label
[FEATURE] rules: add labels at group level
2024-10-21 14:47:12 +01:00
György Krajcsovits 25ef4d3483 benchmark, rename parser omtext_with_nhcb
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-21 15:40:48 +02:00
György Krajcsovits bee1eb7720 goimports run
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-21 14:02:32 +02:00
György Krajcsovits 555bd6292a Better docstring on test
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-21 13:48:21 +02:00
György Krajcsovits a6947e1e6d Remove omcounterdata.txt as redundant
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-21 13:45:33 +02:00
György Krajcsovits eaee6bacc7 Fix failing benchmarks
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-21 13:40:16 +02:00
György Krajcsovits 5ee0980cd1 Add unit test to show that current wrapper is sub-optimal
https://github.com/prometheus/prometheus/pull/14978#discussion_r1800755481

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-21 13:35:33 +02:00
György Krajcsovits 4283ae73dc Rename convert_classic_histograms to convert_classic_histograms_to_nhcb
On reviewer request.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-21 13:22:58 +02:00
György Krajcsovits a23aed5634 More followup to #15164
Scrape test for NHCB modified.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-21 11:10:50 +02:00
György Krajcsovits 70742a64aa Follow up #15178
Renaming

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-21 11:03:47 +02:00
György Krajcsovits 482bb453c6 Followup to #15164
Update test cases

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-21 11:03:07 +02:00
György Krajcsovits 8c1b5a6251 Merge branch 'main' into nhcb-scrape-impl 2024-10-21 11:00:41 +02:00
Bryan Boreham 6b36a5592a
Merge pull request #14618 from machine424/para
test(cmd/prometheus): speed up test execution by t.Parallel() when possible
2024-10-20 18:19:51 +01:00
Ayoub Mrini d8c1605930
Merge pull request #15164 from machine424/quantile
feat: normalize "le" and "quantile" labels values upon ingestion
2024-10-19 21:13:03 +02:00
machine424 cf128a0472
test(cmd/prometheus): speed up test execution by t.Parallel() when possible
turn some loops into subtests to make use of t.Parallel()

requires Go 1.22 to make use of https://go.dev/blog/loopvar-preview

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-10-18 17:52:59 +02:00
machine424 8bcb4d865d
feat: normalize "le" and "quantile" labels values upon ingestion
Signed-off-by: machine424 <ayoubmrini424@gmail.com>

Co-authored-by: beorn7 <beorn@grafana.com>
2024-10-18 17:37:29 +02:00
Ayoub Mrini 98dcd28b1a
Merge pull request #15170 from machine424/awldi
fix(discovery): Handle cache.DeletedFinalStateUnknown in node informers' DeleteFunc
2024-10-18 17:33:08 +02:00
Alex Greenbank 421a3c22ea
scrape: provide a fallback format (#15136)
scrape: Remove implicit fallback to the Prometheus text format

Remove implicit fallback to the Prometheus text format in case of invalid/missing Content-Type and fail the scrape instead. Add ability to specify a `fallback_scrape_protocol` in the scrape config.

---------

Signed-off-by: alexgreenbank <alex.greenbank@grafana.com>
Signed-off-by: Alex Greenbank <alex.greenbank@grafana.com>
Co-authored-by: Björn Rabenstein <beorn@grafana.com>
2024-10-18 17:12:31 +02:00
Ayoub Mrini 5505c83a4d
Merge pull request #15167 from machine424/impor
feat: ProtobufParse.formatOpenMetricsFloat: improve float formatting by using strconv.AppendFloat instead of fmt.Sprint
2024-10-18 15:35:21 +02:00
Alan Protasio c78d5b94af
Disallowing configure AM with the v1 api (#13883)
* Stop supporting Alertmanager v1

* Disallowing configure AM with the v1 api

Signed-off-by: alanprot <alanprot@gmail.com>

* Update config/config_test.go

Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* Update config/config.go

Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* Addressing coments

Signed-off-by: alanprot <alanprot@gmail.com>

* Update notifier/notifier.go

Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* Update config/config_test.go

Co-authored-by: Jan Fajerski <jan--f@users.noreply.github.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

---------

Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>
Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Co-authored-by: Jan Fajerski <jan--f@users.noreply.github.com>
2024-10-18 15:23:14 +02:00
machine424 18b81ad79d
feat: ProtobufParse.formatOpenMetricsFloat: improve float formatting by using strconv.AppendFloat instead of fmt.Sprint
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-10-18 15:19:55 +02:00
Bryan Boreham 754c104a3e
Merge pull request #15173 from prometheus/merge-2.55-into-main-3
Merge release-2.55 into main (interim)
2024-10-18 10:28:20 +01:00
George Krajcsovits 763cbdf35f
Merge pull request #15180 from prometheus/ooo-nh-corrupt-chunk
fix(tsdb): populateWithDelChunkSeriesIterator corrupting chunk meta
2024-10-18 10:49:02 +02:00
György Krajcsovits a4083f14e8 Fix populateWithDelChunkSeriesIterator corrupting chunk meta
When handling recoded histogram chunks the min time of the chunk is
updated by mistake. It should only update when the chunk is completely new.
Otherwise the ongoing chunk's meta will be later than the previously
written samples in it.

Same bug as https://github.com/prometheus/prometheus/pull/14629

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-18 10:34:22 +02:00
György Krajcsovits e6a682f046 Reproduce populateWithDelChunkSeriesIterator corrupting chunk meta
When handling recoded histogram chunks the min time of the chunk is
updated by mistake. It should only update when the chunk is completely
new.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-18 10:34:22 +02:00
Bartlomiej Plotka efc43d0714
s/scrape_classic_histograms/always_scrape_classic_histograms (3.0 breaking change) (#15178)
This is for readability, especially when we can converting to nhcb option.

See discussion https://cloud-native.slack.com/archives/C077Z4V13AM/p1729155873397889

Signed-off-by: bwplotka <bwplotka@gmail.com>
2024-10-18 08:32:15 +01:00