prometheus

Commit Graph

Author	SHA1	Message	Date
Björn Rabenstein	342f970e05	Merge pull request #2413 from prometheus/beorn7/storage storage: Fix offset returned by dropAndPersistChunks	8 years ago
beorn7	46a0837816	storage: Fix offset returned by dropAndPersistChunks This is another corner-case that was previously never exercised because the rewriting of a series file was never prevented by the shrink ratio. Scenario: There is an existing series on disk, which is archived. If a new sample comes in for that file, a new chunk in memory is created, and the chunkDescsOffset is set to -1. If series maintenance happens before the series has at least one chunk to persist _and_ an insufficient chunks on disk is old enough for purging (so that the shrink ratio kicks in), dropAndPersistChunks would return 0, but it should return the chunk length of the series file.	8 years ago
Björn Rabenstein	0c688ab339	Merge pull request #2412 from prometheus/beorn7/storage storage: One more persist error code path discovered	8 years ago
beorn7	bed4934224	storage: One more persist error code path discovered Also, in that code path, set chunkDescsOffset to 0 rather than -1 in case of "dropped more chunks from persistence than from memory" so that no other weird things happen before the series is quarantined for good.	8 years ago
Björn Rabenstein	eac9696a36	Merge pull request #2410 from prometheus/beorn7/storage storage: writeMemorySeries needs to return true for quarantined series	8 years ago
beorn7	8c8baaa558	storage: writeMemorySeries needs to return true for quarantined series This is another fallout of my bug hunt.	8 years ago
Björn Rabenstein	c4686f7915	Merge pull request #2403 from prometheus/beorn7/release Cut release 1.5.1	8 years ago
beorn7	eb6b95ac2e	Cut release 1.5.1 Sadly, this is urgently required.	8 years ago
Björn Rabenstein	3e133a9312	Merge pull request #2400 from prometheus/beorn7/storage2 storage: Fix checkpointing of fully persisted memory series.	8 years ago
beorn7	2363a90adc	storage: Do not throw away fully persisted memory series in checkpointing	8 years ago
beorn7	244a65fb29	storage: Increase persist watermark before calling append The append call may reuse cds, and thus change its len. (In practice, this wouldn't happen as cds should have len==cap. Still, the previous order of lines was problematic.)	8 years ago
beorn7	75282b27ba	storage: Added checks for invariants	8 years ago
beorn7	31e9db7f0c	storage: Simplify evictChunkDesc method	8 years ago
beorn7	65dc8f44d3	storage: Test for errors returned by MaybePopulateLastTime	8 years ago
beorn7	752fac60ae	storage: Remove race condition from TestLoop	8 years ago
Brian Brazil	34767c2221	Clone lset before relabelling. (#2386 ) We need to not change the lset passed into populateLabels, as that is kept around by the SDs. Fixes 2377	8 years ago
Björn Rabenstein	7db4447390	Merge pull request #2385 from prometheus/beorn7/storage Fix embarrassing bug of not setting the shrink ratio	8 years ago
beorn7	4ccfc93dcf	storage: Set shrink ratio in the constructor.	8 years ago
beorn7	b2f086c6c4	storage: Expose bug of not setting the shrink ratio in the contstructor	8 years ago
Frederic Branczyk	d840f2c400	Merge pull request #2359 from brancz/cut-1.5.0 *: cut 1.5.0	8 years ago
Frederic Branczyk	fb17493f66	*: cut 1.5.0	8 years ago
Björn Rabenstein	9688a312ed	Merge pull request #2355 from prometheus/beorn7/lint Remove auto-generated protobuf code from codeclimate	8 years ago
beorn7	4392aa43d4	Remove auto-generated protobuf code from codeclimate	8 years ago
Björn Rabenstein	d717175104	Merge pull request #2354 from prometheus/beorn7/lint Documentation: Add Code Climate badges to README.md	8 years ago
beorn7	0c8b753f6e	Documentation: Add Code Climate badges to README.md	8 years ago
Scott Larkin	e5a75b2b30	Code Climate config (#2351 ) Created a Code Climate config with gofmt, golint, and govet enabled	8 years ago
Alex Somesan	b22eb65d0f	Cleaner separation between ServiceAccount and custom authentication in K8S SD (#2348 ) * Canonical usage of cluster service-account in K8S SD * Early validation for opt-in custom auth in K8S SD * Fix typo in condition	8 years ago
Fabian Reinartz	7eb849e6a8	Merge pull request #2307 from joyent/triton_discovery Add Joyent Triton discovery	8 years ago
Richard Kiene	f3d9692d09	Add Joyent Triton discovery	8 years ago
Brian Brazil	c1b547a90e	Only checkpoint chunkdescs and series that need persisting. (#2340 ) This decreases checkpoint size by not checkpointing things that don't actually need checkpointing. This is fully compatible with the v2 checkpoint format, as it makes series appear as though the only chunksdescs in memory are those that need persisting.	8 years ago
Fabian Reinartz	5418a42965	Merge pull request #2345 from Bplotka/fixed-alertmanager-flag-auth Fixed regression in `-alertmanager.url flag`. Basic auth was ignored.	8 years ago
Bartek Plotka	579e33f19a	Fixed style issues.	8 years ago
Bartek Plotka	d7febe97fa	Fixed regression in -alertmanager.url flag. Basic auth was ignored. - Included basic auth parsing while parsing to AlertmanagerConfig - Added test case Signed-off-by: Bartek Plotka <bwplotka@gmail.com>	8 years ago
Fabian Reinartz	990e40c959	Merge pull request #2338 from brancz/alertmanager-api web/api: add alertmanager api	8 years ago
Frederic Branczyk	bd92571bdd	web/api: make target and alertmanager api responses consistent	8 years ago
Fabian Reinartz	022714b60a	Merge pull request #2341 from mattbostock/patch-1 Correct notifications_dropped description	8 years ago
Matt Bostock	4160892109	Correct notifications_dropped description The current description does not accurately describe when the metric is incremented. Aside from Alertmanger missing from the configuration, `prometheus_notifications_dropped_total` is incremented when errors occur while sending alert notifications to Alertmanager, or because the notifications queue is full, or because the number of notifications to be sent exceeds the queue capacity. I think calling these cases 'errors' in a generic sense is more useful than the current description.	8 years ago
Brian Brazil	f64c231dad	Allow checkpoints and maintenance to happen concurrently. (#2321 ) This is essential on larger Prometheus servers, as otherwise checkpoints prevent sufficient persisting of chunks to disk.	8 years ago
Frederic Branczyk	389c6d0043	web/api: add alertmanager api	8 years ago
Brian Brazil	1dcb7637f5	Add various persistence related metrics (#2333 ) Add metrics around checkpointing and persistence * Add a metric to say if checkpointing is happening, and another to track total checkpoint time and count. This breaks the existing prometheus_local_storage_checkpoint_duration_seconds by renaming it to prometheus_local_storage_checkpoint_last_duration_seconds as the former name is more appropriate for a summary. * Add metric for last checkpoint size. * Add metric for series/chunks processed by checkpoints. For long checkpoints it'd be useful to see how they're progressing. * Add metric for dirty series * Add metric for number of chunks persisted per series. You can get the number of chunks from chunk_ops, but not the matching number of series. This helps determine the size of the writes being made. * Add metric for chunks queued for persistence Chunks created includes both chunks that'll need persistence and chunks read in for queries. This only includes chunks created for persistence. * Code review comments on new persistence metrics.	8 years ago
Björn Rabenstein	6ce97837ab	Merge pull request #2327 from prometheus/beorn7/vendoring vendoring: Update prometheus/common to pull in bug fixes	8 years ago
beorn7	86ec87b78f	vendoring: Update prometheus/common to pull in bug fixes In particular the one for https://github.com/prometheus/common/issues/72.	8 years ago
Fabian Reinartz	3302bb1eb1	Merge pull request #2323 from prometheus/beorn7/retrieval Retrieval: Avoid copying Target	8 years ago
Björn Rabenstein	ad40d0abbc	Merge pull request #2288 from prometheus/limit-scrape Add ability to limit scrape samples, and related metrics	8 years ago
beorn7	5dc01202d7	Retrieval: Remove some test lines that fail on Travis only These lines exercise an append in TestScrapeLoopWrapSampleAppender. Arguably, append shouldn't be tested there in the first place. Still it's weird why this fails on Travis: ``` --- FAIL: TestScrapeLoopWrapSampleAppender (0.00s) scrape_test.go:259: Expected count of 1, got 0 scrape_test.go:290: Expected count of 1, got 0 2017/01/07 22:48:26 http: TLS handshake error from 127.0.0.1:50716: read tcp 127.0.0.1:40265->127.0.0.1:50716: read: connection reset by peer FAIL FAIL github.com/prometheus/prometheus/retrieval 3.603s ``` Should anybody ever find out why, please revert this commit accordingly.	8 years ago
beorn7	3610331eeb	Retrieval: Do not buffer the samples if no sample limit configured Also, simplify and streamline the code a bit.	8 years ago
André Carvalho	c43dfaba1c	Add max concurrent and current queries engine metrics (#2326 ) * Add max concurrent and current queries engine metrics This commit adds two metrics to the promql/engine: the number of max concurrent queries, as configured by the flag, and the number of current queries being served+blocked in the engine.	8 years ago
beorn7	767c0709b1	Retrieval: Avoid copying Target retreival.Target contains a mutex. It was copied in the Targets() call. This potentially can wreak a lot of havoc. It might even have caused the issues reported as #2266 and #2262 .	8 years ago
Brian Brazil	f9e581907a	Make index queue bigger. (#2322 ) When a large Prometheus starts up fresh it can take many minutes to warmup and clear out the index queue. A larger queue means less blocking, bigger batches and cuts down startup time by ~50%.	8 years ago
Fabian Reinartz	c9f4aea8e2	Merge pull request #2305 from alicebob/favicon Add a favicon to the web GUI	8 years ago

1 2 3 4 5 ...

3626 Commits (342f970e05688c4885b97bbcaa975aa9631c1d2b) All Branches Search

3626 Commits (342f970e05688c4885b97bbcaa975aa9631c1d2b)

All Branches