prometheus

Commit Graph

Author	SHA1	Message	Date
Brian Brazil	a8c22c85cc	Correctly handle pruning wraparound after ring expansion (#3942 ) Fixes #3939	7 years ago
Tom Wilkie	f8c9d375b6	Correctly stop the timer used in the remote write path.	7 years ago
ferhat elmas	ffa673f7d8	General simplifications (#3887 ) Another try as in #1516	7 years ago
Fabian Reinartz	7ccd4b39b8	*: implement query params This adds a parameter to the storage selection interface which allows query engine(s) to pass information about the operations surrounding a data selection. This can for example be used by remote storage backends to infer the correct downsampling aggregates that need to be provided.	7 years ago
Tom Wilkie	a730083cbf	Merge pull request #3731 from bboreham/reuse-timer Re-use timer in remote storage queue	7 years ago
Krasi Georgiev	b75428ec19	rename package retrieve to scrape no fucnctinal changes just renaming retrieval to scrape	7 years ago
Tom Wilkie	3dc5b8eef5	Use sub benchmarks.	7 years ago
Tom Wilkie	da29c09dca	Some benchmarks for the mergeSeries set.	7 years ago
Tom Wilkie	749781edf3	Also, don't make a mergeSeriesSet if there is only one SeriesSet.	7 years ago
Tom Wilkie	48e39068bd	Don't allocate a mergeSeries if there is only one series to merge.	7 years ago
Bryan Boreham	8a4535e6ad	Re-use timer instead of creating new ones on every sample The docs for `time.After()` note that "The underlying Timer is not recovered by the garbage collector until the timer fires".	7 years ago
Tom Wilkie	f2c5399e39	Merge pull request #3561 from twiedenbein/master fixed bug with initialization of queueconfig	7 years ago
Shubheksha Jalan	0471e64ad1	Use shared types from the `common` repo (#3674 ) * refactor: use shared types from common repo, remove util/config * vendor: add common/config * fix nit	7 years ago
Shubheksha Jalan	ec94df49d4	Refactor SD configuration to remove `config` dependency (#3629 ) * refactor: move targetGroup struct and CheckOverflow() to their own package * refactor: move auth and security related structs to a utility package, fix import error in utility package * refactor: Azure SD, remove SD struct from config * refactor: DNS SD, remove SD struct from config into dns package * refactor: ec2 SD, move SD struct from config into the ec2 package * refactor: file SD, move SD struct from config to file discovery package * refactor: gce, move SD struct from config to gce discovery package * refactor: move HTTPClientConfig and URL into util/config, fix import error in httputil * refactor: consul, move SD struct from config into consul discovery package * refactor: marathon, move SD struct from config into marathon discovery package * refactor: triton, move SD struct from config to triton discovery package, fix test * refactor: zookeeper, move SD structs from config to zookeeper discovery package * refactor: openstack, remove SD struct from config, move into openstack discovery package * refactor: kubernetes, move SD struct from config into kubernetes discovery package * refactor: notifier, use targetgroup package instead of config * refactor: tests for file, marathon, triton SD - use targetgroup package instead of config.TargetGroup * refactor: retrieval, use targetgroup package instead of config.TargetGroup * refactor: storage, use config util package * refactor: discovery manager, use targetgroup package instead of config.TargetGroup * refactor: use HTTPClient and TLS config from configUtil instead of config * refactor: tests, use targetgroup package instead of config.TargetGroup * refactor: fix tagetgroup.Group pointers that were removed by mistake * refactor: openstack, kubernetes: drop prefixes * refactor: remove import aliases forced due to vscode bug * refactor: move main SD struct out of config into discovery/config * refactor: rename configUtil to config_util * refactor: rename yamlUtil to yaml_config * refactor: kubernetes, remove prefixes * refactor: move the TargetGroup package to discovery/ * refactor: fix order of imports	7 years ago
Ed Schouten	bb724f1bef	Deprecate DeduplicateSeriesSet() in favor of NewMergeSeriesSet(). Federation makes use of dedupedSeriesSet to merge SeriesSets for every query into one output stream. If many match[] arguments are provided, many dedupedSeriesSet objects will get chained. This has the downside of causing a potential O(nk) running time, where n is the number of series and k the number of match[] arguments. In the mean time, the storage package provides a mergeSeriesSet that accomplishes the same with an O(nlog(k)) running time by making use of a binary heap. Let's just get rid of dedupedSeriesSet and change all existing callers to use mergeSeriesSet.	7 years ago
Tom Wiedenbein	937ac8c060	fixed bug with initialization of queueconfig QueueConfigs would only ever initialize to the default settings, and would not pick up their respective values from YAML.	7 years ago
Fabian Reinartz	83cd270ea4	*: adapt to storage interface changes	7 years ago
Tobias Schmidt	7098c56474	Add remote read filter option For special remote read endpoints which have only data for specific queries, it is desired to limit the number of queries sent to the configured remote read endpoint to reduce latency and performance overhead.	7 years ago
Tobias Schmidt	434f0374f7	Refactor remote storage querier handling * Decouple remote client from ReadRecent feature. * Separate remote read filter into a small, testable function. * Use storage.Queryable interface to compose independent functionalities.	7 years ago
Tobias Schmidt	9b0091d487	Add storage.Queryable and storage.QueryableFunc In order to compose different querier implementations more easily, this change introduces a separate storage.Queryable interface grouping the query (Querier) function of the storage. Furthermore, it adds a QueryableFunc type to ease writing very simple queryable implementations.	7 years ago
Julius Volz	9f10c63cff	Fix remote read labelset corruption (#3456 ) The labelsets returned from remote read are mutated in higher levels (like seriesFilter.Labels()) and since the concreteSeriesSet didn't return a copy, the external mutation affected the labelset in the concreteSeries itself. This resulted in bizarre bugs where local and remote series would show with identical label sets in the UI, but not be deduplicated, since internally, a series might come to look like: {__name__="node_load5", instance="192.168.1.202:12090", job="node_exporter", node="odroid", node="odroid"} (note the repetition of the last label)	7 years ago
Krasi Georgiev	5d8f93a22a	now using only github.com/gogo/protobuf bumped all grpc-gateway packages to v1.2.2 updated and run the denproto.sh script	7 years ago
Fabian Reinartz	30e777d10d	tsdb: default too small max block duration	7 years ago
Tom Wilkie	48a7a00a38	Fast path the merge querier (#3358 ) * Fast path the merge querier such that it is completely removed from query path when there is no remote storage. * Add NoopQuerier * Add copyright notice. * Avoid global, use a function.	7 years ago
Tom Wilkie	0e572686db	Revert "Bypass the fanout storage merging if no remote storage is configured."	7 years ago
Tom Wilkie	1af3ef431d	s/TestRemoveLabels/TestSeriesSetFilter/	7 years ago
Tom Wilkie	9c3c98e8de	Revert "Port 'Don't disable HTTP keep-alives for remote storage connections.' to 2.0 (see #3173 )" This reverts commit `0997191b18`.	7 years ago
Tom Wilkie	746752b946	Merge external labels in order.	7 years ago
Tom Wilkie	6e4d4ea402	Initialise some counters in remote storage API.	7 years ago
Tom Wilkie	2ae04d0e79	Add license header.	7 years ago
Tom Wilkie	e8c264e47a	Add comment.	7 years ago
Tom Wilkie	ee011d906d	Port remote read server to 2.0.	7 years ago
Bryan Boreham	0997191b18	Port 'Don't disable HTTP keep-alives for remote storage connections.' to 2.0 (see #3173 ) Removes configurability introduced in #3160 in favour of hard-coding, per advice from @brian-brazil.	7 years ago
Tom Wilkie	56820726fa	Move a couple of the encoding/decoding functions into codec.go	7 years ago
Conor Broderick	08b7328669	Port Metric name validation to 2.0 (see #2975 )	7 years ago
Tom Wilkie	8fe0212ff7	Port 'Make queue manager configurable.' to 2.0, see #2991	7 years ago
Tom Wilkie	3760f56c0c	remote: Expose ClientConfig type (see #3165 )	7 years ago
Tom Wilkie	16f71a7723	Port codec.go over form 1.8 branch.	7 years ago
Fabian Reinartz	e53040e2ac	Merge pull request #3339 from tomwilkie/3065-remote-read-bypass Bypass the fanout storage merging if no remote storage is configured.	7 years ago
Fabian Reinartz	bf56ad4233	Merge branch 'master' into master	7 years ago
Paul Gier	c4c3205d76	storage/tsdb: check that max block duration is larger than min If the user accidentally sets the max block duration smaller than the min, the current error is not informative. This change just performs the check earlier and improves the error message.	7 years ago
Fabian Reinartz	ce63a5a855	Merge pull request #3352 from prometheus/rc2 Cut v2.0.0-rc.2	7 years ago
Thibault Chataigner	fc4406201e	Tsdb StartTime : Use a simplier way to compute StartTime	7 years ago
Julius Volz	099df0c5f0	Migrate "golang.org/x/net/context" -> "context" (#3333 ) In some places, where ctxhttp or gRPC are concerned, we still need to use the old contexts.	7 years ago
Tom Wilkie	4bbef0ec30	Bypass the fanout storage merging if no remote storage is configured.	7 years ago
Fabian Reinartz	a57ea79660	Close index reader properly	7 years ago
Julius Volz	c3d6abc8e6	Fix some lint errors (#3334 ) I left the promql ones and some others untouched as I remember that @fabxc prefers them that way.	7 years ago
Julius Volz	2846d62573	Fix staticcheck issue in test (#3331 ) staticcheck fails with: storage/remote/read_test.go:199:27: do not pass a nil Context, even if a function permits it; pass context.TODO if you are unsure about which Context to use (SA1012)	7 years ago
Brian Brazil	4a50f547c8	removeLabels needs a pointer to work. (#3326 )	7 years ago
Thibault Chataigner	bf4a279a91	Remote storage reads based on oldest timestamp in primary storage (#3129 ) Currently all read queries are simply pushed to remote read clients. This is fine, except for remote storage for wich it unefficient and make query slower even if remote read is unnecessary. So we need instead to compare the oldest timestamp in primary/local storage with the query range lower boundary. If the oldest timestamp is older than the mint parameter, then there is no need for remote read. This is an optionnal behavior per remote read client. Signed-off-by: Thibault Chataigner <t.chataigner@criteo.com>	7 years ago
Julius Volz	9ef8518b37	Remove "package remote" garbage from license headers (#3304 )	7 years ago
Tobias Schmidt	721050c6cb	Update prometheus/tsdb dependency	7 years ago
Julius Volz	33c1171b9c	Don't add anchoring to exported `Value` matcher field Instead, just make the anchoring part of the internal regex. This helps because some users will want to read back the `Value` field and expect it to be the same as the input value (e.g. some tests in Cortex), or use the value in another context which is already expected to add its own anchoring, leading to superfluous double anchoring (such as when we translate matchers into remote read request matchers).	7 years ago
Brian Brazil	73dc96e7f5	Fix leak of ticker in remote storage queue manager.	7 years ago
Brian Brazil	ee88f0d222	Ensure all values are used or _	7 years ago
Brian Brazil	37ec2d5283	Fix off by one error in concreteSeriesSet (#3262 )	7 years ago
Marc Sluiter	6a633eece1	Added go-conntrack for monitoring http connections (#3241 ) Added metrics for in- and outgoing traffic with go-conntrack.	7 years ago
Julius Volz	f7e8348a88	Re-add contexts to storage.Storage.Querier() (#3230 ) * Re-add contexts to storage.Storage.Querier() These are needed when replacing the storage by a multi-tenant implementation where the tenant is stored in the context. The 1.x query interfaces already had contexts, but they got lost in 2.x. * Convert promql.Engine to use native contexts	7 years ago
Fabian Reinartz	7b02bfee0a	web: start web handler while TSDB is starting up	7 years ago
Fabian Reinartz	d21f149745	*: migrate to go-kit/log	7 years ago
Fabian Reinartz	0efecea6d4	Adapt storage APIs to uint64 references	7 years ago
Fabian Reinartz	0c81d5f719	storage: instantiate correct block ranges	7 years ago
Fabian Reinartz	2037778d14	vendor: update TSDB	7 years ago
Tom Wilkie	b11bc8ae24	Fix some comments.	7 years ago
Tom Wilkie	ec999ff397	Prevent number of remote write shards from going negative. This can happen in the situation where the system scales up the number of shards massively (to deal with some backlog), then scales it down again as the number of samples sent during the time period is less than the number received.	7 years ago
Tom Wilkie	a09acdcc5b	Make concreteSeriersIterator behave.	7 years ago
Tom Wilkie	994a7f27d6	Propagate errors through mergeSeriesSet correctly.	7 years ago
Tom Wilkie	2e0d8487e3	Return zeros if At() is called after Next() returns false.	7 years ago
Tom Wilkie	014bd31a86	Remove unnecessary whitespace changes, add comment.	7 years ago
Tom Wilkie	98ac07f86a	Add unit test for the merging on the read path.	7 years ago
Tom Wilkie	b568ace7ce	Move protos to ./prompb	8 years ago
Tom Wilkie	96e25adc8d	Introduce 'primary' storage in fanout, and have Add return the ref from the primary. Also, ensure all append batches are rolled back when a commit or rollback fails.	8 years ago
Tom Wilkie	db8128ceeb	Add label set as first parameter to AddFast, ingored by TSDB adapter.	8 years ago
Tom Wilkie	2dda5775e3	Initial port of remote storage to v2.	8 years ago
Fabian Reinartz	16464c3a33	Merge pull request #2910 from prometheus/adminapi Admin API	8 years ago
Fabian Reinartz	ccf9e62972	*: add admin grpc API	8 years ago
Goutham Veeramachaneni	243419c007	Return tsdb.ErrOutOfBounds as storage.ErrOutOfBounds Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>	8 years ago
Goutham Veeramachaneni	3069bd3996	Handle scrapes with OutOfBounds metrics better fixes #2894 Signed-off-by: Goutham Veeramachaneni <goutham@boomerangcommerce.com>	8 years ago
Goutham Veeramachaneni	d407bd150c	Consolidate the duration params in CLI * All CLI params moved to model.Duration Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>	8 years ago
Goutham Veeramachaneni	baf5b0f0fc	Fix error where we look into the future. (#2829 ) * Fix error where we look into the future. So currently we are adding values that are in the future for an older timestamp. For example, if we have [(1, 1), (150, 2)] we will end up showing [(1, 1), (2,2)]. Further it is not advisable to call .At() after Next() returns false. Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in> * Retuen early if done Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in> * Handle Seek() where we reach the end of iterator Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in> * Simplify code Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>	8 years ago
Brian Brazil	c02c25d5ba	Allow peeking back further in buffer.	8 years ago
Fabian Reinartz	d289dc55c3	storage: update TSDB	8 years ago
Fabian Reinartz	9b175d48cb	Add flag to disable TSDB lock file	8 years ago
Fabian Reinartz	0f3110487d	Merge remote-tracking branch 'origin/dev-2.0' into dev-2.0	8 years ago
Fabian Reinartz	37deb21c45	vendor: remove unused dependency and last ref to fabxc/tsdb	8 years ago
Brian Brazil	5c9a6ce747	Add license to files. This should fix CI for dev-2.0.	8 years ago
Fabian Reinartz	8ffc851147	Merge branch 'master' into dev-2.0	8 years ago
Fabian Reinartz	cfb2a7f1d5	vendor: sync organisation migration of tsdb	8 years ago
Fabian Reinartz	bbcf20ba01	web: deduplicate series in federation	8 years ago
Fabian Reinartz	4e41987bcb	storage: add deduplication function This adds a function to deduplicate two series sets given that duplicate series have equivalent data points.	8 years ago
Björn Rabenstein	50e4f49b7e	Merge pull request #2561 from prometheus/beorn7/storage2 storage: Evict unused chunk.Descs in crash recovery	8 years ago
beorn7	08fc6cbd39	storage: Evict unused chunk.Descs in crash recovery This is in line with the v1.5 change in paradigm to not keep chunk.Descs without chunks around after a series maintenance. It's mainly motivated by avoiding excessive amounts of RAM usage during crash recovery. The code avoids to create memory time series with zero chunk.Descs as that is prone to trigger weird effects. (Series maintenance would archive series with zero chunk.Descs, but we cannot do that here because the archive indices still have to be checked.)	8 years ago
Björn Rabenstein	1c6240fc40	Merge pull request #2559 from prometheus/beorn7/storage storage: Replace fpIter by sortedFPs	8 years ago
beorn7	d284ffab03	storage: Replace fpIter by sortedFPs The fpIter was kind of cumbersome to use and required a lock for each iteration (which wasn't even needed for the iteration at startup after loading the checkpoint). The new implementation here has an obvious penalty in memory, but it's only 8 byte per series, so 80MiB for a beefy server with 10M memory time series (which would probably need ~100GiB RAM, so the memory penalty is only 0.1% of the total memory need). The big advantage is that now series maintenance happens in order, which leads to the time between two maintenances of the same series being less random. Ideally, after each maintenance, the next maintenance would tackle the series with the largest number of non-persisted chunks. That would be quite an effort to find out or track, but with the approach here, the next maintenance will tackle the series whose previous maintenance is longest ago, which is a good approximation. While this commit won't change the _average_ number of chunks persisted per maintenance, it will reduce the mean time a given chunk has to wait for its persistence and thus reduce the steady-state number of chunks waiting for persistence. Also, the map iteration in Go is non-deterministic but not truly random. In practice, the iteration appears to be somewhat "bucketed". You can often observe a bunch of series with similar duration since their last maintenance, i.e. you see batches of series with similar number of chunks persisted per maintenance. If that batch is relatively young, a whole lot of series are maintained with very few chunks to persist. (See screenshot in PR for a better explanation.)	8 years ago
Tobias Schmidt	eac36d123e	Fix unstable fanin test (#2558 )	8 years ago
Julius Volz	5a896033e3	Add remote read external label handling (#2555 ) * Add remote read external label handling This implements rule 1 and 2 from https://docs.google.com/document/d/188YauRgfF0J4CYMigLsVNN34V_kUwKnApBs2dQMfBbs/edit * Use more descriptive example labels in read test * Add comment for querier.addExternalLabels() * Make argument naming in removeLabels() more generic	8 years ago
Björn Rabenstein	e63d079b59	Merge pull request #2527 from prometheus/beorn7/storage storage: Evict chunks and calculate persistence pressure...	8 years ago
Julius Volz	b5b0e00923	Merge pull request #2499 from prometheus/remote-read Remote Read	8 years ago
beorn7	434ab2a6a3	storage: Evict chunks and calculate persistence pressure based on target heap size This is a fairly easy attempt to dynamically evict chunks based on the heap size. A target heap size has to be set as a command line flage, so that users can essentially say "utilize 4GiB of RAM, and please don't OOM". The -storage.local.max-chunks-to-persist and -storage.local.memory-chunks flags are deprecated by this change. Backwards compatibility is provided by ignoring -storage.local.max-chunks-to-persist and use -storage.local.memory-chunks to set the new -storage.local.target-heap-size to a reasonable (and conservative) value (both with a warning). This also makes the metrics intstrumentation more consistent (in naming and implementation) and cleans up a few quirks in the tests. Answers to anticipated comments: There is a chance that Go 1.9 will allow programs better control over the Go memory management. I don't expect those changes to be in contradiction with the approach here, but I do expect them to complement them and allow them to be more precise and controlled. In any case, once those Go changes are available, this code has to be revisted. One might be tempted to let the user specify an estimated value for the RSS usage, and then internall set a target heap size of a certain fraction of that. (In my experience, 2/3 is a fairly safe bet.) However, investigations have shown that RSS size and its relation to the heap size is really really complicated. It depends on so many factors that I wouldn't even start listing them in a commit description. It depends on many circumstances and not at least on the risk trade-off of each individual user between RAM utilization and probability of OOMing during a RAM usage peak. To not add even more to the confusion, we need to stick to the well-defined number we also use in the targeting here, the sum of the sizes of heap objects.	8 years ago
beorn7	96a303b348	storage: Use staleness delta as head chunk timeout Currently, if a series stops to exist, its head chunk will be kept open for an hour. That prevents it from being persisted. Which prevents it from being evicted. Which prevents the series from being archived. Most of the time, once no sample has been added to a series within the staleness limit, we can be pretty confident that this series will not receive samples anymore. The whole chain as described above can be started after 5m instead of 1h. In the relaxed case, this doesn't change a lot as the head chunk timeout is only checked during series maintenance, and usually, a series is only maintained every six hours. However, there is the typical scenario where a large service is deployed, the deoply turns out to be bad, and then it is deployed again within minutes, and quite quickly the number of time series has tripled. That's the point where the Prometheus server is stressed and switches (rightfully) into rushed mode. In that mode, time series are processed as quickly as possible, but all of that is in vein if all of those recently ended time series cannot be persisted yet for another hour. In that scenario, this change will help most, and it's exactly the scenario where help is most desperately needed.	8 years ago
Julius Volz	3f23aa2cc7	Add headers to indicate remote read/write version Also add Content-Type header.	8 years ago
Julius Volz	8fda83ea12	Make rules only read local data	8 years ago
Julius Volz	94acd3f1d8	Add fanin tests and fix uncovered bugs	8 years ago
Julius Volz	9b33cfc457	Fix/unify context-based remote storage timeouts	8 years ago
Julius Volz	815762a4ad	Move retrieval.NewHTTPClient -> httputil.NewClientFromConfig	8 years ago
Fabian Reinartz	397f001ac5	Merge branch 'master' into dev-2.0	8 years ago
Julius Volz	eb14678a25	Make remote read/write use config.HTTPClientConfig	8 years ago
Julius Volz	406b65d0dc	Rename remote.Storage to remote.Writer	8 years ago
Julius Volz	02395a224d	[WIP] Remote Read	8 years ago
Julius Volz	40e41a4776	Merge pull request #2494 from tomwilkie/remote-write-sharding Dynamically reshard the QueueManager based on observed load.	8 years ago
Fabian Reinartz	b586781283	*: update tsdb vendoring and add retention flag	8 years ago
beorn7	48d221c11e	storage: Fix typo in comment	8 years ago
Fabian Reinartz	0ecd205794	promql: Use buffer pool for matrix allocations	8 years ago
Tom Wilkie	75bb0f3253	Review feedback	8 years ago
Tom Wilkie	77cce900b8	Fix tests	8 years ago
Tom Wilkie	b48799a01e	Add license stanza	8 years ago
Tom Wilkie	9d22f030cf	Dynamically reshard the QueueManager based on observed load.	8 years ago
Fabian Reinartz	8a8eb12985	storage/tsdb: don't use partitioned DB.	8 years ago
Fabian Reinartz	9eb1d6c927	remote: take code from master	8 years ago
Fabian Reinartz	9304179ef7	Merge branch 'master' into dev-2.0	8 years ago
Fabian Reinartz	4397b4d508	*: pass Prometheus registry into storage	8 years ago
Tom Wilkie	1ab893c6ec	Limit 'discarding sample' logs to 1 every 10s (#2446 ) * Limit 'discarding sample' logs to 1 every 10s * Include the vendored library * Review feedback	8 years ago
Julius Volz	2f39dbc8b3	Rename StorageQueueManager -> QueueManager	8 years ago
Julius Volz	e9476b35d5	Re-add multiple remote writers Each remote write endpoint gets its own set of relabeling rules. This is based on the (yet-to-be-merged) https://github.com/prometheus/prometheus/pull/2419, which removes legacy remote write implementations.	8 years ago
Björn Rabenstein	089dc1076b	Merge pull request #2435 from jmeulemans/open-chunks-gauge Adding gauge for number of open head chunks.	8 years ago
Jeremy Meulemans	025c828976	Changed to open_head_chunks to address review. Now incrementing numHeadChunks directly.	8 years ago
Jeremy Meulemans	074050b8c0	Updating for failed codeclimate check.	8 years ago
Jeremy Meulemans	f70b52d0b6	Adding gauge for number of open head chunks. Fixes #1710	8 years ago
Julius Volz	beb3c4b389	Remove legacy remote storage implementations This removes legacy support for specific remote storage systems in favor of only offering the generic remote write protocol. An example bridge application that translates from the generic protocol to each of those legacy backends is still provided at: documentation/examples/remote_storage/remote_storage_bridge See also https://github.com/prometheus/prometheus/issues/10 The next step in the plan is to re-add support for multiple remote storages.	8 years ago
beorn7	d771185a43	storage: Fix chunkIndexToStartSeek calculation With a high enough shrink ratio and enough chunks to persist, the cutoff point could be _outside_ of the file, which wreaks havoc in the storage.	8 years ago
beorn7	73bd5e4dff	Merge branch 'beorn7/storage' into beorn7/storage3	8 years ago
beorn7	46a0837816	storage: Fix offset returned by dropAndPersistChunks This is another corner-case that was previously never exercised because the rewriting of a series file was never prevented by the shrink ratio. Scenario: There is an existing series on disk, which is archived. If a new sample comes in for that file, a new chunk in memory is created, and the chunkDescsOffset is set to -1. If series maintenance happens before the series has at least one chunk to persist _and_ an insufficient chunks on disk is old enough for purging (so that the shrink ratio kicks in), dropAndPersistChunks would return 0, but it should return the chunk length of the series file.	8 years ago
beorn7	9d12204da5	Merge branch 'release-1.5'	8 years ago
beorn7	bed4934224	storage: One more persist error code path discovered Also, in that code path, set chunkDescsOffset to 0 rather than -1 in case of "dropped more chunks from persistence than from memory" so that no other weird things happen before the series is quarantined for good.	8 years ago
beorn7	242d8edcb5	Merge branch 'release-1.5'	8 years ago
beorn7	8c8baaa558	storage: writeMemorySeries needs to return true for quarantined series This is another fallout of my bug hunt.	8 years ago
Mitsuhiro Tanda	be8b1eb656	storage: optimize dropping chunks by using minShrinkRatio (#2397 ) storage: prevent unnecessary chunk header reading if minShrinkRatio > 0	8 years ago
beorn7	2363a90adc	storage: Do not throw away fully persisted memory series in checkpointing	8 years ago
Fabian Reinartz	ea3ba338dd	main: add flags for new storage	8 years ago
beorn7	244a65fb29	storage: Increase persist watermark before calling append The append call may reuse cds, and thus change its len. (In practice, this wouldn't happen as cds should have len==cap. Still, the previous order of lines was problematic.)	8 years ago
beorn7	75282b27ba	storage: Added checks for invariants	8 years ago
beorn7	31e9db7f0c	storage: Simplify evictChunkDesc method	8 years ago
Fabian Reinartz	5772f1a7ba	retrieval/storage: adapt to new interface This simplifies the interface to two add methods for appends with labels or faster reference numbers.	8 years ago
beorn7	65dc8f44d3	storage: Test for errors returned by MaybePopulateLastTime	8 years ago
beorn7	752fac60ae	storage: Remove race condition from TestLoop	8 years ago
beorn7	4ccfc93dcf	storage: Set shrink ratio in the constructor.	8 years ago
beorn7	b2f086c6c4	storage: Expose bug of not setting the shrink ratio in the contstructor	8 years ago
Brian Brazil	c1b547a90e	Only checkpoint chunkdescs and series that need persisting. (#2340 ) This decreases checkpoint size by not checkpointing things that don't actually need checkpointing. This is fully compatible with the v2 checkpoint format, as it makes series appear as though the only chunksdescs in memory are those that need persisting.	8 years ago
Fabian Reinartz	c691895a0f	retrieval: cache series references, use pkg/textparse With this change the scraping caches series references and only allocates label sets if it has to retrieve a new reference. pkg/textparse is used to do the conditional parsing and reduce allocations from 900B/sample to 0 in the general case.	8 years ago
Brian Brazil	f64c231dad	Allow checkpoints and maintenance to happen concurrently. (#2321 ) This is essential on larger Prometheus servers, as otherwise checkpoints prevent sufficient persisting of chunks to disk.	8 years ago
Fabian Reinartz	ad9bc62e4c	storage: extend appender and adapt it	8 years ago
Brian Brazil	1dcb7637f5	Add various persistence related metrics (#2333 ) Add metrics around checkpointing and persistence * Add a metric to say if checkpointing is happening, and another to track total checkpoint time and count. This breaks the existing prometheus_local_storage_checkpoint_duration_seconds by renaming it to prometheus_local_storage_checkpoint_last_duration_seconds as the former name is more appropriate for a summary. * Add metric for last checkpoint size. * Add metric for series/chunks processed by checkpoints. For long checkpoints it'd be useful to see how they're progressing. * Add metric for dirty series * Add metric for number of chunks persisted per series. You can get the number of chunks from chunk_ops, but not the matching number of series. This helps determine the size of the writes being made. * Add metric for chunks queued for persistence Chunks created includes both chunks that'll need persistence and chunks read in for queries. This only includes chunks created for persistence. * Code review comments on new persistence metrics.	8 years ago
Fabian Reinartz	304cae9928	tsdb: Use PartitionedDB constructor	8 years ago
Brian Brazil	f9e581907a	Make index queue bigger. (#2322 ) When a large Prometheus starts up fresh it can take many minutes to warmup and clear out the index queue. A larger queue means less blocking, bigger batches and cuts down startup time by ~50%.	8 years ago
Fabian Reinartz	bc20d93f0a	storage: rename iterator value getters to At()	8 years ago
Fabian Reinartz	7322c46b8e	storage: add mock iterator for test	8 years ago
Fabian Reinartz	f8fc1f5bb2	*: migrate ingestion to new batch Appender	8 years ago
Fabian Reinartz	71fe0c58a8	promql: misc fixes	8 years ago
Mitsuhiro Tanda	7e369b9318	expose max memory chunks metrics (#2303 ) * expose max memory chunks metrics	8 years ago
Fabian Reinartz	fecf9532b9	*: fix misc compile errors	8 years ago
Fabian Reinartz	622ece6273	*: fix recording tests, migrate matcher types	8 years ago
Fabian Reinartz	0492ddbd4d	*: fully decouple tsdb, add new storage interfaces	8 years ago
Fabian Reinartz	d17b5be48a	storage/metric: remove package	8 years ago
Fabian Reinartz	8b84ee5ee6	storage: remove old storage This removes all old storage files and only keeps interfaces to still allow the code to compile.	8 years ago
Fabian Reinartz	11a731ba82	remote: remove hard-coded remote storages This commit removes the flag-configured remote storage integrations in favor of the generic remote write path.	8 years ago
Brian Brazil	93b70ee4ea	Evict chunk descs of all unloaded chunks during maintenance. (#2297 ) Keeping these around has two problems: 1) Each desc takes 64 bytes, 10 of them is 640B. This is a lot of overhead on a 1024 byte chunk. 2) It can take well over a week to reach a point where this and thus Prometheus memory usage as a whole enters steady state. This makes RAM estimation very hard for users, and makes it difficult to investigate things like memory fragmentation. Instead we'll wipe them during each memory series maintenance cycle, and if a query pulls them in they'll hang around as cache until the next cycle.	8 years ago
Brian Brazil	1b8a474612	Don't clone the metric if there's no remote writes. The metric clone can't be further optimised, and is a non-trivial memory allocation cost so fast path it if there's no remote writes configured.	8 years ago
Tristan Colgate	30be8e0b8a	ignore dotfiles in data directory	8 years ago
Björn Rabenstein	45570e5972	Merge pull request #2277 from prometheus/beorn7/storage2 storage: Sanity-check number of loaded chunk descs	8 years ago
beorn7	253be23c00	storage: Sanity-check number of loaded chunk descs Two cases: - An unarchived metric must have at least one chunk desc loaded upon unarchival. Otherwise, the file is gone or has size 0, which is an inconsistency (because the series is still indexed in the archive index). Hence, quarantining is triggered. - If loading the chunk descs of a series with a known chunkDescsOffset (i.e. != -1), the number of chunks loaded must be equal to chunkDescsOffset. If not, there is a data corruption. An error is returned, which leads to qurantining. In any case, there is a guard added to not access the 1st element of an empty chunkDescs slice. (That's what triggered the crashes in issue 2249.) A time series with unknown chunkDescsOffset and no chunks in memory and no chunks on disk either could trigger that case. I would assume such a "null series" doesn't exist, but it's not entirely unthinkable and unreasonable to happen (perhaps in future uses of the storage). (Create a series, and then something tries to preload chunks before the first sample is added.)	8 years ago
Björn Rabenstein	5f0c0e43cf	Merge pull request #2276 from prometheus/beorn7/storage storage: Catch data corruption that leads to division by zero	8 years ago
beorn7	837c029b16	storage: Fix linter issue Go style tries to avoid indented `else` blocks.	8 years ago
beorn7	4719482f5f	storage: Make tests go-vet and golint clean	8 years ago
beorn7	485ac8dff7	storage: Verify validity of byte length when unmarshalling (double)delta chunks This makes sure a division-by-zero crash cannot happen in the Len() method. Fixes #2773	8 years ago
tattsun	e714079cf2	storage: fix error message (#2270 ) * storage: add error message	8 years ago
Christopher M. Luciano	148b006e25	Clarify error message when Prometheus data dir finds unexpected files	8 years ago
Julius Volz	127332c56f	Merge pull request #2168 from tomwilkie/chunk-len Add call to estimate number of samples in a chunk to the API	8 years ago
Tom Wilkie	585878cdb2	Add call to estimate number of samples in a chunk to the API	8 years ago
Björn Rabenstein	036715370f	Merge pull request #2184 from huydx/master Fix possible memory leak by defer inside loop	8 years ago
huydx	c999902761	Fix possible memory leak by defer inside loop	8 years ago
Fabian Reinartz	856de30c09	Check error before defer closing If an error is returned the file might be nil and a Close call would cause a panic.	8 years ago
Fabian Reinartz	6703404cb4	Merge remote-tracking branch 'origin/release-1.2'	8 years ago
beorn7	c5bd178b93	Protect exported Querier interface method against negative time ranges	8 years ago
beorn7	5b16d6bd6e	Merge branch 'release-1.2'	8 years ago
beorn7	876e5da4f8	Add guard against non-monotonic samples in series This can only happen due to data corruption.	8 years ago
Dominik Schulz	182e17958a	Trivial spelling corrections and a small comment.	8 years ago
Fabian Reinartz	8fa18d564a	storage: enhance Querier interface usage This extracts Querier as an instantiateable and closeable object rather than just defining extending methods of the storage interface. This improves composability and allows abstracting query transactions, which can be useful for transaction-level caches, consistent data views, and encapsulating teardown.	8 years ago
beorn7	719508752b	Re-add counting of evict chunk ops and decrementing NumMemChunks Also, modify test to expose the regression.	8 years ago
Julius Volz	cb02f017ee	Clean up some doc comments	8 years ago
Julius Volz	c212ef0326	Add Chunk.Utilization() methods When using the chunking code in other projects (both Weave Prism and ChronixDB ingester), you sometimes want to know how well you are utilizing your chunks when closing/storing them.	8 years ago
Julius Volz	c7932aa009	Remove gRPC leftovers in protobuf definitions	8 years ago
Björn Rabenstein	1e2f03f668	Merge pull request #2005 from redbaron/microoptimise-matching Microoptimise matching	8 years ago
Maxim Ivanov	e6db9f8159	New fpsForLabelMatchers and seriesForLabelMatchers methods These more specific methods have replaced `metricForLabelMatchers` in cases where its `map[fingerprint]metric` result type was not necessary or was used as an intermediate step Avoids duplicated calls to `seriesForRange` from `QueryRange` and `QueryInstant` methods.	8 years ago
Brian Brazil	6e8f87a37f	Merge pull request #2047 from prometheus/write-relabel Add support for remote write relabelling.	8 years ago
Brian Brazil	77605649a9	Add support for remote write relabelling. Switch back to a single remote writer, as we were only ever meant to have one and the relabel semantics are clearer that way.	8 years ago
Julius Volz	c9d4526428	Unpublish accidentally published series methods There were some more accidentally published methods of the memorySeries type which I didn't notice when reviewing https://github.com/prometheus/prometheus/pull/2011	8 years ago
Maxim Ivanov	4978a65495	Extract initial FP candidate build logic into candidateFPsForLabelMatchers method No functional changes otherwise	8 years ago
Maxim Ivanov	c048a0cde8	Add metrics to result after checking all matchers Should be marginally faster and somewhat more GC friendly	8 years ago
Maxim Ivanov	bedc0eda1f	Added BenchmarkQueryRange	8 years ago
Julius Volz	c25f0de5ae	Remove local.ZeroSample{,Pair}, use model definitions	8 years ago
Julius Volz	044ebce779	Review fixups.	8 years ago
Julius Volz	d30a3c7c0f	Fix accidental publishing of memorySeries.firstTime()	8 years ago
Julius Volz	ab80ced756	storage: separate chunk package, publish more names This is a followup to https://github.com/prometheus/prometheus/pull/2011. This publishes more of the methods and other names of the chunk code and moves the chunk code to its own package. There's some unavoidable ugliness: the chunk and chunkDesc metrics are used by both packages, so I had to move them to the chunk package. That isn't great, but I don't see how to do it better without a larger redesign of everything. Same for the evict requests and some other types.	8 years ago
Julius Volz	42c05dd3a2	Merge pull request #2011 from mattkanwisher/chuck-public Make Chunk and ChunkIterator public for reuse	8 years ago
beorn7	ca98382943	Avoid `defer` in seriesMap.get This is related to https://github.com/golang/go/issues/14939 . It's probably the only occurrence where it matters.	8 years ago
Matthew Campbell	67d76e3a5d	timeseries: store varbit encoded data into cassandra	8 years ago
Tom Wilkie	4520e12440	Add HTTP Basic Auth & TLS support to the generic write path. (#1957 ) * Add config, HTTP Basic Auth and TLS support to the generic write path. - Move generic write path configuration to the config file - Factor out config.TLSConfig -> tlf.Config translation - Support TLSConfig for generic remote storage - Rename Run to Start, and make it non-blocking. - Dedupe code in httputil for TLS config. - Make remote queue metrics global.	8 years ago
Julius Volz	c187308366	storage: Contextify storage interfaces. This is based on https://github.com/prometheus/prometheus/pull/1997. This adds contexts to the relevant Storage methods and already passes PromQL's new per-query context into the storage's query methods. The immediate motivation supporting multi-tenancy in Frankenstein, but this could also be used by Prometheus's normal local storage to support cancellations and timeouts at some point.	8 years ago
Maxim Ivanov	bdc53098fc	Avoid having contended mutexes on same cacheline CPUs have to serialise write access to a single cache line effectively reducing level of possible parallelism. Placing mutexes on different cache lines avoids this problem. Most gains will be seen on NUMA servers where CPU interconnect traffic is especially expensive Before: go test . -run none -bench BenchmarkFingerprintLocker BenchmarkFingerprintLockerParallel-4 2000000 932 ns/op BenchmarkFingerprintLockerSerial-4 30000000 49.6 ns/op After: go test . -run none -bench BenchmarkFingerprintLocker BenchmarkFingerprintLockerParallel-4 3000000 569 ns/op BenchmarkFingerprintLockerSerial-4 30000000 51.0 ns/op	8 years ago
Julius Volz	5f5a78e807	Merge pull request #1974 from prometheus/disable-local-storage Allow disabling local storage.	8 years ago
Tom Wilkie	d83879210c	Switch back to protos over HTTP, instead of GRPC. My aim is to support the new grpc generic write path in Frankenstein. On the surface this seems easy - however I've hit a number of problems that make me think it might be better to not use grpc just yet. The explanation of the problems requires a little background. At weave, traffic to frankenstein need to go through a couple of services first, for SSL and to be authenticated. So traffic goes: internet -> frontend -> authfe -> frankenstein - The frontend is Nginx, and adds/removes SSL. Its done this way for legacy reasons, so the certs can be managed in one place, although eventually we imagine we'll merge it with authfe. All traffic from frontend is sent to authfe. - Authfe checks the auth tokens / cookie etc and then picks the service to forward the RPC to. - Frankenstein accepts the reads and does the right thing with them. First problem I hit was Nginx won't proxy http2 requests - it can accept them, but all calls downstream are http1 (see https://trac.nginx.org/nginx/ticket/923). This wasn't such a big deal, so it now looks like: internet --(grpc/http2)--> frontend --(grpc/http1)--> authfe --(grpc/http1)--> frankenstein Next problem was golang grpc server won't accept http1 requests (see https://groups.google.com/forum/#!topic/grpc-io/JnjCYGPMUms). It is possible to link a grpc server in with a normal go http mux, as long as the mux server is serving over SSL, as the golang http client & server won't do http2 over anything other than an SSL connection. This would require making all our service to service comms SSL. So I had a go a writing a grpc http1 server, and got pretty far. But is was a bit of a mess. So finally I thought I'd make a separate grpc frontend for this, running in parallel with the frontend/authfe combo on a different port - and first up I'd need a grpc reverse proxy. Ideally we'd have some nice, generic reverse proxy that only knew about a map from service names -> downstream service, and didn't need to decode & re-encode every request as it went through. It seems like this can't be done with golang's grpc library - see https://github.com/mwitkow/grpc-proxy/issues/1. And then I was surprised to find you can't do grpc from browsers! See http://www.grpc.io/faq/ - not important to us, but I'm starting to question why we decided to use grpc in the first place? It would seem we could have most of the benefits of grpc with protos over HTTP, and this wouldn't preclude moving to grpc when its a bit more mature? In fact, the grcp FAQ even admits as much: > Why is gRPC better than any binary blob over HTTP/2? > This is largely what gRPC is on the wire.	8 years ago
Tobias Schmidt	29ced0090f	Fix common english misspellings	8 years ago
Tobias Schmidt	e2c12dcdb5	Add missing error check in persistence test	8 years ago
Tobias Schmidt	8f3b62bfe4	Simplify struct initialization	8 years ago
Julius Volz	b24e5d63bc	Add noop local storage engine. This adds a flag -storage.local.engine which allows turning off local storage in Prometheus. Instead of adding if-conditions and nil checks to all parts of Prometheus that deal with Prometheus's local storage (including the web interface), disabling local storage simply means replacing the normal local storage with a noop version that throws samples away and returns empty query results. We also don't add the noop storage to the fanout appender to decrease internal overhead. Instead of returning empty results, an alternate behavior could be to return errors on any query that point out that the local storage is disabled. Not sure which one is more preferable, so I went with the empty result option for now.	8 years ago
Fabian Reinartz	22296dcb85	storage: clarify sample/matcher relation in docs	8 years ago
Fabian Reinartz	cc6f988a5e	storage: fix Querier interface documentation	8 years ago
Fabian Reinartz	7bd7e63f97	storage: fix struct alignment issue in test The uint64 `numCalls` ends up being not word-aligned on certain architectures, which makes atomic reads/writes panic.	8 years ago
nghialv	7655038184	fix typo	8 years ago
Tom Wilkie	d41d91388f	Update for new generic remote storage.	8 years ago
Tom Wilkie	a6931b71e8	Rationalise retrieval metrics so we have the state (success/failed) on both samples and batches, in a consistent fashion. Also, report total queue capacity of all queues, i.e. capacity * shards.	8 years ago
Tom Wilkie	ece12bff93	Shard/parrallelise samples by fingerprint in StorageQueueManager By splitting the single queue into multiple queues and flushing each individual queue in serially (and all queues in parallel), we can guarantee to preserve the order of timestampsin samples sent to downstream systems.	8 years ago
Julius Volz	fe29e87824	Merge pull request #1930 from prometheus/generic-write-grpc Generic write via gRPC	8 years ago
Julius Volz	aa3f2b7216	Generic write cleanups and changes. - fold metric name into labels - return initialization errors back to main - add snappy compression - better context handling - pre-allocation of labels - remove generic naming - other cleanups	8 years ago
Brian Brazil	36d2c4bd0b	Add generic write path using grpc. This uses a new proto format, with scope for multiple samples per timeseries in future. This will allow users to pump samples out to whatever they like without having to change the core Prometheus code. There's also an example receiver to save users figuring out the boilerplate themselves.	8 years ago
Dan Milstein	ec064c96f6	Add field names to table-driven test fixtures	8 years ago
Dan Milstein	ac8788aca6	Convert to table-driven test and inline helper func	8 years ago
Dan Milstein	f50f656a66	Fix double-delta unmarshaling to respect actual min header size Turns out its valid to have an overall chunk which is smaller than the full doubleDeltaHeaderBytes size -- if it has a single sample, it doesn't fill the whole header. Updated unmarshalling check to respect this.	8 years ago
Dan Milstein	b815956341	Catch errors when unmarshalling delta/doubleDelta encoded chunks This is (hopefully) a fix for #1653 Specifically, this makes it so that if the length for the stored delta/doubleDelta is somehow corrupted to be too small, the attempt to unmarshal will return an error. The current (broken) behavior is to return a malformed chunk, which can then lead to a panic when there is an attempt to read header values. The referenced issue proposed creating chunks with a minimum length -- I instead opted to just error on the attempt to unmarshal, since I'm not clear on how it could be safe to proceed when the length is incorrect/unknown. The issue also talked about possibly "quarantining series", but I don't know the surrounding code well enough to understand how to make that happen.	8 years ago
Matt Bostock	e618af5d0b	Storage: Add crash recovery metric 'started_dirty' ...to indicate when crash recovery was invoked during Prometheus startup. Fixes #1918.	8 years ago
Dan Milstein	764ceaa939	Add timeout to test, cap waiting at 1 second	8 years ago
Dan Milstein	007907b410	Fix one of the tests for a remote storage QueueManager Specifically, the TestSpawnNotMoreThanMaxConcurrentSendsGoroutines was failing on a fresh checkout of master. The test had a race condition -- it would only pass if one of the spawned goroutines happened to very quickly pull a set of samples off an internal queue. This patch rewrites the test so that it deterministically waits until all samples have been pulled off that queue. In case of errors, it also now reports on the difference between what it expected and what it found. I verified that, if the code under test is deliberately broken, the test successfully reports on that.	8 years ago
Julius Volz	3bfec97d46	Make the storage interface higher-level. See discussion in https://groups.google.com/forum/#!topic/prometheus-developers/bkuGbVlvQ9g The main idea is that the user of a storage shouldn't have to deal with fingerprints anymore, and should not need to do an individual preload call for each metric. The storage interface needs to be made more high-level to not expose these details. This also makes it easier to reuse the same storage interface for remote storages later, as fewer roundtrips are required and the fingerprint concept doesn't work well across the network. NOTE: this deliberately gets rid of a small optimization in the old query Analyzer, where we dedupe instants and ranges for the same series. This should have a minor impact, as most queries do not have multiple selectors loading the same series (and at the same offset).	8 years ago
beorn7	fc6737b7fb	storage: improve index lookups tl;dr: This is not a fundamental solution to the indexing problem (like tindex is) but it at least avoids utilizing the intersection problem to the greatest possible amount. In more detail: Imagine the following query: nicely:aggregating:rule{job="foo",env="prod"} While it uses a nicely aggregating recording rule (which might have a very low cardinality), Prometheus still intersects the low number of fingerprints for `{__name__="nicely:aggregating:rule"}` with the many thousands of fingerprints matching `{job="foo"}` and with the millions of fingerprints matching `{env="prod"}`. This totally innocuous query is dead slow if the Prometheus server has a lot of time series with the `{env="prod"}` label. Ironically, if you make the query more complicated, it becomes blazingly fast: nicely:aggregating:rule{job=~"foo",env=~"prod"} Why so? Because Prometheus only intersects with non-Equal matchers if there are no Equal matchers. That's good in this case because it retrieves the few fingerprints for `{__name__="nicely:aggregating:rule"}` and then starts right ahead to retrieve the metric for those FPs and checking individually if they match the other matchers. This change is generalizing the idea of when to stop intersecting FPs and go into "retrieve metrics and check them individually against remaining matchers" mode: - First, sort all matchers by "expected cardinality". Matchers matching the empty string are always worst (and never used for intersections). Equal matchers are in general consider best, but by using some crude heuristics, we declare some better than others (instance labels or anything that looks like a recording rule). - Then go through the matchers until we hit a threshold of remaining FPs in the intersection. This threshold is higher if we are already in the non-Equal matcher area as intersection is even more expensive here. - Once the threshold has been reached (or we have run out of matchers that do not match the empty string), start with "retrieve metrics and check them individually against remaining matchers". A beefy server at SoundCloud was spending 67% of its CPU time in index lookups (fingerprintsForLabelPairs), serving mostly a dashboard that is exclusively built with recording rules. With this change, it spends only 35% in fingerprintsForLabelPairs. The CPU usage dropped from 26 cores to 18 cores. The median latency for query_range dropped from 14s to 50ms(!). As expected, higher percentile latency didn't improve that much because the new approach is _occasionally_ running into the worst case while the old one was _systematically_ doing so. The 99th percentile latency is now about as high as the median before (14s) while it was almost twice as high before (26s).	8 years ago
Dmitry Vorobev	273e457da4	web: return status code and error message for config resource	9 years ago
Björn Rabenstein	0622304244	Merge pull request #1798 from prometheus/beorn7/storage2 Crash recovery: Fix an edge case.	9 years ago
beorn7	2a75b15328	Crash recovery: Fix an edge case. If the chunks of a series in the checkpoint are all older then the latest chunk on disk, the head chunk is persisted and therefore has to be declared closed. It would be great to have a test for this, but that would require more plumbing, subject of #447.	9 years ago
beorn7	064b57858e	Consistently use the `Seconds()` method for conversion of durations This also fixes one remaining case of recording integral numbers of seconds only for a metric, i.e. this will probably fix #1796.	9 years ago
Julius Volz	91401794fa	storage: Make MemorySeriesStorage a public type See https://twitter.com/fabxc/status/748032597876482048	9 years ago
Fabian Reinartz	425736a377	*: remove last remainers of non-second metrics	9 years ago
Julius Volz	b7b6717438	Separate query interface out of local.Storage. PromQL only requires a much narrower interface than local.Storage in order to run queries. Narrower interfaces are easier to replace and test, too. We could also change the web interface to use local.Querier, except that we'll probably use appending functions from there in the future.	9 years ago
Jan van Valburg	68f3df49d0	stoarge: fix 'access denied' error on Windows On Windows, it is not possible to rename or delete a file that is currerntly open. This change closes the file in dropAndPersistChunks before it tries to delete it, or rename the temporary file to it.	9 years ago
beorn7	b274c7aaa7	Update doc comments	9 years ago
beorn7	99881ded63	Make the number of fingerprint mutexes configurable With a lot of series accessed in a short timeframe (by a query, a large scrape, checkpointing, ...), there is actually quite a significant amount of lock contention if something similar is running at the same time. In those cases, the number of locks needs to be increased. On the same front, as our fingerprints don't have a lot of entropy, I introduced some additional shuffling. With the current state, anly changes in the least singificant bits of a FP would matter.	9 years ago
Tobias Schmidt	4c439b4b45	Merge pull request #1646 from prometheus/beorn7/valuecomparison Correctly identify no-op appends if the value is NaN	9 years ago
beorn7	a308c76292	Improve TestAppendOutOfOrder It did not test the returned error so far. Also, add tests for the NaN case broken before https://github.com/prometheus/common/pull/40	9 years ago
beorn7	b2ef4dc52d	Correctly identify no-op appends if the value is NaN This requires an updating of the vendored commen.model package, which I will do once https://github.com/prometheus/common/pull/40 is merged.	9 years ago
Dmitry Vorobev	bd2a770015	storage/remote: Spawn not more than "maxConcurrentSends" goroutines.	9 years ago
Dmitry Savintsev	7fdb62c253	fix several minor golint style issues	9 years ago
Steve Durrheimer	399d5c6375	Make version informations consistent between prometheus components	9 years ago
beorn7	07a294ac15	Doc comment fixes	9 years ago
beorn7	20cba1ed8f	Initialize metric vectors in memorySeriesStorage	9 years ago
beorn7	d566808d40	Bring back logging of discarded samples But only on DEBUG level. Also, count and report the two cases of out-of-order timestamps on the one hand and same timestamp but different value on the other hand separately.	9 years ago
beorn7	db16acd7fb	Never drop a still open head chunk.	9 years ago
beorn7	a90d645378	Checkpoint fingerprint mappings only upon shutdown Before, we checkpointed after every newly detected fingerprint collision, which is not a problem as long as collisions are rare. However, with a sufficient number of metrics or particular nature of the data set, there might be a lot of collisions, all to be detected upon the first set of scrapes, and then the checkpointing after each detection will take a quite long time (it's O(n²), essentially). Since we are rebuilding the fingerprint mapping during crash recovery, the previous, very conservative approach didn't even buy us anything. We only ever read from the checkpoint file after a clean shutdown, so the only time we need to write the checkpoint file is during a clean shutdown.	9 years ago
Jonathan Boulle	38098f8c95	Add missing license headers Prometheus is Apache 2 licensed, and most source files have the appropriate copyright license header, but some were missing it without apparent reason. Correct that by adding it.	9 years ago
Fabian Reinartz	a18639dc2d	Merge pull request #1454 from prometheus/beorn7/fix-test Give TestEvictAndLoadChunkDescs more time to actually evict	9 years ago
Tobias Schmidt	e82ef154ee	Remove unused code leftovers	9 years ago
beorn7	d09ca03e10	Work around compiler bug Benchmarks don't show any significant changes.	9 years ago
beorn7	507f550cd4	Merge branch 'master' into beorn7/storage7	9 years ago
beorn7	865d16f870	Rename Gorilla into varbit	9 years ago
Julius Volz	d3b53bd7f0	Fix comment about Graphite mapping of dimensions.	9 years ago
beorn7	4b574e8a61	Switch chunk encoding to type 2 where it was hardcoded type 1 before The chunk encoding was hardcoded there because it mostly doesn't matter what encoding is chosen in that test. Since type 1 is battle-hardened enough, I'm switching to type 2 here so that we can catch unexpected problems as a byproduct. My expectation is that the chunk encoding doesn't matter anyway, as said, but then "unexpected problems" contains the word "unexpected".	9 years ago
beorn7	c72979e3ed	Remove a redundancy from Gorilla-style chunks So far, the last sample in a chunk was saved twice. That's required for adding more samples as we need to know the last sample added to add more samples without iterating through the whole chunk. However, once the last sample was added to the chunk before it's full, there is no need to save it twice. Thus, the very last sample added to a chunk can _only_ be saved in the header fields for the last sample. The chunk has to be identifiable as closed, then. This information has been added to the flags byte.	9 years ago
beorn7	b6dbb826ae	Improve fuzz testing and fix a bug exposed This improves fuzz testing in two ways: (1) More realistic time stamps. So far, the most common case in practice was very rare in the test: Completely regular increases of the timestamp. (2) Verify samples by scanning through the whole relevant section of the series. For Gorilla-like chunks, this showed two things: (1) With more regularly increasing time stamps, BenchmarkFuzz is essentially as fast as with the traditional chunks: ``` BenchmarkFuzzChunkType0-8 2 972514684 ns/op 83426196 B/op 2500044 allocs/op BenchmarkFuzzChunkType1-8 2 971478001 ns/op 82874660 B/op 2512364 allocs/op BenchmarkFuzzChunkType2-8 2 999339453 ns/op 76670636 B/op 2366116 allocs/op ``` (2) There was a bug related to when and how the chunk footer is overwritten to make use for the last sample. This wasn't exposed by random access as the last sample of a chunk is retrieved from the values in the header in that case.	9 years ago
beorn7	9d8fbbe822	Review improvements	9 years ago
beorn7	8cdced3850	Implement Gorilla-inspired chunk encoding This is not a verbatim implementation of the Gorilla encoding. First of all, it could not, even if we wanted, because Prometheus has a different chunking model (constant size, not constant time). Second, this adds a number of changes that improve the encoding in general or at least for the specific use case of Prometheus (and are partially only possible in the context of Prometheus). See comments in the code for details.	9 years ago
beorn7	8e64e8dfca	Fix return statement.	9 years ago
Björn Rabenstein	98c8560851	Merge pull request #1477 from prometheus/beorn7/storage7 Solve the series churn problem...	9 years ago
beorn7	e7ac9c6863	Improvments based on review - Moved returns into the default section of switch statement that can only happen then. - Fix typo.	9 years ago
beorn7	199f309a39	Resurrect and rename invalid preload requests count metric. It is now also used in label matching, so the name of the metric changed from `prometheus_local_storage_invalid_preload_requests_total` to `non_existent_series_matches_total'.	9 years ago
beorn7	e8c1f30ab2	Merge the parallel logic of getSeriesForRange and metricForFingerprint	9 years ago
beorn7	9445c7053d	Add tests for range-limited label matching While doing so, improve getSeriesForRange.	9 years ago
beorn7	47e3c90f9b	Clean up error propagation Only return an error where callers are doing something with it except simply logging and ignoring. All the errors touched in this commit flag the storage as dirty anyway, and that fact is logged anyway. So most of what is being removed here is just log spam. As discussed earlier, the class of errors that flags the storage as dirty signals fundamental corruption, no even bubbling up a one-time warning to the user (e.g. about incomplete results) isn't helping much because _anything_ happening in the storage has to be doubted from that point on (and in fact retroactively into the past, too). Flagging the storage dirty, and alerting on it (plus marking the state in the web UI) is the only way I can see right now. As a byproduct, I cleaned up the setDirty method a bit and improved the logged errors.	9 years ago
beorn7	99854a84d7	Merge branch 'beorn7/storage6' into beorn7/storage7	9 years ago
beorn7	5e4fa96719	Merge branch 'beorn7/storage5' into beorn7/storage6	9 years ago
beorn7	b343e65907	Merge branch 'beorn7/storage4' into beorn7/storage5 erge is necessary,	9 years ago
beorn7	d0a4477446	Merge branch 'beorn7/storage3' into beorn7/storage4 Conflicts: storage/local/preload.go storage/local/storage.go storage/local/storage_test.go	9 years ago
beorn7	55eddab25f	Merge branch 'beorn7/storage2' into beorn7/storage3	9 years ago
beorn7	161eada3ad	Make chunkIterator even leaner.	9 years ago
beorn7	beb36df4bb	De-flag preloadChunksForRange Now there is preloadChunksForRange and preloadChunksForInstant in both, the series and the storage.	9 years ago
beorn7	836f1db04c	Improve MetricsForLabelMatchers WIP: This needs more tests. It now gets a from and through value, which it may opportunistically use to optimize the retrieval. With possible future range indices, this could be used in a very efficient way. This change merely applies some easy checks, which should nevertheless solve the use case of heavy rule evaluations on servers with a lot of series churn. Idea is the following: - Only archive series that are at least as old as the headChunkTimeout (which was already extremely unlikely to happen). - Then maintain a high watermark for the last archival, i.e. no archived series has a sample more recent than that watermark. - Any query that doesn't reach to a time before that watermark doesn't have to touch the archive index at all. (A production server at Soundcloud with the aforementioned series churn and heavy rule evaluations spends 50% of its CPU time in archive index lookups. Since rule evaluations usually only touch very recent values, most of those lookup should disappear with this change.) - Federation with a very broad label matcher will profit from this, too. As a byproduct, the un-needed MetricForFingerprint method was removed from the Storage interface.	9 years ago
beorn7	167b83695c	Merge branch 'beorn7/storage5' into beorn7/storage6	9 years ago
beorn7	01795382c9	Merge branch 'beorn7/storage4' into beorn7/storage5	9 years ago
beorn7	f7fc542db6	Merge branch 'master' into beorn7/storage4 Conflicts: storage/local/persistence.go	9 years ago
beorn7	3d86130d8c	Merge branch 'master' into beorn7/storage3	9 years ago
beorn7	1f30c8de8d	Merge branch 'master' into beorn7/storage2	9 years ago
beorn7	c13b1ecfe9	Make chunk iterators more DRY This finally extracts all the common code of the two chunk iterators into one. Any future chunk encodings with fast access by index can use the same iterator by simply providing an indexAccessor. Other future chunk encodings without fast index access (like Gorilla-style) can still implement the chunkIterator interface as usual.	9 years ago
beorn7	32f280a3cd	Slim down the chunkIterator interface For one, remove unneeded methods. Then, instead of using a channel for all values, use a bufio.Scanner-like interface. This removes the need for creating a goroutine and avoids the (unnecessary) locking performed by channel sending and receiving. This will make it much easier to write new chunk implementations (like Gorilla-style encoding).	9 years ago
beorn7	b6fdb355d7	Move dump-heads into its own tool	9 years ago
beorn7	f193f2b8ef	Add a command to promtool that dumps metadata of heads.db I needed this today for debugging. It can certainly be improved, but it's already quite helpful. I refactored the reading of heads.db files out of persistence, which is an improvement, too. I made minor changes to the cli package to allow outputting via the io.Writer interface.	9 years ago
beorn7	75a6b460ef	Give TestEvictAndLoadChunkDescs more time to actually evict Obviously, it's really bad to depend on timing here. The proper fix would be to have something like WaitForIndexing for other things to wait for, too. For now, let's see if the wait time increase fixes the issue.	9 years ago
beorn7	fc7de5374a	Quarantine series upon problem writing to the series file This fixes https://github.com/prometheus/prometheus/issues/1059 , but not in the obvious way (simply not updating the persist watermark, because that's actually not that simple - we don't really know what has gone wrong exactly). As any errors relevant here are most likely caused by severe and unrecoverable problems with the series file, Using the now quarantine feature is the right step. We don't really have to be worried about any inconsistent state of the series because it will be removed for good ASAP. Another plus is that we don't have to declare the whole storage dirty anymore.	9 years ago
beorn7	0ea5801e47	Handle errors caused by data corruption more gracefully This requires all the panic calls upon unexpected data to be converted into errors returned. This pollute the function signatures quite lot. Well, this is Go... The ideas behind this are the following: - panic only if it's a programming error. Data corruptions happen, and they are not programming errors. - If we detect a data corruption, we "quarantine" the series, essentially removing it from the database and putting its data into a separate directory for forensics. - Failure during writing to a series file is not considered corruption automatically. It will call setDirty, though, so that a crashrecovery upon the next restart will commence and check for that. - Series quarantining and setDirty calls are logged and counted in metrics, but are hidden from the user of the interfaces in interface.go, whith the notable exception of Append(). The reasoning is that we treat corruption by removing the corrupted series, i.e. a query for it will return no results on its next call anyway, so return no results right now. In the case of Append(), we want to tell the user that no data has been appended, though. Minor side effects: - Now consistently using filepath.* instead of path.*. - Introduced structured logging where I touched it. This makes things less consistent, but a complete change to structured logging would be out of scope for this PR.	9 years ago
beorn7	b6840997a7	Merge branch 'beorn7/storage2' into beorn7/storage3	9 years ago
beorn7	ce58fd357b	Merge branch 'beorn7/storage' into beorn7/storage2 Conflicts: storage/local/chunk.go storage/local/interface.go	9 years ago
beorn7	2581648f70	Separate iterators by offset Add test that exposes the problem.	9 years ago
beorn7	c740789ce3	Improve predict_linear Fixes https://github.com/prometheus/prometheus/issues/1401 This remove the last (and in fact bogus) use of BoundaryValues. Thus, a whole lot of unused (and arguably sub-optimal / ugly) code can be removed here, too.	9 years ago
beorn7	4b503ed9a5	Merge branch 'master' into beorn7/storage2	9 years ago
beorn7	059295332f	Merge remote-tracking branch 'origin/master' into beorn7/storage	9 years ago
beorn7	53005c3085	Merge branch 'beorn7/storage' into beorn7/storage2	9 years ago
beorn7	28e9bbc15f	Populate chunkDesc.chunkLastTime during checkpoint loading, too	9 years ago
Björn Rabenstein	a8c79f0a0c	Merge pull request #1422 from prometheus/release-0.17 Merge more commits from 0.17.	9 years ago
beorn7	8fa1560e48	Fix a very special case of handling the checkpoint timer	9 years ago
beorn7	41e44f6ab9	Merge branch 'master' into beorn7/storage2	9 years ago
Björn Rabenstein	d9eb624322	Merge pull request #1415 from prometheus/release-0.17 Forward-merge release-0.17 into master	9 years ago
beorn7	4d1f7b49b6	Fix a race condition in calculatePersistenceUrgencyScore	9 years ago
beorn7	454ecf3f52	Rework the way ranges and instants are handled In a way, our instants were also ranges, just with the staleness delta as range length. They are no treated equally, just that in one case, the range length is set as range, in the other the staleness delta. However, there are "real" instants where start and and time of a query is the same. In those cases, we only want to return a single value (the one closest before or at the equal start and end time). If that value is the last sample in the series, odds are we have it already in the series object. In that case, there is no need to pin or load any chunks. A special singleSampleSeriesIterator is created for that. This should greatly speed up instant queries as they happen frequently for rule evaluations.	9 years ago
beorn7	b876f8e6a5	Move lastSamplePair method up to memorySeries This implies a slight change of behavior as only samples added to the respective instance of a memorySeries are returned. However, this is most likely anyway what we want. Following cases: - Server has been restarted: Given the time it takes to cleanly shutdown and start up a server, the series are now stale anyway. An improved staleness handling (still to be implemented) will be based on tracking if a given target is continuing to expose samples for a given time series. In that case, we need a full scrape cycle to decide about staleness. So again, it makes sense to consider everything stale directly after a server restart. - Series unarchived due to a read request: The series is definitely stale so we don't want to return anything anyway. - Freshly created time series or series unarchived because of a sample append: That happens because appending a sample is imminent. Before the fingerprint lock is released, the series will have received a sample, and lastSamplePair will always returned the expected value.	9 years ago
beorn7	1e13f89039	Return SamplePair istead of *SamplePair consistently Formalize ZeroSamplePair as return value for non-existing samples. Change LastSamplePairForFingerprint to return a SamplePair (and not a pointer to it), which saves allocations in a potentially extremely frequent call.	9 years ago
beorn7	d290340367	Fix and improve chunkDesc locking	9 years ago
beorn7	0e202dacb4	Streamline series iterator creation This will fix issue #1035 and will also help to make issue #1264 less bad. The fundamental problem in the current code: In the preload phase, we quite accurately determine which chunks will be used for the query being executed. However, in the subsequent step of creating series iterators, the created iterators are referencing _all_ in-memory chunks in their series, even the un-pinned ones. In iterator creation, we copy a pointer to each in-memory chunk of a series into the iterator. While this creates a certain amount of allocation churn, the worst thing about it is that copying the chunk pointer out of the chunkDesc requires a mutex acquisition. (Remember that the iterator will also reference un-pinned chunks, so we need to acquire the mutex to protect against concurrent eviction.) The worst case happens if a series doesn't even contain any relevant samples for the query time range. We notice that during preloading but then we will still create a series iterator for it. But even for series that do contain relevant samples, the overhead is quite bad for instant queries that retrieve a single sample from each series, but still go through all the effort of series iterator creation. All of that is particularly bad if a series has many in-memory chunks. This commit addresses the problem from two sides: First, it merges preloading and iterator creation into one step, i.e. the preload call returns an iterator for exactly the preloaded chunks. Second, the required mutex acquisition in chunkDesc has been greatly reduced. That was enabled by a side effect of the first step, which is that the iterator is only referencing pinned chunks, so there is no risk of concurrent eviction anymore, and chunks can be accessed without mutex acquisition. To simplify the code changes for the above, the long-planned change of ValueAtTime to ValueAtOrBefore time was performed at the same time. (It should have been done first, but it kind of accidentally happened while I was in the middle of writing the series iterator changes. Sorry for that.) So far, we actively filtered the up to two values that were returned by ValueAtTime, i.e. we invested work to retrieve up to two values, and then we invested more work to throw one of them away. The SeriesIterator.BoundaryValues method can be removed once #1401 is fixed. But I really didn't want to load even more changes into this PR. Benchmarks: The BenchmarkFuzz.* benchmarks run 83% faster (i.e. about six times faster) and allocate 95% fewer bytes. The reason for that is that the benchmark reads one sample after another from the time series and creates a new series iterator for each sample read. To find out how much these improvements matter in practice, I have mirrored a beefy Prometheus server at SoundCloud that suffers from both issues #1035 and #1264. To reach steady state that would be comparable, the server needs to run for 15d. So far, it has run for 1d. The test server currently has only half as many memory time series and 60% of the memory chunks the main server has. The 90th percentile rule evaluation cycle time is ~11s on the main server and only ~3s on the test server. However, these numbers might get much closer over time. In addition to performance improvements, this commit removes about 150 LOC.	9 years ago
beorn7	ef3ab96111	Populate first and last time in the chunk descriptor earlier The First time is kind of trivial as we always know it when we create a new chunkDesc. The last time is only know when the chunk is closed, so we have to set it at that time. The change saves a lot of digging down into the chunk itself. Especially the last time is relative expensive as it involves the creation of an iterator. The first time access now doesn't require locking, which is also a nice gain.	9 years ago
beorn7	9a3edea477	Remove race condition from TestRetentionCutoff	9 years ago
Julius Volz	9b6d69610a	Fix various typos in comments. Helpfully reported by https://goreportcard.com/report/github.com/prometheus/prometheus :)	9 years ago
Fabian Reinartz	1f877f3d2a	Fix deadlock, structure target logging	9 years ago
Fabian Reinartz	59f1e722df	Return error on sample appending	9 years ago
beorn7	ec08c9a391	Rework the way to communicate backpressure (AKA suspended ingestion) This gives up on the idea to communicate throuh the Append() call (by either not returning as it is now or returning an error as suggested/explored elsewhere). Here I have added a Throttled() call, which has the advantage that it can be called before a whole _batch_ of Append()'s. Scrapes will happen completely or not at all. Same for rule group evaluations. That's a highly desired behavior (as discussed elsewhere). The code is even simpler now as the whole ingestion buffer could be removed. Logging of throttled mode has been streamlined and will create at most one message per minute.	9 years ago
beorn7	87ef24cd25	Add instrumentation and refactor things around "rushed mode"	9 years ago
beorn7	a2cd479058	Fix calculation of chunks to persist after restart Since we are not overestimating the number of chunks to persist anymore, this commit also adjusts the default value for -storage.local.memory-chunks. Update of documentation will follow.	9 years ago
beorn7	972d94433a	Introduce a hysteresis for "rushed mode" "Rushed mode" is formerly known as "degraded mode", which is changed with this commit, too. The name "degraded" was very misleading. Also, switch into rushed mode if we have too many chunks in memory and an at least reasonable amount of chunks to persist so that speeding up persisting chunks can help.	9 years ago
beorn7	14796bdb60	Improve chunkMaxBatchSize doc comment	9 years ago
beorn7	582af1618c	Streamline chunk writing This helps to avoid allocations in the same way we were already doing it during reading.	9 years ago
beorn7	99b9611351	Remove a race condition from TestRetentionCutoff	9 years ago
beorn7	3f4d22e4c7	Update doc comment This should have gone into a previous commit, but I forgot to save this particular file.	9 years ago
beorn7	add2ebdd56	Tolerate the lost+found directory in the data directory	9 years ago
Björn Rabenstein	6293f3a374	Merge pull request #1304 from prometheus/beorn7/storage Improve handling of series file truncation	9 years ago
beorn7	cb117d8346	Add a series ops metric "purge_on_request" It counts series deletions triggered via the API.	9 years ago
beorn7	4221c7de5c	Improve handling of series file truncation If only very few chunks are to be truncated from a very large series file, the rewrite of the file is a lorge overhead. With this change, a certain ratio of the file has to be dropped to make it happen. While only causing disk overhead at about the same ratio (by default 10%), it will cut down I/O by a lot in above scenario.	9 years ago
Corentin Chary	7b6c3e556c	Use '.' instead of '=' to separate labels from their values in Graphite Using .label=value. was weird to use in Graphite and didn't bring much value.	9 years ago
Julius Volz	75fdcf5698	Merge pull request #1197 from iksaif/master Add support for remote storage on Graphite	9 years ago
Corentin Chary	a2e4439086	Add support for remote storage on Graphite Allows to use graphite over tcp or udp. Metrics labels and values are used to construct a valid Graphite path in a way that will allow us to eventually read them back and reconstruct the metrics. For example, this metric: model.Metric{ model.MetricNameLabel: "test:metric", "testlabel": "test:value", "testlabel2": "test:value", ) Will become: test:metric.testlabel=test:value.testlabel2=test:value escape.go takes care of escaping values to match Graphite character set, it basically uses percent-encoding as a fallback wich will work pretty will in the graphite/grafana world. The remote storage module also has an optional 'prefix' parameter to prefix all metrics with a path (for example, 'prometheus.'). Graphite URLs are simply in the form tcp://host:port or udp://host:port.	9 years ago
Fabian Reinartz	33aab4169c	Anchor regexes in vector matching This commit makes the regex behavior of vector matching consistent with configuration and label_replace() by anchoring it. Fixes #1200	9 years ago
Fabian Reinartz	e3b6ec9784	Switch to common/log	9 years ago
Julius Volz	dac26cef71	Rename global "labels" config option to "external_labels".	9 years ago
Julius Volz	eeb1da36ac	Fix InfluxDB write support to work with InfluxDB 0.9.x. Because the InfluxDB client library currently pulls in multiple MBs of unnecessary dependencies, I have modified and cut up the vendored version to only pull in the few pieces that are actually needed. On InfluxDB's side, this dependency issue is tracked in: https://github.com/influxdb/influxdb/issues/3447 Hopefully, it will be resolved soon. If a password is needed for InfluxDB, it may be supplied via the INFLUXDB_PW environment variable.	9 years ago
Julius Volz	5f77fce578	Improve remote storage queue manager metrics.	9 years ago
beorn7	22d3a4311a	Increase waiting time in TestEvictAndLoadChunkDescs The test had become flaky with Go1.5. Theory here is that with Go1.5.x, sleeping for 10ms might not be enough to wake up another goroutine, possibly because it is used for GC. 50ms should always be enough due to GC pause guarantees with the new GC.	9 years ago
Julius Volz	af513468eb	Fix some dead code, missing error checks, shadowings. I applied https://medium.com/@jgautheron/quality-pipeline-for-go-projects-497e34d6567 and was greeted with a deluge of warnings, most of which were not applicable or really fixable realistically. These are some of the first ones I decided to fix.	9 years ago
beorn7	daeccdd0e9	Fix DropMetricsForFingerprints It now deletes the series file also for archived series. Also, fix a naming error in a doc comment.	9 years ago
Julius Volz	ffc5142c54	Merge pull request #1058 from prometheus/check-errors Fix error checking and logging around checkpointing.	9 years ago
Julius Volz	6774a73878	Fix error checking and logging around checkpointing.	9 years ago
Julius Volz	011faf9057	Fix typo in comment.	9 years ago
Fabian Reinartz	8fa719f778	Attach global labels to remote storage samples	9 years ago
Dieter Plaetinck	e1dacc56e6	fix comment. the sample doesn't get appended to the list of sampleappenders.	9 years ago
Julius Volz	744d5d5a7a	Merge pull request #1029 from prometheus/vet-fixes Fix "go vet" errors.	9 years ago
Julius Volz	995d3b831d	Fix most golint warnings. This is with `golint -min_confidence=0.5`. I left several lint warnings untouched because they were either incorrect or I felt it was better not to change them at the moment.	9 years ago
Julius Volz	963ad82dcb	Fix "go vet" errors. I ignored all errors of the type "composite literal uses unkeyed fields". Most of them are wrong because of https://github.com/golang/go/issues/9171.	9 years ago
Fabian Reinartz	d6b8da8d43	Switch promql types to common/model	9 years ago
Fabian Reinartz	e061595352	Move COWMetric into storage/metric package	9 years ago

... 5 6 7 8 9 ...

1173 Commits (87c5e9bc37c443570728abf883fcb30c9b368ed4)