prometheus

Commit Graph

Author	SHA1	Message	Date
Miek Gieben	4b43e825f4	Rename block to hupReady Remove the write to the channel as per comments.	2015-06-12 14:45:02 +01:00
Miek Gieben	d8651302fc	Start HUP signal handler earlier When prometheus starts up and is recovering its state it will not handle SIGHUPs. If it receives those during this phase it will exit. The change here makes prometheus ignore SIGHUPs until it is ready to handle them. Note this is only done for SIGHUP because that signal is used for trigger a config reload and a such something could already be sending these signals as part of a config update.	2015-06-12 14:30:14 +01:00
Julius Volz	39aa66e46e	Place storage under working directory by default.	2015-06-11 15:31:50 +02:00
Fabian Reinartz	5b713911e3	web/api: enable running API legacy and v1 in parallel	2015-06-08 19:11:48 +02:00
Fabian Reinartz	78047326b4	web: cleanup initialization of web service.	2015-06-03 08:45:43 +02:00
Fabian Reinartz	280d11dca8	main: exit on invalid rule files on startup.	2015-06-02 18:44:41 +02:00
Julius Volz	cf64bbe1ce	Add links to configuration change notice.	2015-06-01 18:36:11 +02:00
Julius Volz	d7c015c149	Convert pathPrefix to not have trailing slash.	2015-06-01 12:43:17 +02:00
Fabian Reinartz	6e319532cf	Read from indexing queue during crash recovery. Change #704 introduced a regression that started reading the queue only after potential crash recovery. When more than the queue capacity was indexed, Prometheus deadlocked.	2015-05-23 15:32:35 +02:00
Julius Volz	d4bd3397ae	Merge pull request #712 from prometheus/fabxc/def_cfg_file Change default config file name	2015-05-20 23:02:14 +02:00
Fabian Reinartz	223eaf2ca3	Change default config file name	2015-05-20 19:24:27 +02:00
Julius Volz	267fd34156	Switch Prometheus to use github.com/prometheus/log. This change is conceptually very simple, although the diff is large. It switches logging from "github.com/golang/glog" to "github.com/prometheus/log", while not actually changing any log messages. V(1)-style logging has been changed to be log.Debug*().	2015-05-20 18:19:32 +02:00
Fabian Reinartz	5d3024fd3e	Restructure component initialization	2015-05-19 14:41:47 +02:00
Fabian Reinartz	d8440d75f1	Do not start storage processing before Start() is called.	2015-05-19 13:51:45 +02:00
Fabian Reinartz	bb540fd9fd	Implement config reloading on SIGHUP. With this commit, sending SIGHUP to the Prometheus process will reload and apply the configuration file. The different components attempt to handle failing changes gracefully.	2015-05-13 16:49:46 +02:00
Fabian Reinartz	3b0777ff84	Merge branch 'master' into fabxc/servdisc	2015-05-12 15:46:16 +02:00
Fabian Reinartz	1f2209b159	Merge pull request #680 from prometheus/fabxc/sd_yamlcfg Switch config to YAML format.	2015-05-11 18:20:29 +02:00
Fabian Reinartz	5fbde88919	Switch config to YAML format.	2015-05-07 16:52:14 +02:00
Fabian Reinartz	8f75ff0513	Add warning about config changes.	2015-05-05 15:17:55 +02:00
Fabian Reinartz	fe935179cd	Stop routing rule statements through the engine.	2015-04-29 18:01:43 +02:00
Fabian Reinartz	479891c9be	Rename RuleManager to Manager, remove interface. This commits renames the RuleManager to Manager as the package name is 'rules' now. The unused layer of abstraction of the RuleManager interface is removed.	2015-04-29 16:42:10 +02:00
Fabian Reinartz	3ca11bcaf5	Switch Prometheus to promql package. This commit removes all functionality from rules/ that is now handled in promql/. All parts of Prometheus are changed to use the promql/ package.	2015-04-28 16:19:23 +02:00
Fabian Reinartz	5015c2a0e8	Make target manager source based. This commit shifts responsibility for maintaining targets from providers and pools to the target manager. Target groups have a source name that identifies them for updates.	2015-04-24 15:49:35 +02:00
Tobias Schmidt	7d71d354fd	Remove special listing of config.file in usage The -config.file parameter isn't required or any more special than the other flags. In order to avoid confusion, this change removes the special mention again. Instead, the error message if a config file couldn't be loaded is changed to mention the flag name.	2015-04-08 17:36:15 -04:00
Tobias Schmidt	35a44509fb	Improve readability of usage text Separates flag and description by a newline to make it easier to read the flags with long descriptions.	2015-04-08 17:33:25 -04:00
Fabian Reinartz	c012ca6039	Make help output readable. This commit increases the usability by grouping flags based on their first dot-separated group. Long flag descriptions are broken into lines printed with indentation.	2015-04-08 12:41:49 +02:00
Björn Rabenstein	d8e515e9cb	Merge pull request #617 from prometheus/influxdb-write-support Add experimental InfluxDB write support.	2015-04-07 13:23:06 +02:00
Ceesjan Luiten	0e18784c64	Make all paths absolute to support proxies	2015-04-02 20:36:47 +02:00
Julius Volz	593e565688	Allow writing to InfluxDB/OpenTSDB at the same time.	2015-04-02 20:24:38 +02:00
Julius Volz	61fb688dd9	Add experimental InfluxDB write support.	2015-04-01 02:03:16 +02:00
Julius Volz	33702da8a8	Use simple Now() func in API instead of utility.Time.	2015-03-27 23:43:47 +01:00
Julius Volz	3f2686d0b3	Remove unused fields from MetricsService.	2015-03-27 18:51:13 +01:00
beorn7	12ae6e9203	Increase resilience of the storage against data corruption - step 4. Step 4: Add a configurable sync'ing of series files after modification.	2015-03-19 15:58:02 +01:00
beorn7	11bd9ce1bd	Increase resilience of the storage against data corruption - step 3. Step 3: Remember the mtime of series files and make use of it to detect series files that are not the one the checkpoint thinks they are.	2015-03-19 15:44:11 +01:00
beorn7	e25cca823c	Increase resilience of the storage against data corruption - step 2. Step 2: Add a flag -storage.local.pedantic-checks to check every series file. Also, remove countPersistedHeadChunks channel, which is unused.	2015-03-19 12:06:15 +01:00
beorn7	da7c0461c6	Rename persist queue len/cap to num/max chunks to persist. Remove deprecated flag storage.incoming-samples-queue-capacity.	2015-03-18 19:36:41 +01:00
beorn7	be11cb2b07	Remove the sample ingestion channel. The one central sample ingestion channel has caused a variety of trouble. This commit removes it. Targets and rule evaluation call an Append method directly now. To incorporate multiple storage backends (like OpenTSDB), storage.Tee forks the Append into two different appenders. Note that the tsdb queue manager had its own queue anyway. It was a queue after a queue... Much queue, so overhead... Targets have their own little buffer (implemented as a channel) to avoid stalling during an http scrape. But a new scrape will only be started once the old one is fully ingested. The contraption of three pipelined ingesters was removed. A Target is an ingester itself now. Despite more logic in Target, things should be less confusing now. Also, remove lint and vet warnings in ast.go.	2015-03-15 14:08:22 +01:00
beorn7	0056eaeb4f	Redesign series maintenance and chunk persistence.	2015-03-14 22:05:23 +01:00
beorn7	5bea942d8e	Improve various things around chunk encoding. A number of mostly minor things: - Rename chunk type -> chunk encoding. - After all, do not carry around the chunk encoding to all parts of the system, but just have one place where the encoding for new chunks is set based on the flag. The new approach has caveats as well, but the polution of so many method signatures is worse. - Use the default chunk encoding for new chunks of existing series. (Previously, only new _series_ would get chunks with the default encoding.) - Use an enum for chunk encoding. (But keep the version number for the flag, for reasons discussed previously.) - Add encoding() to the chunk interface (so that a chunk knows its own encoding - no need to have that in a different top-level function). - Got rid of newFollowUpChunk (which would keep the existing encoding for all chunks of a time series). Now only use newChunk(), which will create a chunk encoding according to the flag. - Simplified transcodeAndAdd. - Reordered methods of deltaEncodedChunk and doubleDeltaEncoded chunk to match the order in the chunk interface. - Only transcode if the chunk is not yet half full. If more than half full, add a new chunk instead.	2015-03-14 19:03:20 +01:00
beorn7	66e768f05e	Improve docstring for chunk type flag.	2015-03-06 17:04:07 +01:00
beorn7	13fcf1ddbc	Implement double-delta encoded chunks.	2015-03-05 20:33:26 +01:00
beorn7	af91fb8e31	Improve persisting chunks to disk. This is done by bucketing chunks by fingerprint. If the persisting to disk falls behind, more and more chunks are in the queue. As soon as there are "double hits", we will now persist both chunks in one go, doubling the disk throughput (assuming it is limited by disk seeks). Should even more pile up so that we end wit "triple hits", we will persist those first, and so on. Even if we have millions of time series, this will still help, assuming not all of them are growing with the same speed. Series that get many samples and/or are not very compressable will accumulate chunks faster, and they will soon get double- or triple-writes. To improve the chance of double writes, -storage.local.persistence-queue-capacity could be set to a higher value. However, that will slow down shutdown a lot (as the queue has to be worked through). So we leave it to the user to set it to a really high value. A more fundamental solution would be to checkpoint not only head chunks, but also chunks still in the persist queue. That would be quite complicated for a rather limited use-case (running many time series with high ingestion rate on slow spinning disks).	2015-02-17 16:02:09 +01:00
beorn7	8a1c195b54	Move emptiness check to the receivers.	2015-02-12 19:47:24 +01:00
beorn7	11b3c2387c	Improvements after review. - Increase samplesQueueCapacity. - Improve docstring for the above. - Accept a short waiting period for the ingest channel to become ready. This should depend on the http timeout, but 100ms is probably good enough to cushion bursts bigger than samplesQueueCapacity, while it is unlikely that anybody ever will set an HTTP timeout similarly short.	2015-02-10 14:58:46 +01:00
beorn7	d2ab49c396	Make the persist queue length configurable. Also, set a much higher default value. Chunk persist requests can be quite spiky. If you collect a large number of time series that are very similar, they will tend to finish up a chunk at about the same time. There is no reason we need to back up scraping just because of that. The rationale of the new default value is "1/8 of the chunks in memory".	2015-02-06 14:54:53 +01:00
Bjoern Rabenstein	5859b74f1b	Clean up license issues. - Move CONTRIBUTORS.md to the more common AUTHORS. - Added the required NOTICE file. - Changed "Prometheus Team" to "The Prometheus Authors". - Reverted the erroneous changes to the Apache License.	2015-01-21 20:07:45 +01:00
juliusv	cca2e58f20	Merge pull request #442 from prometheus/beorn7/fix-crash-recovery Fix ALL the crash-recovery related problems.	2015-01-09 10:56:02 +01:00
Bjoern Rabenstein	0851945054	Add a heuristics to checkpoint early if there are many "dirty" series..	2015-01-08 20:15:58 +01:00
Julius Volz	d6b9e97655	Remove extraction.Result type, simplify code.	2015-01-08 16:34:01 +01:00
Bjoern Rabenstein	b1e4956142	Apply a giant code cleanup. Essentially: - Remove unused code. - Make it 'go vet' clean. The only remaining warnings are in generated code. - Make it 'golint' clean. The only remaining warnings are in gerenated code. - Smoothed out same minor things. Change-Id: I3fe5c1fbead27b0e7a9c247fee2f5a45bc2d42c6	2014-12-10 16:16:49 +01:00

1 2 3 4

165 Commits (b105e26f4d1bb64e2025a211c1251782f6790aa8)