Commit Graph

311 Commits (3215e8535ae8afcf850fbaac9df7ff9abe42f9e0)

Author SHA1 Message Date
Wojciech Tyczynski 33e612e101 Revert "cacher.go: embed storage.Interface into cacher" 2016-07-22 07:28:45 +02:00
k8s-merge-robot 42a1deeff3 Merge pull request #26861 from xiang90/embed
Automatic merge from submit-queue

cacher.go: embed storage.Interface into cacher

Continuous effort to simplify cacher implementation.
2016-07-21 09:50:19 -07:00
Xiang Li 44c0a1190c cacher.go: embed storage.Interface into cacher 2016-07-16 23:25:48 -07:00
Davanum Srinivas 2b0ed014b7 Use Go canonical import paths
Add canonical imports only in existing doc.go files.
https://golang.org/doc/go1.4#canonicalimports

Fixes #29014
2016-07-16 13:48:21 -04:00
Hongchao Deng 54025ce8b3 etcd3/store: Add test for compact conflict 2016-07-15 10:24:50 -07:00
Hongchao Deng 186b4858b4 better handle etcd compaction in multi-apiserver 2016-07-15 10:24:49 -07:00
Jordan Liggitt 4fcd999c25
Fix watch cache filtering 2016-07-14 13:13:17 -04:00
joe2far 5ead89b5bb Fixed several typos 2016-07-13 15:06:24 +01:00
Wojciech Tyczynski 1d9bc58328 Extend Filter interface with Trigger() and use it for pods and nodes 2016-07-13 08:45:18 +02:00
Wojciech Tyczynski 7f7ef0879f Change filter to interface in storage.Interface 2016-07-13 08:44:22 +02:00
Jordan Liggitt d4d6a0ee4c
Allow starting test etcd with http 2016-07-10 22:28:01 -04:00
Jordan Liggitt e43e58c787
Allow specifying base location for test etcd data 2016-07-08 16:24:41 -04:00
Xiang Li aa472ff734 cacher: replace usable lock with conditional variable 2016-07-04 08:57:59 -07:00
David McMahon ef0c9f0c5b Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
k8s-merge-robot 00b5b548d6 Merge pull request #26854 from xiang90/cacher
Automatic merge from submit-queue

cacher.go: remove NewCacher func

NewCacher is a wrapper of NewCacherFromConfig. NewCacher understands
how to create a key func from scopeStrategy. However, it is not the
responsibility of cacher. So we should remove this function, and
construct the config in its caller, which should understand scopeStrategy.
2016-06-25 11:10:06 -07:00
k8s-merge-robot b71e499c92 Merge pull request #26502 from gyuho/remove_name_field
Automatic merge from submit-queue

pkg/storage/etcd3: remove name field in test

Current test gets the name with its test table index,
so there seems to be no reason to have name field in test table.
2016-06-25 07:09:36 -07:00
k8s-merge-robot e65965fca6 Merge pull request #26855 from xiang90/cacher_rm
Automatic merge from submit-queue

cacher: remove unnecessary initialzation
2016-06-24 13:56:47 -07:00
Wojciech Tyczynski 394db3b407 Fix traces 2016-06-06 10:49:46 +02:00
Xiang Li c530a5810a cacher: remove unnecessary initialzation 2016-06-04 22:49:45 -07:00
Xiang Li e2aab093aa cacher.go: remove NewCacher func
NewCacher is a wrapper of NewCacherFromConfig. NewCacher understands
how to create a key func from scopeStrategy. However, it is not the
responsibility of cacher. So we should remove this function, and
construct the config in its caller, which should understand scopeStrategy.
2016-06-04 22:46:58 -07:00
Gyu-Ho Lee b8740c2c67 pkg/storage/etcd3: remove name field in test
Current test gets the name with its test table index,
so there seems to be no reason to have name field in test table.
2016-05-29 02:17:35 -07:00
k8s-merge-robot ae28564526 Merge pull request #26386 from janetkuo/etcd-test-flake
Automatic merge from submit-queue

Increase etcd test server up timeout and disallow returning nil server 

Fixes #25047

[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
2016-05-27 05:49:51 -07:00
k8s-merge-robot 8bfaec4b59 Merge pull request #26334 from fejta/own
Automatic merge from submit-queue

Create pkg/storage/OWNERS

Fixes #26118
2016-05-26 16:10:03 -07:00
Janet Kuo 672cd64035 Disallow returning nil server in NewEtcdTestClientServer 2016-05-26 14:37:12 -07:00
Janet Kuo 3cc2311c54 Increase etcd test server up timeout to wait.ForeverTestTimeout 2016-05-26 14:35:57 -07:00
Alex Mohr edda837142 Merge pull request #25599 from caesarxuchao/orphaning-finalizer
Add orphaning finalizer logic to GC
2016-05-26 13:19:19 -07:00
Erick Fejta f4d07ddf7d Create pkg/storage/OWNERS 2016-05-25 21:13:10 -07:00
Chao Xu 1665546d2d add finalizer logics to the API server and the garbage collector; handling DeleteOptions.OrphanDependents in the API server 2016-05-24 13:07:28 -07:00
deads2k 02c0181f26 reduce conflict retries 2016-05-23 13:09:37 -04:00
Hongchao Deng ae6166f97d etcd3/compactor: fix logging endpoints 2016-05-20 14:35:42 -07:00
Wojciech Tyczynski 9784dff94e Merge pull request #25871 from smarterclayton/retry_on_error
Fix the Retry-After code path to work for clients, and send correct bodies
2016-05-19 16:18:18 +02:00
k8s-merge-robot 044d55ed7d Merge pull request #24142 from rrati/controller-sync-interval-23394
Automatic merge from submit-queue

Separated resync and relist functionality in reflector #23394

controller-manager #23394
2016-05-19 07:10:00 -07:00
k8s-merge-robot d89d45a861 Merge pull request #25266 from smarterclayton/common_storage
Automatic merge from submit-queue

kube-apiserver options should be decoupled from impls

A few months ago we refactored options to keep it independent of the
implementations, so that it could be used in CLI tools to validate
config or to generate config, without pulling in the full dependency
tree of the master.  This change restores that by separating
server_run_options.go back to its own package.

Also, options structs should never contain non-serializable types, which
storagebackend.Config was doing with runtime.Codec. Split the codec out.

Fix a typo on the name of the etcd2.go storage backend.

Finally, move DefaultStorageMediaType to server_run_options.

@nikhiljindal as per my comment in #24454, @liggitt because you and I
discussed this last time
2016-05-19 06:13:38 -07:00
Clayton Coleman e5fbf86157
Allow StatusErrors to be modified after creation 2016-05-19 09:08:53 -04:00
Robert Rati e388c137bb Separate sync and list functionality in the reflector. #23394 2016-05-19 07:41:24 -04:00
Clayton Coleman 633683c08d
kube-apiserver options should be decoupled from impls
A few months ago we refactored options to keep it independent of the
implementations, so that it could be used in CLI tools to validate
config or to generate config, without pulling in the full dependency
tree of the master.  This change restores that by separating
server_run_options.go back to its own package.

Also, options structs should never contain non-serializable types, which
storagebackend.Config was doing with runtime.Codec. Split the codec out.

Fix a typo on the name of the etcd2.go storage backend.

Finally, move DefaultStorageMediaType to server_run_options.
2016-05-18 10:39:21 -04:00
Tim Hockin 152c86ab06 Make name validators return string slices 2016-05-18 00:48:01 -07:00
Wojciech Tyczynski 7452bc3f5b Tune traces thresholds in etcdHelper 2016-05-16 11:10:31 +02:00
k8s-merge-robot be104a8271 Merge pull request #25415 from hongchaodeng/e2
Automatic merge from submit-queue

etcd_watcher: make Deleted Event.Object's version consistent

### What's the problem?

In [sendDelete()](995f022808/pkg/storage/etcd/etcd_watcher.go (L437-L442)), Deleted Event.Object's resource version will be set to the latest resource version. This is actually an assumption made by cacher that all later events should have objects with larger verions; See [here](995f022808/pkg/storage/cacher.go (L579-L581)).

In sendModify(), it could also return Deleted event. However, the resource version is still the old version, which is inconsistent behavior with sendDelete().

### What's this PR?
This PR sets the version of oldObj in sendModify() to be the latest version.
Provided unit test for it.
2016-05-14 20:46:20 -07:00
Matt Liggett 2bc46d5085 It's 2016, yo. 2016-05-13 12:41:40 -07:00
Hongchao Deng 5337bc220a etcd_watcher: test for ensuring delete event have latest index 2016-05-12 11:01:26 -07:00
Hongchao Deng fcf63a6c4b etcd_watcher: make Deleted Event.Object's version consistent 2016-05-12 11:01:26 -07:00
k8s-merge-robot aa8fddb7d9 Merge pull request #25369 from liggitt/cached-watch
Automatic merge from submit-queue

Return 'too old' errors from watch cache via watch stream

Fixes #25151

This PR updates the API server to produce the same results when a watch is attempted with a resourceVersion that is too old, regardless of whether the etcd watch cache is enabled. The expected result is a `200` http status, with a single watch event of type `ERROR`. Previously, the watch cache would deliver a `410` http response.

This is the uncached watch impl:
```
// Implements storage.Interface.
func (h *etcdHelper) WatchList(ctx context.Context, key string, resourceVersion string, filter storage.FilterFunc) (watch.Interface, error) {
	if ctx == nil {
		glog.Errorf("Context is nil")
	}
	watchRV, err := storage.ParseWatchResourceVersion(resourceVersion)
	if err != nil {
		return nil, err
	}
	key = h.prefixEtcdKey(key)
	w := newEtcdWatcher(true, h.quorum, exceptKey(key), filter, h.codec, h.versioner, nil, h)
	go w.etcdWatch(ctx, h.etcdKeysAPI, key, watchRV)
	return w, nil
}
```

once the resourceVersion parses, there is no path that returns a direct error, so all errors would have to be returned as an `ERROR` event via the ResultChan().
2016-05-10 18:15:15 -07:00
Hongchao Deng cd3f7f41c1 etcd3/watcher: refactor test 2016-05-10 12:37:31 -07:00
Hongchao Deng da7e9783e8 etcd3/watcher: Event.Object should have the same rev as etcd delete
instead of previous object's revision.
2016-05-10 10:53:53 -07:00
Jordan Liggitt f80b59ba87 Return 'too old' errors from watch cache via watch stream 2016-05-10 10:59:53 -04:00
Hongchao Deng 97f4647dc3 etcd3/watcher: fix goroutine leak if ctx is canceled
In reflector.go, it could probably call Stop() without retrieving all results
from ResultChan().
A potential leak is that when an error has happened, it could block on resultChan,
and then cancelling context in Stop() wouldn't unblock it.
This fixes the problem by making it also select ctx.Done and cancel context
afterwards if error happened.
2016-05-09 09:06:11 -07:00
Robert Bailey 24c69f0b14 Merge pull request #25211 from lavalamp/leak
Never leak the etcd watcher's translate goroutine
2016-05-06 15:03:21 -07:00
Daniel Smith 995f022808 Never leak the etcd watcher's translate goroutine 2016-05-05 11:19:12 -07:00
Hongchao Deng 3144ebc7fc start etcd compactor in background 2016-05-04 16:01:03 -07:00
Hongchao Deng 84c07b0bbf etcd3/store: userUpdate error should be returned 2016-05-03 14:42:29 +08:00
Clayton Coleman fdb110c859
Fix the rest of the code 2016-04-29 17:12:10 -04:00
k8s-merge-robot 11298d02e0 Merge pull request #24455 from hongchaodeng/fl
Automatic merge from submit-queue

Provide flags to use etcd3 backed storage

ref: #24405

What's in this PR?
- Add a new flag "storage-backend" to choose "etcd2" or "etcd3". By default (i.e. empty), it's "etcd2".
- Take out etcd config code into a standalone package and let it create etcd2 or etcd3 storage backend given user input.
2016-04-29 08:49:04 -07:00
k8s-merge-robot d0b887e4e0 Merge pull request #24595 from zhouhaibing089/httpserverclose
Automatic merge from submit-queue

Uncomment the code that caused by #19254

Fix https://github.com/kubernetes/kubernetes/issues/24546.

@lavalamp
2016-04-28 01:41:16 -07:00
Hongchao Deng c0071a1595 add flags to enable etcd3 2016-04-28 09:48:16 +08:00
k8s-merge-robot 28bc4b32c2 Merge pull request #24532 from rsc/master
Automatic merge from submit-queue

apiserver latency reductions

Combined effect of these two commits on the latency observed by the 1000-node kubemark benchmark:

```
name                               old ms/op  new ms/op   delta
LIST_nodes_p50                      127 ±16%    121 ± 9%   -4.58%  (p=0.000 n=29+27)
LIST_nodes_p90                      326 ±12%    266 ±12%  -18.48%  (p=0.000 n=29+27)
LIST_nodes_p99                      453 ±11%    400 ±14%  -11.79%  (p=0.000 n=29+28)
LIST_replicationcontrollers_p50    29.4 ±49%   26.2 ±54%     ~     (p=0.085 n=30+29)
LIST_replicationcontrollers_p90    83.0 ±78%   68.6 ±59%  -17.33%  (p=0.013 n=30+28)
LIST_replicationcontrollers_p99     216 ±43%    177 ±49%  -17.68%  (p=0.000 n=29+29)
DELETE_pods_p50                    24.5 ±14%   24.3 ±13%     ~     (p=0.562 n=30+29)
DELETE_pods_p90                    30.7 ± 1%   30.7 ± 1%   -0.30%  (p=0.011 n=29+29)
DELETE_pods_p99                    77.2 ±34%   54.2 ±23%  -29.76%  (p=0.000 n=30+27)
PUT_replicationcontrollers_p50     5.86 ±26%   5.94 ±32%     ~     (p=0.734 n=29+29)
PUT_replicationcontrollers_p90     15.8 ± 7%   15.5 ± 6%   -2.06%  (p=0.010 n=29+29)
PUT_replicationcontrollers_p99     57.8 ±35%   39.5 ±55%  -31.60%  (p=0.000 n=29+29)
PUT_nodes_p50                      14.9 ± 2%   14.8 ± 2%   -0.68%  (p=0.012 n=30+27)
PUT_nodes_p90                      16.5 ± 1%   16.3 ± 2%   -0.90%  (p=0.000 n=27+28)
PUT_nodes_p99                      57.9 ±47%   41.3 ±35%  -28.61%  (p=0.000 n=30+28)
POST_replicationcontrollers_p50    6.35 ±29%   6.34 ±20%     ~     (p=0.944 n=30+28)
POST_replicationcontrollers_p90    15.4 ± 5%   15.0 ± 5%   -2.18%  (p=0.001 n=29+29)
POST_replicationcontrollers_p99    52.2 ±71%   32.9 ±46%  -36.99%  (p=0.000 n=29+27)
POST_pods_p50                      8.99 ±13%   8.95 ±16%     ~     (p=0.903 n=30+29)
POST_pods_p90                      16.2 ± 4%   16.1 ± 4%     ~     (p=0.287 n=29+29)
POST_pods_p99                      30.9 ±21%   26.4 ±12%  -14.73%  (p=0.000 n=28+28)
POST_bindings_p50                  9.34 ±12%   8.92 ±15%   -4.54%  (p=0.013 n=30+28)
POST_bindings_p90                  16.6 ± 1%   16.5 ± 3%   -0.73%  (p=0.017 n=28+29)
POST_bindings_p99                  23.5 ± 9%   21.1 ± 4%  -10.09%  (p=0.000 n=27+28)
PUT_pods_p50                       10.8 ±11%   10.2 ± 5%   -5.47%  (p=0.000 n=30+27)
PUT_pods_p90                       16.1 ± 1%   16.0 ± 1%   -0.64%  (p=0.000 n=29+28)
PUT_pods_p99                       23.4 ± 9%   20.9 ± 9%  -10.93%  (p=0.000 n=28+27)
DELETE_replicationcontrollers_p50  2.42 ±16%   2.50 ±13%     ~     (p=0.054 n=29+28)
DELETE_replicationcontrollers_p90  11.5 ±12%   11.8 ±13%     ~     (p=0.141 n=30+28)
DELETE_replicationcontrollers_p99  19.5 ±21%   19.1 ±21%     ~     (p=0.397 n=29+29)
GET_nodes_p50                      0.77 ±10%   0.76 ±10%     ~     (p=0.317 n=28+28)
GET_nodes_p90                      1.20 ±16%   1.14 ±24%   -4.66%  (p=0.036 n=28+29)
GET_nodes_p99                      11.4 ±48%    7.5 ±46%  -34.28%  (p=0.000 n=28+29)
GET_replicationcontrollers_p50     0.74 ±17%   0.73 ±17%     ~     (p=0.222 n=30+28)
GET_replicationcontrollers_p90     1.04 ±25%   1.01 ±27%     ~     (p=0.231 n=30+29)
GET_replicationcontrollers_p99     12.1 ±81%  10.0 ±145%     ~     (p=0.063 n=28+29)
GET_pods_p50                       0.78 ±12%   0.77 ±10%     ~     (p=0.178 n=30+28)
GET_pods_p90                       1.06 ±19%   1.02 ±19%     ~     (p=0.120 n=29+28)
GET_pods_p99                       3.92 ±43%   2.45 ±38%  -37.55%  (p=0.000 n=27+25)
LIST_services_p50                  0.20 ±13%   0.20 ±16%     ~     (p=0.854 n=28+29)
LIST_services_p90                  0.28 ±15%   0.27 ±14%     ~     (p=0.219 n=29+28)
LIST_services_p99                  0.49 ±20%   0.47 ±24%     ~     (p=0.140 n=29+29)
LIST_endpoints_p50                 0.19 ±14%   0.19 ±15%     ~     (p=0.709 n=29+29)
LIST_endpoints_p90                 0.26 ±16%   0.26 ±13%     ~     (p=0.274 n=29+28)
LIST_endpoints_p99                 0.46 ±24%   0.44 ±21%     ~     (p=0.111 n=29+29)
LIST_horizontalpodautoscalers_p50  0.16 ±15%   0.15 ±13%     ~     (p=0.253 n=30+27)
LIST_horizontalpodautoscalers_p90  0.22 ±24%   0.21 ±16%     ~     (p=0.152 n=30+28)
LIST_horizontalpodautoscalers_p99  0.31 ±33%   0.31 ±38%     ~     (p=0.817 n=28+29)
LIST_daemonsets_p50                0.16 ±20%   0.15 ±11%     ~     (p=0.135 n=30+27)
LIST_daemonsets_p90                0.22 ±18%   0.21 ±25%     ~     (p=0.135 n=29+28)
LIST_daemonsets_p99                0.29 ±28%   0.29 ±32%     ~     (p=0.606 n=28+28)
LIST_jobs_p50                      0.16 ±16%   0.15 ±12%     ~     (p=0.375 n=29+28)
LIST_jobs_p90                      0.22 ±18%   0.21 ±16%     ~     (p=0.090 n=29+26)
LIST_jobs_p99                      0.31 ±28%   0.28 ±35%  -10.29%  (p=0.005 n=29+27)
LIST_deployments_p50               0.15 ±16%   0.15 ±13%     ~     (p=0.565 n=29+28)
LIST_deployments_p90               0.22 ±22%   0.21 ±19%     ~     (p=0.107 n=30+28)
LIST_deployments_p99               0.31 ±27%   0.29 ±34%     ~     (p=0.068 n=29+28)
LIST_namespaces_p50                0.21 ±25%   0.21 ±26%     ~     (p=0.768 n=29+27)
LIST_namespaces_p90                0.28 ±29%   0.26 ±25%     ~     (p=0.101 n=30+28)
LIST_namespaces_p99                0.30 ±48%   0.29 ±42%     ~     (p=0.339 n=30+29)
LIST_replicasets_p50               0.15 ±18%   0.15 ±16%     ~     (p=0.612 n=30+28)
LIST_replicasets_p90               0.22 ±19%   0.21 ±18%   -5.13%  (p=0.011 n=28+27)
LIST_replicasets_p99               0.31 ±39%   0.28 ±29%     ~     (p=0.066 n=29+28)
LIST_persistentvolumes_p50         0.16 ±23%   0.15 ±21%     ~     (p=0.124 n=30+29)
LIST_persistentvolumes_p90         0.21 ±23%   0.20 ±23%     ~     (p=0.092 n=30+25)
LIST_persistentvolumes_p99         0.21 ±24%   0.20 ±23%     ~     (p=0.053 n=30+25)
LIST_resourcequotas_p50            0.16 ±12%   0.16 ±13%     ~     (p=0.175 n=27+28)
LIST_resourcequotas_p90            0.20 ±22%   0.20 ±24%     ~     (p=0.388 n=30+28)
LIST_resourcequotas_p99            0.22 ±24%   0.22 ±23%     ~     (p=0.575 n=30+28)
LIST_persistentvolumeclaims_p50    0.15 ±21%   0.15 ±29%     ~     (p=0.079 n=30+28)
LIST_persistentvolumeclaims_p90    0.19 ±26%   0.18 ±34%     ~     (p=0.446 n=29+29)
LIST_persistentvolumeclaims_p99    0.19 ±26%   0.18 ±34%     ~     (p=0.446 n=29+29)
LIST_pods_p50                      68.0 ±16%   56.3 ± 9%  -17.19%  (p=0.000 n=29+28)
LIST_pods_p90                       119 ±19%     93 ± 8%  -21.88%  (p=0.000 n=28+28)
LIST_pods_p99                       230 ±18%    202 ±14%  -12.13%  (p=0.000 n=27+28)
```
2016-04-27 08:32:18 -07:00
zhouhaibing089 bf1a3f99c0 Uncomment the code that cause by #19254 2016-04-25 23:21:31 +08:00
Hongchao Deng b0f4517e65 etcd3/watcher: cancelling context shouldn't return error 2016-04-22 12:23:04 +08:00
Russ Cox 6a19e46ed6 pkg/storage: cache timers
A previous change here replaced time.After with an explicit
timer that can be stopped, to avoid filling up the active timer list
with timers that are no longer needed. But an even better fix is to
reuse the timers across calls, to avoid filling the allocated heap
with work for the garbage collector. On top of that, try a quick
non-blocking send to avoid the timer entirely.

For the e2e 1000-node kubemark test, basically everything gets faster,
some things significantly so. The 90th and 99th percentile for LIST nodes
in particular are the worst case that has caused SLO/SLA problems
in the past, and this reduces 99th percentile by 10%.

name                               old ms/op  new ms/op   delta
LIST_nodes_p50                      127 ±16%    124 ±13%     ~     (p=0.136 n=29+29)
LIST_nodes_p90                      326 ±12%    278 ±15%  -14.85%  (p=0.000 n=29+29)
LIST_nodes_p99                      453 ±11%    405 ±19%  -10.70%  (p=0.000 n=29+28)
LIST_replicationcontrollers_p50    29.4 ±49%   26.6 ±43%     ~     (p=0.176 n=30+29)
LIST_replicationcontrollers_p90    83.0 ±78%   68.7 ±63%  -17.30%  (p=0.020 n=30+29)
LIST_replicationcontrollers_p99     216 ±43%    173 ±41%  -19.53%  (p=0.000 n=29+28)
DELETE_pods_p50                    24.5 ±14%   24.3 ±17%     ~     (p=0.562 n=30+28)
DELETE_pods_p90                    30.7 ± 1%   30.6 ± 0%   -0.44%  (p=0.000 n=29+28)
DELETE_pods_p99                    77.2 ±34%   56.3 ±27%  -26.99%  (p=0.000 n=30+28)
PUT_replicationcontrollers_p50     5.86 ±26%   5.83 ±36%     ~     (p=1.000 n=29+28)
PUT_replicationcontrollers_p90     15.8 ± 7%   15.9 ± 6%     ~     (p=0.936 n=29+28)
PUT_replicationcontrollers_p99     57.8 ±35%   56.7 ±41%     ~     (p=0.725 n=29+28)
PUT_nodes_p50                      14.9 ± 2%   14.9 ± 1%   -0.55%  (p=0.020 n=30+28)
PUT_nodes_p90                      16.5 ± 1%   16.4 ± 2%   -0.60%  (p=0.040 n=27+28)
PUT_nodes_p99                      57.9 ±47%   44.6 ±42%  -23.02%  (p=0.000 n=30+29)
POST_replicationcontrollers_p50    6.35 ±29%   6.33 ±23%     ~     (p=0.957 n=30+28)
POST_replicationcontrollers_p90    15.4 ± 5%   15.2 ± 6%   -1.14%  (p=0.034 n=29+28)
POST_replicationcontrollers_p99    52.2 ±71%   53.4 ±52%     ~     (p=0.720 n=29+27)
POST_pods_p50                      8.99 ±13%   9.33 ±13%   +3.79%  (p=0.023 n=30+29)
POST_pods_p90                      16.2 ± 4%   16.3 ± 4%     ~     (p=0.113 n=29+29)
POST_pods_p99                      30.9 ±21%   28.4 ±23%   -8.26%  (p=0.001 n=28+29)
POST_bindings_p50                  9.34 ±12%   8.98 ±17%     ~     (p=0.083 n=30+29)
POST_bindings_p90                  16.6 ± 1%   16.5 ± 2%   -0.76%  (p=0.000 n=28+26)
POST_bindings_p99                  23.5 ± 9%   21.4 ± 5%   -8.98%  (p=0.000 n=27+27)
PUT_pods_p50                       10.8 ±11%   10.3 ± 5%   -4.67%  (p=0.000 n=30+28)
PUT_pods_p90                       16.1 ± 1%   16.0 ± 1%   -0.55%  (p=0.003 n=29+29)
PUT_pods_p99                       23.4 ± 9%   21.6 ±14%   -8.03%  (p=0.000 n=28+28)
DELETE_replicationcontrollers_p50  2.42 ±16%   2.50 ±13%     ~     (p=0.072 n=29+29)
DELETE_replicationcontrollers_p90  11.5 ±12%   11.7 ±10%     ~     (p=0.190 n=30+28)
DELETE_replicationcontrollers_p99  19.5 ±21%   19.0 ±22%     ~     (p=0.298 n=29+28)
GET_nodes_p90                      1.20 ±16%   1.18 ±19%     ~     (p=0.626 n=28+29)
GET_nodes_p99                      11.4 ±48%    8.3 ±40%  -27.31%  (p=0.000 n=28+28)
GET_replicationcontrollers_p90     1.04 ±25%   1.03 ±21%     ~     (p=0.682 n=30+29)
GET_replicationcontrollers_p99     12.1 ±81%  10.0 ±123%     ~     (p=0.135 n=28+28)
GET_pods_p90                       1.06 ±19%   1.08 ±21%     ~     (p=0.597 n=29+29)
GET_pods_p99                       3.92 ±43%   2.81 ±39%  -28.39%  (p=0.000 n=27+28)
LIST_pods_p50                      68.0 ±16%   65.3 ±13%     ~     (p=0.066 n=29+29)
LIST_pods_p90                       119 ±19%    115 ±12%     ~     (p=0.091 n=28+27)
LIST_pods_p99                       230 ±18%    226 ±21%     ~     (p=0.251 n=27+28)
2016-04-21 15:53:47 -04:00
k8s-merge-robot 85de6acadc Merge pull request #23208 from deads2k/fix-version-override
Automatic merge from submit-queue

make storage enablement, serialization, and location orthogonal

This allows a caller (command-line, config, code) to specify multiple separate pieces of config information regarding storage and have them properly composed at runtime.  The information provided is exposed through interfaces to allow alternate implementations, which allows us to change the expression of the config moving forward.  I also fixed up the types to be correct as I moved through.

The same options still exist, but they're composed slightly differently
 1. specify target etcd servers per Group or per GroupResource
 1. specify storage GroupVersions per Groups or per GroupResource
 1. specify etcd prefixes per GroupVersion or per GroupResource
 1. specify that multiple GroupResources share the same location in etcd
 1. enable GroupResources by GroupVersion or by GroupResource whitelist or GroupResource blacklist

The `storage.Interface` is built per GroupResource by:
 1. find the set of possible storage GroupResource based on the priority list of cohabitators
 1. choose a GroupResource from the set by looking at which Groups have the resource enabled
 1. find the target etcd server, etcd prefix, and storage encoding based on the GroupResource

The API server can have its resources separately enabled, but for now I've kept them linked.

@liggitt I think we need this (or something like it) to be able to go from config to these interfaces.  Given another round of refactoring, we may be able to reshape these to be more forward driving.

@smarterclayton this is important for rebasing and for a seamless 1.2 to 1.3 migration for us.
2016-04-21 08:24:29 -07:00
k8s-merge-robot 0a5d57a383 Merge pull request #24079 from hongchaodeng/comp
Automatic merge from submit-queue

etcd3 store: provide compactor util

What's this PR?
- Provides a util to compact keys in etcd.

Reason:
We want to save the most recent 10 minutes event history. It should be more than enough for slow watchers. It is not number based, so it can tolerate event bursts too. We do not want to save longer since the current storage API cannot take advantage of the multi-version key yet. We might keep a longer history in the future.
2016-04-21 05:19:54 -07:00
deads2k 6670b73b18 make storage enablement, serialization, and location orthogonal 2016-04-21 08:18:55 -04:00
Hongchao Deng 2bc022aad4 watcher test: print more info for debugging 2016-04-21 06:56:50 +08:00
Hongchao Deng 46214c60bb etcd3/store: support TTL in Create, Update 2016-04-19 07:35:59 +08:00
Hongchao Deng e18b4e67be etcd3/store: watcher implementation 2016-04-18 21:41:53 +08:00
k8s-merge-robot 5f3f06f0b1 Merge pull request #24022 from hongchaodeng/dep
Automatic merge from submit-queue

Bump up etcd dependency to fix data race

ref: https://github.com/kubernetes/kubernetes/pull/23694

What this PR does
- Bumping up the godep of etcd to fix data race in etcd watcher. Without this change, watcher PR builds will fail in race detection.
- Small changes to fix builds after upgrade
2016-04-17 12:01:32 -07:00
k8s-merge-robot 2bf52175f9 Merge pull request #23923 from hongchaodeng/exp
Automatic merge from submit-queue

Decouple etcd node.expiration logic from DeleitonTimestamp

ref: https://github.com/kubernetes/kubernetes/issues/23902
2016-04-17 04:12:26 -07:00
k8s-merge-robot a275a045d1 Merge pull request #23914 from sky-uk/make-etcd-cache-size-configurable
Automatic merge from submit-queue

Make etcd cache size configurable

Instead of the prior 50K limit, allow users to specify a more sensible size for their cluster.

I'm not sure what a sensible default is here. I'm still experimenting on my own clusters. 50 gives me a 270MB max footprint. 50K caused my apiserver to run out of memory as it exceeded >2GB. I believe that number is far too large for most people's use cases.

There are some other fundamental issues that I'm not addressing here:
- Old etcd items are cached and potentially never removed (it stores using modifiedIndex, and doesn't remove the old object when it gets updated)
- Cache isn't LRU, so there's no guarantee the cache remains hot. This makes its performance difficult to predict. More of an issue with a smaller cache size.
- 1.2 etcd entries seem to have a larger memory footprint (I never had an issue in 1.1, even though this cache existed there). I suspect that's due to image lists on the node status.

This is provided as a fix for #23323
2016-04-17 00:06:31 -07:00
Hongchao Deng 9f43a110d9 file rename; refactor 2016-04-16 01:51:29 +08:00
Andy Goldstein 049e63d253 Honor starting resourceVersion in watch cache
Compare the requested resourceVersion to each event's resourceVersion to ensure events that occurred
in the past are not sent to the client.
2016-04-14 09:37:22 -04:00
Hongchao Deng b9745999c9 Decouple etcd node.expiration logic from DeleitonTimestamp 2016-04-13 15:11:53 -07:00
Daniel Smith 4c539bf082 Merge pull request #23490 from wojtek-t/remove_set_from_storage_interface
Remove Set() from storage.Interface.
2016-04-13 14:22:05 -07:00
Jordan Liggitt ada60236f7 Make watch cache behave like uncached watch 2016-04-12 10:14:07 -04:00
James Ravn 5bb0595260 Make deserialization cache size configurable
Instead of the default 50K entries, allow users to specify more sensible
sizes for their cluster.
2016-04-12 13:42:27 +01:00
Hongchao Deng ab9ac70e56 etcd3 store: provide compactor util 2016-04-09 11:01:27 -07:00
Hongchao Deng 71b46f3f57 fix build 2016-04-07 19:22:28 -07:00
Wojciech Tyczynski 53f433f019 Remove Set() from storage.Interface. 2016-04-04 17:54:18 +02:00
k8s-merge-robot f5c93c8ddc Merge pull request #23472 from wojtek-t/fix_object_meta_for
Automatic merge from submit-queue

Switch from api.ObjectMetaFor to meta.Accessor in most of places

Fix #23278

@smarterclayton @lavalamp
2016-04-02 02:33:40 -07:00
Wojciech Tyczynski 2699be2e7e Switch api.ObjetaMetaFor to meta.Accessor 2016-03-31 17:52:31 +02:00
Hongchao Deng 00ddf0671d etcd (v3) store: implements KV methods of storage.Interface
This implements Get(), Create(), Delete(), GetToList(),
List(), GuaranteedUpdate().
2016-03-30 10:20:39 -07:00
Chao Xu 31b425b3a1 add delete precondition 2016-03-25 11:21:39 -07:00
k8s-merge-robot 4e4ad61260 Merge pull request #23366 from goltermann/vet
Auto commit by PR queue bot
2016-03-24 21:50:56 -07:00
k8s-merge-robot 2777cd7e75 Merge pull request #23295 from hongchaodeng/error
Auto commit by PR queue bot
2016-03-23 02:27:36 -07:00
k8s-merge-robot 4af38b52b9 Merge pull request #22736 from resouer/fix-util-dev
Auto commit by PR queue bot
2016-03-22 19:54:58 -07:00
goltermann 34d4eaea08 Fixing several (but not all) go vet errors. Most are around string formatting, or unreachable code. 2016-03-22 17:26:50 -07:00
Hongchao Deng 189ce6e397 storage: add custom storage error 2016-03-22 08:19:16 -07:00
k8s-merge-robot 2bb6f74bf9 Merge pull request #23099 from shawnps/patch-12
Auto commit by PR queue bot
2016-03-21 09:19:21 -07:00
harry 26dad2c428 Update generated docs 2016-03-21 15:36:24 +08:00
harry b6924a322a Refactor cache into util sub pkg 2016-03-21 14:50:57 +08:00
k8s-merge-robot 08c706a8ab Merge pull request #23194 from hongchaodeng/dep
Auto commit by PR queue bot
2016-03-19 06:35:17 -07:00
Hongchao Deng 0a1ff0bb0b fix EtcdTestServer 2016-03-18 23:39:48 -07:00
Russ Cox e4b369e1d7 storage: clean up timer in cacheWatcher.add
In the e2e benchmarks, this timer is a significant source of garbage
and stale timers. Because the timer is not stopped after its use
in the select, it stays in the timer heap until it eventually fires
(5 seconds later). Under load, a lot of 5-second timers can pile up
before any start going away. The timer heap being large makes timer
operations take longer; the operations are O(log N) but N is still big.

The way to fix this in current versions of Go is to stop the underlying
timer explicitly, which this CL does for this one case.

There are many other places in the code that use the same idiom,
but those do not show up on profiles of the e2e server.
I am investigating changes for Go 1.7's runtime that would make
the old code behave like this new code transparently, so I don't
think it's worth updating any uses of the idiom that are not in
hot spots found with profiling.

Measuring 'LIST nodes' latency in milliseconds during e2e test
shows the benefit of this change.

Using Go 1.4.2:

BEFORE  p50: 148±7   p90: 328±19  p99: 513±29  n: 10
AFTER   p50: 151±8   p90: 339±19  p99: 479±20  n: 9

Using Go 1.6.0:

BEFORE  p50: 141±9   p90: 383±32  p99: 604±44  n: 11
AFTER   p50: 140±14  p90: 360±31  p99: 483±39  n: 10
2016-03-18 15:58:34 -04:00
Shawn Smith 0ea3e43f1c use Fatalf 2016-03-17 15:18:04 +09:00
deads2k ab03317d96 support CIDRs in NO_PROXY 2016-03-16 16:22:54 -04:00
Jordan Liggitt a1c2267f20 Decrease parallelism in deletecollection test, lengthen test etcd certs 2016-03-12 18:30:12 -05:00
AdoHe 7228b9b987 restore ability to run against secured etcd 2016-03-11 11:21:16 -05:00
Fabio Yeon d25449d58e Merge pull request #21310 from wojtek-t/require_versioner
Require versioner in etcdHelper to be non-null.
2016-02-26 15:44:59 -08:00
k8s-merge-robot 90d1276507 Merge pull request #21223 from hongchaodeng/fix
Auto commit by PR queue bot
2016-02-22 07:41:45 -08:00
Daniel Smith 3fb020b28d Fix a locking bug in the cacher. 2016-02-19 17:45:02 -08:00
k8s-merge-robot 17325ef6ef Merge pull request #20501 from piosz/hpa-ga
Auto commit by PR queue bot
2016-02-18 06:52:39 -08:00