Commit Graph

45545 Commits (b3be5774c92fa3cec8697f8afe07e8a31ff7559b)

Author SHA1 Message Date
Saad Ali b3be5774c9 Update Changelog for 1.5.5 2017-03-21 18:59:30 -07:00
saadali 321acf00e5 Update CHANGELOG.md for v1.5.5. 2017-03-21 18:17:11 -07:00
Kubernetes Submit Queue b54077b0c5 Merge pull request #43472 from piosz/annotation-rename
Automatic merge from submit-queue

Renamed fluentd-ds-ready annotation

We don't want to have alpha annotation as production ready solution.

Low risk change, only rename.
2017-03-21 17:27:06 -07:00
Kubernetes Submit Queue 2b6e318ea0 Merge pull request #38937 from nak3/reserved-example
Automatic merge from submit-queue

Use realistic value for the memory example of kube-reserved and system-reserved

Use realistic value for the memory example of kube-reserved and system-reserved

Currently, kublet help shows the memory example of
kube-reserved and system-reserved as 150G. This 150G is not realistic
value and it leads misconfiguration or confusion. This patch changes
to example value as 500Mi.

Before(same with system-reserved):
```
      --kube-reserved value                                A set of ResourceName=ResourceQuantity (e.g. cpu=200m,memory=150G) pairs that describe resources reserved for kubernetes system components. Currently only cpu and memory are supported. See http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md for more detail. [default=none]
```

After(same with system-reserved):
```
      --kube-reserved value                                A set of ResourceName=ResourceQuantity (e.g. cpu=200m,memory=500Mi) pairs that describe resources reserved for kubernetes system components. Currently only cpu and memory are supported. See http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md for more detail. [default=none]
```
2017-03-21 16:39:19 -07:00
Kubernetes Submit Queue eb77144474 Merge pull request #42715 from DirectXMan12/bug/infinite-hpa
Automatic merge from submit-queue

Rate limit HPA controller to sync period

Since the HPA controller pulls information from an external source that
makes no guarantees about consistency, it's possible for the HPA
to get into an infinite update loop -- if the metrics change with
every query, the HPA controller will run it's normal reconcilation,
post a status update, see that status update itself, fetch new metrics,
and if those metrics are different, post another status update, and
repeat.  This can lead to continuously updating a single HPA.
    
By rate-limiting each HPA to once per sync interval, we prevent this
from happening.

**Release note**:
```release-note
NONE
```
2017-03-21 14:26:16 -07:00
Kubernetes Submit Queue 827591cc6d Merge pull request #43462 from csbell/federation-up-timeout
Automatic merge from submit-queue

[Federation][e2e] Ensure kubefed times out in federation-up.sh

Although this should eventually be moved into kubefed itself, monitor kubefed from federation-up.sh and force it to timeout after being unable to initialize. The motivating factor here is to ensure that CI can timeout after a reasonable attempt at trying to initialize the FCP.
2017-03-21 13:33:44 -07:00
Piotr Szczesniak 8968ac5c36 Renamed fluentd-ds-ready annotation 2017-03-21 20:48:13 +01:00
Kubernetes Submit Queue 63d8e244b6 Merge pull request #43458 from mwielgus/ca-0.5.0
Automatic merge from submit-queue (batch tested with PRs 43422, 43458)

Bump Cluster Autoscaler version to 0.5.0

**What this PR does / why we need it**:

This PR bumps Cluster Autoscaler version to 0.5.0. The version is the same as 0.5.0-beta2 (from the code perspective). We are just removing the -beta2 tag from the image. 

**Release note**:
None.

cc: @MaciekPytel @fgrzadkowski @wojtek-t
2017-03-21 12:24:17 -07:00
Kubernetes Submit Queue 51beb16ede Merge pull request #43422 from liggitt/convert-null-list
Automatic merge from submit-queue

Ensure slices are serialized as zero-length, not null

Fixes https://github.com/kubernetes/kubernetes/issues/43203 null serialization of slices to prevent NPE errors in clients that store and expect to receive non-null JSON values in these fields.

Ensures when we are converting to an external slice field that will be serialized even if empty (has `json` tag that does not include `omitempty`), we populate it with `[]`, not `nil`

Other places I considered putting this logic instead:

* When unmarshaling
  * Would have to be done for both protobuf and ugorji
  * Would still have to be done here (or on marshal) to handle cases where we construct objects to return
* When marshaling
  * Would have to switch to use custom json marshaler (currently we use stdlib)
* When defaulting
  * Defaulting isn't run on some fields, notably, pod template in rc/deployment spec
  * Would still have to be done here (or on marshal) to handle cases where we construct objects to return

```release-note
API fields that previously serialized null arrays as `null` and empty arrays as `[]` no longer distinguish between those values and always output `[]` when serializing to JSON.
```
2017-03-21 11:46:19 -07:00
Kubernetes Submit Queue edbc9f9c43 Merge pull request #43427 from liggitt/default-toleration
Automatic merge from submit-queue

Keep ResourceQuota admission at the end of the chain

Fixes #43426 

Moves DefaultTolerationSeconds admission prior to ResourceQuota to keep it at the end of the chain
2017-03-21 11:01:25 -07:00
Kubernetes Submit Queue 3d3062c17f Merge pull request #43441 from crassirostris/bump-fluentd-gcp-memory-limit
Automatic merge from submit-queue

Increase memory limit for fluentd-gcp

This PR increases fluentd memory limit in fluentd-gcp addon to avoid OOMs. Request is left intact
2017-03-21 10:14:29 -07:00
Christian Bell 699c3ad299 [Federation][e2e] Ensure kubefed times out in federation-up.sh
Although this should eventually be moved into kubefed itself, monitor kubefed from federation-up.sh and force it to timeout after being unable to initialize.
2017-03-21 09:20:19 -07:00
Marcin Wielgus a3b268d659 Bump cluster autoscaler version to 0.5.0 2017-03-21 16:16:34 +01:00
Kubernetes Submit Queue 0a6d82d8e7 Merge pull request #43446 from wojtek-t/fix_restore_from_backup_script
Automatic merge from submit-queue

Fix restore-from-backup.sh script
2017-03-21 06:36:25 -07:00
Kubernetes Submit Queue 681ba3a6d3 Merge pull request #43309 from MaciekPytel/ca_drain_e2e
Automatic merge from submit-queue

e2e test for cluster-autoscaler draining node

**What this PR does / why we need it**:
Adds an e2e test for Cluster-Autoscaler removing a node with a pod running (by rescheduling the pod).

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:
@mwielgus can you take a look?

**Release note**:

```release-note
```
2017-03-21 05:48:38 -07:00
Wojciech Tyczynski 41b3db0fcc Fix restore-from-backup.sh script 2017-03-21 11:58:13 +01:00
Mik Vyatskov 48e750f5a0 Increate memory limit for fluentd-gcp 2017-03-21 10:44:01 +01:00
Kubernetes Submit Queue 1e5fa8fed5 Merge pull request #43415 from thockin/fix-nodeport-close-wait
Automatic merge from submit-queue

Install a REJECT rule for nodeport with no backend

Rather than actually accepting the connection, REJECT.  This will avoid
CLOSE_WAIT.

Fixes #43212


@justinsb @felipejfc @spiddy
2017-03-20 23:22:21 -07:00
Jordan Liggitt bc539151f3
Keep ResourceQuota admission at the end of the chain 2017-03-21 01:53:11 -04:00
Tim Hockin 2ec87999a9 Install a REJECT rule for nodeport with no backend
Rather than actually accepting the connection, REJECT.  This will avoid
CLOSE_WAIT.
2017-03-20 21:37:00 -07:00
Jordan Liggitt 7ceeee8665
Update client-go 2017-03-20 23:57:38 -04:00
Jordan Liggitt 939ca532aa
generated files 2017-03-20 23:57:38 -04:00
Jordan Liggitt 0e2f1b535d
Ensure empty serialized slices are zero-length, not null 2017-03-20 23:56:39 -04:00
Kubernetes Submit Queue e3f6f14bf0 Merge pull request #43411 from csbell/test-timeouts-cleanup
Automatic merge from submit-queue

Unify test timeouts under a common name.

Some timeouts were too aggressive and since we've slowly been moving every controller to 5 minutes, consolidate everyone under ``federatedDefaultTestTimeout``. To aid in debugging some service-related issues, if a service cannot be deleted, we issue a kubectl describe on it prior to failing.
2017-03-20 19:07:26 -07:00
Kubernetes Submit Queue e09fda63be Merge pull request #43414 from Random-Liu/use-uid-in-config
Automatic merge from submit-queue

Use uid in config.go instead of pod full name.

For https://github.com/kubernetes/kubernetes/issues/43397.

In config.go, use pod uid in pod cache.

Previously, if we update the static pod, even though a new UID is generated in file.go, config.go will only reference the pod with pod full name, and never update the pod UID in the internal cache. This causes:
1) If we change container spec, kubelet will restart the corresponding container because the container hash is changed.
2) If we change pod spec, kubelet will do nothing.

With this fix, kubelet will always restart pod whenever pod spec (including container spec) is changed.

@yujuhong @bowei @dchen1107 
/cc @kubernetes/sig-node-bugs
2017-03-20 18:18:36 -07:00
Kubernetes Submit Queue 4974a0589b Merge pull request #43337 from janetkuo/ds-template-semantic-deepequal
Automatic merge from submit-queue

Use Semantic.DeepEqual to compare DaemonSet template on updates

Switch to `Semantic.DeepEqual` when comparing templates on DaemonSet updates, since we can't distinguish between `null` and `[]` in protobuf. This avoids unnecessary DaemonSet pods restarts. 

I didn't touch `reflect.DeepEqual` used in controller because it's close to release date, and the DeepEqual in the controller doesn't cause serious issues (except for maybe causing more enqueues than needed). 

Fixes #43218 

@liggitt @kargakis @lukaszo @kubernetes/sig-apps-pr-reviews
2017-03-20 17:24:18 -07:00
Random-Liu fbc320af28 Use uid in config.go instead of pod full name. 2017-03-20 15:52:29 -07:00
Kubernetes Submit Queue e4536e85b8 Merge pull request #43399 from derekwaynecarr/fix_node_e2e
Automatic merge from submit-queue (batch tested with PRs 42452, 43399)

Fix faulty assumptions in summary API testing

**What this PR does / why we need it**:
1. on systemd, launch kubelet in dedicated part of cgroup hierarchy
1. bump allowable memory usage for busy box containers as my own local testing often showed values > 1mb which were valid per the memory limit settings we impose
1. there is a logic flaw today in how we report node.memory.stats that needs to be fixed in follow-on.

for the last issue, we look at `/sys/fs/cgroup/memory.stat[rss]` value which if you have global accounting enabled on systemd machines (as expected) will report 0 because nothing runs local to the root cgroup.  we really want to be showing the total_rss value for non-leaf cgroups so we get the full hierarchy of usage.
2017-03-20 15:23:35 -07:00
Kubernetes Submit Queue a2d74cda38 Merge pull request #42452 from jingxu97/Mar/nodeNamePrefix
Automatic merge from submit-queue (batch tested with PRs 42452, 43399)

Modify getInstanceByName to avoid calling getInstancesByNames

This PR modify getInstanceByname to loop through all management zones
directly instead of calling getInstancesByNames. Currently
getInstancesByNames use a node name prefix as a filter to list the
instances. If the prefix does not match, it will return all instances
which is very wasteful since getInstanceByName only query one instance
with a specific name.

Partially fix issue #42445
2017-03-20 15:23:33 -07:00
Christian Bell 273eb6b9b5 Unify test timeouts under a common name. 2017-03-20 14:53:40 -07:00
Janet Kuo f780f32c1e Use Semantic.DeepEqual to compare DaemonSet template on updates 2017-03-20 13:58:49 -07:00
Kubernetes Submit Queue 948e3754f8 Merge pull request #43368 from feiskyer/dns-policy
Automatic merge from submit-queue (batch tested with PRs 43398, 43368)

CRI: add support for dns cluster first policy

**What this PR does / why we need it**:

PR #29378 introduces ClusterFirstWithHostNet policy but only dockertools was updated to support the feature. 

This PR updates kuberuntime to support it for all runtimes.


**Which issue this PR fixes** 

fixes #43352

**Special notes for your reviewer**:

Candidate for v1.6.

**Release note**:

```release-note
NONE
```

cc @thockin @luxas @vefimova @Random-Liu
2017-03-20 13:54:33 -07:00
Kubernetes Submit Queue bc82d87f0a Merge pull request #43398 from enisoc/deletion-race-flake
Automatic merge from submit-queue

Deflake TestSyncDeploymentDeletionRace

**What this PR does / why we need it**:

The cache was sometimes catching up while we were testing the case
where the cache is not yet caught up.

Before this fix, I could reproduce the failure with the following
command. After the fix, it passes.

```
go test -count 100000 -run TestSyncDeploymentDeletionRace
```

I checked the other controllers, and they all were already not starting informers for the deletion race test. I also checked that the deletion race tests for other controllers all pass with `-count 100000`.

**Which issue this PR fixes**:

Fixes #43390

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2017-03-20 13:26:03 -07:00
Kubernetes Submit Queue e668ee1182 Merge pull request #43370 from feiskyer/port-mapping
Automatic merge from submit-queue (batch tested with PRs 42659, 43370)

dockershim: process protocol correctly for port mapping

**What this PR does / why we need it**:

dockershim: process protocol correctly for port mapping.

**Which issue this PR fixes** 

Fixes #43365.

**Special notes for your reviewer**:

Should be included in v1.6.

**Release note**:

```release-note
NONE
```

cc/ @Random-Liu @justinsb @kubernetes/sig-node-pr-reviews
2017-03-20 12:40:40 -07:00
Kubernetes Submit Queue e1b4d03499 Merge pull request #42659 from enisoc/controller-ref-rc-rs
Automatic merge from submit-queue (batch tested with PRs 42659, 43370)

RC/RS: Fixes for ControllerRef.

**What this PR does / why we need it**:

This fixes some issues with RC/RS ControllerRef handling that were brought up in reviews for other controller types, after #41984 was merged. See the individual commit messages for details.

**Which issue this PR fixes**:

**Special notes for your reviewer**:

**Release note**:
```release-note
```
2017-03-20 12:40:39 -07:00
Derek Carr 5c8b957779 Fix faulty assumptions in summary API testing 2017-03-20 14:56:11 -04:00
Anthony Yeh 0b9233648e Deflake TestSyncDeploymentDeletionRace
The cache was sometimes catching up while we were testing the case
where the cache is not yet caught up.

Before this fix, I could reproduce the failure with the following
command. After the fix, it passes.

```
go test -count 100000 -run TestSyncDeploymentDeletionRace
```
2017-03-20 11:13:26 -07:00
Anthony Yeh c74aab649f RC/RS: Mark lookup-cache-size flags as deprecated. 2017-03-20 09:10:12 -07:00
Anthony Yeh f4ee44eb39 RC/RS: Check that ControllerRef UID matches found controller.
Otherwise, we may confuse a former controller by that name with a new
one that has the same name.
2017-03-20 08:57:42 -07:00
Maciej Pytel 7f9b3b6358 e2e test for cluster-autoscaler draining node 2017-03-20 16:46:43 +01:00
Kubernetes Submit Queue 38055983e0 Merge pull request #43294 from crassirostris/cluster-logging-test-simplifying
Automatic merge from submit-queue

Loosen requirements of cluster logging e2e tests, make them more stable

There should be an e2e test for cloud logging in the main test suite, because this is the important part of functionality and it can be broken by different components.

However, existing cluster logging e2e tests were too strict for the current solution, which may loose some log entries, which results in flakes. There's no way to fix this problem in 1.6, so this PR makes basic cluster logging e2e tests less strict.
2017-03-20 06:04:59 -07:00
Pengfei Ni 95c3782043 Rewrite resolv.conf for dockershim
PR #29378 introduces ClusterFirstWithHostNet, but docker doesn't support
setting dns options togather with hostnetwork. This commit rewrites
resolv.conf same as dockertools.
2017-03-20 18:45:39 +08:00
Pengfei Ni 079158fa08 CRI: add support for dns cluster first policy
PR #29378 introduces ClusterFirstWithHostNet policy but only dockertools
was updated to support the feature. This PR updates kuberuntime to
support it for all runtimes.

Also fixes #43352.
2017-03-20 17:50:38 +08:00
Pengfei Ni 99ed3202f3 Run hack/update-bazel.sh 2017-03-20 17:48:36 +08:00
Pengfei Ni 53b5f2df48 Add unit test for MakePortsAndBindings 2017-03-20 17:47:38 +08:00
Pengfei Ni 2ddaaec199 dockershim: process protocol correctly for port mapping 2017-03-20 16:52:24 +08:00
Kubernetes Submit Queue 47320fd3f0 Merge pull request #42938 from enisoc/orphan-race
Automatic merge from submit-queue

GC: Fix re-adoption race when orphaning dependents.

**What this PR does / why we need it**:

The GC expects that once it sees a controller with a non-nil
DeletionTimestamp, that controller will not attempt any adoption.
There was a known race condition that could cause a controller to
re-adopt something orphaned by the GC, because the controller is using a
cached value of its own spec from before DeletionTimestamp was set.

This fixes that race by doing an uncached quorum read of the controller
spec just before the first adoption attempt. It's important that this
read occurs after listing potential orphans. Note that this uncached
read is skipped if no adoptions are attempted (i.e. at steady state).

**Which issue this PR fixes**:

Fixes #42639

**Special notes for your reviewer**:

**Release note**:
```release-note
```

cc @kubernetes/sig-apps-pr-reviews
2017-03-20 01:30:11 -07:00
Kubernetes Submit Queue f880340314 Merge pull request #43231 from csbell/service-race
Automatic merge from submit-queue

[Federation] Fix deletion logic in service controller

This is a regression from 1.5 exposed by cascading deletions. In order to apply updates, the service controller locks access to a cached service and spawns go routines without waiting for them. When updates and deletions arrive in quick succession, previous goroutines remain active and race with the deletion logic. Coupled with this, the service_helper was not re-evaluating the value of the DeletionTimestamp.

Without this patch, federation will sometimes leak resources at destruction time about half the time.

In e2e land, about 4-5 test runs cause service tests to eat up all global fwd-ing rules and in turn, every subsequent ingress test will fail until we manually clean up leaked resources. No possibility to go green in fed e2e until this is merged.
2017-03-20 00:19:23 -07:00
Christian Bell 3769435a45 Fix deletion logic in service controller.
This is a regression from 1.5 exposed by cascading deltions. In order to apply updates, the service controller locks access to a cached service and spawns go routines without waiting for them. When updates and deletions arrive in quick succession, previous goroutines remain active and race with the deletion logic. Coupled with this, the service_helper was not re-evaluating the value of the DeletionTimestamp.

Without this patch, federation will sometimes leak resources at destruction time.
2017-03-19 22:49:21 -07:00
Kubernetes Submit Queue ae6a5d2bf3 Merge pull request #42827 from madhusudancs/fed-rs-e2e-update-withname
Automatic merge from submit-queue (batch tested with PRs 43355, 42827)

[Federation] Rewrite ReplicaSet CRUD and Preferences tests.

I think `should create replicasets and rebalance them` test is still flaky. I still don't know the source of this flakiness. I will continue hunting. But it is a lot less flaky than before (or perhaps it even never passed before?). This PR could be merged now and flake hunting can happen in parallel.

```release-note
NONE
```
2017-03-19 10:49:46 -07:00