Automatic merge from submit-queue (batch tested with PRs 47065, 47157, 47143)
Use actual hostname when creating network e2e test pod
**What this PR does / why we need it**:
This changes a e2e framework network test Pod use the actual hostname value to match the `kubernetes.io/hostname` label in it's `NodeSelector`. Currently it assumes the Node name will match that hostname label which is not true in all environments.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Fixescoreos/tectonic-installer#1018
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47065, 47157, 47143)
Removed a race condition from ResourceConsumer
**What this PR does / why we need it**: Without this PR there is a race condition in ResourceConsumer that sometimes results in communication to pods that might not exist anymore.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47127
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Audit webhook config for GCE
Add a `ADVANCED_AUDIT_BACKEND` (comma delimited list) environment variable to the GCE cluster config to select the audit backend, and add configuration for the webhook backend.
~~Based on the first commit from https://github.com/kubernetes/kubernetes/pull/46557~~
For kubernetes/features#22
Since this is GCE-only configuration plumbing, I think this should be exempt from code-freeze.
Automatic merge from submit-queue
Remove Initializers from admission-control in kubernetes-master charm for pre-1.7
**What this PR does / why we need it**:
This fixes a problem with the kubernetes-master charm where kube-apiserver never comes up:
```
failed to initialize admission: Unknown admission plugin: Initializers
```
The Initializers plugin does not exist before Kubernetes 1.7. The charm needs to support 1.6 as well.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47062
**Special notes for your reviewer**:
This fixes a problem introduced by https://github.com/kubernetes/kubernetes/pull/36721
**Release note**:
```release-note
Remove Initializers from admission-control in kubernetes-master charm for pre-1.7
```
Automatic merge from submit-queue
Fixes 47182
**What this PR does / why we need it**: Adds some state guards to the idle_status message to speed up the deployment
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47182
**Special notes for your reviewer**:
This adds additional state guards of the idle_status method, which will
prevent it from being run until a worker has joined the relationship.
Previous invocations may have some messaging inconsistencies but will reach
eventual consistency once a worker has joined.
This prevents the polling loop from executing too soon, bloating the
installation time by bare-minimum an additional 10 minutes.
**Release note**:
```release-note
Added state guards to the idle_status messaging in the kubernetes-master charm to make deployment faster on initial deployment.
```
Automatic merge from submit-queue
Bump up npd version to v0.4.0
Fixes#47070.
Bump up npd version to [v0.4.0](https://github.com/kubernetes/node-problem-detector/releases/tag/v0.4.0).
```release-note
Bump up Node Problem Detector version to v0.4.0, which added support of parsing log from /dev/kmsg and ABRT.
```
/cc @dchen1107 @ajitak
This adds additional state guardsof the idle_status method, which will
prevent it from being run until a worker has joined the relationship.
Previous invocations may have some message artifacting, but will reach
eventual consistency once a worker has joined.
This prevents the polling loop from executing too soon, bloating the
installation time by bare-minimum an additional 10 minutes.
Automatic merge from submit-queue
kubeadm: Enable the Node Authorizer/Admission plugin in v1.7
**What this PR does / why we need it**:
This is similar to https://github.com/kubernetes/kubernetes/pull/46796, but for kubeadm.
Basically it was a part of https://github.com/kubernetes/kubernetes/pull/46796, but there were some other upgradability and compability concerns for kubeadm I took care of while working today.
Example:
```console
$ kubeadm init --kubernetes-version v1.7.0-beta.0
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[init] Using Kubernetes version: v1.7.0-beta.0
[init] Using Authorization mode: [RBAC Node]
...
$ sudo kubectl --kubeconfig=/etc/kubernetes/kubelet.conf get secret foo
Error from server (Forbidden): User "system:node:thegopher" cannot get secrets in the namespace "default".: "no path found to object" (get secrets foo)
$ echo '{"apiVersion":"v1","kind":"Node","metadata":{"name":"foo"}}' | sudo kubectl create -f - --kubeconfig=/etc/kubernetes/kubelet.conf
Error from server (Forbidden): error when creating "STDIN": nodes "foo" is forbidden: node thegopher cannot modify node foo
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
Depends on https://github.com/kubernetes/kubernetes/pull/46864 (uses that PR as a base, will rebase once it's merged)
Please only review the second commit. Will also fix tests in a minute.
**Release note**:
```release-note
kubeadm: Enable the Node Authorizer/Admission plugin in v1.7
```
@mikedanese @liggitt @pipejakob @roberthbailey @jbeda @timothysc
Automatic merge from submit-queue
Bump external provisioner image to smaller version
The image is roughly half as big so this should improve speed/flakiness maybe
-->
```release-note
NONE
```
Automatic merge from submit-queue
Deprecated binding for 1.7
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#10043
```release-note
Deprecated Binding objects in 1.7.
```
Automatic merge from submit-queue (batch tested with PRs 46979, 47078, 47138, 46916)
Add a secretbox and AES-CBC path for encrypt at rest
Add a secretbox and AES-CBC encrypt at rest provider and alter the config, based on feedback from security review. AES-CBC is more well reviewed and generally fits better with common criteria and FIPS, secretbox is newer and faster than CBC.
```release-note
Add secretbox and AES-CBC encryption modes to at rest encryption. AES-CBC is considered superior to AES-GCM because it is resistant to nonce-reuse attacks, and secretbox uses Poly1305 and XSalsa20.
```
Automatic merge from submit-queue (batch tested with PRs 46979, 47078, 47138, 46916)
DeleteCollection should include uninitialized resources
Users who delete a collection expect all resources to be deleted, and
users can also delete an uninitialized resource. To preserve this
expectation, DeleteCollection selects all resources regardless of
initialization.
The namespace controller should list uninitialized resources in order to
gate cleanup of a namespace.
Fixes#47137
Automatic merge from submit-queue (batch tested with PRs 46979, 47078, 47138, 46916)
HPA: only send updates when the status has changed
This commit only sends updates if the status has actually changed.
Since the HPA runs at a regular interval, this should reduce the volume
of writes, especially on short HPA intervals with relatively constant
metrics.
Fixes#47077
**Release note**:
```release-note
The HorizontalPodAutoscaler controller will now only send updates when it has new status information, reducing the number of writes caused by the controller.
```
Automatic merge from submit-queue (batch tested with PRs 46979, 47078, 47138, 46916)
[federation][e2e] Fix cleanupServiceShardLoadBalancer
**What this PR does / why we need it**:
Fixes the issue mentioned in #46976
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#46976
**Special notes for your reviewer**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45877, 46846, 46630, 46087, 47003)
Remove duplicate errors from an aggregate error input.
This PR, in general, removes duplicate errors from an aggregate error input, and returns unique errors with their occurrence count. Specifically, this PR helps with some scheduler errors that fill the log enormously. For example, see the following `truncated` output from a 300-plus nodes cluster, as there was a same error from almost all nodes.
[SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found.........
After this PR, the output looks like (on a 2-node cluster):
SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected.(Count=2)
@derekwaynecarr @smarterclayton @kubernetes/sig-scheduling-pr-reviews
Fixes https://github.com/kubernetes/kubernetes/issues/47145
Automatic merge from submit-queue (batch tested with PRs 45877, 46846, 46630, 46087, 47003)
gpusInUse info error when kubelet restarts
**What this PR does / why we need it**:
In my test, I found 2 errors in the nvidia_gpu_manager.go.
1. the number of activePods in gpusInUse() equals to 0 when kubelet restarts. It seems the Start() method was called before pods recovery which caused this error. So I decide not to call gpusInUse() in the Start() function, just let it happen when new pod needs to be created.
2. the container.ContainerID in line 242 returns the id in format of "docker://<container_id>", this will make the client failed to inspect the container by id. We have to erase the prefix of "docker://".
**Special notes for your reviewer**:
**Release note**:
```
Avoid assigning the same GPU to multiple containers.
```
Automatic merge from submit-queue (batch tested with PRs 45877, 46846, 46630, 46087, 47003)
update NetworkPolicy e2e test for v1 semantics
This makes the NetworkPolicy test at least correct for v1, although ideally we'll eventually add a few more tests... (So this covers about half of #46625.)
I've tested that this compiles, but not that it passes, since I don't have a v1-compatible NetworkPolicy implementation yet...
@caseydavenport @ozdanborne, maybe you're closer to having a testable plugin than I am?
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45877, 46846, 46630, 46087, 47003)
func parseEndpointWithFallbackProtocol should check if protocol of endpoint is empty
**What this PR does / why we need it**:
func parseEndpointWithFallbackProtocol should check if protocol of endpoint is empty
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: part of #45927
NONE
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45877, 46846, 46630, 46087, 47003)
add Unit Test for PodList Printer
Signed-off-by: zhangxiaoyu-zidif <zhang.xiaoyu33@zte.com.cn>
**What this PR does / why we need it**:
add Unit Test for PodList Printer
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47024, 47050, 47086, 47081, 47013)
Wrap HumanReadablePrinter in tab output unless explicitly asked not to
`kubectl get` was not properly aligning its output due to #40848
Fixes an accidental regression. In general, we should not accept an incoming tabwriter and instead manage at a higher level. Fix the bug and add a comment re: future refactoring.
Automatic merge from submit-queue (batch tested with PRs 47024, 47050, 47086, 47081, 47013)
kubeadm: Make the creation of the RBAC rules phase idempotent
**What this PR does / why we need it**:
Bugfix: Currently kubeadm fails with a non-zero code if resources it's trying to create already exist. This PR fixes that by making kubeadm try to Update resources that already exist.
After this PR, https://github.com/kubernetes/kubernetes/pull/46879 and a beta.1 release, kubeadm will be fully upgradeable from v1.6 to v1.7 using only kubeadm init.
Last piece of https://github.com/kubernetes/kubeadm/issues/288
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes: https://github.com/kubernetes/kubeadm/issues/288
**Special notes for your reviewer**:
**Release note**:
```release-note
kubeadm: Modifications to cluster-internal resources installed by kubeadm will be overwritten when upgrading from v1.6 to v1.7.
```
@pipejakob @mikedanese @timothysc
Automatic merge from submit-queue (batch tested with PRs 47024, 47050, 47086, 47081, 47013)
apiextensions-apiserver: Fix decoding of DeleteOptions.
Fixes#47072 by making apiextensions-apiserver capable of decoding unversioned DeleteOptions, rather than only handling Unstructured objects (i.e. Custom Resources).
This also closes#46736 and #37554 since the added regression test works for TPR as well.
Automatic merge from submit-queue (batch tested with PRs 47024, 47050, 47086, 47081, 47013)
client-go: deprecate TPR example and add CRD example
/cc @nilebox
Part of https://github.com/kubernetes/kubernetes/issues/46702
Users who delete a collection expect all resources to be deleted, and
users can also delete an uninitialized resource. To preserve this
expectation, DeleteCollection selects all resources regardless of
initialization.
The namespace controller should list uninitialized resources in order to
gate cleanup of a namespace.
Automatic merge from submit-queue (batch tested with PRs 43005, 46660, 46385, 46991, 47103)
Ignore `daemonset-controller-hash` label key in federation before comparing the federated object with its cluster equivalent.
Kubernetes daemonset controller writes a daemonset's hash to the object label as an optimization to avoid recomputing it every time. Adding a new label to the object that the federation is unaware of causes problems because federated controllers compare the objects in federation and their equivalents in clusters and try to reconcile them. This leads to a constant fight between the federated daemonset
controller and the cluster controllers, and they never reach a stable state.
Ideally, cluster components should not update an object's spec or metadata in a way federation cannot replicate. They can update an object's status though. Therefore, this daemonset hash should be a
field in daemonset's status, not a label in object meta. @janetkuo says that this label is only a short term solution. In the near future, they are going to replace it with revision numbers in daemonset status. We
can then rip this bandaid out.
Fixes#46925
**Release note**:
```release-note
NONE
```
/assign @csbell
/cc @shashidharatd @marun @nikhiljindal @perotinus
/sig federation
Automatic merge from submit-queue (batch tested with PRs 43005, 46660, 46385, 46991, 47103)
[gke-slow always fails] Defer DeleteGCEStaticIP before asserting error
From https://github.com/kubernetes/kubernetes/issues/46918.
I'm getting close to the root cause: During tests, CreateGCEStaticIP() in fact successfully created static IP, but the parser we wrote in test mistakenly think we failed, probably because the gcloud output format was changed recently (or not). I'm still looking into fixing that.
This PR defer the delete function before asserting the error so that we can stop consistently leaking static IP in every run.
/assign @krzyzacy @dchen1107
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43005, 46660, 46385, 46991, 47103)
add e2e node test for Pod hostAliases feature
**What this PR does / why we need it**: adds node e2e test for #45148
tests requested in https://github.com/kubernetes/kubernetes/issues/43632#issuecomment-298434125
**Release note**:
```release-note
NONE
```
@yujuhong @thockin
Automatic merge from submit-queue (batch tested with PRs 43005, 46660, 46385, 46991, 47103)
Azure cloudprovider retry using flowcontrol
An initial attempt at engaging exponential backoff for API error responses.
Addresses #47048
Uses k8s.io/client-go/util/flowcontrol; implementation inspired by GCE
cloudprovider backoff.
**What this PR does / why we need it**:
The existing azure cloudprovider implementation has no guard rails in place to adapt to unexpected underlying operational conditions (i.e., clogs in resource plumbing between k8s runtime and the cloud API). The purpose of these changes is to support exponential backoff wrapping around API calls; and to support targeted rate limiting. Both of these options are configurable via `--cloud-config`.
Implementation inspired by the GCE's use of `k8s.io/client-go/util/flowcontrol` and `k8s.io/apimachinery/pkg/util/wait`, this PR likewise uses `flowcontrol` for rate limiting; and `wait` to thinly wrap backoff retry attempts to the API.
**Special notes for your reviewer**:
Pay especial note to the declaration of retry-able conditions from an unsuccessful HTTP request:
- all `4xx` and `5xx` HTTP responses
- non-nil error responses
And the declaration of retry success conditions:
- `2xx` HTTP responses
Tests updated to include additions to `Config`.
Those may be incomplete, or in other ways non-representative.
**Release note**:
```release-note
Added exponential backoff to Azure cloudprovider
```
Automatic merge from submit-queue (batch tested with PRs 43005, 46660, 46385, 46991, 47103)
Consolidate sysctl commands for kubelet
**What this PR does / why we need it**:
These commands are important enough to be in the Kubelet itself.
By default, Ubuntu 14.04 and Debian Jessie have these set to 200 and
20000. Without this setting, nodes are limited in the number of
containers that they can start.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#26005
**Special notes for your reviewer**:
I had a difficult time writing tests for this. It is trivial to create a fake sysctl for testing, but the Kubelet does not have any tests for the prior settings.
**Release note**:
```release-note
```