Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add note on upgrading cluster by kubeadm.
**What this PR does / why we need it**:
Upgrading cluster from 1.9.x to 1.10.0 by kubeadm fails due to type change of `featureGates` in `KubeProxyConfiguration` done in #57962. This PR add a note on what should be done before upgrading.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref #61764
**Special notes for your reviewer**:
cc @kubernetes/sig-cluster-lifecycle-bugs @liggitt @fgbreel
Thanks @berlinsaint for the workaround!
We may need a patch (if possible) for kubeadm to do this automatically as suggested by @danderson .
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 62951, 57460, 63118). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix device plugin re-registration
**What this PR does / why we need it**:
While registering a new endpoint, device manager copies all the devices from the old endpoint for the same resource and then it stops the old endpoint and starts the new endpoint.
There is no sync between stopping the old and starting the new. While stopping the old, manager marks devices(which are copied to new endpoint as well) as "Unhealthy".
In the endpoint.go, when after restart, plugin reports devices healthy, same health state (healthy) is found in the endpoint database and endpoint module does not update manager database.
Solution in the PR is to mark devices as unhealthy before copying to new endpoint.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62773
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
/cc @jiayingz @vishh @RenaudWasTaken @derekwaynecarr
Automatic merge from submit-queue (batch tested with PRs 62951, 57460, 63118). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix hostport checking for initContainers since they run in sequential order
**What this PR does / why we need it**:
Fix hostport checking for initContainers since they run in sequential order
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
partial Fixes#57443
**Special notes for your reviewer**:
/assign @CaoShuFeng @dims
**Release note**:
```release-note
None
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Support nsenter in non-systemd environments
**What this PR does / why we need it**:
In our CI, we run kubekins image for most of the jobs. This is a
debian image with upstart and does not enable systemd. So we should
* Bailout if any binary is missing other than systemd-run.
* SupportsSystemd should check the binary path to correctly
identify if the systemd-run is present or not
* Pass the errors back to the callers so kubelet is forced to
fail early when there is a problem. We currently assume
that all binaries are in the root directory by default which
is wrong.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add contribex to github template owners
**What this PR does / why we need it**:
The `.github/` folder doesn't have an OWNERS file. These templates fall under contribex, so this would add an OWNERS file to this folder.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Tag pkg_rpm rules as manual
**What this PR does / why we need it**: Mitigation step for #63108. Nothing depends on these rules yet, and tagging as manual prevents `make bazel-build` from building them, too.
**Release note**:
```release-note
NONE
```
/assign @BenTheElder
/cc @sigma
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
dockershim/sandbox: clean up pod network even if SetUpPod() failed
If the CNI network plugin completes successfully, but something fails
between that success and dockerhsim's sandbox setup code, plugin resources
may not be cleaned up. A non-trivial amount of code runs after the
plugin itself exits and the CNI driver's SetUpPod() returns, and any error
condition recognized by that code would cause this leakage.
The Kubernetes CRI RunPodSandbox() request does not attempt to clean
up on errors, since it cannot know how much (if any) networking
was actually set up. It depends on the CRI implementation to do
that cleanup for it.
In the dockershim case, a SetUpPod() failure means networkReady is
FALSE for the sandbox, and TearDownPod() will not be called later by
garbage collection even though networking was configured, because
dockershim can't know how far SetUpPod() got.
Concrete examples include if the sandbox's container is somehow
removed during during that time, or another OS error is encountered,
or the plugin returns a malformed result to the CNI driver.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1532965
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59220, 62927, 63084, 63090, 62284). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix hpa-use-rest-clients help text
The help text erroneously says "WARNING: alpha feature" when it
shouldn't have. When we moved to beta, this should have been removed.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59220, 62927, 63084, 63090, 62284). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix qosReserved json tag (lowercase qos, instead of uppercase QOS)
The API conventions specify that json keys should start with a lowercase
character, and if the key starts with an initialism, all characters in
the initialism should be lowercase. See `tlsCipherSuites` as an example.
API Conventions:
https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md
>All letters in the acronym should have the same case, using the
>appropriate case for the situation. For example, at the beginning
>of a field name, the acronym should be all lowercase, such as "httpGet".
Follow up to: https://github.com/kubernetes/kubernetes/pull/62925
```release-note
NONE
```
@sjenning @derekwaynecarr
Automatic merge from submit-queue (batch tested with PRs 59220, 62927, 63084, 63090, 62284). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix typo: mutating validating admission should be distinguished
1. fix typo: mutating validating admission should be distinguished
1. avoid calling admit.Handles twice in delete
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59220, 62927, 63084, 63090, 62284). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubeadm: add better test coverage to token.go
**What this PR does / why we need it**:
a PR for adding some more tests in `kubeadm/cmd` for `token.go`.
some areas of the `token.go` like listing, creating and deleting tokens can present challenges.
coverage was increased to around 87%.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
please, link issue # if you know of such.
**Special notes for your reviewer**:
none
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove confusing flexibility for metadata interpretation
Metadata accessors are coded in. This means that we don't need to inject flexibility, the flexibility is already present based on what your code relies up. This removes the per-individual resource injection which simplifies all calling code.
intersection of @kubernetes/sig-api-machinery-pr-reviews @kubernetes/sig-cli-maintainers
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 62590, 62818, 63015, 62922, 63000). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove METADATA_AGENT_VERSION config option
**What this PR does / why we need it**:
Remove METADATA_AGENT_VERSION configuration option. To keep Metadata Agent version consistent across Kubernetes deployments.
**Release note**:
```release-note
Remove METADATA_AGENT_VERSION configuration option.
```
Automatic merge from submit-queue (batch tested with PRs 62590, 62818, 63015, 62922, 63000). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use BootID instead of ExternalID to check for new instance
PR #60692 changed the way that ExternalID is reported on GCE. Its value
is no longer the GCE instance ID. It is the instance name. So it
cannot be used to determine VM uniqueness across time. Instead,
upgrade will check that the boot ID changed.
**What this PR does / why we need it**:
Node upgrades stall out because the external ID remains the same across upgrades now.
**Which issue(s) this PR fixes**:
Fixes#62713
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 62590, 62818, 63015, 62922, 63000). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
gce: move etcd dir cleanup to manifests
we deploy it as a manifest, not an addon so locate it with the other
master manifests.
This is the last "bare pod addon", which needs to be remove to improve the situation in https://github.com/kubernetes/kubernetes/issues/62808.
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 62590, 62818, 63015, 62922, 63000). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
authz: nodes should not be able to delete themselves
@kubernetes/sig-auth-pr-reviews
```release-note
kubelets are no longer allowed to delete their own Node API object. Prior to 1.11, in rare circumstances related to cloudprovider node ID changes, kubelets would attempt to delete/recreate their Node object at startup. If a legacy kubelet encounters this situation, a cluster admin can remove the Node object:
* `kubectl delete node/<nodeName>`
or grant self-deletion permission explicitly:
* `kubectl create clusterrole self-deleting-nodes --verb=delete --resource=nodes`
* `kubectl create clusterrolebinding self-deleting-nodes --clusterrole=self-deleting-nodes --group=system:nodes`
```
Automatic merge from submit-queue (batch tested with PRs 62590, 62818, 63015, 62922, 63000). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix some bugs inside CSI volume plugin unit test TestAttacherMountDevice
**What this PR does / why we need it**:
Fix two bugs inside CSI volume plugin unit test `TestAttacherMountDevice`.
**Release note**:
```release-note
None
```
PR opencontainers/runc#1754 works around an issue in manager.Apply(-1) that
makes Kubelet startup hang when using systemd cgroup driver (by adding a
timeout) and further PR opencontainers/runc#1772 fixes that bug by
checking the proper error status before waiting on the channel.
PR opencontainers/runc#1776 checks whether Delegate works in slices,
which keeps libcontainer systemd cgroup driver working on systemd v237+.
PR opencontainers/runc#1781 makes the channel buffered, so if we time
out waiting on the channel, the updater will not block trying to it
since there are no longer any consumers.
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubeadm: Make kube-proxy tolerate all taints
**What this PR does / why we need it**:
As a essential core component, kube-proxy should generally run on all
nodes even if the cluster operator taints nodes for special purposes.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixeskubernetes/kubeadm#699
**Release note**:
```release-note
kubeadm creates kube-proxy with a toleration to run on all nodes, no matter the taint.
```
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
reset resultRun to 0 on pod restart
**What this PR does / why we need it**:
The resultRun should be reset to 0 on pod restart, so that resultRun on the first failure of the new container will be 1, which is correct. Otherwise, the actual FailureThreshold after restarting will be `FailureThreshold - 1`.
**Which issue(s) this PR fixes**:
This PR is related to https://github.com/kubernetes/kubernetes/issues/53530. https://github.com/kubernetes/kubernetes/pull/46371 fixed that issue but there's still a little problem like what I said above.
**Special notes for your reviewer**:
**Release note**:
```release-note
fix resultRun by resetting it to 0 on pod restart
```
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubeadm: Mount additional paths inside apiserver/controller-manager for working CA root
This is required for a working CA root, as /etc/ssl/certs on a few
Linux distributions just contains a bunch of symlinks.
Container Linux and Debian have symlinks pointing to
/usr/share/ca-certificates, ArchLinux has symlinks pointing
to /etc/ca-certificates.
On Debian /etc/ssl/certs can also include symlinks pointing
to /usr/local/share/ca-certificates for local CA certificates.
Fix: kubeadm/#671
---
**What this PR does / why we need it**:
Without this PR, `controller-manager` and `apiserver` would lack a CA root on some Linux distro (ex: Container Linux) which for example break flexplugins which require a CA root [[1]](https://github.com/kubernetes-incubator/external-storage/issues/571#issuecomment-360155462).
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubeadm/issues/671
**Special notes for your reviewer**:
**Release note**:
```release-note
Mount additional paths required for a working CA root, for setups where /etc/ssl/certs doesn't contains certificates but just symlink.
```
/sig sig-kubeadm
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Implemented truncating audit backend
Fixes https://github.com/kubernetes/kubernetes/issues/60432
Introduces an optional truncating backend, disabled by default, that estimates the size of audit events and truncates events/split batches based on the configuration.
/cc @sttts @tallclair @CaoShuFeng @ericchiang
```release-note
Introduce truncating audit backend that can be enabled for existing backend to limit the size of individual audit events and batches of events.
```
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Modify the kubeadm upgrade DAG for the TLS Upgrade
**What this PR does / why we need it**:
This adds the necessary utilities to detect Etcd TLS on static pods from the file system and query Etcd.
It modifies the upgrade logic to make it support the APIServer downtime.
Tests are included and should be passing.
```bash
bazel test //cmd/kubeadm/... \
&& bazel build //cmd/kubeadm --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64 \
&& issue=TLSUpgrade ~/Repos/vagrant-kubeadm-testing/copy_kubeadm_bin.sh
```
These cases are working consistently for me
```bash
kubeadm-1.9.6 reset \
&& kubeadm-1.9.6 init --kubernetes-version 1.9.1 \
&& kubectl apply -f https://git.io/weave-kube-1.6
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.9.6 # non-TLS to TLS
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.10.0 # TLS to TLS
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.10.1 # TLS to TLS
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.9.1 # TLS to TLS /w major version downgrade
```
This branch is based on top of #61942, as resolving the hash race condition is necessary for consistent behavior.
It looks to fit in pretty well with @craigtracey's PR: #62141
The interfaces are pretty similar
/assign @detiber @timothysc
**Which issue(s) this PR fixes**
Helps with https://github.com/kubernetes/kubeadm/issues/740
**Special notes for your reviewer**:
278b322a1c
[kubeadm] Implement ReadStaticPodFromDisk
c74b56372d
Implement etcdutils with Cluster.HasTLS()
- Test HasTLS()
- Instrument throughout upgrade plan and apply
- Update plan_test and apply_test to use new fake Cluster interfaces
- Add descriptions to upgrade range test
- Support KubernetesDir and EtcdDataDir in upgrade tests
- Cover etcdUpgrade in upgrade tests
- Cover upcoming TLSUpgrade in upgrade tests
8d8e5fe33b
Update test-case, fix nil-pointer bug, and improve error message
97117fa873
Modify the kubeadm upgrade DAG for the TLS Upgrade
- Calculate `beforePodHashMap` before the etcd upgrade in anticipation of
KubeAPIServer downtime
- Detect if pre-upgrade etcd static pod cluster `HasTLS()==false` to switch
on the Etcd TLS Upgrade if TLS Upgrade:
- Skip L7 Etcd check (could implement a waiter for this)
- Skip data rollback on etcd upgrade failure due to lack of L7 check
(APIServer is already down unable to serve new requests)
- On APIServer upgrade failure, also rollback the etcd manifest to
maintain protocol compatibility
- Add logging
**Release note**:
```release-note
kubeadm upgrade no longer races leading to unexpected upgrade behavior on pod restarts
kubeadm upgrade now successfully upgrades etcd and the controlplane to use TLS
kubeadm upgrade now supports external etcd setups
kubeadm upgrade can now rollback and restore etcd after an upgrade failure
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
unpack dynamic kubelet config payloads to files
This PR unpacks the downloaded ConfigMap to a set of files on the node.
This enables other config files to ride alongside the
KubeletConfiguration, and the KubeletConfiguration to refer to these
cohabitants with relative paths.
This PR also stops storing dynamic config metadata (e.g. current,
last-known-good config records) in the same directory as config
checkpoints. Instead, it splits the storage into `meta` and
`checkpoints` dirs.
The current store dir structure is as follows:
```
- dir named by --dynamic-config-dir (root for managing dynamic config)
| - meta (dir for metadata, e.g. which config source is currently assigned, last-known-good)
| - current (a serialized v1 NodeConfigSource object, indicating the assigned config)
| - last-known-good (a serialized v1 NodeConfigSource object, indicating the last-known-good config)
| - checkpoints (dir for config checkpoints)
| - uid1 (dir for unpacked config, identified by uid1)
| - file1
| - file2
| - ...
| - uid2
| - ...
```
There are some likely changes to the above structure before dynamic config goes beta, such as renaming "current" to "assigned" for clarity, and extending the checkpoint identifier to include a resource version, as part of resolving #61643.
```release-note
NONE
```
/cc @luxas @smarterclayton
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add --ipvs-exclude-cidrs flag to kube-proxy.
**What this PR does / why we need it**:
Add a flag to kube-proxy called --ipvs-exclude-cidrs. This flag allows a user to specify a list of CIDR ranges that should not be included in the cleanup of IPVS rules.
Fixes: #59507
**Release note**:
```
Use --ipvs-exclude-cidrs to specify a list of CIDR's which the IPVS proxier should not touch when cleaning up IPVS rules.
```
/assign @m1093782566
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Make integration test etcd store unique
Ensures each test gets a distinct spot in etcd to avoid cross-test pollution
Fixes flakes seen in the etcd_storage_path_test
```release-note
NONE
```
If the CNI network plugin completes successfully, but something fails
between that success and dockerhsim's sandbox setup code, plugin resources
may not be cleaned up. A non-trivial amount of code runs after the
plugin itself exits and the CNI driver's SetUpPod() returns, and any error
condition recognized by that code would cause this leakage.
The Kubernetes CRI RunPodSandbox() request does not attempt to clean
up on errors, since it cannot know how much (if any) networking
was actually set up. It depends on the CRI implementation to do
that cleanup for it.
In the dockershim case, a SetUpPod() failure means networkReady is
FALSE for the sandbox, and TearDownPod() will not be called later by
garbage collection even though networking was configured, because
dockershim can't know how far SetUpPod() got.
Concrete examples include if the sandbox's container is somehow
removed during during that time, or another OS error is encountered,
or the plugin returns a malformed result to the CNI driver.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1532965
The API conventions specify that json keys should start with a lowercase
character, and if the key starts with an initialism, all characters in
the initialism should be lowercase. See `tlsCipherSuites` as an example.
API Conventions:
https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md
>All letters in the acronym should have the same case, using the
>appropriate case for the situation. For example, at the beginning
>of a field name, the acronym should be all lowercase, such as "httpGet".
Fix `rollbackEtcdData()` to return error=nil on success
`rollbackEtcdData()` used to always return an error making the rest of the
upgrade code completely unreachable.
Ignore errors from `rollbackOldManifests()` during the rollback since it
always returns an error.
Success of the rollback is gated with etcd L7 healthchecks.
Remove logic implying the etcd manifest should be rolled back when
`upgradeComponent()` fails
- Calculate `beforePodHashMap` before the etcd upgrade in anticipation of KubeAPIServer downtime
- Detect if pre-upgrade etcd static pod cluster `HasTLS()==false` to switch on the Etcd TLS Upgrade
if TLS Upgrade:
- Skip L7 Etcd check (could implement a waiter for this)
- Skip data rollback on etcd upgrade failure due to lack of L7 check (APIServer is already down unable to serve new requests)
- On APIServer upgrade failure, also rollback the etcd manifest to maintain protocol compatibility
- Add logging