Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
apiserver: master count and lease endpoint test
**What this PR does / why we need it**: Adds a test to make sure master count and lease endpoint reconcilers work well together, so we can bump LeaseEndpoint to beta. Based on Jordan's comment https://github.com/kubernetes/kubernetes/pull/58474#issuecomment-369954890.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Issue: #57617
Followup PR: #58474
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/cc @kubernetes/sig-cluster-lifecycle-api-reviews @kubernetes/sig-cluster-lifecycle-api-reviews
Automatic merge from submit-queue (batch tested with PRs 63251, 59166, 63250, 63180, 63169). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Break a generic package dependency to core/api/v1
It is not necessary for this package to depend on core/v1.
https://github.com/kubernetes/kubernetes/pull/62913 switched from using a client pool, where each groupVersionResource got its own rest client, to a single client.
This increases the QPS to account for increased requests using a single rest client rate limiter.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove KUBE_API_VERSIONS
Fixes https://github.com/kubernetes/kubernetes/issues/63102
KUBE_API_VERSIONS is an attempt to control the available serialization of types. It pre-dates the idea that we'll have separate schemes, so it's not a thing that makes sense anymore.
Server-side we've had a very clear message about breaks in the logs for a year "KUBE_API_VERSIONS is only for testing. Things will break.".
Client-side it became progressively more broken as we moved to generic types for CRUD more than a year ago. What is registered doesn't matter when everything is unstructured.
We should remove this piece of legacy since it doesn't behave predictable server-side or client-side.
@smarterclayton @lavalamp
@kubernetes/sig-api-machinery-bugs
```release-note
KUBE_API_VERSIONS is no longer respected. It was used for testing, but runtime-config is the proper flag to set.
```
Automatic merge from submit-queue (batch tested with PRs 59965, 59115, 63076, 63059). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Prepull etcd before an upgrade
If kubeadm ever has to upgrade etcd it should prepull the image so
there is less downtime during the upgrade when etcd versions change.
Fixeskubernetes/kubeadm#669
Signed-off-by: Chuck Ha <ha.chuck@gmail.com>
**What this PR does / why we need it**:
This PR Prepulls the etcd image during a `kubeadm upgrade apply`.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixeskubernetes/kubeadm#669
**Special notes for your reviewer**:
constants.MasterComponents was not changed because it is used in many places where etcd does not need to be nor should it be a part of this slice.
**Release note**:
```release-note
NONE
```
/cc @kubernetes/sig-cluster-lifecycle-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 59965, 59115, 63076, 63059). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubeadm: prompt for confirmation when resetting a master
Signed-off-by: Alexander Brand <alexbrand09@gmail.com>
**What this PR does / why we need it**:
This PR implements a confirmation prompt when running `kubeadm reset` on a master node. This is to prevent users from mistakenly resetting a master node.
**Which issue(s) this PR fixes**:
Fixes https://github.com/kubernetes/kubeadm/issues/673
**Special notes for your reviewer**:
I am somewhat torn on the approach on how to detect that kubeadm is running on a master node. I went with checking for the apiserver manfiest file on the local filesystem, as it seems like a simpler approach when compared to getting a k8s client, getting a list of nodes, finding the current node, and checking if it has the master taint. I am happy to rework if the latter is more desirable.
Sample runs:
```
# ./kubeadm reset
[Warning] Are you sure you want to reset this master node? Type the word "confirm" to continue: no
Aborted reset operation on master node
# ./kubeadm reset
[Warning] Are you sure you want to reset this master node? Type the word "confirm" to continue: confirm
[preflight] Running pre-flight checks.
[reset] Stopping the kubelet service.
[reset] WARNING: The kubelet service could not be stopped by kubeadm: [exit status 1]
[reset] WARNING: Please ensure kubelet is stopped manually.
[reset] Unmounting mounted directories in "/var/lib/kubelet"
........
# ./kubeadm reset
[Warning] Are you sure you want to reset this master node? Type the word "confirm" to continue:
Aborted reset operation on master node
# ./kubeadm reset --confirm
[preflight] Running pre-flight checks.
[reset] Stopping the kubelet service.
[reset] WARNING: The kubelet service could not be stopped by kubeadm: [exit status 1]
[reset] WARNING: Please ensure kubelet is stopped manually.
[reset] Unmounting mounted directories in "/var/lib/kubelet"
........
```
**Release note**:
```release-note
kubeadm: prompt the user for confirmation when resetting a master node
```
Automatic merge from submit-queue (batch tested with PRs 59965, 59115, 63076, 63059). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix help message of kubeconfig-dir option(kubeadm alpha phase kubeconfig all)
**What this PR does / why we need it**:
This patch will fix wrong help message.
The command is kubeadm alpha phase kubeconfig [command]
The help message is for --kubeconfig-dir option.
kubeconfig-dir is not port.(It is directory)
So, I fixed the message.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
`NONE`
/sig cluster-lifecycle
Automatic merge from submit-queue (batch tested with PRs 61601, 62881, 63159). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
support simultaneous kubeadm --v and --config
**What this PR does / why we need it**:
Providing -v and --config parameters to increase verbosity while providing a kubeadm.config results in an error rather than providing the requested verbosity.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubeadm/issues/765
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 61601, 62881, 63159). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubeadm: add test coverage to join.go
**What this PR does / why we need it**:
Add test coverage to `join.go`.
A separate commit exports the function `ValidateJoinCommandLine()` from `join.go` so that testing this file is more flexible.
Test coverage is at 76%. One untested part is successfully running `Join.Run()` without errors, but that requires a valid HTTPS API server running and a valid config. i got this partially working but gave up because i faced some cert / config blockers. suggestions on how to get that to work easily are welcome.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
NONE
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 62982, 63075, 63067, 62877, 63141). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
eliminate indirection from type registration
Some years back there was a partial attempt to revamp api type registration, but the effort was never completed and this was before we started splitting schemes. With separate schemes, the idea of partial registration no longer makes sense. This pull starts removing cruft from the registration process and pulls out a layer of indirection that isn't needed.
@kubernetes/sig-api-machinery-pr-reviews
@lavalamp @cheftako @sttts @smarterclayton
Rebase cost is fairly high, so I'd like to avoid this lingering.
/assign @sttts
/assign @cheftako
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 62982, 63075, 63067, 62877, 63141). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
add warnings on using pod-infra-container-image for remote container runtime
**What this PR does / why we need it**:
We should warn on using `--pod-infra-container-image` to avoid confusions, when users are using remote container runtime.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #55676,#62388,#62732
**Special notes for your reviewer**:
/cc @kubernetes/sig-node-pr-reviews
**Release note**:
```release-note
add warnings on using pod-infra-container-image for remote container runtime
```
Instead of using kubeadmutil.CheckErr() in every single
phase of cmd.Run(), use a new helper function
NewValidJoin() that returns a single error.
This would improve the unit testing options for this file.
Otherwise any error in cmd.Run() will trigger an os.Exit()
as kubeadmutil.CheckErr() does that.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Support nsenter in non-systemd environments
**What this PR does / why we need it**:
In our CI, we run kubekins image for most of the jobs. This is a
debian image with upstart and does not enable systemd. So we should
* Bailout if any binary is missing other than systemd-run.
* SupportsSystemd should check the binary path to correctly
identify if the systemd-run is present or not
* Pass the errors back to the callers so kubelet is forced to
fail early when there is a problem. We currently assume
that all binaries are in the root directory by default which
is wrong.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59220, 62927, 63084, 63090, 62284). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix hpa-use-rest-clients help text
The help text erroneously says "WARNING: alpha feature" when it
shouldn't have. When we moved to beta, this should have been removed.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59220, 62927, 63084, 63090, 62284). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubeadm: add better test coverage to token.go
**What this PR does / why we need it**:
a PR for adding some more tests in `kubeadm/cmd` for `token.go`.
some areas of the `token.go` like listing, creating and deleting tokens can present challenges.
coverage was increased to around 87%.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
please, link issue # if you know of such.
**Special notes for your reviewer**:
none
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubeadm: Make kube-proxy tolerate all taints
**What this PR does / why we need it**:
As a essential core component, kube-proxy should generally run on all
nodes even if the cluster operator taints nodes for special purposes.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixeskubernetes/kubeadm#699
**Release note**:
```release-note
kubeadm creates kube-proxy with a toleration to run on all nodes, no matter the taint.
```
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubeadm: Mount additional paths inside apiserver/controller-manager for working CA root
This is required for a working CA root, as /etc/ssl/certs on a few
Linux distributions just contains a bunch of symlinks.
Container Linux and Debian have symlinks pointing to
/usr/share/ca-certificates, ArchLinux has symlinks pointing
to /etc/ca-certificates.
On Debian /etc/ssl/certs can also include symlinks pointing
to /usr/local/share/ca-certificates for local CA certificates.
Fix: kubeadm/#671
---
**What this PR does / why we need it**:
Without this PR, `controller-manager` and `apiserver` would lack a CA root on some Linux distro (ex: Container Linux) which for example break flexplugins which require a CA root [[1]](https://github.com/kubernetes-incubator/external-storage/issues/571#issuecomment-360155462).
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubeadm/issues/671
**Special notes for your reviewer**:
**Release note**:
```release-note
Mount additional paths required for a working CA root, for setups where /etc/ssl/certs doesn't contains certificates but just symlink.
```
/sig sig-kubeadm
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Implemented truncating audit backend
Fixes https://github.com/kubernetes/kubernetes/issues/60432
Introduces an optional truncating backend, disabled by default, that estimates the size of audit events and truncates events/split batches based on the configuration.
/cc @sttts @tallclair @CaoShuFeng @ericchiang
```release-note
Introduce truncating audit backend that can be enabled for existing backend to limit the size of individual audit events and batches of events.
```
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Modify the kubeadm upgrade DAG for the TLS Upgrade
**What this PR does / why we need it**:
This adds the necessary utilities to detect Etcd TLS on static pods from the file system and query Etcd.
It modifies the upgrade logic to make it support the APIServer downtime.
Tests are included and should be passing.
```bash
bazel test //cmd/kubeadm/... \
&& bazel build //cmd/kubeadm --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64 \
&& issue=TLSUpgrade ~/Repos/vagrant-kubeadm-testing/copy_kubeadm_bin.sh
```
These cases are working consistently for me
```bash
kubeadm-1.9.6 reset \
&& kubeadm-1.9.6 init --kubernetes-version 1.9.1 \
&& kubectl apply -f https://git.io/weave-kube-1.6
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.9.6 # non-TLS to TLS
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.10.0 # TLS to TLS
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.10.1 # TLS to TLS
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.9.1 # TLS to TLS /w major version downgrade
```
This branch is based on top of #61942, as resolving the hash race condition is necessary for consistent behavior.
It looks to fit in pretty well with @craigtracey's PR: #62141
The interfaces are pretty similar
/assign @detiber @timothysc
**Which issue(s) this PR fixes**
Helps with https://github.com/kubernetes/kubeadm/issues/740
**Special notes for your reviewer**:
278b322a1c
[kubeadm] Implement ReadStaticPodFromDisk
c74b56372d
Implement etcdutils with Cluster.HasTLS()
- Test HasTLS()
- Instrument throughout upgrade plan and apply
- Update plan_test and apply_test to use new fake Cluster interfaces
- Add descriptions to upgrade range test
- Support KubernetesDir and EtcdDataDir in upgrade tests
- Cover etcdUpgrade in upgrade tests
- Cover upcoming TLSUpgrade in upgrade tests
8d8e5fe33b
Update test-case, fix nil-pointer bug, and improve error message
97117fa873
Modify the kubeadm upgrade DAG for the TLS Upgrade
- Calculate `beforePodHashMap` before the etcd upgrade in anticipation of
KubeAPIServer downtime
- Detect if pre-upgrade etcd static pod cluster `HasTLS()==false` to switch
on the Etcd TLS Upgrade if TLS Upgrade:
- Skip L7 Etcd check (could implement a waiter for this)
- Skip data rollback on etcd upgrade failure due to lack of L7 check
(APIServer is already down unable to serve new requests)
- On APIServer upgrade failure, also rollback the etcd manifest to
maintain protocol compatibility
- Add logging
**Release note**:
```release-note
kubeadm upgrade no longer races leading to unexpected upgrade behavior on pod restarts
kubeadm upgrade now successfully upgrades etcd and the controlplane to use TLS
kubeadm upgrade now supports external etcd setups
kubeadm upgrade can now rollback and restore etcd after an upgrade failure
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add --ipvs-exclude-cidrs flag to kube-proxy.
**What this PR does / why we need it**:
Add a flag to kube-proxy called --ipvs-exclude-cidrs. This flag allows a user to specify a list of CIDR ranges that should not be included in the cleanup of IPVS rules.
Fixes: #59507
**Release note**:
```
Use --ipvs-exclude-cidrs to specify a list of CIDR's which the IPVS proxier should not touch when cleaning up IPVS rules.
```
/assign @m1093782566
Fix `rollbackEtcdData()` to return error=nil on success
`rollbackEtcdData()` used to always return an error making the rest of the
upgrade code completely unreachable.
Ignore errors from `rollbackOldManifests()` during the rollback since it
always returns an error.
Success of the rollback is gated with etcd L7 healthchecks.
Remove logic implying the etcd manifest should be rolled back when
`upgradeComponent()` fails
- Calculate `beforePodHashMap` before the etcd upgrade in anticipation of KubeAPIServer downtime
- Detect if pre-upgrade etcd static pod cluster `HasTLS()==false` to switch on the Etcd TLS Upgrade
if TLS Upgrade:
- Skip L7 Etcd check (could implement a waiter for this)
- Skip data rollback on etcd upgrade failure due to lack of L7 check (APIServer is already down unable to serve new requests)
- On APIServer upgrade failure, also rollback the etcd manifest to maintain protocol compatibility
- Add logging
- Test HasTLS()
- Instrument throughout upgrade plan and apply
- Update plan_test and apply_test to use new fake Cluster interfaces
- Add descriptions to upgrade range test
- Support KubernetesDir and EtcdDataDir in upgrade tests
- Cover etcdUpgrade in upgrade tests
- Cover upcoming TLSUpgrade in upgrade tests
If kubeadm ever has to upgrade etcd it should prepull the image so
there is less downtime during the upgrade when etcd versions change.
Fixeskubernetes/kubeadm#669
Signed-off-by: Chuck Ha <ha.chuck@gmail.com>
Automatic merge from submit-queue (batch tested with PRs 62495, 63003, 62829, 62151, 62002). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix scheduler informers to receive events for all the pods in the cluster
**What this PR does / why we need it**:
This PR has an important change to fix scheduler informers. More information in #63002.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#63002
**Special notes for your reviewer**:
This should be back-ported to 1.10 and 1.9.
**Release note**:
```release-note
Fix scheduler informers to receive events for all the pods in the cluster.
```
Automatic merge from submit-queue (batch tested with PRs 62464, 62947). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
make API.ControlPlaneEndpoint accept IP
**What this PR does / why we need it**:
This PR implements one of the actions defined by https://github.com/kubernetes/kubeadm/issues/751 (checklist form implementing HA in kubeadm).
With this PR, the `API.ControlPlaneEndpoint` value in the kubeadm MasterConfiguration file now accepts both DNS and IP.
The `API.ControlPlaneEndpoint` should be used to set a stable IP address for the control plane; in an HA configuration, this should be the load balancer address (no matter if identified by a DNS name or by a stable IP).
**Special notes for your reviewer**:
/CC @timothysc
This PR is the same of https://github.com/kubernetes/kubernetes/pull/62667, that I closed by error 😥
**Release note**:
```release-note
NONE
```
Nb. first https://github.com/kubernetes/kubernetes/pull/62667 already has the release note
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove request context mapper
http.Request now allows setting/retrieving a per-request context, which removes the need for plumbing a request-context mapper throughout the stack
In addition to being way simpler, this has the benefit of removing a potentially contentious lock object from the handling path
This PR:
* removes RequestContextMapper
* converts context fetchers to use `req.Context()`
* converts context setters to use `req = req.WithContext(...)`
* updates filter plumbing in two places (audit and timeout) to properly return the request with modified context
* updates tests that used a fake context mapper to set the context in the request instead
Fixes https://github.com/kubernetes/kubernetes/issues/62796
```release-note
NONE
```
In our CI, we run kubekins image for most of the jobs. This is a
debian image with upstart and does not enable systemd. So we should:
* Bailout if any binary is missing other than systemd-run.
* SupportsSystemd should check the binary path to correctly
identify if the systemd-run is present or not
* Pass the errors back to the callers so kubelet is forced to
fail early when there is a problem. We currently assume
that all binaries are in the root directory by default which
is wrong.