Automatic merge from submit-queue (batch tested with PRs 52376, 52439, 52382, 52358, 52372)
Pass correct clientbuilder to cloudproviders
Fixes https://github.com/kubernetes/kubeadm/issues/425 by moving the Initialize call to after the start of the token controller and passing `clientBuilder` instead of `rootClientBuilder` to the cloudproviders.
/assign @bowei
**Release note**:
```release-note
NONE
```
Should fix in 1.8 and cherrypick to 1.7
Automatic merge from submit-queue (batch tested with PRs 51601, 52153, 52364, 52362, 52342)
fix kubeadm token create error
**What this PR does / why we need it**:
fix kubeadm token create error
**Which issue this PR fixes**
[#436](https://github.com/kubernetes/kubeadm/issues/436)
**Special notes for your reviewer**:
CC @luxas
Automatic merge from submit-queue (batch tested with PRs 51601, 52153, 52364, 52362, 52342)
fix Kubeadm phase addon error
What this PR does / why we need it:
fix Kubeadm phase addon error
Which issue this PR fixes
[#437](https://github.com/kubernetes/kubeadm/issues/437)
Special notes for your reviewer:
CC @luxas @andrewrynhard
Automatic merge from submit-queue (batch tested with PRs 51601, 52153, 52364, 52362, 52342)
Improve kubeadm help text
* Replace 'misc' with more specific at-mentions bugs and feature-requests.
* Replace ReplicaSets with Deployments as example, because ReplicaSets are dated.
* Generalize join example.
Before:
```
┌──────────────────────────────────────────────────────────┐
│ KUBEADM IS BETA, DO NOT USE IT FOR PRODUCTION CLUSTERS! │
│ │
│ But, please try it out! Give us feedback at: │
│ https://github.com/kubernetes/kubeadm/issues │
│ and at-mention @kubernetes/sig-cluster-lifecycle-misc │
└──────────────────────────────────────────────────────────┘
Example usage:
Create a two-machine cluster with one master (which controls the cluster),
and one node (where your workloads, like Pods and ReplicaSets run).
┌──────────────────────────────────────────────────────────┐
│ On the first machine │
├──────────────────────────────────────────────────────────┤
│ master# kubeadm init │
└──────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────┐
│ On the second machine │
├──────────────────────────────────────────────────────────┤
│ node# kubeadm join --token=<token> <ip-of-master>:<port> │
└──────────────────────────────────────────────────────────┘
You can then repeat the second step on as many other machines as you like.
```
After (changes highlighted with `<--`):
```
┌──────────────────────────────────────────────────────────┐
│ KUBEADM IS BETA, DO NOT USE IT FOR PRODUCTION CLUSTERS! │
│ │
│ But, please try it out! Give us feedback at: │
│ https://github.com/kubernetes/kubeadm/issues │
│ and at-mention @kubernetes/sig-cluster-lifecycle-bugs │ <--
│ or @kubernetes/sig-cluster-lifecycle-feature-requests │ <--
└──────────────────────────────────────────────────────────┘
Example usage:
Create a two-machine cluster with one master (which controls the cluster),
and one node (where your workloads, like Pods and Deployments run). <--
┌──────────────────────────────────────────────────────────┐
│ On the first machine │
├──────────────────────────────────────────────────────────┤
│ master# kubeadm init │
└──────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────┐
│ On the second machine │
├──────────────────────────────────────────────────────────┤
│ node# kubeadm join <arguments-returned-from-init> │ <--
└──────────────────────────────────────────────────────────┘
You can then repeat the second step on as many other machines as you like.
```
cc @luxas
Automatic merge from submit-queue (batch tested with PRs 52007, 52196, 52169, 52263, 52291)
Fixed CCM service controller start jitter
**What this PR does / why we need it**: The start jitter for the service controller was running regardless if the service controller was being ran. This should help startup time for CCM's without the service controller implementation.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/cc @wlan0 @andrewsykim @luxas @jhorwit2
/area cloudprovider
/sig cluster-lifecycle
Automatic merge from submit-queue (batch tested with PRs 52119, 52306)
kubeadm: Mark self-hosting alpha in v1.8
**What this PR does / why we need it**:
Self-hosting is alpha in v1.8, not beta. We targeted it to be beta, hence the initial add of this feature gates' value, but now changing back to alpha.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
@kubernetes/sig-cluster-lifecycle-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 50289, 52106)
Honor --use-service-account-credentials in cloud-controller-manager
If --use-service-account-credentials is specified, the cloud controller manager should honor it
The distinction between the rootclientbuilder and the clientbuilder came from kube-controller-manager, which is responsible for running the very controllers that enable service accounts. That two-layer approach is not needed in the cloud-controller-manager.
```release-note
The `kube-cloud-controller-manager` flag `--service-account-private-key-file` was non-functional and is now deprecated.
The `kube-cloud-controller-manager` flag `--use-service-account-credentials` is now honored consistently, regardless of whether `--service-account-private-key-file` was specified.
```
Automatic merge from submit-queue (batch tested with PRs 50949, 52155, 52175, 52112, 52188)
kubeadm: Perform TLS Bootstrapping in kubeadm join for v1.7 kubelets
**What this PR does / why we need it**:
Partially reverts 9dc3a661d7
Performs the TLS Bootstrap if `kubeadm join` v1.8 is executed on a node with a kubelet v1.7.
Since the kubelet arguments for v1.7 (from the kubeadm dropin) expects a TLS bootstrapped kubeconfig, we still have to provide this functionality in kubeadm CLI v1.8 (as we support one minor version down)
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes: https://github.com/kubernetes/kubeadm/issues/429
**Special notes for your reviewer**:
This is a required bug fix for v1.8
**Release note**:
```release-note
NONE
```
@kubernetes/sig-cluster-lifecycle-pr-reviews
Currently setting watch cache size for a given resource does not disable
the watch cache. This commit adds a new `default-watch-cache-size` flag
to map to the existing field, and refactors how watch cache sizes are
calculated to bring all of the code into one place. It also adds debug
logging to startup to allow us to verify watch cache enablement in
production.
Automatic merge from submit-queue
kubeadm: add `kubeadm phase addons` command
**What this PR does / why we need it**:
Adds the `addons` phase command to `kubeadm`
fixes: https://github.com/kubernetes/kubeadm/issues/418
/cc @luxas
Automatic merge from submit-queue (batch tested with PRs 51728, 49202)
Enable CRI-O stats from cAdvisor
**What this PR does / why we need it**:
cAdvisor may support multiple container runtimes (docker, rkt, cri-o, systemd, etc.)
As long as the kubelet continues to run cAdvisor, runtimes with native cAdvisor support may not want to run multiple monitoring agents to avoid performance regression in production. Pending kubelet running a more light-weight monitoring solution, this PR allows remote runtimes to have their stats pulled from cAdvisor when cAdvisor is registered stats provider by introspection of the runtime endpoint.
See issue https://github.com/kubernetes/kubernetes/issues/51798
**Special notes for your reviewer**:
cAdvisor will be bumped to pick up https://github.com/google/cadvisor/pull/1741
At that time, CRI-O will support fetching stats from cAdvisor.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51956, 50708)
Move autoscaling/v2 from alpha1 to beta1
This graduates autoscaling/v2alpha1 to autoscaling/v2beta1. The move is more-or-less just a straightforward rename.
Part of kubernetes/features#117
```release-note
v2 of the autoscaling API group, including improvements to the HorizontalPodAutoscaler, has moved from alpha1 to beta1.
```
Automatic merge from submit-queue (batch tested with PRs 51956, 50708)
kubeadm: Upgrade Bootstrap Tokens to beta when upgrading to v1.8
**What this PR does / why we need it**:
Makes sure the v1.7 -> v1.8 upgrade works regarding the Bootstrap Token alpha -> beta graduation.
Not much have to be done, but some LoC are needed to preserve the behaivor
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
@kubernetes/sig-cluster-lifecycle-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 51603, 51653)
Graduate metrics/v1alpha1 to v1beta1
This introduces v1beta1 of the resource metrics API, previously in alpha.
The v1alpha1 version remains for compatibility with the Heapster legacy version
of the resource metrics API, which is compatible with the v1alpha1 version. It also
renames the v1beta1 version to `resource-metrics.metrics.k8s.io`.
The HPA controller's REST clients (but not the legacy client) have been migrated as well.
Part of kubernetes/features#118.
```release-note
Migrate the metrics/v1alpha1 API to metrics/v1beta1. The HorizontalPodAutoscaler
controller REST client now uses that version. For v1beta1, the API is now known as
resource-metrics.metrics.k8s.io.
```
Automatic merge from submit-queue
Improve APIService auto-registration for HA/upgrade scenarios
Fixes#51912
Required for 1.8 due to impact on HA upgrades.
/assign @deads2k
cc @kubernetes/sig-api-machinery-bugs
```release-note
Fixes an issue with APIService auto-registration affecting rolling HA apiserver restarts that add or remove API groups being served.
```
Automatic merge from submit-queue (batch tested with PRs 51984, 51351, 51873, 51795, 51634)
Revert to using isolated PID namespaces in Docker
**What this PR does / why we need it**: Reverts to the previous docker default of using isolated PID namespaces for containers in a pod. There exist container images that expect always to be PID 1 which we want to support unmodified in 1.8.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48937
**Special notes for your reviewer**:
**Release note**:
```release-note
Sharing a PID namespace between containers in a pod is disabled by default in 1.8. To enable for a node, use the --docker-disable-shared-pid=false kubelet flag. Note that PID namespace sharing requires docker >= 1.13.1.
```
Introduce feature gate for expanding PVs
Add a field to SC
Add new Conditions and feature tag pvc update
Add tests for size update via feature gate
register the resize admission plugin
Update golint failures
Automatic merge from submit-queue
Enable batch/v1beta1.CronJobs by default
This PR re-applies the cronjobs->beta back (https://github.com/kubernetes/kubernetes/pull/51720) with the fix from @shyamjvs.
Fixes#51692
@apelisse @dchen1107 @smarterclayton ptal
@janetkuo @erictune fyi
Automatic merge from submit-queue (batch tested with PRs 51682, 51546, 51369, 50924, 51827)
kubeadm: Detect kubelet readiness and error out if the kubelet is unhealthy
**What this PR does / why we need it**:
In order to improve the UX when the kubelet is unhealthy or stopped, or whatever, kubeadm now polls the kubelet's API after 40 and 60 seconds, and then performs an exponential backoff for a total of 155 seconds.
If the kubelet endpoint is not returning `ok` by then, kubeadm gives up and exits.
This will miligate at least 60% of our "[apiclient] Created API client, waiting for control plane to come up" issues in the kubeadm issue tracker 🎉, as kubeadm now informs the user what's wrong and also doesn't deadlock like before.
Demo:
```
lucas@THEGOPHER:~/luxas/kubernetes$ sudo ./kubeadm init --skip-preflight-checks
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[init] Using Kubernetes version: v1.7.4
[init] Using Authorization modes: [Node RBAC]
[preflight] Skipping pre-flight checks
[kubeadm] WARNING: starting in 1.8, tokens expire after 24 hours by default (if you require a non-expiring token use --token-ttl 0)
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [thegopher kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.115]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests"
[init] This often takes around a minute; or longer if the control plane images have to be pulled.
[apiclient] All control plane components are healthy after 40.502199 seconds
[markmaster] Will mark node thegopher as master by adding a label and a taint
[markmaster] Master thegopher tainted and labelled with key/value: node-role.kubernetes.io/master=""
[bootstraptoken] Using token: 5776d5.91e7ed14f9e274df
[bootstraptoken] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[uploadconfig] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[addons] Applied essential addon: kube-dns
[addons] Applied essential addon: kube-proxy
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run (as a regular user):
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
http://kubernetes.io/docs/admin/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join --token 5776d5.91e7ed14f9e274df 192.168.1.115:6443 --discovery-token-ca-cert-hash sha256:6f301ce8c3f5f6558090b2c3599d26d6fc94ffa3c3565ffac952f4f0c7a9b2a9
lucas@THEGOPHER:~/luxas/kubernetes$ sudo ./kubeadm reset
[preflight] Running pre-flight checks
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Removing kubernetes-managed containers
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /etc/cni/net.d /var/lib/dockershim /var/run/kubernetes /var/lib/etcd]
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
lucas@THEGOPHER:~/luxas/kubernetes$ sudo systemctl stop kubelet
lucas@THEGOPHER:~/luxas/kubernetes$ sudo ./kubeadm init --skip-preflight-checks
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[init] Using Kubernetes version: v1.7.4
[init] Using Authorization modes: [Node RBAC]
[preflight] Skipping pre-flight checks
[kubeadm] WARNING: starting in 1.8, tokens expire after 24 hours by default (if you require a non-expiring token use --token-ttl 0)
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [thegopher kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.115]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests"
[init] This often takes around a minute; or longer if the control plane images have to be pulled.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp 127.0.0.1:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp 127.0.0.1:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp 127.0.0.1:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp 127.0.0.1:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp 127.0.0.1:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp 127.0.0.1:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp 127.0.0.1:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp 127.0.0.1:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp 127.0.0.1:10255: getsockopt: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by that:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
- There is no internet connection; so the kubelet can't pull the following control plane images:
- gcr.io/google_containers/kube-apiserver-amd64:v1.7.4
- gcr.io/google_containers/kube-controller-manager-amd64:v1.7.4
- gcr.io/google_containers/kube-scheduler-amd64:v1.7.4
You can troubleshoot this for example with the following commands if you're on a systemd-powered system:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
couldn't initialize a Kubernetes cluster
```
In this demo, I'm first starting kubeadm normally and everything works as usual.
In the second case, I'm explicitely stopping the kubelet so it doesn't run, and skipping preflight checks, so that kubeadm doesn't even try to exec `systemctl start kubelet` like it does usually.
That obviously results in a non-working system, but now kubeadm tells the user what's the problem instead of waiting forever.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes: https://github.com/kubernetes/kubeadm/issues/377
**Special notes for your reviewer**:
**Release note**:
```release-note
kubeadm: Detect kubelet readiness and error out if the kubelet is unhealthy
```
@kubernetes/sig-cluster-lifecycle-pr-reviews @pipejakob
cc @justinsb @kris-nova @lukemarsden as well as you wanted this feature :)
Automatic merge from submit-queue (batch tested with PRs 51805, 51725, 50925, 51474, 51638)
Limit events accepted by API Server
**What this PR does / why we need it**:
This PR adds the ability to limit events processed by an API server. Limits can be set globally on a server, per-namespace, per-user, and per-source+object. This is needed to prevent badly-configured or misbehaving players from making a cluster unstable.
Please see https://github.com/kubernetes/community/pull/945.
**Release Note:**
```release-note
Adds a new alpha EventRateLimit admission control that is used to limit the number of event queries that are accepted by the API Server.
```