Automatic merge from submit-queue (batch tested with PRs 52259, 53951, 54385, 54805, 55145). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
COS: Keep the docker network checkpoint
This is necessary for enabling the live-restore feature.
**What this PR does / why we need it**:
This is necessary for enabling the live-restore feature on COS.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
COS: Keep the docker network checkpoint
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Override recycler pod in GCE
**What this PR does / why we need it**:
Override the default nfs and hostpath recycler pod with the busybox image from gcr.io/google-containers. It does this by:
* writing out the new recycler pod spec to /home/kubernetes
* specifying recycler pod arguments to kube-controller-manager,
* adding a hostpath volume to the recycler pod spec in the kube-controller-manager manfiest
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix configuration of Metadata Agent daemon set
**What this PR does / why we need it**:
Fixes small errors in Stackdriver Metadata Agent configuration: port number and default version.
**Release note**:
```release-note
Fix port number and default Stackdriver Metadata Agent in daemon set configuration.
```
Automatic merge from submit-queue (batch tested with PRs 55360, 56444, 56687, 56791, 56802). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Configure metadata concealment iptables rules in node startup.
**What this PR does / why we need it**: Configure iptables rule for metadata concealment at startup so the pod doesn't have to, to reduce memory consumption.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add checking HPA_USE_REST_CLIENTS
Add checking HPA_USE_REST_CLIENTS in addition to ENABLE_METRICS_SERVER when disabling REST clients use for HPA.
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add CoreDNS as an optional addon in kube-up
**What this PR does / why we need it**:
This PR adds the option of installing CoreDNS as an addon instead of kube-dns in kube-up.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#56439
**Special notes for your reviewer**:
**Release note**:
```release-note
kube-up: Add optional addon CoreDNS.
Install CoreDNS instead of kube-dns by setting CLUSTER_DNS_CORE_DNS value to 'true'.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add deployment for Stackdriver Metadata Agent with version and resource requirement controlled by env variable
**What this PR does / why we need it**:
Introduces Stackdriver Metadata Agent - a daemon set providing metadata for kubernetes objects connected to the same node.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 55952, 49112, 55450, 56178, 56151). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add environment variable to enable support for new Stackdriver resource model
To be merged after #56211
**What this PR does / why we need it**:
This PR adds env variable to control Stackdriver sink in Heapster - whether it export metrics for new resource model or old resource model.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 56207, 55950). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix setting resources in fluentd-gcp plugin
Currently if some of the variables are not set, scripts prints error, which is not critical, since the function is executed in a separate process, but it leads to the wrong resulting values
```release-note
NONE
```
/cc @piosz @x13n
/assign @roberthbailey @mikedanese
Could you please approve?
Automatic merge from submit-queue (batch tested with PRs 54316, 53400, 55933, 55786, 55794). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Be less agressive and more patient when creating large master.
**What this PR does / why we need it**:
Workaround for #55777
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 54824, 55911, 55730, 55979, 55961). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add options for mounting SCSI or NVMe local SSD though Block or Filesystem and do all of that with UUID
Fixes: #51431
Fixed version of: #53466
Mount SCSI local SSD by UUID in /mnt/disks/by-uuid/, also allows for users to request and mount NVMe disks. Both types of disks will be accessible either through block or file-system.
I have confirmed that it is no longer crashing when nodes are initialized on GKE.
Lack of this flag sometimes causes iptables to return error code 4 (if
other process holds xtables lock). As a result, because of `set -o errexit`,
whole startup script fails, leaving master in an incorrect state.
This is another occurence of (already closed) https://github.com/kubernetes/kubernetes/issues/7370
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add options for mounting SCSI or NVMe local SSD though Block or Filesystem and do all of that with UUID
Fixes: #51431
Mount SCSI local SSD by UUID in /mnt/disks/by-uuid/, also allows for users to request and mount NVMe disks. Both types of disks will be accessable either through block or filesystem
To see code in progress for NVMe and block support see working branch: https://github.com/davidz627/kubernetes/tree/localExt
Automatic merge from submit-queue (batch tested with PRs 54602, 54877, 55243, 55509, 55128). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
PodSecurityPolicies for addons
**What this PR does / why we need it**:
1. Colocate addon PodSecurityPolicy config with the addons (in a `podsecuritypolicies` subdirectory).
2. Add policies for addons that are currently missing policies (not in the default GCE suite)
3. Remove HostPath SSL certs from several heapster deployments, so that heapster doesn't require a special PSP
**Which issue(s) this PR fixes**:
#43538
**Release note**:
```release-note
- Add PodSecurityPolicies for cluster addons
- Remove SSL cert HostPath volumes from heapster addons
```
Automatic merge from submit-queue (batch tested with PRs 54826, 53576, 55591, 54946, 54825). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Run nvidia-gpu device-plugin daemonset as an addon on GCE nodes that have nvidia GPUs attached
- Instead of the old `Accelerators` feature that added `alpha.kubernetes.io/nvidia-gpu` resource, use the new `DevicePlugins` feature that adds vendor specific resources. (In case of nvidia GPUs it will
add `nvidia.com/gpu` resource.)
- Add node label to GCE nodes with accelerators attached. This node label is the same as what GKE attaches to node pools with accelerators attached. (For example, for nvidia-tesla-p100 GPU, the label would be `cloud.google.com/gke-accelerator=nvidia-tesla-p100`) This will help us target accelerator specific
daemonsets etc. to these nodes.
- Run nvidia-gpu device-plugin daemonset as an addon on GCE nodes that have nvidia GPUs attached.
- Some minor documentation improvements in addon manager.
**Release note**:
```release-note
GCE nodes with NVIDIA GPUs attached now expose `nvidia.com/gpu` as a resource instead of `alpha.kubernetes.io/nvidia-gpu`.
```
/sig cluster-lifecycle
/sig scheduling
/area hw-accelerators
https://github.com/kubernetes/features/issues/368
Automatic merge from submit-queue (batch tested with PRs 53337, 55465, 55512, 55522, 54554). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Allow configuring docker storage driver in GCE
**What this PR does / why we need it**:
For GCE, allow configuring of the docker storage driver.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
GCE: Provide an option to configure the docker storage driver.
```
Automatic merge from submit-queue (batch tested with PRs 54987, 55221, 54099, 55144, 54215). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Increase waiting time (120s) for docker startup in health-monitor.sh
Fix the issue of killing docker again when startup takes longer time on overloaded nodes.
Automatic merge from submit-queue (batch tested with PRs 55265, 54092, 55353, 53733, 55385). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Symbol links of key and cert are no longer used.
**What this PR does / why we need it**:
This is unused for current cycle.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref #42660
**Special notes for your reviewer**:
/cc @mikedanese
**Release note**:
```release-note
NONE
```
fixed a typo which was causing script to break while creating a GKE cluster.
Specifically, the line "setup-addon-manifests "addons" "rbac/legacy-kubelet-user-disabled" was meant to refer to the directory cluster/addons/rbac/legacy-kubelet-user-disable. The extra "d" at the end of disable was causing the script to break.
Automatic merge from submit-queue (batch tested with PRs 54773, 52523, 47497, 55356, 49429). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
don't check in mounter binary
```release-note
GCI mounter is moved from the manifests tarball to the server tarball.
```
Automatic merge from submit-queue (batch tested with PRs 54177, 55203, 55120, 55275, 55260). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GCE: provide an option to disable docker's live-restore
**What this PR does / why we need it**:
Provide an option to disable docker's live-restore for COS/ubuntu images on GCE. Some newer COS images have live-restore enabled by default. This allows users to override the option if needed.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
GCE: provide an option to disable docker's live-restore on COS/ubuntu
```
Automatic merge from submit-queue (batch tested with PRs 53866, 54852, 55178, 55185, 55130). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Set the NON_MASQUERADE_CIDR to 0/0 by default in GCE/GKE
This disables masquerade rules setup by the kubelet. Additionally this adds masquerade rules based on NON_MASQUERADE_CIDR being set to 0/0.
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Add masquerading rules by default to GCE/GKE
```
Automatic merge from submit-queue (batch tested with PRs 51001, 55181). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Added logic for custom kube proxy yaml for GKE
Added yaml-replacement logic for custom kube-proxy daemon set on GKE.
Release Note:
```release-note
None
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove Google Cloud KMS's in-tree integration
Removes the following introduced by #48574 and others:
* `kms.go` which contained the cloudkms-specific code for Google Cloud KMS service.
* Registering the Google Cloud KMS in the KMS plugin registry.
* Google's `cloudkms` API package from `vendor` folder.
The following changes are upcoming:
* Removal of KMSPluginRegistry. This would not be needed anymore, since KMS providers will be out-of-tree from now on (so no need of registering them, an address of the process would be enough).
* A service which allows encrypt/decrypt functionality (satisfies `envelope.Service` interface) if initialized with an IP/Port of an out-of-tree process serving KMS requests. Will tentatively use gRPC requests to talk to this external service.
Reference: https://github.com/kubernetes/kubernetes/pull/54439#issuecomment-340062801 and https://github.com/kubernetes/kubernetes/issues/51965#issuecomment-339333937.
```release-note
Google KMS integration was removed from in-tree in favor of a out-of-process extension point that will be used for all KMS providers.
```
Automatic merge from submit-queue (batch tested with PRs 54488, 54838, 54964). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add support to for alternative container runtime in `kube-up.sh`
For https://github.com/kubernetes/features/issues/286.
This PR added 4 new environment variables in `kube-up.sh` to support alternative container runtime:
1) `KUBE_MASTER_EXTRA_METADATA` and `KUBE_NODE_EXTRA_METADATA`. Add extra metadata on master and node instance. With this we could specify different cloud-init for a different container runtime, and also add extra metadata for the new cloud-init, e.g. [master.yaml](7d73966214/test/e2e/master.yaml)
2) `KUBE_CONTAINER_RUNTIME_ENDPOINT`. Specify different sock for different container runtime. It's only used when it's not empty.
3) `KUBE_LOAD_IMAGE_COMMAND`. Specify different load image command for different container runtime.
An example for cri-containerd:
```
export KUBE_MASTER_EXTRA_METADATA="user-data=${GOPATH}/src/github.com/kubernetes-incubator/cri-containerd/test/e2e/master.yaml,cri-containerd-configure-sh=${GOPATH}/src/github.com/kubernetes-incubator/cri-containerd/test/configure.sh"
export KUBE_NODE_EXTRA_METADATA="user-data=${GOPATH}/src/github.com/kubernetes-incubator/cri-containerd/test/e2e/node.yaml,cri-containerd-configure-sh=${GOPATH}/src/github.com/kubernetes-incubator/cri-containerd/test/configure.sh"
export KUBE_CONTAINER_RUNTIME="remote"
export KUBE_CONTAINER_RUNTIME_ENDPOINT="/var/run/cri-containerd.sock"
export KUBE_LOAD_IMAGE_COMMAND="/home/cri-containerd/usr/local/bin/cri-containerd load"
export NETWORK_POLICY_PROVIDER="calico"
```
Signed-off-by: Lantao Liu <lantaol@google.com>
```release-note
none
```
/cc @yujuhong @dchen1107 @feiskyer @mikebrow @abhi @mrunalp @runcom
/cc @kubernetes/sig-node-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 52367, 53363, 54989, 54872, 54643). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Basic GCE PodSecurityPolicy Config
**What this PR does / why we need it**:
This PR lays the foundation for enabling PodSecurityPolicy in GCE and other default deployments. The 3 commits are:
1. Add policies, roles & bindings for the default addons on GCE.
2. Enable the PSP admission controller & load the addon policies when the`ENABLE_POD_SECURITY_POLICY=true` environment variable is set.
3. Support the PodSecurityPolicy in the E2E environment & add PSP tests.
NOTES:
- ~~Depends on https://github.com/kubernetes/kubernetes/pull/52301 for privileged capabilities~~
- ~~Depends on https://github.com/kubernetes/kubernetes/pull/52849 for sane mutations~~
- ~~Depends on https://github.com/kubernetes/kubernetes/pull/53479 for aggregator tests to pass~~
- ~~Depends on https://github.com/kubernetes/kubernetes/pull/54175 for dedicated fluentd service~~ account
- This PR is a fork of https://github.com/kubernetes/kubernetes/pull/46064, credit to @Q-Lee
**Which issue this PR fixes**: #43538
**Release note**:
```release-note
Add support for PodSecurityPolicy on GCE: `ENABLE_POD_SECURITY_POLICY=true` enables the admission controller, and installs policies for default addons.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Move hardcoded constants to the beginning of configure.sh script.
**What this PR does / why we need it**:
Move hardcoded constants of component version and sha1 to the beginning of configure.sh to make it easier for GKE image preloader to parse.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 54112, 54150, 53816, 54321, 54338). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Enable metadata concealment for tests
**What this PR does / why we need it**: Metadata concealment is going to beta for v1.9; enable it by default in tests. Also, just use `ENABLE_METADATA_CONCEALMENT` instead of two different vars. Work toward #8867.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: none
**Special notes for your reviewer**:
**Release note**:
```release-note
Metadata concealment on GCE is now controlled by the `ENABLE_METADATA_CONCEALMENT` env var. See cluster/gce/config-default.sh for more info.
```
Automatic merge from submit-queue (batch tested with PRs 52003, 54559, 54518). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Added functionality to replace default kube-dns deployment with a GKE specific one
**What this PR does / why we need it**:
In GKE, we need to use custom kube-dns deployments, which means replacing the default one with the custom. This PR adds the replacement functionality into the relevant configuration scripts.
Release Note:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 54400, 54403). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Allow for configuring etcd hostname in the manifest
```release-note
Allow for configuring etcd hostname in the manifest
```
Automatic merge from submit-queue (batch tested with PRs 53106, 52193, 51250, 52449, 53861). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
bump CNI to v0.6.0
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#49480
**Special notes for your reviewer**:
/assign @luxas @bboreham @feiskyer
**Release note**:
```release-note
bump CNI to v0.6.0
```
Automatic merge from submit-queue (batch tested with PRs 52883, 52183, 53915, 53848). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[GCE kube-up] Don't provision kubeconfig file for kube-proxy service account
**What this PR does / why we need it**:
Offloading the burden of provisioning kubeconfig file for kube-proxy service account from GCE startup scripts. This also helps us decoupling kube-proxy daemonset upgrade from node upgrade.
Previous attempt on https://github.com/kubernetes/kubernetes/pull/51172, using InClusterConfig for kube-proxy based on discussions on https://github.com/kubernetes/client-go/issues/281.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #NONE
**Special notes for your reviewer**:
/assign @bowei @thockin
cc @luxas @murali-reddy
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50223, 53205). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Create e2e tests for Custom Metrics - Stackdriver Adapter and HPA based on custom metrics from Stackdriver
**What this PR does / why we need it**:
- Add e2e test for Custom Metrics - Stackdriver Adapter
- Add 2e2 test for HPA based on custom metrics from Stackdriver
- Enable HorizontalPodAutoscalerUseRESTClients option
**Release note**:
```release-note
Horizontal pod autoscaler uses REST clients through the kube-aggregator instead of the legacy client through the API server proxy.
```
This allows the etcd docker registry that is currently hard coded to
`gcr.io/google_containers/etcd` in the `etcd.manifest` template to be
overridden. This can be used to test new versions of etcd with
kubernetes that have not yet been published to
`gcr.io/google_containers/etcd` and also enables cluster operators to
manage the etcd images used by their cluster in an internal
repository.
Automatic merge from submit-queue (batch tested with PRs 53044, 52956, 53512, 53028). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add ipvs sync period parameters - align to iptables proxier
**What this PR does / why we need it**:
Add ipvs sync period parameters - align to iptables proxier
**Which issue this PR fixes**:
fixes#52957
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Merge kube-dns templates into a single file
**What this PR does / why we need it**: Merge all of the kube-dns cluster yamls into a single file.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#42832
**Special notes for your reviewer**:
/assign @bowei @shashidharatd
cc @kevin-wangzefeng @euank @lhuard1A
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52488, 52548). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Enable overriding Heapster resource requirements in GCP
This PR enables to override Heapster resource requirements in GCP.
**Release note:**
```release-note
```
Add CLUSTER_SIGNING_DURATION environment variable to cluster
configuration scripts to allow configuration of signing duration of
certificates issued via the Certificate Signing Request API.
Automatic merge from submit-queue (batch tested with PRs 52452, 52115, 52260, 52290)
Add env var to enable kubelet rotation in kube-up.sh.
Fixes https://github.com/kubernetes/kubernetes/issues/52114
```release-note
Adds ROTATE_CERTIFICATES environment variable to kube-up.sh script for GCE
clusters. When that var is set to true, the command line flag enabling kubelet
client certificate rotation will be added to the kubelet command line.
```
Automatic merge from submit-queue (batch tested with PRs 52376, 52439, 52382, 52358, 52372)
Add new api groups to the GCE advanced audit policy
Fixes https://github.com/kubernetes/kubernetes/issues/52265
It introduces the missing api groups, that were introduced in 1.8 release.
@piosz there's also the 'metrics' api group, should we audit it?
Automatic merge from submit-queue (batch tested with PRs 51601, 52153, 52364, 52362, 52342)
Make advanced audit policy on GCP configurable
Related to https://github.com/kubernetes/kubernetes/issues/52265
Make GCP audit policy configurable
/cc @tallclair
Automatic merge from submit-queue
Add cluster up configuration for certificate signing duration.
```release-note
Add CLUSTER_SIGNING_DURATION environment variable to cluster configuration scripts
to allow configuration of signing duration of certificates issued via the Certificate
Signing Request API.
```
Automatic merge from submit-queue (batch tested with PRs 51921, 51829, 51968, 51988, 51986)
COS/GCE: bump the max pids for the docker service
**What this PR does / why we need it**:
TasksMax limits how many threads/processes docker can create. Insufficient limit affects container starts.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
fixes#51977
**Special notes for your reviewer**:
**Release note**:
```release-note
Ensure TasksMax is sufficient for docker
```
Automatic merge from submit-queue (batch tested with PRs 51921, 51829, 51968, 51988, 51986)
Fix unbound variable in configure-helper.sh
This isn't plumbed yet on GKE, so results in an unbound variable.
```release-note
NONE
```
During NFS/GlusterFS mount, it requires to have DNS server to be able to
resolve service name. This PR gets the DNS server ip from kubelet and
add it to the containerized mounter path. So if containerized mounter is
used, service name could be resolved during mount
Automatic merge from submit-queue (batch tested with PRs 49727, 51792)
Introducing metrics-server
ref https://github.com/kubernetes/features/issues/271
There is still some work blocked on problems with repo synchronization:
- migrate to `v1beta1` introduced in #51653
- bump deps to HEAD
Will do it in a follow up PRs once the issue is resolved.
```release-note
Introduced Metrics Server
```
Automatic merge from submit-queue
Add RBAC, healthchecks, autoscalers and update Calico to v2.5.1
**What this PR does / why we need it**:
- Updates Calico to `v2.5`
- Calico/node to `v2.5.1`
- Calico CNI to `v1.10.0`
- Typha to `v0.4.1`
- Enable health check endpoints
- Add Readiness probe for calico-node and Typha
- Add Liveness probe for calico-node and Typha
- Add RBAC manifest
- With calico ClusterRole, ServiceAccount and ClusterRoleBinding
- Add Calico CRDs in the Calico manifest (only works for k8s v1.7+)
- Add vertical autoscaler for calico-node and Typha
- Add horizontal autoscaler for Typha
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51553, 51538, 51663, 51069, 51737)
Allow enable pod priority feature gate for GCE and configure priority for kube-proxy
**What this PR does / why we need it**:
From #23225, this PR adds an option for user to enable pod priority feature gate using GCE startup scripts, and configure pod priority for kube-proxy when enabled.
The setup `priorityClassName: system` derives from: ce1485c626/staging/src/k8s.io/api/core/v1/types.go (L2536-L2542)
The plan is to configure pod priority for kube-proxy daemonset (https://github.com/kubernetes/kubernetes/pull/50705) in the same way.
**Special notes for your reviewer**:
cc @bsalamat @davidopp @thockin
**Release note**:
```release-note
When using kube-up.sh on GCE, user could set env `ENABLE_POD_PRIORITY=true` to enable pod priority feature gate.
```
Automatic merge from submit-queue (batch tested with PRs 51590, 48217, 51209, 51575, 48627)
FlexVolume setup script for COS instance using mounting utility image in GCR.
**What this PR does / why we need it**: This scripts automates FlexVolume installation for a single COS instance. Users need to pre-pack their drivers and mount utilities in a Docker image and upload it to GCR.
For each FlexVolume plugin, the script places a driver wrapper in a writable and executable location. The wrapper calls commands from the actual driver but in a chroot environment, so that mount utilities from the image can be used.
I'm working on a script that automatically executes this on all instances. Will be in a separate PR.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48626
```release-note
NONE
```
/cc @gmarek @chakri-nelluri
/assign @saad-ali @msau42
/sig storage
/release-note-none
Automatic merge from submit-queue
Adding Flexvolume plugin dir piping for controller manager on COS
**What this PR does / why we need it**: Sets the default Flexvolume plugin directory correctly for controller manager running on COS images.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#51563
```release-note
NONE
```
/release-note-none
/sig storage
/assign @msau42
/cc @wongma7
Automatic merge from submit-queue
Retry master instance creation in case of retriable error (with sleep)
To help with our 5k-node CI tests failing to startup the cluster.
And also towards the greater goal - https://github.com/kubernetes/kubernetes/issues/43140
cc @kubernetes/sig-scalability-misc @kubernetes/sig-cluster-lifecycle-misc
Automatic merge from submit-queue
Add Google cloud KMS service for envelope encryption transformer
This adds the required pieces which will allow addition of KMS based encryption providers (envelope transformer).
For now, we will be implementing it using Google Cloud KMS, but the code should make it easy to add support for any other such provider which can expose Decrypt and Encrypt calls.
Writing tests for Google Cloud KMS Service may cause a significant overhead to the testing framework. It has been tested locally and on GKE though.
Upcoming after this PR:
* Complete implementation of the envelope transformer, which uses LRU cache to maintain decrypted DEKs in memory.
* Track key version to assist in data re-encryption after a KEK rotation.
Development branch containing the changes described above: https://github.com/sakshamsharma/kubernetes/pull/4
Envelope transformer used by this PR was merged in #49350
Concerns #48522
Planned configuration:
```
kind: EncryptionConfig
apiVersion: v1
resources:
- resources:
- secrets
providers:
- kms:
cachesize: 100
configfile: gcp-cloudkms.conf
name: gcp-cloudkms
- identity: {}
```
gcp-cloudkms.conf:
```
[GoogleCloudKMS]
kms-location: global
kms-keyring: google-container-engine
kms-cryptokey: example-key
```
Automatic merge from submit-queue (batch tested with PRs 50932, 49610, 51312, 51415, 50705)
Allow running kube-proxy as a DaemonSet when using kube-up.sh on GCE
**What this PR does / why we need it**:
From #23225, this PR adds an option for user to run kube-proxy as a DaemonSet instead of static pods using GCE startup scripts. By default, kube-proxy will run as static pods.
This is the first step for moving kube-proxy into a DaemonSet in GCE, remaining tasks will be tracked on #23225.
**Special notes for your reviewer**:
The last commit are purely for testing out kube-proxy as daemonset via CIs.
cc @kubernetes/sig-network-misc @kubernetes/sig-cluster-lifecycle-misc
**Release note**:
```release-note
When using kube-up.sh on GCE, user could set env `KUBE_PROXY_DAEMONSET=true` to run kube-proxy as a DaemonSet. kube-proxy is run as static pods by default.
```
Automatic merge from submit-queue (batch tested with PRs 51038, 50063, 51257, 47171, 51143)
update related manifest files to use hostpath type
**What this PR does / why we need it**:
Per [discussion in #46597](https://github.com/kubernetes/kubernetes/pull/46597#pullrequestreview-53568947)
Dependes on #46597
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes: https://github.com/kubernetes/kubeadm/issues/298
**Special notes for your reviewer**:
/cc @euank @thockin @tallclair @Random-Liu
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 51108, 51035, 50539, 51160, 50947)
Set GCE_ALPHA_FEATURES environment variable in gce.conf
This allows us to gate alpha features in the pkg/cloudprovider/providers/gce.
Automatic merge from submit-queue (batch tested with PRs 50386, 50374, 50444, 50382)
Add explicit API kind and version to the audit policy file on GCE
Adds an explicit API version and kind to the audit policy file in GCE configuration scripts. It's a prerequisite for https://github.com/kubernetes/kubernetes/pull/49115
/cc @tallclair @piosz
Automatic merge from submit-queue (batch tested with PRs 50300, 50328, 50368, 50370, 50372)
Bugfix: set resources only for fluentd-gcp container.
There is more than one container in fluentd-gcp deployment. Previous
implementation was setting resources for all containers, not just
the fluent-gcp one.
**What this PR does / why we need it**:
Bugfix; https://github.com/kubernetes/kubernetes/pull/49009 without this is eating more resources.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50366
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
There is more than one container in fluentd-gcp deployment. Previous
implementation was setting resources for multiple containers, not just
the fluent-gcp one.
Automatic merge from submit-queue (batch tested with PRs 48487, 49009, 49862, 49843, 49700)
Enable overriding fluentd resources in GCP
**What this PR does / why we need it**: This enables overriding fluentd resources in GCP, when there is a need for custom ones.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49989, 49806, 49649, 49412, 49512)
Use existing k8s binaries and images on disk when they are preloaded to gce cos image.
**What this PR does / why we need it**:
This change is to accelerate K8S startup time on gce when k8s tarballs and images are already preloaded in VM image, by skipping the downloading, extracting and file transfer steps.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49898, 49897, 49919, 48860, 49491)
gce: make append_or_replace.. atomic
Before this change,
* the final echo is not atomically written to the target file
* two concurrent callers will use the same tempfile
Helps with https://github.com/kubernetes/kubernetes/issues/49895
cc @miekg
Replaces use of --api-servers with --kubeconfig in Kubelet args across
the turnup scripts. In many cases this involves generating a kubeconfig
file for the Kubelet and placing it in the correct location on the node.
Automatic merge from submit-queue
Auto-calculate master disk and root disk sizes in GCE
@gmarek PR https://github.com/kubernetes/kubernetes/pull/49282 didn't fix the issue because MASTER_DISK_SIZE was defaulting to 20GB in config-test.sh before being calculated inside get-master-disk-size() where you use pre-existing value if any.
It should be fixed by this now.
Automatic merge from submit-queue (batch tested with PRs 49222, 49333, 48708, 49337)
Fix issue in installing containerized mounter
Fix PR #49335
PR #49157 causes failure when installing containerized mounter. This
PR is a fix for it
Automatic merge from submit-queue (batch tested with PRs 49120, 46755, 49157, 49165, 48950)
gce: don't print every file in mounter to stdout
This is printing ~3000 lines.
Automatic merge from submit-queue
Pass cluster name to Heapster with Stackdriver sink.
**What this PR does / why we need it**:
Passes cluster name as argument to Heapster when it's used with Stackdriver sink to allow setting resource label 'cluster_name' in exported metrics.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47918, 47964, 48151, 47881, 48299)
Add ApiEndpoint support to GCE config.
**What this PR does / why we need it**:
Add the ability to change ApiEndpoint for GCE.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 43558, 48261, 42376, 46803, 47058)
Add bind mount /etc/resolv.conf from host to containerized mounter
Currently, in containerized mounter rootfs, there is no DNS setup. If client
try to set up volume with host name instead of IP address, it will fail to resolve
the host name.
By bind mount the host's /etc/resolv.conf to mounter rootfs, VM hosts name
could be resolved when using host name during mount.
```release-note
Fixes issue where you could not mount NFS or glusterFS volumes using hostnames on GCI/GKE with COS images.
```
Automatic merge from submit-queue (batch tested with PRs 48004, 48205, 48130, 48207)
Do not set CNI in cases where there is a private master and network policy provider is set.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
In GCE and in a "private master" setup, do not set the network-plugin provider to CNI by default if a network policy provider is given.
```
Automatic merge from submit-queue (batch tested with PRs 47993, 47892, 47591, 47469, 47845)
Bump up npd version to v0.4.1
```
Bump up npd version to v0.4.1
```
Fixes#47219
Automatic merge from submit-queue (batch tested with PRs 47993, 47892, 47591, 47469, 47845)
Use a different env var to enable the ip-masq-agent addon.
We shouldn't mix setting the non-masq-cidr with enabling the addon.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
https://github.com/kubernetes/kubernetes/issues/47865
Automatic merge from submit-queue (batch tested with PRs 45268, 47573, 47632, 47818)
NODE_TAINTS in gce startup scripts
Currently there is now way to pass a list of taints that should be added on node registration (at least not in gce or other saltbased deployment). This PR adds necessary plumbing to pass the taints from user or instance group template to kubelet startup flags.
```release-note
Taints support in gce/salt startup scripts.
```
The PR was manually tested.
```
NODE_TAINTS: 'dedicated=ml:NoSchedule'
```
in kube-env results in
```
spec:
[...]
taints:
- effect: NoSchedule
key: dedicated
timeAdded: null
value: ml
```
cc: @davidopp @gmarek @dchen1107 @MaciekPytel
Automatic merge from submit-queue (batch tested with PRs 46604, 47634)
Set price expander in Cluster Autoscaler for GCE
With CA 0.6 we will make price-preferred node expander the default one for GCE. For other cloud providers we will stick to the default one (random) until the community implement the required interfaces in CA repo.
https://github.com/kubernetes/autoscaler/issues/82
cc: @MaciekPytel @aleksandra-malinowska
Automatic merge from submit-queue (batch tested with PRs 46327, 47166)
mark --network-plugin-dir deprecated for kubelet
**What this PR does / why we need it**:
**Which issue this PR fixes** : fixes#43967
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47530, 47679)
Use cos-stable-59-9460-64-0 instead of cos-beta-59-9460-20-0.
Remove dead code that has now moved to another repo as part of #47467
**Release note**:
```release-note
NONE
```
/sig node
Automatic merge from submit-queue (batch tested with PRs 47626, 47674, 47683, 47290, 47688)
The KUBE-METADATA-SERVER firewall must be applied before the universa…
…l tcp ACCEPT
**What this PR does / why we need it**: the metadata firewall rule was broken by being appended after the universal tcp accept.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 38751, 44282, 46382, 47603, 47606)
Working on fixing #43716.
This will create the necessary certificates.
On GCE is will upload those certificates to Metadata.
They are then pulled down on to the kube-apiserver.
They are written to the /etc/src/kubernetes/pki directory.
Finally they are loaded vi the appropriate command line flags.
The requestheader-client-ca-file can be seen by running the following:-
kubectl get ConfigMap extension-apiserver-authentication
--namespace=kube-system -o yaml
Minor bug fixes.
Made sure AGGR_MASTER_NAME is set up in all configs.
Clean up variable names.
Added additional requestheader configuration parameters.
Added check so that if there is no Aggregator CA contents we won't start
the aggregator with the relevant flags.
**What this PR does / why we need it**:
This PR creates a request header CA. It also creates a proxy client cert/key pair.
It causes these files to end up on kube-apiserver and set the CLI flags so they are properly loaded.
Without it the customer either has to set them up themselves or re-use the master CA which is a security vulnerability.
Currently this creates everything on GCE.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#43716
**Special notes for your reviewer**:
This is a reapply of pull/47094 with the GKE issue resolved.
**Release note**: None
- It contains a fix for ipaliasing.
- It contains a fix which decouples GPU driver installation from kernel
version.
Remove dead code that has now moved to another repo as part of #47467
Automatic merge from submit-queue
Don't start any Typha instances if not using Calico
**What this PR does / why we need it**:
Don't start any Typha instances if Calico isn't being used. A recent change now includes all add-ons on the master, but we don't always want a Typha replica.
**Which issue this PR fixes**
Fixes https://github.com/kubernetes/kubernetes/issues/47622
**Release note**:
```release-note
NONE
```
cc @dnardo
Automatic merge from submit-queue (batch tested with PRs 47562, 47605)
Adding option in node start script to add "volume-plugin-dir" flag to kubelet.
**What this PR does / why we need it**: Adds a variable to allow specifying FlexVolume driver directory through cluster/kube-up.sh. Without this, the process of setting up FlexVolume in a non-default directory is very manual.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47561
Automatic merge from submit-queue
Add encryption provider support via environment variables
These changes are needed to allow cloud providers to use the encryption providers as an alpha feature. The version checks can be done in the respective cloud providers'.
Context: #46460 and #46916
@destijl @jcbsmpsn @smarterclayton
This will create the necessary certificates.
On GCE is will upload those certificates to Metadata.
They are then pulled down on to the kube-apiserver.
They are written to the /etc/src/kubernetes/pki directory.
Finally they are loaded vi the appropriate command line flags.
The requestheader-client-ca-file can be seen by running the following:-
kubectl get ConfigMap extension-apiserver-authentication
--namespace=kube-system -o yaml
Minor bug fixes.
Made sure AGGR_MASTER_NAME is set up in all configs.
Clean up variable names.
Added additional requestheader configuration parameters.
Added check so that if there is no Aggregator CA contents we won't start
the aggregator with the relevant flags.
Automatic merge from submit-queue
Fix dangling reference to gcloud alpha API for GCI (should be beta)
This reference to the alpha API was missed (fixed in GCE, but not GCI)
Fixes#47494
```release-note
none
```
Automatic merge from submit-queue (batch tested with PRs 47000, 47188, 47094, 47323, 47124)
Set up proxy certs for Aggregator.
Working on fixing https://github.com/kubernetes/kubernetes/issues/43716.
This will create the necessary certificates.
On GCE is will upload those certificates to Metadata.
They are then pulled down on to the kube-apiserver.
They are written to the /etc/src/kubernetes/pki directory.
Finally they are loaded vi the appropriate command line flags.
The requestheader-client-ca-file can be seen by running the following:-
kubectl get ConfigMap extension-apiserver-authentication --namespace=kube-system -o yaml
**What this PR does / why we need it**:
This PR creates a request header CA. It also creates a proxy client cert/key pair.
It causes these files to end up on kube-apiserver and set the CLI flags so they are properly loaded.
Without it the customer either has to set them up themselves or re-use the master CA which is a security vulnerability.
Currently this creates everything on GCE.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#43716
**Special notes for your reviewer**: