Automatic merge from submit-queue (batch tested with PRs 67950, 68195). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Remove e2e-image-puller
**What this PR does / why we need it**:
A long time ago, We added the image prepulling as a workaround due to
the overwhelming amount of flake caused by pulling during the tests.
This functionality has been broken for a while now when we switched to a
COS image where mounting `docker` binary into `busybox` stopped working.
So we just have dead code we should clean up.
Change-Id: I538171a5c1d9361eee7f9e0a99655b88b1721e3e
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#63355
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 68087, 68256, 64621, 68299, 68296). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
gce: use getrandom instead of urandom for on node rng
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 68161, 68023, 67909, 67955, 67731). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Register RuntimeClass CRD as an addon
**What this PR does / why we need it**:
Register the RuntimeClass CRD when the RuntimeClass feature gate is enabled. This is done in through the addon manager.
This is an alternative approach to https://github.com/kubernetes/kubernetes/pull/67924
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
For https://github.com/kubernetes/features/issues/585
**Release note**:
Covered by #67737
```release-note
NONE
```
/sig node
/kind feature
/priority important-soon
/milestone v1.12
In the context, our urandoms where generally safe, however getrandom has
built in invariants around entropy pool initialization, making getrandom
safe in all contexts. This should protect us from cryptopasta errors or
weird entropy issues.
Automatic merge from submit-queue (batch tested with PRs 67736, 68123, 68138). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Added support to get clusters in gce cloud provider.
**What this PR does / why we need it**:
Implemented the call to get all cluster objects in a zone for a project.
Also added code to allow the container api to be set in the gce.conf
file.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
A long time ago, We added the image prepulling as a workaround due to
the overwhelming amount of flake caused by pulling during the tests.
This functionality has been broken for a while now when we switched to a
COS image where mounting `docker` binary into `busybox` stopped working.
So we just have dead code we should clean up.
Change-Id: I538171a5c1d9361eee7f9e0a99655b88b1721e3e
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Support extra prune resources in kube-addon-manager.
The default prune whitelist resources in https://github.com/kubernetes/kubernetes/blob/master/pkg/kubectl/cmd/apply.go#L531 are sometimes not enough.
One example is that when we remove an admission webhook running as an addon pod, after we remove the addon yaml file, the admission webhook pod will be pruned, but the `MutatingWebhookConfiguration`/`ValidationWebhookConfiguration` won't... If the webhook failure policy is `Fail`, this will break the cluster, and users can't create new pods anymore.
It would be good to at least make this configurable, so that users and vendors can configure it based on their requirement.
This PR keeps the default prune resource list exactly the same with before, just makes it possible to add extra ones.
@dchen1107 @MrHohn @kubernetes/sig-cluster-lifecycle-pr-reviews @kubernetes/sig-gcp-pr-reviews
Signed-off-by: Lantao Liu <lantaol@google.com>
**Release note**:
```release-note
Support extra `--prune-whitelist` resources in kube-addon-manager.
```
Automatic merge from submit-queue (batch tested with PRs 64283, 67910, 67803, 68100). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Adding GCE node termination handler as an optional addon.
This step is a pre-requisite for auto-deploying that addon in GKE
cc @mikedanese
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Promote AdvancedAuditing to GA
**What this PR does / why we need it**:
Removes deprecated legacy code used for basic audit logging in favor of advanced audit logging.
```release-note
Promote AdvancedAuditing to GA, replacing the previous (legacy) audit logging mechanisms.
```
Automatic merge from submit-queue (batch tested with PRs 67745, 67432, 67569, 67825, 67943). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Add flag for disabling prometheus-to-sd only for daemon sets
```release-note
NONE
```
The requested Service Protocol is checked against the supported protocols of GCE Internal LB. The supported protocols are TCP and UDP.
SCTP is not supported by OpenStack LBaaS. If SCTP is requested in a Service with type=LoadBalancer, the request is rejected. Comment style is also corrected.
SCTP is not allowed for LoadBalancer Service and for HostPort. Kube-proxy can be configured not to start listening on the host port for SCTP: see the new SCTPUserSpaceNode parameter
changed the vendor github.com/nokia/sctp to github.com/ishidawataru/sctp. I.e. from now on we use the upstream version.
netexec.go compilation fixed. Various test cases fixed
SCTP related conformance tests removed. Netexec's pod definition and Dockerfile are updated to expose the new SCTP port(8082)
SCTP related e2e test cases are removed as the e2e test systems do not support SCTP
sctp related firewall config is removed from cluster/gce/util.sh. Variable name sctp_addr is corrected to sctpAddr in pkg/proxy/ipvs/proxier.go
cluster/gce/util.sh is copied from master
Implemented the call to get all cluster objects in a zone for a project.
Also added code to allow the container api to be set in the gce.conf
file.
Requested fix for @lavalamp. Fixed GetClusters to be GetManagedClusters.
Leaving ListClusters as ListClusters as it is part of the Cloud Clusters
interface, despite also being a "managed" call.
Remove copy pasta :D
Fixed method variable name.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove rescheduler since scheduling DS pods by default scheduler is moving to beta
**What this PR does / why we need it**:
remove rescheduler since scheduling DS pods by default scheduler is moving to beta
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#64725
**Special notes for your reviewer**:
**Release note**:
```release-note
Remove rescheduler since scheduling DS pods by default scheduler is moving to beta.
```
Automatic merge from submit-queue (batch tested with PRs 66177, 66185, 67136, 67157, 65065). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update configure-helper.sh to support heapster resource optimizations
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 66177, 66185, 67136, 67157, 65065). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Templatize the scaling policy for metrics-server
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GCE: Add OWNERS for image (gci) configuration
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59030, 64666, 66251, 66485, 66813). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
A large set of improvements to the Stackdriver components.
**What this PR does / why we need it**:
This PR delivers a large set of improvements for both the Stackdriver Logging agent and the Stackdriver Metadata agent.
**Release note**:
```release-note
Metadata Agent Improvements
Bump metadata agent version to 0.2-0.0.21-1.
Expand the metadata agent's access to all API groups.
Remove metadata agent config maps in favor of command line flags.
Update the metadata agent's liveness probe to a new /healthz handler.
Logging Agent Improvements
Bump logging agent version to 0.2-1.5.33-1-k8s-1.
Appropriately set log severity for k8s_container.
Fix detect exceptions plugin to analyze message field instead of log field.
Fix detect exceptions plugin to analyze streams based on local resource id.
Disable the metadata agent for monitored resource construction in logging.
Disable timestamp adjustment in logs to optimize performance.
Reduce logging agent buffer chunk limit to 512k to optimize performance.
```
Metadata Agent Improvements
Bump metadata agent version to 0.2-0.0.21-1.
Expand the metadata agent's access to all API groups.
Remove metadata agent config maps in favor of command line flags.
Update the metadata agent's liveness probe to a new /healthz handler.
Logging Agent Improvements
Bump logging agent version to 0.2-1.5.33-1-k8s-1.
Appropriately set log severity for k8s_container.
Fix detect exceptions plugin to analyze message field instead of log field.
Fix detect exceptions plugin to analyze streams based on local resource id.
Disable the metadata agent for monitored resource construction in logging.
Disable timestamp adjustment in logs to optimize performance.
Reduce logging agent buffer chunk limit to 512k to optimize performance.
In addition to the shell script changes the heapster yaml has been
updated to use addon resizer 1.8.3 for the heapster-nanny. Addon resizer 1.8.3
is being used to take advantage of the new minClusterSize flag. Note this is a
no-op change. The values specified for heapster-nanny reflect the current
configuration used with version 1.8.2.
Automatic merge from submit-queue (batch tested with PRs 66152, 66406, 66218, 66278, 65660). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update crictl to v1.11.1.
Update `crictl` to v1.11.1 to fix several bugs. Release note: https://github.com/kubernetes-incubator/cri-tools/releases/tag/v1.11.1
@kubernetes/sig-node-pr-reviews @kubernetes/sig-cluster-lifecycle-pr-reviews
@kubernetes/sig-gcp-pr-reviews
Signed-off-by: Lantao Liu <lantaol@google.com>
```release-note
Update crictl to v1.11.1.
```
The container used by build/run.sh doesn't necessarily have an entry in
/etc/passwd for the host user's uid, and this missing data causes
`whoami` to fail.
Switch `whoami` to `id -un` to fall back to the uid if the /etc/passwd
entry is missing.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
stop using deprecated --etcd-quorum-read
etcd-quorum-read was deprecated, but it is still used.
This pr stops using it.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 65319, 64513, 65474, 65601, 65634). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Allow custom manifests in GCP master setup
Add a hook in GCE setup script to allow using custom manifests on master, so we can decouple some GKE changes from k8s. Note that this PR just adds a hook there is no change in default behavior.
```release-note
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Set pod priority on kube-proxy manifest by default
**What this PR does / why we need it**:
Follow up of https://github.com/kubernetes/kubernetes/pull/59237, set pod priority on kube-proxy by default and remove the unneeded logic in startup script.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #NONE
**Special notes for your reviewer**:
/assign @bsalamat @bowei
cc @tanshanshan
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add /home/kubernetes/bin into sudoers path, so that `sudo crictl` works.
Add `/home/kubernetes/bin` to sudoers path, so that user can call `sudo crictl` directly.
Without this fix, user has to either use the full path `sudo /home/kubernetes/bin/crictl` or switch to root, which is not a good user experience.
/cc @yujuhong @feiskyer @filbranden @kubernetes/sig-node-pr-reviews @kubernetes/sig-gcp-pr-reviews
**Release note**:
```release-note
User can now use `sudo crictl` on GCE cluster.
```
Automatic merge from submit-queue (batch tested with PRs 65024, 65287, 65345, 64693, 64941). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add a helper function to customize K8s addon yamls and use it to customize Calico addons on GKE
**What this PR does / why we need it**:
Allow customizing Calico addon in GCP. With #65022, this allows us to do a couple of things:, e.g., run Calico 3.0+ on GCP, use a non-default MTU etc.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#65045, #65067
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 65123, 65176, 65139, 65084, 65056). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Pass cluster_location argument to Heapster
**What this PR does / why we need it**:
Fixes Stackdriver monitoring on GCE clusters where cluster location is not a single zone, for example regional clusters.
**Release note**:
```release-note
Pass cluster_location argument to Heapster
```
Automatic merge from submit-queue (batch tested with PRs 64503, 64903, 64643, 64987). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Create system:cluster-autoscaler account & role and introduce it to C…
**What this PR does / why we need it**:
This PR adds cluster-autoscaler ClusterRole & binding, to be used by the Cluster Autoscaler (kubernetes/autoscaler repository).
It also updates GCE scripts to make CA use the cluster-autoscaler user account.
User account instead of Service account is chosen to be more in line with kube-scheduler.
**Which issue(s) this PR fixes**:
Fixes [issue 383](https://github.com/kubernetes/autoscaler/issues/383) from kubernetes/autoscaler.
**Special notes for your reviewer**:
This PR might be treated as a security fix since prior to it CA on GCE was using system:cluster-admin account, assumed due to default handling of unsecured & unauthenticated traffic over plain HTTP.
**Release note**:
```release-note
A cluster-autoscaler ClusterRole is added to cover only the functionality required by Cluster Autoscaler and avoid abusing system:cluster-admin role.
action required: Cloud providers other than GCE might want to update their deployments or sample yaml files to reuse the role created via add-on.
```
Automatic merge from submit-queue (batch tested with PRs 63453, 64592, 64482, 64618, 64661). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Revert "Remove rescheduler and corresponding tests from master"
Reverts kubernetes/kubernetes#64364
After discussing with @bsalamat on how DS controllers(ref: https://github.com/kubernetes/kubernetes/pull/63223#discussion_r192277527) cannot create pods if the cluster is at capacity and they have to rely on rescheduler for making some space, we thought it is better to
- Bring rescheduler back.
- Make rescheduler priority aware.
- If cluster is full and if **only** DS controller is not able to create pods, let rescheduler be run and let it evict some pods which have less priority.
- The DS controller pods will be scheduled now.
So, I am reverting this PR now. Step 2, 3 above are going to be in rescheduler.
/cc @bsalamat @aveshagarwal @k82cn
Please let me know your thoughts on this.
```release-note
Revert #64364 to resurrect rescheduler. More info https://github.com/kubernetes/kubernetes/issues/64725 :)
```
Automatic merge from submit-queue (batch tested with PRs 61610, 64591, 58143, 63929). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add netd as an addon for GCP
**What this PR does / why we need it**:
Add netd as an addon for GKE.
The PR will add setup functions and var to help deploy netd daemon on GKE.
Please checkout more detail for netd at https://github.com/GoogleCloudPlatform/netd
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 61610, 64591, 58143, 63929). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Create CoreDNS and kube-dns folders
**What this PR does / why we need it**:
Separate the CoreDNS and kube-dns manifests by creating their own folders (dns/coredns and dns/kube-dns)
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#61435
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
cc @MrHohn
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add ipvs module loading logic to gce scripts
**What this PR does / why we need it**:
Add ipvs module loading logic to gce scripts.
Fixes a part of #59402.
/cc @Lion-Wei
/assign @roberthbailey @m1093782566
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 63434, 64172, 63975, 64180, 63755). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Dump Stack when docker fails on healthcheck
Save stack dump of docker daemon in order to be able to
investigate why docker daemon was unresposive to `docker ps`
See https://github.com/moby/moby/blob/master/daemon/daemon.go on
how docker sets up a trap for SIGUSR1 with `setupDumpStackTrap()`
**What this PR does / why we need it**:
This allows us to investigate why docker daemon was unresponsive to "docker ps" command.
**Special notes for your reviewer**:
Manually tested on Ubuntu and COS.
**Release note**:
```release-note
NONE
```
Send SIGUSR1 to dockerd to save stack dump of docker daemon
in order to be able to investigate why docker daemon was
unresposive to health check done by `docker ps`.
See https://github.com/moby/moby/blob/master/daemon/daemon.go on
how docker sets up a trap for SIGUSR1 with `setupDumpStackTrap()`
Automatic merge from submit-queue (batch tested with PRs 63969, 63902, 63689, 63973, 63978). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Reuse existing CA cert path for kubelet certs
**What this PR does / why we need it**: configure-helper.sh already knows the path to CA cert, re-use that to avoid typos.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 63569, 63918, 63980, 63295, 63989). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
New event exporter config with support for new stackdriver resources
New event exporter, with support for use new and old stackdriver resource model.
This should also be cherry-picked to release-1.10 branch, as all fluentd-gcp components support new and stackdriver resource model.
```release-note
Update event-exporter to version v0.2.0 that supports old (gke_container/gce_instance) and new (k8s_container/k8s_node/k8s_pod) stackdriver resources.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
gce: Prefer MASTER_ADVERTISE_ADDRESS in apiserver setup
MASTER_ADVERTISE_ADDRESS is used to set the --advertise-address flag
for the apiserver. It's useful for running the apiserver behind a load
balancer.
However, if PROJECT_ID, TOKEN_URL, TOKEN_BODY, and NODE_NETWORK are
all set, the GCE VM's external IP address will be fetched and used
instead and MASTER_ADVERTISE_ADDRESS will be ignored.
Change this behavior so that MASTER_ADVERTISE_ADDRESS takes precedence
because it's more specific. We still fall back to using the VM's
external IP address if the other variables are set.
Also: Move the setting of --ssh-user and --ssh-keyfile based on
PROXY_SSH_USER) to a top-level block because this is common to all
codepaths.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 63167, 63357). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Install and use crictl in gce kube-up.sh
Download and use crictl in gce kube-up.sh.
This PR:
1. Downloads crictl `v1.0.0-beta.0` onto the node, which supports CRI v1alpha2. We'll upgrade it to `v1.0.0-beta.1` soon after the release is cut.
2. Change `kube-docker-monitor` to `kube-container-runtime-monitor`, and let it use `crictl` to do health monitoring.
3. Change `e2e-image-puller` to use `crictl`. Because of https://github.com/kubernetes/kubernetes/issues/63355, it doesn't work now. But in `crictl v1.0.0-beta.1`, we are going to statically link it, and the `e2e-image-puller` should work again.
4. Use `systemctl kill --kill-who=main` instead of `pkill`, the reason is that:
a. `pkill docker` will send `SIGTERM` to all processes including `dockerd`, `docker-containerd`, `docker-containerd-shim`. This is not a problem for Docker 17.03 CE, because `containerd-shim` in containerd 0.2.x doesn't exit with SIGERM (see [code](https://github.com/containerd/containerd/blob/v0.2.x/containerd-shim/main.go#L123)). However, `containerd-shim` in containerd 1.0+ does exit with SIGTERM (see [code](https://github.com/containerd/containerd/blob/master/cmd/containerd-shim/main_unix.go#L200)). This means that `pkill docker` and `pkill containerd` will kill all shim processes for Docker 17.11+ and containerd 1.0+.
b. We can use `pkill -x` instead. However, docker systemd service name is `docker`, but daemon process name is `dockerd`. We have to introduce another environment variable to specify "daemon process name". Given so, it seems easier to just use `systemctl kill` which only requires systemd service name. `systemctl kill --kill-who=main` will make sure only main process receives SIGTERM.
Signed-off-by: Lantao Liu <lantaol@google.com>
/cc @filbranden @yujuhong @feiskyer @mrunalp @kubernetes/sig-node-pr-reviews @kubernetes/sig-cluster-lifecycle-pr-reviews
**Release note**:
```release-note
Kubernetes cluster on GCE have crictl installed now. Users can use it to help debug their node. The documentation of crictl can be found https://github.com/kubernetes-incubator/cri-tools/blob/master/docs/crictl.md.
```
MASTER_ADVERTISE_ADDRESS is used to set the --advertise-address flag
for the apiserver. It's useful for running the apiserver behind a load
balancer.
However, if PROJECT_ID, TOKEN_URL, TOKEN_BODY, and NODE_NETWORK are
all set, the GCE VM's external IP address will be fetched and used
instead and MASTER_ADVERTISE_ADDRESS will be ignored.
Change this behavior so that MASTER_ADVERTISE_ADDRESS takes precedence
because it's more specific. We still fall back to using the VM's
external IP address if the other variables are set.
Also: Pass --ssh-user and --ssh-keyfile flags if both PROXY_SSH_USER
and MASTER_ADVERTISE_ADDRESS is set.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use the logging agent's node name as the metadata agent URL.
The Stackdriver Logging agent should use the node's hostname when it constructs the Stackdriver Metadata Agent's URL, currently, it's using the GKE Master's hostname, which is a bug.
**Release note:**
```release-note
[fluentd-gcp addon] Use the logging agent's node name as the metadata agent URL.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
gce: plumb --kubelet-certificate-authority flag to apiserver
**What this PR does / why we need it**:
We want to start signing kubelets' serving certs with cluster CA. This
flag is required to enforce that on apiserver side.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 63340, 63266). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
gcp: allow non-bootstrap kubeconfig
**What this PR does / why we need it**:
Needed for https://github.com/kubernetes/community/pull/2022
This change lets us generate a non-bootstrap kubeconfig with exec plugin for authn.
The plugin does TLS bootstrapping internally.
**Special notes for your reviewer**:
Defaults when no new env vars are set will behave same as before this change.
`KUBELET_AUTH_TYPE` should never be `tls-auth` in practice, but leaving it there just in case.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update all script shebangs to use /usr/bin/env interpreter instead of /bin/interpreter
This is required to support systems where bash doesn't reside in /bin (such as NixOS, or the *BSD family) and allow users to specify a different interpreter version through $PATH manipulation.
https://www.cyberciti.biz/tips/finding-bash-perl-python-portably-using-env.html
```release-note
Use /usr/bin/env in all script shebangs to increase portability.
```
The regular kubeconfig is fetched from metadata when
CREATE_BOOTSTRAP_KUBECONFIG==false.
We will experiment with an exec plugin that does TLS bootstrapping
internally: #61803
Automatic merge from submit-queue (batch tested with PRs 62718, 62863). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
gcp: add env var to configure enabled controllers in controller-manager
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 62590, 62818, 63015, 62922, 63000). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove METADATA_AGENT_VERSION config option
**What this PR does / why we need it**:
Remove METADATA_AGENT_VERSION configuration option. To keep Metadata Agent version consistent across Kubernetes deployments.
**Release note**:
```release-note
Remove METADATA_AGENT_VERSION configuration option.
```
Automatic merge from submit-queue (batch tested with PRs 63007, 62919, 62669, 62860). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add unit test for configure-helper.sh.
**What this PR does / why we need it**:
Add a framework for unit-testing configure-helper.sh.
configure-helper.sh plays a critical role in initializing clusters both on GCE and GKE. It is currently, over 2K lines of code, yet it has no unit test coverage.
This PR proposes a framework/approach on how to provide test coverage for this component.
Notes:
1. Changes to configure-helper.sh itself were necessary to enable sourcing of this script for the purposes of testing.
2. As POC api_manifest_test.go covers the logic related to the initialization of apiserver when integration with KMS was requested. The hope is that the same approach could be extended to the rest of the script.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 62409, 62856). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
DNS-Autoscaler support for CoreDNS
**What this PR does / why we need it**:
This PR provides the dns-horizontal autoscaler for CoreDNS in kube-up, enabling the tests to pass once CoreDNS is the default.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#61176
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add prometheus cluster monitoring addon.
This PR adds new cluster monitoring addon based on prometheus.
It adds prometheus deployment with e2e tests.
Additional components will be added iterativly in future.
Manifests based on current Helm chart.
At current state it's not intended for production use.
cc @piosz @kawych @miekg
```release-note
Add prometheus cluster monitoring addon to kube-up
```
/sig instrumentation
/kind feature
/priority important-soon
This PR extends the client-side startup scripts to provision a Kubelet
config file instead of legacy flags. This PR also extends the
master/node init scripts to install this config file from the GCE
metadata server, and provide the --config argument to the Kubelet.
Automatic merge from submit-queue (batch tested with PRs 59636, 62429, 61862). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Inject CloudKMS Plugin container into Kube-APIServer pod.
**What this PR does / why we need it**:
Inject CloudKMS Plugin container into Kube-APIServer pod when etcd level encryption via CloudKMS Plugin is requested.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add support to ingest log entries to Stackdriver against new "k8s_container" and "k8s_node" resources.
**What this PR does / why we need it**:
**Which issue(s) this PR fixes**
Fluentd 0.14 has some memory leak issues that caused the e2e tests to be flaky. Downgrading to v0.12.
**Special notes for your reviewer**:
We never released any previous version with Fluentd v0.14. Only upgraded it very recently. So this downgrading is not visible to users.
**Release note**:
```release-note
Add support to ingest log entries to Stackdriver against new "k8s_container" and "k8s_node" resources.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes permissions error for Local SSD when created with NODE_LOCAL_SSDS flag
**What this PR does / why we need it**:
The PR fixes a permissions error introduced in 1.9 whereby users are unable to write to their Local SSD if it is created with the `NODE_LOCAL_SSDS` flag.
This will need to be cherrypicked to 1.9 and 1.10.
/sig storage
/kind bug
/assign @msau42
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update GCP fluentd configmap for COS audit logging on GKE node
**What this PR does / why we need it**:
This PR adds a placeholder in fluentd configmap for COS audit logging on GKE node.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
NONE
```
In order to use -n, the value needs either be quoted or [[ .. ]] block
has to be used. Fix the comparisons that way.
To verify, consider this (analogous) script:
#!/bin/bash
subnetwork_url=""
if [ -n ${subnetwork_url} ]; then
echo "foo"
fi
if [[ -n ${subnetwork_url} ]]; then
echo "bar"
fi
Here "foo" is echoed by the script, even though the variable
subnetwork_url has a zero-length value.