MASTER_ADVERTISE_ADDRESS is used to set the --advertise-address flag
for the apiserver. It's useful for running the apiserver behind a load
balancer.
However, if PROJECT_ID, TOKEN_URL, TOKEN_BODY, and NODE_NETWORK are
all set, the GCE VM's external IP address will be fetched and used
instead and MASTER_ADVERTISE_ADDRESS will be ignored.
Change this behavior so that MASTER_ADVERTISE_ADDRESS takes precedence
because it's more specific. We still fall back to using the VM's
external IP address if the other variables are set.
Also: Pass --ssh-user and --ssh-keyfile flags if both PROXY_SSH_USER
and MASTER_ADVERTISE_ADDRESS is set.
Automatic merge from submit-queue (batch tested with PRs 63492, 62379, 61984, 63805, 63807). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove PodPreset and enable resources for Priority admission plugins in e2e-gce
**What this PR does / why we need it**:
e2e-gce start kube-apiserver without admission PodPreset and enable resources for Priority
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62377
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
It is possible that the node API object doesn't exist in a brief
window between recreation and registering. By moving the uncordon
until after the node is ready, we can be sure the API object exists.
Automatic merge from submit-queue (batch tested with PRs 62665, 62194, 63616, 63672, 63450). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Bump down to cos-stable-65 in config-test
Until https://github.com/kubernetes/kubernetes/issues/62456 is fixed (and we have a good patched version of cos-66), we probably should not be using the current version for testing which we anyway know we wouldn't be using for prod due to the bug.
/cc @yujuhong @filbranden @wojtek-t
Wdyt?
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 62665, 62194, 63616, 63672, 63450). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Uncordon the node after upgrade
**What this PR does / why we need it**:
Previous logic was relying on the node to recreate the node API object
and, as a side-effect, uncordon itself. A change went in that no
longer ensures the node recreates itself, so the bug in this logic was exposed.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#63506
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
cc @dchen1107 @AishSundar
Previous logic was relying on the node to recreate the node API object
and, as a side-effect, uncordon itself. A change went in that no
longer ensures the node recreates itself, so the bug in this logic was exposed.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use the logging agent's node name as the metadata agent URL.
The Stackdriver Logging agent should use the node's hostname when it constructs the Stackdriver Metadata Agent's URL, currently, it's using the GKE Master's hostname, which is a bug.
**Release note:**
```release-note
[fluentd-gcp addon] Use the logging agent's node name as the metadata agent URL.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
gce: plumb --kubelet-certificate-authority flag to apiserver
**What this PR does / why we need it**:
We want to start signing kubelets' serving certs with cluster CA. This
flag is required to enforce that on apiserver side.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 63340, 63266). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
gcp: allow non-bootstrap kubeconfig
**What this PR does / why we need it**:
Needed for https://github.com/kubernetes/community/pull/2022
This change lets us generate a non-bootstrap kubeconfig with exec plugin for authn.
The plugin does TLS bootstrapping internally.
**Special notes for your reviewer**:
Defaults when no new env vars are set will behave same as before this change.
`KUBELET_AUTH_TYPE` should never be `tls-auth` in practice, but leaving it there just in case.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update all script shebangs to use /usr/bin/env interpreter instead of /bin/interpreter
This is required to support systems where bash doesn't reside in /bin (such as NixOS, or the *BSD family) and allow users to specify a different interpreter version through $PATH manipulation.
https://www.cyberciti.biz/tips/finding-bash-perl-python-portably-using-env.html
```release-note
Use /usr/bin/env in all script shebangs to increase portability.
```
The regular kubeconfig is fetched from metadata when
CREATE_BOOTSTRAP_KUBECONFIG==false.
We will experiment with an exec plugin that does TLS bootstrapping
internally: #61803
Automatic merge from submit-queue (batch tested with PRs 62718, 62863). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
gcp: add env var to configure enabled controllers in controller-manager
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 62590, 62818, 63015, 62922, 63000). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove METADATA_AGENT_VERSION config option
**What this PR does / why we need it**:
Remove METADATA_AGENT_VERSION configuration option. To keep Metadata Agent version consistent across Kubernetes deployments.
**Release note**:
```release-note
Remove METADATA_AGENT_VERSION configuration option.
```
Automatic merge from submit-queue (batch tested with PRs 62590, 62818, 63015, 62922, 63000). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use BootID instead of ExternalID to check for new instance
PR #60692 changed the way that ExternalID is reported on GCE. Its value
is no longer the GCE instance ID. It is the instance name. So it
cannot be used to determine VM uniqueness across time. Instead,
upgrade will check that the boot ID changed.
**What this PR does / why we need it**:
Node upgrades stall out because the external ID remains the same across upgrades now.
**Which issue(s) this PR fixes**:
Fixes#62713
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 63007, 62919, 62669, 62860). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add unit test for configure-helper.sh.
**What this PR does / why we need it**:
Add a framework for unit-testing configure-helper.sh.
configure-helper.sh plays a critical role in initializing clusters both on GCE and GKE. It is currently, over 2K lines of code, yet it has no unit test coverage.
This PR proposes a framework/approach on how to provide test coverage for this component.
Notes:
1. Changes to configure-helper.sh itself were necessary to enable sourcing of this script for the purposes of testing.
2. As POC api_manifest_test.go covers the logic related to the initialization of apiserver when integration with KMS was requested. The hope is that the same approach could be extended to the rest of the script.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update addon manifests to use policy/v1beta1
**What this PR does / why we need it:**
This is a part of the PSP migration from extensions to policy API group. This PR updates addon manifests to use policy/v1beta1 and grant permissions in policy API group.
**Which issue(s) this PR fixes:**
Addressed to https://github.com/kubernetes/features/issues/5
PR 60692 changed the way that ExternalID is reported on GCE. Its value
is no longer the GCE instance ID. It is the instance name. So it
cannot be used to determine VM uniqueness across time. Instead,
upgrade will check that the boot ID changed.
Automatic merge from submit-queue (batch tested with PRs 62409, 62856). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
DNS-Autoscaler support for CoreDNS
**What this PR does / why we need it**:
This PR provides the dns-horizontal autoscaler for CoreDNS in kube-up, enabling the tests to pass once CoreDNS is the default.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#61176
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add prometheus cluster monitoring addon.
This PR adds new cluster monitoring addon based on prometheus.
It adds prometheus deployment with e2e tests.
Additional components will be added iterativly in future.
Manifests based on current Helm chart.
At current state it's not intended for production use.
cc @piosz @kawych @miekg
```release-note
Add prometheus cluster monitoring addon to kube-up
```
/sig instrumentation
/kind feature
/priority important-soon
Automatic merge from submit-queue (batch tested with PRs 62568, 62220, 62743, 62751, 62753). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GCE: Bump GLBC manifest to v1.1.1
**Special notes for your reviewer**:
/assign bowei
/cc bowei
/cc rramkumar1
**Release note**:
```release-note
GCE: Bump GLBC version to 1.1.1 - fixing an issue of handling multiple certs with identical certificates
```
Automatic merge from submit-queue (batch tested with PRs 62486, 62471, 62183). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
provision kubelet config file for GCE instead of deprecated flags
Many Kubelet flags are now deprecated in favor of the versioned config file format. This PR adopts the versioned config file format in our cluster turn-up scripts.
```release-note
cluster/kube-up.sh now provisions a Kubelet config file for GCE via the metadata server. This file is installed by the corresponding GCE init scripts.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
cluster/gce/list-resources.sh: also list stackdriver logging sinks
**What this PR does / why we need it**: we seem to be logging stackdriver logging sinks on GCE, likely because we're not keeping track of them. (ref https://github.com/kubernetes/test-infra/issues/7295)
This doesn't fix the leaks, but it'll hopefully help us detect when that happens.
**Release note**:
```release-note
NONE
```
cc @krzyzacy @crassirostris @summit
This PR extends the client-side startup scripts to provision a Kubelet
config file instead of legacy flags. This PR also extends the
master/node init scripts to install this config file from the GCE
metadata server, and provide the --config argument to the Kubelet.
Automatic merge from submit-queue (batch tested with PRs 62455, 62465, 62427, 62416, 62411). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Bump GLBC version and remove Unreleased tag from tests
/assign rramkumar1
/cc mrhohn
**Release note**:
```release-note
GCE: Bump GLBC version to 1.1.0 - supporting multiple certificates and HTTP2
```
Automatic merge from submit-queue (batch tested with PRs 59636, 62429, 61862). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Inject CloudKMS Plugin container into Kube-APIServer pod.
**What this PR does / why we need it**:
Inject CloudKMS Plugin container into Kube-APIServer pod when etcd level encryption via CloudKMS Plugin is requested.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Reimplement migrate-if-needed.sh in go
The `migrate-if-needed.sh` script was already partially implemented in go (see the attachlease and rollback sub-dirs), but was still unnecessarily difficult to understand and test. This closely reimplements the original logic but with improved code structure, error handling and testing.
Where possible, go code that was previously executed as separate binaries is now statically linked into a single 'migrate' go cobra CLI app, which is then thinly wrapped by`migrate-if-needed.sh`.
There are numerous additional improvements that need to be made, but will be submitted in future PRs. This PR is focused on achieving parity with the pre-existing functionality and introducing some much needed test coverage, in particular HA cluster upgrade test coverage.
It appears that the `attachlease` and `rollback` go binaries are no longer needed as standalones and so I have consolidated them into the new `migrate` go binary. Other than that, this change aims to be 100% backward compatible.
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add support to ingest log entries to Stackdriver against new "k8s_container" and "k8s_node" resources.
**What this PR does / why we need it**:
**Which issue(s) this PR fixes**
Fluentd 0.14 has some memory leak issues that caused the e2e tests to be flaky. Downgrading to v0.12.
**Special notes for your reviewer**:
We never released any previous version with Fluentd v0.14. Only upgraded it very recently. So this downgrading is not visible to users.
**Release note**:
```release-note
Add support to ingest log entries to Stackdriver against new "k8s_container" and "k8s_node" resources.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Migrating test images to gcr.io/kubernetes-e2e-test-images
**What this PR does / why we need it**:
Currently e2e test images are distributed between 2 different registry locations, k8s.gcr.io and gcr.io/kubernetes-e2e-test-images. This is part of a multi-step initiative to house all the images in gcr.io/kubernetes-e2e-test-images.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62131
**Special notes for your reviewer**:
1. I am starting off with migrating images under k8s/test/images/volumes-tester.
2. I did not move ceph and nfs images since they are marked for [deprecation and removal](https://github.com/kubernetes/kubernetes/tree/master/test/images/volumes-tester). Let me know if we want them moved as well.
3. I have made a copy of the images in gcr.io/kubernetes-e2e-test-images so the references are not broken post the PR merge. Will work on removing the images from k8s.gcr.io once this change sticks.
Automatic merge from submit-queue (batch tested with PRs 60102, 59970, 60021, 62011, 62080). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[GCE] Ingress HTTP2 e2e test
**What this PR does / why we need it**:
- Adds e2e test for bringing up an HTTP2 Ingress, converting it to HTTPS, then back to HTTP2
- Update echoserver image to 1.10
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fixes#54017, remove deprecated --mode flag
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#54017
**Special notes for your reviewer**:
**Release note**:
```release-note
remove deprecated --mode flag in check-network-mode
```
Automatic merge from submit-queue (batch tested with PRs 62162, 60628, 62172). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
When using custom network with IP-alias, use the former's subnet for the latter too
Currently, when we're using custom subnet and ip-alias simultaneously, the cluster fails to come up.
The reason is because we're creating a subnet in the former with one name, but expecting a differently named subnet for the latter.
This is causing [continuous failures in our 100-node job](https://k8s-testgrid.appspot.com/sig-scalability-gce#gce) where I recently turned both of them on.
cc @kubernetes/sig-network-bugs
```release-note
NONE
```
Currently all our e2e test images are distributed between 2 registry locations (i) google-containers (k8s.gcr.io) and (ii) gcr.io/kubernetes-e2e-test-images. This PR is part of the initiative to house all test images at gcr.io/kubernetes-e2e-test-images eventually.
Set the default to cos-stable-65 (which is what we are using on GKE for
latest 1.9 and 1.8) and set config-test to use cos-beta-66, so that we
can get more exposure to it.
The testgrid seems to be fairly happy with these images. (both
e2e-gce-cosdev-k8sdev-default and e2e-gce-cosbeta-k8sdev-default are
generally green.)
Automatic merge from submit-queue (batch tested with PRs 61904, 61565, 61401, 61432, 61772). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove rktnetes code
**What this PR does / why we need it**:
rktnetes is scheduled to be deprecated in 1.10 (#53601). According to the deprecation policy for beta CLI and flags, we can remove the feature in 1.11.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#58721
**Special notes for your reviewer**:
**Release note**:
```release-note
Removed rknetes code, which was deprecated in 1.10.
```
/assign @yujuhong
/hold
Hold until the end of the freeze.
Automatic merge from submit-queue (batch tested with PRs 60420, 60590). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Enable AESGCM encryption of secrets in etcd by default.
**What this PR does / why we need it**:
Enable encryption of secrets in etcd via AESGCM transform (as described here https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/) during kube-up.sh build of a cluster.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60166, 61706, 61769). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Change HAIRPIN_MODE to hairpin-veth as default
**What this PR does / why we need it**:
Change the default HAIRPIN_MODE back to "hairpin-veth".
It was previously "promiscuous-bridge" in order to workaround a kernel bug which deadlocked the machine when hairpin-veth was used. (#27498)
After some thorough manual testing on ubuntu clusters, we feel confident now that the kernel bug is fixed so we should switch back to using hairpin-veth. This will allow us to clean up some ebtables rules that were put in place to make "promiscuous-bridge" work properly.
Once this change goes in, we need to carefully monitor our e2e tests to make sure the bug has not resurfaced.
**Release note**:
```release-note
In a GCE cluster, the default HAIRPIN_MODE is now "hairpin-veth".
```
/cc @freehan @prameshj
/assign @roberthbailey
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes permissions error for Local SSD when created with NODE_LOCAL_SSDS flag
**What this PR does / why we need it**:
The PR fixes a permissions error introduced in 1.9 whereby users are unable to write to their Local SSD if it is created with the `NODE_LOCAL_SSDS` flag.
This will need to be cherrypicked to 1.9 and 1.10.
/sig storage
/kind bug
/assign @msau42
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Suppress error message from grep when checking whether a subnet has a secondary range or not.
**What this PR does / why we need it**:
Get rid of stdrr caused by grep command when running cluster/kube-up.sh for GCE.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
N/A
**Special notes for your reviewer**:
No behavior change.
**Release note**:
```release-note
"NONE"
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update GCP fluentd configmap for COS audit logging on GKE node
**What this PR does / why we need it**:
This PR adds a placeholder in fluentd configmap for COS audit logging on GKE node.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
NONE
```
In order to use -n, the value needs either be quoted or [[ .. ]] block
has to be used. Fix the comparisons that way.
To verify, consider this (analogous) script:
#!/bin/bash
subnetwork_url=""
if [ -n ${subnetwork_url} ]; then
echo "foo"
fi
if [[ -n ${subnetwork_url} ]]; then
echo "bar"
fi
Here "foo" is echoed by the script, even though the variable
subnetwork_url has a zero-length value.
In shell scripts inside [[ .. ]] blocks, ">" is a string comparison operator.
The "attempt" number comparison works (most likely by accident) because the max
number of attempts is below 10. Change to -gt operator.