Automatic merge from submit-queue
Make node-instance-group base names unique to prevent collisions
We create multiple IGMs for >1000 Node clusters. When we have a conflict on base name IGMs will fight over ownership of the VM that happen to have the name belonging to multiple IGMs.
This change will increase reliability of starting big clusters.
cc @wojtek-t @alex-mohr @roberthbailey @mikedanese
Automatic merge from submit-queue
Add node problem detector as an addon pod.
```release-note
Introduce a new add-on pod NodeProblemDetector.
NodeProblemDetector is a DaemonSet running on each node, monitoring node health and reporting
node problems as NodeCondition and Event. Currently it already supports kernel log monitoring, and
will support more problem detection in the future. It is enabled by default on gce now.
```
This PR enables NodeProblemDetector as an add-on pod.
/cc @mikedanese @kubernetes/sig-node
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Automatic merge from submit-queue
Configuration for GCP webhook authentication and authorization
This PR adds configuration for GCP webhook authentication and authorization in ContainerVM and GCI. The change of configure-vm.sh and kube-apiserver.manifest is directly copied from @cjcullen's PR #25380 and #25296. The change in GCI script configure-helper.sh includes the support for webhook authentication and authorization, and also some code refactor to improve readability.
@cjcullen @roberthbailey @zmerlynn please review it. The original PRs are P1, please mark this as P1.
cc/ @fabioy @kubernetes/goog-image FYI.
I verified it by running e2e tests on GCI cluster. Without the GCI side change, cluster creation fails as being capture by GKE Jenkins tests. I don't test when the two env GCP_AUTHN_URL and GCP_AUTHZ_URL are set, because they are only set in GKE. After this PR is merged, @cjcullen will test in GKE.
Automatic merge from submit-queue
cluster/gce/coreos: Set service-cluster-ip-range
Broken by #19242
See also #26002
This is necessary to kube-up for me, but depending on how #26002 plays out, this PR might not be necessary. Happy to close this or merge or whatever depending on what's best.
cc @yifan-gu @sjpotter @mikedanese
Automatic merge from submit-queue
GCI: Fix the condition for using the default image
This PR revises the condition for using the default GCI image. The old logic is not convenient for manually run e2e tests in some cases (mainly for GCI team to test custom images). The new logic by this PR is very similar to the logic in using ContainerVM. When setting distro to "gci", if master or node image is unset, we use gci-dev for it. If either is set, we respect it.
@roberthbailey @zmerlynn @dchen1107 please review it, and we should cherry pick it in release-1.2 branch. Thanks!
cc/ @kubernetes/goog-image @adityakali FYI
Automatic merge from submit-queue
GCI/Trusty: Fix an issue in using 'find' commands
This PR makes the logic of 'find' command consistent with the 'cp' command afterwards, i.e., only check one layer of a given dir. Without this fix, we have seen a recent breakage after PR #25309 added the file cluster/addons/fluentd-elasticsearch/es-image/template-k8s-logstash.json. The 'find' command discovers this json file, but the 'cp' command fails.
@roberthbailey @dchen1107 @zmerlynn please review this fix, and mark it as a cherry pick candidate. I already verified this fix can resolve the breakage.
cc/ @wonderfly @fabioy @kubernetes/goog-image FYI
Automatic merge from submit-queue
GCI: Enable the log of upstart jobs
This PR enables the log of upstart jobs in master.yaml and node.yaml. By default, log of upstart jobs are enabled in Trusty and placed in /var/log/upstart, but not enabled in GCI. This change explicitly directs the log to the system logger. For trusty, they are in /var/log/syslog file. In GCI, we can check it using "journalctl". This change will be useful for debugging if cluster initialization fails.
@roberthbailey @maisem @dchen1107 please review it. This will be useful for issues like #23634. We should also cherry pick it in release-1.2
cc/ @fabioy @zmerlynn @wonderfly FYI.
Automatic merge from submit-queue
Salt configuration for the new Cluster Autoscaler for GCE
Adds support for cloud autoscaler from contrib/cloud-autoscaler in kube-up.sh GCE script.
cc: @fgrzadkowski @piosz
Automatic merge from submit-queue
Use --format='value(name)' with gcloud instead of grep/awk/cut
Fixing our fragile parsing of `gcloud` is getting old (#24746, #25159, maybe others?).
Instead, let's just get the proper output out of `gcloud` in the first place.
Automatic merge from submit-queue
Change default clusterCIDRs from /16 to /14 in GCE configs allowing 1000 Node clusters by default.
cc @thockin @roberthbailey @wojtek-t @zmerlynn @davidopp