Automatic merge from submit-queue (batch tested with PRs 50386, 50374, 50444, 50382)
Add explicit API kind and version to the audit policy file on GCE
Adds an explicit API version and kind to the audit policy file in GCE configuration scripts. It's a prerequisite for https://github.com/kubernetes/kubernetes/pull/49115
/cc @tallclair @piosz
Automatic merge from submit-queue (batch tested with PRs 49725, 50367, 50391, 48857, 50181)
New get-kube.sh option: KUBERNETES_SKIP_RELEASE_VALIDATION
**What this PR does / why we need it**:
This is an alternative solution to https://github.com/kubernetes/kubernetes/pull/49884. The goal is to be able to pull releases that were built by bazel jobs (both presubmit and postsubmit builds), which currently fail our regex validation against the version string.
This implementation is a simple "I know what I'm doing" breakglass option to turn regex validation off, whereas https://github.com/kubernetes/kubernetes/pull/49884 was to extend our validation to support the new formats of bazel build jobs. I'm testing the waters to see if this is a more palatable solution.
**Release note**:
```release-note
New get-kube.sh option: KUBERNETES_SKIP_RELEASE_VALIDATION
```
CC @BenTheElder @fejta @ixdy
Automatic merge from submit-queue (batch tested with PRs 50300, 50328, 50368, 50370, 50372)
Bugfix: set resources only for fluentd-gcp container.
There is more than one container in fluentd-gcp deployment. Previous
implementation was setting resources for all containers, not just
the fluent-gcp one.
**What this PR does / why we need it**:
Bugfix; https://github.com/kubernetes/kubernetes/pull/49009 without this is eating more resources.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50366
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
GKE deployment: Kill cluster/gke
kubernetes/test-infra#3983 migrated the remaining GKE jobs using the bash deployment (cluster/gke). All jobs are now on the gke `deployer` in `kubetest`.
Fixeskubernetes/test-infra#3307
```release-note
`cluster/gke` has been removed. GKE end-to-end testing should be done using `kubetest --deployment=gke`
```
There is more than one container in fluentd-gcp deployment. Previous
implementation was setting resources for multiple containers, not just
the fluent-gcp one.
Automatic merge from submit-queue
Ensure that pricing expander is used by default in Cluster Autoscaler
Pricing expander was set as the default one for GCP, however on some occasion it was possible that AUTOSCALER_EXPANDER_CONFIG variable was not set resulting in using the the random expander.
Automatic merge from submit-queue (batch tested with PRs 48532, 50054, 50082)
Refactored the fluentd-es addon
Refactor fluentd-elasticsearch addon:
- Decrease the number of files by moving RBAC-related objects in the same files where they're used
- Move the fluentd configuration out of the image
- Don't use PVC to avoid leaking resources in e2e tests
- Fluentd now ingest docker and kubelet logs that are written to journald
- Disable X-Pack, because it's not free
Fixes https://github.com/kubernetes/kubernetes/issues/41462
Fixes https://github.com/kubernetes/kubernetes/issues/49816
Fixes https://github.com/kubernetes/kubernetes/issues/48973
Fixes https://github.com/kubernetes/kubernetes/issues/49450
@aknuds1 @coffeepac Could you please take a look?
```release-note
Fluentd DaemonSet in the fluentd-elasticsearch addon is configured via ConfigMap and includes journald plugin
Elasticsearch StatefulSet in the fluentd-elasticsearch addon uses local storage instead of PVC by default
```
Automatic merge from submit-queue (batch tested with PRs 48487, 49009, 49862, 49843, 49700)
Enable overriding fluentd resources in GCP
**What this PR does / why we need it**: This enables overriding fluentd resources in GCP, when there is a need for custom ones.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50119, 48366, 47181, 41611, 49547)
Add basic install and mount flexvolumes e2e tests
fixes https://github.com/kubernetes/kubernetes/issues/47010
These two tests install a skeleton "dummy" flex driver, attachable and non-attachable respectively, then test that a pod can successfully use the flex driver. They are labeled disruptive because kubelet and controller-manager get restarted as part of the flex install. IMO it's important to keep this install procedure as part of the test to isolate any bugs with the startup plugin probe code.
There is a bit of an ugly dependency on cluster/gce/config-test.sh because --flex-volume-plugin-dir must be set to a dir that's readable from controller-manager container and writable by the flex e2e test. The default path is not writable on GCE masters with read-only root so I picked a location that looks okay.
In the "dummy" drivers I trick kubelet into thinking there is a mount point by doing "mount -t tmpfs none ${MNTPATH} >/dev/null 2>&1", hope that is okay.
I have only tested on GCE and theoretically they may work on AWS but I don't think there is a need to test on multiple cloudproviders.
-->
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46685, 49863, 50098, 50070, 50096)
GCE: Bump GLBC version to 0.9.6
Closes#50095
**Release note**:
```release-note
GCE: Bump GLBC version to 0.9.6
```
Automatic merge from submit-queue (batch tested with PRs 50103, 49677, 49449, 43586, 48969)
Run kazel on the entire tree
**What this PR does / why we need it**: part of #47558: auto-generate `BUILD` files on the entire tree, since this is what `gazelle` does, and it'll make subsequent reviews easier if less is changing.
**Release note**:
```release-note
NONE
```
/assign
/release-note-none
Automatic merge from submit-queue (batch tested with PRs 48365, 49902, 49808, 48722, 47045)
Upgrade fluentd-elasticsearch addon to Elasticsearch/Kibana 5.5
This is a patch to upgrade the fluentd-elasticsearch addon to Elasticsearch/Kibana 5.5. Please provide feedback!
```release-notes
* Upgrade Elasticsearch/Kibana to 5.5.1 in fluentd-elasticsearch addon
* Switch to basing our image of Elasticsearch in fluentd-elasticsearch addon off the official one
* Switch to the official image of Kibana in fluentd-elasticsearch addon
* Use StatefulSet for Elasticsearch instead of ReplicationController, with persistent volume claims
* Require authenticating towards Elasticsearch, as Elasticsearch 5.5 by default requires basic authentication
```
Automatic merge from submit-queue (batch tested with PRs 48365, 49902, 49808, 48722, 47045)
Rebase hyperkube image on debian-hyperkube-base, based on debian-base.
**What this PR does / why we need it**: saves all of the hyperkube image dependencies in a cacheable base image, rather than downloading them for every build (which is slow and flaky).
This way, at build time, we only need to pull down the hyperkube base image and add the hyperkube binary.
I've additionally based the base image on `debian-base` instead of `debian`, though we amusing end up reinstalling a bunch of the things we removed in `debian-base`.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#35058, at least partially
**Special notes for your reviewer**: I'm increasingly convinced that the hyperkube image is a bad pattern, as this image carries the superset of dependencies anyone might need, rather than the limited set of dependencies one needs. hyperkube really needs a proper owner.
**Release note**:
```release-note
```
/assign @timstclair @luxas @philips @nikhiljindal
cc @kubernetes/sig-release-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 49989, 49806, 49649, 49412, 49512)
Use existing k8s binaries and images on disk when they are preloaded to gce cos image.
**What this PR does / why we need it**:
This change is to accelerate K8S startup time on gce when k8s tarballs and images are already preloaded in VM image, by skipping the downloading, extracting and file transfer steps.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50029, 48517, 49739, 49866, 49782)
fix spelling
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
Add parallelism to GCE cluster upgrade
Fixes https://github.com/kubernetes/kubernetes/issues/48373
Should allow upgrading 500-node cluster (1.6->1.7) in < 1 hr. It currently takes ~1.5 day.
Though it is the duty of the upgrader to choose the right parallelism in order to avoid disrupting too many pods.
/cc @kubernetes/sig-cluster-lifecycle-pr-reviews @kubernetes/sig-scalability-misc @mikedanese @gmarek
Automatic merge from submit-queue
[addon-manager] Remove unneeded annotation codes
**What this PR does / why we need it**:
Clean up addon-manager codes to make it less confusing. The annotation logics is only needed for 1.4->1.5 upgrade.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49898, 49897, 49919, 48860, 49491)
gce: make append_or_replace.. atomic
Before this change,
* the final echo is not atomically written to the target file
* two concurrent callers will use the same tempfile
Helps with https://github.com/kubernetes/kubernetes/issues/49895
cc @miekg
Automatic merge from submit-queue (batch tested with PRs 49898, 49897, 49919, 48860, 49491)
gce: extend CLOBBER_CONFIG to support known_tokens.csv
Helps with #49895
Automatic merge from submit-queue
Reduce kubectl calls from O(#nodes) to O(1) in cluster logdump
Ref https://github.com/kubernetes/kubernetes/issues/48513
Each node's logexporter is made to write a file to a GCS directory on success (https://github.com/kubernetes/test-infra/pull/3782).
We now use that directory as a registry of successful nodes and get it through a single "gsutil ls" call. This:
- reduces the current waiting time for logexporter in 5k-node cluster from >1hr to <10s.
- eliminates dependency on `kubectl logs` calls which seem to be unreliable sometimes (e.g when kubelet (or apiserver) is down)
/cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek @fejta
Automatic merge from submit-queue (batch tested with PRs 49538, 49708, 47665, 49750, 49528)
Add a support for GKE regional clusters in e2e tests.
**What this PR does / why we need it**:
Add a support for GKE regional clusters in e2e tests.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49712, 49694, 49714, 49670, 49717)
set juju master charm state to blocked if the services appear to be failing
**What this PR does / why we need it**: set the juju master charm state to blocked if the services appear to be failing
**Release note**:
```release-note
set the juju master charm state to blocked if the services appear to be failing
```
Automatic merge from submit-queue (batch tested with PRs 49712, 49694, 49714, 49670, 49717)
Adding old Juju charm maintainers
**What this PR does / why we need it**: Update email addresses of past Juju charm maintainers
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
**Special notes for your reviewer**:
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 47738, 49196, 48907, 48533, 48822)
Fix a dead link in cluster/update-storage-objects.sh
**What this PR does / why we need it**: This PR fixes a dead link in cluster/update-storage-objects.sh.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48360, 48469, 49576, 49516, 49558)
Update maintainers for Juju charm layers
**What this PR does / why we need it**: Update maintainers of harm layers to reflect ... reality
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
**Special notes for your reviewer**:
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 48360, 48469, 49576, 49516, 49558)
Rev Calico's Typha daemon to v0.2.3 in add-on deployment.
**What this PR does / why we need it**:
This PR revs the version of Calico's Typha daemon used in the calico-policy-controller add-on to the latest bug-fix release, which incorporates a [critical bug fix](https://github.com/projectcalico/typha/issues/28).
**Which issue this PR fixes**
fixes#49473
**Release note**:
```release-note
Rev version of Calico's Typha daemon used in add-on to v0.2.3 to pull in bug-fixes.
```
Automatic merge from submit-queue
Set snat to false
**What this PR does / why we need it**:
- the [version](e8bea554c5) of the portmap plugin included with calico CNI version `v1.9.1` doesn't have `noSnat` config option, it has `snat` which is not specified (which is the case without this PR), [will be set to true by default](https://github.com/containernetworking/plugins/tree/master/plugins/meta/portmap#usage) , so we need to explicitly set it to `false`
CC @caseydavenport
Automatic merge from submit-queue (batch tested with PRs 45040, 48960)
Add ceph-common to hyperkube image
**What this PR does / why we need it**:
Adds the ceph-common package to the hyperkube image
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)
Fix bug in cluster/log-dump
We're breaking in case KUBECTL is set as "./cluster/kubectl.sh --match-server-version". Moreover we always are using cluster/kubectl.sh as the default and don't want to do match-server-version for the purpose of logexporter.
Also adding owners file so I'm not blocked for approves while making fixes in log-dump. Besides I'll be able to review fixes sent by others.
/cc @wojtek-t
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)
Use presence of kubeconfig file to toggle standalone mode
Fixes#40049
```release-note
The deprecated --api-servers flag has been removed. Use --kubeconfig to provide API server connection information instead. The --require-kubeconfig flag is now deprecated. The default kubeconfig path is also deprecated. Both --require-kubeconfig and the default kubeconfig path will be removed in Kubernetes v1.10.0.
```
/cc @kubernetes/sig-cluster-lifecycle-misc @kubernetes/sig-node-misc
Automatic merge from submit-queue
Remove flags low-diskspace-threshold-mb and outofdisk-transition-frequency
issue: #48843
This removes two flags replaced by the eviction manager. These have been depreciated for two releases, which I believe correctly follows the kubernetes depreciation guidelines.
```release-note
Remove depreciated flags: --low-diskspace-threshold-mb and --outofdisk-transition-frequency, which are replaced by --eviction-hard
```
cc @mtaufen since I am changing kubelet flags
cc @vishh @derekwaynecarr
/sig node
Replaces use of --api-servers with --kubeconfig in Kubelet args across
the turnup scripts. In many cases this involves generating a kubeconfig
file for the Kubelet and placing it in the correct location on the node.
Automatic merge from submit-queue (batch tested with PRs 49326, 49394, 49346, 49379, 49399)
more robust stat handling from ceph df output in the kubernetes-master charm create-rbd-pv action
**What this PR does / why we need it**: more robust stat handling from ceph df output in the kubernetes-master charm create-rbd-pv action
**Release note**:
```release-note
more robust stat handling from ceph df output in the kubernetes-master charm create-rbd-pv action
```
Automatic merge from submit-queue (batch tested with PRs 49420, 49296, 49299, 49371, 46514)
Fix: PV metric is not namespaced
**What this PR does / why we need it**: The PV metric of juju deployments is not namespaced. This PR fixes this bug.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/348
**Special notes for your reviewer**:
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 49420, 49296, 49299, 49371, 46514)
Update status to show failing services.
**What this PR does / why we need it**: Report on charm status any services that are not running.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/341
**Special notes for your reviewer**:
**Release note**:
```Report failing services in Juju deployed clusters.
```
Automatic merge from submit-queue
Auto-calculate master disk and root disk sizes in GCE
@gmarek PR https://github.com/kubernetes/kubernetes/pull/49282 didn't fix the issue because MASTER_DISK_SIZE was defaulting to 20GB in config-test.sh before being calculated inside get-master-disk-size() where you use pre-existing value if any.
It should be fixed by this now.
Automatic merge from submit-queue (batch tested with PRs 48565, 49172)
On GCE check whether NODE_LOCAL_SSDS=0 and handle this case appropriately
**What this PR does / why we need it**: Presently if you are using a mac and GCE and specify NODE_LOCAL_SSDS=0, or use the default, you end up with 2 local SSDs.
**Which issue this PR fixes** : fixes https://github.com/kubernetes/kubernetes/issues/49171
**Special notes for your reviewer**:
I've discovered that this issue is due to b353792f9c/cluster/gce/util.sh (L579)
If NODE_LOCAL_SSDS=0, this evaluates to $(seq 0)
```
$ for i in $(seq 0); do echo $i; done
1
0
```
From man seq on mac osx
```
The seq utility prints a sequence of numbers, one per line (default), from first (default 1),
to near last as possible, in increments of incr (default 1).When first is larger than last the
default incr is -1.
```
This was run on mac with the seq manpage indicating it comes from BSD Feb 19 2010.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49222, 49333, 48708, 49337)
Fix issue in installing containerized mounter
Fix PR #49335
PR #49157 causes failure when installing containerized mounter. This
PR is a fix for it
Automatic merge from submit-queue (batch tested with PRs 49222, 49333, 48708, 49337)
glbc: change the label of the l7-lb-controller pod
This ensures that the default http backend service doesn't include this
pod as its endpoint. This fixes#49159
Automatic merge from submit-queue (batch tested with PRs 49330, 49252, 49262, 49278, 49334)
Simplify master-worker relation missing message
**What this PR does / why we need it**: Simplify messaging of missing relation in Juju deployments
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/309
**Special notes for your reviewer**:
**Release note**:
```NONE
```
Automatic merge from submit-queue
Use custom port for node-problem-detector
It fixes https://github.com/kubernetes/kubernetes/issues/49263
```release-note
Use port 20256 for node-problem-detector in standalone mode.
```
Automatic merge from submit-queue
fix the typo of Kubernetes Worker
**What this PR does / why we need it**:
fix the typo of Kubernetes Worker that Kubernetes spell error
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```NONE
Automatic merge from submit-queue
Bump rescheduler version to v0.3.1
**What this PR does / why we need it**:
Bump Rescheduler version to v0.3.1 to log to STDERR.
**Which issue this PR fixes**
Fixes https://github.com/kubernetes/contrib/issues/2518
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48377, 48940, 49144, 49062, 49148)
add some more deprecation warnings to cluster
Part of https://github.com/kubernetes/kubernetes/issues/49213
@kubernetes/sig-cluster-lifecycle-misc
Automatic merge from submit-queue
Bump e2e mounttest image version to 0.8
Reduce the number of image files required for e2e test run
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49058, 49072, 49137, 49182, 49045)
Set default CIDR to /16 for Juju deployments
**What this PR does / why we need it**: Increase the number of IPs on a deployment
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/272
**Special notes for your reviewer**:
**Release note**:
```Set default CIDR to /16 for Juju deployments
```
Automatic merge from submit-queue (batch tested with PRs 49120, 46755, 49157, 49165, 48950)
gce: don't print every file in mounter to stdout
This is printing ~3000 lines.
Automatic merge from submit-queue (batch tested with PRs 48914, 48535, 49099, 48935, 48871)
Log error when fail to execute command in with-retry()
**What this PR does / why we need it**: Enhance gke/util.sh logging.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48913
**Special notes for your reviewer**:
/cc @krzyzacy
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49019, 48919, 49040, 49018, 48874)
Set default snap channel on charms to 1.7 stable
**What this PR does / why we need it**: This PR sets the default snap channel on charms to 1.7/stable.
This addresses problems where the the user might want to deploy the charm and get the same kubernetes version found on the bundles.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/305
**Special notes for your reviewer**:
**Release note**:
```Set default snap channel on charms to 1.7/stable
```
Automatic merge from submit-queue (batch tested with PRs 48231, 47377, 48797, 49020, 49033)
prevent unsetting of nonexistent previous port in kubeapi-load-balancer charm
**What this PR does / why we need it**: prevent unsetting of nonexistent previous port in kubeapi-load-balancer charm
**Release note**:
```release-note
prevent unsetting of nonexistent previous port in kubeapi-load-balancer charm
```
Automatic merge from submit-queue (batch tested with PRs 48578, 48895, 48958)
use port configuration
**What this PR does / why we need it**: Uses the `port` config option in the kubeapi-load-balancer charm.
**Release note**:
```release-note
Uses the port config option in the kubeapi-load-balancer charm.
```
Automatic merge from submit-queue
remove some people from OWNERS so they don't get reviews anymore
These are googlers who don't work on the project anymore but are still
getting reviews assigned to them:
- @bprashanth
- @rjnagal
- @vmarmol
Automatic merge from submit-queue (batch tested with PRs 48812, 48276)
Change fluentd-gcp monitoring to use metrics exposed by SD plugin
Following https://github.com/GoogleCloudPlatform/fluent-plugin-google-cloud/pull/135, make fluentd-gcp expose metrics in Prometheus registry and use them instead of counting records in the pipeline.
/cc @piosz @igorpeshansky
```release-note
Fluentd-gcp DaemonSet exposes different set of metrics.
```
Automatic merge from submit-queue (batch tested with PRs 48864, 48651, 47703)
Enable logexporter mechanism to dump logs from k8s nodes to GCS directly
Ref https://github.com/kubernetes/kubernetes/issues/48513
This adds support for logexporter from k8s side. Next I'll send a PR adding support from test-infra side.
/cc @kubernetes/sig-scalability-misc @kubernetes/test-infra-maintainers @fejta @wojtek-t @gmarek
Automatic merge from submit-queue
Fixed cluster validation for multizonal clusters.
Fixed cluster validation for multizonal clusters.
This should fix HA master e2e tests.
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 46748, 48826)
Added `CriticalAddonsOnly` toleration for npd.
**What this PR does / why we need it**:
We should add `CriticalAddonsOnly` toleration to make sure the daemonset can be scheduled on the node even if already planned to run critical pod.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47015
**Special notes for your reviewer**:
**Release note**:
```release-note
none
```
Automatic merge from submit-queue
Properly nest code blocks
**What this PR does / why we need it**:
Markdown code blocks are adjusted to better display on GitHub. See [rendered](c3fbec7663/cluster/addons/cluster-loadbalancing/glbc/README.md) version.
**Release note**:
```release-note
Adjust markdown code block in README for Google Load Balancer addon.
```
Automatic merge from submit-queue
Update docs for user-guide
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48781, 48817, 48830, 48829, 48053)
Fix yaml-quote typo
Caught this looking through CI logs.
/assign wojtek-t
Automatic merge from submit-queue (batch tested with PRs 48279, 48566, 48319, 48794, 47952)
Add prometheus plugin on fluentd image.
**What this PR does / why we need it**:
This PR adds the prometheus plugin on Fluentd.
**Special notes for your reviewer**:
The plugin used was: https://github.com/kazegusuri/fluent-plugin-prometheus, on the latest stable version.
All configs used are default.
**Release note**:
```release-note
Fluentd-es addon now exposes a /metrics endpoint for monitoring on port 24231.
```
Automatic merge from submit-queue
Use Container-optimzed OS images for nodes by default
Part of the deprecation of the debian-based ContainerVM images.
```release-note
kube-up and kubemark will default to using cos (GCI) images for nodes.
The previous default was container-vm (CVM, "debian"), which is deprecated.
If you need to explicitly use container-vm for some reason, you should set
KUBE_NODE_OS_DISTRIBUTION=debian
```
Automatic merge from submit-queue
Pass cluster name to Heapster with Stackdriver sink.
**What this PR does / why we need it**:
Passes cluster name as argument to Heapster when it's used with Stackdriver sink to allow setting resource label 'cluster_name' in exported metrics.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48405, 48742, 48748, 48571, 48482)
Setting default FlexVolume driver directory on COS images.
**What this PR does / why we need it**: The original default FlexVolume driver directory is not writable on COS. A new location is necessary to make FlexVolume work.
This directory doesn't exist by default. FlexVolume users need to create this directory, bind mount it, and remount with the executable permission. The other candidate is /home/kubernetes/bin, but the directory is already getting cluttered. I will submit a different PR for a script that automates this step.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48570
Automatic merge from submit-queue (batch tested with PRs 48698, 48712, 48516, 48734, 48735)
GCE: Allow empty NETWORK_PROJECT_ID env var
Changes:
1. Adds `GCE_API_ENDPOINT` logic to container-linux as it was added to GCI in #47881.
1. Apply `NETWORK_PROJECT_ID` value to gce.conf only if the env var is set.
/sig network
/area platform/gce
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Launch kubemark with an existing Kubemark master
In order to expand the use of kubemark, allow developers to use kubemark with a pre-existing Kubernetes cluster.
Ref issue #44393
Automatic merge from submit-queue (batch tested with PRs 48399, 48450, 48144)
Skip errors when unregistering juju kubernetes-workers
**What this PR does / why we need it**: When removing a kubernetes node from using Juju and for some reason kubernetes master fails we should not error the node, instead we should proceed with the removal of the node and the master will recognise that node as unavailable because it will fail heartbeats.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/300
**Special notes for your reviewer**:
**Release note**:
```
Clean decommission of Juju kubernetes worker units
```
Automatic merge from submit-queue (batch tested with PRs 48399, 48450, 48144)
configure kube-proxy to run with unset conntrack param when in lxc
**What this PR does / why we need it**: Configures the Juju Charm code to run kube-proxy with `conntrack-max-per-core` set to `0` when in an lxc as a workaround for issues when mounting `/sys/module/nf_conntrack/parameters/hashsize`
**Release note**:
```release-note
Configures the Juju Charm code to run kube-proxy with conntrack-max-per-core set to 0 when in an lxc as a workaround for issues when mounting /sys/module/nf_conntrack/parameters/hashsize
```
Automatic merge from submit-queue (batch tested with PRs 47043, 48448, 47515, 48446)
Fix charms leaving services running after remove-unit
**What this PR does / why we need it**:
This fixes a case where removed charm units can sometimes leave behind running services that interfere with the rest of the cluster.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Fix charms leaving services running after remove-unit
```
Automatic merge from submit-queue (batch tested with PRs 48439, 48440, 48394)
Fix kubernetes charms not restarting services after snap upgrades
**What this PR does / why we need it**:
This fixes a problem where the Kubernetes charms don't restart services after upgrading snaps. This can cause certain fixes not to be picked up (for example https://github.com/juju-solutions/release/pull/10)
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Fixed kubernetes charms not restarting services after snap upgrades
```
Automatic merge from submit-queue (batch tested with PRs 48439, 48440, 48394)
Fix: namespace-create have kubectl in path
**What this PR does / why we need it**: In juju deployed clusters namespace-create action is failing
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/326
**Special notes for your reviewer**:
**Release note**:
```Fix: namespace-create action on Juju deployed clusters
```
Automatic merge from submit-queue
Add configuration for swift container name
**What this PR does / why we need it:**
This review updates the OpenStack Heat provider to allow for configuring the name of the Swift object store.
**Which issue this PR fixes:**
fixes#47966
**Special notes for your reviewer**:
Note that the terminology for OpenStack Swift conflicts with K8S terminology. In this instance, container is referring to the organization structure of Swift storage objects.
**Release note**:
```release-note
Adds configuration option for Swift object store container name to OpenStack Heat provider.
```
Automatic merge from submit-queue (batch tested with PRs 48317, 48313, 48351, 48357, 48115)
Ensure get_password is accessing a file that exists.
**What this PR does / why we need it**: get_password will throw an exception instead of returning None in case the basic_auth.csv file is missing but /root/cdk/ is there in a juju deployment.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/324
**Special notes for your reviewer**:
**Release note**:
```
Fix race condition where /root/cdk is not yet initialised in kubernetes-master setup by Juju
```
Automatic merge from submit-queue (batch tested with PRs 47918, 47964, 48151, 47881, 48299)
Add ApiEndpoint support to GCE config.
**What this PR does / why we need it**:
Add the ability to change ApiEndpoint for GCE.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 43558, 48261, 42376, 46803, 47058)
Add bind mount /etc/resolv.conf from host to containerized mounter
Currently, in containerized mounter rootfs, there is no DNS setup. If client
try to set up volume with host name instead of IP address, it will fail to resolve
the host name.
By bind mount the host's /etc/resolv.conf to mounter rootfs, VM hosts name
could be resolved when using host name during mount.
```release-note
Fixes issue where you could not mount NFS or glusterFS volumes using hostnames on GCI/GKE with COS images.
```
Automatic merge from submit-queue (batch tested with PRs 47850, 47835, 46197, 47250, 48284)
Securing the cluster created by Juju
**What this PR does / why we need it**: This PR secures the deployments done with Juju master. Works around certain security issues inherent to kubernetes (see for example dashboard access)
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
Securing Juju kubernetes dashboard
```
Automatic merge from submit-queue (batch tested with PRs 46850, 47984)
Update addon-resizer version
Update addon-resizer version and remove the flags that have been deprecated in the new version.
**What this PR does / why we need it**:
ref kubernetes/contrib#2623
**Special notes for your reviewer**:
Need to wait for merging kubernetes/contrib#2623 first.
**Release note**:
```release-note
addon-resizer flapping behavior was removed.
```
Automatic merge from submit-queue
Allow log-dumping only N randomly-chosen nodes in the cluster
This should let us save "lots" (~3-4 hours) of time in our 5000-node cluster scale tests as we copy logs from all the nodes to jenkins worker and then upload all of them to gcs (while we don't need too many).
This will also prevent the jenkins container facing "No space left on device" error while dumping logs, that we saw in runs 12-13 of gce-enormous-cluster.
The longterm fix will be to enable [logexporter](https://github.com/kubernetes/test-infra/tree/master/logexporter) for our tests.
cc @kubernetes/sig-scalability-misc @kubernetes/test-infra-maintainers @gmarek @fejta
Automatic merge from submit-queue (batch tested with PRs 48004, 48205, 48130, 48207)
Bumped Heapster to v1.4.0
``` release-note
Bumped Heapster to v1.4.0.
More details about the release https://github.com/kubernetes/heapster/releases/tag/v1.4.0
```
follow up #47961
The release candidate `v1.4.0-beta.0` turned out to be stable.
Automatic merge from submit-queue (batch tested with PRs 48004, 48205, 48130, 48207)
Do not set CNI in cases where there is a private master and network policy provider is set.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
In GCE and in a "private master" setup, do not set the network-plugin provider to CNI by default if a network policy provider is given.
```
Automatic merge from submit-queue (batch tested with PRs 48192, 48182)
Add generic NoSchedule toleration to fluentd in gcp config as a quick…
…-fix for #44445
Automatic merge from submit-queue (batch tested with PRs 48139, 48042, 47645, 48054, 48003)
Add a failsafe for etcd not returning a connection string
**What this PR does / why we need it**: Removing a kubernetes-master will fail as described on this issue: https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/311
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/311
**Special notes for your reviewer**: This is a two liner defensive code. I am not totally sold on this patch. I might not be the right place to address the above issue. However, solving the problem on the etcd side and updating the interface scope to be unit (as suggested) seems much more involving.
**Release note**:
```
Fix error when removing juju kubernetes-master unit
```
Automatic merge from submit-queue
Make big clusters work again after introduction of subnets
This PR does two things:
- make IP aliases automatically pick Node IP Range based on number of Nodes,
- fix logic for starting clusters >4095 Nodes that was broken by introduction of subnets,
cc @wojtek-t @shyamjvs
```release-note
Setting env var ENABLE_BIG_CLUSTER_SUBNETS=true will allow kube-up.sh to start clusters bigger that 4095 Nodes on GCE.
```
Ref https://github.com/kubernetes/kubernetes/issues/47344
Automatic merge from submit-queue
Insert Cynerva and Kjackal to approvers list
**What this PR does / why we need it**:
Per the membership reviews, we're looking to promote Konstantinos and
George to approvers to help distribute the review/bug load for the `cluster/juju` code
tree.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
**Special notes for your reviewer**:
cc @marcoceppi and @tvansteenburgh
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48092, 47894, 47983)
fix systemd service file for custom args.
`KUBE_SCHEDULER_ARGS` and `KUBELET_ARGS` are used to custom args for scheduler or kubelet by users.
But if there are more than one params in `KUBELET_ARGS`, for example, if I set KUBELET_ARGS="--cgroups-per-qos=false --enforce-node-allocatable=", the kubelet will judge the `false --enforce-node-allocatable=` as the value of `cgroups-per-qos`. Because `${KUBELET_ARGS}` in kubelet.service will expands the variable into one word. And if I take `$KUBELET_ARGS` instead, kubelet will worker perfectly.
For more info, please click [EnvironmentFiles and support for /etc/sysconfig files](http://fedoraproject.org/wiki/Packaging:Systemd#EnvironmentFiles_and_support_for_.2Fetc.2Fsysconfig_files). This bug is reported by @huanxingyouyoutoo. And I make this PR for her to fix it.
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48012, 47443, 47702, 47178)
Fix setting juju worker labels during deployment
**What this PR does / why we need it**: Allows for setting the labels of juju workers during deployment (eg inside a bundle)
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47176
**Special notes for your reviewer**:
**Release note**:
```
Fix bug in setting Juju kubernetes-worker labels in bundle.yaml files.
```
Automatic merge from submit-queue (batch tested with PRs 47860, 47170)
Fix restart action on juju kubernetes-master
**What this PR does / why we need it**: Restart action of kubernetes-master of Juju is not functioning.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/299
**Special notes for your reviewer**:
**Release note**:
```
Fix: Restart action of juju's kubernetes-master restarts the respective snap based services
```
Automatic merge from submit-queue (batch tested with PRs 47860, 47170)
Make fluentd log to stdio instead of a dedicated file
Lower verbosity also, to reduce volume of system logs exported to the backend.
Fix https://github.com/kubernetes/kubernetes/issues/43772
/cc @piosz
Automatic merge from submit-queue (batch tested with PRs 47961, 46276)
Bumped Heapster to v1.4.0-beta.0
Heapster release candidate for Kubernetes 1.7
cc @dchen1107 @caesarxuchao
Automatic merge from submit-queue
Parameterize the binary path and host arch for the hyperkube image
As the [cluster/images/hyperkube/README.md](https://github.com/kubernetes/kubernetes/tree/master/cluster/images/hyperkube) shows, I run the command: `make build VERSION=test ARCH=ppc64le`, but got the below errors, so this PR will fix it.
```
ARCH=ppc64le
cp -r ./* /tmp/hyperkubeTFbYrI
mkdir -p /tmp/hyperkubeTFbYrI/cni-bin
cp ../../../_output/dockerized/bin/linux/ppc64le/hyperkube /tmp/hyperkubeTFbYrI
cp: cannot stat '../../../_output/dockerized/bin/linux/ppc64le/hyperkube': No such file or directory
Makefile:62: recipe for target 'build' failed
make: *** [build] Error 1
```
Automatic merge from submit-queue (batch tested with PRs 47993, 47892, 47591, 47469, 47845)
Bump up npd version to v0.4.1
```
Bump up npd version to v0.4.1
```
Fixes#47219
Automatic merge from submit-queue (batch tested with PRs 47993, 47892, 47591, 47469, 47845)
Use a different env var to enable the ip-masq-agent addon.
We shouldn't mix setting the non-masq-cidr with enabling the addon.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
https://github.com/kubernetes/kubernetes/issues/47865
Automatic merge from submit-queue (batch tested with PRs 47883, 47179, 46966, 47982, 47945)
Strip versions from known api groups in audit policy
Props to @CaoShuFeng for catching this.
Issue: kubernetes/features#22
/cc @ericchiang
Automatic merge from submit-queue (batch tested with PRs 46151, 47602, 47507, 46203, 47471)
Add RBAC support to fluentd-elasticsearch cluster addon
**What this PR does / why we need it**:
Adds rbac support to the fluentd-elasticsearch addon
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#46023
**Special notes for your reviewer**:
**Release note**:
```release-note
Add RBAC support to fluentd-elasticsearch cluster addon
```
Automatic merge from submit-queue (batch tested with PRs 46151, 47602, 47507, 46203, 47471)
es discovery support args apiserver-host and kubeconfig
Now discovery elasticsearch through kubernetes client,but now does not support specifying the apiserver-host or kubeconfig create client.