Automatic merge from submit-queue
Update dashboard-controller.yaml
**What this PR does / why we need it**: Updates Dashboard addon to newest version. Changelog can be found at https://github.com/kubernetes/dashboard/releases/tag/v1.6.1.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Update Dashboard version to 1.6.1
```
Automatic merge from submit-queue
fix: required openstack heat version for conditions is 2016-10-14 / newton
This fix sets the required heat version to 2016-10-14.
In OpenStack heat the conditions statement was introduced in version 2016-10-14 | newton, accourding to:
https://docs.openstack.org/releasenotes/heat/newton.html
and more specific:
https://docs.openstack.org/developer/heat/template_guide/hot_spec.html
The conditions are used to make the assignment of public ips / floating ips optional (added in commit 4eef540876). However this template is not compatible with OpenStack heat releases prior newton and produces the following error:
```
ERROR: Failed to validate: : resources.kube_minions: : "condition" is not a valid keyword inside a output definition
```
PR without a release note:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45860, 45119, 44525, 45625, 44403)
Support running StatefulSetBasic e2e tests with local-up-cluster
**What this PR does / why we need it**:
Currently StatefulSet(s) fail when you use local-up-cluster without
setting a cloud provider. In this PR, we use set the
kubernetes.io/host-path provisioner as the default provisioner when
there CLOUD_PROVIDER is not specified. This enables e2e test(s)
(specifically StatefulSetBasic) to work.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 44337, 45775, 45832, 45574, 45758)
Fix lint failures on kubernetes-e2e charm
**What this PR does / why we need it**:
This fixes a test failure on the kubernetes-e2e charm relating to tox and flake8:
```DEBUG🏃/bin/sh: 1: flake8: not found```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
This is a follow-up to https://github.com/kubernetes/kubernetes/pull/45494 where the same thing was done for kubernetes-master.
**Release note**:
```release-note
Fix lint failures on kubernetes-e2e charm
```
Automatic merge from submit-queue (batch tested with PRs 44337, 45775, 45832, 45574, 45758)
Refactor gcr.io/google_containers/elasticsearch to alpine
**What this PR does / why we need it**:
This reduces the image size of the gcr.io/google_containers/elasticsearch image.
Before:
```
REPOSITORY TAG IMAGE ID CREATED SIZE
gcr.io/google_containers/elasticsearch v2.4.1-2 6941e43df81a 4 weeks ago 419MB
```
After:
```
REPOSITORY TAG IMAGE ID CREATED SIZE
gcr.io/google_containers/elasticsearch v2.4.1-2 24ad40c21a52 About an hour ago 178MB
```
**Special notes for your reviewer**:
I used a workaround to make the elasticsearch_logging_discovery binary work with alpine. (See [stackoverflow](https://stackoverflow.com/questions/34729748/installed-go-binary-not-found-in-path-on-alpine-linux-docker/35613430#35613430)). Alternatively this can be solved by setting ```CGO_ENABLED=0```when compiling the binary. I didn't feel comfortable chaing the Makefile though, since I'm no golang expert. Feedback wanted!
Automatic merge from submit-queue (batch tested with PRs 45070, 45821, 45732, 45494, 45789)
Fix lint errors in juju kubernetes master and e2e charms
**What this PR does / why we need it**: Fixes style error in the Juju charms
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
Code style fixes in Juju charms
```
Automatic merge from submit-queue (batch tested with PRs 45653, 45719, 45729, 45730, 44250)
Remove kubemark.sh as we don't use pod IP from it anymore
This has been pending for sometime now. We no longer seem to actually depend on the downwarp api for the pod IP (hollow-proxy for example now gets it using api call).
cc @wojtek-t @gmarek
Automatic merge from submit-queue
Remove unused file cluster/images/kubemark/build-kubemark.sh
It's irrelevant and we don't seem to use/need it anymore.
cc @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 45691, 45667, 45698, 45715)
Add general NoExecute Toleration to fluentd in gcp configuration
Ref #44445
Once merged I'll create a cherry-pick that will be picked up in GKE together with the next patch release.
cc @JorritSalverda @davidopp @aveshagarwal @nimeshksingh @piosz
```release-note
fluentd will tolerate all NoExecute Taints when run in gcp configuration.
```
Automatic merge from submit-queue
Update kube-dns version to 1.14.2
```release-note
Updates kube-dns to 1.14.2
- Support kube-master-url flag without kubeconfig
- Fix concurrent R/Ws in dns.go
- Fix confusing logging when initialize server
- Fix printf in cmd/kube-dns/app/server.go
- Fix version on startup and --version flag
- Support specifying port number for nameserver in stubDomains
```
Changes:
- Support kube-master-url flag without kubeconfig
- Fix concurrent R/Ws in dns.go
- Fix confusing logging when initialize server
- Fix printf in cmd/kube-dns/app/server.go
- Fix version on startup and --version flag
- Support specifying port number for nameserver in stubDomains
Automatic merge from submit-queue (batch tested with PRs 45569, 45602, 45604, 45478, 45550)
Don't append :443 to registry domain in the kubernetes-worker layer registry action
**What this PR does / why we need it**: Fixes#45547
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#45547
**Special notes for your reviewer**:
**Release note**:
```
Fix#45547 - don't append :443 to juju created docker registry config
```
Automatic merge from submit-queue (batch tested with PRs 45569, 45602, 45604, 45478, 45550)
Enable kernel memcg notification for node and cluster GCI/COS testing.
Sets --experimental-kernel-memcg-notification=true when running on the GCI/COS image. It sets this for master and nodes for cluster e2e tests, and for the node in node e2e tests.
Issue #42676
cc @dchen1107 @Random-Liu
IP aliases are an alpha feature, and node accelerators are a beta
feature. $gcloud determines which is appropriate.
Before, this would try to run "gcloud alpha beta", which is incoherent.
Automatic merge from submit-queue
util/iptables: grab iptables locks if iptables-restore doesn't support --wait
When iptables-restore doesn't support --wait (which < 1.6.2 don't), it may
conflict with other iptables users on the system, like docker, because it
doesn't acquire the iptables lock before changing iptables rules. This causes
sporadic docker failures when starting containers.
To ensure those don't happen, essentially duplicate the iptables locking
logic inside util/iptables when we know iptables-restore doesn't support
the --wait option.
Unfortunately iptables uses two different locking mechanisms, one until
1.4.x (abstract socket based) and another from 1.6.x (/run/xtables.lock
flock() based). We have to grab both locks, because we don't know what
version of iptables-restore exists since iptables-restore doesn't have
a --version option before 1.6.2. Plus, distros (like RHEL) backport the
/run/xtables.lock patch to 1.4.x versions.
Related: https://github.com/kubernetes/kubernetes/pull/43575
See also: https://github.com/openshift/origin/pull/13845
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1417234
@kubernetes/rh-networking @kubernetes/sig-network-misc @eparis @knobunc @danwinship @thockin @freehan
Automatic merge from submit-queue (batch tested with PRs 44830, 45130)
Adding support for Accelerators to GCE clusters.
```release-note
Create clusters with GPUs in GKE by specifying "type=<gpu-type>,count=<gpu-count>" to NODE_ACCELERATORS env var.
List of available GPUs - https://cloud.google.com/compute/docs/gpus/#introduction
```
Automatic merge from submit-queue (batch tested with PRs 44590, 44969, 45325, 45208, 44714)
Enable basic auth username rotation for GCI
When changing basic auth creds, just delete the whole file, in order to be able to rotate username in addition to password.
Automatic merge from submit-queue (batch tested with PRs 45309, 45376)
Allow passing --enable-kubernetes-alpha to GKE e2e tests
**What this PR does / why we need it**:
This allows us to pass --enable-kubernetes-alpha when running GKE e2e tests.
**Release note**:
```
NONE
```
@dchen1107
Automatic merge from submit-queue (batch tested with PRs 45285, 45162)
mounter.go: format return err.
**What this PR does / why we need it**:
when an error returned is nil, it's preferred to explicitly return nil.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45283, 45289, 45248, 44295)
Use docker_bundle rule from new rules_docker repo
**What this PR does / why we need it**: switched to using the new `docker_bundle` rule from `rules_docker` instead of my patched `docker_build` rule. This also brings in some fixes for the docker rules that were missing from my fork.
Additionally, I switched out the `git_repository` rules for `http_archive` rules, since that seems to be recommended by the bazel docs (and might be faster).
Lastly, I updated the `pkg_tar` rules to use my patch, which doesn't prepend `./` to files inside the tarballs.
This one should likely be merged upstream in the near future.
I think this is the last of the changes necessary to have `bazel run //:ci-artifacts` working properly to support using bazel for e2e in CI.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45283, 45289, 45248, 44295)
Remove offending code due to bad rebase
**What this PR does / why we need it**: Fix bug introduced by bad rebasing
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43884, 44712, 45124, 43883)
Increase Dashboard memory limits
**What this PR does / why we need it**: Increases memory requests and limits for Dashboard.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/kubernetes/dashboard/issues/1431
**Special notes for your reviewer**: Dashboard crashes on large clusters, this change should fix that problem.
**Release note**:
```release-note
Increase Dashboard's memory requests and limits
```
Automatic merge from submit-queue
Support arbitrary alphanumeric strings as prerelease identifiers
**What this PR does / why we need it**: this is basically an extension of #43642, but supports more general prerelease identifiers, per the spec at http://semver.org/#spec-item-9.
These regular expressions are still a bit more restrictive than the SemVer spec allows (we disallow hyphens, and we require the format `-foo.N` instead of arbitrary `-foo.X.bar.Y.bazZ`), but this should support our needs without changing too much more logic or breaking other assumptions.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
README.md: Update outdated links
**What this PR does / why we need it**:
the PR aims to update some links.
Some links with "#" would not redirect to right point of pages.
Other links without "#" can work, but they are outdated. I change them by the way.
**Special notes for your reviewer**:
**Release note**:
```release-note
```
none
Automatic merge from submit-queue
Retry calls we report config changes quickly.
**What this PR does / why we need it**: In Juju deployments of Kubernetes the status of the charms is updated when a status-update is triggered periodically. As a result changes in config variables may take up to 10 minutes to be reflected on the charms status. See bug below.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/263
**Special notes for your reviewer**:
**Release note**:
```
Kubernetes clusters deployed with Juju pick up config changes faster.
```
Automatic merge from submit-queue (batch tested with PRs 41583, 45117, 45123)
Adds the cifs-common package
**What this PR does / why we need it**: Enables mounting of CIFS volumes. Required for Azure.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/227
**Release note**:
```release-note
Added CIFS PV support for Juju Charms
```
Automatic merge from submit-queue
Remove the Rackspace provider
**What this PR does / why we need it**:
To aid the effort of moving providers out of the cluster dir, I'm
removing Rackspace and leaving behind a README.md simply as a
placeholder until the entire dir is deleted.
**Which issue this PR fixes**
Fixes#6962
**Release note**:
```release-note
Deployment of Kubernetes clusters on Rackspace using the in-tree bash deployment (i.e. cluster/kube-up.sh or get-kube.sh) is obsolete and support has been removed.```
Currently StatefulSet(s) fail when you use local-up-cluster without
setting a cloud provider. In this PR, we use set the
kubernetes.io/host-path provisioner as the default provisioner when
there CLOUD_PROVIDER is not specified. This enables e2e test(s)
(specifically StatefulSetBasic) to work.
Automatic merge from submit-queue (batch tested with PRs 41530, 44814, 43620, 41985)
Fixes juju kubernetes master: 1. Get certs from a dead leader. 2. Append tokens.
**What this PR does / why we need it**:
Fixes two issues with the Juju kubernetes master.
1. Grab certificates from a leader that is already removed.
2. Append (not truncate) auth tokens
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes#43563fixes#43519
**Special notes for your reviewer**:
**Release note**:
```
Recover certificates from leadership context in case all masters die in a Juju deployment
```
Automatic merge from submit-queue
Add metrics exporter to the fluentd-gcp deployment
Metrics exporter container reads metrics from the `/metrics` endpoint in fluentd and exports them directly to the Stackdriver. It assumes that Stackdriver Monitoring API is enabled.
/cc @fgrzadkowski
Automatic merge from submit-queue (batch tested with PRs 42432, 44628, 45101, 44921)
Use correct option name in the kubernetes-worker layer registry action
**What this PR does / why we need it**: It fixes#44920
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#44920
**Special notes for your reviewer**:
**Release note**:
```
Ensure kubernetes-worker juju layer registry action uses correct ingress controller option name
```
Automatic merge from submit-queue
Bump GLBC version to 0.9.3
**What this PR does / why we need it**:
Bumps version of GLBC shipped with K8s
https://github.com/kubernetes/ingress/releases/tag/0.9.3
```
Major Changelog:
Bug fix: adding backends to existing backend-services #652
Bug fix: handling of secret-based SSL Certs #639
Add second LB healthcheck/proxy traffic source CIDR #574#479
Support backside re-encryption (HTTPS) #519
```
The two noted bugs are common occurrences for GKE users
**Release note**:
```release-note
Bump GLBC version to 0.9.3
```
Fixes#6962
To aid the effort of moving providers out of the cluster dir, I'm
removing Rackspace and leaving behind a README.md simply as a
placeholder until the entire dir is deleted.
Automatic merge from submit-queue (batch tested with PRs 42740, 44980, 45039, 41627, 45044)
Update kubernetes-e2e charm to use snaps
**What this PR does / why we need it**:
This updates the kubernetes-e2e charm to use snaps instead of Juju resources for payload delivery.
The main advantage of this is that it decouples the charm from the e2e payload, allowing us to support multiple versions of Kubernetes with a single release of the charm.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Update kubernetes-e2e charm to use snaps
```
Automatic merge from submit-queue (batch tested with PRs 44591, 44549)
Update repo-infra bazel dependency and use new gcs_upload rule
This PR provides similar functionality to push-build.sh entirely within Bazel rules (though it relies on gsutil).
It's an alternative to #44306.
Depends on https://github.com/kubernetes/repo-infra/pull/13.
**Release note**:
```release-note
NONE
```
The master-components started state triggers a daemon recycle. The guard
was to prevent the daemons from being cycled too often and interrupting
normal workflow. This additional state check is guarded against the etcd
connection string from changing, allowing the current behavior but
triggers a re-configure and recycle of the api-control plane when etcd
units are scaling up and down.
Automatic merge from submit-queue
Support running Ubuntu image on GCE
**What this PR does / why we need it**:
This PR (on top of #44629) contains the script changes for running Ubuntu image on GCE.
**Special notes for your reviewer**:
We made change in `gci/node.yaml` and `gci/master.yaml` to ensure that Kubernetes jobs can start automatically after reboot. This is not needed for GCI but required by Ubuntu. See https://github.com/kubernetes/kubernetes/pull/44744#discussion_r113105970 for details. With this change, Ubuntu could use the same provisioning scripts as GCI's.
Ran e2e tests using the following command and all tests passed.
```
KUBE_GCE_NODE_IMAGE=ubuntu-gke-1604-xenial-v20170420-1 KUBE_GCE_NODE_PROJECT=ubuntu-os-gke-cloud KUBE_NODE_OS_DISTRIBUTION=ubuntu GINKGO_PARALLEL=y GINKGO_PARALLEL_NODES=30 go run hack/e2e.go -- -v --build --up --test --test_args="--ginkgo.skip=\[Slow\]|\[Serial\]|\[Disruptive\]|\[Flaky\]|\[Feature:.+\]" --down
```
Also tested manually for both GCI and Ubuntu images.
**Release note**:
`Support Ubuntu 16.04 image on GCE`
Automatic merge from submit-queue
Send dns details only after cdk-addons are configured
**What this PR does / why we need it**: This is a bugfix on the deployment of Kubernetes via Juju. See issue below.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#40386 and
https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/262
**Special notes for your reviewer**:
**Release note**:
```
Fix KubeDNS issue in Juju deployments.
```
Automatic merge from submit-queue
Allow disabling log dump for nodes (in preparation for using logexporter)
This is, in part, a change required for allowing usage of [logexporter](https://github.com/kubernetes/test-infra/tree/master/logexporter) for dumping node logs to GCS directly, instead of doing it through log-dump.sh.
cc @kubernetes/test-infra-maintainers @wojtek-t @gmarek @fejta
Automatic merge from submit-queue (batch tested with PRs 44931, 44808)
Closes#44392
**What this PR does / why we need it**:
Fix the pause action with regard to the new behavior where
--delete-local-data=false by default. Historically --force was all that
was required, this flag has changed to be more descriptive of the
actions it's taking.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#44392
**Release note**:
```release-note
Added support to the pause action in the kubernetes-worker charm for new flag --delete-local-data
```
Fix the pause action with regard to the new behavior where
--delete-local-data=false by default. Historically --force was all that
was required, this flag has changed to be more descriptive of the
actions it's taking.
Automatic merge from submit-queue
Add namespace-{list, create, delete} actions to the kubernetes-master layer
**What this PR does / why we need it**:
This PR adds namespace-{list,create,delete} actions to the juju kubernetes-master layer.
**Which issue this PR fixes**: fixes#43712
**Special notes for your reviewer**:
Original PR https://github.com/juju-solutions/kubernetes/pull/109
**Release note**:
```
Add namespace-{list,create,delete} actions to the juju kubernetes-master layer
```
Automatic merge from submit-queue (batch tested with PRs 42486, 44780)
Hostname patch for vsphere provider limitations with juju
**What this PR does / why we need it**:
The Juju VSphere provider doesn't set a unique hostname which causes issues when scaling worker-pools and they all have the hostname `ubuntuguest`. Instead we assign the JUJU_UNIT_NAME to that hostname to prevent the collision which allows the master to sort out that there are multiple units and not one attempting re-registration.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/237
**Special notes for your reviewer**:
The charm-pre-exec runs before it installs the charm software so the validation can happen quickly. Check hostname output, as well as kubectl get no post deployment.
```release-note
Resolves juju vsphere hostname bug showing only a single node in a scaled node-pool.
```
This patch sets the hostname to a unique identifier (the juju unit name)
during pre-deployment of the charm. This may not be a FQDN resolveable
hostname but will prevent hostname collision.
Using Ubuntu on GCE to run cluster e2e tests requires slightly different
node.yaml and master.yaml files than GCI, because Ubuntu uses systemd as
PID 1, wheras GCI uses upstart with a systemd delegate. Therefore the
e2e tests fail using those files since the kubernetes services are not
brought back up after a node/master reboot.
Automatic merge from submit-queue (batch tested with PRs 44722, 44704, 44681, 44494, 39732)
prevent installation of docker from upstream
**What this PR does / why we need it**: Disallows installation of upstream docker from PPA in the Juju kubernetes-worker charm.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Disallows installation of upstream docker from PPA in the Juju kubernetes-worker charm.
```
Automatic merge from submit-queue (batch tested with PRs 42177, 42176, 44721)
Removed fluentd-gcp manifest pod
```release-note
Fluentd manifest pod is no longer created on non-registered master when creating clusters using kube-up.sh.
```
Automatic merge from submit-queue
Remove the old docker-multinode files that were built into the hyperkube image
**What this PR does / why we need it**:
ref: https://goo.gl/VxSaKx
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
The hyperkube image has been slimmed down and no longer includes addon manifests and other various scripts. These were introduced for the now removed docker-multinode setup system.
```
cc @jbeda @brendandburns @bgrant0607 @justinsb @mikedanese
Automatic merge from submit-queue (batch tested with PRs 44687, 44689, 44661)
Retry in get-kube.sh to avoid download flakes.
GCS has up to 2% 5xx rates, so retrying is critical.
This is currently failing about 8 times per day [according to the dashboard](https://storage.googleapis.com/k8s-gubernator/triage/index.html?test=Extract#be2f33fb1e6dd2389d12). It could be backported to reduce the flake rate.
Relase note:
```release-note
NONE
```
Automatic merge from submit-queue
select one api endpoint at random when deploying kubernetes-core charm
**What this PR does / why we need it**: Fixes a bug in the kubernetes-worker Juju charm code that attempted to give kube-proxy more than one api endpoint.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**: https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/255
**Release note**:
```release-note
Fixes a bug in the kubernetes-worker Juju charm code that attempted to give kube-proxy more than one api endpoint.
```
Automatic merge from submit-queue
Update gcr.io/google_containers/porter image to 4524579c0e
**What this PR does / why we need it**: updates the porter image to one built at 4524579c0e using go1.8.1.
This incorporates #44638, which has a new dummy certificate that is compliant with go1.8+.
Image has already been pushed.
**Release note**:
```release-note
NONE
```
/assign @liggitt
/cc @luxas @lavalamp
Automatic merge from submit-queue
Fix ceph-secret type to kubernetes.io/rbd in kubernetes-master charm
**What this PR does / why we need it**:
This fixes the type of the ceph-secret secret that's created by the kubernetes-master charm.
Without the `kubernetes.io/rbd` type, automatic provisioning of PVCs doesn't work.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Fix ceph-secret type to kubernetes.io/rbd in kubernetes-master charm
```
Automatic merge from submit-queue
issue_43986: fix docu with non-functional proxy
The documentation defines a couple of replication-controller and service
to provision a docker-registry somewhere on the cluster and have it
available by the name viz. A record of
kube-registry.default.svc.<clustername>.
On each node, http-proxies are placed as daemon-set with the
kube-registry DNS name set as upstream, so that the registry is
available on each host under endpoint localhost:5000
Because in the documentation, selector-identifiers are the same for
"upstream" registry and proxies, the proxies themselves register under
the service intended for the upstream and now have themselves as
upstream under a different port, where connection attempts result in
"connection refused".
Adapting selectors to be unique as in this patch fixes the problem.
**What this PR does / why we need it**:
Patch fixes (cf. above) erroneous documentation.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#43986
**Special notes for your reviewer**:
Thank you for your consideration.
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 43000, 44500, 44457, 44553, 44267)
Add Kubernetes 1.6 support to Juju charms
**What this PR does / why we need it**:
This adds Kubernetes 1.6 support to Juju charms.
This includes some large architectural changes in order to support multiple versions of Kubernetes with a single release of the charms. There are a few bug fixes in here as well, for issues that we discovered during testing.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
Thanks to @marcoceppi, @ktsakalozos, @jacekn, @mbruzek, @tvansteenburgh for their work in this feature branch as well!
**Release note**:
```release-note
Add Kubernetes 1.6 support to Juju charms
Add metric collection to charms for autoscaling
Update kubernetes-e2e charm to fail when test suite fails
Update Juju charms to use snaps
Add registry action to the kubernetes-worker charm
Add support for kube-proxy cluster-cidr option to kubernetes-worker charm
Fix kubernetes-master charm starting services before TLS certs are saved
Fix kubernetes-worker charm failures in LXD
Fix stop hook failure on kubernetes-worker charm
Fix handling of juju kubernetes-worker.restart-needed state
Fix nagios checks in charms
```
The documentation defines a couple of replication-controller and service
to provision a docker-registry somewhere on the cluster and have it
available by the name viz. A record of
kube-registry.default.svc.<clustername>.
On each node, http-proxies are placed as daemon-set with the
kube-registry DNS name set as upstream, so that the registry is
available on each host under endpoint localhost:5000
Because in the documentation, selector-identifiers are the same for
"upstream" registry and proxies, the proxies themselves register under
the service intended for the upstream and now have themselves as
upstream under a different port, where connection attempts result in
"connection refused".
Adapting selectors to be unique as in this patch fixes the problem.
modified: cluster/addons/registry/README.md
modified: cluster/addons/registry/registry-rc.yaml
modified: cluster/addons/registry/registry-svc.yaml
Automatic merge from submit-queue (batch tested with PRs 40055, 42085, 44509, 44568, 43956)
Change the default CLUSTER_IP_RANGE used by e2e
The existing choice intersects with the range reserved for auto
subnets and cannot be used with some GCP features.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44406, 41543, 44071, 44374, 44299)
Enable service account token lookup by default
Fixes#24167
```release-note
kube-apiserver: --service-account-lookup now defaults to true, requiring the Secret API object containing the token to exist in order for a service account token to be valid. This enables service account tokens to be revoked by deleting the Secret object containing the token.
```
Automatic merge from submit-queue
Make get-kube.sh work properly the "ci/latest" pointer
**What this PR does / why we need it**: this is a (late) followup from #36419, fixing a bug discovered in https://github.com/kubernetes/kubernetes/pull/36419#issuecomment-265679578.
Basically, `get-kube-binaries.sh` looks at `$KUBERNETES_RELEASE_URL`, but we weren't properly overriding it in `get-kube.sh` when downloading binaries from the CI release bucket. With this change, we set the variable correctly, and everything works:
```console
$ KUBERNETES_RELEASE=ci/latest ~/code/kubernetes/src/k8s.io/kubernetes/cluster/get-kube.sh
Downloading kubernetes release v1.7.0-alpha.0.2068+3a3dc827e45426
from https://dl.k8s.io/ci/v1.7.0-alpha.0.2068+3a3dc827e45426/kubernetes.tar.gz
to /tmp/foo/kubernetes.tar.gz
Is this ok? [Y]/n
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 161 100 161 0 0 1004 0 --:--:-- --:--:-- --:--:-- 1006
100 6023k 100 6023k 0 0 10.9M 0 --:--:-- --:--:-- --:--:-- 10.9M
Unpacking kubernetes release v1.7.0-alpha.0.2068+3a3dc827e45426
Kubernetes release: v1.7.0-alpha.0.2068+3a3dc827e45426
Server: linux/amd64 (to override, set KUBERNETES_SERVER_ARCH)
Client: linux/amd64 (autodetected)
Will download kubernetes-server-linux-amd64.tar.gz from https://dl.k8s.io/ci/v1.7.0-alpha.0.2068+3a3dc827e45426
Will download and extract kubernetes-client-linux-amd64.tar.gz from https://dl.k8s.io/ci/v1.7.0-alpha.0.2068+3a3dc827e45426
Is this ok? [Y]/n
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 161 100 161 0 0 991 0 --:--:-- --:--:-- --:--:-- 987
100 348M 100 348M 0 0 39.1M 0 0:00:08 0:00:08 --:--:-- 34.2M
md5sum(kubernetes-server-linux-amd64.tar.gz)=e71c9b48f6551797a74de2b83b501c44
sha1sum(kubernetes-server-linux-amd64.tar.gz)=688dcf567b60e27e3d9bf97436154543432768cf
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 161 100 161 0 0 1019 0 --:--:-- --:--:-- --:--:-- 1025
100 29.0M 100 29.0M 0 0 32.2M 0 --:--:-- --:--:-- --:--:-- 95.4M
md5sum(kubernetes-client-linux-amd64.tar.gz)=8e6a90298411ae5a0e943b1c0e182b1d
sha1sum(kubernetes-client-linux-amd64.tar.gz)=187a2d2c1c6ae1ead32ec4c1fa51f695223edaae
Extracting /tmp/foo/kubernetes/client/kubernetes-client-linux-amd64.tar.gz into /tmp/foo/kubernetes/platforms/linux/amd64
Add '/tmp/foo/kubernetes/client/bin' to your PATH to use newly-installed binaries.
Creating a kubernetes on gce...
...
```
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Ensure only 1 Swift URL is used in cluster operations
**What this PR does / why we need it**:
Extracts only 1 Swift URL if multiple are returned from Keystone.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
https://github.com/kubernetes/kubernetes/issues/34930
**Special notes for your reviewer**:
**Release note**:
```release-note
Heat cluster operations now support environments that have multiple Swift URLs
```
Automatic merge from submit-queue
Use auto mode networks instead of legacy networks in GCP
Use of the --range flag creates legacy networks in GCP.
Legacy networks will not support new GCP features.
```release-note
NONE
```
Automatic merge from submit-queue
Add support for IP aliases for pod IPs (GCP alpha feature)
```release-note
Adds support for allocation of pod IPs via IP aliases.
# Adds KUBE_GCE_ENABLE_IP_ALIASES flag to the cluster up scripts (`kube-{up,down}.sh`).
KUBE_GCE_ENABLE_IP_ALIASES=true will enable allocation of PodCIDR ips
using the ip alias mechanism rather than using routes. This feature is currently
only available on GCE.
## Usage
$ CLUSTER_IP_RANGE=10.100.0.0/16 KUBE_GCE_ENABLE_IP_ALIASES=true bash -x cluster/kube-up.sh
# Adds CloudAllocator to the node CIDR allocator (kubernetes-controller manager).
If CIDRAllocatorType is set to `CloudCIDRAllocator`, then allocation
of CIDR allocation instead is done by the external cloud provider and
the node controller is only responsible for reflecting the allocation
into the node spec.
- Splits off the rangeAllocator from the cidr_allocator.go file.
- Adds cloudCIDRAllocator, which is used when the cloud provider allocates
the CIDR ranges externally. (GCE support only)
- Updates RBAC permission for node controller to include PATCH
```
KUBE_GCE_ENABLE_IP_ALIASES=true will enable allocation of PodCIDR ips
using the ip alias mechanism rather than using routes.
NODE_IP_RANGE will control the node instance IP cidr
KUBE_GCE_IP_ALIAS_SIZE controls the size of each podCIDR
IP_ALIAS_SUBNETWORK controls the name of the subnet created for the cluster
Automatic merge from submit-queue (batch tested with PRs 43866, 42748)
hack/cluster: download cfssl if not present
hack/local-up-cluster.sh uses cfssl to generate certificates and
will exit it cfssl is not already installed. But other cluster-up
mechanisms (GCE) that generate certs just download cfssl if not
present. Make local-up-cluster.sh do that too so users don't have
to bother installing it from somewhere.
Automatic merge from submit-queue
fix kubedns-sa.yaml missing "namespace: kube-system" value
The file kubedns-sa.yaml missing `namespace: kube-system`, so it will create ServiceAccount kube-dns in default namespace, this will cause kube-dns deployment's pods be blocked forever;
Some logs as following:
> - lastTransitionTime: 2017-04-06T19:02:12Z
> lastUpdateTime: 2017-04-06T19:02:12Z
> message: 'unable to create pods: pods "kube-dns-699984412-" is forbidden: service
> account kube-system/kube-dns was not found, retry after the service account
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 42025, 44169, 43940)
if we have a dedicated serviceaccount keypair, use it to verify serviceaccounts
Automatic merge from submit-queue
[Federation] Remove FEDERATIONS_DOMAIN_MAP references
Remove all references to FEDERATIONS_DOMAIN_MAP as this method is no longer is used and is replaced by adding federation domain map to kube-dns configmap.
cc @madhusudancs @kubernetes/sig-federation-pr-reviews
**Release note**:
```
[Federation] Mechanism of adding `federation domain maps` to kube-dns deployment via `--federations` flag is superseded by adding/updating `federations` key in `kube-system/kube-dns` configmap. If user is using kubefed tool to join cluster federation, adding federation domain maps to kube-dns is already taken care by `kubefed join` and does not need further action.
```
Automatic merge from submit-queue
Use shared informers for proxy endpoints and service configs
Use shared informers instead of creating local controllers/reflectors
for the proxy's endpoints and service configs. This allows downstream
integrators to pass in preexisting shared informers to save on memory &
cpu usage.
This also enables the cache mutation detector for kube-proxy for those
presubmit jobs that already turn it on.
Follow-up to #43295 cc @wojtek-t
Will race with #43937 for conflicting changes 😄 cc @thockin
cc @smarterclayton @sttts @liggitt @deads2k @derekwaynecarr @eparis @kubernetes/rh-cluster-infra
```release-note
kube-apiserver: --service-account-lookup now defaults to true. This enables service account tokens to be revoked by deleting the Secret object containing the token.
```
Automatic merge from submit-queue (batch tested with PRs 44047, 43514, 44037, 43467)
Juju: Enable GPU mode if GPU hardware detected
**What this PR does / why we need it**:
Automatically configures kubernetes-worker node to utilize GPU hardware when such hardware is detected.
layer-nvidia-cuda does the hardware detection, installs CUDA and Nvidia
drivers, and sets a state that the k8s-worker can react to.
When gpu is available, worker updates config and restarts kubelet to
enable gpu mode. Worker then notifies master that it's in gpu mode via
the kube-control relation.
When master sees that a worker is in gpu mode, it updates to privileged
mode and restarts kube-apiserver.
The kube-control interface has subsumed the kube-dns interface
functionality.
An 'allow-privileged' config option has been added to both worker and
master charms. The gpu enablement respects the value of this option;
i.e., we can't enable gpu mode if the operator has set
allow-privileged="false".
**Special notes for your reviewer**:
Quickest test setup is as follows:
```bash
# Bootstrap. If your aws account doesn't have a default vpc, you'll need to
# specify one at bootstrap time so that juju can provision a p2.xlarge.
# Otherwise you can leave out the --config "vpc-id=vpc-xxxxxxxx" bit.
juju bootstrap --config "vpc-id=vpc-xxxxxxxx" --constraints "cores=4 mem=16G root-disk=64G" aws/us-east-1 k8s
# Deploy the bundle containing master and worker charms built from
# https://github.com/tvansteenburgh/kubernetes/tree/gpu-support/cluster/juju/layers
juju deploy cs:~tvansteenburgh/bundle/kubernetes-gpu-support-3
# Setup kubectl locally
mkdir -p ~/.kube
juju scp kubernetes-master/0:config ~/.kube/config
juju scp kubernetes-master/0:kubectl ./kubectl
# Download a gpu-dependent job spec
wget -O /tmp/nvidia-smi.yaml https://raw.githubusercontent.com/madeden/blogposts/master/k8s-gpu-cloud/src/nvidia-smi.yaml
# Create the job
kubectl create -f /tmp/nvidia-smi.yaml
# You should see a new nvidia-smi-xxxxx pod created
kubectl get pods
# Wait a bit for the job to run, then view logs; you should see the
# nvidia-smi table output
kubectl logs $(kubectl get pods -l name=nvidia-smi -o=name -a)
```
kube-control interface: https://github.com/juju-solutions/interface-kube-control
nvidia-cuda layer: https://github.com/juju-solutions/layer-nvidia-cuda
(Both are registered on http://interfaces.juju.solutions/)
**Release note**:
```release-note
Juju: Enable GPU mode if GPU hardware detected
```
Automatic merge from submit-queue
get-kube-local.sh checks pods with option "--namespace=kube-system"
**What this PR does / why we need it**:
Local cluster creation using get-kube-local.sh is never finished.
The get-kube-local.sh monitors running_count of pods such as etcd,
master and kube-proxy, but these pods are created under the namespace
kube-system. Therefore kubectl can't find these pods then cluster
creation isn't completed.
The get-kube-local.sh should monitor created pods with option
"--namespace=kube-system".
**Which issue this PR fixes** : fixes#42517
**Release note**:
```
`NONE`
```
Automatic merge from submit-queue
cluster/log-dump - chmod files before dumping
We make the files world-readable, so that installation techniques that
lock down the logfiles can still be dumped.
Issue https://github.com/kubernetes/test-infra/issues/2397
```release-note
NONE
```
Use shared informers instead of creating local controllers/reflectors
for the proxy's endpoints and service configs. This allows downstream
integrators to pass in preexisting shared informers to save on memory &
cpu usage.
This also enables the cache mutation detector for kube-proxy for those
presubmit jobs that already turn it on.
Automatic merge from submit-queue
Adding more proxy options and header to nginx load-balancer.
**What this PR does / why we need it**: The kubeapi-load-balancer uses nginx to proxy commands to the kube-apiserver. It currently does not support SPDY and therefore the `kubectl exec` command is broken.
**Which issue this PR fixes** :
fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/226
fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/201
**Special notes for your reviewer**: This only changes the nginx configuration no code change was required.
**Release note**:
```release-note
Using http2 in kubeapi-load-balancer to fix kubectl exec uses
```
hack/local-up-cluster.sh uses cfssl to generate certificates and
will exit it cfssl is not already installed. But other cluster-up
mechanisms (GCE) that generate certs just download cfssl if not
present. Make local-up-cluster.sh do that too.
Automatic merge from submit-queue (batch tested with PRs 42379, 42668, 42876, 41473, 43260)
Silence error messages from the docker rmi call we expect to fail
**What this PR does / why we need it**: when we removed `docker tag -f` in #34361 we added a bunch of `docker rmi` calls to preserve behavior for older docker versions. That step is usually a no-op, however, and results in confusing messages like
```
Tagging docker image gcr.io/google_containers/kube-proxy:c8d0b2e7a06b451117a8ac58fc3bb3d3 as gcr.io/kubernetes-release-test/kube-proxy-amd64:v1.5.4
Error response from daemon: No such image: gcr.io/kubernetes-release-test/kube-proxy-amd64:v1.5.4
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#42665
**Special notes for your reviewer**: I could probably remove the `docker rmi` calls entirely, though I don't know if folks are still using docker < 1.10. (I think Jenkins still has 1.9.1.)
**Release note**:
```release-note
NONE
```
cc @jessfraz
Per Clayton's suggestion, move stuff from cluster/lib/util.sh to
hack/lib/util.sh. Also consolidate ensure-temp-dir and use the
hack/lib/util.sh implementation rather than cluster/common.sh.
Automatic merge from submit-queue (batch tested with PRs 43726, 43643)
Make a smaller redis image for testing, based on Alpine.
**What this PR does / why we need it**:
This shrinks gcr.io/google_containers/redis from 400MB to 5MB, which should reduce flakes.
**Which issue this PR fixes**:
fixes#43631
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Moves dns-horizontal-autoscaler to a separate service account
Similar to #38816.
As one of the cluster add-ons, dns-horizontal-autoscaler is now using the default service account in kube-system namespace, which is introduced by https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/e2e-rbac-bindings/random-addon-grabbag.yaml for the ease of transition. This default service account will be removed in the future.
This PR subdivides dns-horizontal-autoscaler to a separate service account and setup the necessary permissions.
@bowei
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43518, 42467)
install/kube-up: fix some errors while install k8s through kube-up/down.sh
What this PR does / why we need it:
etcd2.3.1 will be installed follow this scripts, but k8s use etcd3 as default storage backend, so the next error will always be apprear:
API server: rpc error: code = 13 desc = transport is closing
so i think we should change the version of etcd
thank you!
Automatic merge from submit-queue
Centos provider: generate SSL certificates for etcd cluster.
**What this PR does / why we need it**:
Support secure etcd cluster for centos provider, generate SSL certificates for etcd in default. Running it w/o SSL is exposing cluster data to everyone and is not recommended. [#39462](https://github.com/kubernetes/kubernetes/pull/39462#issuecomment-271601547)
/cc @jszczepkowski @zmerlynn
**Release note**:
```release-note
Support secure etcd cluster for centos provider.
```
Automatic merge from submit-queue
added prompt warning if etcd3 media type isn't set during upgrade
**What this PR does / why we need it**:
This adds a prompt confirming the upgrade when `STORAGE_MEDIA_TYPE` is not explicitly set. This is to prevent users from accidentally upgrading to protobuf.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Alongs with docs, addresses #43669
**Special notes for your reviewer**:
Should be cherrypicked onto `release-1.6`
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 41297, 42638, 42666, 43039, 42567)
Allow minion floating IPs to be optional
**What this PR does / why we need it**:
Makes the generation of floating IPs for worker nodes optional, based on an env var. To quote the original issue:
> Currently, the OpenStack installation method assigns a floating IP to every single worker node. While this is fine for smaller clusters with a good sized IP pool, it can cause issues in environments with high node counts or less IPs available.
**Which issue this PR fixes**:
https://github.com/kubernetes/kubernetes/issues/40737
**Special notes for your reviewer**:
I used the conditions section of the Heat spec: https://docs.openstack.org/developer/heat/template_guide/hot_spec.html#conditions-section
**Release note**:
```release-note
OpenStack clusters can now specify whether worker nodes are assigned a floating IP
```
Automatic merge from submit-queue (batch tested with PRs 43048, 43624, 43649)
Remove E2E_UPGRADE_TEST check in config-test.sh
Once https://github.com/kubernetes/test-infra/pull/2330 merges, the upgrade tests will drive the exact behavior they want, and we can remove the check for envvars leaked from the job env
Automatic merge from submit-queue
Update NPD rbac.
I recently enabled NPD in gke.
However, I found that in gke e2e test (https://k8s-testgrid.appspot.com/google-gke#gci-gke), npd on the node could not talk with apiserver, and reported full of following errors:
```
E0324 05:08:26.745545 1328 manager.go:160] failed to update node conditions: the server does not allow access to the requested resource (patch nodes gke-bootstrap-e2e-default-pool-fd91d792-mqh4)
E0324 05:08:37.719423 1328 manager.go:160] failed to update node conditions: the server does not allow access to the requested resource (patch nodes gke-bootstrap-e2e-default-pool-fd91d792-mqh4)
E0324 05:08:47.719694 1328 manager.go:160] failed to update node conditions: the server does not allow access to the requested resource (patch nodes gke-bootstrap-e2e-default-pool-fd91d792-mqh4)
```
I created a GKE cluster (v1.7.0-alpha.0.1483+1e879c69ecf09e) myself, and found that addon manager could not create npd binding with the following error:
```
error: error validating "/etc/kubernetes/addons/node-problem-detector/standalone/npd-binding.yaml": error validating data: couldn't find type: v1alpha1.ClusterRoleBinding; if you choose to ignore these errors, turn validation off with --validate=false
```
I found that rbac was updated to beta, but npd was missed because it was merged after 9e6a3496b4 (diff-b05c70853d9a772b310db71a61297841).
I updated rbac to beta in the master manifest and npd on the node could talk with apiserver immediately.
We must get this in 1.6 to make NPD working. @dchen1107
@dchen1107 @fabioy @liggitt
Automatic merge from submit-queue (batch tested with PRs 43546, 43544)
Default to enabling legacy ABAC policy in non-test kube-up.sh environments
Fixes https://github.com/kubernetes/kubernetes/issues/43541
In 1.5, we unconditionally stomped the abac policy file if KUBE_USER was set, and unconditionally used ABAC mode pointing to that file.
In 1.6, unless the user opts out (via `ENABLE_LEGACY_ABAC=false`), we want the same legacy policy included as a fallback to RBAC.
This PR:
* defaults legacy ABAC **on** in normal deployments
* defaults legacy ABAC **on** in upgrade E2Es (ensures combination of ABAC and RBAC works properly for upgraded clusters)
* defaults legacy ABAC **off** in non-upgrade E2Es (ensures e2e tests 1.6+ run with tightened permissions, and that default RBAC roles cover the required core components)
GKE changes to drive the `ENABLE_LEGACY_ABAC` envvar were made by @cjcullen out of band
```release-note
`kube-up.sh` using the `gce` provider enables both RBAC authorization and the permissive legacy ABAC policy that makes all service accounts superusers. To opt out of the permissive ABAC policy, export the environment variable `ENABLE_LEGACY_ABAC=false` before running `cluster/kube-up.sh`.
```
Automatic merge from submit-queue
Bump CNI consumers to v0.5.1
**What this PR does / why we need it**:
- vendored CNI plugins properly handle `DEL` on missing resources
- update CNI version refs
**Which issue this PR fixes**
fixes#43488
**Release note**:
`bumps CNI to version v0.5.1 where plugins properly handle DEL on non existent resources`
layer-nvidia-cuda does the hardware detection and sets a state that the
worker can react to.
When gpu is available, worker updates config and restarts kubelet to
enable gpu mode. Worker then notifies master that it's in gpu mode via
the kube-control relation.
When master sees that a worker is in gpu mode, it updates to privileged
mode and restarts kube-apiserver.
The kube-control interface has subsumed the kube-dns interface
functionality.
An 'allow-privileged' config option has been added to both worker and
master charms. The gpu enablement respects the value of this option;
i.e., we can't enable gpu mode if the operator has set
allow-privileged="false".
Automatic merge from submit-queue
Add an env KUBE_ENABLE_MASTER_NOSCHEDULE_TAINT and disable it by default
This PR changed master `NoSchedule` taint to opt-in.
As is discussed with @bgrant0607 @janetkuo, `NoSchedule` master taint breaks existing user workload, we should not enable it by default.
Previously, NPD required the taint because it can only support one OS distro with a specific configuration. If master and node are using different OS distros, NPD will not work either on master or node. However, we've already fixed this in https://github.com/kubernetes/kubernetes/pull/40206, so for NPD it's fine to disable the taint.
This should work, but I'll still try it in my cluster to confirm.
@kubernetes/sig-scheduling-misc @dchen1107 @mikedanese
Automatic merge from submit-queue (batch tested with PRs 43422, 43458)
Bump Cluster Autoscaler version to 0.5.0
**What this PR does / why we need it**:
This PR bumps Cluster Autoscaler version to 0.5.0. The version is the same as 0.5.0-beta2 (from the code perspective). We are just removing the -beta2 tag from the image.
**Release note**:
None.
cc: @MaciekPytel @fgrzadkowski @wojtek-t
Automatic merge from submit-queue
Keep ResourceQuota admission at the end of the chain
Fixes#43426
Moves DefaultTolerationSeconds admission prior to ResourceQuota to keep it at the end of the chain
Automatic merge from submit-queue
Increase memory limit for fluentd-gcp
This PR increases fluentd memory limit in fluentd-gcp addon to avoid OOMs. Request is left intact
Automatic merge from submit-queue
Export KUBE_VERSION for consumption by get-kube-binaries.sh
/assign @ixdy
https://github.com/kubernetes/kubernetes/pull/43331 will not have any effect until we update get-kube.sh to export KUBE_VERSION
Automatic merge from submit-queue (batch tested with PRs 43331, 43336)
Do not override KUBERNETES_RELEASE if already set
/assign @ixdy
If the user calls `get-kube.sh` with `KUBERNETES_RELEASE` and `KUBERNETES_RELEASE_URL` already set, continue to use these values.
Automatic merge from submit-queue
Update Dashboard version to v1.6.0
**What this PR does / why we need it**:
Updates dashboard addon to latest version. Changelog can be found [here](https://github.com/kubernetes/dashboard/releases/tag/v1.6.0).
**Release note**:
```release-note
Update dashboard version to v1.6.0
```
Automatic merge from submit-queue (batch tested with PRs 43254, 43255, 43184, 42509)
Symlink cluster/gce/cos to cluster/gce/gci
Fixes: #43139
As I just unfortunately found out after spending an hour getting to the point where I could test this, upgrade.sh does not support upgrading nodes to local binaries. So someone will have to cut a release to test whether this change actually works.
Automatic merge from submit-queue (batch tested with PRs 43254, 43255, 43184, 42509)
Re-add kube_proxy to the abac file (Match what we had in 1.5).
**What this PR does / why we need it**:
Make the ABAC file match what it was in 1.5. GKE rewrites the ABAC file every time, so we were clobbering the kube_proxy entry that used to exist. This would have gone unnoticed, but a separate bug in GKE is causing the token file rewrites to fail on GKE (meaning group used in RBAC aren't there).
**Which issue this PR fixes**
fixes#42746
@liggitt @krousey
Automatic merge from submit-queue
Allow ABAC to be disabled easily on upgrades
**What this PR does / why we need it**:
Adds a local variable to the configure-helper script so that ABAC_AUTHZ_FILE can be set to a nonexistent file in kube-env to disable ABAC on a cluster that previously was using ABAC.
@liggitt @Q-Lee
Automatic merge from submit-queue
Update npd to the official v0.3.0 release.
Update npd to the official release v0.3.0.
This also fixes a npd bug https://github.com/kubernetes/node-problem-detector/pull/98.
@dchen1107 @kubernetes/node-problem-detector-reviewers
Automatic merge from submit-queue (batch tested with PRs 43177, 43202)
Rename default storageclasses
From UX perspective, 'default' is a bad name for the default storage class:
```
$ kubectl get storageclass
NAME TYPE
default (default) kubernetes.io/aws-ebs
```
This is sort of OK, it gets more confusing when user is not happy with the
preinstalled default storage class and creates its own and makes it default:
```
NAME TYPE
default kubernetes.io/aws-ebs
iops (default) kubernetes.io/aws-ebs
```
This PR uses name of the underlying storage as name of the default storage class:
```
NAME TYPE
gp2 (default) kubernetes.io/aws-ebs
```
On GCE (and many others):
```
NAME TYPE
standard (default) kubernetes.io/gce-pd
```
Detailed list of names of new default storage classes:
* AWS: `gp2`
* GCE: `standard` (from pd-standard)
* vSphere: `thin`
* Cinder does not have a default - it's up to OpenStack admin to set some default and it can change at any time, using `standard` as the class name.
* I was not able to find details about Azure, using `standard` too.
@justinsb @jingxu97 @kerneltime @colemickens, PTAL quickly so we can catch 1.6.
```release-note
NONE
```
For 1.6 release manager, this PR just renames objects in addon manager.
From UX perspective, 'default' is a bad name for the default storage class:
$ kubectl get storageclass
NAME TYPE
default (default) kubernetes.io/aws-ebs
This is sort of OK, it gets more confusing when user is not happy with the
preinstalled default storage class and creates its own and makes it default:
NAME TYPE
default kubernetes.io/aws-ebs
iops (default) kubernetes.io/aws-ebs
Automatic merge from submit-queue (batch tested with PRs 43106, 43110)
Bumped rescheduler version to 0.3.0
fix#32531https://github.com/kubernetes/contrib/pull/2474 needs to be merged first
cc @ethernetdan @marun @k82cn @aveshagarwal
Automatic merge from submit-queue (batch tested with PRs 43018, 42713, 42819)
Update startup scripts for kube-dns ConfigMap and ServiceAccount
Follow up PR of #42757. This PR changes all existing startup scripts to support default kube-dns ConfigMap and ServiceAccount.
@bowei
cc @liggitt
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 42802, 42927, 42669, 42988, 43012)
Update Cluster Autoscaler entrypoint
**What this PR does / why we need it**:
Update Cluster Autoscaler manifest file to use new shell wrapper instead of directly calling CA binary (the wrapper is already included in current CA image).
Add params to improve logging.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 42940, 42906, 42970, 42848)
Enable RollingUpdates for the fluentd daemonset addon
In anticipation of needing to rev fluentd-gcp image versions in patch releases, we should enable rolling update so the new versions get rolled out in a timely manner.
/cc @ixdy
Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933)
[Federation][e2e] Add framework for upgrade test in federation
Adding framework for federation upgrade tests. please refer to #41791
cc @madhusudancs @nikhiljindal @kubernetes/sig-federation-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 41830, 42630)
Arrange for elasticsearch to shutdown cleanly
Kubernetes initiates "graceful shutdown" by sending SIGTERM to pid 1, which
is exactly what elasticsearch is expecting (good!)
The way the existing startup scripts worked however, this signal arrived at
the shell wrapper, not elasticsearch, and the shell wrapper exited,
killing the container immediately (bad!)
Before this change:
```
1 ? Ss 0:00 /bin/sh -c /run.sh
6 ? S 0:00 /bin/bash /run.sh
13 ? S 0:00 \_ /bin/su -c /elasticsearch/bin/elasticsearch elasticsearch
14 ? Ss 0:00 \_ sh -c /elasticsearch/bin/elasticsearch
15 ? Sl 19:18 \_ /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java ... org.elasticsearch.bootstrap.Elasticsearch start
```
After this change:
```
1 ? Ssl 0:29 /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java ... org.elasticsearch.bootstrap.Elasticsearch start
```
Automatic merge from submit-queue (batch tested with PRs 42877, 42853)
discriminate more when parsing kube-env :(
Exactly match the key. Right now CA_KEY matches ETCD_CA_KEY and we just pick the first because fml.
I HATE BASH
more fixes for kubelet rbac enablement upgrades.
Automatic merge from submit-queue (batch tested with PRs 42024, 42780, 42808, 42640)
Handle NPD during cluster upgrade.
Generate NPD token during upgrade.
I could not fully verify this change because of https://github.com/kubernetes/kubernetes/issues/42199. However, at least I tried upgrade master, and the corresponding environment variables are correctly generated.
```
...
ENABLE_NODE_PROBLEM_DETECTOR: 'standalone'
...
KUBELET_TOKEN: 'PKNgAaVXeL3VojND2s0KMleELjzGK0oW'
```
@maisem @dchen1107
Automatic merge from submit-queue (batch tested with PRs 42768, 42760, 42771, 42767)
Create EnsureExists class addons before Reconcile class addons
From #42757.
The addon-manager creates "Reconcile" class addons before creates "EnsureExists" class addons, which is not the best order. The "EnsureExists" class addons tend to be some default configurations like `default-storage-class` and `default kube-dns ConfigMap` (being added in #42757), and we would like to have these default configurations created before other addons are created.
@mikedanese @bowei
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 42211, 38691, 42737, 42757, 42754)
Adds default kube-dns configmap
From #42746.
Using 1.4 kubelet with 1.6 master is causing upgrade test failures. Because kubelet doesn't understand optional configmaps and there is no kube-dns configmap exist, kubelet will crash.
This PR adds an empty kube-dns configmap as an "EnsureExists" class addon for fixing that.
Note: The addon-manager creates "Reconcile" class addons before creates "EnsureExists" class addons, which is not the order we want. I will soon have another PR to reverse this order.
@bowei @krousey @skriss
```release-note
none
```
Kubernetes initiates "graceful shutdown" by sending SIGTERM to pid 1.
The way the existing startup scripts worked, this signal arrived at
the shell wrapper, not elasticsearch, and the shell wrapper exited,
killing the container immediately.
Before this change:
1 ? Ss 0:00 /bin/sh -c /run.sh
6 ? S 0:00 /bin/bash /run.sh
13 ? S 0:00 \_ /bin/su -c /elasticsearch/bin/elasticsearch elasticsearch
14 ? Ss 0:00 \_ sh -c /elasticsearch/bin/elasticsearch
15 ? Sl 19:18 \_ /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java ... org.elasticsearch.bootstrap.Elasticsearch start
After this change:
1 ? Ssl 0:29 /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java ... org.elasticsearch.bootstrap.Elasticsearch start
Automatic merge from submit-queue (batch tested with PRs 41826, 42405)
Add stubDomains and upstreamNameservers configuration to kube-dns
```release-note
Updates the dnsmasq cache/mux layer to be managed by dnsmasq-nanny.
dnsmasq-nanny manages dnsmasq based on values from the
kube-system:kube-dns configmap:
"stubDomains": {
"acme.local": ["1.2.3.4"]
},
is a map of domain to list of nameservers for the domain. This is used
to inject private DNS domains into the kube-dns namespace. In the above
example, any DNS requests for *.acme.local will be served by the
nameserver 1.2.3.4.
"upstreamNameservers": ["8.8.8.8", "8.8.4.4"]
is a list of upstreamNameservers to use, overriding the configuration
specified in /etc/resolv.conf.
```
Automatic merge from submit-queue
Fixing unbound bash variable.
**What this PR does / why we need it**: this fixes a bug introduced in 1.6 for ABAC.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**: without this, we hit an unbound variable and fail to bring up the kube-apiserver with ABAC enabled.
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 42070, 42127)
Remove fluentd-gcp image sources
This PR removes fluentd-gcp image sources from the main kubernetes repo to move it the `contrib`: https://github.com/kubernetes/contrib/pull/2426
Once image is moved, it will be maintained by Stackdriver team (@igorpeshansky, @qingling128 and @dhrupadb)
CC @ixdy @timstclair
Automatic merge from submit-queue
Remove the kube-discovery binary from the tree
**What this PR does / why we need it**:
kube-discovery was a temporary solution to implementing proposal: https://github.com/kubernetes/community/blob/master/contributors/design-proposals/bootstrap-discovery.md
However, this functionality is now gonna be implemented in the core for v1.6 and will fully replace kube-discovery:
- https://github.com/kubernetes/kubernetes/pull/36101
- https://github.com/kubernetes/kubernetes/pull/41281
- https://github.com/kubernetes/kubernetes/pull/41417
So due to that `kube-discovery` isn't used in any v1.6 code, it should be removed.
The image `gcr.io/google_containers/kube-discovery-${ARCH}:1.0` should and will continue to exist so kubeadm <= v1.5 continues to work.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Remove cmd/kube-discovery from the tree since it's not necessary anymore
```
@jbeda @dgoodwin @mikedanese @dmmcquay @lukemarsden @errordeveloper @pires
Local cluster creation using get-kube-local.sh is never finished.
The get-kube-local.sh monitors running_count of pods such as etcd,
master and kube-proxy, but these pods are created under the namespace
kube-system. Therefore kubectl can't find these pods then cluster
creation isn't completed.
The get-kube-local.sh should monitor created pods with option
"--namespace=kube-system".
Fix#42517
Automatic merge from submit-queue (batch tested with PRs 41919, 41149, 42350, 42351, 42285)
Juju: Disable anonymous auth on kubelet
**What this PR does / why we need it**:
This disables anonymous authentication on kubelet when deployed via Juju.
I've also adjusted a few other TLS options for kubelet and kube-apiserver. The end result is that:
1. kube-apiserver can now authenticate with kubelet
2. kube-apiserver now verifies the integrity of kubelet
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/219
**Special notes for your reviewer**:
This is dependent on PR #41251, where the tactics changes are being merged in separately.
Some useful pages from the documentation:
* [apiserver -> kubelet](https://kubernetes.io/docs/admin/master-node-communication/#apiserver---kubelet)
* [Kubelet authentication/authorization](https://kubernetes.io/docs/admin/kubelet-authentication-authorization/)
**Release note**:
```release-note
Juju: Disable anonymous auth on kubelet
```
Automatic merge from submit-queue (batch tested with PRs 41306, 42187, 41666, 42275, 42266)
remove support for debian masters in GCE
Asked about this on the mailing list and no one objects.
@zmerlynn @roberthbailey
```release-note
Remove support for debian masters in GCE kube-up.
```
Automatic merge from submit-queue (batch tested with PRs 41672, 42084, 42233, 42165, 42273)
remove azure getting kube-ups.
Haven't been touched in > 7 months.
@colemickens , i"m going to send out an email about this.
```release-note
Remove Azure kube-up as the Azure community has focused efforts elsewhere.
```
Automatic merge from submit-queue (batch tested with PRs 42126, 42130, 42232, 42245, 41932)
Update fluentd-gcp configuration for hosted masters
This PR makes use of the new fluentd-gcp image, which is not configured per se, for the hosted masters, which cannot use configmaps.
Mirroring https://github.com/kubernetes/kubernetes/pull/42126
Automatic merge from submit-queue
Move fluentd DS config to configmap
This is the logical continuation of https://github.com/kubernetes/kubernetes/pull/41998. This PR makes fluentd-gcp DaemonSet use the new image configured using ConfigMap.
This PR doesn't change the way fluentd-gcp works in case master is not registered, that'll be fixed in a separate PR
CC @ixdy @timstclair @igorpeshansky @qingling128 @dhrupadb
**Release note:**
```release-note
Fluentd-gcp containers spawned by DaemonSet are now configured using ConfigMap
```
Automatic merge from submit-queue (batch tested with PRs 41931, 39821, 41841, 42197, 42195)
Adding legacy ABAC for 1.6
This is a fork of a previous [pull request](https://github.com/kubernetes/kubernetes/pull/42014) to include feedback as the original author is unavailable.
Adds a mechanism to optionally enable legacy abac for 1.6 to provide a migration path for existing users.
Automatic merge from submit-queue (batch tested with PRs 41931, 39821, 41841, 42197, 42195)
Admission Controller: Add Pod Preset
Based off the proposal in https://github.com/kubernetes/community/pull/254
cc @pmorie @pwittrock
TODO:
- [ ] tests
**What this PR does / why we need it**: Implements the Pod Injection Policy admission controller
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Added new Api `PodPreset` to enable defining cross-cutting injection of Volumes and Environment into Pods.
```
Currently, in containerized mounter rootfs, there is no DNS setup. By
bind mount the host's /etc/resolv.conf to mounter rootfs, vm hosts name
could be resolved when using host name during mount.
Automatic merge from submit-queue (batch tested with PRs 41644, 42020, 41753, 42206, 42212)
Update defaultbackend image to 1.3
Update `gcr.io/google-containers/defaultbackend` to the latest version.
See https://github.com/kubernetes/contrib/pull/2386
/cc @ixdy
export functions from pkg/api/validation
add settings API
add settings to pkg/registry
add settings api to pkg/master/master.go
add admission control plugin for pod preset
add new admission control plugin to kube-apiserver
add settings to import_known_versions.go
add settings to codegen
add validation tests
add settings to client generation
add protobufs generation for settings api
update linted packages
add settings to testapi
add settings install to clientset
add start of e2e
add pod preset plugin to config-test.sh
Signed-off-by: Jess Frazelle <acidburn@google.com>
Automatic merge from submit-queue (batch tested with PRs 38676, 41765, 42103, 41833, 41702)
Update addons yaml files for converting tolerations to api fields.
Automatic merge from submit-queue (batch tested with PRs 42162, 41973, 42015, 42115, 41923)
add nrpe-external-master relation to kubernetes-master and kubernetes-worker
**What this PR does / why we need it**:
This PR adds an an nrpe-external-master relation to the kubernetes-worker, kubernetes-master and kubeapi-load-balancer charms. This is needed to monitor the state of the workers, the masters and the load-balancers via Nagios.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/165
**Special notes for your reviewer**:
Original work by @axinojolais in PR #40897. All I've done is squash commits on his behalf.
**Release note**:
```release-note
The kubernetes-master, kubernetes-worker and kubeapi-load-balancer charms have gained an nrpe-external-master relation, allowing the integration of their monitoring in an external Nagios server.
```
Automatic merge from submit-queue
remove trusty GCE kube-up.sh
Asked on the mailing list. No one objected. Lot's of people were in favor.
cc @roberthbailey
```release-note
Remove support for trusty in GCE kube-up.
```
Automatic merge from submit-queue (batch tested with PRs 35094, 42095, 42059, 42143, 41944)
Use chroot for containerized mounts
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
Automatic merge from submit-queue (batch tested with PRs 40746, 41699, 42108, 42174, 42093)
Avoid fake node names in user info
Node usernames should follow the format `system:node:<node-name>`,
but if we don't know the node name, it's worse to put a fake one in.
In the future, we plan to have a dedicated node authorizer, which would
start rejecting requests from a user with a bogus node name like this.
The right approach is to either mint correct credentials per node, or use node bootstrapping so it requests a correct client certificate itself.
Automatic merge from submit-queue (batch tested with PRs 41937, 41151, 42092, 40269, 42135)
GCE will properly regenerate basic_auth.csv on kube-apiserver start.
**What this PR does / why we need it**:
If basic_auth.csv does not exist we will generate it as normal.
If basic_auth.csv exists we will remove the old admin password before adding the "new" one. (Turns in to a no-op if the password exists).
This did not work properly before because we were replacing by key, where the key was the password. New password would not match and so not replace the old password.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#41935
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Updates the dnsmasq cache/mux layer to be managed by dnsmasq-nanny.
dnsmasq-nanny manages dnsmasq based on values from the
kube-system:kube-dns configmap:
"stubDomains": {
"acme.local": ["1.2.3.4"]
},
is a map of domain to list of nameservers for the domain. This is used
to inject private DNS domains into the kube-dns namespace. In the above
example, any DNS requests for *.acme.local will be served by the
nameserver 1.2.3.4.
"upstreamNameservers": ["8.8.8.8", "8.8.4.4"]
is a list of upstreamNameservers to use, overriding the configuration
specified in /etc/resolv.conf.
Automatic merge from submit-queue (batch tested with PRs 42058, 41160, 42065, 42076, 39338)
Bump up dns-horizontal-autoscaler to 1.1.1
cluster-proportional-autoscaler 1.1.1 is releasing by kubernetes-incubator/cluster-proportional-autoscaler#26, also bump it up for dns-horizontal-autoscaler to introduce below features:
- Add PreventSinglePointFailure option in linear mode.
- Use protobufs for communication with apiserver.
- Support switching control mode on-the-fly.
Note:
The new entry `"preventSinglePointFailure":true` ensures kube-dns to have at least 2 replicas when there is more than one node. Mitigate the issue mentioned in #40063.
@bowei @thockin
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 42058, 41160, 42065, 42076, 39338)
Juju: Fix shebangs in charm actions to use python3
**What this PR does / why we need it**:
This fixes the microbot and create-rbd-pv actions by reverting them back to python3. We accidentally switched them to python2 to match the boilerplate checker's expectations for python files.
It looks like hack/verify-boilerplate.sh does not check these since they don't have the .py extension, so we should be good with no changes there.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/212
**Special notes for your reviewer**:
**Release note**:
```release-note
Juju: Fix shebangs in charm actions to use python3
```
Automatic merge from submit-queue
Use docker log rotation mechanism instead of logrotate
This is a solution for https://github.com/kubernetes/kubernetes/issues/38495.
Instead of rotating logs using logrotate tool, which is configured quite rigidly, this PR makes docker responsible for the rotation and makes it possible to configure docker logging parameters. It solves the following problems:
* Logging agent will stop loosing lines upon rotation
* Container's logs size will be more strictly constrained. Instead of checking the size hourly, size will be checked upon write, preventing https://github.com/kubernetes/kubernetes/issues/27754
It's still far from ideal, for example setting logging options per pod, as suggested in https://github.com/kubernetes/kubernetes/issues/15478 would be much more flexible, but latter approach requires deep changes, including changes in API, which may be in vain because of CRI and long-term vision for logging.
Changes include:
* Change in salt. It's possible to configure docker log parameters, using variables in pillar. They're exported from env variables on `gce`, but for different cloud provider they have to be exported first.
* Change in `configure-helper.sh` scripts for those os on `gce` that don't use salt + default values exposed via env variables
This change may be problematic for kubelet logs functionality with CRI enabled, that will be tackled in the follow-up PR, if confirmed.
CC @piosz @Random-Liu @yujuhong @dashpole @dchen1107 @vishh @kubernetes/sig-node-pr-reviews
```release-note
On GCI by default logrotate is disabled for application containers in favor of rotation mechanism provided by docker logging driver.
```
Automatic merge from submit-queue
Cleanup fluentd-gcp image, rebase on debian-base
**Why we need this PR**:
There are several problems with our current fluentd-gcp image:
- It pulls in lots of unused packages, which expose unnecessary risk and create noise in CVE scans (and scare customers). The most notable example is the fluent-ui, which pulls in rails.
- `curl | sh ` is not a good practice for a Dockerfile. First, the script is not checked in the same source control branch, so builds are not reproducible. Second, the actions it is taking are opaque. Third, in this case, using non-standard packages means they're harder to manage with CVE scans & upstream fixes.
**What is changed by this PR?**
- Rather than relying on td-agent (which includes fluent-ui), use standard upstream packages. This is largely based off the [official fluentd debian-based image](https://github.com/fluent/fluentd-docker-image/blob/master/v0.12/debian/Dockerfile).
- Rebases the image on debian-base (depends on https://github.com/kubernetes/kubernetes/pull/41915). We would like to move towards a single full-distro base image we can maintain. This change should be relatively minor.
As a result of these changes, the image size is reduced from 360.6 MB to 185.8 MB (nearly half). Many packages were removed, and the full diff (focus on the unversioned files) is listed here: 3fb704f977
**Which issue this PR fixes** https://github.com/kubernetes/kubernetes/issues/40248
**Special notes for your reviewer**:
This change both addresses security concerns, and is expected to greatly reduce the maintenance burden of the fluentd-gcp image. I'd *really* like to get this into 1.6, so please prioritize this review if possible.
I tested this by running the default e2e suite on a private e2e cluster using the new image. If there are other tests you'd like me to run, please let me know ASAP.
**Release note**:
```release-note
Cleanup fluentd-gcp image: rebase on debian-base, switch to upstream packages, remove fluent-ui & rails
```
Automatic merge from submit-queue (batch tested with PRs 35408, 41915, 41992, 41964, 41925)
bump version numbers for heapster/influxdb/grafana images
**What this PR does / why we need it**:
It updates version of monitoring components (heapster/influxdb/grafana) to the latest one used by heapster
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
[e2e/monitoring.go](https://github.com/kubernetes/kubernetes/blob/master/test/e2e/monitoring.go) test seems to be passing without modifications
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 40932, 41896, 41815, 41309, 41628)
enable DefaultTolerationSeconds admission controller by default
**What this PR does / why we need it**:
Continuation of PR #41414, enable DefaultTolerationSeconds admission controller by default.
**Which issue this PR fixes**:
fixes: #41860
related Issue: #1574, #25320
related PRs: #34825, #41133, #41414
**Special notes for your reviewer**:
**Release note**:
```release-note
enable DefaultTolerationSeconds admission controller by default
```
If the file does not exist we will generate it as normal.
If the file exists we will remove the old admin password before adding
the "new" one. (Turns in to a no-op if the password exists).
This did not work properly before because we were replacing by key,
where the key was the password. New password would not match and so
not replace the old password.
Added a METADATA_CLOBBERS_CONFIG flag
METADATA_CLOBBERS_CONFIG controls if we consider the values on disk or in
metadata to be the canonical source of truth. Currently defaulting to
false for GCE and forcing to true for GKE.
Added handling for older forms of the basic_auth.csv file.
Fixed comment to reflect new METADATA_CLOBBERS_CONFIG var.
Automatic merge from submit-queue
Protect kubeproxy deployed via kube-up from system OOMs
This change is necessary until it can be moved to Guaranteed QoS Class.
For #40573
Automatic merge from submit-queue (batch tested with PRs 41854, 41801, 40088, 41590, 41911)
Bump gcr.io/google-containers/rescheduler to v0.2.2
**What this PR does / why we need it**: updates the rescheduler image to one based on busybox instead of ubuntu-slim. Changes for the image were in https://github.com/kubernetes/contrib/pull/2390.
Do you think this merits a release note? I'm leaning towards no.
**Release note**:
```release-note
Update gcr.io/google-containers/rescheduler to v0.2.2, which uses busybox as a base image instead of ubuntu.
```
cc @timstclair
Automatic merge from submit-queue (batch tested with PRs 41854, 41801, 40088, 41590, 41911)
Default storage class for vSphere Fixes#40070
**What this PR does / why we need it**:
Create default storage class for vSphere. This is part of the storage class GA effort https://github.com/kubernetes/features/issues/36
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes#40070
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 40665, 41094, 41351, 41721, 41843)
Multi master patch
**What this PR does / why we need it**: Corrects a sync files issue present when running in a HA Master configuration. This PR adds logic to syncronize on first deployment for `/etc/kubernetes/serviceaccount.key` which will cause cypto verification failure if not 1:1 on each master unit. Additionally syncs basic_auth and additional files in /srv/kubernetes.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#41019
**Special notes for your reviewer**: This requires PR #41251 as a dependency before merging.
**Release note**:
```release-note
Juju - K8s master charm now properly keeps distributed master files in sync for an HA control plane.
```
Automatic merge from submit-queue
Supports 'ensure exist' class addon in Addon-manager
Fixes#39561, fixes#37047 and fixes#36411. Depends on #40057.
This PR splits cluster addons into two categories:
- Reconcile: Addons that need to be reconciled (`kube-dns` for instance).
- EnsureExists: Addons that need to be exist but changeable (`default-storage-class`).
The behavior for the 'EnsureExists' class addon would be:
- Create it if not exist.
- Users could do any modification they want, addon-manager will not reconcile it.
- If it is deleted, addon-manager will recreate it with the given template.
- It will not be updated/clobbered during upgrade.
As Brian pointed out in [#37048/comment](https://github.com/kubernetes/kubernetes/issues/37048#issuecomment-272510835), this may not be the best solution for addon-manager. Though #39561 needs to be fixed in 1.6 and we might not have enough bandwidth to do a big surgery.
@mikedanese @thockin
cc @kubernetes/sig-cluster-lifecycle-misc
---
Tasks for this PR:
- [x] Supports 'ensure exist' class addon and switch to use new labels in addon-manager.
- [x] Updates READMEs regarding the new behavior of addon-manager.
- [x] Updated `test/e2e/addon_update.go` to match the new behavior.
- [x] Go through all current addons and apply the new labels on them regarding what they need.
- [x] Bump addon-manager and update its template files.
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
Automatic merge from submit-queue
Remove ivan4th from reviewers
**What this PR does / why we need it**:
Per @ivan4th request in #41351 he would like to be removed from the
reviewers list in this directory tree. This commit addresses that
request.
**Special notes for your reviewer**:
As Ivan has already investigated the PR in question under 41351 I would like to see that driven to landing before landing this OWNERS file change, unless another reviewer would like to step in and help land that open PR.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Base etcd-empty-dir-cleanup on busybox, run as nobody, and update to etcdctl 3.0.14
**What this PR does / why we need it**: since the `etcd-empty-dir-cleanup` image just uses a simple shell script and `etcdctl`, we can base it on busybox, which is a smaller target than alpine.
I've also updated this to use an `etcdctl` from etcd 3.0.14, which matches the version of etcd we're running in 1.6 clusters (I believe), and changed the tag to match the `etcdctl` version.
Tested in my own e2e cluster, where it seems to work.
I haven't pushed the image yet, so e2e tests *may* fail. Tagging `do-not-merge`; if you think this looks good, I'll push the image and retest.
**Release note**:
```release-note
```
cc @timstclair @mml @wojtek-t
Automatic merge from submit-queue
move kube-dns to a separate service account
Switches the kubedns addon to run as a separate service account so that we can subdivide RBAC permission for it. The RBAC permissions will need a little more refinement which I'm expecting to find in https://github.com/kubernetes/kubernetes/pull/38626 .
@cjcullen @kubernetes/sig-auth since this is directly related to enabling RBAC with subdivided permissions
@thockin @kubernetes/sig-network since this directly affects now kubedns is added.
```release-note
`kube-dns` now runs using a separate `system:serviceaccount:kube-system:kube-dns` service account which is automatically bound to the correct RBAC permissions.
```
Automatic merge from submit-queue (batch tested with PRs 39855, 41433, 41567, 41887, 41652)
Add fluentd monitoring to fluentd-gcp image
Right now we are not able to monitor the state of fluentd in cluster, which may result in logging subsystem quietly failing. This PR tries to address that problem by introducing the fluentd container monitoring:
* fluentd internal metrics, like number of buffers and number of data in buffers
* `logging_line_count`, number of lines, read by fluentd from application containers' logs
* Has `tag` label, corresponding to the fluentd tag of the entry
* `logging_entry_count`, number of entries, emitted to the output plugin
* With label `component` set to `container`, generated by application containers
* With label `component` set to `system`, generated by system components like kubelet, docker, scheduler, etc.
* Has `tag` label, corresponding to the fluentd tag of the entry
CC @fabxc @igorpeshansky @edsiper
Automatic merge from submit-queue (batch tested with PRs 41812, 41665, 40007, 41281, 41771)
Bump golang versions to 1.7.5
**What this PR does / why we need it**: While #41636 might not make it in until 1.7, this would bump current golang versions from 1.7.4 to 1.7.5 to integrate the fixes from that patch version. This would include, among other things, a fix to ensure cross-built binaries for darwin don't have certificate validation errors (golang/go#18688)
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: none
**Special notes for your reviewer**:
**Release note**:
```release-note
Upgrade golang versions to 1.7.5
```
Automatic merge from submit-queue (batch tested with PRs 41797, 41793, 41795, 41807, 41781)
Remove unnecessary metrics (http/process/go) from being exposed by etcd-version-monitor
Unregister metrics we do not want from the etcd version metrics handler.
cc @wojtek-t @piosz
Automatic merge from submit-queue (batch tested with PRs 41797, 41793, 41795, 41807, 41781)
Turn fluentd supervisor off for fluentd-gcp
By default, turn fluentd supervisor off so that when fluentd process fails, for example due to OOM, container fails completely and it would be easy to detect.
CC @igorpeshansky @qingling128
This branch sync's the crypto keys, and flat-files used for auth with
all the masters when scaling the kubernetes-master units. This should
fix the mis-matched crypto keys seen when rebooting units after first
deploy.
Automatic merge from submit-queue (batch tested with PRs 41349, 41532, 41256, 41587, 41657)
Update kubectl in addon-manager to use HPA in autoscaling/v1
Addon-manager is broken since HPA objects were removed from extensions api group.
Came across the logs from [the latest addon-manager on Jenkins](https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce/4290/artifacts/bootstrap-e2e-master/kube-addon-manager.log):
```
INFO: == Entering periodical apply loop at 2017-02-16T17:33:37+0000 ==
error: error pruning namespaced object extensions/v1beta1, Kind=HorizontalPodAutoscaler: the server could not find the requested resource
WRN: == Failed to execute /usr/local/bin/kubectl apply --namespace=kube-system -f /etc/kubernetes/addons --prune=true -l kubernetes.io/cluster-service=true --recursive >/dev/null at 2017-02-16T17:33:38+0000. 2 tries remaining. ==
error: error pruning namespaced object extensions/v1beta1, Kind=HorizontalPodAutoscaler: the server could not find the requested resource
WRN: == Failed to execute /usr/local/bin/kubectl apply --namespace=kube-system -f /etc/kubernetes/addons --prune=true -l kubernetes.io/cluster-service=true --recursive >/dev/null at 2017-02-16T17:33:46+0000. 1 tries remaining. ==
error: error pruning namespaced object extensions/v1beta1, Kind=HorizontalPodAutoscaler: the server could not find the requested resource
WRN: == Failed to execute /usr/local/bin/kubectl apply --namespace=kube-system -f /etc/kubernetes/addons --prune=true -l kubernetes.io/cluster-service=true --recursive >/dev/null at 2017-02-16T17:33:53+0000. 0 tries remaining. ==
WRN: == Kubernetes addon update completed with errors at 2017-02-16T17:33:58+0000 ==
```
And notice this commit (f66679a4e9) came in two weeks ago, which removed HorizontalPodAutoscaler from extensions/v1beta1.
Addon-manager is now partially functioning that it could successfully create and update addons, but will fail to prune objects, which means upgrade tests may mostly fail.
Pushed another version of addon-manager with kubectl v1.6.0-alpha.2 ([release 2 days ago](https://github.com/kubernetes/kubernetes/releases/tag/v1.6.0-alpha.2)) for fixing, including below images:
- gcr.io/google-containers/kube-addon-manager:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-amd64:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-arm:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-arm64:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-ppc64le:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-s390x:v6.4-alpha.2
@mikedanese
cc @wojtek-t @shyamjvs
Automatic merge from submit-queue (batch tested with PRs 41349, 41532, 41256, 41587, 41657)
Lint fixes for the master and worker Python code.
**What this PR does / why we need it**: lint fixes for the python code.
**Which issue this PR fixes** none
**Special notes for your reviewer**: This is lint fixes for the Juju python code.
**Release note**:
```release-note
NONE
```
Please consider these changes so we can pass flake8 lint tests in our build process.
Automatic merge from submit-queue (batch tested with PRs 41364, 40317, 41326, 41783, 41782)
Add ability to enable cache mutation detector in GCE
Add the ability to enable the cache mutation detector in GCE. The current default behavior (disabled) is retained.
When paired with https://github.com/kubernetes/test-infra/pull/1901, we'll be able to detect shared informer cache mutations in gce e2e PR jobs.
Automatic merge from submit-queue (batch tested with PRs 41706, 39063, 41330, 41739, 41576)
[Kubemark] Add option to log hollow-node logs
Ref https://github.com/kubernetes/kubernetes/issues/41613
Added an option to log kubemark hollow-node logs which includes kubelet, kubeproxy and npd logs for each hollow-node.
Setting the env var `ENABLE_HOLLOW_NODE_LOGS=true` should now enable logging for tests.
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek @yujuhong @Random-Liu
Automatic merge from submit-queue
Add standalone npd on GCI.
This PR added standalone NPD in GCE GCI cluster. I already verified the PR, and it should work.
/cc @dchen1107 @fabioy @andyxning @kubernetes/sig-node-misc
Automatic merge from submit-queue
Fix the output of health-mointor.sh
The script show prints the errors/response of the health check, but not
show the progress of `curl`.
Automatic merge from submit-queue
Added a basic monitor for providing etcd version related info
Fixes#41071
This tool scrapes metrics partly from etcd's /version and /metrics endpoints and partly using etcdctl and exposes them as prometheus metrics at `http://localhost:9101/metrics` endpoint on the master. Here is a summary of the metrics it exposes (self-explanatory from the code):
- etcdVersionFetchCount = prometheus.NewCounterVec(
prometheus.CounterOpts{
Namespace: "etcd",
Name: "version_info_fetch_count",
Help: "Number of times etcd's version info was fetched, labeled by etcd's server binary and cluster version",
},
[]string{"serverversion", "clusterversion"})
- etcdGRPCRequestsTotal = prometheus.NewCounterVec(
prometheus.CounterOpts{
Namespace: namespace,
Name: "grpc_requests_total",
Help: "Counter of received grpc requests, labeled by grpc method and grpc service names",
},
[]string{"grpc_method", "grpc_service"})
For further info on how to run this as a binary/docker-container/kubernetes-pod and checking the metrics, have a look at the README.md file.
cc @fgrzadkowski @wojtek-t @piosz
Allow cache mutation detector enablement by PRs in an attempt to find
mutations before they're merged in to the code base. It's just for the
apiserver and controller-manager for now. If/when the other components
start using a SharedInformerFactory, we should set them up just like
this as well.
Automatic merge from submit-queue
Reduce default value of kubemark's NUM_NODES to 10
Changing the default value of kubemark's NUM_NODES from 100 to 10, as it would then be possible to start kubemark on gce clusters that have been started using kube-up that uses the default config of three n1-standard-2 nodes. I've already been asked by a couple of people about why kubemark is not starting on their cluster because of this. More people shouldn't be facing this issue in future.
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue
Bump fluentd-gcp google_cloud plugin version
Bump the version of `fluent-plugin-google-cloud` in fluentd-gcp image, because it's broken for version `0.5.2`.
Recently, gem `google-api-client` was updated to version `0.10.0`. The new version broke `fluent-plugin-google-cloud` which doesn't specify the upper version of `google-api-client` gem. I'm bumping the version used in our image to allow future changes in this release to be run and tested.
This PR doesn't bump the version, since no effective changes has happened, leaving this for the next PR to do.
CC @igorpeshansky
Automatic merge from submit-queue (batch tested with PRs 40000, 41508, 41489)
Add toleration to fluentd daemonset to make it run on master
Because of https://github.com/kubernetes/kubernetes/pull/41172 fluentd pods stopped being allocated on master node.
This PR introduces toleration for master taint for fluentd.
CC @davidopp @janetkuo @kubernetes/sig-scheduling-bugs
Unfortunately, we don't have e2e tests to ensure that master logs are being ingested. This problem is a great signal to work on https://github.com/kubernetes/kubernetes/issues/41411
Automatic merge from submit-queue (batch tested with PRs 40000, 41508, 41489)
Make fluentd use default dns instead of cluster dns to make it work o…
Fix https://github.com/kubernetes/kubernetes/issues/41415
Fluentd for Stackdriver requires external urls (e.g. `logging.googleapis.com`) to be available in order to work. If fluentd runs on master, it cannot access the service endpoint of cluster DNS. This change makes fluentd use default dns to fix this problem.
CC @thockin @bowei
Automatic merge from submit-queue (batch tested with PRs 41104, 41245, 40722, 41439, 41502)
openstack-heat: do not daemonize salt-minion
_openstack-heat_ does currently not setup a _salt-master_, so it is not necessary to daemonize it.
**What this PR does / why we need it**:
as stated in #40721:
> The _openstack-heat_ provider only installs _salt-minions_, no _salt-master_. The configuration does not take this into account which causes the following issues:
>
> - the _salt minion_ is not able to DNS resolve `salt` (see fist part of error log below)
> - the _salt-minion_ is daemonized and fails finding the master (second part of error log below). From my understanding is not required when there is no salt-master, as the setup uses `salt-call`
> anyway (see [gce provider](https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/configure-vm.sh#L328-L339) as reference).
>
> ```
> Jan 31 03:00:04 kube-stack-master salt-minion[9795]: [ERROR ] DNS lookup of 'salt' failed.
> Jan 31 03:00:04 kube-stack-master salt-minion[9795]: [ERROR ] Master hostname: 'salt' not found. Retrying in 30 seconds
> ...
> Jan 31 02:35:30 kube-stack-master salt-minion[9690]: [ERROR ] Error while bringing up minion for multi-master. Is master at salt responding?
> ```
>
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#40721
**Release note**:
```release-note
Do not daemonize `salt-minion` for the openstack-heat provider.
```
Automatic merge from submit-queue (batch tested with PRs 41104, 41245, 40722, 41439, 41502)
Change the etcd rollback tool to do rollback to 2.2.1 version.
I did some tests of it and for my 3-node cluster with 1 deployment it worked fine.
But before merging this, we should probably do way more testing (we should rerun tests that @mml was doing for the previous script).
@lavalamp @xiang90
Automatic merge from submit-queue
Added configurable etcd initial-cluster-state to kube-up script.
Added configurable etcd initial-cluster-state to kube-up script. This
allows creation of multi-master cluster from scratch. This is a
cherry-pick of #41320 from 1.5 branch.
```release-note
Added configurable etcd initial-cluster-state to kube-up script.
```
Automatic merge from submit-queue (batch tested with PRs 41134, 41410, 40177, 41049, 41313)
Refactored kubemark code into provider-specific and provider-independent parts [Part-3]
Fixes#38967
Applying final part of the changes in PR #39033 (which refactored kubemark code completely). The changes included in this PR are:
- Removed `test/kubemark/common.sh` and moved relevant parts of its code to the right places in start-kubemark/stop-kubemark scripts.
- Added DOCKER_REGISTRY, PROJECT, KUBEMARK_IMAGE_MAKE_TARGET variables to `/test/kubemark/cloud-provider-config.sh` to make the kubemark image push location variable wrt provider.
- Removed get-real-pod-for-hollow-node.sh as it doesn't seem to do anything useful.
@kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue
fluentd-gcp: Add kube-apiserver-audit.log.
**What this PR does / why we need it**:
Add `kube-apiserver-audit.log` from https://github.com/kubernetes/kubernetes/pull/41211 to fluentd config, so the audit log gets sent to the same place as `kube-apiserver.log`.
**Which issue this PR fixes**:
**Special notes for your reviewer**:
We would like to backport this to release-1.5 also.
**Release note**:
```release-note
The apiserver audit log (`/var/log/kube-apiserver-audit.log`) will be sent through fluentd if enabled.
```
Automatic merge from submit-queue (batch tested with PRs 41196, 41252, 41300, 39179, 41449)
Bump GCE ContainerVM to container-vm-v20170214
`container-vm-v20170214` is a re-build of the `docker-runc` in `container-vm-v20170201`, and should clear the GCE slow tests.
c.f. #40828
```release-note
Bump GCE ContainerVM to container-vm-v20170214 to address CVE-2016-9962.
```
Automatic merge from submit-queue (batch tested with PRs 40297, 41285, 41211, 41243, 39735)
cluster/gce: Add env var to enable apiserver basic audit log.
For now, this is focused on a fixed set of flags that makes the audit
log show up under /var/log/kube-apiserver-audit.log and behave similarly
to /var/log/kube-apiserver.log. Allowing other customization would
require significantly more complex changes.
Audit log rotation is handled the same as for `kube-apiserver.log`.
**What this PR does / why we need it**:
Add a knob to enable [basic audit logging](https://kubernetes.io/docs/admin/audit/) in GCE.
**Which issue this PR fixes**:
**Special notes for your reviewer**:
We would like to cherrypick/port this to release-1.5 also.
**Release note**:
```release-note
The kube-apiserver [basic audit log](https://kubernetes.io/docs/admin/audit/) can be enabled in GCE by exporting the environment variable `ENABLE_APISERVER_BASIC_AUDIT=true` before running `cluster/kube-up.sh`. This will log to `/var/log/kube-apiserver-audit.log` and use the same `logrotate` settings as `/var/log/kube-apiserver.log`.
```
Automatic merge from submit-queue (batch tested with PRs 40297, 41285, 41211, 41243, 39735)
Secure kube-scheduler
This PR:
* Adds a bootstrap `system:kube-scheduler` clusterrole
* Adds a bootstrap clusterrolebinding to the `system:kube-scheduler` user
* Sets up a kubeconfig for kube-scheduler on GCE (following the controller-manager pattern)
* Switches kube-scheduler to running with kubeconfig against secured port (salt changes, beware)
* Removes superuser permissions from kube-scheduler in local-up-cluster.sh
* Adds detailed RBAC deny logging
```release-note
On kube-up.sh clusters on GCE, kube-scheduler now contacts the API on the secured port.
```
For now, this is focused on a fixed set of flags that makes the audit
log show up under /var/log/kube-apiserver-audit.log and behave similarly
to /var/log/kube-apiserver.log. Allowing other customization would
require significantly more complex changes.
Audit log rotation is handled externally by the wildcard /var/log/*.log
already configured in configure-helper.sh.
Automatic merge from submit-queue (batch tested with PRs 41299, 41325, 41386, 41329, 41418)
Migrate etcd data using correct etcd version in case of previous crash
Fix#41324Fix#41323
@mml