Automatic merge from submit-queue
Add standalone npd on GCI.
This PR added standalone NPD in GCE GCI cluster. I already verified the PR, and it should work.
/cc @dchen1107 @fabioy @andyxning @kubernetes/sig-node-misc
Automatic merge from submit-queue
Fix the output of health-mointor.sh
The script show prints the errors/response of the health check, but not
show the progress of `curl`.
Automatic merge from submit-queue
Added a basic monitor for providing etcd version related info
Fixes#41071
This tool scrapes metrics partly from etcd's /version and /metrics endpoints and partly using etcdctl and exposes them as prometheus metrics at `http://localhost:9101/metrics` endpoint on the master. Here is a summary of the metrics it exposes (self-explanatory from the code):
- etcdVersionFetchCount = prometheus.NewCounterVec(
prometheus.CounterOpts{
Namespace: "etcd",
Name: "version_info_fetch_count",
Help: "Number of times etcd's version info was fetched, labeled by etcd's server binary and cluster version",
},
[]string{"serverversion", "clusterversion"})
- etcdGRPCRequestsTotal = prometheus.NewCounterVec(
prometheus.CounterOpts{
Namespace: namespace,
Name: "grpc_requests_total",
Help: "Counter of received grpc requests, labeled by grpc method and grpc service names",
},
[]string{"grpc_method", "grpc_service"})
For further info on how to run this as a binary/docker-container/kubernetes-pod and checking the metrics, have a look at the README.md file.
cc @fgrzadkowski @wojtek-t @piosz
Allow cache mutation detector enablement by PRs in an attempt to find
mutations before they're merged in to the code base. It's just for the
apiserver and controller-manager for now. If/when the other components
start using a SharedInformerFactory, we should set them up just like
this as well.
Automatic merge from submit-queue
Reduce default value of kubemark's NUM_NODES to 10
Changing the default value of kubemark's NUM_NODES from 100 to 10, as it would then be possible to start kubemark on gce clusters that have been started using kube-up that uses the default config of three n1-standard-2 nodes. I've already been asked by a couple of people about why kubemark is not starting on their cluster because of this. More people shouldn't be facing this issue in future.
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue
Bump fluentd-gcp google_cloud plugin version
Bump the version of `fluent-plugin-google-cloud` in fluentd-gcp image, because it's broken for version `0.5.2`.
Recently, gem `google-api-client` was updated to version `0.10.0`. The new version broke `fluent-plugin-google-cloud` which doesn't specify the upper version of `google-api-client` gem. I'm bumping the version used in our image to allow future changes in this release to be run and tested.
This PR doesn't bump the version, since no effective changes has happened, leaving this for the next PR to do.
CC @igorpeshansky
Automatic merge from submit-queue (batch tested with PRs 40000, 41508, 41489)
Add toleration to fluentd daemonset to make it run on master
Because of https://github.com/kubernetes/kubernetes/pull/41172 fluentd pods stopped being allocated on master node.
This PR introduces toleration for master taint for fluentd.
CC @davidopp @janetkuo @kubernetes/sig-scheduling-bugs
Unfortunately, we don't have e2e tests to ensure that master logs are being ingested. This problem is a great signal to work on https://github.com/kubernetes/kubernetes/issues/41411
Automatic merge from submit-queue (batch tested with PRs 40000, 41508, 41489)
Make fluentd use default dns instead of cluster dns to make it work o…
Fix https://github.com/kubernetes/kubernetes/issues/41415
Fluentd for Stackdriver requires external urls (e.g. `logging.googleapis.com`) to be available in order to work. If fluentd runs on master, it cannot access the service endpoint of cluster DNS. This change makes fluentd use default dns to fix this problem.
CC @thockin @bowei
Automatic merge from submit-queue (batch tested with PRs 41104, 41245, 40722, 41439, 41502)
openstack-heat: do not daemonize salt-minion
_openstack-heat_ does currently not setup a _salt-master_, so it is not necessary to daemonize it.
**What this PR does / why we need it**:
as stated in #40721:
> The _openstack-heat_ provider only installs _salt-minions_, no _salt-master_. The configuration does not take this into account which causes the following issues:
>
> - the _salt minion_ is not able to DNS resolve `salt` (see fist part of error log below)
> - the _salt-minion_ is daemonized and fails finding the master (second part of error log below). From my understanding is not required when there is no salt-master, as the setup uses `salt-call`
> anyway (see [gce provider](https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/configure-vm.sh#L328-L339) as reference).
>
> ```
> Jan 31 03:00:04 kube-stack-master salt-minion[9795]: [ERROR ] DNS lookup of 'salt' failed.
> Jan 31 03:00:04 kube-stack-master salt-minion[9795]: [ERROR ] Master hostname: 'salt' not found. Retrying in 30 seconds
> ...
> Jan 31 02:35:30 kube-stack-master salt-minion[9690]: [ERROR ] Error while bringing up minion for multi-master. Is master at salt responding?
> ```
>
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#40721
**Release note**:
```release-note
Do not daemonize `salt-minion` for the openstack-heat provider.
```
Automatic merge from submit-queue (batch tested with PRs 41104, 41245, 40722, 41439, 41502)
Change the etcd rollback tool to do rollback to 2.2.1 version.
I did some tests of it and for my 3-node cluster with 1 deployment it worked fine.
But before merging this, we should probably do way more testing (we should rerun tests that @mml was doing for the previous script).
@lavalamp @xiang90
Automatic merge from submit-queue
Added configurable etcd initial-cluster-state to kube-up script.
Added configurable etcd initial-cluster-state to kube-up script. This
allows creation of multi-master cluster from scratch. This is a
cherry-pick of #41320 from 1.5 branch.
```release-note
Added configurable etcd initial-cluster-state to kube-up script.
```
Automatic merge from submit-queue (batch tested with PRs 41134, 41410, 40177, 41049, 41313)
Refactored kubemark code into provider-specific and provider-independent parts [Part-3]
Fixes#38967
Applying final part of the changes in PR #39033 (which refactored kubemark code completely). The changes included in this PR are:
- Removed `test/kubemark/common.sh` and moved relevant parts of its code to the right places in start-kubemark/stop-kubemark scripts.
- Added DOCKER_REGISTRY, PROJECT, KUBEMARK_IMAGE_MAKE_TARGET variables to `/test/kubemark/cloud-provider-config.sh` to make the kubemark image push location variable wrt provider.
- Removed get-real-pod-for-hollow-node.sh as it doesn't seem to do anything useful.
@kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue
fluentd-gcp: Add kube-apiserver-audit.log.
**What this PR does / why we need it**:
Add `kube-apiserver-audit.log` from https://github.com/kubernetes/kubernetes/pull/41211 to fluentd config, so the audit log gets sent to the same place as `kube-apiserver.log`.
**Which issue this PR fixes**:
**Special notes for your reviewer**:
We would like to backport this to release-1.5 also.
**Release note**:
```release-note
The apiserver audit log (`/var/log/kube-apiserver-audit.log`) will be sent through fluentd if enabled.
```
Automatic merge from submit-queue (batch tested with PRs 41196, 41252, 41300, 39179, 41449)
Bump GCE ContainerVM to container-vm-v20170214
`container-vm-v20170214` is a re-build of the `docker-runc` in `container-vm-v20170201`, and should clear the GCE slow tests.
c.f. #40828
```release-note
Bump GCE ContainerVM to container-vm-v20170214 to address CVE-2016-9962.
```
Automatic merge from submit-queue (batch tested with PRs 40297, 41285, 41211, 41243, 39735)
cluster/gce: Add env var to enable apiserver basic audit log.
For now, this is focused on a fixed set of flags that makes the audit
log show up under /var/log/kube-apiserver-audit.log and behave similarly
to /var/log/kube-apiserver.log. Allowing other customization would
require significantly more complex changes.
Audit log rotation is handled the same as for `kube-apiserver.log`.
**What this PR does / why we need it**:
Add a knob to enable [basic audit logging](https://kubernetes.io/docs/admin/audit/) in GCE.
**Which issue this PR fixes**:
**Special notes for your reviewer**:
We would like to cherrypick/port this to release-1.5 also.
**Release note**:
```release-note
The kube-apiserver [basic audit log](https://kubernetes.io/docs/admin/audit/) can be enabled in GCE by exporting the environment variable `ENABLE_APISERVER_BASIC_AUDIT=true` before running `cluster/kube-up.sh`. This will log to `/var/log/kube-apiserver-audit.log` and use the same `logrotate` settings as `/var/log/kube-apiserver.log`.
```
Automatic merge from submit-queue (batch tested with PRs 40297, 41285, 41211, 41243, 39735)
Secure kube-scheduler
This PR:
* Adds a bootstrap `system:kube-scheduler` clusterrole
* Adds a bootstrap clusterrolebinding to the `system:kube-scheduler` user
* Sets up a kubeconfig for kube-scheduler on GCE (following the controller-manager pattern)
* Switches kube-scheduler to running with kubeconfig against secured port (salt changes, beware)
* Removes superuser permissions from kube-scheduler in local-up-cluster.sh
* Adds detailed RBAC deny logging
```release-note
On kube-up.sh clusters on GCE, kube-scheduler now contacts the API on the secured port.
```
For now, this is focused on a fixed set of flags that makes the audit
log show up under /var/log/kube-apiserver-audit.log and behave similarly
to /var/log/kube-apiserver.log. Allowing other customization would
require significantly more complex changes.
Audit log rotation is handled externally by the wildcard /var/log/*.log
already configured in configure-helper.sh.
Automatic merge from submit-queue (batch tested with PRs 41299, 41325, 41386, 41329, 41418)
Migrate etcd data using correct etcd version in case of previous crash
Fix#41324Fix#41323
@mml
Automatic merge from submit-queue (batch tested with PRs 41357, 41178, 41280, 41184, 41278)
Switch RBAC subject apiVersion to apiGroup in v1beta1
Referencing a subject from an RBAC role binding, the API group and kind of the subject is needed to fully-qualify the reference.
The version is not, and adds complexity around re-writing the reference when returning the binding from different versions of the API, and when reconciling subjects.
This PR:
* v1beta1: change the subject `apiVersion` field to `apiGroup` (to match roleRef)
* v1alpha1: convert apiVersion to apiGroup for backwards compatibility
* all versions: add defaulting for the three allowed subject kinds
* all versions: add validation to the field so we can count on the data in etcd being good until we decide to relax the apiGroup restriction
```release-note
RBAC `v1beta1` RoleBinding/ClusterRoleBinding subjects changed `apiVersion` to `apiGroup` to fully-qualify a subject. ServiceAccount subjects default to an apiGroup of `""`, User and Group subjects default to an apiGroup of `"rbac.authorization.k8s.io"`.
```
@deads2k @kubernetes/sig-auth-api-reviews @kubernetes/sig-auth-pr-reviews
Added configurable etcd initial-cluster-state to kube-up script. This
allows creation of multi-master cluster from scratch. This is a
cherry-pick of #41320 from 1.5 branch.
Automatic merge from submit-queue (batch tested with PRs 41182, 41290)
Add a default storage class for Azure Disk
Part of https://github.com/kubernetes/kubernetes/issues/40071
@jsafrane @colemickens @codablock @rootfs
Automatic merge from submit-queue (batch tested with PRs 38252, 41122, 36101, 41017, 41264)
Add alternative names for the server binaries to hyperkube
**What this PR does / why we need it**:
Right now one can't swap a server image to the hyperkube image without touching the `command` field in the yaml spec, and that's daunting and leading to extra and unnecessary logic for example in kubeadm.
This makes the hyperkube image directly swappable, so now `/usr/local/bin/kube-*` is a portable first argument (or simply `kube-*` if there's a shell).
**Special notes for your reviewer**:
**Release note**:
```release-note
Align the hyperkube image to support running binaries at /usr/local/bin/ like the other server images
```
@jessfraz @thockin @ixdy
Automatic merge from submit-queue (batch tested with PRs 41223, 40892, 41220, 41207, 41242)
Backup etcd only before migration
There is a bug currently that trigger backup on every run of a script (when we are running 2.2.1 version).
@mml
Automatic merge from submit-queue (batch tested with PRs 41037, 40118, 40959, 41084, 41092)
Bump up GLBC version from 0.9.0-beta to 0.9.1
Tests have been green, moving the beta to a release.
Automatic merge from submit-queue (batch tested with PRs 41121, 40048, 40502, 41136, 40759)
Remove deprecated kubelet flags that look safe to remove
Removes:
```
--config
--auth-path
--resource-container
--system-container
```
which have all been marked deprecated since at least 1.4 and look safe to remove.
```release-note
The deprecated flags --config, --auth-path, --resource-container, and --system-container were removed.
```
Automatic merge from submit-queue (batch tested with PRs 40971, 41027, 40709, 40903, 39369)
Bump GCI to gci-beta-56-9000-80-0
cc/ @Random-Liu @adityakali
Changelogs since gci-dev-56-8977-0-0 (currently used in Kubernetes):
- "net.ipv4.conf.eth0.forwarding" and "net.ipv4.ip_forward" may get reset to 0
- Track CVE-2016-9962 in Docker in GCI
- Linux kernel CVE-2016-7097
- Linux kernel CVE-2015-8964
- Linux kernel CVE-2016-6828
- Linux kernel CVE-2016-7917
- Linux kernel CVE-2016-7042
- Linux kernel CVE-2016-9793
- Linux kernel CVE-2016-7039 and CVE-2016-8666
- Linux kernel CVE-2016-8655
- Toolbox: allow docker image to be loaded from local tarball
- Update compute-image-package in GCI
- Change the product name on /etc/os-release (to COS)
- Remove 'dogfood' from HWID_OVERRIDE in /etc/lsb-release
- Include Google NVME extensions to optimize LocalSSD performance.
- /proc/<pid>/io missing on GCI (enables process stats accounting)
- Enable BLK_DEV_THROTTLING
cc/ @roberthbailey @fabioy for GKE cluster update
Automatic merge from submit-queue (batch tested with PRs 39169, 40719, 38954, 40808, 40689)
Exports KUBE_TEMP for use in Vagrantfile
In #40147, the logic for setting `KUBE_TEMP` was refactored into `common.sh`. However, it was overlooked that `KUBE_TEMP` [needs to be exported for vagrant to work properly](https://github.com/kubernetes/kubernetes/pull/40147/files#diff-b19d3d93456020e2168c7f304f722969).
This PR restores the `export` so that `Vagrantfile` can use `ENV["KUBE_TEMP"]` properly.
👀 @rthallisey @shyamjvs @timothysc
Automatic merge from submit-queue
Add websocket support for port forwarding
#32880
**Release note**:
```release-note
Port forwarding can forward over websockets or SPDY.
```
Automatic merge from submit-queue (batch tested with PRs 40758, 39145, 40776)
Bumps addon-manager to v6.4-alpha.1 for supporting optional ConfigMap
From #40382. Bumps up addon-manager to use v1.6.0-alpha.1 kubectl for the optional ConfigMap feature. Below images have been pushed:
- gcr.io/google-containers/kube-addon-manager:v6.4-alpha.1
- gcr.io/google-containers/kube-addon-manager-amd64:v6.4-alpha.1
- gcr.io/google-containers/kube-addon-manager-arm:v6.4-alpha.1
- gcr.io/google-containers/kube-addon-manager-arm64:v6.4-alpha.1
- gcr.io/google-containers/kube-addon-manager-ppc64le:v6.4-alpha.1
- gcr.io/google-containers/kube-addon-manager-s390x:v6.4-alpha.1
@liggitt @bowei
- split out port forwarding into its own package
Allow multiple port forwarding ports
- Make it easy to determine which port is tied to which channel
- odd channels are for data
- even channels are for errors
- allow comma separated ports to specify multiple ports
Add portfowardtester 1.2 to whitelist
Automatic merge from submit-queue (batch tested with PRs 40111, 40368, 40342, 40274, 39443)
Libvirt-coreos - Add execute permissions to kubernetes/bin
**What this PR does / why we need it**:
The master node was failing to start for me due to the permission errors on the kubernetes server binaries.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 40111, 40368, 40342, 40274, 39443)
Change OPENSTACK_IMAGE_NAME to be more specific
There may already be other images in the cloud named CentOS7,
but since we are fetching a very specific version (1604) we should
go ahead and make the image name very specific as well.
**What this PR does / why we need it**:
Some clouds already have `Centos7` as an image that is available, however it may not be the *specific* version that openstack-heat looks for and downloads from CentOS.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
**Special notes for your reviewer**:
**Release note**:
```release-note
OpenStack-Heat will now look for an image named "CentOS-7-x86_64-GenericCloud-1604". To restore the previous behavior set OPENSTACK_IMAGE_NAME="CentOS7"
```
Automatic merge from submit-queue (batch tested with PRs 38772, 38797, 40732, 40740)
AWS: Deprecate the bash deployment
**What this PR does / why we need it**: Add a strong deprecation warning to the `kube-up.sh` AWS deployment.
**Release note**:
```release-note
The bash AWS deployment via kube-up.sh has been deprecated. See http://kubernetes.io/docs/getting-started-guides/aws/ for alternatives.
```
Automatic merge from submit-queue (batch tested with PRs 40392, 39242, 40579, 40628, 40713)
Add --force-new-cluster when running etcd for migrations.
This is required to avoid etcd trying to create quorum during
migrations.
Might fix#40110
Automatic merge from submit-queue (batch tested with PRs 40691, 40551, 40683, 40700, 40702)
Juju kubernetes-master charm: improve status messages
**What this PR does / why we need it**:
This update to the kubernetes-master charm does the following:
1. Remove "Kubernetes master services ready" status which was occurring too early
2. Add "Waiting for kube-system pods to start" status
3. Replace "Rendering the Kubernetes DNS files." status with "Deploying KubeDNS"
4. Add "Waiting to retry KubeDNS deployment" status
The purpose of this is to give better feedback to the operator during cluster deployment.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/143, which we are tracking in a separate repository
**Special notes for your reviewer**:
This is a rebase of https://github.com/juju-solutions/kubernetes/pull/103, where prior review was done, though it was targeted against a fork.
**Release note**:
```release-note
Juju kubernetes-master charm: improve status messages
```
Automatic merge from submit-queue (batch tested with PRs 40703, 40093, 40618, 40659, 39810)
Change kubemark Makefile to be provider independent
Ref issue #38967
The Kubemark Makefile is defaulted to gcr.io. Instead, make it
provider independent.
The kubemark makefile is set to push the kubemark image to the gcr.io registry. In order to make kubemark not as provider specific, allow the developer to choose a registry.
Automatic merge from submit-queue (batch tested with PRs 40549, 40339)
Invalid node names when deploying with Heat
OpenStack Heat templates create Kubernetes nodes with invalid
hostnames. Capital letters are not allowed in the hostnames:
Unable to register node "kubernetes-node-6s8OizYe" with API server: Node "kubernetes-node-6s8OizYe" is invalid: metadata.name: Invalid value: "kubernetes-node-6s8OizYe": must match the regex [a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)* (e.g. 'example.com')
This patch prevents Heat generating hostnames that contain
capital letters.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
Improve the multiarch situation; armel => armhf; reenable pcc64le; remove the patched golang
**What this PR does / why we need it**:
- Improves the multiarch situation as described in #38067
- Tries to bump to go1.8 for arm (and later enable ppc64le)
- GOARM 6 => GOARM 7
- Remove the golang 1.7 patch
- armel => armhf
- Bump QEMU version to v2.7.0
**Release note**:
```release-note
Improve the ARM builds and make hyperkube on ARM working again by upgrading the Go version for ARM to go1.8beta2
```
@kubernetes/sig-testing-misc @jessfraz @ixdy @jbeda @david-mcmahon @pwittrock
Automatic merge from submit-queue
Use a wrapper script to locate kubefed and kubectl binaries instead of directly constructing their paths.
This fixes the e2e failures that is now switched to using kubefed.
cc @kubernetes/sig-federation-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 40497, 39769, 40554, 40569, 40597)
When calling chown, use : instead of . to separate the user and group for cross platform compatibility.
**What this PR does / why we need it**: Makes it possible to build on a Mac, which was broken by #39515.
**Special notes for your reviewer**:
**Release note**:
`NONE`
Automatic merge from submit-queue (batch tested with PRs 39469, 40557)
Refactored kubemark code into provider-specific and provider-independent parts [Part-1]
Applying part of the changes of PR https://github.com/kubernetes/kubernetes/pull/39033 (which refactored kubemark code completely). The changes included in this PR are:
The following are the major changes as part of this refactoring:
- Moved cluster-kubemark/config-default.sh -> cluster-kubemark/gce/config-default.sh (as the config is gce-specific)
- Changed kubernetes/cluster/kubemark/util.sh to source the right scripts based on the cloud-provider
- Added the file test/kubemark/cloud-provider-config.sh which sets the variable CLOUD_PROVIDER that is later picked up by various scripts (run-e2e-tests.sh, common.sh)
- Removed useless code and restructured start-kubemark.sh and stop-kubemark.sh scripts.
@kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 40126, 40565, 38777, 40564, 40572)
Bump up glbc version to 0.9.0-beta.1
I plan to bump up the version to 0.9.0 proper in time for the next 1.5.x release, and cherry-pick both this and the future pr.
Previously we were just using a single version, but the "-beta/alpha" is consistent with how we release kube and gives us a convenient revert target. It also forces us to remove the "beta" tag before code freeze, and track the kubernetes release cycle.
Automatic merge from submit-queue (batch tested with PRs 38739, 40480, 40495, 40172, 40393)
Use existing ABAC policy file when upgrading GCE cluster
When upgrading, continue loading an existing ABAC policy file so that existing system components continue working as-is
```
When upgrading an existing 1.5 GCE cluster using `cluster/gce/upgrade.sh`, an existing ABAC policy file located at /etc/srv/kubernetes/abac-authz-policy.jsonl (the default location in 1.5) will enable the ABAC authorizer in addition to the RBAC authorizer. To switch an upgraded 1.5 cluster completely to RBAC, ensure the control plane components and your superuser have been granted sufficient RBAC permissions, move the legacy ABAC policy file to a backup location, and restart the apiserver.
```
Automatic merge from submit-queue
Add support of Keystone v3 'domain-name' to 'openstack-heat' cluster setup
**What this PR does / why we need it**:
Keystone v3 authentication by user name [requires the domain (name or ID)](http://developer.openstack.org/api-ref/identity/v3/index.html?expanded=password-authentication-with-scoped-authorization-detail). If `domain-name` is not provided kubelet fails as seen below:
```
kubelet: error: failed to run Kubelet: could not init cloud provider "openstack": You must provide exactly one of DomainID or DomainName to authenticate by Username
systemd: kubelet.service: main process exited, code=exited, status=1/FAILURE
systemd: Unit kubelet.service entered failed state.
systemd: kubelet.service failed.
```
To solve this I pass a new`OS_USER_DOMAIN_NAME` environment variable through openstack-heat's heat-templates to write it as `domain-name` in `/srv/kubernetes/openstack.conf`.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#39783
**Special notes for your reviewer**:
**Release note**:
```
domain-name support for Keystone v3 added to openstack-heat cluster setup
```
Automatic merge from submit-queue
Able to quick create a HA cluster by kube-up.sh centos provider
Make `kube-up.sh` `centos provider` support quick create a HA cluster, as I said above [#39430](https://github.com/kubernetes/kubernetes/issues/39430), it's more flexible than `kops` or `kubeadm` for some people in a limited network region.
I'm new to k8s dev, so if this pull request need to change, please let me know.
```release-note
Added support for creating HA clusters for centos using kube-up.sh.
```
Automatic merge from submit-queue
[OpenStack-Heat] Add link to the OpenStack CLI install documentation
**What this PR does / why we need it**:
More helpful diagnostic text
**Special notes for your reviewer**:
Trivial patch
**Release note**:
```release-note
NONE
```
Fix: cannot get default master advertise address correctly
Set default value of NUM_MASTERS and NUM_NODES by MASTERS and NODES themself
Code cleanup and documented
Using runtime reconfiguration for etcd cluster instead of etcd discovery
Add exceptions for verify-flags
Automatic merge from submit-queue (batch tested with PRs 40335, 40320, 40324, 39103, 40315)
Use the e2e zone name as the cluster name.
This needs a revamp, but for now e2e zone name is used as the
unique cluster identifier in our e2e tests and we will continue
to use that pattern.
This is a follow up to PR #38638.
cc @kubernetes/sig-federation-pr-reviews @nikhiljindal
Automatic merge from submit-queue (batch tested with PRs 40335, 40320, 40324, 39103, 40315)
Splitting master/node services into separate charm layers
**What this PR does / why we need it**:
This branch includes a roll-up series of commits from a fork of the
Kubernetes repository pre 1.5 release because we didn't make the code freeze.
This additional effort has been fully tested and has results submit into
the gubernator to enhance confidence in this code quality vs. the single
layer, posing as both master/node.
To reference the gubernator results, please see:
https://k8s-gubernator.appspot.com/builds/canonical-kubernetes-tests/logs/kubernetes-gce-e2e-node/
Apologies in advance for the large commit however, we did not want to
submit without having successful upstream automated testing results.
This commit includes:
- Support for CNI networking plugins
- Support for durable storage provided by Ceph
- Building from upstream templates (read: kubedns - no more template
drift!)
- An e2e charm-layer to make running validation tests much simpler/repeatable
- Changes to support the 1.5.x series of Kubernetes
**Special notes for your reviewer**:
Additional note: We will be targeting -all- future work against upstream
so large pull requests of this magnitude will not occur again.
**Release note**:
```release-note
- Splits Juju Charm layers into master/worker roles
- Adds support for 1.5.x series of Kubernetes
- Introduces a tactic for keeping templates in sync with upstream eliminating template drift
- Adds CNI support to the Juju Charms
- Adds durable storage support to the Juju Charms
- Introduces an e2e Charm layer for repeatable testing efforts and validation of clusters
```
Automatic merge from submit-queue (batch tested with PRs 39260, 40216, 40213, 40325, 40333)
Fixed propagation of kube master certs during master replication.
Fixed propagation of kube-master-certs during master replication.
Automatic merge from submit-queue (batch tested with PRs 39275, 40327, 37264)
Fix invalid node name in openstack-heat provider
Cluster node name must follow name syntax in RFC 1123.
But currently, openstack-heat provider generate invalid
node name which contains upper-case characters.
This patch fixes it.
Automatic merge from submit-queue (batch tested with PRs 40299, 40311)
cluster: update default rkt version to 1.23.0
This updates cluster configurations to current stable rkt version.
There may already be other images in the cloud named CentOS7,
but since we are fetching a very specific version (1604) we should
go ahead and make the image name very specific as well.
This branch includes a rollup series of commits from a fork of the
kubernetes repository pre 1.5 release because we didn't make the code freeze.
This additional effort has been fully tested and has results submit into
the gubernator to enhance confidence in this code quality vs. the single
layer, posing as both master/node.
To reference the gubernator results, please see:
https://k8s-gubernator.appspot.com/builds/canonical-kubernetes-tests/logs/kubernetes-gce-e2e-node/
Apologies in advance for the large commit, however we did not want to
submit without having successful upstream automated testing results.
This commit includes:
- Support for CNI networking plugins
- Support for durable storage provided by ceph
- Building from upstream templates (read: kubedns - no more template
drift!)
- An e2e charm-layer to make running validation tests much simpler/repeatable
- Changes to support the 1.5.x series of kubernetes
Additional note: We will be targeting -all- future work against upstream
so large pull requests of this magnitude will not occur again.
OpenStack Heat templates create Kubernetes nodes with invalid
hostnames. Capital letters are not allowed in the hostnames:
Unable to register node "kubernetes-node-6s8OizYe" with API server: Node "kubernetes-node-6s8OizYe" is invalid: metadata.name: Invalid value: "kubernetes-node-6s8OizYe": must match the regex [a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)* (e.g. 'example.com')
This patch prevents Heat generating hostnames that contain
capital letters.
Automatic merge from submit-queue
OWNERS: Update latest OWNERS files
These files have been created lately, so we don't have much information
about them anyway, so let's just:
- Remove assignees and make them approvers
- Copy approves as reviewers
Automatic merge from submit-queue (batch tested with PRs 40251, 40171)
Only run gcloud as root if we plan to change something.
Only run gcloud as root if we plan to change something.
Fixes bug introduced in #36292 @jlowdermilk @ixdy
These files have been created lately, so we don't have much information
about them anyway, so let's just:
- Remove assignees and make them approvers
- Copy approves as reviewers
Automatic merge from submit-queue (batch tested with PRs 40011, 40159)
Make CACHEBUST for hyperkube build optional
**What this PR does / why we need it**: It makes CACHEBUST for the hyperkube build optional. Currently, building the hyperkube always results in a full rebuild, including retrieving and installing of all debian packages. This is a good thing for releases, but makes live as a dev hard.
This allows to do something like this:
```
$ REGISTRY=<registry> VERSION=<version> CACHEBUST=0 ./hack/dev-push-hyperkube.sh
```
Probably we should even make CACHEBUST=0 the default when calling dev-xxx.sh scripts.
CC: @aaronlevy
Automatic merge from submit-queue (batch tested with PRs 36693, 40154, 40170, 39033)
Refactored kubemark into cloud-provider independent code and GCE specific code
Ref issue #38967
The following are the major changes as part of this refactoring:
- Moved cluster-kubemark/config-default.sh -> cluster-kubemark/gce/config-default.sh (as the config is gce-specific)
- Changed kubernetes/cluster/kubemark/util.sh to source the right scripts based on the cloud-provider
- Added test/kubemark/skeleton/util.sh which defines a well-commented interface that any cloud-provider should implement to run kubemark. (We have this interface defined only for gce currently)
This includes functions like creating the master machine instance along with its resources, executing a given command on the master (like ssh), scp, deleting the master instance and its resources.
All these functions have to be overrided by each cloud provider inside the file /test/kubemark/$CLOUD_PROVIDER/util.sh
- Added the file test/kubemark/cloud-provider-config.sh which sets the variable CLOUD_PROVIDER that is later picked up by various scripts (start-kubemark.sh, stop-kubemark.sh, run-e2e-tests.sh)
- Removed test/kubemark/common.sh and moved whatever provider-independent code it had into start-kubemark.sh (the only place where the scipt is called) and moved the little gce-specific code
into test/kubemark/gce/util.sh.
- Finally, removed useless code and restructured start-kubemark.sh and stop-kubemark.sh scripts.
@kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 36693, 40154, 40170, 39033)
make client-go authoritative for pkg/client/restclient
Moves client/restclient to client-go and a util/certs, util/testing as transitives.
Automatic merge from submit-queue (batch tested with PRs 40168, 40165, 39158, 39966, 40190)
Include system:masters group in the bootstrap admin client certificate
Sets up the bootstrap admin client certificate for new clusters to be in the system:masters group
Removes the need for an explicit grant to the kubecfg user in e2e-bindings
```release-note
The default client certificate generated by kube-up now contains the superuser `system:masters` group
```
Automatic merge from submit-queue
Update root approvers files
Replaces #40040
Update top level OWNERS files mostly to set assignees to approvers. Also remove @bgrant0607 from everywhere but the very top level OWNERS file.
Automatic merge from submit-queue
Use ensure-temp-dir in the common.sh script
Ref issue #38967
Instead of having an ensure-temp-dir function in multiple
places, add it to the common.sh script which is sourced by
all the providers.
Automatic merge from submit-queue (batch tested with PRs 40003, 40017)
Remove library copying from fluentd image
It seems that fluentd can no longer copy systemd libraries from host to be able to read journals.
Automatic merge from submit-queue
Adding cos as an alias for gci.
**What this PR does / why we need it**: Adding COS as an alias for GCI.
cc: @adityakali @wonderfly
Automatic merge from submit-queue (batch tested with PRs 40105, 40095)
[OpenStack-Heat] Fix regex used to get object-store URL
**Release note**:
```release-note
Fixes a bug in the OpenStack-Heat kubernetes provider, in the handling of differences between the Identity v2 and Identity v3 APIs
```
Automatic merge from submit-queue
Build release tars using bazel
**What this PR does / why we need it**: builds equivalents of the various kubernetes release tarballs, solely using bazel.
For example, you can now do
```console
$ make bazel-release
$ hack/e2e.go -v -up -test -down
```
**Special notes for your reviewer**: this is currently dependent on 3b29803eb5, which I have yet to turn into a pull request, since I'm still trying to figure out if this is the best approach.
Basically, the issue comes up with the way we generate the various server docker image tarfiles and load them on nodes:
* we `md5sum` the binary being encapsulated (e.g. kube-proxy) and save that to `$binary.docker_tag` in the server tarball
* we then build the docker image and tag using that md5sum (e.g. `gcr.io/google_containers/kube-proxy:$MD5SUM`)
* we `docker save` this image, which embeds the full tag in the `$binary.tar` file.
* on cluster startup, we `docker load` these tarballs, which are loaded with the tag that we'd created at build time. the nodes then use the `$binary.docker_tag` file to find the right image.
With the current bazel `docker_build` rule, the tag isn't saved in the docker image tar, so the node is unable to find the image after `docker load`ing it.
My changes to the rule save the tag in the docker image tar, though I don't know if there are subtle issues with it. (Maybe we want to only tag when `--stamp` is given?)
Also, the docker images produced by bazel have the timestamp set to the unix epoch, which is not great for debugging. Might be another thing to change with a `--stamp`.
Long story short, we probably need to follow up with bazel folks on the best way to solve this problem.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 36467, 36528, 39568, 40094, 39042)
Bump GCE to container-vm-v20170117
Base image update only, no kubelet or Docker updates.
```release-note
Update GCE ContainerVM deployment to container-vm-v20170117 to pick up CVE fixes in base image.
```
Automatic merge from submit-queue
Enable lazy initialization of ext3/ext4 filesystems
**What this PR does / why we need it**: It enables lazy inode table and journal initialization in ext3 and ext4.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#30752, fixes#30240
**Release note**:
```release-note
Enable lazy inode table and journal initialization for ext3 and ext4
```
**Special notes for your reviewer**:
This PR removes the extended options to mkfs.ext3/mkfs.ext4, so that the defaults (enabled) for lazy initialization are used.
These extended options come from a script that was historically located at */usr/share/google/safe_format_and_mount* and later ported to GO so this dependency to the script could be removed. After some search, I found the original script here: https://github.com/GoogleCloudPlatform/compute-image-packages/blob/legacy/google-startup-scripts/usr/share/google/safe_format_and_mount
Checking the history of this script, I found the commit [Disable lazy init of inode table and journal.](4d7346f7f5). This one introduces the extended flags with this description:
```
Now that discard with guaranteed zeroing is supported by PD,
initializing them is really fast and prevents perf from being affected
when the filesystem is first mounted.
```
The problem is, that this is not true for all cloud providers and all disk types, e.g. Azure and AWS. I only tested with magnetic disks on Azure and AWS, so maybe it's different for SSDs on these cloud providers. The result is that this performance optimization dramatically increases the time needed to format a disk in such cases.
When mkfs.ext4 is told to not lazily initialize the inode tables and the check for guaranteed zeroing on discard fails, it falls back to a very naive implementation that simply loops and writes zeroed buffers to the disk. Performance on this highly depends on free memory and also uses up all this free memory for write caching, reducing performance of everything else in the system.
As of https://github.com/kubernetes/kubernetes/issues/30752, there is also something inside kubelet that somehow degrades performance of all this. It's however not exactly known what it is but I'd assume it has something to do with cgroups throttling IO or memory.
I checked the kernel code for lazy inode table initialization. The nice thing is, that the kernel also does the guaranteed zeroing on discard check. If it is guaranteed, the kernel uses discard for the lazy initialization, which should finish in a just few seconds. If it is not guaranteed, it falls back to using *bio*s, which does not require the use of the write cache. The result is, that free memory is not required and not touched, thus performance is maxed and the system does not suffer.
As the original reason for disabling lazy init was a performance optimization and the kernel already does this optimization by default (and in a much better way), I'd suggest to completely remove these flags and rely on the kernel to do it in the best way.
Automatic merge from submit-queue (batch tested with PRs 39911, 40002, 39969, 40012, 40009)
Sync fluentd daemonset liveness probe with static pod liveness probe
Syncing change from https://github.com/kubernetes/kubernetes/pull/39949
Should also be cherry-picked
Automatic merge from submit-queue
Use $HOSTNAME as node.name by default
**What this PR does / why we need it**:
Allows to identify elasticsearch instances more easily.
As $HOSTNAME of a pod is unique, this should be no problem.
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)
Remove fluentd buffers if fluentd is stuck
Fluentd now stores its buffers on disk for the resiliency. However, if buffer is corrupted, fluentd will be restarting forever.
Following change will make fluentd liveness probe delete buffers if fluentd is stuck for more than X minutes (15 by default).
Automatic merge from submit-queue
Update images that use ubuntu-slim base image to :0.6
**What this PR does / why we need it**: `ubuntu-slim:0.4` is somewhat old, being based on Ubuntu 16.04, whereas `ubuntu-slim:0.6` is based on Ubuntu 16.04.1.
**Special notes for your reviewer**: I haven't pushed any of these images yet, so I expect all of the e2e builds to fail. If we're happy with the changes, I can push the images and then re-trigger tests.
**Release note**:
```release-note
NONE
```
cc @aledbf as FYI
Automatic merge from submit-queue
Update kubectl to stable version for Addon Manager
Bumps up Addon Manager to v6.2, below images are pushed:
- gcr.io/google-containers/kube-addon-manager:v6.2
- gcr.io/google-containers/kube-addon-manager-amd64:v6.2
- gcr.io/google-containers/kube-addon-manager-arm:v6.2
- gcr.io/google-containers/kube-addon-manager-arm64:v6.2
- gcr.io/google-containers/kube-addon-manager-ppc64le:v6.2
- gcr.io/google-containers/kube-addon-manager-s390x:v6.2
@mikedanese
cc @ixdy
Automatic merge from submit-queue (batch tested with PRs 39803, 39698, 39537, 39478)
include bootstrap admin in super-user group, ensure tokens file is correct on upgrades
Fixes https://github.com/kubernetes/kubernetes/issues/39532
Possible issues with cluster bring-up scripts:
- [x] known_tokens.csv and basic_auth.csv is not rewritten if the file already exists
* new users (like the controller manager) are not available on upgrade
* changed users (like the kubelet username change) are not reflected
* group additions (like the addition of admin to the superuser group) don't take effect on upgrade
* this PR updates the token and basicauth files line-by-line to preserve user additions, but also ensure new data is persisted
- [x] existing 1.5 clusters may depend on more permissive ABAC permissions (or customized ABAC policies). This PR adds an option to enable existing ABAC policy files for clusters that are upgrading
Follow-ups:
- [ ] both scripts are loading e2e role-bindings, which only be loaded in e2e tests, not in normal kube-up scenarios
- [ ] when upgrading, set the option to use existing ABAC policy files
- [ ] update bootstrap superuser client certs to add superuser group? ("We also have a certificate that "used to be" a super-user. On GCE, it has CN "kubecfg", on GKE it's "client"")
- [ ] define (but do not load by default) a relaxed set of RBAC roles/rolebindings matching legacy ABAC, and document how to load that for new clusters that do not want to isolate user permissions
Automatic merge from submit-queue
Enable kubernetes_metadata by default for ELK stack
Looks like it was accidentally removed and was not restored back in this PR https://github.com/kubernetes/kubernetes/pull/29883
Because actually this plugin still exists in the image, but new ELK deployment don't allow you to index namespaces, pod names, etc.
Automatic merge from submit-queue (batch tested with PRs 39731, 39662, 39721)
container-linux: restart rkt-api on failure
This works around a flake I saw which had the same root cause as
https://github.com/coreos/rkt/issues/3513.
This will potentially help reduce the impact of such future problems as
well.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 39731, 39662, 39721)
Update dashboard version to v1.5.1
**What this PR does / why we need it**:
Latest Dashboard developments, including a CSRF issue in the dashboard POST handlers
**Release note**:
```
Set Dashboard UI version to v1.5.1
```
This works around a flake I saw which had the same root cause as
https://github.com/coreos/rkt/issues/3513.
This will potentially help reduce the impact of such future problems as
well.
Automatic merge from submit-queue (batch tested with PRs 39694, 39383, 39651, 39691, 39497)
Bump container-linux and gci timeout for docker health check
The command `docker ps` can take longer time to respond under heavy load or
when encountering some known issues. In these cases, the containers are running
fine, so aggressive health check could cause serious disruption. Bump the
timeout to 60s to be consistent with the debian-based containerVM.
This addresses #38588
Automatic merge from submit-queue (batch tested with PRs 39694, 39383, 39651, 39691, 39497)
Allow rolebinding/clusterrolebinding with explicit bind permission check
Fixes https://github.com/kubernetes/kubernetes/issues/39176
Fixes https://github.com/kubernetes/kubernetes/issues/39258
Allows creating/updating a rolebinding/clusterrolebinding if the user has explicitly been granted permission to perform the "bind" verb against the referenced role/clusterrole (previously, they could only bind if they already had all the permissions in the referenced role via an RBAC role themselves)
```release-note
To create or update an RBAC RoleBinding or ClusterRoleBinding object, a user must:
1. Be authorized to make the create or update API request
2. Be allowed to bind the referenced role, either by already having all of the permissions contained in the referenced role, or by having the "bind" permission on the referenced role.
```
Automatic merge from submit-queue (batch tested with PRs 38212, 38792, 39641, 36390, 39005)
Generate a kubelet CA and kube-apiserver cert-pair for kubelet auth.
cc @cjcullen
The command `docker ps` can take longer time to respond under heavy load or
when encountering some known issues. In these cases, the containers are running
fine, so aggressive health check could cause serious disruption. Bump the
timeout to 60s to be consistent with the debian-based containerVM.
Automatic merge from submit-queue (batch tested with PRs 39628, 39551, 38746, 38352, 39607)
fix e2e kubelet binding
Fixes#39543
This limits scope of the kubelet. It was an oversight before. Hopefully we won't end up chasing permissions again.
Automatic merge from submit-queue
cluster/cl: move abac to rbac
See #39092
We based off of GCI in the brief time where it was using abac.
fixes#39395
cc @yifan-gu
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 36229, 39450)
Bump etcd to 3.0.14 and switch to v3 API in etcd.
Ref #20504
**Release note**:
```release-note
Switch default etcd version to 3.0.14.
Switch default storage backend flag in apiserver to `etcd3` mode.
```
Automatic merge from submit-queue (batch tested with PRs 38433, 36245)
Remove needless env var in OpenStack provider
**What this PR does / why we need it**:
If we use openstack provider to set up k8s cluster using kube-up script,
`TENANT_ID` environment variable is needed.
But to configure `TENANT_ID` is very annoying because this value is not static by each env.
This patch uses `TENANT_NAME` instead of `TENANT_ID`
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Since `TENANT_NAME` is unique if we use keystone v2 api,
so `TENANT_ID` is not needed if `TENANT_NAME` is provided
to configure OpenStack provider.
And also to set `TENANT_ID` is annoying to develop, because
`TENANT_ID` is not static by each environment.
This patch remove dependency of `TENANT_ID` and simply use
`TENANT_NAME`.
Automatic merge from submit-queue
Try parse golang logs by default
Glog by default logs to stderr, so Stackdriver Logging shows them all as errors. This PR makes fluentd try to parse messages using glog format and if succeeded, set timestamp and severity accordingly.
CC @piosz @fgrzadkowski
Automatic merge from submit-queue
Remove all MAINTAINER statements in the codebase as they are deprecated
**What this PR does / why we need it**:
ref: https://github.com/docker/docker/pull/25466
**Release note**:
```release-note
Remove all MAINTAINER statements in Dockerfiles in the codebase as they are deprecated by docker
```
@ixdy @thockin (who else should be notified?)
Automatic merge from submit-queue
Adds assignees for kube-dns
Adds assignees for auto-assigning. Does not add assignees for pkg/dns folder as we are moving it out.
@thockin
Automatic merge from submit-queue
Adds kubernetes.io link for dns autoscaler addon
The [official page for DNS Horizontal Autoscaling](http://kubernetes.io/docs/tasks/administer-cluster/dns-horizontal-autoscaling/) is available on kubernetes.io after 1.5 release. Putting the link into this dns autoscaler addon folder as well.
@bowei
Automatic merge from submit-queue (batch tested with PRs 39146, 39094)
cleanup last e2e authorization failures
Builds on https://github.com/kubernetes/kubernetes/pull/39080. This adds rbac role bindings during e2e tests for test that use SA permissions to loopback to the API server.
Assigned to me until its ready.
Automatic merge from submit-queue
Make fluentd pods critical
Related to https://github.com/kubernetes/kubernetes/issues/38322
Make fluentd critical so it will be evicted with less probability.
CC @piosz @fgrzadkowski
Automatic merge from submit-queue
Add liveness probe for fluentd-gcp
It's known that fluentd can hung up during execution until manual restart.
Liveness probe fixes this problem in the following way: if no buffer chunks were sent or created in the last 5 minutes, fluentd is hanging and should be restarted.
CC @piosz
Automatic merge from submit-queue (batch tested with PRs 39061, 39079)
Fixed cluster validation: added -q flag to gcloud.
Fixed cluster validation in multi-zone mode: added -q flag to gcloud.
Automatic merge from submit-queue
Fix typo for federation/*
**What this PR does / why we need it**:
Increase code readability for this new member in v1.5
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
**Special notes for your reviewer**:
Could we develop a typo-fix bot along with a k8s terminology dictionary ?
**Release note**:
```release-note
```
Automatic merge from submit-queue
To add local registry to libvirt_coreos
`libvirt_coreos` is quick way to have a multi-node cluster on a linux laptop for development purpouse.
This PR adds local registry to libvirt_coreos cluster. Mind have a look?
@roberthbailey since you reviewed my last PR on dns for libvirt_coreos
Automatic merge from submit-queue
Update reference to dns sidecar (was dnsmasq-metrics); remove exec-healthz
-The image path is wrong -- I am waiting for the CI to pass here before pushing to google_containers-
Automatic merge from submit-queue
Coreos kube-up now with less cloud init
This update includes significant refactoring. It moves almost all of the
logic into bash scripts, modeled after the `gci` cluster scripts.
The reason to do this is:
1. Avoid duplicating the saltbase manifests by reusing gci's parsing logic (easier maintenance)
2. Take an incremental step towards sharing more code between gci/trusty/coreos, again for better maintenance
3. Pave the way for making future changes (e.g. improved rkt support, kubelet support) easier to share
The primary differences from the gci scripts are the following:
1. Use of the `/opt/kubernetes` directory over `/home/kubernetes`
2. Support for rkt as a runtime
3. No use of logrotate
4. No use of `/etc/default/`
5. No logic related to noexec mounts or gci-specific firewall-stuff
It will make sense to move 2 over to gci, as well as perhaps a few other small improvements. That will be a separate PR for ease of review.
Ref #29720, this is a part of that because it removes a copy of them.
Fixes#24165
cc @yifan-gu
Since this logic largely duplicates logic from the gci folder, it would be nice if someone closely familiar with that gave an OK or made sure I didn't fall into any gotchas related to that, so cc @andyzheng0831
Automatic merge from submit-queue
Moved kubemark master from Debian to GCI
This PR fixes issue #37484
Kubemark master now runs on GCI instead of Debian, taking it one step closer to a real cluster master.
Primary changes:
1. changing master VM image/OS in kubemark's config-default.sh to debian
2. moving kubelet to systemd from supervisord
3. changing directory for cert/key/csv files from /srv/kubernetes to /etc/srv/kubernetes
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue
Curating Owners: cluster/juju
cc @castrojo @mbruzek @chuckbutler @marcoceppi
In an effort to expand the existing pool of reviewers and establish a
two-tiered review process (first someone lgtms and then someone
experienced in the project approves), we are adding new reviewers to
existing owners files.
If You Care About the Process:
------------------------------
We did this by algorithmically figuring out who’s contributed code to
the project and in what directories. Unfortunately, that doesn’t work
well: people that have made mechanical code changes (e.g change the
copyright header across all directories) end up as reviewers in lots of
places.
Instead of using pure commit data, we generated an excessively large
list of reviewers and pruned based on all time commit data, recent
commit data and review data (number of PRs commented on).
At this point we have a decent list of reviewers, but it needs one last
pass for fine tuning.
Also, see https://github.com/kubernetes/contrib/issues/1389.
TLDR:
-----
As an owner of a sig/directory and a leader of the project, here’s what
we need from you:
1. Use PR https://github.com/kubernetes/kubernetes/pull/35715 as an example.
2. The pull-request is made editable, please edit the `OWNERS` file to
remove the names of people that shouldn't be reviewing code in the
future in the **reviewers** section. You probably do NOT need to modify
the **approvers** section. Names asre sorted by relevance, using some
secret statistics.
3. Notify me if you want some OWNERS file to be removed. Being an
approver or reviewer of a parent directory makes you a reviewer/approver
of the subdirectories too, so not all OWNERS files may be necessary.
4. Please use ALIAS if you want to use the same list of people over and
over again (don't hesitate to ask me for help, or use the pull-request
above as an example)
Automatic merge from submit-queue
common.sh should load before kube-down/kube-up function called
#38921 common.sh should load before kube-down/kube-up function called,now it load in kube-down/kube-up function,so can't find verify-kube-binaries command
This update includes significant refactoring. It moves almost all of the
logic into bash scripts, modeled after the `gci` cluster scripts.
The primary differences between the two are the following:
1. Use of the `/opt/kubernetes` directory over `/home/kubernetes`
2. Support for rkt as a runtime
3. No use of logrotate
4. No use of `/etc/default/`
5. No logic related to noexec mounts or gci-specific firewall-stuff
Automatic merge from submit-queue (batch tested with PRs 38906, 38808)
change the version in the yaml file
change the version in heapster-controller.yaml with image version
Automatic merge from submit-queue
cluster/gce/coreos: add OWNERS
See #33965 for context.
The code in `cluster/gce/coreos` has mostly been written/maintained by @yifan-gu and myself thusfar, so I added our names to the owner list.
@ethernetdan has also volunteered as well (thanks!).
**Release note**:
```release-note
NONE
```
cc @roberthbailey
Automatic merge from submit-queue
Admit critical pods in the kubelet
Haven't verified in a live cluster yet, just unittested, so applying do-not-merge label.
Automatic merge from submit-queue
Use daemonset in docker registry add on
When using registry add on with kubernetes cluster it will be right to use `daemonset` to bring up a pod on each node of cluster, right now the docs suggests to bring up a pod on each node manually by dropping the pod manifests into directory `/etc/kubernetes/manifests`.
Automatic merge from submit-queue
Automatically download missing kube binaries in kube-up/kube-down.
**What this PR does / why we need it**: some users extract `kubernetes.tar.gz` and then immediately call `cluster/kube-up.sh` without first calling the new `cluster/get-kube-binaries.sh` script. As a result, the cluster fails to start, but it's not immediately clear why binaries are missing.
This PR streamlines this workflow by detecting this condition and prompting the user to download necessary binaries (using `cluster/get-kube-binaries.sh`).
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#38725
cc @arun-gupta @christian-posta
Automatic merge from submit-queue
[Federation][init-11.2] use USE_KUBEFED env var to choose bw old and new federation deployment
This is continuation of #35961
USE_KUBEFED variable is used for deploying federation control plane. if not defined, federation will be brought up using old method i.e scripts.
Have verified that federation comes up using the old method, using following steps
```
$ export FEDERATION=true
$ export E2E_ZONES="asia-east1-c"
$ export FEDERATION_PUSH_REPO_BASE=gcr.io/<my-project>
$ KUBE_RELEASE_RUN_TESTS=n KUBE_FASTBUILD=true go run hack/e2e.go -v -build
$ build-tools/push-federation-images.sh
$ go run hack/e2e.go -v --up
```
Should merge #35961 before this PR
@madhusudancs
Automatic merge from submit-queue (batch tested with PRs 38760, 38213)
Avoid exporting fluentd-gcp own logs
To prevent fluentd from exporting its own logs, redirect the output to a file. Ability to read fluentd logs remains, but because these logs will not be exported, we can increase the verbosity of these logs.
Same change should be made for fluentd-es image.
CC @piosz
Using daemonset to bring up a pod on each node of cluster,
right now the docs suggests to bring up a pod on each node by
manually dropping the pod manifests into directory /etc/kubernetes/manifests.
Automatic merge from submit-queue
Use the cluster name in the names of the firewall rules that allow cluster-internal traffic to disambiguate the rules belonging to different clusters.
Also dropping the network name from these firewall rule names.
Network name was used to disambiguate firewall rules in a given network.
However, since two clusters cannot share a name in a GCE project, this
sufficiently disambiguates the firewall rule names. A potential confusion
arises when someone tries to create a firewall rule with the same name
in a different network, but that's also an indication that they shouldn't
be doing that.
@jszczepkowski due to PR #33094
@ixdy for test-infra
cc @kubernetes/sig-federation @nikhiljindal
Currently, node name is only allowed to use lower-case
characters. But openstack-heat provider generate invalid
node name which contains upper-case characters. This issue
breaks building kubernetes cluster using openstack-heat
provider.
So This patch fixes it.
Automatic merge from submit-queue (batch tested with PRs 38727, 38726, 38347, 38348)
Second pass of renaming kube-dns configure files
Continue work of #38523.
Not sure why cluster/centos/deployAddons.sh was omitted in previous PR. Also deletes the non-use `DNS_REPLICAS` var and changes `-rc` suffix in hack/local-up-cluster.sh.
@thockin @bowei @deads2k
Network name was used to disambiguate firewall rules in a given network.
However, since two clusters cannot share a name in a GCE project, this
sufficiently disambiguates the firewall rule names. A potential confusion
arises when someone tries to create a firewall rule with the same name
in a different network, but that's also an indication that they shouldn't
be doing that.
Automatic merge from submit-queue
Allow GCI_VERSION to come from env
This is to facilitate GCI tip vs. K8s tip testing; we need to
dynamically set the version of GCI to stay current with their
latest canary (latest of the "gci-base" prefixed images).
Automatic merge from submit-queue
Fixed validation of multizone cluster for GCE.
```release-note
Fixed validation of multizone cluster for GCE
```
Fixed validation of multizone cluster for GCE: taking actual number of worker nodes.
Automatic merge from submit-queue
Fixed detection of master during creation of multizone nodes cluster.
```release-note
Fixed detection of master during creation of multizone nodes cluster by kube-up.
```
Fixed detection of master during creation of multizone nodes cluster by kube-up.
Automatic merge from submit-queue
Curating Owners: cluster/vagrant
cc @derekwaynecarr
In an effort to expand the existing pool of reviewers and establish a
two-tiered review process (first someone lgtms and then someone
experienced in the project approves), we are adding new reviewers to
existing owners files.
If You Care About the Process:
------------------------------
We did this by algorithmically figuring out who’s contributed code to
the project and in what directories. Unfortunately, that doesn’t work
well: people that have made mechanical code changes (e.g change the
copyright header across all directories) end up as reviewers in lots of
places.
Instead of using pure commit data, we generated an excessively large
list of reviewers and pruned based on all time commit data, recent
commit data and review data (number of PRs commented on).
At this point we have a decent list of reviewers, but it needs one last
pass for fine tuning.
Also, see https://github.com/kubernetes/contrib/issues/1389.
TLDR:
-----
As an owner of a sig/directory and a leader of the project, here’s what
we need from you:
1. Use PR https://github.com/kubernetes/kubernetes/pull/35715 as an example.
2. The pull-request is made editable, please edit the `OWNERS` file to
remove the names of people that shouldn't be reviewing code in the
future in the **reviewers** section. You probably do NOT need to modify
the **approvers** section. Names asre sorted by relevance, using some
secret statistics.
3. Notify me if you want some OWNERS file to be removed. Being an
approver or reviewer of a parent directory makes you a reviewer/approver
of the subdirectories too, so not all OWNERS files may be necessary.
4. Please use ALIAS if you want to use the same list of people over and
over again (don't hesitate to ask me for help, or use the pull-request
above as an example)
After adding the aws janitor, the thing we're consistently sweeping is
the DhcpOptionSets created by cluster/aws/util.sh (and there were
thousands on the first run). Fix it!
This is to facilitate GCI tip vs. K8s tip testing; we need to
dynamically set the version of GCI to stay current with their
latest canary (latest of the "gci-base" prefixed images).
Automatic merge from submit-queue
Keeps addon manager yamls in sync
From #38437.
We should have kept all addon manager YAML files in sync. This does not fix the release scripts issue, but we should still have this.
@mikedanese @ixdy
Automatic merge from submit-queue (batch tested with PRs 38058, 38523)
Renames kube-dns configure files from skydns* to kubedns*
`skydns-` prefix and `-rc` suffix are confusing and misleading. Renaming it to `kubedns` in existing yaml files and scripts.
@bowei @thockin
Automatic merge from submit-queue (batch tested with PRs 37860, 38429, 38451, 36050, 38463)
[Part 2] Adding s390x cross-compilation support for gcr.io images in this repo
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**: This PR enables s390x support to kube-dns , pause, addon-manager, etcd, hyperkube, kube-discovery etc. This PR also includes the changes due to which it can be cross compiled on x86 host architecture.
**Which issue this PR fixes#34328
**Special notes for your reviewer**: In existing file "build-tools/build-image/cross/Dockerfile" the repository mentioned for installing cross build tool chains for supporting architecture does not have a tool chain for s390x hence in my PR I am changing the repository so that it will be cross compiled for s390x.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```
Allows cross compilation of Kubernetes on x86 host for s390x also enables s390x support to kube-dns , pause, addon-manager, etcd, hyperkube, kube-discovery etc
```
Automatic merge from submit-queue
Fix OSX hyperkube packaging with updated "mktemp -d" usage
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
Before this patch, the ```make release``` command does not finish successfully. The reason is the ```kube::release::package_hyperkube``` can't succeed, because the usage of ```mktemp -d``` needs to be updated for OSX version of ```mktemp```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
didn't find any existing issues
**Special notes for your reviewer**:
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
```
The PR title sounds good enough for the release note
Automatic merge from submit-queue
Correct docs
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
There was a change to the registry-proxy but the documentaiton wasn't completely updated to reflect change made.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Linked to [contribute deis/registry-proxy as a replacement for kube-registry-proxy](https://github.com/kubernetes/kubernetes/pull/35797)
**Special notes for your reviewer**:
First time contributing.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
Updated the kube-registry-proxy readme example.
```
Automatic merge from submit-queue (batch tested with PRs 36736, 35956, 35655, 37713, 38316)
Ae/fix2
**What this PR does / why we need it**: Fixes some kubelet typos
**Release note**:
`None`
Since `TENANT_NAME` is unique if we use keystone v2 api,
so `TENANT_ID` is not needed if `TENANT_NAME` is provided
to configure OpenStack provider.
And also to set `TENANT_ID` is annoying to develop, because
`TENANT_ID` is not static by each environment.
This patch remove dependency of `TENANT_ID` and simply use
`TENANT_NAME`.
Automatic merge from submit-queue (batch tested with PRs 36419, 38330, 37718, 38244, 38375)
Translate a published version like 'release/stable' to version number
This PR adds new functionality to `cluster/get-kube.sh` script. It translates a published version like 'release/stable' to version number.
Fixes: https://github.com/kubernetes/kubernetes/issues/35351
Automatic merge from submit-queue
kubelet-run-parameter: change config to pod-manifest-path
What this PR does / why we need it:
"--config" will be removed in a future version of kubelet, in order to prevent failure in the new version, use "pod-manifest-path" instead of it
Automatic merge from submit-queue (batch tested with PRs 32663, 35797)
contribute deis/registry-proxy as a replacement for kube-registry-proxy
This PR is a proposal to replace the `kube-registry-proxy` addon code with [deis/registry-proxy](https://github.com/deis/registry-proxy). We have been running this component in production for several months ([since Workflow v2.3.0](15d4c1c298/workflow-v2.3.0/tpl/deis-registry-proxy-daemon.yaml)) without any issues.
There are several benefits that this proxy provides over the current implementation:
- it's the same code that is provided in [docker/distribution's contrib dir](https://github.com/docker/distribution/tree/master/contrib/compose) which I have personally used for both Docker v1 and v2 engine deployments without any issues
- the ability to [disable old Docker clients](https://github.com/deis/registry-proxy/blob/master/rootfs/etc/nginx/conf.d/default.conf.in#L19-L23) that are incompatible with the v2 registry
- better default connection timeouts, using best practices from the Docker community as a whole
- workarounds for bugs like https://github.com/docker/docker/issues/1486 (see https://github.com/deis/registry-proxy/blob/master/rootfs/etc/nginx/conf.d/default.conf.in#L15-L16)
Things that this PR differs from the current implementation:
- it's not HAProxy.
I'm not sure how the release process goes for this component, but I bumped the version to v0.4 and changed the maintainer to myself considering this is a massive overhaul. Please let me know if this is acceptable as a replacement or if we should perhaps consider this as an alternative implementation.
Happy Friday!
Automatic merge from submit-queue
openstack: Implement the `Routes` provider API
``` release-note
Implement the Routes provider API for OpenStack using Neutron extraroute extension. This removes the need for flannel/etc where supported. To use, ensure all your nodes are on the same Neutron (private) network and specify the router ID in new `[Route]` section of provider config:
[Route]
router-id = <router UUID>
```