Kubernetes initiates "graceful shutdown" by sending SIGTERM to pid 1.
The way the existing startup scripts worked, this signal arrived at
the shell wrapper, not elasticsearch, and the shell wrapper exited,
killing the container immediately.
Before this change:
1 ? Ss 0:00 /bin/sh -c /run.sh
6 ? S 0:00 /bin/bash /run.sh
13 ? S 0:00 \_ /bin/su -c /elasticsearch/bin/elasticsearch elasticsearch
14 ? Ss 0:00 \_ sh -c /elasticsearch/bin/elasticsearch
15 ? Sl 19:18 \_ /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java ... org.elasticsearch.bootstrap.Elasticsearch start
After this change:
1 ? Ssl 0:29 /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java ... org.elasticsearch.bootstrap.Elasticsearch start
Automatic merge from submit-queue (batch tested with PRs 41826, 42405)
Add stubDomains and upstreamNameservers configuration to kube-dns
```release-note
Updates the dnsmasq cache/mux layer to be managed by dnsmasq-nanny.
dnsmasq-nanny manages dnsmasq based on values from the
kube-system:kube-dns configmap:
"stubDomains": {
"acme.local": ["1.2.3.4"]
},
is a map of domain to list of nameservers for the domain. This is used
to inject private DNS domains into the kube-dns namespace. In the above
example, any DNS requests for *.acme.local will be served by the
nameserver 1.2.3.4.
"upstreamNameservers": ["8.8.8.8", "8.8.4.4"]
is a list of upstreamNameservers to use, overriding the configuration
specified in /etc/resolv.conf.
```
Automatic merge from submit-queue (batch tested with PRs 42070, 42127)
Remove fluentd-gcp image sources
This PR removes fluentd-gcp image sources from the main kubernetes repo to move it the `contrib`: https://github.com/kubernetes/contrib/pull/2426
Once image is moved, it will be maintained by Stackdriver team (@igorpeshansky, @qingling128 and @dhrupadb)
CC @ixdy @timstclair
Automatic merge from submit-queue (batch tested with PRs 42126, 42130, 42232, 42245, 41932)
Update fluentd-gcp configuration for hosted masters
This PR makes use of the new fluentd-gcp image, which is not configured per se, for the hosted masters, which cannot use configmaps.
Mirroring https://github.com/kubernetes/kubernetes/pull/42126
Automatic merge from submit-queue
Move fluentd DS config to configmap
This is the logical continuation of https://github.com/kubernetes/kubernetes/pull/41998. This PR makes fluentd-gcp DaemonSet use the new image configured using ConfigMap.
This PR doesn't change the way fluentd-gcp works in case master is not registered, that'll be fixed in a separate PR
CC @ixdy @timstclair @igorpeshansky @qingling128 @dhrupadb
**Release note:**
```release-note
Fluentd-gcp containers spawned by DaemonSet are now configured using ConfigMap
```
Automatic merge from submit-queue (batch tested with PRs 41644, 42020, 41753, 42206, 42212)
Update defaultbackend image to 1.3
Update `gcr.io/google-containers/defaultbackend` to the latest version.
See https://github.com/kubernetes/contrib/pull/2386
/cc @ixdy
Updates the dnsmasq cache/mux layer to be managed by dnsmasq-nanny.
dnsmasq-nanny manages dnsmasq based on values from the
kube-system:kube-dns configmap:
"stubDomains": {
"acme.local": ["1.2.3.4"]
},
is a map of domain to list of nameservers for the domain. This is used
to inject private DNS domains into the kube-dns namespace. In the above
example, any DNS requests for *.acme.local will be served by the
nameserver 1.2.3.4.
"upstreamNameservers": ["8.8.8.8", "8.8.4.4"]
is a list of upstreamNameservers to use, overriding the configuration
specified in /etc/resolv.conf.
Automatic merge from submit-queue (batch tested with PRs 42058, 41160, 42065, 42076, 39338)
Bump up dns-horizontal-autoscaler to 1.1.1
cluster-proportional-autoscaler 1.1.1 is releasing by kubernetes-incubator/cluster-proportional-autoscaler#26, also bump it up for dns-horizontal-autoscaler to introduce below features:
- Add PreventSinglePointFailure option in linear mode.
- Use protobufs for communication with apiserver.
- Support switching control mode on-the-fly.
Note:
The new entry `"preventSinglePointFailure":true` ensures kube-dns to have at least 2 replicas when there is more than one node. Mitigate the issue mentioned in #40063.
@bowei @thockin
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Cleanup fluentd-gcp image, rebase on debian-base
**Why we need this PR**:
There are several problems with our current fluentd-gcp image:
- It pulls in lots of unused packages, which expose unnecessary risk and create noise in CVE scans (and scare customers). The most notable example is the fluent-ui, which pulls in rails.
- `curl | sh ` is not a good practice for a Dockerfile. First, the script is not checked in the same source control branch, so builds are not reproducible. Second, the actions it is taking are opaque. Third, in this case, using non-standard packages means they're harder to manage with CVE scans & upstream fixes.
**What is changed by this PR?**
- Rather than relying on td-agent (which includes fluent-ui), use standard upstream packages. This is largely based off the [official fluentd debian-based image](https://github.com/fluent/fluentd-docker-image/blob/master/v0.12/debian/Dockerfile).
- Rebases the image on debian-base (depends on https://github.com/kubernetes/kubernetes/pull/41915). We would like to move towards a single full-distro base image we can maintain. This change should be relatively minor.
As a result of these changes, the image size is reduced from 360.6 MB to 185.8 MB (nearly half). Many packages were removed, and the full diff (focus on the unversioned files) is listed here: 3fb704f977
**Which issue this PR fixes** https://github.com/kubernetes/kubernetes/issues/40248
**Special notes for your reviewer**:
This change both addresses security concerns, and is expected to greatly reduce the maintenance burden of the fluentd-gcp image. I'd *really* like to get this into 1.6, so please prioritize this review if possible.
I tested this by running the default e2e suite on a private e2e cluster using the new image. If there are other tests you'd like me to run, please let me know ASAP.
**Release note**:
```release-note
Cleanup fluentd-gcp image: rebase on debian-base, switch to upstream packages, remove fluent-ui & rails
```
Automatic merge from submit-queue (batch tested with PRs 41854, 41801, 40088, 41590, 41911)
Default storage class for vSphere Fixes#40070
**What this PR does / why we need it**:
Create default storage class for vSphere. This is part of the storage class GA effort https://github.com/kubernetes/features/issues/36
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes#40070
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
Base etcd-empty-dir-cleanup on busybox, run as nobody, and update to etcdctl 3.0.14
**What this PR does / why we need it**: since the `etcd-empty-dir-cleanup` image just uses a simple shell script and `etcdctl`, we can base it on busybox, which is a smaller target than alpine.
I've also updated this to use an `etcdctl` from etcd 3.0.14, which matches the version of etcd we're running in 1.6 clusters (I believe), and changed the tag to match the `etcdctl` version.
Tested in my own e2e cluster, where it seems to work.
I haven't pushed the image yet, so e2e tests *may* fail. Tagging `do-not-merge`; if you think this looks good, I'll push the image and retest.
**Release note**:
```release-note
```
cc @timstclair @mml @wojtek-t
Automatic merge from submit-queue
move kube-dns to a separate service account
Switches the kubedns addon to run as a separate service account so that we can subdivide RBAC permission for it. The RBAC permissions will need a little more refinement which I'm expecting to find in https://github.com/kubernetes/kubernetes/pull/38626 .
@cjcullen @kubernetes/sig-auth since this is directly related to enabling RBAC with subdivided permissions
@thockin @kubernetes/sig-network since this directly affects now kubedns is added.
```release-note
`kube-dns` now runs using a separate `system:serviceaccount:kube-system:kube-dns` service account which is automatically bound to the correct RBAC permissions.
```
Automatic merge from submit-queue (batch tested with PRs 39855, 41433, 41567, 41887, 41652)
Add fluentd monitoring to fluentd-gcp image
Right now we are not able to monitor the state of fluentd in cluster, which may result in logging subsystem quietly failing. This PR tries to address that problem by introducing the fluentd container monitoring:
* fluentd internal metrics, like number of buffers and number of data in buffers
* `logging_line_count`, number of lines, read by fluentd from application containers' logs
* Has `tag` label, corresponding to the fluentd tag of the entry
* `logging_entry_count`, number of entries, emitted to the output plugin
* With label `component` set to `container`, generated by application containers
* With label `component` set to `system`, generated by system components like kubelet, docker, scheduler, etc.
* Has `tag` label, corresponding to the fluentd tag of the entry
CC @fabxc @igorpeshansky @edsiper
Automatic merge from submit-queue (batch tested with PRs 41797, 41793, 41795, 41807, 41781)
Turn fluentd supervisor off for fluentd-gcp
By default, turn fluentd supervisor off so that when fluentd process fails, for example due to OOM, container fails completely and it would be easy to detect.
CC @igorpeshansky @qingling128
Automatic merge from submit-queue (batch tested with PRs 41349, 41532, 41256, 41587, 41657)
Update kubectl in addon-manager to use HPA in autoscaling/v1
Addon-manager is broken since HPA objects were removed from extensions api group.
Came across the logs from [the latest addon-manager on Jenkins](https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce/4290/artifacts/bootstrap-e2e-master/kube-addon-manager.log):
```
INFO: == Entering periodical apply loop at 2017-02-16T17:33:37+0000 ==
error: error pruning namespaced object extensions/v1beta1, Kind=HorizontalPodAutoscaler: the server could not find the requested resource
WRN: == Failed to execute /usr/local/bin/kubectl apply --namespace=kube-system -f /etc/kubernetes/addons --prune=true -l kubernetes.io/cluster-service=true --recursive >/dev/null at 2017-02-16T17:33:38+0000. 2 tries remaining. ==
error: error pruning namespaced object extensions/v1beta1, Kind=HorizontalPodAutoscaler: the server could not find the requested resource
WRN: == Failed to execute /usr/local/bin/kubectl apply --namespace=kube-system -f /etc/kubernetes/addons --prune=true -l kubernetes.io/cluster-service=true --recursive >/dev/null at 2017-02-16T17:33:46+0000. 1 tries remaining. ==
error: error pruning namespaced object extensions/v1beta1, Kind=HorizontalPodAutoscaler: the server could not find the requested resource
WRN: == Failed to execute /usr/local/bin/kubectl apply --namespace=kube-system -f /etc/kubernetes/addons --prune=true -l kubernetes.io/cluster-service=true --recursive >/dev/null at 2017-02-16T17:33:53+0000. 0 tries remaining. ==
WRN: == Kubernetes addon update completed with errors at 2017-02-16T17:33:58+0000 ==
```
And notice this commit (f66679a4e9) came in two weeks ago, which removed HorizontalPodAutoscaler from extensions/v1beta1.
Addon-manager is now partially functioning that it could successfully create and update addons, but will fail to prune objects, which means upgrade tests may mostly fail.
Pushed another version of addon-manager with kubectl v1.6.0-alpha.2 ([release 2 days ago](https://github.com/kubernetes/kubernetes/releases/tag/v1.6.0-alpha.2)) for fixing, including below images:
- gcr.io/google-containers/kube-addon-manager:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-amd64:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-arm:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-arm64:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-ppc64le:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-s390x:v6.4-alpha.2
@mikedanese
cc @wojtek-t @shyamjvs
Automatic merge from submit-queue
Bump fluentd-gcp google_cloud plugin version
Bump the version of `fluent-plugin-google-cloud` in fluentd-gcp image, because it's broken for version `0.5.2`.
Recently, gem `google-api-client` was updated to version `0.10.0`. The new version broke `fluent-plugin-google-cloud` which doesn't specify the upper version of `google-api-client` gem. I'm bumping the version used in our image to allow future changes in this release to be run and tested.
This PR doesn't bump the version, since no effective changes has happened, leaving this for the next PR to do.
CC @igorpeshansky
Automatic merge from submit-queue (batch tested with PRs 40000, 41508, 41489)
Add toleration to fluentd daemonset to make it run on master
Because of https://github.com/kubernetes/kubernetes/pull/41172 fluentd pods stopped being allocated on master node.
This PR introduces toleration for master taint for fluentd.
CC @davidopp @janetkuo @kubernetes/sig-scheduling-bugs
Unfortunately, we don't have e2e tests to ensure that master logs are being ingested. This problem is a great signal to work on https://github.com/kubernetes/kubernetes/issues/41411
Automatic merge from submit-queue
fluentd-gcp: Add kube-apiserver-audit.log.
**What this PR does / why we need it**:
Add `kube-apiserver-audit.log` from https://github.com/kubernetes/kubernetes/pull/41211 to fluentd config, so the audit log gets sent to the same place as `kube-apiserver.log`.
**Which issue this PR fixes**:
**Special notes for your reviewer**:
We would like to backport this to release-1.5 also.
**Release note**:
```release-note
The apiserver audit log (`/var/log/kube-apiserver-audit.log`) will be sent through fluentd if enabled.
```
Automatic merge from submit-queue (batch tested with PRs 41357, 41178, 41280, 41184, 41278)
Switch RBAC subject apiVersion to apiGroup in v1beta1
Referencing a subject from an RBAC role binding, the API group and kind of the subject is needed to fully-qualify the reference.
The version is not, and adds complexity around re-writing the reference when returning the binding from different versions of the API, and when reconciling subjects.
This PR:
* v1beta1: change the subject `apiVersion` field to `apiGroup` (to match roleRef)
* v1alpha1: convert apiVersion to apiGroup for backwards compatibility
* all versions: add defaulting for the three allowed subject kinds
* all versions: add validation to the field so we can count on the data in etcd being good until we decide to relax the apiGroup restriction
```release-note
RBAC `v1beta1` RoleBinding/ClusterRoleBinding subjects changed `apiVersion` to `apiGroup` to fully-qualify a subject. ServiceAccount subjects default to an apiGroup of `""`, User and Group subjects default to an apiGroup of `"rbac.authorization.k8s.io"`.
```
@deads2k @kubernetes/sig-auth-api-reviews @kubernetes/sig-auth-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 41182, 41290)
Add a default storage class for Azure Disk
Part of https://github.com/kubernetes/kubernetes/issues/40071
@jsafrane @colemickens @codablock @rootfs
These files have been created lately, so we don't have much information
about them anyway, so let's just:
- Remove assignees and make them approvers
- Copy approves as reviewers
Automatic merge from submit-queue (batch tested with PRs 36693, 40154, 40170, 39033)
make client-go authoritative for pkg/client/restclient
Moves client/restclient to client-go and a util/certs, util/testing as transitives.
Automatic merge from submit-queue (batch tested with PRs 40168, 40165, 39158, 39966, 40190)
Include system:masters group in the bootstrap admin client certificate
Sets up the bootstrap admin client certificate for new clusters to be in the system:masters group
Removes the need for an explicit grant to the kubecfg user in e2e-bindings
```release-note
The default client certificate generated by kube-up now contains the superuser `system:masters` group
```
Automatic merge from submit-queue (batch tested with PRs 40003, 40017)
Remove library copying from fluentd image
It seems that fluentd can no longer copy systemd libraries from host to be able to read journals.
Automatic merge from submit-queue
Build release tars using bazel
**What this PR does / why we need it**: builds equivalents of the various kubernetes release tarballs, solely using bazel.
For example, you can now do
```console
$ make bazel-release
$ hack/e2e.go -v -up -test -down
```
**Special notes for your reviewer**: this is currently dependent on 3b29803eb5, which I have yet to turn into a pull request, since I'm still trying to figure out if this is the best approach.
Basically, the issue comes up with the way we generate the various server docker image tarfiles and load them on nodes:
* we `md5sum` the binary being encapsulated (e.g. kube-proxy) and save that to `$binary.docker_tag` in the server tarball
* we then build the docker image and tag using that md5sum (e.g. `gcr.io/google_containers/kube-proxy:$MD5SUM`)
* we `docker save` this image, which embeds the full tag in the `$binary.tar` file.
* on cluster startup, we `docker load` these tarballs, which are loaded with the tag that we'd created at build time. the nodes then use the `$binary.docker_tag` file to find the right image.
With the current bazel `docker_build` rule, the tag isn't saved in the docker image tar, so the node is unable to find the image after `docker load`ing it.
My changes to the rule save the tag in the docker image tar, though I don't know if there are subtle issues with it. (Maybe we want to only tag when `--stamp` is given?)
Also, the docker images produced by bazel have the timestamp set to the unix epoch, which is not great for debugging. Might be another thing to change with a `--stamp`.
Long story short, we probably need to follow up with bazel folks on the best way to solve this problem.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Use $HOSTNAME as node.name by default
**What this PR does / why we need it**:
Allows to identify elasticsearch instances more easily.
As $HOSTNAME of a pod is unique, this should be no problem.
Automatic merge from submit-queue
Update images that use ubuntu-slim base image to :0.6
**What this PR does / why we need it**: `ubuntu-slim:0.4` is somewhat old, being based on Ubuntu 16.04, whereas `ubuntu-slim:0.6` is based on Ubuntu 16.04.1.
**Special notes for your reviewer**: I haven't pushed any of these images yet, so I expect all of the e2e builds to fail. If we're happy with the changes, I can push the images and then re-trigger tests.
**Release note**:
```release-note
NONE
```
cc @aledbf as FYI
Automatic merge from submit-queue
Update kubectl to stable version for Addon Manager
Bumps up Addon Manager to v6.2, below images are pushed:
- gcr.io/google-containers/kube-addon-manager:v6.2
- gcr.io/google-containers/kube-addon-manager-amd64:v6.2
- gcr.io/google-containers/kube-addon-manager-arm:v6.2
- gcr.io/google-containers/kube-addon-manager-arm64:v6.2
- gcr.io/google-containers/kube-addon-manager-ppc64le:v6.2
- gcr.io/google-containers/kube-addon-manager-s390x:v6.2
@mikedanese
cc @ixdy
Automatic merge from submit-queue (batch tested with PRs 39803, 39698, 39537, 39478)
include bootstrap admin in super-user group, ensure tokens file is correct on upgrades
Fixes https://github.com/kubernetes/kubernetes/issues/39532
Possible issues with cluster bring-up scripts:
- [x] known_tokens.csv and basic_auth.csv is not rewritten if the file already exists
* new users (like the controller manager) are not available on upgrade
* changed users (like the kubelet username change) are not reflected
* group additions (like the addition of admin to the superuser group) don't take effect on upgrade
* this PR updates the token and basicauth files line-by-line to preserve user additions, but also ensure new data is persisted
- [x] existing 1.5 clusters may depend on more permissive ABAC permissions (or customized ABAC policies). This PR adds an option to enable existing ABAC policy files for clusters that are upgrading
Follow-ups:
- [ ] both scripts are loading e2e role-bindings, which only be loaded in e2e tests, not in normal kube-up scenarios
- [ ] when upgrading, set the option to use existing ABAC policy files
- [ ] update bootstrap superuser client certs to add superuser group? ("We also have a certificate that "used to be" a super-user. On GCE, it has CN "kubecfg", on GKE it's "client"")
- [ ] define (but do not load by default) a relaxed set of RBAC roles/rolebindings matching legacy ABAC, and document how to load that for new clusters that do not want to isolate user permissions
Automatic merge from submit-queue
Enable kubernetes_metadata by default for ELK stack
Looks like it was accidentally removed and was not restored back in this PR https://github.com/kubernetes/kubernetes/pull/29883
Because actually this plugin still exists in the image, but new ELK deployment don't allow you to index namespaces, pod names, etc.
Automatic merge from submit-queue (batch tested with PRs 39731, 39662, 39721)
Update dashboard version to v1.5.1
**What this PR does / why we need it**:
Latest Dashboard developments, including a CSRF issue in the dashboard POST handlers
**Release note**:
```
Set Dashboard UI version to v1.5.1
```
Automatic merge from submit-queue (batch tested with PRs 39694, 39383, 39651, 39691, 39497)
Allow rolebinding/clusterrolebinding with explicit bind permission check
Fixes https://github.com/kubernetes/kubernetes/issues/39176
Fixes https://github.com/kubernetes/kubernetes/issues/39258
Allows creating/updating a rolebinding/clusterrolebinding if the user has explicitly been granted permission to perform the "bind" verb against the referenced role/clusterrole (previously, they could only bind if they already had all the permissions in the referenced role via an RBAC role themselves)
```release-note
To create or update an RBAC RoleBinding or ClusterRoleBinding object, a user must:
1. Be authorized to make the create or update API request
2. Be allowed to bind the referenced role, either by already having all of the permissions contained in the referenced role, or by having the "bind" permission on the referenced role.
```
Automatic merge from submit-queue (batch tested with PRs 39628, 39551, 38746, 38352, 39607)
fix e2e kubelet binding
Fixes#39543
This limits scope of the kubelet. It was an oversight before. Hopefully we won't end up chasing permissions again.
Automatic merge from submit-queue
Try parse golang logs by default
Glog by default logs to stderr, so Stackdriver Logging shows them all as errors. This PR makes fluentd try to parse messages using glog format and if succeeded, set timestamp and severity accordingly.
CC @piosz @fgrzadkowski
Automatic merge from submit-queue
Remove all MAINTAINER statements in the codebase as they are deprecated
**What this PR does / why we need it**:
ref: https://github.com/docker/docker/pull/25466
**Release note**:
```release-note
Remove all MAINTAINER statements in Dockerfiles in the codebase as they are deprecated by docker
```
@ixdy @thockin (who else should be notified?)
Automatic merge from submit-queue
Adds assignees for kube-dns
Adds assignees for auto-assigning. Does not add assignees for pkg/dns folder as we are moving it out.
@thockin
Automatic merge from submit-queue
Adds kubernetes.io link for dns autoscaler addon
The [official page for DNS Horizontal Autoscaling](http://kubernetes.io/docs/tasks/administer-cluster/dns-horizontal-autoscaling/) is available on kubernetes.io after 1.5 release. Putting the link into this dns autoscaler addon folder as well.
@bowei
Automatic merge from submit-queue (batch tested with PRs 39146, 39094)
cleanup last e2e authorization failures
Builds on https://github.com/kubernetes/kubernetes/pull/39080. This adds rbac role bindings during e2e tests for test that use SA permissions to loopback to the API server.
Assigned to me until its ready.
Automatic merge from submit-queue
Make fluentd pods critical
Related to https://github.com/kubernetes/kubernetes/issues/38322
Make fluentd critical so it will be evicted with less probability.
CC @piosz @fgrzadkowski
Automatic merge from submit-queue
Add liveness probe for fluentd-gcp
It's known that fluentd can hung up during execution until manual restart.
Liveness probe fixes this problem in the following way: if no buffer chunks were sent or created in the last 5 minutes, fluentd is hanging and should be restarted.
CC @piosz
Automatic merge from submit-queue
Coreos kube-up now with less cloud init
This update includes significant refactoring. It moves almost all of the
logic into bash scripts, modeled after the `gci` cluster scripts.
The reason to do this is:
1. Avoid duplicating the saltbase manifests by reusing gci's parsing logic (easier maintenance)
2. Take an incremental step towards sharing more code between gci/trusty/coreos, again for better maintenance
3. Pave the way for making future changes (e.g. improved rkt support, kubelet support) easier to share
The primary differences from the gci scripts are the following:
1. Use of the `/opt/kubernetes` directory over `/home/kubernetes`
2. Support for rkt as a runtime
3. No use of logrotate
4. No use of `/etc/default/`
5. No logic related to noexec mounts or gci-specific firewall-stuff
It will make sense to move 2 over to gci, as well as perhaps a few other small improvements. That will be a separate PR for ease of review.
Ref #29720, this is a part of that because it removes a copy of them.
Fixes#24165
cc @yifan-gu
Since this logic largely duplicates logic from the gci folder, it would be nice if someone closely familiar with that gave an OK or made sure I didn't fall into any gotchas related to that, so cc @andyzheng0831
This update includes significant refactoring. It moves almost all of the
logic into bash scripts, modeled after the `gci` cluster scripts.
The primary differences between the two are the following:
1. Use of the `/opt/kubernetes` directory over `/home/kubernetes`
2. Support for rkt as a runtime
3. No use of logrotate
4. No use of `/etc/default/`
5. No logic related to noexec mounts or gci-specific firewall-stuff
Automatic merge from submit-queue
Use daemonset in docker registry add on
When using registry add on with kubernetes cluster it will be right to use `daemonset` to bring up a pod on each node of cluster, right now the docs suggests to bring up a pod on each node manually by dropping the pod manifests into directory `/etc/kubernetes/manifests`.
Automatic merge from submit-queue (batch tested with PRs 38760, 38213)
Avoid exporting fluentd-gcp own logs
To prevent fluentd from exporting its own logs, redirect the output to a file. Ability to read fluentd logs remains, but because these logs will not be exported, we can increase the verbosity of these logs.
Same change should be made for fluentd-es image.
CC @piosz
Using daemonset to bring up a pod on each node of cluster,
right now the docs suggests to bring up a pod on each node by
manually dropping the pod manifests into directory /etc/kubernetes/manifests.
Automatic merge from submit-queue (batch tested with PRs 38058, 38523)
Renames kube-dns configure files from skydns* to kubedns*
`skydns-` prefix and `-rc` suffix are confusing and misleading. Renaming it to `kubedns` in existing yaml files and scripts.
@bowei @thockin
Automatic merge from submit-queue (batch tested with PRs 37860, 38429, 38451, 36050, 38463)
[Part 2] Adding s390x cross-compilation support for gcr.io images in this repo
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**: This PR enables s390x support to kube-dns , pause, addon-manager, etcd, hyperkube, kube-discovery etc. This PR also includes the changes due to which it can be cross compiled on x86 host architecture.
**Which issue this PR fixes#34328
**Special notes for your reviewer**: In existing file "build-tools/build-image/cross/Dockerfile" the repository mentioned for installing cross build tool chains for supporting architecture does not have a tool chain for s390x hence in my PR I am changing the repository so that it will be cross compiled for s390x.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```
Allows cross compilation of Kubernetes on x86 host for s390x also enables s390x support to kube-dns , pause, addon-manager, etcd, hyperkube, kube-discovery etc
```
Automatic merge from submit-queue (batch tested with PRs 32663, 35797)
contribute deis/registry-proxy as a replacement for kube-registry-proxy
This PR is a proposal to replace the `kube-registry-proxy` addon code with [deis/registry-proxy](https://github.com/deis/registry-proxy). We have been running this component in production for several months ([since Workflow v2.3.0](15d4c1c298/workflow-v2.3.0/tpl/deis-registry-proxy-daemon.yaml)) without any issues.
There are several benefits that this proxy provides over the current implementation:
- it's the same code that is provided in [docker/distribution's contrib dir](https://github.com/docker/distribution/tree/master/contrib/compose) which I have personally used for both Docker v1 and v2 engine deployments without any issues
- the ability to [disable old Docker clients](https://github.com/deis/registry-proxy/blob/master/rootfs/etc/nginx/conf.d/default.conf.in#L19-L23) that are incompatible with the v2 registry
- better default connection timeouts, using best practices from the Docker community as a whole
- workarounds for bugs like https://github.com/docker/docker/issues/1486 (see https://github.com/deis/registry-proxy/blob/master/rootfs/etc/nginx/conf.d/default.conf.in#L15-L16)
Things that this PR differs from the current implementation:
- it's not HAProxy.
I'm not sure how the release process goes for this component, but I bumped the version to v0.4 and changed the maintainer to myself considering this is a massive overhaul. Please let me know if this is acceptable as a replacement or if we should perhaps consider this as an alternative implementation.
Happy Friday!
/etc/ssl/certs is currently mounted through in a number of places.
However, on Gentoo and CoreOS (and probably others), the files in
/etc/ssl/certs are just symlinks to files in /usr/share/ca-certificates.
For these components to correclty work, the target of the symlinks needs
to be available as well.
This is especially important for kube-controller-manager, where this
issue was noticed.
This change was originally part of #33965, but was split out for ease of
review.
Automatic merge from submit-queue
Adds docs for dns-horizontal-autoscaler and kube-dns
Although we have separate docs on kubernetes.io, we should have a short description about the dns-horizontal-autoscaler addon in folder.
Also updates kube-dns README with example command to scale kube-dns Deployment. This is needed because Addon Manager v6 has stricter reconcile behavior.
@bowei @bprashanth @thockin
Automatic merge from submit-queue
Unify fluentd-gcp configurations
There're two different configs and two different pod specs for fluentd agent for GCL: one for GCI and one for CVM. This PR makes it possible to use only one config and only one pod spec.
CC @piosz
Automatic merge from submit-queue
Set strategy spec for kube-dns to support zero downtime rolling update
From #37728 and coreos/kube-aws#111.
Set `maxUnavailable` to 0 to prevent DNS service outage during update when the replica number is only 1.
Also keeps all kube-dns yaml files in sync.
@bowei @thockin