Commit Graph

5673 Commits (578d9fcf638f33a043af79e6a0f7c9444f81a456)

Author SHA1 Message Date
Kubernetes Submit Queue 3ff99a8381 Merge pull request #46149 from cjcullen/logtoggle
Automatic merge from submit-queue

Allow the /logs handler on the apiserver to be toggled.

Adds a flag to kube-apiserver, and plumbs through en environment variable in configure-helper.sh
2017-05-23 15:19:08 -07:00
Kubernetes Submit Queue 1e2105808b Merge pull request #45136 from vishh/cos-nvidia-driver-install
Automatic merge from submit-queue

Enable "kick the tires" support for Nvidia GPUs in COS

This PR provides an installation daemonset that will install Nvidia CUDA drivers on Google Container Optimized OS (COS).
User space libraries and debug utilities from the Nvidia driver installation are made available on the host in a special directory on the host -
* `/home/kubernetes/bin/nvidia/lib` for libraries
*  `/home/kubernetes/bin/nvidia/bin` for debug utilities

Containers that run CUDA applications on COS are expected to consume the libraries and debug utilities (if necessary) from the host directories using `HostPath` volumes.

Note: This solution requires updating Pod Spec across distros. This is a known issue and will be addressed in the future. Until then CUDA workloads will not be portable.

This PR updates the COS base image version to m59. This is coupled with this PR for the following reasons:
1. Driver installation requires disabling a kernel feature in COS. 
2. The kernel API for disabling this interface changed across COS versions
3. If the COS image update is not handled in this PR, then a subsequent COS image update will break GPU integration and will require an update to the installation scripts in this PR.
4. Instead of having to post `3` PRs, one each for adding the basic installer, updating COS to m59, and then updating the installer again, this PR combines all the changes to reduce review overhead and latency, and additional noise that will be created when GPU tests break.

**Try out this PR**
1. Get Quota for GPUs in any region
2. `export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci`
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1" cluster/kube-up.sh`
4. `kubectl create -f cluster/gce/gci/nvidia-gpus/cos-installer-daemonset.yaml`
5. Run your CUDA app in a pod.

**Another option is to run a e2e manually to try out this PR**
1. Get Quota for GPUs in any region
2. export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1"`
4. `go run hack/e2e.go -- --up` 
5. `hack/ginkgo-e2e.sh --ginkgo.focus="\[Feature:GPU\]"`
The e2e will install the drivers automatically using the daemonset and then run test workloads to validate driver integration.

TODO:
- [x] Update COS image version to m59 release.
- [x] Remove sleep from the install script and add it to the daemonset
- [x] Add an e2e that will run the daemonset and run a sample CUDA app on COS clusters.
- [x] Setup a test project with necessary quota to run GPU tests against HEAD to start with https://github.com/kubernetes/test-infra/pull/2759
- [x] Update node e2e serial configs to install nvidia drivers on COS by default
2017-05-23 10:46:10 -07:00
Kubernetes Submit Queue 4871f4a75b Merge pull request #45637 from xilabao/hide-api-version
Automatic merge from submit-queue

remove --api-version
2017-05-23 06:15:45 -07:00
Kubernetes Submit Queue 2718429e4f Merge pull request #45952 from harryge00/update-es-image
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)

remove the elasticsearch template

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```
NONE
```
Loading file-based index template has been disabled since 2.0.0-beta1 version of Elasticsearch.  https://www.elastic.co/guide/en/elasticsearch/reference/2.0/breaking_20_index_api_changes.html#_file_based_index_templates 

So the `template-k8s-logstash.json` is not longer useful.

On the other hand, as https://github.com/kubernetes/kubernetes/issues/25127 indicated, we might better curl the elasticsearch API to load this template.
2017-05-22 20:58:01 -07:00
CJ Cullen 9dca164ddd Allow the /logs handler on the apiserver to be toggled.
Change-Id: Ibf173b7f85cf7fffe8482eaee74fb77da2b2588b
2017-05-22 14:37:24 -07:00
Kubernetes Submit Queue c4229be7ad Merge pull request #46035 from crassirostris/fluentd-config-version-bump
Automatic merge from submit-queue

Add version for fluentd-gcp config

Fluentd-gcp config should be versioned, because otherwise during the update race can happen and the new pod can mount the old config
2017-05-22 02:08:20 -07:00
Vishnu kannan 333e571bee update default project to cos-cloud in gce configs
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-05-20 21:21:23 -07:00
Vishnu kannan 86b5edb79a Update COS version to m59
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-05-20 21:17:19 -07:00
Vishnu kannan 1e77594958 Adding an installer script that installs Nvidia drivers in Container Optimized OS
Packaged the script as a docker container stored in gcr.io/google-containers
A daemonset deployment is included to make it easy to consume the installer
A cluster e2e has been added to test the installation daemonset along with verifying installation
by using a sample CUDA application.
Node e2e for GPUs updated to avoid running on nodes without GPU devices.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-05-20 21:17:19 -07:00
Kubernetes Submit Queue a9d0403858 Merge pull request #38169 from caseydavenport/calico-daemonset
Automatic merge from submit-queue

Update Calico add-on

**What this PR does / why we need it:**

Updates Calico to the latest version using self-hosted install as a DaemonSet, removes Calico's dependency on etcd.

- [x] Remove [last bits of Calico salt](175fe62720/cluster/saltbase/salt/calico/master.sls (L3))
- [x] Failing on the master since no kube-proxy to access API.
- [x] Fix outgoing NAT
- [x] Tweak to work on both debian / GCI (not just GCI)
- [x] Add the portmap plugin for host port support

Maybe:
- [ ] Add integration test

**Which issue this PR fixes:**

https://github.com/kubernetes/kubernetes/issues/32625

**Try it out**

Clone the PR, then:

```
make quick-release
export NETWORK_POLICY_PROVIDER=calico
export NODE_OS_DISTRIBUTION=gci
export MASTER_SIZE=n1-standard-4
./cluster/kube-up.sh 
```

**Release note:**

```release-note
The Calico version included in kube-up for GCE has been updated to v2.2.
```
2017-05-19 19:38:59 -07:00
Kubernetes Submit Queue d3aa925c01 Merge pull request #46038 from dnardo/ip-masq-agent
Automatic merge from submit-queue (batch tested with PRs 44606, 46038)

Add ip-masq-agent addon to the addons folder. 

This also ensures that under gce we add this DaemonSet if the non-masq-cidr
is set to 0/0.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:
```release-note
Add ip-masq-agent addon to the addons folder which is used in GCE if  --non-masquerade-cidr is set to 0/0
```
2017-05-19 11:52:09 -07:00
Daniel Nardo 96ae34685e Add ip-masq-agent addon to the addons folder. This also
ensures that under gce we add this daemonset if the non-masq-cidr
is set to 0/0.
2017-05-19 08:43:55 -07:00
Marcin Wielgus 2f4cb6bfe7 Use integer comparisons instead of string comparisons in autoscaler config validation 2017-05-19 14:50:55 +02:00
xilabao e0b4f3f73c remove --api-version 2017-05-19 10:56:35 +08:00
Mik Vyatskov a6ccc89541 Add version for fluentd-gcp config 2017-05-18 16:59:05 +02:00
Kubernetes Submit Queue a1c2db2fec Merge pull request #45950 from shyamjvs/revert-proxier
Automatic merge from submit-queue

Make real proxier in hollow-proxy optional (default=true)

Ref https://github.com/kubernetes/kubernetes/pull/45622
This allows using real proxier for hollow proxy, but we use the fake one by default.

cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
2017-05-18 07:55:09 -07:00
Shyam Jeedigunta 804a4f558c Make usage of real proxier in hollow-proxy optional (default=true) 2017-05-18 14:30:12 +02:00
Kubernetes Submit Queue 0765740eb9 Merge pull request #46002 from bowei/ip-alias-to-beta
Automatic merge from submit-queue

Update cluster startup scripts to use gcloud beta for alias IP support

The feature has gone from alpha to beta.

```release-note
NONE
```
2017-05-18 02:05:45 -07:00
Bowei Du 7febdde22a Update cluster startup scripts to use gcloud beta for alias IP support
The feature has gone from alpha to beta.
2017-05-17 16:26:48 -07:00
Casey Davenport 63744a819f Update Calico add-on 2017-05-17 15:04:08 -07:00
Kubernetes Submit Queue 0c25199117 Merge pull request #45953 from maciaszczykm/patch-2
Automatic merge from submit-queue

Update dashboard-controller.yaml

**What this PR does / why we need it**: Updates Dashboard addon to newest version. Changelog can be found at https://github.com/kubernetes/dashboard/releases/tag/v1.6.1.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
Update Dashboard version to 1.6.1
```
2017-05-17 13:19:32 -07:00
Michael Taufen 2ee2ec5e21 Remove the deprecated --babysit-daemons kubelet flag 2017-05-17 09:08:57 -07:00
Marcin Maciaszczyk 5a0aef05b8 Update dashboard-controller.yaml 2017-05-17 14:12:12 +02:00
haoyuan d3fd956dac remove the elasticsearch template 2017-05-17 19:20:14 +08:00
Kubernetes Submit Queue 35e563d70c Merge pull request #45771 from magreiner/fix-heatversion
Automatic merge from submit-queue

fix: required openstack heat version for conditions is 2016-10-14 / newton

This fix sets the required heat version to 2016-10-14.

In OpenStack heat the conditions statement was introduced in version 2016-10-14 | newton, accourding to: 
https://docs.openstack.org/releasenotes/heat/newton.html
and more specific:
https://docs.openstack.org/developer/heat/template_guide/hot_spec.html

The conditions are used to make the assignment of public ips / floating ips optional (added in commit 4eef540876). However this template is not compatible with OpenStack heat releases prior newton and produces the following error:

```
ERROR: Failed to validate: : resources.kube_minions: : "condition" is not a valid keyword inside a output definition
```

PR without a release note:
```release-note
NONE
```
2017-05-17 02:22:49 -07:00
Kubernetes Submit Queue ec415a12d2 Merge pull request #45119 from dims/set-default-host-path-as-provisioner
Automatic merge from submit-queue (batch tested with PRs 45860, 45119, 44525, 45625, 44403)

Support running StatefulSetBasic e2e tests with local-up-cluster

**What this PR does / why we need it**:

Currently StatefulSet(s) fail when you use local-up-cluster without
setting a cloud provider. In this PR, we use set the
kubernetes.io/host-path provisioner as the default provisioner when
there CLOUD_PROVIDER is not specified. This enables e2e test(s)
(specifically StatefulSetBasic) to work.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-05-16 16:14:51 -07:00
Kubernetes Submit Queue 0cf7fd85e4 Merge pull request #45860 from mml/instance-templates-delete
Automatic merge from submit-queue

Add --quiet to instance-templates delete.
2017-05-16 15:52:13 -07:00
Kubernetes Submit Queue 1e6061b9ec Merge pull request #45763 from piosz/es-owners
Automatic merge from submit-queue

Added coffeepac to ElasticSearch owners

@coffeepac

@fgrzadkowski, could you please add @coffeepac to Kubernetes org?
2017-05-16 12:22:59 -07:00
Kubernetes Submit Queue ba69aa9c09 Merge pull request #45832 from juju-solutions/gkk/fix-e2e-lint
Automatic merge from submit-queue (batch tested with PRs 44337, 45775, 45832, 45574, 45758)

Fix lint failures on kubernetes-e2e charm

**What this PR does / why we need it**:

This fixes a test failure on the kubernetes-e2e charm relating to tox and flake8:

```DEBUG🏃/bin/sh: 1: flake8: not found```

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

This is a follow-up to https://github.com/kubernetes/kubernetes/pull/45494 where the same thing was done for kubernetes-master.

**Release note**:

```release-note
Fix lint failures on kubernetes-e2e charm
```
2017-05-15 18:39:16 -07:00
Kubernetes Submit Queue eee8598ff9 Merge pull request #44337 from koep/master
Automatic merge from submit-queue (batch tested with PRs 44337, 45775, 45832, 45574, 45758)

Refactor gcr.io/google_containers/elasticsearch to alpine

**What this PR does / why we need it**:
This reduces the image size of the gcr.io/google_containers/elasticsearch image.

Before:
```
REPOSITORY                                                                       TAG                    IMAGE ID            CREATED             SIZE
gcr.io/google_containers/elasticsearch                                           v2.4.1-2               6941e43df81a        4 weeks ago         419MB
```
After:
```
REPOSITORY                                                                       TAG                    IMAGE ID            CREATED             SIZE
gcr.io/google_containers/elasticsearch                                           v2.4.1-2               24ad40c21a52        About an hour ago   178MB
```

**Special notes for your reviewer**:
I used a workaround to make the elasticsearch_logging_discovery binary work with alpine. (See [stackoverflow](https://stackoverflow.com/questions/34729748/installed-go-binary-not-found-in-path-on-alpine-linux-docker/35613430#35613430)). Alternatively this can be solved by setting ```CGO_ENABLED=0```when compiling the binary. I didn't feel comfortable chaing the Makefile though, since I'm no golang expert.  Feedback wanted!
2017-05-15 18:39:07 -07:00
Matt Liggett 5dd4a5d56b Add --quiet to instance-templates delete.
Otherwise it hangs waiting for confirmation.
2017-05-15 16:26:11 -07:00
George Kraft d50b69442e Fix lint failures on kubernetes-e2e charm 2017-05-15 13:22:55 -05:00
Christian Koep df80b76d1b
Refactor gcr.io/google_containers/elasticsearch to alpine
Signed-off-by: Christian Koep <christiankoep@gmail.com>
2017-05-15 17:52:39 +02:00
Kubernetes Submit Queue fd5146f193 Merge pull request #45494 from ktsakalozos/bug/fix-lint
Automatic merge from submit-queue (batch tested with PRs 45070, 45821, 45732, 45494, 45789)

Fix lint errors in juju kubernetes master and e2e charms

**What this PR does / why we need it**: Fixes style error in the Juju charms

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```
Code style fixes in Juju charms
```
2017-05-15 07:49:57 -07:00
Matthias Greiner 0688c3c6a9 fix: required openstack heat version for conditions is 2016-10-14 / newton 2017-05-13 17:12:45 +00:00
Piotr Szczesniak da8f82cbd0 Added coffeepac to ElasticSearch owners 2017-05-13 07:48:09 +02:00
Kubernetes Submit Queue cb9074c418 Merge pull request #45730 from shyamjvs/remove-kubemark-sh
Automatic merge from submit-queue (batch tested with PRs 45653, 45719, 45729, 45730, 44250)

Remove kubemark.sh as we don't use pod IP from it anymore

This has been pending for sometime now. We no longer seem to actually depend on the downwarp api for the pod IP (hollow-proxy for example now gets it using api call).

cc @wojtek-t @gmarek
2017-05-12 12:12:48 -07:00
Kubernetes Submit Queue fa30eb1dc2 Merge pull request #45734 from crassirostris/fluentd-gcp-export-process-start
Automatic merge from submit-queue

Export process start time metric in fluentd-gcp

For correct ingestion of cumulative metrics fluentd-gcp exposes.
2017-05-12 10:57:43 -07:00
Kubernetes Submit Queue f8d5c63eda Merge pull request #45720 from shyamjvs/remove-waste
Automatic merge from submit-queue

Remove unused file cluster/images/kubemark/build-kubemark.sh

It's irrelevant and we don't seem to use/need it anymore.

cc @wojtek-t @gmarek
2017-05-12 10:57:34 -07:00
Mik Vyatskov dcd3ce3bcb Export process start time metric in fluentd-gcp 2017-05-12 16:37:36 +02:00
Shyam Jeedigunta 0f1d5e6e36 Remove kubemark.sh as we don't use pod IP from it anymore 2017-05-12 13:47:13 +02:00
Shyam Jeedigunta f65c80cc8c Remove unused file cluster/images/kubemark/build-kubemark.sh 2017-05-12 13:14:38 +02:00
Kubernetes Submit Queue b58a1b5601 Merge pull request #45715 from gmarek/fluentd_toleration
Automatic merge from submit-queue (batch tested with PRs 45691, 45667, 45698, 45715)

Add general NoExecute Toleration to fluentd in gcp configuration

Ref #44445

Once merged I'll create a cherry-pick that will be picked up in GKE together with the next patch release.

cc @JorritSalverda @davidopp @aveshagarwal @nimeshksingh @piosz 

```release-note
fluentd will tolerate all NoExecute Taints when run in gcp configuration.
```
2017-05-12 04:09:45 -07:00
Kubernetes Submit Queue 3b9a90ae79 Merge pull request #45684 from bowei/kube-dns-update
Automatic merge from submit-queue

Update kube-dns version to 1.14.2

```release-note
Updates kube-dns to 1.14.2

- Support kube-master-url flag without kubeconfig
- Fix concurrent R/Ws in dns.go
- Fix confusing logging when initialize server
- Fix printf in cmd/kube-dns/app/server.go
- Fix version on startup and --version flag
- Support specifying port number for nameserver in stubDomains
```
2017-05-12 03:13:38 -07:00
gmarek 4d7d6b72b3 Add general NoExecute Toleration to fluentd in gcp configuration 2017-05-12 11:23:23 +02:00
Brandon Philips b9a96272f7 images: hyperkube: README: add a note about REGISTRY variable
The REGISTRY variable is pretty helpful for people who are hacking on hyperkube. Document it here instead of just in the Makefile.
2017-05-11 17:24:23 -07:00
Bowei Du 1c223c8e1b Update kube-dns version to 1.14.2
Changes:

- Support kube-master-url flag without kubeconfig
- Fix concurrent R/Ws in dns.go
- Fix confusing logging when initialize server
- Fix printf in cmd/kube-dns/app/server.go
- Fix version on startup and --version flag
- Support specifying port number for nameserver in stubDomains
2017-05-11 12:29:00 -07:00
Kubernetes Submit Queue 4b2ab4e116 Merge pull request #45550 from jacekn/fix45547
Automatic merge from submit-queue (batch tested with PRs 45569, 45602, 45604, 45478, 45550)

Don't append :443 to registry domain in the kubernetes-worker layer registry action

**What this PR does / why we need it**: Fixes #45547

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #45547

**Special notes for your reviewer**:

**Release note**:

```
Fix #45547 - don't append :443 to juju created docker registry config
```
2017-05-10 21:34:45 -07:00
Kubernetes Submit Queue a507d30833 Merge pull request #45602 from dashpole/enable_memcg_for_all_tests
Automatic merge from submit-queue (batch tested with PRs 45569, 45602, 45604, 45478, 45550)

Enable kernel memcg notification for node and cluster GCI/COS testing.

Sets --experimental-kernel-memcg-notification=true when running on the GCI/COS image.  It sets this for master and nodes for cluster e2e tests, and for the node in node e2e tests.

Issue #42676 

cc @dchen1107 @Random-Liu
2017-05-10 21:34:39 -07:00
Ryan Hitchman 62235c3bb8 Fix ip-alias testing.
IP aliases are an alpha feature, and node accelerators are a beta
feature. $gcloud determines which is appropriate.

Before, this would try to run "gcloud alpha beta", which is incoherent.
2017-05-10 12:10:17 -07:00