Commit Graph

229 Commits (55682f2a555981d81af4d2262da93d4cf712e9ea)

Author SHA1 Message Date
Shyam Jeedigunta 4a57d68da0 Reduce hollow-kubelet cpu request 2017-08-09 14:30:10 +02:00
Shyam Jeedigunta 55527036b4 Fix bug in command retrying in kubemark 2017-07-24 17:18:50 +02:00
Shyam JVS f81d9ed9b9 Reduce hollow proxy mem/node 2017-07-21 03:57:03 +02:00
Shyam Jeedigunta 5f8cb3d9ff Enable logexporter mechanism to dump logs from k8s nodes to GCS directly 2017-07-12 14:39:49 +02:00
Ryan Hallisey f9354bf6b7 Add a README to the pre-existing provdier 2017-07-05 09:14:58 -04:00
Ryan Hallisey 82e1d208f6 Launch kubemark with an existing Kubemark Master
In order to expand the use of kubemark, allow developers to
use kubemark with a pre-existing Kubemark master.
2017-07-05 09:14:53 -04:00
gmarek 64f6606833 Make big clusters work again after introduction of subnets 2017-06-26 21:27:04 +02:00
Ajit Kumar caff16c678 Bump up npd version to v0.4.1 2017-06-22 13:13:50 -07:00
Kubernetes Submit Queue 037330c365 Merge pull request #47470 from gyuho/kubemark-etcd
Automatic merge from submit-queue

test/kubemark/resources: configure custom etcd endpoints

We want to stress our own etcd cluster with Kubernetes
workloads, using kubemark e2e tests. This PR adds a new
environment variable 'ETCD_SERVERS' to configure custom
etcd endpoints.

/cc @xiang90 @hongchaodeng
2017-06-14 12:10:06 -07:00
Kubernetes Submit Queue d8983699e0 Merge pull request #47389 from ixdy/kube-addon-manager-update
Automatic merge from submit-queue (batch tested with PRs 47302, 47389, 47402, 47468, 47459)

Update to kube-addon-manager:v6.4-beta.2: kubectl v1.6.4 and refreshed base images

**What this PR does / why we need it**: refreshes base images for kube-addon-manager with fixes for CVE-2016-9841 and CVE-2016-9843.

x-ref https://github.com/kubernetes/kubernetes/issues/47386

**Special notes for your reviewer**: the updated images are not yet pushed, so tests will fail until that's done.

**Release note**:

```release-note
```

/assign @MrHohn
2017-06-13 23:37:43 -07:00
Gyu-Ho Lee ab1ebbc79c test/kubemark/resources: configure custom etcd endpoints
We want to stress our own etcd cluster with Kubernetes
workloads, using kubemark e2e tests. This PR adds a new
environment variable 'ETCD_SERVERS' to configure custom
etcd endpoints.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-13 19:08:53 -07:00
Jeff Grafton eddf98d2c8 Update to kube-addon-manager:v6.4-beta.2: new kubectl and base images 2017-06-12 19:28:23 -07:00
Jordan Liggitt 1d9855474d
Enable Node authorizer and NodeRestriction admission in kubemark 2017-06-09 10:17:08 -04:00
Random-Liu 1d3979190c Bump up npd version to v0.4.0 2017-06-06 16:30:02 -07:00
Kubernetes Submit Queue c0407972e9 Merge pull request #46938 from shyamjvs/no-reprinting-gcloud-result
Automatic merge from submit-queue

Avoid double printing output of gcloud commands in kubemark

Just noticed we were unnecessarily echoing the result again.

/cc @wojtek-t
2017-06-06 04:07:55 -07:00
Kubernetes Submit Queue f842bb9987 Merge pull request #46802 from shyamjvs/npd-kernel-config
Automatic merge from submit-queue (batch tested with PRs 46972, 42829, 46799, 46802, 46844)

Add KernelDeadlock condition to hollow NPD

Ref https://github.com/kubernetes/kubernetes/issues/44701

/cc @wojtek-t @gmarek
2017-06-05 17:46:55 -07:00
Shyam Jeedigunta 163f1de5ed Avoid double printing output of gcloud commands in kubemark 2017-06-04 20:07:36 +02:00
Shyam Jeedigunta b655953e21 Enable DefaultTolerationSeconds and PodPreset admission plugins for kubemark 2017-06-04 19:52:57 +02:00
Clayton Coleman 4ce3907639
Add Initializers to all admission control paths by default 2017-06-02 22:09:04 -04:00
Shyam Jeedigunta 23ef37d9ce Add KernelDeadlock condition to hollow NPD 2017-06-01 22:23:28 +02:00
Sen Lu d237e54a24 Switch gcloud compute copy-files to scp 2017-05-30 10:19:33 -07:00
Shyam Jeedigunta 02092312bb Make kubemark scripts fail fast 2017-05-30 11:59:13 +02:00
Kubernetes Submit Queue 52337d5db6 Merge pull request #46502 from gmarek/run_kubemark_tests
Automatic merge from submit-queue

Fix kubemark/run-e2e-tests.sh

This should make most common arguments work.

cc @shyamjvs
2017-05-29 09:35:01 -07:00
gmarek 0ca6aeb95c Fix kubemark/run-e2e-tests.sh 2017-05-29 15:20:54 +02:00
Shyam Jeedigunta b72cbc074c chmod +x kubemark scripts 2017-05-26 22:03:12 +02:00
Kubernetes Submit Queue c60bc53921 Merge pull request #46434 from shyamjvs/kubemark-config-upload
Automatic merge from submit-queue (batch tested with PRs 46124, 46434, 46089, 45589, 46045)

Copy kubeconfig to kubemark master

This should save the effort of digging through jenkins agent and its container to get the kubeconfig.
Ideally we should have kubectl directly working on the kubemark master, but I'm facing some issues due to older version of kubectl present by default on the node.

cc @wojtek-t @gmarek
2017-05-25 21:39:59 -07:00
Shyam Jeedigunta 8f2b4c3b33 Copy kubeconfig to kubemark master 2017-05-25 14:55:28 +02:00
gmarek 2437cf4d59 fix type in start-kubemark 2017-05-25 11:48:01 +02:00
Kubernetes Submit Queue 1e2105808b Merge pull request #45136 from vishh/cos-nvidia-driver-install
Automatic merge from submit-queue

Enable "kick the tires" support for Nvidia GPUs in COS

This PR provides an installation daemonset that will install Nvidia CUDA drivers on Google Container Optimized OS (COS).
User space libraries and debug utilities from the Nvidia driver installation are made available on the host in a special directory on the host -
* `/home/kubernetes/bin/nvidia/lib` for libraries
*  `/home/kubernetes/bin/nvidia/bin` for debug utilities

Containers that run CUDA applications on COS are expected to consume the libraries and debug utilities (if necessary) from the host directories using `HostPath` volumes.

Note: This solution requires updating Pod Spec across distros. This is a known issue and will be addressed in the future. Until then CUDA workloads will not be portable.

This PR updates the COS base image version to m59. This is coupled with this PR for the following reasons:
1. Driver installation requires disabling a kernel feature in COS. 
2. The kernel API for disabling this interface changed across COS versions
3. If the COS image update is not handled in this PR, then a subsequent COS image update will break GPU integration and will require an update to the installation scripts in this PR.
4. Instead of having to post `3` PRs, one each for adding the basic installer, updating COS to m59, and then updating the installer again, this PR combines all the changes to reduce review overhead and latency, and additional noise that will be created when GPU tests break.

**Try out this PR**
1. Get Quota for GPUs in any region
2. `export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci`
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1" cluster/kube-up.sh`
4. `kubectl create -f cluster/gce/gci/nvidia-gpus/cos-installer-daemonset.yaml`
5. Run your CUDA app in a pod.

**Another option is to run a e2e manually to try out this PR**
1. Get Quota for GPUs in any region
2. export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1"`
4. `go run hack/e2e.go -- --up` 
5. `hack/ginkgo-e2e.sh --ginkgo.focus="\[Feature:GPU\]"`
The e2e will install the drivers automatically using the daemonset and then run test workloads to validate driver integration.

TODO:
- [x] Update COS image version to m59 release.
- [x] Remove sleep from the install script and add it to the daemonset
- [x] Add an e2e that will run the daemonset and run a sample CUDA app on COS clusters.
- [x] Setup a test project with necessary quota to run GPU tests against HEAD to start with https://github.com/kubernetes/test-infra/pull/2759
- [x] Update node e2e serial configs to install nvidia drivers on COS by default
2017-05-23 10:46:10 -07:00
Kubernetes Submit Queue 03ba1324cf Merge pull request #46224 from gmarek/kubemark_heapster
Automatic merge from submit-queue (batch tested with PRs 46133, 46211, 46224, 46205, 45910)

Make CPU request for heapster in kubemark scale with the number of Nodes
2017-05-22 15:50:03 -07:00
gmarek 27fc7be396 Make CPU request for heapster in kubemark scale with the number of Nodes 2017-05-22 16:20:27 +02:00
Vishnu kannan 86b5edb79a Update COS version to m59
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-05-20 21:17:19 -07:00
Shyam Jeedigunta 360054a75f Add script to dump kubemark master logs 2017-05-20 13:12:38 +02:00
Kubernetes Submit Queue a1c2db2fec Merge pull request #45950 from shyamjvs/revert-proxier
Automatic merge from submit-queue

Make real proxier in hollow-proxy optional (default=true)

Ref https://github.com/kubernetes/kubernetes/pull/45622
This allows using real proxier for hollow proxy, but we use the fake one by default.

cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
2017-05-18 07:55:09 -07:00
Shyam Jeedigunta 804a4f558c Make usage of real proxier in hollow-proxy optional (default=true) 2017-05-18 14:30:12 +02:00
Michael Taufen 2ee2ec5e21 Remove the deprecated --babysit-daemons kubelet flag 2017-05-17 09:08:57 -07:00
Shyam Jeedigunta 87cde074f8 Minor fix in run-gcloud-compute-with-retries output piping 2017-05-15 13:39:10 +02:00
Shyam Jeedigunta 0f1d5e6e36 Remove kubemark.sh as we don't use pod IP from it anymore 2017-05-12 13:47:13 +02:00
Kubernetes Submit Queue e939019900 Merge pull request #45604 from shyamjvs/start-km-master-fix
Automatic merge from submit-queue (batch tested with PRs 45569, 45602, 45604, 45478, 45550)

Minor bug fix in start-kubemark-master script

cc @wojtek-t @gmarek
2017-05-10 21:34:41 -07:00
Shyam Jeedigunta 1078e9580c Minor bug fix in start-kubemark-master script 2017-05-10 19:51:14 +02:00
Shyam Jeedigunta 1fc831e0ec Fix bug in hollow-node deletion in stop-kubemark script 2017-05-10 12:57:43 +02:00
Shyam Jeedigunta 0759289dcf Stream output of run-gcloud-compute-with-retries to stdout in realtime 2017-05-09 13:44:48 +02:00
Shyam Jeedigunta 2e800eef20 Fix add-metadata command for kubemark master 2017-05-08 20:44:20 +02:00
Shyam Jeedigunta efc84378b8 Fix gcloud retries cmd to rightly capture return code 2017-05-08 19:34:26 +02:00
Shyam Jeedigunta 395d3bf3b4 Move hollow-node's initContainer from annotation to field 2017-05-04 11:41:33 +02:00
Dan Williams b3705b6e35 hack/cluster: consolidate cluster/ utils to hack/lib/util.sh
Per Clayton's suggestion, move stuff from cluster/lib/util.sh to
hack/lib/util.sh.  Also consolidate ensure-temp-dir and use the
hack/lib/util.sh implementation rather than cluster/common.sh.
2017-03-30 22:34:46 -05:00
Kubernetes Submit Queue 4f606b9c8d Merge pull request #42820 from MrHohn/addon-kubemark-v6.4-beta.1
Automatic merge from submit-queue (batch tested with PRs 42672, 42770, 42818, 42820, 40849)

kubemark test: Bump addon-manager to v6.4-beta.1

Follow up PR of #42760. This PR bumps addon-manager to v6.4-beta.1 for kubemark test.

**Release note**:

```release-note
NONE
```
2017-03-25 14:27:27 -07:00
Piotr Szczesniak 69fd7aafd0 Bumped Heapster to v1.3.0 2017-03-17 15:45:52 +01:00
Random-Liu c4b3fd4e63 Update npd to the official v0.3.0 release. 2017-03-15 14:26:12 -07:00
Zihong Zheng 34b8d008ec kubemark test: Bump addon-manager to v6.4-beta.1 2017-03-09 10:13:07 -08:00