Enabling full functionality aggregator functionality in kubemark tests.
This includes configuring it to work in gce (we seem to assume gce in our kubemark tests)
It also includes setting up the relevant security and auth config.
Removing unneeded reference to CA key for MHBauer.
Fixed to pull the "parsed" values for the certs.
Fix from shyamjvs.
Automatic merge from submit-queue (batch tested with PRs 51113, 46597, 50397, 51052, 51166)
implement proposal 34058: hostPath volume type
**What this PR does / why we need it**:
implement proposal #34058
**Which issue this PR fixes** : fixes#46549
**Special notes for your reviewer**:
cc @thockin @luxas @euank PTAL
Automatic merge from submit-queue (batch tested with PRs 50485, 49951, 50508, 50511, 50506)
Pass config to external Kubemark cluster in e2e tests
When cluster autoscaler is used in kubemark tests,
pass default kubeconfig as external cluster config.
@shyamjvs @gmarek
**Release note**:
```
NONE
```
Automatic merge from submit-queue
test/kubemark/resources: configure custom etcd endpoints
We want to stress our own etcd cluster with Kubernetes
workloads, using kubemark e2e tests. This PR adds a new
environment variable 'ETCD_SERVERS' to configure custom
etcd endpoints.
/cc @xiang90 @hongchaodeng
Automatic merge from submit-queue (batch tested with PRs 47302, 47389, 47402, 47468, 47459)
Update to kube-addon-manager:v6.4-beta.2: kubectl v1.6.4 and refreshed base images
**What this PR does / why we need it**: refreshes base images for kube-addon-manager with fixes for CVE-2016-9841 and CVE-2016-9843.
x-ref https://github.com/kubernetes/kubernetes/issues/47386
**Special notes for your reviewer**: the updated images are not yet pushed, so tests will fail until that's done.
**Release note**:
```release-note
```
/assign @MrHohn
We want to stress our own etcd cluster with Kubernetes
workloads, using kubemark e2e tests. This PR adds a new
environment variable 'ETCD_SERVERS' to configure custom
etcd endpoints.
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Automatic merge from submit-queue
Avoid double printing output of gcloud commands in kubemark
Just noticed we were unnecessarily echoing the result again.
/cc @wojtek-t
Automatic merge from submit-queue (batch tested with PRs 46124, 46434, 46089, 45589, 46045)
Copy kubeconfig to kubemark master
This should save the effort of digging through jenkins agent and its container to get the kubeconfig.
Ideally we should have kubectl directly working on the kubemark master, but I'm facing some issues due to older version of kubectl present by default on the node.
cc @wojtek-t @gmarek
Automatic merge from submit-queue
Enable "kick the tires" support for Nvidia GPUs in COS
This PR provides an installation daemonset that will install Nvidia CUDA drivers on Google Container Optimized OS (COS).
User space libraries and debug utilities from the Nvidia driver installation are made available on the host in a special directory on the host -
* `/home/kubernetes/bin/nvidia/lib` for libraries
* `/home/kubernetes/bin/nvidia/bin` for debug utilities
Containers that run CUDA applications on COS are expected to consume the libraries and debug utilities (if necessary) from the host directories using `HostPath` volumes.
Note: This solution requires updating Pod Spec across distros. This is a known issue and will be addressed in the future. Until then CUDA workloads will not be portable.
This PR updates the COS base image version to m59. This is coupled with this PR for the following reasons:
1. Driver installation requires disabling a kernel feature in COS.
2. The kernel API for disabling this interface changed across COS versions
3. If the COS image update is not handled in this PR, then a subsequent COS image update will break GPU integration and will require an update to the installation scripts in this PR.
4. Instead of having to post `3` PRs, one each for adding the basic installer, updating COS to m59, and then updating the installer again, this PR combines all the changes to reduce review overhead and latency, and additional noise that will be created when GPU tests break.
**Try out this PR**
1. Get Quota for GPUs in any region
2. `export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci`
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1" cluster/kube-up.sh`
4. `kubectl create -f cluster/gce/gci/nvidia-gpus/cos-installer-daemonset.yaml`
5. Run your CUDA app in a pod.
**Another option is to run a e2e manually to try out this PR**
1. Get Quota for GPUs in any region
2. export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1"`
4. `go run hack/e2e.go -- --up`
5. `hack/ginkgo-e2e.sh --ginkgo.focus="\[Feature:GPU\]"`
The e2e will install the drivers automatically using the daemonset and then run test workloads to validate driver integration.
TODO:
- [x] Update COS image version to m59 release.
- [x] Remove sleep from the install script and add it to the daemonset
- [x] Add an e2e that will run the daemonset and run a sample CUDA app on COS clusters.
- [x] Setup a test project with necessary quota to run GPU tests against HEAD to start with https://github.com/kubernetes/test-infra/pull/2759
- [x] Update node e2e serial configs to install nvidia drivers on COS by default
Automatic merge from submit-queue (batch tested with PRs 46133, 46211, 46224, 46205, 45910)
Make CPU request for heapster in kubemark scale with the number of Nodes
Automatic merge from submit-queue
Make real proxier in hollow-proxy optional (default=true)
Ref https://github.com/kubernetes/kubernetes/pull/45622
This allows using real proxier for hollow proxy, but we use the fake one by default.
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 45569, 45602, 45604, 45478, 45550)
Minor bug fix in start-kubemark-master script
cc @wojtek-t @gmarek
Per Clayton's suggestion, move stuff from cluster/lib/util.sh to
hack/lib/util.sh. Also consolidate ensure-temp-dir and use the
hack/lib/util.sh implementation rather than cluster/common.sh.
Automatic merge from submit-queue (batch tested with PRs 42672, 42770, 42818, 42820, 40849)
kubemark test: Bump addon-manager to v6.4-beta.1
Follow up PR of #42760. This PR bumps addon-manager to v6.4-beta.1 for kubemark test.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 42456, 42457, 42414, 42480, 42370)
Update npd in kubemark since #42201 is merged.
Revert https://github.com/kubernetes/kubernetes/pull/41716.
#42201 has been merged, and #41713 is fixed. Now we could retry update npd in kubemark.
/cc @shyamjvs @wojtek-t @dchen1107
Automatic merge from submit-queue (batch tested with PRs 41980, 42192, 42223, 41822, 42048)
Modified kubemark startup scripts to restore master on reboot
Fixes#41735
As discussed in the issue, modified the scripts to satisfy the conditions of restoring master env, running non-idempotent operations only for the first time and persist important data like pki/auth files on a PD.
Also attached `start-kubemark-master.sh` as startup-script metadata to master instance (on GCE) so that it is called automatically on each boot.
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 42316, 41618, 42201, 42113, 42191)
[Kubemark] Add init container in hollow node for setting inotify limit of node to 200
Fixes#41713
Along with adding the init container, I also changed the manifest to a yaml as otherwise the entire init container annotation would have to be in a single line (with escaped characters), as json doesn't allow multi-line strings.
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek @Random-Liu
Automatic merge from submit-queue (batch tested with PRs 40746, 41699, 42108, 42174, 42093)
Avoid fake node names in user info
Node usernames should follow the format `system:node:<node-name>`,
but if we don't know the node name, it's worse to put a fake one in.
In the future, we plan to have a dedicated node authorizer, which would
start rejecting requests from a user with a bogus node name like this.
The right approach is to either mint correct credentials per node, or use node bootstrapping so it requests a correct client certificate itself.
Automatic merge from submit-queue (batch tested with PRs 41962, 42055, 42062, 42019, 42054)
Update flag to --check-version-skew instead of --check_version_skew
https://github.com/kubernetes/test-infra/issues/2012
Also add a `--` to send the flags to kubetest without triggering a warning.
Automatic merge from submit-queue (batch tested with PRs 41349, 41532, 41256, 41587, 41657)
Update kubectl in addon-manager to use HPA in autoscaling/v1
Addon-manager is broken since HPA objects were removed from extensions api group.
Came across the logs from [the latest addon-manager on Jenkins](https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce/4290/artifacts/bootstrap-e2e-master/kube-addon-manager.log):
```
INFO: == Entering periodical apply loop at 2017-02-16T17:33:37+0000 ==
error: error pruning namespaced object extensions/v1beta1, Kind=HorizontalPodAutoscaler: the server could not find the requested resource
WRN: == Failed to execute /usr/local/bin/kubectl apply --namespace=kube-system -f /etc/kubernetes/addons --prune=true -l kubernetes.io/cluster-service=true --recursive >/dev/null at 2017-02-16T17:33:38+0000. 2 tries remaining. ==
error: error pruning namespaced object extensions/v1beta1, Kind=HorizontalPodAutoscaler: the server could not find the requested resource
WRN: == Failed to execute /usr/local/bin/kubectl apply --namespace=kube-system -f /etc/kubernetes/addons --prune=true -l kubernetes.io/cluster-service=true --recursive >/dev/null at 2017-02-16T17:33:46+0000. 1 tries remaining. ==
error: error pruning namespaced object extensions/v1beta1, Kind=HorizontalPodAutoscaler: the server could not find the requested resource
WRN: == Failed to execute /usr/local/bin/kubectl apply --namespace=kube-system -f /etc/kubernetes/addons --prune=true -l kubernetes.io/cluster-service=true --recursive >/dev/null at 2017-02-16T17:33:53+0000. 0 tries remaining. ==
WRN: == Kubernetes addon update completed with errors at 2017-02-16T17:33:58+0000 ==
```
And notice this commit (f66679a4e9) came in two weeks ago, which removed HorizontalPodAutoscaler from extensions/v1beta1.
Addon-manager is now partially functioning that it could successfully create and update addons, but will fail to prune objects, which means upgrade tests may mostly fail.
Pushed another version of addon-manager with kubectl v1.6.0-alpha.2 ([release 2 days ago](https://github.com/kubernetes/kubernetes/releases/tag/v1.6.0-alpha.2)) for fixing, including below images:
- gcr.io/google-containers/kube-addon-manager:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-amd64:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-arm:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-arm64:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-ppc64le:v6.4-alpha.2
- gcr.io/google-containers/kube-addon-manager-s390x:v6.4-alpha.2
@mikedanese
cc @wojtek-t @shyamjvs
Automatic merge from submit-queue (batch tested with PRs 41706, 39063, 41330, 41739, 41576)
[Kubemark] Add option to log hollow-node logs
Ref https://github.com/kubernetes/kubernetes/issues/41613
Added an option to log kubemark hollow-node logs which includes kubelet, kubeproxy and npd logs for each hollow-node.
Setting the env var `ENABLE_HOLLOW_NODE_LOGS=true` should now enable logging for tests.
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek @yujuhong @Random-Liu
Automatic merge from submit-queue
Fix kubemark default e2e test suite's name
Seems like the suite "[Feature:performance]" doesn't trigger tests anymore. Changed it to "[Feature:Performance]" in kubemark run-e2e-tests.sh.
cc @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 41134, 41410, 40177, 41049, 41313)
Refactored kubemark code into provider-specific and provider-independent parts [Part-3]
Fixes#38967
Applying final part of the changes in PR #39033 (which refactored kubemark code completely). The changes included in this PR are:
- Removed `test/kubemark/common.sh` and moved relevant parts of its code to the right places in start-kubemark/stop-kubemark scripts.
- Added DOCKER_REGISTRY, PROJECT, KUBEMARK_IMAGE_MAKE_TARGET variables to `/test/kubemark/cloud-provider-config.sh` to make the kubemark image push location variable wrt provider.
- Removed get-real-pod-for-hollow-node.sh as it doesn't seem to do anything useful.
@kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 40297, 41285, 41211, 41243, 39735)
Secure kube-scheduler
This PR:
* Adds a bootstrap `system:kube-scheduler` clusterrole
* Adds a bootstrap clusterrolebinding to the `system:kube-scheduler` user
* Sets up a kubeconfig for kube-scheduler on GCE (following the controller-manager pattern)
* Switches kube-scheduler to running with kubeconfig against secured port (salt changes, beware)
* Removes superuser permissions from kube-scheduler in local-up-cluster.sh
* Adds detailed RBAC deny logging
```release-note
On kube-up.sh clusters on GCE, kube-scheduler now contacts the API on the secured port.
```