Commit Graph

48812 Commits (b4ddf4720d72952041392341950f6987e908d007)

Author SHA1 Message Date
Andy Goldstein 3b69884843 Use storage instead of REST for the CRD finalizer
Switch the custom resource definition finalizer controller to use
storage instead of a REST client, because a client could incorrectly try
to delete ThirdPartyResources whose names happen to collide with the
CustomResource instances.
2017-05-23 14:14:55 -04:00
Kubernetes Submit Queue 1e2105808b Merge pull request #45136 from vishh/cos-nvidia-driver-install
Automatic merge from submit-queue

Enable "kick the tires" support for Nvidia GPUs in COS

This PR provides an installation daemonset that will install Nvidia CUDA drivers on Google Container Optimized OS (COS).
User space libraries and debug utilities from the Nvidia driver installation are made available on the host in a special directory on the host -
* `/home/kubernetes/bin/nvidia/lib` for libraries
*  `/home/kubernetes/bin/nvidia/bin` for debug utilities

Containers that run CUDA applications on COS are expected to consume the libraries and debug utilities (if necessary) from the host directories using `HostPath` volumes.

Note: This solution requires updating Pod Spec across distros. This is a known issue and will be addressed in the future. Until then CUDA workloads will not be portable.

This PR updates the COS base image version to m59. This is coupled with this PR for the following reasons:
1. Driver installation requires disabling a kernel feature in COS. 
2. The kernel API for disabling this interface changed across COS versions
3. If the COS image update is not handled in this PR, then a subsequent COS image update will break GPU integration and will require an update to the installation scripts in this PR.
4. Instead of having to post `3` PRs, one each for adding the basic installer, updating COS to m59, and then updating the installer again, this PR combines all the changes to reduce review overhead and latency, and additional noise that will be created when GPU tests break.

**Try out this PR**
1. Get Quota for GPUs in any region
2. `export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci`
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1" cluster/kube-up.sh`
4. `kubectl create -f cluster/gce/gci/nvidia-gpus/cos-installer-daemonset.yaml`
5. Run your CUDA app in a pod.

**Another option is to run a e2e manually to try out this PR**
1. Get Quota for GPUs in any region
2. export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1"`
4. `go run hack/e2e.go -- --up` 
5. `hack/ginkgo-e2e.sh --ginkgo.focus="\[Feature:GPU\]"`
The e2e will install the drivers automatically using the daemonset and then run test workloads to validate driver integration.

TODO:
- [x] Update COS image version to m59 release.
- [x] Remove sleep from the install script and add it to the daemonset
- [x] Add an e2e that will run the daemonset and run a sample CUDA app on COS clusters.
- [x] Setup a test project with necessary quota to run GPU tests against HEAD to start with https://github.com/kubernetes/test-infra/pull/2759
- [x] Update node e2e serial configs to install nvidia drivers on COS by default
2017-05-23 10:46:10 -07:00
Kubernetes Submit Queue 9ebfe9662f Merge pull request #46286 from zjj2wry/timstamps-timestamps
Automatic merge from submit-queue (batch tested with PRs 45587, 46286)

fix typo in kubelet

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-05-23 10:29:58 -07:00
Kubernetes Submit Queue 1602e2a338 Merge pull request #45587 from foxish/pdb-maxunavailab
Automatic merge from submit-queue (batch tested with PRs 45587, 46286)

PDB Max Unavailable Field

Completes https://github.com/kubernetes/features/issues/285

```release-note
Adds a MaxUnavailable field to PodDisruptionBudget
```


Individual commits are self-contained; Last commit can be ignored because it is autogenerated code.
cc @kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews
2017-05-23 10:29:56 -07:00
Nick Sardo f40f45abc1 Defer test stop & cleanup 2017-05-23 10:11:46 -07:00
Tim St. Clair 4c98cab4db
Update audit API with missing pieces 2017-05-23 09:55:00 -07:00
Random-Liu 5f0288e022 Double `StopContainer` request timeout. 2017-05-23 09:35:48 -07:00
Andy Goldstein d1a0384678 GC: allow ignored resources to be customized
Allow the list of resources the garbage collector controller should
ignore to be customizable, so downstream integrators can add their own
resources to the list, if necessary.
2017-05-23 12:05:09 -04:00
Andy Goldstein d30fb0d9d5 GC: update required verbs for deletable resources
The garbage collector controller currently needs to list, watch, get,
patch, update, and delete resources. Update the criteria for
deletable resources to reflect this.
2017-05-23 12:00:10 -04:00
Humble Chirammal 8700776d26 Add CephFS volume source to describe printer.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2017-05-23 20:27:00 +05:30
Anirudh 48d76edc74 PDB MaxUnavailable: Generated 2017-05-23 07:42:24 -07:00
Kubernetes Submit Queue 8e07e61a43 Merge pull request #46223 from smarterclayton/scheduler_max
Automatic merge from submit-queue (batch tested with PRs 45766, 46223)

Scheduler should use a shared informer, and fix broken watch behavior for cached watches

Can be used either from a true shared informer or a local shared
informer created just for the scheduler.

Fixes a bug in the cache watcher where we were returning the "current" object from a watch event, not the historic event.  This means that we broke behavior when introducing the watch cache.  This may have API implications for filtering watch consumers - but on the other hand, it prevents clients filtering from seeing objects outside of their watch correctly, which can lead to other subtle bugs.

```release-note
The behavior of some watch calls to the server when filtering on fields was incorrect.  If watching objects with a filter, when an update was made that no longer matched the filter a DELETE event was correctly sent.  However, the object that was returned by that delete was not the (correct) version before the update, but instead, the newer version.  That meant the new object was not matched by the filter.  This was a regression from behavior between cached watches on the server side and uncached watches, and thus broke downstream API clients.
```
2017-05-23 07:42:00 -07:00
Kubernetes Submit Queue 1f45c4846b Merge pull request #45766 from sttts/sttts-audit-event-in-context
Automatic merge from submit-queue (batch tested with PRs 45766, 46223)

Audit: fill audit.Event in handler chain

Related:
- external API types https://github.com/kubernetes/kubernetes/pull/45315
- policy checker https://github.com/kubernetes/kubernetes/pull/46009

Decisions:
- ~~[ ] decide whether we want to send an event before `WriteHeader` https://github.com/kubernetes/kubernetes/pull/45766#pullrequestreview-38664161~~ Follow-up described in https://github.com/kubernetes/kubernetes/pull/46065/files#r117438531
- [ ] decide how to handle `AuditID`s and the IP chain https://github.com/kubernetes/kubernetes/pull/45766#pullrequestreview-38659371. Is the variant in the proposal (https://github.com/kubernetes/community/pull/625) final? Then we need the API type update.
- ~~[ ] decide how to mark intermediate/incomplete events? set a special reason in `ResponseStatus.Reason` vs. having extra fields for that `Event.NonFinal`
 https://github.com/kubernetes/kubernetes/pull/45766#discussion_r116795888~~ Follow-up of #46065
- [ ] decide whether and how to protect the `Audit-Level` header https://github.com/kubernetes/kubernetes/pull/45766#pullrequestreview-38937691

TODOs:
- ~~[ ] move `AuditIDHeader`, `AuditLevelHeader` to types https://github.com/kubernetes/kubernetes/pull/45766#discussion_r117064094, @timstclair for the type PR~~ Follow-up of https://github.com/kubernetes/kubernetes/pull/46065
- [x] add SourceIP/ForwardedFor support https://github.com/kubernetes/kubernetes/pull/45766#discussion_r116778101
- [x] adapt ObjectReference.Resource to API PR https://github.com/kubernetes/kubernetes/pull/45766#pullrequestreview-38656828
2017-05-23 07:41:56 -07:00
Anirudh 63e51dc66e PDB MaxUnavailable: e2e tests 2017-05-23 07:18:44 -07:00
Anirudh 078f9566d9 PDB MaxUnavailable: kubectl changes 2017-05-23 07:18:44 -07:00
Anirudh ce48d4fb5c PDB MaxUnavailable: Disruption Controller Changes 2017-05-23 07:18:44 -07:00
Anirudh 2b0de599a7 PDB MaxUnavailable: API changes 2017-05-23 07:18:43 -07:00
Kubernetes Submit Queue 4a1483efda Merge pull request #46216 from deads2k/owners-02-tighten
Automatic merge from submit-queue

tighten and simplify owners in some staging repos

With the move to staging, we can have much cleaner owners across the related packages.  This pares down the list of OWNERS to better match for code and activity.  It should help get PRs directed to people more active and familiar with the areas for quicker review.

@kubernetes/sig-api-machinery-misc 
@lavalamp @smarterclayton ptal.
2017-05-23 06:15:54 -07:00
Kubernetes Submit Queue 4871f4a75b Merge pull request #45637 from xilabao/hide-api-version
Automatic merge from submit-queue

remove --api-version
2017-05-23 06:15:45 -07:00
zhengjiajin c79b0c797f fix typo in kubelet 2017-05-23 19:54:10 +08:00
Yassine TIJANI a348a4e881 removing this todo after discussion (#46027) 2017-05-23 13:34:14 +02:00
Matt Potter 743cc5d685 autogen BUILD file 2017-05-23 11:37:48 +01:00
Matt Potter ae102d64c4 refactor to use sets.String 2017-05-23 11:37:48 +01:00
Matt Potter b8c0314861 deduplicate endpoints before DNS registration 2017-05-23 11:37:48 +01:00
Dr. Stefan Schimanski 9fdc36a47a Update bazel 2017-05-23 11:20:14 +02:00
Dr. Stefan Schimanski ce942d19c3 audit: wire through non-nil context everywhere 2017-05-23 11:20:14 +02:00
Dr. Stefan Schimanski 0b5bcb0219 audit: add audit event to the context and fill in handlers 2017-05-23 11:20:14 +02:00
Dr. Stefan Schimanski c1bf6e832e apiserver: move LongRunningRequestCheck type into endpoints/request 2017-05-23 11:20:13 +02:00
Kubernetes Submit Queue 8bee44b65f Merge pull request #46234 from wojtek-t/faster_selflink
Automatic merge from submit-queue (batch tested with PRs 46060, 46234)

Speedup generating selflinks for list and watch requests

I've seen profiles, where GenerateSelflink was 8-9% of whole cpu usage of apiserver (profiles over 30s). Most of this where spent in getting RequestInfo from the context and creating the context.

This PR changes the API of the GenerateLink method of the namer which results in computing the context and requestInfo only once per LIST/WATCH request (instead of computing it for every single returned element of LIST/WATCH).

@smarterclayton @deads2k - can one of you please take a look?
2017-05-23 01:41:57 -07:00
Kubernetes Submit Queue 7e75998233 Merge pull request #46060 from MrHohn/fix-serviceregistry-externaltraffic
Automatic merge from submit-queue (batch tested with PRs 46060, 46234)

Randomize test nodePort to prevent collision

Fix #37982.

/assign @bowei 

**Release note**:

```release-note
NONE
```
2017-05-23 01:41:55 -07:00
Kubernetes Submit Queue 286bcc6f5c Merge pull request #45995 from humblec/glusterfs-mount-3
Automatic merge from submit-queue

Add `auto_unmount` mount option for glusterfs fuse mount.

libfuse has an auto_unmount option which, if enabled, ensures that
the file system is unmounted at FUSE server termination by running a
separate monitor process that performs the unmount when that occurs.
(This feature would probably better be called "robust auto-unmount",
as FUSE servers usually do try to unmount their file systems upon
termination, it's just this mechanism is not crash resilient.)
This change implements that option and behavior for glusterfs.

This option will be only supported for clients with version >3.11.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2017-05-23 00:29:41 -07:00
Kubernetes Submit Queue 455e9fff09 Merge pull request #46176 from vmware/vSphereStoragePolicySupport
Automatic merge from submit-queue

vSphere storage policy support for dynamic volume provisioning

Till now, vSphere cloud provider provides support to configure persistent volume with VSAN storage capabilities - kubernetes#42974. Right now this only works with VSAN.

Also there might be other use cases:

- The user might need a way to configure a policy on other datastores like VMFS, NFS etc.
- Use Storage IO control, VMCrypt policies for a persistent disk.

We can achieve about 2 use cases by using existing storage policies which are already created on vCenter using the Storage Policy Based Management service. The user will specify the SPBM policy ID as part of dynamic provisioning 

- resultant persistent volume will have the policy configured with it. 
- The persistent volume will be created on the compatible datastore that satisfies the storage policy requirements. 
- If there are multiple compatible datastores, the datastore with the max free space would be chosen by default.
- If the user specifies the datastore along with the storage policy ID, the volume will created on this datastore if its compatible. In case if the user specified datastore is incompatible, it would error out the reasons for incompatibility to the user.
- Also, the user will be able to see the associations of persistent volume object with the policy on the vCenter once the volume is attached to the node.

For instance in the below example, the volume will created on a compatible datastore with max free space that satisfies the "Gold" storage policy requirements.

```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
       name: fast
provisioner: kubernetes.io/vsphere-volume
parameters:
      diskformat: zeroedthick
      storagepolicyName: Gold
```

For instance in the below example, the vSphere CP checks if "VSANDatastore" is compatible with "Gold" storage policy requirements. If yes, volume will be provisioned on "VSANDatastore" else it will error that "VSANDatastore" is not compatible with the exact reason for failure.

```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
       name: fast
provisioner: kubernetes.io/vsphere-volume
parameters:
      diskformat: zeroedthick
      storagepolicyName: Gold
      datastore: VSANDatastore
```

As a part of this change, 4 commits have been added to this PR.

1. Vendor changes for vmware/govmomi
2. Changes to the VsphereVirtualDiskVolumeSource in the Kubernetes API. Added 2 additional fields StoragePolicyName, StoragePolicyID
3. Swagger and Open spec API changes.
4. vSphere Cloud Provider changes to implement the storage policy support.

**Release note**:


```release-note
vSphere cloud provider: vSphere Storage policy Support for dynamic volume provisioning
```
2017-05-22 23:41:10 -07:00
jianglingxia adc0faa2c4 fix the invalid link 2017-05-23 14:36:43 +08:00
Kubernetes Submit Queue 3bfae793f0 Merge pull request #46008 from NickrenREN/openstack-add-metric
Automatic merge from submit-queue

Recording openstack metrics

add openstack operation metrics


**Release note**:
```release-note
Add support for emitting metrics from openstack cloudprovider about storage operations.
```

/assign @gnufied
2017-05-22 21:54:02 -07:00
Kubernetes Submit Queue 644a544d62 Merge pull request #46062 from alexandercampbell/correct-deprecation-errors
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)

kubectl: fix deprecation warning bug

**What this PR does / why we need it**:

Some kubectl commands were deprecated but would fail to print the
correct warning message when a flag was given before the command name.

	# Correctly prints the warning that "resize" is deprecated and
	# "scale" is now preferred.
	kubectl resize [...]

	# Should print the same warning but no warning is printed.
	kubectl --v=1 resize [...]

This was due to a fragile check on os.Args[1].

This commit implements a new function deprecatedCmd() that is used to
construct new "passthrough" commands which are marked as deprecated and
hidden.

Note that there is an existing "filters" system that may be preferable
to the system created in this commit. I'm not sure why the "filters"
array was not used for all deprecated commands in the first place.

**Release note**:

```release-note
NONE
```
2017-05-22 20:58:07 -07:00
Kubernetes Submit Queue 31bd852ec1 Merge pull request #46247 from marun/fed-override-etcd-default-image
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)

[Federation][kubefed]: Add support for etcd image override

This PR adds support for overriding the default etcd image used by ``kubefed init`` by providing an argument to ``--etcd-image``.  This is primarily intended to allow consumers like openshift to provide a different default, but as a nice side-effect supports code-free validation of non-default etcd images. 

**Release note**:

```release-note
'kubefed init' now supports overriding the default etcd image name with the --etcd-image parameter.
```
cc: @kubernetes/sig-federation-pr-reviews
2017-05-22 20:58:05 -07:00
Kubernetes Submit Queue cc6e51c6e8 Merge pull request #45427 from ncdc/gc-shared-informers
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)

Use shared informers in gc controller if possible

Modify the garbage collector controller to try to use shared informers for resources, if possible, to reduce the number of unique reflectors listing and watching the same thing.

cc @kubernetes/sig-api-machinery-pr-reviews @caesarxuchao @deads2k @liggitt @sttts @smarterclayton @timothysc @soltysh @kargakis @kubernetes/rh-cluster-infra @derekwaynecarr @wojtek-t @gmarek
2017-05-22 20:58:03 -07:00
Kubernetes Submit Queue 2718429e4f Merge pull request #45952 from harryge00/update-es-image
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)

remove the elasticsearch template

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```
NONE
```
Loading file-based index template has been disabled since 2.0.0-beta1 version of Elasticsearch.  https://www.elastic.co/guide/en/elasticsearch/reference/2.0/breaking_20_index_api_changes.html#_file_based_index_templates 

So the `template-k8s-logstash.json` is not longer useful.

On the other hand, as https://github.com/kubernetes/kubernetes/issues/25127 indicated, we might better curl the elasticsearch API to load this template.
2017-05-22 20:58:01 -07:00
Kubernetes Submit Queue 6f5193593d Merge pull request #46201 from wojtek-t/address_kubeproxy_todos
Automatic merge from submit-queue

Address remaining TODOs in kube-proxy.

Followup PR from the previous two.
2017-05-22 20:54:14 -07:00
Kubernetes Submit Queue 99a8f7c303 Merge pull request #43590 from dashpole/eviction_complete_deletion
Automatic merge from submit-queue (batch tested with PRs 46022, 46055, 45308, 46209, 43590)

Eviction does not evict unless the previous pod has been cleaned up

Addresses #43166
This PR makes two main changes:
First, it makes the eviction loop re-trigger immediately if there may still be pressure.  This way, if we already waited 10 seconds to delete a pod, we dont need to wait another 10 seconds for the next synchronize call.
Second, it waits for the pod to be cleaned up (including volumes, cgroups, etc), before moving on to the next synchronize call.  It has a timeout for this operation currently set to 30 seconds.
2017-05-22 20:00:03 -07:00
Kubernetes Submit Queue c586f36e55 Merge pull request #46209 from wojtek-t/remove_iptables_save
Automatic merge from submit-queue (batch tested with PRs 46022, 46055, 45308, 46209, 43590)

Remove Save() from iptables interface

This is what @thockin requested in one of the reviews.
2017-05-22 20:00:00 -07:00
Kubernetes Submit Queue c6cf666fa1 Merge pull request #45308 from fabianofranz/more_cmd_sanity_checks
Automatic merge from submit-queue (batch tested with PRs 46022, 46055, 45308, 46209, 43590)

More cli sanity verifications

Adds some more `kubectl` command sanity checks to improve consistency and avoid the need of code reviews for some of our CLI style and standards.

**Release note**:

```release-note
NONE
```
@kubernetes/sig-cli-pr-reviews
2017-05-22 19:59:59 -07:00
Kubernetes Submit Queue bb56937b92 Merge pull request #46055 from deads2k/crd-01-embed
Automatic merge from submit-queue (batch tested with PRs 46022, 46055, 45308, 46209, 43590)

embed kube-apiextensions inside of kube-apiserver

To reduce operation complexity, we decided to include the kube-apiextensions-server inside of kube-apiserver (https://github.com/kubernetes/community/blob/master/sig-api-machinery/api-extensions-position-statement.md#q-should-kube-aggregator-be-a-separate-binaryprocess-than-kube-apiserver).  With the API reasonably well established and a finalizer about merge, I think its time to add ourselves.

This pull wires kube-apiextensions-server ahead of the TPRs so that one will replace the other if both are added by accident (CRDs should have priority) and wires a controller for automatic aggregation.

WIP because I still need tests: unit test for controller, test-cmd test to mirror the TPR test.


```release-note
Adds the `CustomResourceDefinition` (crd) types to the `kube-apiserver`.  These are the successors to `ThirdPartyResource`.  See https://github.com/kubernetes/community/blob/master/contributors/design-proposals/thirdpartyresources.md for more details.
```
2017-05-22 19:59:57 -07:00
Kubernetes Submit Queue e823e60bbf Merge pull request #46022 from xilabao/add-rolebinding-to-describe-command
Automatic merge from submit-queue

add rolebinding/clusterrolebinding to describe.go

**What this PR does / why we need it**:

```
./cluster/kubectl.sh describe clusterrolebinding system:kube-dns
Name:		system:kube-dns
Labels:		kubernetes.io/bootstrapping=rbac-defaults
Annotations:	rbac.authorization.kubernetes.io/autoupdate=true
Role:
  Kind:	ClusterRole
  Name:	system:kube-dns
Subjects:
  Kind			Name		Namespace
  ----			----		---------
  ServiceAccount	kube-dns	kube-system
```

**Which issue this PR fixes**: 

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-05-22 19:59:20 -07:00
Balu Dontu eb3cf509e5 SPBM policy ID support in vsphere cloud provider 2017-05-22 19:45:17 -07:00
Balu Dontu 668fa94ccb Open API and swagger spec changes 2017-05-22 19:45:02 -07:00
System Administrator 83520a7470 Kubernetes core API changes for vSphere 2017-05-22 19:43:29 -07:00
Balu Dontu 23ee1745d3 PBM govmomi dependencies 2017-05-22 19:43:10 -07:00
xilabao d555b1e265 fix err message in storage extensions 2017-05-23 10:22:01 +08:00
Kubernetes Submit Queue 199465c3a5 Merge pull request #43663 from shiywang/quato
Automatic merge from submit-queue (batch tested with PRs 38990, 45781, 46225, 44899, 43663)

Fix command exec -- COMMAND can not contain spaces

Fixes https://github.com/kubernetes/kubernetes/issues/7688
the problem is when you execute command:
 `cluster/kubectl.sh exec -p client-blue-8yw37 -c client -i -t -- 'ls -t /usr'`
the args is 
[`client-blue-8yw37` , `ls -t /usr`] 
**instead of** 
[`client-blue-8yw37`, `ls`, `-t`, `/usr`]
@kubernetes/sig-cli-pr-reviews, so I add a warning, wdyt ?
cc @ymqytw @adohe @fabianofranz
2017-05-22 19:07:12 -07:00