Commit Graph

454 Commits (c582a37cae02b4d1a850e6668ea8ea0a81dcd204)

Author SHA1 Message Date
Justin Santa Barbara d420531f95 volumes: SetNodeStatusUpdateNeeded on error
If an error happened during the UpdateNodeStatuses loop, there were some
code paths where we would not call SetNodeStatusUpdateNeeded, leaking
the state.  Add it to all paths by adding a function.

Part of #40583
2017-06-01 00:32:20 -04:00
NickrenREN e62bdf82da Extract volume relevant events reason
Extract volume relevant events reason and make them const
2017-05-31 22:11:36 +08:00
Shyam Jeedigunta 4425864707 Migrate kubelet configmap management logic to an interface 2017-05-31 10:39:36 +02:00
Kubernetes Submit Queue 0aad9d30e3 Merge pull request #44897 from msau42/local-storage-plugin
Automatic merge from submit-queue (batch tested with PRs 46076, 43879, 44897, 46556, 46654)

Local storage plugin

**What this PR does / why we need it**:
Volume plugin implementation for local persistent volumes.  Scheduler predicate will direct already-bound PVCs to the node that the local PV is at.  PVC binding still happens independently.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 
Part of #43640

**Release note**:

```
Alpha feature: Local volume plugin allows local directories to be created and consumed as a Persistent Volume.  These volumes have node affinity and pods will only be scheduled to the node that the volume is at.
```
2017-05-30 23:20:02 -07:00
Kubernetes Submit Queue c77b74e328 Merge pull request #46486 from NickrenREN/pv-provisioner-check
Automatic merge from submit-queue

Optimize provisioner plugin result check logic

If err is not returned by findProvisionablePlugin(...), storageClass is certainly not nil


**Release note**:

```release-note
NONE
```
2017-05-29 05:31:38 -07:00
Kubernetes Submit Queue c34b359bd7 Merge pull request #45923 from verult/cxing/NodeStatusUpdaterFix
Automatic merge from submit-queue (batch tested with PRs 46383, 45645, 45923, 44884, 46294)

Node status updater now deletes the node entry in attach updates...

… when node is missing in NodeInformer cache.

- Added RemoveNodeFromAttachUpdates as part of node status updater operations.



**What this PR does / why we need it**: Fixes issue of unnecessary node status updates when node is deleted.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #42438

**Special notes for your reviewer**: Unit tested added, but a more comprehensive test involving the attach detach controller requires certain testing functionality that is currently absent, and will require larger effort. Will be added at a later time.

There is an edge case caused by the following steps:
1) A node is deleted and restarted. The node exists, but is not yet recognized by Kubernetes.
2) A pod requiring a volume attach with nodeName specifically set to this node.

This would make the pod stuck in ContainerCreating state. This is low-pri since it's a specific edge case that can be avoided.

**Release note**:

```release-note
NONE
```
2017-05-26 12:58:02 -07:00
NickrenREN 51e2336476 Optimize provisioner plugin result check logic
If err is not returned by findProvisionablePlugin(...), storageClass is certainly not nil
2017-05-26 13:57:40 +08:00
Cheng Xing f9dc2d5ca3 Node status updater now deletes the node entry in attach updates when node is missing in NodeInformer cache. Fixes #42438.
- Added RemoveNodeFromAttachUpdates as part of node status updater operations.
2017-05-24 18:31:47 -07:00
NickrenREN add091b1fb fix regression in UX experience for double attach volume
send event when volume is not allowed to multi-attach
2017-05-25 09:27:24 +08:00
Justin Santa Barbara 35be997c2f volumes: promote some logs from info -> warning
Part of #40583
2017-05-23 22:36:42 -04:00
Michelle Au 6ade5461ad Add GetNodeLabels to VolumeHost interface 2017-05-22 14:44:06 -07:00
Alexander Block 06baeb33b2 Don't try to attach volumes which are already attached to other nodes 2017-05-18 06:56:30 +02:00
Kubernetes Submit Queue 6dbe853e29 Merge pull request #45544 from ianchakeres/reconciler-err-cleanup
Automatic merge from submit-queue (batch tested with PRs 45990, 45544, 45745, 45742, 45678)

Refactor reconciler volume log and error messages

**What this PR does / why we need it**:
Utilizes volume-specific error and log messages introduced in #44969, inside files that also log volume information. 

Specifically: 

- pkg/kubelet/volumemanager/reconciler/reconciler.go, 
- pkg/controller/volume/attachdetach/reconciler/reconciler.go, and
- pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go


**Which issue this PR fixes** : fixes #40905

**Special notes for your reviewer**:

**Release note**:

```release-note
```
NONE
2017-05-17 18:40:51 -07:00
Kubernetes Submit Queue 521d7d1ac5 Merge pull request #42472 from timchenxiaoyu/requesttypo
Automatic merge from submit-queue

fix request typo
2017-05-15 15:57:57 -07:00
Ian Chakeres b1315f4491 Refactor reconciler volume log and error messages 2017-05-11 22:33:17 -07:00
Hemant Kumar 951a36aac7 Add Keepterminatedpodvolumes as a annotation on node
and lets make sure that controller respects it
and doesn't detaches mounted volumes.
2017-05-11 22:31:14 -04:00
Hemant Kumar 9a1a9cbe08 detach the volume when pod is terminated
Make sure volume is detached when pod is terminated because
of any reason and not deleted from api server.
2017-05-11 22:18:22 -04:00
NickrenREN 0861688237 add and clear err message in RemoveVolumeFromReportAsAttached 2017-05-08 09:37:21 +08:00
Jan Safranek 9d0c47f1db Use storage.v1 instead of v1beta1
storage.v1beta1 was used to work around GKE which does not expose v1. Now that
GKE is updated, we can switch everything to v1.
2017-04-25 10:13:38 +02:00
Tomas Smetana 3064fe4d39 Fix issue #44757: Flaky Test_AttachDetachControllerRecovery 2017-04-21 12:43:54 +02:00
Kubernetes Submit Queue db53e80445 Merge pull request #39732 from tsmetana/issue_34242
Automatic merge from submit-queue (batch tested with PRs 44722, 44704, 44681, 44494, 39732)

Fix issue #34242: Attach/detach should recover from a crash

When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status and figure out which volumes to detach. This requires some changes in the volume providers too: the only information available from the nodes is the volume name and the device path. The controller needs to find the correct volume plugin and reconstruct the volume spec just from the name. This required a small change also in the volume plugin interface.

Fixes Issue #34242.
cc: @jsafrane @jingxu97
2017-04-20 16:01:04 -07:00
Tomas Smetana 852c44ae59 Fix issue #34242: Attach/detach should recover from a crash
When the attach/detach controller crashes and a pod with attached PV is deleted
afterwards the controller will never detach the pod's attached volumes. To
prevent this the controller should try to recover the state from the nodes
status.
2017-04-20 13:04:50 +02:00
NickrenREN eca490bbdd Remove claimClass check and upgradeVolumeFrom1_2() function 2017-04-19 19:12:32 +08:00
NickrenREN 5cafb9042b find and add active pods for dswp
loops through the list of active pods and ensures that each one exists in the desired state of the world cache
2017-04-18 11:21:37 +08:00
Kubernetes Submit Queue 08ea5387f2 Merge pull request #44566 from wongma7/adc-wait
Automatic merge from submit-queue (batch tested with PRs 44469, 44566, 44467, 44526)

WaitForCacheSync before running attachdetach controller

@gnufied you wrote the test and @ncdc the TODO comment. Let's just run the pv and pvc informers, we do not care about them in this test. But we want to be able to stop the pod Informer at will, hence not just using informers.Start, is my understanding.
```release-note
NONE
```
2017-04-17 20:06:58 -07:00
Chao Xu 4f9591b1de move pkg/api/v1/ref.go and pkg/api/v1/resource.go to subpackages. move some functions in resource.go to pkg/api/v1/node and pkg/api/v1/pod 2017-04-17 11:38:11 -07:00
Matthew Wong e1ce33d944 WaitForCacheSync before running attachdetach controller 2017-04-17 14:02:33 -04:00
Chao Xu d4850b6c2b move pkg/api/v1/helpers.go to subpackage 2017-04-14 14:25:11 -07:00
Mike Danese a05c3c0efd autogenerated 2017-04-14 10:40:57 -07:00
Andy Goldstein e63fcf708d Make controller Run methods consistent
- startup/shutdown logging
- wait for cache sync logging
- defer utilruntime.HandleCrash()
- wait for stop channel before exiting
2017-04-14 07:27:45 -04:00
NickrenREN e0ef5bfd40 Exit from NewController() for PersistentVolumeController when InitPlugins() failed just like NewAttachDetachController() does 2017-04-12 13:43:09 +08:00
Kubernetes Submit Queue e60cc6ee3d Merge pull request #44090 from NickrenREN/remove-alpha-pv
Automatic merge from submit-queue

Remove alphaProvisioner in PVController and AlphaStorageClassAnnotation

remove alpha annotation and alphaProvisioner 

**Release note**:

```release-note
NONE
```
2017-04-11 20:41:40 -07:00
Kubernetes Submit Queue 50b104b8e2 Merge pull request #42313 from timchenxiaoyu/completelytypo
Automatic merge from submit-queue

fix completely typo
2017-04-10 05:57:06 -07:00
Kubernetes Submit Queue 3572e5ca86 Merge pull request #43679 from xingzhou/kube-42121
Automatic merge from submit-queue

Improve event msg for PV controller when using external provisioner

Improve event msg for PV controller when using external provisioner

**Which issue this PR fixes** *:
Fixed part of #42121

**Special notes for your reviewer**:
@jsafrane, as many of our users are confused by the original message, can we fix the message first and then consider how to control the count of the events?
2017-04-10 05:07:52 -07:00
NickrenREN fa7bd44966 Remove alphaProvisioner in PVController and AlphaStorageClassAnnotation 2017-04-10 17:09:40 +08:00
Kubernetes Submit Queue 00743a4f2b Merge pull request #42629 from NickrenREN/pv-index
Automatic merge from submit-queue (batch tested with PRs 41775, 39678, 42629, 42524, 43028)

matchPredicate does not fit findByClaim()

matchPredicate has two args which are type of PV,and is not used in function findByClaim(),remove it


**Release note**:
```release-note
NONE
```
2017-04-07 17:44:16 -07:00
NickrenREN 67a55a4b02 matchPredicate does not fit findByClaim()
matchPredicate has two args which are type of PV,and is not used in function findByClaim(),remove it
2017-04-07 10:15:42 +08:00
Xing Zhou 5b29afb1ad Improve event msg for PV controller when using external provisioner
Improve event msg for PV controller when using external provisioner
2017-04-05 08:49:48 +08:00
Kubernetes Submit Queue 46d4c621a8 Merge pull request #42992 from NickrenREN/syncUnboundClaim
Automatic merge from submit-queue (batch tested with PRs 43453, 42992)

make sure that the volume satisfies the requirements of the claim before binding

check if the volume requested by the claim satisfies the requirements of the claim before binding when
syncUnboundClaim and claim.Spec.VolumeName is not set, although the volume is asked by user


**Release note**:
```release-note
NONE
```
2017-04-04 04:27:19 -07:00
Kubernetes Submit Queue 05c046f6d3 Merge pull request #43810 from gnufied/add-gnufied-vol-controller
Automatic merge from submit-queue

Adding gnufied as reviewer for volume controller

I have helped review several PRs and made new
PRs to this area.

cc @childsb @saad-ali
2017-04-03 12:46:25 -07:00
NickrenREN f922af5138 make sure that the volume satisfies the requirements of the claim before binding
check if the volume requested by the claim satisfies the requirements of the claim before binding when
syncUnboundClaim and claim.Spec.VolumeName is not set
2017-04-04 01:03:37 +08:00
Kubernetes Submit Queue 50763cb6be Merge pull request #43627 from pospispa/make-constants-public-so-that-they-can-be-used-in-an-external-provisioner
Automatic merge from submit-queue

Make Constants Public so that They Can Be Used in an Ext. Provisioner

Out-of-tree external provisioners have the same purpose as in-tree provisioners. As external provisioners work with PV and PVC datastructures it's an advantage to import certain Kubernetes packages instead of copy-pasting the Kubernetes code.

That's why the constants are made public so that they can be used in an external provisioner.

@jsafrane  @kubernetes/sig-storage-pr-reviews 

```
NONE
```
2017-04-03 06:43:03 -07:00
Kubernetes Submit Queue 4fa902a915 Merge pull request #43933 from childsb/add_approver
Automatic merge from submit-queue

update pkg/controller/volume/OWNER to add appropriate approvers for both volume controllers

Update pkg/controller/volume approvers so that the attach/detach and binding controllers have approvers.
2017-04-01 11:37:15 -07:00
Kubernetes Submit Queue fff5fae0a0 Merge pull request #43289 from tsmetana/adc-race-fix
Automatic merge from submit-queue

Attach/detach controller: fix potential race in constructor

**What this PR does / why we need it**:
There is a potential race condition in the Attach/detach controller: The "constructor" first installs informer event handlers and then creates and initializes the other data structures. However there is no guarantee an event cannot arrive before the data structures required by the event handlers are ready. This may result in nil pointer derefernces and potential crashes (e.g. the nodeAdd method calls adc.actualStateOfWorld.SetNodeStatusUpdateNeeded even though the actualStateOfWorld might be still nil).

It should be enough just to move the event handlers installation at the end of the constructor function.

**Release note**:

```release-note
NONE
```
2017-03-31 17:30:36 -07:00
childsb 308a3a8c45 Update OWNERS with approvers to cover both volume controllers. 2017-03-31 18:26:42 -05:00
Hemant Kumar 39ec7c7400 Adding myself as reviewer for volume controller
I have helped review several PRs and made new
PRs to this area.
2017-03-29 10:42:44 -04:00
Kubernetes Submit Queue 81d9e7f68a Merge pull request #42721 from NickrenREN/pv-provisionClaimOperation
Automatic merge from submit-queue

fix createProvisionedPV result err judgenment bug

When ctrl.kubeClient.Core().PersistentVolumes().Create(volume) returns no err, and storeVolumeUpdate() fails, we save PV sucessfully ,but here err is not nil,we should not run the codes next in block if err != nil {} to delete the storage asset.
same in the deletion retries below
change the err names to make it clear
**Release note**:
```release-note
NONE
```
@jsafrane @saad-ali  PTAL. Thanks
2017-03-26 23:30:34 -07:00
NickrenREN 0ab9f6fe55 fix createProvisionedPV result err judgenment bug
When ctrl.kubeClient.Core().PersistentVolumes().Create(volume) returns no err, but storeVolumeUpdate() failed, we save PV sucessfully ,but here err is not nil,
we should not run the codes next in block if err != nil {}
same in the deletion retries below
2017-03-25 10:48:16 +08:00
NickrenREN 5d1c7c77b4 Fix PVC Annotations
annDynamicallyProvisioned is added to PV, PVC has annStorageProvisioner
2017-03-25 10:42:04 +08:00
Kubernetes Submit Queue d585eb3035 Merge pull request #42631 from NickrenREN/pv-isVolumeBoundToClaim-1
Automatic merge from submit-queue (batch tested with PRs 42522, 42545, 42556, 42006, 42631)

optimize the binding logic of bindClaimToVolume

extract var shouldSetBoundByController and do not need to judge volumename twice
**Release note**:
```release-note
NONE
```
2017-03-24 15:10:35 -07:00
Kubernetes Submit Queue d2175bab9a Merge pull request #42522 from NickrenREN/pv-err-print
Automatic merge from submit-queue

print err message when update store failed

```release-note
NONE
```
2017-03-24 15:06:49 -07:00
Kubernetes Submit Queue 6ae0199eb0 Merge pull request #42402 from jorenhehe/pv-typo
Automatic merge from submit-queue

fix pv_controller typos

```release-note
NONE
```
2017-03-24 10:25:55 -07:00
pospispa 1fdd163dd1 Make Some Constants Public so that They Can Be Used in an External Provisioner
Out-of-tree external provisioners have the same purpose as in-tree provisioners. As external provisioners work with PV and PVC datastructures it's an advantage to import certain Kubernetes packages instead of copy-pasting the Kubernetes code.

That's why the constants are made public so that they can be used in an external provisioner.
2017-03-24 17:09:51 +01:00
Tomas Smetana 6898bc60ce Attach/detach controller: fix potential race in constructor 2017-03-17 13:34:53 +01:00
NickrenREN 10779c8bcc optimize the binding logic of bindClaimToVolume 2017-03-07 17:04:04 +08:00
NickrenREN 059ffbe9b9 print err message when update store failed 2017-03-04 15:49:25 +08:00
Kubernetes Submit Queue e9bbfb81c1 Merge pull request #41306 from gnufied/implement-interface-bulk-volume-poll
Automatic merge from submit-queue (batch tested with PRs 41306, 42187, 41666, 42275, 42266)

Implement bulk polling of volumes

This implements Bulk volume polling using ideas presented by
justin in https://github.com/kubernetes/kubernetes/pull/39564

But it changes the implementation to use an interface
and doesn't affect other implementations.

cc @justinsb
2017-03-03 10:54:38 -08:00
timchenxiaoyu 811a334be9 fix request typo 2017-03-03 17:22:38 +08:00
Hemant Kumar 786da1de12 Impement bulk polling of volumes
This implements Bulk volume polling using ideas presented by
justin in https://github.com/kubernetes/kubernetes/pull/39564

But it changes the implementation to use an interface
and doesn't affect other implementations.
2017-03-02 14:59:59 -05:00
Jan Safranek 9487552e41 Regenerate everything 2017-03-02 10:23:58 +01:00
Jan Safranek 52adaa16e0 PV controller: use attributes instead of beta annotations in unit tests 2017-03-02 10:23:56 +01:00
Jan Safranek 0097adc1c5 PV controller: Set StorageClassName during provisioning 2017-03-02 10:23:56 +01:00
Jan Safranek 7ae4152712 Move PV/PVC annotations to PV/PVC types.
They aren't part of storage.k8s.io/v1 or v1beta1 API.
Also move associated *GetClass functions.
2017-03-02 10:23:55 +01:00
jorenhehe 42c39d6aaa fix pv_controller typos 2017-03-02 16:28:30 +08:00
Hemant Kumar 2d3008fc56 Implement support for mount options in PVs
Add support for mount options via annotations on PVs
2017-03-01 11:50:40 -05:00
timchenxiaoyu c1851649f3 fix completely typo 2017-03-01 12:55:31 +08:00
Justin Santa Barbara 1d357b334f volumes: simplify append-to-slice code 2017-02-28 10:37:28 -05:00
Justin Santa Barbara 0ee71ef214 volumes: add comment on getNodeAndVolume
Add comments on getNodeAndVolume to explain the code - it is a little
subtle, and it confused me on first reading.
2017-02-28 10:30:10 -05:00
Justin Santa Barbara b7edfda828 volumes: Add logging when removing node fails 2017-02-28 10:17:33 -05:00
Kubernetes Submit Queue 0b54264d3e Merge pull request #41406 from jsafrane/operation-backoff
Automatic merge from submit-queue (batch tested with PRs 41814, 41922, 41957, 41406, 41077)

pv_controller: Do not report exponential backoff as error.

It's not an error when recycle/delete/provision operation cannot be started
because it has failed recently. It will be restarted automatically when
backoff expires.

This just pollutes logs without any useful information:
```
E0214 08:00:30.428073   77288 pv_controller.go:1410] error scheduling operaion "delete-pvc-1fa0e8b4-f2b5-11e6-a8bb-fa163ecb84eb[1fbd52ee-f2b5-11e6-a8bb-fa163ecb84eb]": Failed to create operation with name "delete-pvc-1fa0e8b4-f2b5-11e6-a8bb-fa163ecb84eb[1fbd52ee-f2b5-11e6-a8bb-fa163ecb84eb]". An operation with that name failed at 2017-02-14 08:00:15.631133152 -0500 EST. No retries permitted until 2017-02-14 08:00:31.631133152 -0500 EST (16s). Last error: "Cannot delete the volume \"11a4faea-bfc7-4713-88b3-dec492480dba\", it's still attached to a node".
```

```release-note
NONE
```

@kubernetes/sig-storage-pr-reviews
2017-02-26 10:22:53 -08:00
Jordan Liggitt 41c88e0455
Revert "Merge pull request #40088 from jsafrane/storage-ga-v1"
This reverts commit 5984607cb9, reversing
changes made to 067f92e789.
2017-02-25 22:35:15 -05:00
Jan Safranek fa93f1c411 Update imports 2017-02-24 13:52:16 +01:00
Jan Safranek cea7a46de1 Regenerate everything 2017-02-24 13:34:18 +01:00
Jan Safranek 3f6caca97a Add storage.k8s.io/v1 2017-02-24 13:34:18 +01:00
Kubernetes Submit Queue a67e78e4fa Merge pull request #40317 from kpgriffith/recycle-vol-plug-cleanup
Automatic merge from submit-queue (batch tested with PRs 41364, 40317, 41326, 41783, 41782)

changes to cleanup the volume plugin for recycle

**What this PR does / why we need it**:
Code cleanup. Changing from creating a new interface from the plugin, that then calls a function to recycle a volume, to adding the function to the plugin itself.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #26230

**Special notes for your reviewer**:
Took same approach from closed PR #28432.

Do you want the approach to be the same for NewDeleter(), NewMounter(), NewUnMounter() and should they be in this same PR or submit different PR's for those?

**Release note**:

```NONE
```
2017-02-21 07:45:40 -08:00
Matthew Wong 33f98d4db3 Switch pv controller to shared informers 2017-02-16 10:08:23 -05:00
Shyam JVS 2ed7acfbcc Revert "Remove alpha provisioning" 2017-02-16 13:53:55 +01:00
Kubernetes Submit Queue 8faa9b5d4e Merge pull request #40000 from jsafrane/storage-ga-remove-alpha
Automatic merge from submit-queue

Remove alpha provisioning

This is the first part of https://github.com/kubernetes/features/issues/36

@kubernetes/sig-storage-misc 

**Release note**:
```release-note
Alpha version of dynamic volume provisioning is removed in this release. Annotation
"volume.alpha.kubernetes.io/storage-class" does not have any special meaning. A default storage class
and  DefaultStorageClass admission plugin can be used to preserve similar behavior of Kubernetes cluster,
see https://kubernetes.io/docs/user-guide/persistent-volumes/#class-1 for details.
```
2017-02-16 01:02:06 -08:00
Jan Safranek 308c0ecde9 pv_controller: Do not report exponential backoff as error.
It's not an error when recycle/delete/provision operation cannot be started
because it has failed recently. It will be restarted automatically when
backoff expires.
2017-02-14 15:16:26 +01:00
deads2k fd34b11e13 react to informer updates 2017-02-13 09:18:32 -05:00
deads2k a86fabb9d2 regenerate informers 2017-02-13 07:59:34 -05:00
Kubernetes Submit Queue 9abfa6b446 Merge pull request #40385 from ncdc/shared-informers-02-swap-existing
Automatic merge from submit-queue

Replace hand-written informers with generated ones

Replace existing uses of hand-written informers with generated ones.
Follow-up commits will switch the use of one-off informers to shared
informers.

This is a precursor to #40097. That PR will switch one-off informers to shared informers for the majority of the code base (but not quite all of it...).

NOTE: this does create a second set of shared informers in the kube-controller-manager. This will be resolved back down to a single factory once #40097 is reviewed and merged.

There are a couple of places where I expanded the # of caches we wait for in the calls to `WaitForCacheSync` - please pay attention to those. I also added in a commented-out wait in the attach/detach controller. If @kubernetes/sig-storage-pr-reviews is ok with enabling the waiting, I'll do it (I'll just need to tweak an integration test slightly).

@deads2k @sttts @smarterclayton @liggitt @soltysh @timothysc @lavalamp @wojtek-t @gmarek @sjenning @derekwaynecarr @kubernetes/sig-scalability-pr-reviews
2017-02-06 16:25:42 -08:00
Andy Goldstein 70c6087600 Replace hand-written informers with generated ones
Replace existing uses of hand-written informers with generated ones.
 Follow-up commits will switch the use of one-off informers to shared
 informers.
2017-02-06 13:49:27 -05:00
Kevin Griffith 9448aa66ff cleanup the volume plugin for recycle
update commit to reflect changes
2017-02-06 10:38:49 -06:00
Kubernetes Submit Queue ab794c6128 Merge pull request #40918 from k82cn/pv_ctrl_typo
Automatic merge from submit-queue

Fixed typo in pv_controller.go

fixes #40916
2017-02-03 07:37:25 -08:00
Klaus Ma ef5f838c23 Fixed typo in pv_controller.go 2017-02-03 20:55:15 +08:00
Kubernetes Submit Queue 9e427c88c4 Merge pull request #40859 from jsafrane/ps-scheduler-event
Automatic merge from submit-queue (batch tested with PRs 40855, 40859)

PV binding: send an event when there are no PVs to bind

This is similar to scheduler that says "no nodes available to schedule pods"
when it can't schedule a pod.

@kubernetes/sig-storage-pr-reviews
2017-02-02 09:01:48 -08:00
Jan Safranek 13546e5ea4 PV binding: send an event when there are no PVs to bind
This is similar to scheduler that says "no nodes available to schedule pods"
when it can't schedule a pod.
2017-02-02 13:30:53 +01:00
Vladimir Vivien 8ebed57767 Prevent pv controller from forcefully overwrite provisioned volume name
This fix prevents the PV controller from forcefully overwriting the provisioned volume's name with the generated PV name.  Instead, it allows dynamic provisioner implementers to set the name of the volume to a value that they choose.
2017-02-01 12:19:20 -05:00
Jan Safranek 587eb199e0 Remove alpha provisioning 2017-02-01 14:51:54 +01:00
deads2k 8a12000402 move client/record 2017-01-31 19:14:13 -05:00
deads2k 2c1c0f3f72 move workqueue to client-go 2017-01-30 09:08:21 -05:00
Dr. Stefan Schimanski 44ea6b3f30 Update generated files 2017-01-29 21:41:45 +01:00
Dr. Stefan Schimanski bc6fdd925d pkg/api/resource: move to apimachinery 2017-01-29 21:41:44 +01:00
Kubernetes Submit Queue 88890f586c Merge pull request #40126 from resouer/return-value
Automatic merge from submit-queue (batch tested with PRs 40126, 40565, 38777, 40564, 40572)

Do not swallow error in asw.updateNodeStatusUpdateNeeded

Ref #39056

Bubble the error up to `SetNodeUpdateStatusNeeded` and log it out.

NOTE: This does not modify interface of `SetNodeUpdateStatusNeeded`
2017-01-27 01:34:16 -08:00
deads2k 9488e2ba30 move testing/core to client-go 2017-01-26 13:54:40 -05:00
Dr. Stefan Schimanski a0137e9b28 Update generated files 2017-01-25 19:49:45 +01:00
Dr. Stefan Schimanski d7eb3b6870 pkg/util: move uuid and strategicpatch into k8s.io/apimachinery 2017-01-25 19:45:09 +01:00
Harry Zhang 70941f65bf Do not swallow error in volume 2017-01-25 21:29:48 +08:00
deads2k b0b156b381 make tools/cache authoritative 2017-01-25 08:29:45 -05:00
Kubernetes Submit Queue 373e7ef0c0 Merge pull request #40294 from tsmetana/persistent-volume-test-refactor
Automatic merge from submit-queue (batch tested with PRs 39064, 40294)

Refactor persistent volume tests

This is an attempt to make the binder tests a bit more concise. The PVCs are being created by a "templating" function. There is also a handful of PVs in the tests but those vary quite more and I don't think similar approach would save us much code.

Reference:
https://reviewable.kubernetes.io/reviews/kubernetes/kubernetes/29006#-KPJuVeDE0O6TvDP9jia

@jsafrane: I hope this is what you have on mind.
2017-01-25 02:05:57 -08:00
Clayton Coleman 469df12038
refactor: move ListOptions references to metav1 2017-01-23 17:52:46 -05:00
Wojciech Tyczynski bf7138652f SecretVolume using secret manager 2017-01-23 16:10:01 +01:00
Tomas Smetana 382f7bc9cc Refactor persistent volume tests 2017-01-23 10:43:44 +01:00
Wojciech Tyczynski d08abdb187 Allow for returning map[string]interface{} from patch. 2017-01-18 11:53:30 +01:00
Clayton Coleman 9a2a50cda7
refactor: use metav1.ObjectMeta in other types 2017-01-17 16:17:19 -05:00
Clayton Coleman 36acd90aba
Move APIs and core code to use metav1.ObjectMeta 2017-01-17 16:17:18 -05:00
Kubernetes Submit Queue f74b4bbbad Merge pull request #38094 from yarntime/fix_update_typo
Automatic merge from submit-queue

fix typos

fix typos.
2017-01-16 18:22:33 -08:00
deads2k 77b4d55982 mechanical 2017-01-16 09:35:12 -05:00
Klaus Ma 25fe1e0d82 Made cache.Controller to be interface. 2017-01-13 13:33:23 +08:00
deads2k 6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
yarntime@163.com f7c737e8a9 fix typos 2017-01-11 16:08:20 +08:00
Kubernetes Submit Queue 7c3fff1a95 Merge pull request #39551 from chrislovecnm/reconciler-time-increases
Automatic merge from submit-queue (batch tested with PRs 39628, 39551, 38746, 38352, 39607)

Increasing times on reconciling volumes fixing impact to AWS.

#**What this PR does / why we need it**:

We are currently blocked by API timeouts with PV volumes.  See https://github.com/kubernetes/kubernetes/issues/39526.  This is a workaround, not a fix.

**Special notes for your reviewer**:

A second PR will be dropped with CLI cobra options in it, but we are starting with increasing the reconciliation periods.  I am dropping this without major testing and will test on our AWS account. Will be marked WIP until I run smoke tests.

**Release note**:

```release-note
Provide kubernetes-controller-manager flags to control volume attach/detach reconciler sync.  The duration of the syncs can be controlled, and the syncs can be shut off as well. 
```
2017-01-10 11:54:15 -08:00
chrislovecnm ac49139c9f updates from review 2017-01-09 17:20:19 -07:00
chrislovecnm a973c38c7d The capability to control duration via controller-manager flags,
and the option to shut off reconciliation.
2017-01-09 16:47:13 -07:00
NickrenREN 639572ac68 fix redundant alias and remove unused function 2017-01-09 17:13:09 +08:00
Jeff Grafton 20d221f75c Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
Jan Safranek 0fd5f2028d Add work queues to PV controller
PV controller should not use Controller.Requeue, as as it is not available in
shared informers. We need to implement our own work queues instead where we
can enqueue volumes/claims as we want.
2017-01-02 15:17:24 +01:00
Mike Danese 161c391f44 autogenerated 2016-12-29 13:04:10 -08:00
rkouj e7e3c55ad7 Add unit tests for MountVolume() of operation executor 2016-12-27 16:07:06 -08:00
rkouj d5f7610b82 Refactor operation_executor to make it unit testable 2016-12-27 15:12:16 -08:00
Hemant Kumar 7b423085fa Fix variable shadowing in exponential backoff when deleting volumes
Also fix pv_controller unit tests to behave more accurately
in light of exponential backoffs
2016-12-20 21:31:12 -05:00
Chao Xu 03d8820edc rename /release_1_5 to /clientset 2016-12-14 12:39:48 -08:00
Kubernetes Submit Queue 8abbedae54 Merge pull request #38315 from mikedanese/pin-gazel
Automatic merge from submit-queue

Pin gazel to a version and support cgo

This fixes the bazel build.

@krousey who is buildcop
2016-12-12 19:32:29 -08:00
Kubernetes Submit Queue f45e918b8b Merge pull request #35833 from apelisse/owners-pkg-controller
Automatic merge from submit-queue

Curating Owners: pkg/controller

cc @jsafrane @mikedanese @bprashanth @derekwaynecarr @thockin @saad-ali

In an effort to expand the existing pool of reviewers and establish a
two-tiered review process (first someone **lgtms** and then someone
experienced in the project **approves**), we are adding new reviewers to
existing owners files.
## If You Care About the Process:

We did this by algorithmically figuring out who’s contributed code to
the project and in what directories.  Unfortunately, that doesn’t work
perfectly: people that have made mechanical code changes (e.g change the
copyright header across all directories) end up as reviewers in lots of
places.

Instead of using pure commit data, we generated an excessively large
list of reviewers and pruned based on all time commit data, recent
commit data and review data (number of PRs commented on).

At this point we have a decent list of reviewers, but it needs one last
pass for fine tuning.
## TLDR:

As an owner of a sig/directory and a leader of the project, here’s what
we need from you:
1. Use PR https://github.com/kubernetes/kubernetes/pull/35715 as an example.
2. The pull-request is made editable, please edit the OWNERS file to add
   the names of people that should be reviewing code in the future in the **reviewers** section. You probably do NOT need to modify the **approvers** section.
3. Notify me if you want some OWNERS file to be removed.  Being an approver or reviewer
   of a parent directory makes you a reviewer/approver of the subdirectories too, so not all
   OWNERS files may be necessary.
4. Please use ALIAS if you want to use the same list of people over and
   over again (don't hesitate to ask me for help, or use the pull-request
   above as an example)
2016-12-12 18:51:33 -08:00
Mike Danese c87de85347 autoupdate BUILD files 2016-12-12 13:30:07 -08:00
Kubernetes Submit Queue 43233caaf0 Merge pull request #37871 from Random-Liu/use-patch-in-kubelet
Automatic merge from submit-queue (batch tested with PRs 36692, 37871)

Use PatchStatus to update node status in kubelet.

Fixes https://github.com/kubernetes/kubernetes/issues/37771.

This PR changes kubelet to update node status with `PatchStatus`.

@caesarxuchao @ymqytw told me that there is a limitation of current `CreateTwoWayMergePatch`, it doesn't support primitive type slice which uses strategic merge.
* I checked the node status, the only primitive type slices in NodeStatus are as follows, they are not using strategic merge:
  * [`ContainerImage.Names`](https://github.com/kubernetes/kubernetes/blob/master/pkg/api/v1/types.go#L2963)
  * [`VolumesInUse`](https://github.com/kubernetes/kubernetes/blob/master/pkg/api/v1/types.go#L2909)
* Volume package is already [using `CreateStrategicMergePath` to generate node status update patch](https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/attachdetach/statusupdater/node_status_updater.go#L111), and till now everything is fine. 

@yujuhong @dchen1107 
/cc @kubernetes/sig-node
2016-12-09 11:29:11 -08:00
Wojciech Tyczynski e8d1cba875 GetOptions in client calls 2016-12-09 09:42:01 +01:00
Random-Liu beba1ebbf8 Use PatchStatus to update node status in kubelet. 2016-12-08 17:13:59 -08:00
Jordan Liggitt 6819706adf
Pass addressable values to DeepCopy 2016-12-08 14:16:01 -05:00
Kubernetes Submit Queue 8f11cc78a8 Merge pull request #38339 from gnufied/backoff-on-volume-delete
Automatic merge from submit-queue (batch tested with PRs 38377, 36365, 36648, 37691, 38339)

Exponential back off when volume delete fails

**What this PR does / why we need it**:

This PR implements ability in pv_controller to back off when deleting a volume fails from plugin API. 

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 

Partly fixes #38295 , but I think volume delete is most problematic thing happening in pv_controller without any sort of backoff. 

After this change the attempts of volume deletion look like:

```
controller : I1208 00:18:35.532061   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
controller : I1208 00:20:50.578325   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
controller : I1208 00:23:05.563488   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
controller : I1208 00:25:20.599158   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
controller : I1208 00:27:35.560009   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
controller : I1208 00:29:50.594967   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
controller : I1208 00:32:05.539168   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
controller : I1208 00:34:20.581665   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
```
2016-12-08 10:52:03 -08:00
Hemant Kumar caf867a402 Exponential back off when volume delete fails
This implements pv_controller to exponentially backoff
when deleting a volume fails in Cloud API. It ensures that
we aren't making too many calls to Cloud API
2016-12-07 19:25:36 -05:00
Alejandro Escobar 759530536f type found with controller comment. 2016-12-07 10:55:02 -08:00
Hemant Kumar fcf5d79be7 Add integration tests for desire state of world populator
This adds tests for code introduced here :
https://github.com/kubernetes/kubernetes/issues/26994

Via integration test we can now verify that if pod delete
event is somehow missed by AttachDetach controller - it still
get cleaned up by Desired State of World populator.
2016-12-06 06:52:52 -05:00
Clayton Coleman 3454a8d52c
refactor: update bazel, codec, and gofmt 2016-12-03 19:10:53 -05:00
Clayton Coleman 5df8cc39c9
refactor: generated 2016-12-03 19:10:46 -05:00
Kubernetes Submit Queue 6fd00e9f56 Merge pull request #37678 from tsmetana/issue_37377
Automatic merge from submit-queue (batch tested with PRs 37608, 37103, 37320, 37607, 37678)

Fix issue #37377: Report an event on successful PVC provisioning

This is a simple patch to fix the issue #37377: On a successful PVC provisioning an event is emitted so it's clear the provisioning actually succeeded.

cc: @jsafrane
2016-12-02 23:32:50 -08:00
Kubernetes Submit Queue fb7e9d901d Merge pull request #37939 from yarntime/fix_typo_in_node_status_updater
Automatic merge from submit-queue (batch tested with PRs 37997, 37939, 37990, 36700, 37258)

fix typo in node_status_updater

fix typo.
2016-12-02 19:26:47 -08:00
Kubernetes Submit Queue c552f8918b Merge pull request #37727 from rkouj/bug-fix-upgrade-test
Automatic merge from submit-queue

SetNodeUpdateStatusNeeded whenever nodeAdd event is received

**What this PR does / why we need it**:
Bug fix and SetNodeStatusUpdateNeeded for a node whenever its api object is added. This is to ensure that we don't lose the attached list of volumes in the node when its api object is deleted and recreated.

fixes https://github.com/kubernetes/kubernetes/issues/37586
         https://github.com/kubernetes/kubernetes/issues/37585


**Special notes for your reviewer**:


<!--  Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access) 
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`. 
-->
2016-12-02 05:44:57 -08:00
yarntime@163.com df6e9db9d9 fix typo 2016-12-02 17:33:45 +08:00
rkouj 638ef1b977 SetNodeUpdateStatusNeeded whenever nodeAdd event is received 2016-11-30 21:12:34 -08:00
Kubernetes Submit Queue 66fe55f5ad Merge pull request #37238 from deads2k/controller-02-minor-fixes
Automatic merge from submit-queue

controller manager refactors

The controller manager needs some significant cleanup.  This starts us down the patch by respecting parameters like `stopCh`, simplifying discovery checks, removing unnecessary parameters, preventing unncessary fatals, and using our client builder.

@sttts @ncdc
2016-11-30 20:08:19 -08:00
Tomas Smetana a02ee64d00 Fix issue #37377: Report an event on successful PVC provisioning
cc: @jsafrane
2016-11-30 10:27:22 +01:00
Pengfei Ni f584ed4398 Fix package aliases to follow golang convention 2016-11-30 15:40:50 +08:00
deads2k d973158a4e make controller manager use specified stop channel 2016-11-28 15:02:21 -05:00
Chao Xu bcc783c594 run hack/update-all.sh 2016-11-23 15:53:09 -08:00
Chao Xu 7eeb71f698 cmd/kube-controller-manager 2016-11-23 15:53:09 -08:00
ymqytw 3cc294b1e0 Revert "support patch list of primitives"
This reverts commit 34891ad9f6.
2016-11-22 21:06:36 -08:00
ymqytw d248843b65 Revert "try old patch after new patch fails"
This reverts commit f32696e734.
2016-11-22 21:02:30 -08:00
Kubernetes Submit Queue b9d2d74a94 Merge pull request #37038 from ymqytw/retry_old_patch_after_new_patch_fail
Automatic merge from submit-queue

Fix kubectl Stratigic Merge Patch compatibility

As @smarterclayton pointed out in [comment1](https://github.com/kubernetes/kubernetes/pull/35647#pullrequestreview-8290820) and [comment2](https://github.com/kubernetes/kubernetes/pull/35647#pullrequestreview-8290847) in PR #35647,
we cannot assume the API servers publish version and they shares the same version.

This PR removes all the calls of GetServerSupportedSMPatchVersion().
Change the behavior of `apply` and `edit` to:
Retrying with the old patch version, if the new version fails.
Default other usage of SMPatch to the new version, since they don't update list of primitives.

fixes #36916

cc: @pwittrock @smarterclayton
2016-11-19 01:02:47 -08:00
ymqytw f32696e734 try old patch after new patch fails 2016-11-17 14:28:09 -08:00
Jing Xu 3d3e44e77e fix issue in converting aws volume id from mount paths
This PR is to fix the issue in converting aws volume id from mount
paths. Currently there are three aws volume id formats supported. The
following lists example of those three formats and their corresponding
global mount paths:
1. aws:///vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/vol-123456)
2. aws://us-east-1/vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/mounts/aws/us-est-1/vol-123455)
3. vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/mounts/aws/us-est-1/vol-123455)

For the first two cases, we need to check the mount path and convert
them back to the original format.
2016-11-16 22:35:48 -08:00
Kubernetes Submit Queue 3e169be887 Merge pull request #35647 from ymqytw/patch_primitive_list
Automatic merge from submit-queue

Fix strategic patch for list of primitive type with merge sementic

Fix strategic patch for list of primitive type when the patch strategy is `merge`.
Before: we cannot replace or delete an item in a list of primitive, e.g. string, when the patch strategy is `merge`. It will always append new items to the list.
This patch will generate a map to update the list of primitive type.
The server with this patch will accept either a new patch or an old patch.
The client will found out the APIserver version before generate the patch.

Fixes #35163, #32398

cc: @pwittrock @fabianofranz 

``` release-note
Fix strategic patch for list of primitive type when patch strategy is `merge` to remove deleted objects.
```
2016-11-11 14:36:44 -08:00
Kubernetes Submit Queue 0f082c6663 Merge pull request #36280 from rkouj/better-mount-error
Automatic merge from submit-queue

Better messaging for missing volume binaries on host

**What this PR does / why we need it**:
When mount binaries are not present on a host, the error returned is a generic one.
This change is to check the mount binaries before the mount and return a user-friendly error message.

This change is specific to GCI and the flag is experimental now.

https://github.com/kubernetes/kubernetes/issues/36098

**Release note**:
Introduces a flag `check-node-capabilities-before-mount` which if set, enables a check (`CanMount()`) prior to mount operations to verify that the required components (binaries, etc.) to mount the volume are available on the underlying node. If the check is enabled and `CanMount()` returns an error, the mount operation fails. Implements the `CanMount()` check for NFS.















Sample output post change :


rkouj@rkouj0:~/go/src/k8s.io/kubernetes$ kubectl describe pods
Name:		sleepyrc-fzhyl
Namespace:	default
Node:		e2e-test-rkouj-minion-group-oxxa/10.240.0.3
Start Time:	Mon, 07 Nov 2016 21:28:36 -0800
Labels:		name=sleepy
Status:		Pending
IP:		
Controllers:	ReplicationController/sleepyrc
Containers:
  sleepycontainer1:
    Container ID:	
    Image:		gcr.io/google_containers/busybox
    Image ID:		
    Port:		
    Command:
      sleep
      6000
    QoS Tier:
      cpu:	Burstable
      memory:	BestEffort
    Requests:
      cpu:		100m
    State:		Waiting
      Reason:		ContainerCreating
    Ready:		False
    Restart Count:	0
    Environment Variables:
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	False 
  PodScheduled 	True 
Volumes:
  data:
    Type:	NFS (an NFS mount that lasts the lifetime of a pod)
    Server:	127.0.0.1
    Path:	/export
    ReadOnly:	false
  default-token-d13tj:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	default-token-d13tj
Events:
  FirstSeen	LastSeen	Count	From						SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----						-------------	--------	------		-------
  7s		7s		1	{default-scheduler }						Normal		Scheduled	Successfully assigned sleepyrc-fzhyl to e2e-test-rkouj-minion-group-oxxa
  6s		3s		4	{kubelet e2e-test-rkouj-minion-group-oxxa}			Warning		FailedMount	Unable to mount volume kubernetes.io/nfs/32c7ef16-a574-11e6-813d-42010af00002-data (spec.Name: data) on pod sleepyrc-fzhyl (UID: 32c7ef16-a574-11e6-813d-42010af00002). Verify that your node machine has the required components before attempting to mount this volume type. Required binary /sbin/mount.nfs is missing
2016-11-09 18:51:00 -08:00
Kubernetes Submit Queue 6a8edf72e1 Merge pull request #35957 from jsafrane/implement-external-provisioner
Automatic merge from submit-queue

Implement external provisioning proposal

In other words, add "provisioned-by" annotation to all PVCs that should be provisioned dynamically.

Most of the changes are actually in tests.

@kubernetes/sig-storage
2016-11-09 18:12:56 -08:00
Rajat Ramesh Koujalagi d81e216fc6 Better messaging for missing volume components on host to perform mount 2016-11-09 15:16:11 -08:00
ymqytw 34891ad9f6 support patch list of primitives 2016-11-09 11:46:59 -08:00
Paul Morie 4722cb299b Remove GetRootContext from VolumeHost 2016-11-03 12:21:19 -04:00
Jan Safranek 2224e80dd7 Fix race when two provisioner create two PVs for a single claim. 2016-11-03 16:58:25 +01:00
Saad Ali eac6809845 Add reviewers for `controller/volume` dir 2016-11-02 16:19:19 -07:00
Antoine Pelisse db35acde19 Update OWNERS: Remove reviewers: pkg/controller 2016-11-02 16:19:19 -07:00
Antoine Pelisse c695a54c1c Update OWNERS approvers and reviewers: pkg/controller 2016-11-02 16:19:18 -07:00
Jan Safranek 18de83c641 Implement external provisioning proposal
In other words, add "provisioned-by" annotation to all PVCs
that should be provisioned dynamically.
2016-11-02 14:13:34 +01:00
Chao Xu 850729bfaf include multiple versions in clientset
update client-gen to use the term "internalversion" rather than "unversioned";
leave internal one unqualified;
cleanup client-gen
2016-10-29 13:30:47 -07:00
Jing Xu abbde43374 Add sync state loop in master's volume reconciler
At master volume reconciler, the information about which volumes are
attached to nodes is cached in actual state of world. However, this
information might be out of date in case that node is terminated (volume
is detached automatically). In this situation, reconciler assume volume
is still attached and will not issue attach operation when node comes
back. Pods created on those nodes will fail to mount.

This PR adds the logic to periodically sync up the truth for attached volumes kept in the actual state cache. If the volume is no longer attached to the node, the actual state will be updated to reflect the truth. In turn, reconciler will take actions if needed.

To avoid issuing many concurrent operations on cloud provider, this PR
tries to add batch operation to check whether a list of volumes are
attached to the node instead of one request per volume.

More details are explained in PR #33760
2016-10-28 09:24:53 -07:00
Kubernetes Submit Queue 453bfa1f0f Merge pull request #34368 from jingxu97/Oct/statusupdate-10-7
Automatic merge from submit-queue

Node status updater should SetNodeStatusUpdateNeeded if it fails to

update status

When volume controller tries to update the node status, if it fails to
update the nodes status, it should call SetNodeStatusUpdateNeeded so
that the volume list could be updated next time.
2016-10-26 11:09:16 -07:00
Jan Safranek ad946f4fcc Fixed mutation warning in Attach/Detach controller
Objects from shared informer must not be changed, they are shared among all
controllers.

This fixes CacheMutationDetector panic with this output:

CACHE *api.Node[5] ALTERED!
{"metadata":{"name":"ip-172-18-8-71.ec2.internal","selfLink":"/api/v1/nodes/ip-172-18-8-71.ec2.internal","uid":"73d07d16-976e-11e6-8225-0e2f14b56070","resourceVersion":"136","creationTimestamp":"2016-10-21T09:12:12Z","labels":{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/instance-type":"t2.medium","beta.kubernetes.io/os":"linux","failure-domain.beta.kubernetes.io/region":"us-east-1","failure-domain.beta.kubernetes.io/zone":"us-east-1d","kubernetes.io/hostname":"ip-172-18-8-71.ec2.internal"},"annotations":{"volumes.kubernetes.io/controller-managed-attach-detach":"true"}},"spec":{"externalID":"i-9cb6180f","providerID":"aws:///us-east-1d/i-9cb6180f"},"status":{"capacity":{"alpha.kubernetes.io/nvidia-gpu":"0","cpu":"2","memory":"4045568Ki","pods":"110"},"allocatable":{"alpha.kubernetes.io/nvidia-gpu":"0","cpu":"2","memory":"4045568Ki","pods":"110"},"conditions":[{"type":"OutOfDisk","status":"False","lastHeartbeatTime":"2016-10-21T09:12:52Z","lastTransitionTime":"2016-10-21T09:12:12Z","reason":"KubeletHasSufficientDisk","message":"kubelet has sufficient disk space available"},{"type":"MemoryPressure","status":"False","lastHeartbeatTime":"2016-10-21T09:12:52Z","lastTransitionTime":"2016-10-21T09:12:12Z","reason":"KubeletHasSufficientMemory","message":"kubelet has sufficient memory available"},{"type":"DiskPressure","status":"False","lastHeartbeatTime":"2016-10-21T09:12:52Z","lastTransitionTime":"2016-10-21T09:12:12Z","reason":"KubeletHasNoDiskPressure","message":"kubelet has no disk pressure"},{"type":"InodePressure","status":"False","lastHeartbeatTime":"2016-10-21T09:12:52Z","lastTransitionTime":"2016-10-21T09:12:12Z","reason":"KubeletHasNoInodePressure","message":"kubelet has no inode pressure"},{"type":"Ready","status":"True","lastHeartbeatTime":"2016-10-21T09:12:52Z","lastTransitionTime":"2016-10-21T09:12:22Z","reason":"KubeletReady","message":"kubelet is posting ready status"}],"addresses":[{"type":"InternalIP","address":"172.18.8.71"},{"type":"LegacyHostIP","address":"172.18.8.71"},{"type":"ExternalIP","address":"54.85.104.236"}],"daemonEndpoints":{"kubeletEndpoint":{"Port":10250}},"nodeInfo":{"machineID":"78a79498db8e4fdc9ac24b5e436a982c","systemUUID":"EC2BB406-5467-4ABE-B54D-D9993C45714F","bootID":"2553d6b8-1ddb-4ef0-902a-d09a807b89ba","kernelVersion":"4.6.7-300.fc24.x86_64","osImage":"Fedora 24 (Cloud Edition)","containerRuntimeVersion":"docker://1.10.3","kubeletVersion":"v1.5.0-alpha.1.726+5aac5eddb809e4","kubeProxyVersion":"v1.5.0-alpha.1.726+5aac5eddb809e4","operatingSystem":"linux","architecture":"amd64"},"images":[{"names":["openshift/origin-release:latest"],"sizeBytes":714569002},{"names":["openshift/origin-haproxy-router-base:latest"],"sizeBytes":294417608},{"names":["openshift/origin-base:latest"],"sizeBytes":275310761},{"names":["docker.io/centos@sha256:2ae0d2c881c7123870114fb9cc7afabd1e31f9888dac8286884f6cf59373ed9b","docker.io/centos:centos7"],"sizeBytes":196744353},{"names":["gcr.io/google_containers/busybox@sha256:4bdd623e848417d96127e16037743f0cd8b528c026e9175e22a84f639eca58ff","gcr.io/google_containers/busybox:1.24"],"sizeBytes":1113554},{"names":["gcr.io/google_containers/pause-amd64@sha256:163ac025575b775d1c0f9bf0bdd0f086883171eb475b5068e7defa4ca9e76516","gcr.io/google_containers/pause-amd64:3.0"],"sizeBytes":746888}],"volumesInUse":["kubernetes.io/aws-ebs/aws://us-east-1d/vol-f4bd0352"]

A: ,"volumesAttached":[{"name":"kubernetes.io/aws-ebs/aws://us-east-1d/vol-f4bd0352","devicePath":"/dev/xvdba"}]}}

B: }}
2016-10-25 14:28:10 +02:00
Jing Xu 70efadc2f4 Node status updater should SetNodeStatusUpdateNeeded if it fails to
update status

When volume controller tries to update the node status, if it fails to
update the nodes status, it should call SetNodeStatusUpdateNeeded so
that the volume list could be updated next time.
2016-10-24 13:59:39 -07:00
Mike Danese 3b6a067afc autogenerated 2016-10-21 17:32:32 -07:00
Scott Creeley 86f1a94be5 Adding default StorageClass annotation printout for resource_printer 2016-10-19 10:59:07 -04:00
Jan Safranek 101602ab11 Pass whole PVC to provisioner plugin
Gluster provisioner is interested in pvc.Namespace and I don't want to add
at as a new field in VolumeOptions - it would contain almost whole PVC.

Let's pass direct reference to PVC instead and let the provisioner to pick
information it is interested in.
2016-10-12 12:22:01 +02:00
Jing Xu 9e8edf6baf Fix issue in updating device path when volume is attached multiple times
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue #29324
2016-10-03 17:14:23 -07:00
Kubernetes Submit Queue 1854bdcb0c Merge pull request #29048 from justinsb/volumes_nodename_not_hostname
Automatic merge from submit-queue

Use strongly-typed types.NodeName for a node name

We had another bug where we confused the hostname with the NodeName.

Also, if we want to use different values for the Node.Name (which is
an important step for making installation easier), we need to keep
better control over this.

A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName
2016-09-27 17:58:41 -07:00
Justin Santa Barbara 54195d590f Use strongly-typed types.NodeName for a node name
We had another bug where we confused the hostname with the NodeName.

To avoid this happening again, and to make the code more
self-documenting, we use types.NodeName (a typedef alias for string)
whenever we are referring to the Node.Name.

A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName

Also clean up some of the (many) places where the NodeName is referred
to as a hostname (not true on AWS), or an instanceID (not true on GCE),
etc.
2016-09-27 10:47:31 -04:00
Jan Safranek a54c9e2887 Refactor volume controller parameters into a structure
persistentvolumecontroller.NewPersistentVolumeController has 11 arguments now,
put them into a structure.

Also, rename NewPersistentVolumeController to NewController, persistentvolume
is already name of the package.

Fixes #30219
2016-09-26 14:15:25 +02:00
Jan Safranek 5ff1597cf9 Rename controller*.go to pv_controller*.go
To make log filtering easier. controller.go is used by several controllers and
matching logs for "pv_controller.*" is much better.
2016-09-26 12:26:58 +02:00
Kubernetes Submit Queue 4785f6f517 Merge pull request #31978 from jsafrane/detach-before-delete
Automatic merge from submit-queue

Do not report error when deleting an attached volume

Persistent volume controller should not send warning events to a PV and mark the PV as failed when the volume is still attached.

This happens when a user quickly deletes a pod and associated PVC - PV is slowly detaching, while the PVC is already deleted and the PV enters Failed phase.

`Deleter.Deleter` can now return `tryAgainError`, which is sent as INFO to the PV to let the user know we did not forget to delete the PV, however the PV stays in Released state. The controller tries again in the next sync (15 seconds by default).

Fixes #31511
2016-09-25 18:55:32 -07:00
Kubernetes Submit Queue 0a4316f11e Merge pull request #32807 from jingxu97/stateupdateNeeded-9-15
Automatic merge from submit-queue

Fix race condition in setting node statusUpdateNeeded flag

This PR fixes the race condition in setting node statusUpdateNeeded flag
in master's attachdetach controller. This flag is used to indicate
whether a node status has been updated by the node_status_updater or
not. When updater finishes update a node status, it is set to false.
When the node status is changed such as volume is detached or new volume
is attached to the node, the flag is set to true so that updater can
update the status again. The previous workflow has a race condition as
follows
1. updater gets the currently attached volume list from the node which needs to be
updated.
2. A new volume A is attached to the same node right after 1 and set the
flag to TRUE
3. updater updates the node attached volume list (which does not include volume A) and then set the flag to FALSE.
The result is that volume A will be never added to the attached volume
list so at node side, this volume is never attached.

So in this PR, the flag is set to FALSE when updater tries to get the
attached volume list (as in an atomic operation). So in the above
example, after step 2, the flag will be TRUE again, in step 3, updater
does not set the flag if updates is sucessful. So after that, flag is
still TRUE and in next round of update, the node status will be updated.
2016-09-23 11:25:16 -07:00
Jing Xu 14cad206f5 Fix race conditino in setting node statusUpdateNeeded flag
This PR fixes the race condition in setting node statusUpdateNeeded flag
in master's attachdetach controller. This flag is used to indicate
whether a node status has been updated by the node_status_updater or
not. When updater finishes update a node status, it is set to false.
When the node status is changed such as volume is detached or new volume
is attached to the node, the flag is set to true so that updater can
update the status again. The previous workflow has a race condition as
follows
1. updater gets the currently attached volume list from the node which needs to be
updated.
2. A new volume A is attached to the same node right after 1 and set the
flag to TRUE
3. updater updates the node attached volume list (which does not include volume A) and then set the flag to FALSE.
The result is that volume A will be never added to the attached volume
list so at node side, this volume is never attached.

So in this PR, the flag is set to FALSE when updater tries to get the
attached volume list (as in an atomic operation). So in the above
example, after step 2, the flag will be TRUE again, in step 3, updater
does not set the flag if updates is sucessful. So after that, flag is
still TRUE and in next round of update, the node status will be updated.

This PR also changes a unit test due to the workflow changes
2016-09-22 14:02:30 -07:00
Kubernetes Submit Queue e9f4db2748 Merge pull request #27714 from jsafrane/event-recycle
Automatic merge from submit-queue

Send recycle events from pod to pv.

This allows users to diagnose what's wrong with recycler. Recycler pods are started automatically with a cryptic name and they are deleted immediately when they finish.

e.g, `kubectl describe pv` could show that NFS cannot be mounted (and how many pods have tried it):

```
  FirstSeen     LastSeen        Count   From                            SubobjectPath   Type            Reason          Message
  ---------     --------        -----   ----                            -------------   --------        ------          -------
  59m           59m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(5421800e-347b-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  53m           53m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(3c9809e5-347c-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  46m           46m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(250dd2a2-347d-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  40m           40m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(0d84ea33-347e-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  33m           33m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(f5fb63bf-347e-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  27m           27m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(de7128fd-347f-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  1h            3m              75      {persistentvolume-controller }                  Normal          RecyclerPod     Recycler pod: Successfully assigned recycler-for-nfs to 127.0.0.1
  1h            3m              76      {persistentvolume-controller }                  Normal          RecyclerPod     Recycler pod: Pod was active on the node longer than specified deadline
  1h            1m              12      {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  20m           1m              4       {persistentvolume-controller }                  Warning         RecyclerPod     (events with common reason combined)
```

These steps were necessary:

- added event watcher to volume.RecycleVolumeByWatchingPodUntilCompletion
- pass all these events through volume plugins to volume controller
- rework volume.RecycleVolumeByWatchingPodUntilCompletion unit tests to a table (too much copy-paste)
- fix all unit tests along the way
2016-09-22 12:18:53 -07:00
Mike Danese a765d59932 move informer and controller to pkg/client/cache
Signed-off-by: Mike Danese <mikedanese@google.com>
2016-09-15 12:50:08 -07:00
Kubernetes Submit Queue 843d7cd24c Merge pull request #32576 from wongma7/revert-30825-pv-controller-informer
Automatic merge from submit-queue

Revert "Use PV shared informer in PV controller"

Fixes #32497 

Reverts kubernetes/kubernetes#30825
2016-09-15 04:37:29 -07:00
Jan Safranek 9903b389b3 Update cloud providers 2016-09-15 10:33:57 +02:00
Jan Safranek a24e6a90bd Add new error 2016-09-15 09:39:30 +02:00
Matthew Wong 25e9b9dcf9 Revert "Use PV shared informer in PV controller" 2016-09-13 10:12:34 -04:00
Jan Safranek 3eae8c9022 Do not report warning event when an unknown deleter is requested
When Kubernetes does not have a plugin to delete a PV it should wait for
either external deleter or storage admin to delete the volume instead of
throwing an error.

Related to #32077
2016-09-13 10:39:45 +02:00
Kubernetes Submit Queue 6a9a93d469 Merge pull request #32242 from jingxu97/bug-wrongvolume-9-2
Automatic merge from submit-queue

Fix race condition in updating attached volume between master and node

This PR tries to fix issue #29324. The cause of this issue is that a race
condition happens when marking volumes as attached for node status. This
PR tries to clean up the logic of when and where to mark volumes as
attached/detached. Basically the workflow as follows,
1. When volume is attached sucessfully, the volume and node info is
added into nodesToUpdateStatusFor to mark the volume as attached to the
node.
2. When detach request comes in, it will check whether it is safe to
detach now. If the check passes, remove the volume from volumesToReportAsAttached
to indicate the volume is no longer considered as attached now.
Afterwards, reconciler tries to update node status and trigger detach
operation. If any of these operation fails, the volume is added back to
the volumesToReportAsAttached list showing that it is still attached.

These steps should make sure that kubelet get the right (might be
outdated) information about which volume is attached or not. It also
garantees that if detach operation is pending, kubelet should not
trigger any mount operations.
2016-09-12 15:29:38 -07:00
Jing Xu efaceb28cc Fix race condition in updating attached volume between master and node
This PR tries to fix issue #29324. This cause of this issue is a race
condition happens when marking volumes as attached for node status. This
PR tries to clean up the logic of when and where to mark volumes as
attached/detached. Basically the workflow as follows,
1. When volume is attached sucessfully, the volume and node info is
added into nodesToUpdateStatusFor to mark the volume as attached to the
node.
2. When detach request comes in, it will check whether it is safe to
detach now. If the check passes, remove the volume from volumesToReportAsAttached
to indicate the volume is no longer considered as attached now.
Afterwards, reconciler tries to update node status and trigger detach
operation. If any of these operation fails, the volume is added back to
the volumesToReportAsAttached list showing that it is still attached.

These steps should make sure that kubelet get the right (might be
outdated) information about which volume is attached or not. It also
garantees that if detach operation is pending, kubelet should not
trigger any mount operations.
2016-09-12 13:51:08 -07:00
Kubernetes Submit Queue 17f82069bb Merge pull request #30825 from wongma7/pv-controller-informer
Automatic merge from submit-queue

Use PV shared informer in PV controller

Use the PV shared informer, addressing (partially) https://github.com/kubernetes/kubernetes/issues/26247 . Using the PVC shared informer is not so simple because sometimes the controller wants to `Requeue` and...
2016-09-10 12:40:30 -07:00
Jan Safranek d7111b282f Send recycle events from pod to pv.
This allows users to diagnose what's wrong with recycler. Recycler pods are
started automatically with a cryptic name and they are deleted immediately
when they finish.

kubectl describe pods will show:

  FirstSeen     LastSeen        Count   From                            SubobjectPath   Type            Reason          Message
  ---------     --------        -----   ----                            -------------   --------        ------          -------
  59m           59m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(5421800e-347b-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  53m           53m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(3c9809e5-347c-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  46m           46m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(250dd2a2-347d-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  40m           40m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(0d84ea33-347e-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  33m           33m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(f5fb63bf-347e-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  27m           27m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(de7128fd-347f-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  1h            3m              75      {persistentvolume-controller }                  Normal          RecyclerPod     Recycler pod: Successfully assigned recycler-for-nfs to 127.0.0.1
  1h            3m              76      {persistentvolume-controller }                  Normal          RecyclerPod     Recycler pod: Pod was active on the node longer than specified deadline
  1h            1m              12      {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  20m           1m              4       {persistentvolume-controller }                  Warning         RecyclerPod     (events with common reason combined)


These steps were necessary:

- added event watcher to volume.RecycleVolumeByWatchingPodUntilCompletion

- pass all these events through volume plugins to volume controller

- rework volume.RecycleVolumeByWatchingPodUntilCompletion unit tests to a table
  (too much copy-paste)

- fix all unit tests along the way
2016-09-08 12:57:57 +02:00
Jan Safranek 3a2f4e52a8 Do not report warning event when an nknown provisioner is requested
with StorageClass.Provisioner == <unknown plugin>, we should wait for
either external provisioner or volume admin to provide a PV for a claim
instead of reporting an error.

Fixes #31723
2016-09-07 09:11:41 +02:00
deads2k cd5b6cc491 move StorageClass to its own group 2016-09-06 08:41:17 -04:00
Kubernetes Submit Queue d532bfc63c Merge pull request #31885 from better0332/master
Automatic merge from submit-queue

fix deleteClaim
2016-09-04 00:40:50 -07:00
Kubernetes Submit Queue aad5c66792 Merge pull request #31837 from jingxu97/recorder
Automatic merge from submit-queue

Post event message for volume attachment

This PR is to add event message when attaching volume fails to help
users to debug. For detach failure, may address in a different PR since
it requires more data structure change.
2016-09-01 23:30:57 -07:00
Jing Xu b9157b7524 Post event message for volume attachment
This PR is to add event message when attaching volume fails to help
users to debug. For detach failure, may address in a different PR since
it requires more data structure change.
2016-09-01 16:24:36 -07:00
better88 041beadcc8 fix deleteClaim
`ok` is not in same variable socpe
like https://github.com/kubernetes/kubernetes/pull/31416
2016-09-01 23:26:38 +08:00
Matthew Wong 1d6dbdd9d2 Use PV shared informer in PV controller 2016-08-25 21:55:23 -04:00
better0332 524f0da769 fix deleteVolume
`ok` is not in same variable socpe
2016-08-25 15:26:18 +08:00
Kubernetes Submit Queue 1def4a0458 Merge pull request #30690 from wongma7/claimref-capacity
Automatic merge from submit-queue

Don't bind pre-bound pvc & pv if size request not satisfied

as discussed briefly here https://github.com/kubernetes/kubernetes/pull/30522 , volume size ought to be verified before binding a pv & pvc regardless of what's in the pv's claimRef. @thockin
2016-08-21 16:02:14 -07:00
Jordan Liggitt 387f9ea952
Fix data race in PVC Run/Stop methods 2016-08-21 15:15:33 -04:00
Kubernetes Submit Queue 3d7a105d9b Merge pull request #30903 from jingxu97/cherrypick-8-19
Automatic merge from submit-queue

Avoid failure message flush log when node no longer exist

When node is deleted, attach-detach controller cache may contain stale
information of this node, and update node status fails in reconciler
loop. This message easily flush the log file. This PR is just a quick
fix of this issue. More complete fix including make controller cache
up to date will be addressed in another PR.
2016-08-19 15:45:58 -07:00
Kubernetes Submit Queue 6ce405c6ee Merge pull request #27778 from screeley44/k8-vol-executor
Automatic merge from submit-queue

Add Events for operation_executor to show status of mounts, failed/successful to show in describe events

Fixes #27590 
@saad-ali @pmorie @erinboyd

After talking with @pmorie last week about the above issue, I decided to poke around and see if I could remedy.  The refactoring broke my previous UXP merged PR's that correctly showed failed mount errors in the describe events.  However, Not sure I implemented correctly, but it tested out and seems to be working, let me know what I missed or if this is not the correct approach.

```
Events:
  FirstSeen	LastSeen	Count	From			SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  2m		2m		1	{default-scheduler }			Normal		Scheduled	Successfully assigned nfs-bb-pod1 to 127.0.0.1
  44s		44s		1	{kubelet 127.0.0.1}			Warning		FailedMount	Unable to mount volumes for pod "nfs-bb-pod1_default(a94f64f1-37c9-11e6-9aa5-52540073d346)": timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol]
  44s		44s		1	{kubelet 127.0.0.1}			Warning		FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol]
  38s		38s		1	{kubelet }				Warning		FailedMount	Unable to mount volumes for pod "a94f64f1-37c9-11e6-9aa5-52540073d346": Mount failed: exit status 32
Mounting arguments: nfs1.rhs:/opt/data99 /var/lib/kubelet/pods/a94f64f1-37c9-11e6-9aa5-52540073d346/volumes/kubernetes.io~nfs/nfsvol nfs []
Output: mount.nfs: Connection timed out

Resolution hint: Check and make sure the NFS Server exists (ensure that correct IPAddress/Hostname was given) and is available/reachable.
Also make sure firewall ports are open on both client and NFS Server (2049 v4 and 2049, 20048 and 111 for v3).
Use commands telnet <nfs server> <port> and showmount <nfs server> to help test connectivity.
```
2016-08-19 08:27:48 -07:00
Jing Xu 70deeb0ae4 node not exist during node status update should not block others
When node is deleted, attach-detach controller cache may contain stale
information of this node, and update node status fails in reconciler
loop. But one node update failure should not block updating other nodes.
Also the warning message easily flush the log file. This PR is just a quick
fix of this issue. More complete fix including make sure controller cache
up to date will be addressed in another PR.
2016-08-18 13:51:30 -07:00
Kubernetes Submit Queue 9d2a5fe5e8 Merge pull request #29006 from jsafrane/dynprov2
Automatic merge from submit-queue

Implement dynamic provisioning (beta) of PersistentVolumes via StorageClass

Implemented according to PR #26908. There are several patches in this PR with one huge code regen inside.

* Please review the API changes (the first patch) carefully, sometimes I don't know what the code is doing...

* `PV.Spec.Class` and `PVC.Spec.Class` is not implemented, use annotation `volume.alpha.kubernetes.io/storage-class`

* See e2e test and integration test changes - Kubernetes won't provision a thing without explicit configuration of at least one `StorageClass` instance!

* Multiple provisioning volume plugins can coexist together, e.g. HostPath and AWS EBS. This is important for Gluster and RBD provisioners in #25026

* Contradicting the proposal, `claim.Selector` and `volume.alpha.kubernetes.io/storage-class` annotation are **not** mutually exclusive. They're both used for matching existing PVs. However, only `volume.alpha.kubernetes.io/storage-class` is used for provisioning, configuration of provisioning with `Selector` is left for (near) future.

* Documentation is missing. Can please someone write some while I am out?

For now, AWS volume plugin accepts classes with these parameters:

```
kind: StorageClass
metadata:
  name: slow
provisionerType: kubernetes.io/aws-ebs
provisionerParameters:
  type: io1
  zone: us-east-1d
  iopsPerGB: 10
```

* parameters are case-insensitive
* `type`: `io1`, `gp2`, `sc1`, `st1`. See AWS docs for details
* `iopsPerGB`: only for `io1` volumes. I/O operations per second per GiB. AWS volume plugin multiplies this with size of requested volume to compute IOPS of the volume and caps it at 20 000 IOPS (maximum supported by AWS, see AWS docs).
* of course, the plugin will use some defaults when a parameter is omitted in a `StorageClass` instance (`gp2` in the same zone as in 1.3).

GCE:

```
apiVersion: extensions/v1beta1
kind: StorageClass
metadata:
  name: slow
provisionerType: kubernetes.io/gce-pd
provisionerParameters:
  type: pd-standard
  zone: us-central1-a
```

* `type`: `pd-standard` or `pd-ssd`
* `zone`: GCE zone
* of course, the plugin will use some defaults when a parameter is omitted in a `StorageClass` instance (SSD in the same zone as in 1.3 ?).


No OpenStack/Cinder yet

@kubernetes/sig-storage
2016-08-18 09:56:16 -07:00
Kubernetes Submit Queue 9696a27aa0 Merge pull request #30737 from saad-ali/fix29358Round2
Automatic merge from submit-queue

Skip safe to detach check if node API object no longer exists

Fixes #29358
2016-08-18 04:00:05 -07:00
Jan Safranek bb5d562f37 Restore alpha behavior 2016-08-18 10:36:50 +02:00
Jan Safranek d8a95a3785 Update matching logic with storage class
- no default StorageClass
- PVC.Spec.Class == nil means the same as PVC.Spec.Class == ""
2016-08-18 10:36:50 +02:00
Jan Safranek 6e4d95f646 Dynamic provisioning V2 controller, provisioners, docs and tests. 2016-08-18 10:36:49 +02:00
Matthew Wong 6486576f56 continue searching on bad size and add tests for bad size&mode 2016-08-17 10:42:52 -04:00
Scott Creeley 782d7d9815 Add Events for operation_executor to show status of mounts, failed or successful 2016-08-17 09:53:47 -04:00
saadali 0c72568247 Skip safe to detach if node api obj doesn't exist 2016-08-16 21:30:51 -07:00
Avesh Agarwal 52a60fe3be Fix default resource limits (node capacities) for downward api volumes 2016-08-16 14:41:17 -04:00
Matthew Wong fe817674ab Don't bind pre-bound pvc & pv if size request not satisfied 2016-08-16 12:24:18 -04:00
Jing Xu f19a1148db This change supports robust kubelet volume cleanup
Currently kubelet volume management works on the concept of desired
and actual world of states. The volume manager periodically compares the
two worlds and perform volume mount/unmount and/or attach/detach
operations. When kubelet restarts, the cache of those two worlds are
gone. Although desired world can be recovered through apiserver, actual
world can not be recovered which may cause some volumes cannot be cleaned
up if their information is deleted by apiserver. This change adds the
reconstruction of the actual world by reading the pod directories from
disk. The reconstructed volume information is added to both desired
world and actual world if it cannot be found in either world. The rest
logic would be as same as before, desired world populator may clean up
the volume entry if it is no longer in apiserver, and then volume
manager should invoke unmount to clean it up.
2016-08-15 11:29:15 -07:00
Jan Safranek 3c5364954b Fix PVC.Status.Capacity and AccessModes after binding
Also, fix unit tests to have the same claim and volume sizes in most of the
tests where we don't test matching based on size and test for a specific size
when we do actually test the matching.
2016-08-08 10:45:42 +02:00
Kubernetes Submit Queue 42a12a4cd6 Merge pull request #29978 from hodovska/sharedInformer-fixup
Automatic merge from submit-queue

SharedInformerFactory: usage and fixes

Follow-up for #26709
2016-08-04 09:00:23 -07:00
Dominika Hodovska 816f6d32ca Collapse duplicate informer creation paths 2016-08-04 09:02:13 +02:00
Kubernetes Submit Queue 48bd6368a7 Merge pull request #28777 from jsafrane/volume-unittest-waittest
Automatic merge from submit-queue

Stabilize volume unit tests by waiting for exact state

Wait for specific final state instead of waiting for specific number of
operations in controller unit tests. The tests are more readable and will survive
random goroutine ordering (PV and PVC controller have both their own
goroutine).

@kubernetes/sig-storage
2016-08-03 01:46:23 -07:00
Michal Rostecki 59ca5986dd Print/log pointers of structs with %#v instead of %+v
There are many places in k8s where %+v is used to format a pointer
to struct, which isn't working as expected.

Fixes #26591
2016-08-01 22:27:56 +02:00
Paul Morie de4d193d45 Add note about space-shuttle code style in controller/volume 2016-07-30 14:29:25 -04:00
Paul Morie 8a1baa4d64 Revert "controller/volume: simplify sync logic in syncUnboundClaim"
This reverts commit 9eb2831954.
2016-07-30 14:00:25 -04:00
Paul Morie a6d0dc0529 Revert "controller/volume: simplify sync logic in syncBoundClaim"
This reverts commit 67787caeeb.
2016-07-30 14:00:09 -04:00
k8s-merge-robot 5760acf603 Merge pull request #29596 from matttproud/fix/time-leaks/remainder
Automatic merge from submit-queue

pkg/various: plug leaky time.New{Timer,Ticker}s

According to the documentation for Go package time, `time.Ticker` and
`time.Timer` are uncollectable by garbage collector finalizers.  They
leak until otherwise stopped.  This commit ensures that all remaining
instances are stopped upon departure from their relative scopes.

Similar efforts were incrementally done in #29439 and #29114.

```release-note
* pkg/various: plugged various time.Ticker and time.Timer leaks.
```
2016-07-29 14:06:47 -07:00
Paul Morie c884297990 Fix collisions issues / timeouts for mounts
For non-attachable volumes, do not call GetVolumeName on the plugin and instead
generate a unique name based on the identity of the pod and the name of the volume
within the pod.
2016-07-27 17:53:50 -04:00
Matt T. Proud 5c6292c074 pkg/various: plug leaky time.New{Timer,Ticker}s
According to the documentation for Go package time, `time.Ticker` and
`time.Timer` are uncollectable by garbage collector finalizers.  They
leak until otherwise stopped.  This commit ensures that all remaining
instances are stopped upon departure from their relative scopes.
2016-07-26 06:20:31 +02:00
k8s-merge-robot 696cca21e2 Merge pull request #28813 from xiang90/pv_1
Automatic merge from submit-queue

controller/volume: simplify sync logic in syncBoundClaim

Remove all unnecessary branchings.
2016-07-23 00:51:49 -07:00
saadali 89fd358c52 Assume volume detached if node doesn't exist
Fixes #29358
2016-07-22 22:07:32 -07:00
k8s-merge-robot 99e24da2ff Merge pull request #29077 from saad-ali/fixIssue29051NamespaceDeletion
Automatic merge from submit-queue

Fix "PVC Volume not detached if pod deleted via namespace deletion" issue

Fixes #29051: "PVC Volume not detached if pod deleted via namespace deletion"

This PR:
* Fixes a bug in `desired_state_of_the_world_populator.go` to check the value of `exists` returned by the `podInformer` so that it can delete pods even if the delete event is missed (or fails).
* Reduces the desired state of the world populators sleep period from 5 min to 1 min (reducing the amount of time a volume would remain attached if a volume delete event is missed or fails).
2016-07-20 20:40:32 -07:00
saadali afd8a58e5c Reduce DSW populator sleep period from 5 min to 1 2016-07-20 01:03:04 -07:00
saadali d210c2231f Check pod exist in attach controller DSW populator
Fix bug in desired_state_of_the_world_populator.go to check exists so
that it can delete pods even if the delete event is missed (or fails)
2016-07-20 01:03:04 -07:00
saadali 88d495026d Allow mounts to run in parallel for non-attachable
Allow mount volume operations to run in parallel for non-attachable
volume plugins.

Allow unmount volume operations to run in parallel for all volume
plugins.
2016-07-19 21:54:26 -07:00
k8s-merge-robot 2125c0eb62 Merge pull request #28811 from xiang90/pv
Automatic merge from submit-queue

controller/volume: simplify sync logic in syncUnboundClaim

Remove all unnecessary branching logic. No actual logic changes. Code is more readable now.
2016-07-12 02:28:05 -07:00
Xiang Li 67787caeeb controller/volume: simplify sync logic in syncBoundClaim 2016-07-11 19:36:36 -07:00
Xiang Li 9eb2831954 controller/volume: simplify sync logic in syncUnboundClaim 2016-07-11 19:22:14 -07:00
k8s-merge-robot 7b067c859f Merge pull request #26387 from MHBauer/cleanupjitter
Automatic merge from submit-queue

close channel to prevent buildup of wait.JitterUntil()

<!--
Checklist for submitting a Pull Request

Please remove this comment block before submitting.

1. Please read our [contributor guidelines](https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md).
2. See our [developer guide](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md).
3. If you want this PR to automatically close an issue when it is merged,
   add `fixes #<issue number>` or `fixes #<issue number>, fixes #<issue number>`
   to close multiple issues (see: https://github.com/blog/1506-closing-issues-via-pull-requests).
4. Follow the instructions for [labeling and writing a release note for this PR](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes) in the block below.
-->

Trying to look at flake in #26377 by running the test with large counts of runs. It was timing out because a `wait.JitterUntil` goroutine builds up for each of the four tests. So if you ran it a thousand times, you would end up with 4k goroutines spinning in the background. Now I create a channel and close it at the end of each test to prevent a memory leak.
2016-07-11 18:53:39 -07:00
Jan Safranek 71b75d593e Stabilize volume unit tests by waiting for exact state
Wait for specific final state instead of waiting for specific number of
operations in volume unit tests. The tests are more readable and will survive
random goroutine ordering (PV and PVC controller have both their own
goroutine).
2016-07-11 15:35:01 +02:00
Morgan Bauer 69719167a3
close channel to prevent memory leak
- wait.JitterUntil goroutine is never cleaned up when used with wait.NeverStop
 - fixup comment
2016-07-06 09:34:20 -07:00
bin liu 426fdc431a Merge branch 'master' into fix-typos 2016-07-04 11:20:47 +08:00
saadali 0dd17fff22 Reorganize volume controllers and manager 2016-07-01 18:50:25 -07:00
David McMahon ef0c9f0c5b Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
saadali e06b32b1ef Mark VolumeInUse before checking if it is Attached
Ensure that kublet marks VolumeInUse before checking if it is Attached.
Also ensures that the attach/detach controller always fetches a fresh
copy of the node object before detach (instead ofKubelet relying on node
informer cache).
2016-06-28 14:05:59 -07:00
saadali dfe8e606c1 Fix device path used by volume WaitForAttach 2016-06-22 12:56:58 -07:00
saadali 773ac20880 Prevent detach before node status update 2016-06-22 04:45:50 -07:00
Jing Xu 0fefb23f94 implement desiredWorld populator to sync up with informer
This change implements the desiredStateOfWorld populator to sync up with
the pod informer. It periodically check each pod in the
desiredStateOfworld and verify whether it is still in pod informer
cache. If it not, remove it from the desiredStateOfWorld
2016-06-21 17:09:35 -07:00
saadali e716ddc771 Controller wait for attach and exponential backoff
Modify attach/detach controller to keep track of volumes to report
attached in Node VolumeToAttach status.

Modify kubelet volume manager to wait for volume to show up in Node
VolumeToAttach status.

Implement exponential backoff for errors in volume manager and attach
detach controller
2016-06-20 18:19:55 -07:00
goltermann 218645b346 Fix several spelling errors in comments. 2016-06-17 10:41:18 -07:00
saadali 542f2dc708 Introduce new kubelet volume manager
This commit adds a new volume manager in kubelet that synchronizes
volume mount/unmount (and attach/detach, if attach/detach controller
is not enabled).

This eliminates the race conditions between the pod creation loop
and the orphaned volumes loops. It also removes the unmount/detach
from the `syncPod()` path so volume clean up never blocks the
`syncPod` loop.
2016-06-15 09:34:08 -07:00
saadali 9b6a505f8a Rename UniqueDeviceName to UniqueVolumeName
Rename UniqueDeviceName to UniqueVolumeName and move helper functions
from attacherdetacher to volumehelper package.
Introduce UniquePodName alias
2016-06-15 09:32:12 -07:00
Saad Ali 9dbe943491 Attach/Detach Controller Kubelet Changes
This PR contains Kubelet changes to enable attach/detach controller control.
* It introduces a new "enable-controller-attach-detach" kubelet flag to
  enable control by controller. Default enabled.
* It removes all references "SafeToDetach" annoation from controller.
* It adds the new VolumesInUse field to the Node Status API object.
* It modifies the controller to use VolumesInUse instead of SafeToDetach
  annotation to gate detachment.
* There is a bug in node-problem-detector that causes VolumesInUse to
  get reset every 30 seconds. Issue https://github.com/kubernetes/node-problem-detector/issues/9
  opened to fix that.
2016-06-02 16:47:11 -07:00
saadali 3c345abafd Fix DATA RACE in unit tests: reconciler_test.go 2016-05-27 01:19:25 -07:00