Commit Graph

177 Commits (6d5b2ef49e7f2f1dad51ec077a66b536e5329350)

Author SHA1 Message Date
Kubernetes Submit Queue 0a4316f11e Merge pull request #32807 from jingxu97/stateupdateNeeded-9-15
Automatic merge from submit-queue

Fix race condition in setting node statusUpdateNeeded flag

This PR fixes the race condition in setting node statusUpdateNeeded flag
in master's attachdetach controller. This flag is used to indicate
whether a node status has been updated by the node_status_updater or
not. When updater finishes update a node status, it is set to false.
When the node status is changed such as volume is detached or new volume
is attached to the node, the flag is set to true so that updater can
update the status again. The previous workflow has a race condition as
follows
1. updater gets the currently attached volume list from the node which needs to be
updated.
2. A new volume A is attached to the same node right after 1 and set the
flag to TRUE
3. updater updates the node attached volume list (which does not include volume A) and then set the flag to FALSE.
The result is that volume A will be never added to the attached volume
list so at node side, this volume is never attached.

So in this PR, the flag is set to FALSE when updater tries to get the
attached volume list (as in an atomic operation). So in the above
example, after step 2, the flag will be TRUE again, in step 3, updater
does not set the flag if updates is sucessful. So after that, flag is
still TRUE and in next round of update, the node status will be updated.
2016-09-23 11:25:16 -07:00
Jing Xu 14cad206f5 Fix race conditino in setting node statusUpdateNeeded flag
This PR fixes the race condition in setting node statusUpdateNeeded flag
in master's attachdetach controller. This flag is used to indicate
whether a node status has been updated by the node_status_updater or
not. When updater finishes update a node status, it is set to false.
When the node status is changed such as volume is detached or new volume
is attached to the node, the flag is set to true so that updater can
update the status again. The previous workflow has a race condition as
follows
1. updater gets the currently attached volume list from the node which needs to be
updated.
2. A new volume A is attached to the same node right after 1 and set the
flag to TRUE
3. updater updates the node attached volume list (which does not include volume A) and then set the flag to FALSE.
The result is that volume A will be never added to the attached volume
list so at node side, this volume is never attached.

So in this PR, the flag is set to FALSE when updater tries to get the
attached volume list (as in an atomic operation). So in the above
example, after step 2, the flag will be TRUE again, in step 3, updater
does not set the flag if updates is sucessful. So after that, flag is
still TRUE and in next round of update, the node status will be updated.

This PR also changes a unit test due to the workflow changes
2016-09-22 14:02:30 -07:00
Kubernetes Submit Queue e9f4db2748 Merge pull request #27714 from jsafrane/event-recycle
Automatic merge from submit-queue

Send recycle events from pod to pv.

This allows users to diagnose what's wrong with recycler. Recycler pods are started automatically with a cryptic name and they are deleted immediately when they finish.

e.g, `kubectl describe pv` could show that NFS cannot be mounted (and how many pods have tried it):

```
  FirstSeen     LastSeen        Count   From                            SubobjectPath   Type            Reason          Message
  ---------     --------        -----   ----                            -------------   --------        ------          -------
  59m           59m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(5421800e-347b-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  53m           53m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(3c9809e5-347c-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  46m           46m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(250dd2a2-347d-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  40m           40m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(0d84ea33-347e-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  33m           33m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(f5fb63bf-347e-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  27m           27m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(de7128fd-347f-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  1h            3m              75      {persistentvolume-controller }                  Normal          RecyclerPod     Recycler pod: Successfully assigned recycler-for-nfs to 127.0.0.1
  1h            3m              76      {persistentvolume-controller }                  Normal          RecyclerPod     Recycler pod: Pod was active on the node longer than specified deadline
  1h            1m              12      {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  20m           1m              4       {persistentvolume-controller }                  Warning         RecyclerPod     (events with common reason combined)
```

These steps were necessary:

- added event watcher to volume.RecycleVolumeByWatchingPodUntilCompletion
- pass all these events through volume plugins to volume controller
- rework volume.RecycleVolumeByWatchingPodUntilCompletion unit tests to a table (too much copy-paste)
- fix all unit tests along the way
2016-09-22 12:18:53 -07:00
Mike Danese a765d59932 move informer and controller to pkg/client/cache
Signed-off-by: Mike Danese <mikedanese@google.com>
2016-09-15 12:50:08 -07:00
Kubernetes Submit Queue 843d7cd24c Merge pull request #32576 from wongma7/revert-30825-pv-controller-informer
Automatic merge from submit-queue

Revert "Use PV shared informer in PV controller"

Fixes #32497 

Reverts kubernetes/kubernetes#30825
2016-09-15 04:37:29 -07:00
Jan Safranek 9903b389b3 Update cloud providers 2016-09-15 10:33:57 +02:00
Jan Safranek a24e6a90bd Add new error 2016-09-15 09:39:30 +02:00
Matthew Wong 25e9b9dcf9 Revert "Use PV shared informer in PV controller" 2016-09-13 10:12:34 -04:00
Jan Safranek 3eae8c9022 Do not report warning event when an unknown deleter is requested
When Kubernetes does not have a plugin to delete a PV it should wait for
either external deleter or storage admin to delete the volume instead of
throwing an error.

Related to #32077
2016-09-13 10:39:45 +02:00
Kubernetes Submit Queue 6a9a93d469 Merge pull request #32242 from jingxu97/bug-wrongvolume-9-2
Automatic merge from submit-queue

Fix race condition in updating attached volume between master and node

This PR tries to fix issue #29324. The cause of this issue is that a race
condition happens when marking volumes as attached for node status. This
PR tries to clean up the logic of when and where to mark volumes as
attached/detached. Basically the workflow as follows,
1. When volume is attached sucessfully, the volume and node info is
added into nodesToUpdateStatusFor to mark the volume as attached to the
node.
2. When detach request comes in, it will check whether it is safe to
detach now. If the check passes, remove the volume from volumesToReportAsAttached
to indicate the volume is no longer considered as attached now.
Afterwards, reconciler tries to update node status and trigger detach
operation. If any of these operation fails, the volume is added back to
the volumesToReportAsAttached list showing that it is still attached.

These steps should make sure that kubelet get the right (might be
outdated) information about which volume is attached or not. It also
garantees that if detach operation is pending, kubelet should not
trigger any mount operations.
2016-09-12 15:29:38 -07:00
Jing Xu efaceb28cc Fix race condition in updating attached volume between master and node
This PR tries to fix issue #29324. This cause of this issue is a race
condition happens when marking volumes as attached for node status. This
PR tries to clean up the logic of when and where to mark volumes as
attached/detached. Basically the workflow as follows,
1. When volume is attached sucessfully, the volume and node info is
added into nodesToUpdateStatusFor to mark the volume as attached to the
node.
2. When detach request comes in, it will check whether it is safe to
detach now. If the check passes, remove the volume from volumesToReportAsAttached
to indicate the volume is no longer considered as attached now.
Afterwards, reconciler tries to update node status and trigger detach
operation. If any of these operation fails, the volume is added back to
the volumesToReportAsAttached list showing that it is still attached.

These steps should make sure that kubelet get the right (might be
outdated) information about which volume is attached or not. It also
garantees that if detach operation is pending, kubelet should not
trigger any mount operations.
2016-09-12 13:51:08 -07:00
Kubernetes Submit Queue 17f82069bb Merge pull request #30825 from wongma7/pv-controller-informer
Automatic merge from submit-queue

Use PV shared informer in PV controller

Use the PV shared informer, addressing (partially) https://github.com/kubernetes/kubernetes/issues/26247 . Using the PVC shared informer is not so simple because sometimes the controller wants to `Requeue` and...
2016-09-10 12:40:30 -07:00
Jan Safranek d7111b282f Send recycle events from pod to pv.
This allows users to diagnose what's wrong with recycler. Recycler pods are
started automatically with a cryptic name and they are deleted immediately
when they finish.

kubectl describe pods will show:

  FirstSeen     LastSeen        Count   From                            SubobjectPath   Type            Reason          Message
  ---------     --------        -----   ----                            -------------   --------        ------          -------
  59m           59m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(5421800e-347b-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  53m           53m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(3c9809e5-347c-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  46m           46m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(250dd2a2-347d-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  40m           40m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(0d84ea33-347e-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  33m           33m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(f5fb63bf-347e-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  27m           27m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(de7128fd-347f-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  1h            3m              75      {persistentvolume-controller }                  Normal          RecyclerPod     Recycler pod: Successfully assigned recycler-for-nfs to 127.0.0.1
  1h            3m              76      {persistentvolume-controller }                  Normal          RecyclerPod     Recycler pod: Pod was active on the node longer than specified deadline
  1h            1m              12      {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  20m           1m              4       {persistentvolume-controller }                  Warning         RecyclerPod     (events with common reason combined)


These steps were necessary:

- added event watcher to volume.RecycleVolumeByWatchingPodUntilCompletion

- pass all these events through volume plugins to volume controller

- rework volume.RecycleVolumeByWatchingPodUntilCompletion unit tests to a table
  (too much copy-paste)

- fix all unit tests along the way
2016-09-08 12:57:57 +02:00
Jan Safranek 3a2f4e52a8 Do not report warning event when an nknown provisioner is requested
with StorageClass.Provisioner == <unknown plugin>, we should wait for
either external provisioner or volume admin to provide a PV for a claim
instead of reporting an error.

Fixes #31723
2016-09-07 09:11:41 +02:00
deads2k cd5b6cc491 move StorageClass to its own group 2016-09-06 08:41:17 -04:00
Kubernetes Submit Queue d532bfc63c Merge pull request #31885 from better0332/master
Automatic merge from submit-queue

fix deleteClaim
2016-09-04 00:40:50 -07:00
Kubernetes Submit Queue aad5c66792 Merge pull request #31837 from jingxu97/recorder
Automatic merge from submit-queue

Post event message for volume attachment

This PR is to add event message when attaching volume fails to help
users to debug. For detach failure, may address in a different PR since
it requires more data structure change.
2016-09-01 23:30:57 -07:00
Jing Xu b9157b7524 Post event message for volume attachment
This PR is to add event message when attaching volume fails to help
users to debug. For detach failure, may address in a different PR since
it requires more data structure change.
2016-09-01 16:24:36 -07:00
better88 041beadcc8 fix deleteClaim
`ok` is not in same variable socpe
like https://github.com/kubernetes/kubernetes/pull/31416
2016-09-01 23:26:38 +08:00
Matthew Wong 1d6dbdd9d2 Use PV shared informer in PV controller 2016-08-25 21:55:23 -04:00
better0332 524f0da769 fix deleteVolume
`ok` is not in same variable socpe
2016-08-25 15:26:18 +08:00
Kubernetes Submit Queue 1def4a0458 Merge pull request #30690 from wongma7/claimref-capacity
Automatic merge from submit-queue

Don't bind pre-bound pvc & pv if size request not satisfied

as discussed briefly here https://github.com/kubernetes/kubernetes/pull/30522 , volume size ought to be verified before binding a pv & pvc regardless of what's in the pv's claimRef. @thockin
2016-08-21 16:02:14 -07:00
Jordan Liggitt 387f9ea952
Fix data race in PVC Run/Stop methods 2016-08-21 15:15:33 -04:00
Kubernetes Submit Queue 3d7a105d9b Merge pull request #30903 from jingxu97/cherrypick-8-19
Automatic merge from submit-queue

Avoid failure message flush log when node no longer exist

When node is deleted, attach-detach controller cache may contain stale
information of this node, and update node status fails in reconciler
loop. This message easily flush the log file. This PR is just a quick
fix of this issue. More complete fix including make controller cache
up to date will be addressed in another PR.
2016-08-19 15:45:58 -07:00
Kubernetes Submit Queue 6ce405c6ee Merge pull request #27778 from screeley44/k8-vol-executor
Automatic merge from submit-queue

Add Events for operation_executor to show status of mounts, failed/successful to show in describe events

Fixes #27590 
@saad-ali @pmorie @erinboyd

After talking with @pmorie last week about the above issue, I decided to poke around and see if I could remedy.  The refactoring broke my previous UXP merged PR's that correctly showed failed mount errors in the describe events.  However, Not sure I implemented correctly, but it tested out and seems to be working, let me know what I missed or if this is not the correct approach.

```
Events:
  FirstSeen	LastSeen	Count	From			SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  2m		2m		1	{default-scheduler }			Normal		Scheduled	Successfully assigned nfs-bb-pod1 to 127.0.0.1
  44s		44s		1	{kubelet 127.0.0.1}			Warning		FailedMount	Unable to mount volumes for pod "nfs-bb-pod1_default(a94f64f1-37c9-11e6-9aa5-52540073d346)": timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol]
  44s		44s		1	{kubelet 127.0.0.1}			Warning		FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol]
  38s		38s		1	{kubelet }				Warning		FailedMount	Unable to mount volumes for pod "a94f64f1-37c9-11e6-9aa5-52540073d346": Mount failed: exit status 32
Mounting arguments: nfs1.rhs:/opt/data99 /var/lib/kubelet/pods/a94f64f1-37c9-11e6-9aa5-52540073d346/volumes/kubernetes.io~nfs/nfsvol nfs []
Output: mount.nfs: Connection timed out

Resolution hint: Check and make sure the NFS Server exists (ensure that correct IPAddress/Hostname was given) and is available/reachable.
Also make sure firewall ports are open on both client and NFS Server (2049 v4 and 2049, 20048 and 111 for v3).
Use commands telnet <nfs server> <port> and showmount <nfs server> to help test connectivity.
```
2016-08-19 08:27:48 -07:00
Jing Xu 70deeb0ae4 node not exist during node status update should not block others
When node is deleted, attach-detach controller cache may contain stale
information of this node, and update node status fails in reconciler
loop. But one node update failure should not block updating other nodes.
Also the warning message easily flush the log file. This PR is just a quick
fix of this issue. More complete fix including make sure controller cache
up to date will be addressed in another PR.
2016-08-18 13:51:30 -07:00
Kubernetes Submit Queue 9d2a5fe5e8 Merge pull request #29006 from jsafrane/dynprov2
Automatic merge from submit-queue

Implement dynamic provisioning (beta) of PersistentVolumes via StorageClass

Implemented according to PR #26908. There are several patches in this PR with one huge code regen inside.

* Please review the API changes (the first patch) carefully, sometimes I don't know what the code is doing...

* `PV.Spec.Class` and `PVC.Spec.Class` is not implemented, use annotation `volume.alpha.kubernetes.io/storage-class`

* See e2e test and integration test changes - Kubernetes won't provision a thing without explicit configuration of at least one `StorageClass` instance!

* Multiple provisioning volume plugins can coexist together, e.g. HostPath and AWS EBS. This is important for Gluster and RBD provisioners in #25026

* Contradicting the proposal, `claim.Selector` and `volume.alpha.kubernetes.io/storage-class` annotation are **not** mutually exclusive. They're both used for matching existing PVs. However, only `volume.alpha.kubernetes.io/storage-class` is used for provisioning, configuration of provisioning with `Selector` is left for (near) future.

* Documentation is missing. Can please someone write some while I am out?

For now, AWS volume plugin accepts classes with these parameters:

```
kind: StorageClass
metadata:
  name: slow
provisionerType: kubernetes.io/aws-ebs
provisionerParameters:
  type: io1
  zone: us-east-1d
  iopsPerGB: 10
```

* parameters are case-insensitive
* `type`: `io1`, `gp2`, `sc1`, `st1`. See AWS docs for details
* `iopsPerGB`: only for `io1` volumes. I/O operations per second per GiB. AWS volume plugin multiplies this with size of requested volume to compute IOPS of the volume and caps it at 20 000 IOPS (maximum supported by AWS, see AWS docs).
* of course, the plugin will use some defaults when a parameter is omitted in a `StorageClass` instance (`gp2` in the same zone as in 1.3).

GCE:

```
apiVersion: extensions/v1beta1
kind: StorageClass
metadata:
  name: slow
provisionerType: kubernetes.io/gce-pd
provisionerParameters:
  type: pd-standard
  zone: us-central1-a
```

* `type`: `pd-standard` or `pd-ssd`
* `zone`: GCE zone
* of course, the plugin will use some defaults when a parameter is omitted in a `StorageClass` instance (SSD in the same zone as in 1.3 ?).


No OpenStack/Cinder yet

@kubernetes/sig-storage
2016-08-18 09:56:16 -07:00
Kubernetes Submit Queue 9696a27aa0 Merge pull request #30737 from saad-ali/fix29358Round2
Automatic merge from submit-queue

Skip safe to detach check if node API object no longer exists

Fixes #29358
2016-08-18 04:00:05 -07:00
Jan Safranek bb5d562f37 Restore alpha behavior 2016-08-18 10:36:50 +02:00
Jan Safranek d8a95a3785 Update matching logic with storage class
- no default StorageClass
- PVC.Spec.Class == nil means the same as PVC.Spec.Class == ""
2016-08-18 10:36:50 +02:00
Jan Safranek 6e4d95f646 Dynamic provisioning V2 controller, provisioners, docs and tests. 2016-08-18 10:36:49 +02:00
Matthew Wong 6486576f56 continue searching on bad size and add tests for bad size&mode 2016-08-17 10:42:52 -04:00
Scott Creeley 782d7d9815 Add Events for operation_executor to show status of mounts, failed or successful 2016-08-17 09:53:47 -04:00
saadali 0c72568247 Skip safe to detach if node api obj doesn't exist 2016-08-16 21:30:51 -07:00
Avesh Agarwal 52a60fe3be Fix default resource limits (node capacities) for downward api volumes 2016-08-16 14:41:17 -04:00
Matthew Wong fe817674ab Don't bind pre-bound pvc & pv if size request not satisfied 2016-08-16 12:24:18 -04:00
Jing Xu f19a1148db This change supports robust kubelet volume cleanup
Currently kubelet volume management works on the concept of desired
and actual world of states. The volume manager periodically compares the
two worlds and perform volume mount/unmount and/or attach/detach
operations. When kubelet restarts, the cache of those two worlds are
gone. Although desired world can be recovered through apiserver, actual
world can not be recovered which may cause some volumes cannot be cleaned
up if their information is deleted by apiserver. This change adds the
reconstruction of the actual world by reading the pod directories from
disk. The reconstructed volume information is added to both desired
world and actual world if it cannot be found in either world. The rest
logic would be as same as before, desired world populator may clean up
the volume entry if it is no longer in apiserver, and then volume
manager should invoke unmount to clean it up.
2016-08-15 11:29:15 -07:00
Jan Safranek 3c5364954b Fix PVC.Status.Capacity and AccessModes after binding
Also, fix unit tests to have the same claim and volume sizes in most of the
tests where we don't test matching based on size and test for a specific size
when we do actually test the matching.
2016-08-08 10:45:42 +02:00
Kubernetes Submit Queue 42a12a4cd6 Merge pull request #29978 from hodovska/sharedInformer-fixup
Automatic merge from submit-queue

SharedInformerFactory: usage and fixes

Follow-up for #26709
2016-08-04 09:00:23 -07:00
Dominika Hodovska 816f6d32ca Collapse duplicate informer creation paths 2016-08-04 09:02:13 +02:00
Kubernetes Submit Queue 48bd6368a7 Merge pull request #28777 from jsafrane/volume-unittest-waittest
Automatic merge from submit-queue

Stabilize volume unit tests by waiting for exact state

Wait for specific final state instead of waiting for specific number of
operations in controller unit tests. The tests are more readable and will survive
random goroutine ordering (PV and PVC controller have both their own
goroutine).

@kubernetes/sig-storage
2016-08-03 01:46:23 -07:00
Michal Rostecki 59ca5986dd Print/log pointers of structs with %#v instead of %+v
There are many places in k8s where %+v is used to format a pointer
to struct, which isn't working as expected.

Fixes #26591
2016-08-01 22:27:56 +02:00
Paul Morie de4d193d45 Add note about space-shuttle code style in controller/volume 2016-07-30 14:29:25 -04:00
Paul Morie 8a1baa4d64 Revert "controller/volume: simplify sync logic in syncUnboundClaim"
This reverts commit 9eb2831954.
2016-07-30 14:00:25 -04:00
Paul Morie a6d0dc0529 Revert "controller/volume: simplify sync logic in syncBoundClaim"
This reverts commit 67787caeeb.
2016-07-30 14:00:09 -04:00
k8s-merge-robot 5760acf603 Merge pull request #29596 from matttproud/fix/time-leaks/remainder
Automatic merge from submit-queue

pkg/various: plug leaky time.New{Timer,Ticker}s

According to the documentation for Go package time, `time.Ticker` and
`time.Timer` are uncollectable by garbage collector finalizers.  They
leak until otherwise stopped.  This commit ensures that all remaining
instances are stopped upon departure from their relative scopes.

Similar efforts were incrementally done in #29439 and #29114.

```release-note
* pkg/various: plugged various time.Ticker and time.Timer leaks.
```
2016-07-29 14:06:47 -07:00
Paul Morie c884297990 Fix collisions issues / timeouts for mounts
For non-attachable volumes, do not call GetVolumeName on the plugin and instead
generate a unique name based on the identity of the pod and the name of the volume
within the pod.
2016-07-27 17:53:50 -04:00
Matt T. Proud 5c6292c074 pkg/various: plug leaky time.New{Timer,Ticker}s
According to the documentation for Go package time, `time.Ticker` and
`time.Timer` are uncollectable by garbage collector finalizers.  They
leak until otherwise stopped.  This commit ensures that all remaining
instances are stopped upon departure from their relative scopes.
2016-07-26 06:20:31 +02:00
k8s-merge-robot 696cca21e2 Merge pull request #28813 from xiang90/pv_1
Automatic merge from submit-queue

controller/volume: simplify sync logic in syncBoundClaim

Remove all unnecessary branchings.
2016-07-23 00:51:49 -07:00
saadali 89fd358c52 Assume volume detached if node doesn't exist
Fixes #29358
2016-07-22 22:07:32 -07:00
k8s-merge-robot 99e24da2ff Merge pull request #29077 from saad-ali/fixIssue29051NamespaceDeletion
Automatic merge from submit-queue

Fix "PVC Volume not detached if pod deleted via namespace deletion" issue

Fixes #29051: "PVC Volume not detached if pod deleted via namespace deletion"

This PR:
* Fixes a bug in `desired_state_of_the_world_populator.go` to check the value of `exists` returned by the `podInformer` so that it can delete pods even if the delete event is missed (or fails).
* Reduces the desired state of the world populators sleep period from 5 min to 1 min (reducing the amount of time a volume would remain attached if a volume delete event is missed or fails).
2016-07-20 20:40:32 -07:00
saadali afd8a58e5c Reduce DSW populator sleep period from 5 min to 1 2016-07-20 01:03:04 -07:00
saadali d210c2231f Check pod exist in attach controller DSW populator
Fix bug in desired_state_of_the_world_populator.go to check exists so
that it can delete pods even if the delete event is missed (or fails)
2016-07-20 01:03:04 -07:00
saadali 88d495026d Allow mounts to run in parallel for non-attachable
Allow mount volume operations to run in parallel for non-attachable
volume plugins.

Allow unmount volume operations to run in parallel for all volume
plugins.
2016-07-19 21:54:26 -07:00
k8s-merge-robot 2125c0eb62 Merge pull request #28811 from xiang90/pv
Automatic merge from submit-queue

controller/volume: simplify sync logic in syncUnboundClaim

Remove all unnecessary branching logic. No actual logic changes. Code is more readable now.
2016-07-12 02:28:05 -07:00
Xiang Li 67787caeeb controller/volume: simplify sync logic in syncBoundClaim 2016-07-11 19:36:36 -07:00
Xiang Li 9eb2831954 controller/volume: simplify sync logic in syncUnboundClaim 2016-07-11 19:22:14 -07:00
k8s-merge-robot 7b067c859f Merge pull request #26387 from MHBauer/cleanupjitter
Automatic merge from submit-queue

close channel to prevent buildup of wait.JitterUntil()

<!--
Checklist for submitting a Pull Request

Please remove this comment block before submitting.

1. Please read our [contributor guidelines](https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md).
2. See our [developer guide](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md).
3. If you want this PR to automatically close an issue when it is merged,
   add `fixes #<issue number>` or `fixes #<issue number>, fixes #<issue number>`
   to close multiple issues (see: https://github.com/blog/1506-closing-issues-via-pull-requests).
4. Follow the instructions for [labeling and writing a release note for this PR](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes) in the block below.
-->

Trying to look at flake in #26377 by running the test with large counts of runs. It was timing out because a `wait.JitterUntil` goroutine builds up for each of the four tests. So if you ran it a thousand times, you would end up with 4k goroutines spinning in the background. Now I create a channel and close it at the end of each test to prevent a memory leak.
2016-07-11 18:53:39 -07:00
Jan Safranek 71b75d593e Stabilize volume unit tests by waiting for exact state
Wait for specific final state instead of waiting for specific number of
operations in volume unit tests. The tests are more readable and will survive
random goroutine ordering (PV and PVC controller have both their own
goroutine).
2016-07-11 15:35:01 +02:00
Morgan Bauer 69719167a3
close channel to prevent memory leak
- wait.JitterUntil goroutine is never cleaned up when used with wait.NeverStop
 - fixup comment
2016-07-06 09:34:20 -07:00
bin liu 426fdc431a Merge branch 'master' into fix-typos 2016-07-04 11:20:47 +08:00
saadali 0dd17fff22 Reorganize volume controllers and manager 2016-07-01 18:50:25 -07:00
David McMahon ef0c9f0c5b Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
saadali e06b32b1ef Mark VolumeInUse before checking if it is Attached
Ensure that kublet marks VolumeInUse before checking if it is Attached.
Also ensures that the attach/detach controller always fetches a fresh
copy of the node object before detach (instead ofKubelet relying on node
informer cache).
2016-06-28 14:05:59 -07:00
saadali dfe8e606c1 Fix device path used by volume WaitForAttach 2016-06-22 12:56:58 -07:00
saadali 773ac20880 Prevent detach before node status update 2016-06-22 04:45:50 -07:00
Jing Xu 0fefb23f94 implement desiredWorld populator to sync up with informer
This change implements the desiredStateOfWorld populator to sync up with
the pod informer. It periodically check each pod in the
desiredStateOfworld and verify whether it is still in pod informer
cache. If it not, remove it from the desiredStateOfWorld
2016-06-21 17:09:35 -07:00
saadali e716ddc771 Controller wait for attach and exponential backoff
Modify attach/detach controller to keep track of volumes to report
attached in Node VolumeToAttach status.

Modify kubelet volume manager to wait for volume to show up in Node
VolumeToAttach status.

Implement exponential backoff for errors in volume manager and attach
detach controller
2016-06-20 18:19:55 -07:00
goltermann 218645b346 Fix several spelling errors in comments. 2016-06-17 10:41:18 -07:00
saadali 542f2dc708 Introduce new kubelet volume manager
This commit adds a new volume manager in kubelet that synchronizes
volume mount/unmount (and attach/detach, if attach/detach controller
is not enabled).

This eliminates the race conditions between the pod creation loop
and the orphaned volumes loops. It also removes the unmount/detach
from the `syncPod()` path so volume clean up never blocks the
`syncPod` loop.
2016-06-15 09:34:08 -07:00
saadali 9b6a505f8a Rename UniqueDeviceName to UniqueVolumeName
Rename UniqueDeviceName to UniqueVolumeName and move helper functions
from attacherdetacher to volumehelper package.
Introduce UniquePodName alias
2016-06-15 09:32:12 -07:00
Saad Ali 9dbe943491 Attach/Detach Controller Kubelet Changes
This PR contains Kubelet changes to enable attach/detach controller control.
* It introduces a new "enable-controller-attach-detach" kubelet flag to
  enable control by controller. Default enabled.
* It removes all references "SafeToDetach" annoation from controller.
* It adds the new VolumesInUse field to the Node Status API object.
* It modifies the controller to use VolumesInUse instead of SafeToDetach
  annotation to gate detachment.
* There is a bug in node-problem-detector that causes VolumesInUse to
  get reset every 30 seconds. Issue https://github.com/kubernetes/node-problem-detector/issues/9
  opened to fix that.
2016-06-02 16:47:11 -07:00
saadali 3c345abafd Fix DATA RACE in unit tests: reconciler_test.go 2016-05-27 01:19:25 -07:00
saadali 92500a20d7 Attach detach controller business logic added
Split controller cache into actual and desired state of world.
Controller will only operate on volumes scheduled to nodes that
have the "volumes.kubernetes.io/controller-managed-attach" annotation.
2016-05-24 23:01:16 -07:00
Ed Robinson afdbad078a
Corrects some misspellings in comments
This should help to make
https://goreportcard.com/report/k8s.io/kubernetes#misspell
look a little nicer.
2016-05-11 08:16:13 +01:00
saadali 214b4c28bc Skeleton of new attach detach controller 2016-05-09 11:34:11 -07:00
saadali 71302d1163 Add data structure for storing attach detach controller state. 2016-05-03 14:11:10 -07:00