Commit Graph

57 Commits (52492e683f51685cb5a702ff0c4b8272b853f0a1)

Author SHA1 Message Date
Mike Danese a765d59932 move informer and controller to pkg/client/cache
Signed-off-by: Mike Danese <mikedanese@google.com>
2016-09-15 12:50:08 -07:00
Jing Xu efaceb28cc Fix race condition in updating attached volume between master and node
This PR tries to fix issue #29324. This cause of this issue is a race
condition happens when marking volumes as attached for node status. This
PR tries to clean up the logic of when and where to mark volumes as
attached/detached. Basically the workflow as follows,
1. When volume is attached sucessfully, the volume and node info is
added into nodesToUpdateStatusFor to mark the volume as attached to the
node.
2. When detach request comes in, it will check whether it is safe to
detach now. If the check passes, remove the volume from volumesToReportAsAttached
to indicate the volume is no longer considered as attached now.
Afterwards, reconciler tries to update node status and trigger detach
operation. If any of these operation fails, the volume is added back to
the volumesToReportAsAttached list showing that it is still attached.

These steps should make sure that kubelet get the right (might be
outdated) information about which volume is attached or not. It also
garantees that if detach operation is pending, kubelet should not
trigger any mount operations.
2016-09-12 13:51:08 -07:00
Kubernetes Submit Queue 6ce405c6ee Merge pull request #27778 from screeley44/k8-vol-executor
Automatic merge from submit-queue

Add Events for operation_executor to show status of mounts, failed/successful to show in describe events

Fixes #27590 
@saad-ali @pmorie @erinboyd

After talking with @pmorie last week about the above issue, I decided to poke around and see if I could remedy.  The refactoring broke my previous UXP merged PR's that correctly showed failed mount errors in the describe events.  However, Not sure I implemented correctly, but it tested out and seems to be working, let me know what I missed or if this is not the correct approach.

```
Events:
  FirstSeen	LastSeen	Count	From			SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  2m		2m		1	{default-scheduler }			Normal		Scheduled	Successfully assigned nfs-bb-pod1 to 127.0.0.1
  44s		44s		1	{kubelet 127.0.0.1}			Warning		FailedMount	Unable to mount volumes for pod "nfs-bb-pod1_default(a94f64f1-37c9-11e6-9aa5-52540073d346)": timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol]
  44s		44s		1	{kubelet 127.0.0.1}			Warning		FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol]
  38s		38s		1	{kubelet }				Warning		FailedMount	Unable to mount volumes for pod "a94f64f1-37c9-11e6-9aa5-52540073d346": Mount failed: exit status 32
Mounting arguments: nfs1.rhs:/opt/data99 /var/lib/kubelet/pods/a94f64f1-37c9-11e6-9aa5-52540073d346/volumes/kubernetes.io~nfs/nfsvol nfs []
Output: mount.nfs: Connection timed out

Resolution hint: Check and make sure the NFS Server exists (ensure that correct IPAddress/Hostname was given) and is available/reachable.
Also make sure firewall ports are open on both client and NFS Server (2049 v4 and 2049, 20048 and 111 for v3).
Use commands telnet <nfs server> <port> and showmount <nfs server> to help test connectivity.
```
2016-08-19 08:27:48 -07:00
Scott Creeley 782d7d9815 Add Events for operation_executor to show status of mounts, failed or successful 2016-08-17 09:53:47 -04:00
Avesh Agarwal 52a60fe3be Fix default resource limits (node capacities) for downward api volumes 2016-08-16 14:41:17 -04:00
saadali afd8a58e5c Reduce DSW populator sleep period from 5 min to 1 2016-07-20 01:03:04 -07:00
saadali 0dd17fff22 Reorganize volume controllers and manager 2016-07-01 18:50:25 -07:00