github/k3s - k3s - https://git.xinac.net

Commit Graph

Author	SHA1	Message	Date
Justin Santa Barbara	54195d590f	Use strongly-typed types.NodeName for a node name We had another bug where we confused the hostname with the NodeName. To avoid this happening again, and to make the code more self-documenting, we use types.NodeName (a typedef alias for string) whenever we are referring to the Node.Name. A tedious but mechanical commit therefore, to change all uses of the node name to use types.NodeName Also clean up some of the (many) places where the NodeName is referred to as a hostname (not true on AWS), or an instanceID (not true on GCE), etc.	2016-09-27 10:47:31 -04:00
Jing Xu	efaceb28cc	Fix race condition in updating attached volume between master and node This PR tries to fix issue #29324. This cause of this issue is a race condition happens when marking volumes as attached for node status. This PR tries to clean up the logic of when and where to mark volumes as attached/detached. Basically the workflow as follows, 1. When volume is attached sucessfully, the volume and node info is added into nodesToUpdateStatusFor to mark the volume as attached to the node. 2. When detach request comes in, it will check whether it is safe to detach now. If the check passes, remove the volume from volumesToReportAsAttached to indicate the volume is no longer considered as attached now. Afterwards, reconciler tries to update node status and trigger detach operation. If any of these operation fails, the volume is added back to the volumesToReportAsAttached list showing that it is still attached. These steps should make sure that kubelet get the right (might be outdated) information about which volume is attached or not. It also garantees that if detach operation is pending, kubelet should not trigger any mount operations.	2016-09-12 13:51:08 -07:00
Jing Xu	b9157b7524	Post event message for volume attachment This PR is to add event message when attaching volume fails to help users to debug. For detach failure, may address in a different PR since it requires more data structure change.	2016-09-01 16:24:36 -07:00
Kubernetes Submit Queue	6ce405c6ee	Merge pull request #27778 from screeley44/k8-vol-executor Automatic merge from submit-queue Add Events for operation_executor to show status of mounts, failed/successful to show in describe events Fixes #27590 @saad-ali @pmorie @erinboyd After talking with @pmorie last week about the above issue, I decided to poke around and see if I could remedy. The refactoring broke my previous UXP merged PR's that correctly showed failed mount errors in the describe events. However, Not sure I implemented correctly, but it tested out and seems to be working, let me know what I missed or if this is not the correct approach. ``` Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 2m 2m 1 {default-scheduler } Normal Scheduled Successfully assigned nfs-bb-pod1 to 127.0.0.1 44s 44s 1 {kubelet 127.0.0.1} Warning FailedMount Unable to mount volumes for pod "nfs-bb-pod1_default(a94f64f1-37c9-11e6-9aa5-52540073d346)": timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol] 44s 44s 1 {kubelet 127.0.0.1} Warning FailedSync Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol] 38s 38s 1 {kubelet } Warning FailedMount Unable to mount volumes for pod "a94f64f1-37c9-11e6-9aa5-52540073d346": Mount failed: exit status 32 Mounting arguments: nfs1.rhs:/opt/data99 /var/lib/kubelet/pods/a94f64f1-37c9-11e6-9aa5-52540073d346/volumes/kubernetes.io~nfs/nfsvol nfs [] Output: mount.nfs: Connection timed out Resolution hint: Check and make sure the NFS Server exists (ensure that correct IPAddress/Hostname was given) and is available/reachable. Also make sure firewall ports are open on both client and NFS Server (2049 v4 and 2049, 20048 and 111 for v3). Use commands telnet <nfs server> <port> and showmount <nfs server> to help test connectivity. ```	2016-08-19 08:27:48 -07:00
Scott Creeley	782d7d9815	Add Events for operation_executor to show status of mounts, failed or successful	2016-08-17 09:53:47 -04:00
saadali	0c72568247	Skip safe to detach if node api obj doesn't exist	2016-08-16 21:30:51 -07:00
Jing Xu	f19a1148db	This change supports robust kubelet volume cleanup Currently kubelet volume management works on the concept of desired and actual world of states. The volume manager periodically compares the two worlds and perform volume mount/unmount and/or attach/detach operations. When kubelet restarts, the cache of those two worlds are gone. Although desired world can be recovered through apiserver, actual world can not be recovered which may cause some volumes cannot be cleaned up if their information is deleted by apiserver. This change adds the reconstruction of the actual world by reading the pod directories from disk. The reconstructed volume information is added to both desired world and actual world if it cannot be found in either world. The rest logic would be as same as before, desired world populator may clean up the volume entry if it is no longer in apiserver, and then volume manager should invoke unmount to clean it up.	2016-08-15 11:29:15 -07:00
Paul Morie	c884297990	Fix collisions issues / timeouts for mounts For non-attachable volumes, do not call GetVolumeName on the plugin and instead generate a unique name based on the identity of the pod and the name of the volume within the pod.	2016-07-27 17:53:50 -04:00
saadali	88d495026d	Allow mounts to run in parallel for non-attachable Allow mount volume operations to run in parallel for non-attachable volume plugins. Allow unmount volume operations to run in parallel for all volume plugins.	2016-07-19 21:54:26 -07:00
Cindy Wang	e13c678e3b	Make volume unmount more robust using exclusive mount w/ O_EXCL	2016-07-18 16:20:08 -07:00
saadali	0dd17fff22	Reorganize volume controllers and manager	2016-07-01 18:50:25 -07:00
David McMahon	ef0c9f0c5b	Remove "All rights reserved" from all the headers.	2016-06-29 17:47:36 -07:00
saadali	e06b32b1ef	Mark VolumeInUse before checking if it is Attached Ensure that kublet marks VolumeInUse before checking if it is Attached. Also ensures that the attach/detach controller always fetches a fresh copy of the node object before detach (instead ofKubelet relying on node informer cache).	2016-06-28 14:05:59 -07:00
saadali	dfe8e606c1	Fix device path used by volume WaitForAttach	2016-06-22 12:56:58 -07:00
saadali	e716ddc771	Controller wait for attach and exponential backoff Modify attach/detach controller to keep track of volumes to report attached in Node VolumeToAttach status. Modify kubelet volume manager to wait for volume to show up in Node VolumeToAttach status. Implement exponential backoff for errors in volume manager and attach detach controller	2016-06-20 18:19:55 -07:00
saadali	d72f88bf3a	Modify Attach method to return device path	2016-06-19 23:54:02 -07:00
saadali	542f2dc708	Introduce new kubelet volume manager This commit adds a new volume manager in kubelet that synchronizes volume mount/unmount (and attach/detach, if attach/detach controller is not enabled). This eliminates the race conditions between the pod creation loop and the orphaned volumes loops. It also removes the unmount/detach from the `syncPod()` path so volume clean up never blocks the `syncPod` loop.	2016-06-15 09:34:08 -07:00

1 2 3 4

167 Commits (750881c0ab62a9f59d53582bc33826caa167aa83)