This implements Bulk volume polling using ideas presented by
justin in https://github.com/kubernetes/kubernetes/pull/39564
But it changes the implementation to use an interface
and doesn't affect other implementations.
Currently kubelet volume management works on the concept of desired
and actual world of states. The volume manager periodically compares the
two worlds and perform volume mount/unmount and/or attach/detach
operations. When kubelet restarts, the cache of those two worlds are
gone. Although desired world can be recovered through apiserver, actual
world can not be recovered which may cause some volumes cannot be cleaned
up if their information is deleted by apiserver. This change adds the
reconstruction of the actual world by reading the pod directories from
disk. The reconstructed volume information is added to both desired
world and actual world if it cannot be found in either world. The rest
logic would be as same as before, desired world populator may clean up
the volume entry if it is no longer in apiserver, and then volume
manager should invoke unmount to clean it up.
Automatic merge from submit-queue
Fix mount collision timeout issue
Short- or medium-term workaround for #29555. The root issue being fixed here is that the recent attach/detach work in the kubelet uses a unique volume name as a key that tracks the work that has to be done for each volume in a pod to attach/mount/umount/detach. However, the non-attachable volume plugins do not report unique names for themselves, which causes collisions when a single secret or configmap is mounted multiple times in a pod.
This is still a WIP -- I need to add a couple E2E tests that ensure that tests break in the future if there is a regression -- but posting for early review.
cc @kubernetes/sig-storage
Ultimately, I would like to refine this a bit further. A couple things I would like to change:
1. `GetUniqueVolumeName` should be a property ONLY of attachable volumes
2. I would like to see the kubelet apparatus for attach/mount/umount/detach handle non-attachable volumes specifically to avoid things like the `WaitForControllerAttach` call that has to be done for those volume types now
For non-attachable volumes, do not call GetVolumeName on the plugin and instead
generate a unique name based on the identity of the pod and the name of the volume
within the pod.
This fixes race conditions in configmap, secret, downwardapi & git_repo
volume plugins.
wrappedVolumeSpec vars used by volume mounters and unmounters contained
a pointer to api.Volume structs which were being patched by
NewWrapperMounter/NewWrapperUnmounter, causing race condition during
volume mounts.
This commit adds a new volume manager in kubelet that synchronizes
volume mount/unmount (and attach/detach, if attach/detach controller
is not enabled).
This eliminates the race conditions between the pod creation loop
and the orphaned volumes loops. It also removes the unmount/detach
from the `syncPod()` path so volume clean up never blocks the
`syncPod` loop.
- Add volume.MetricsProvider function to Volume interface.
- Add volume.MetricsDu for providing metrics via executing "du".
- Add volulme.MetricsNil for unsupported Volumes.