mirror of https://github.com/k3s-io/k3s
![]() Automatic merge from submit-queue kubelet: storage: don't hang kubelet on unresponsive nfs Fixes #31272 Currently, due to the nature of nfs, an unresponsive nfs volume in a pod can wedge the kubelet such that additional pods can not be run. The discussion thus far surrounding this issue was to wrap the `lstat`, the syscall that ends up hanging in uninterruptible sleep, in a goroutine and limiting the number of goroutines that hang to one per-pod per-volume. However, in my investigation, I found that the callsites that request a listing of the volumes from a particular volume plugin directory don't care anything about the properties provided by the `lstat` call. They only care about whether or not a directory exists. Given that constraint, this PR just avoids the `lstat` call by using `Readdirnames()` instead of `ReadDir()` or `ReadDirNoExit()` ### More detail for reviewers Consider the pod mounted nfs volume at `/var/lib/kubelet/pods/881341b5-9551-11e6-af4c-fa163e815edd/volumes/kubernetes.io~nfs/myvol`. The kubelet wedges because when we do a `ReadDir()` or `ReadDirNoExit()` it calls `syscall.Lstat` on `myvol` which requires communication with the nfs server. If the nfs server is unreachable, this call hangs forever. However, for our code, we only care what about the names of files/directory contained in `kubernetes.io~nfs` directory, not any of the more detailed information the `Lstat` call provides. Getting the names can be done with `Readdirnames()`, which doesn't need to involve the nfs server. @pmorie @eparis @ncdc @derekwaynecarr @saad-ali @thockin @vishh @kubernetes/rh-cluster-infra |
||
---|---|---|
.. | ||
async | ||
bandwidth | ||
cache | ||
cert | ||
chmod | ||
chown | ||
clock | ||
codeinspector | ||
config | ||
configz | ||
crlf | ||
dbus | ||
diff | ||
ebtables | ||
env | ||
errors | ||
exec | ||
flag | ||
flock | ||
flowcontrol | ||
flushwriter | ||
framer | ||
goroutinemap | ||
hash | ||
homedir | ||
httpstream | ||
initsystem | ||
integer | ||
interrupt | ||
intstr | ||
io | ||
iptables | ||
json | ||
jsonpath | ||
keymutex | ||
labels | ||
limitwriter | ||
logs | ||
maps | ||
metrics | ||
mount | ||
net | ||
node | ||
oom | ||
parsers | ||
pod | ||
procfs | ||
proxy | ||
rand | ||
replicaset | ||
resourcecontainer | ||
rlimit | ||
runtime | ||
selinux | ||
sets | ||
slice | ||
strategicpatch | ||
strings | ||
sysctl | ||
system | ||
term | ||
testing | ||
threading | ||
uuid | ||
validation | ||
wait | ||
workqueue | ||
wsstream | ||
yaml | ||
doc.go | ||
template.go | ||
template_test.go | ||
trace.go | ||
trie.go | ||
umask.go | ||
umask_windows.go | ||
util.go | ||
util_test.go |