If the CNI network plugin completes successfully, but something fails
between that success and dockerhsim's sandbox setup code, plugin resources
may not be cleaned up. A non-trivial amount of code runs after the
plugin itself exits and the CNI driver's SetUpPod() returns, and any error
condition recognized by that code would cause this leakage.
The Kubernetes CRI RunPodSandbox() request does not attempt to clean
up on errors, since it cannot know how much (if any) networking
was actually set up. It depends on the CRI implementation to do
that cleanup for it.
In the dockershim case, a SetUpPod() failure means networkReady is
FALSE for the sandbox, and TearDownPod() will not be called later by
garbage collection even though networking was configured, because
dockershim can't know how far SetUpPod() got.
Concrete examples include if the sandbox's container is somehow
removed during during that time, or another OS error is encountered,
or the plugin returns a malformed result to the CNI driver.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1532965
With CRI, kubelet no longer sets up networking for the pods. The
dockershim package is the rightful owner and the only user of the
newtork package. This change moves the package into dockershim to make
the distinction obvious, and untangles the codebase.
The`network/dns`is kept in the original package since it is only used by
kubelet.
Automatic merge from submit-queue (batch tested with PRs 60435, 60334, 60458, 59301, 60125). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
dockershim: don't check pod IP in StopPodSandbox
We're about to tear the container down, there's no point. It also suppresses
an annoying error message due to kubelet stupidity that causes multiple
parallel calls to StopPodSandbox for the same sandbox.
docker_sandbox.go:355] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "docker-registry-1-deploy_default": Unexpected command output nsenter: cannot open /proc/22646/ns/net: No such file or directory
1) A first StopPodSandbox() request triggered by SyncLoop(PLEG) for
a ContainerDied event calls into TearDownPod() and thus the network
plugin. Until this completes, networkReady=true for the
sandbox.
2) A second StopPodSandbox() request triggered by SyncLoop(REMOVE)
calls PodSandboxStatus() and calls into the network plugin to read
the IP address because networkReady=true
3) The first request exits the network plugin, sets networReady=false,
and calls StopContainer() on the sandbox. This destroys the network
namespace.
4) The second request finally gets around to running nsenter but
the network namespace is already destroyed. It returns an error
which is logged by getIP().
```release-note
NONE
```
@yujuhong @freehan
We're about to tear the container down, there's no point. It also suppresses
an annoying error message due to kubelet stupidity that causes multiple
parallel calls to StopPodSandbox for the same sandbox.
docker_sandbox.go:355] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "docker-registry-1-deploy_default": Unexpected command output nsenter: cannot open /proc/22646/ns/net: No such file or directory
1) A first StopPodSandbox() request triggered by SyncLoop(PLEG) for
a ContainerDied event calls into TearDownPod() and thus the network
plugin. Until this completes, networkReady=true for the
sandbox.
2) A second StopPodSandbox() request triggered by SyncLoop(REMOVE)
calls PodSandboxStatus() and calls into the network plugin to read
the IP address because networkReady=true
3) The first request exits the network plugin, sets networReady=false,
and calls StopContainer() on the sandbox. This destroys the network
namespace.
4) The second request finally gets around to running nsenter but
the network namespace is already destroyed. It returns an error
which is logged by getIP().
This also incorporates the version string into the package name so
that incompatibile versions will fail to connect.
Arbitrary choices:
- The proto3 package name is runtime.v1alpha2. The proto compiler
normally translates this to a go package of "runtime_v1alpha2", but
I renamed it to "v1alpha2" for consistency with existing packages.
- kubelet/apis/cri is used as "internalapi". I left it alone and put the
public "runtimeapi" in kubelet/apis/cri/runtime.
GenericPLEG's 1s relist() loop races against pod network setup. It
may be called after the infra container has started but before
network setup is done, since PLEG and the runtime's SyncPod() run
in different goroutines.
Track network setup status and don't bother trying to read the pod's
IP address if networking is not yet ready.
See also: https://bugzilla.redhat.com/show_bug.cgi?id=1434950
Mar 22 12:18:17 ip-172-31-43-89 atomic-openshift-node: E0322
12:18:17.651013 25624 docker_manager.go:378] NetworkPlugin
cni failed on the status hook for pod 'pausepods22' - Unexpected
command output Device "eth0" does not exist.
GenericPLEG's 1s relist() loop races against pod network setup. It
may be called after the infra container has started but before
network setup is done, since PLEG and the runtime's SyncPod() run
in different goroutines.
Track network setup status and don't bother trying to read the pod's
IP address if networking is not yet ready.
See also: https://bugzilla.redhat.com/show_bug.cgi?id=1434950
Mar 22 12:18:17 ip-172-31-43-89 atomic-openshift-node: E0322
12:18:17.651013 25624 docker_manager.go:378] NetworkPlugin
cni failed on the status hook for pod 'pausepods22' - Unexpected
command output Device "eth0" does not exist.
The enum constants are not namespaced. The shorter, unspecifc names are likely
to cause naming conflicts in the future.
Also replace "SandBox" with "Sandbox" in the API.