Commit Graph

100 Commits (3e97969bdc5ed3387a1c61ec1ab8266be4d6b411)

Author SHA1 Message Date
Kubernetes Submit Queue 1902a18c88
Merge pull request #59301 from dcbw/dockershim-stop-sandbox-no-ip
Automatic merge from submit-queue (batch tested with PRs 60435, 60334, 60458, 59301, 60125). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

dockershim: don't check pod IP in StopPodSandbox

We're about to tear the container down, there's no point.  It also suppresses
an annoying error message due to kubelet stupidity that causes multiple
parallel calls to StopPodSandbox for the same sandbox.

docker_sandbox.go:355] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "docker-registry-1-deploy_default": Unexpected command output nsenter: cannot open /proc/22646/ns/net: No such file or directory

1) A first StopPodSandbox() request triggered by SyncLoop(PLEG) for
a ContainerDied event calls into TearDownPod() and thus the network
plugin.  Until this completes, networkReady=true for the
sandbox.

2) A second StopPodSandbox() request triggered by SyncLoop(REMOVE)
calls PodSandboxStatus() and calls into the network plugin to read
the IP address because networkReady=true

3) The first request exits the network plugin, sets networReady=false,
and calls StopContainer() on the sandbox.  This destroys the network
namespace.

4) The second request finally gets around to running nsenter but
the network namespace is already destroyed.  It returns an error
which is logged by getIP().

```release-note
NONE
```
@yujuhong @freehan
2018-02-26 17:48:50 -08:00
Dan Williams 60a955d414 dockershim: don't check pod IP in StopPodSandbox
We're about to tear the container down, there's no point.  It also suppresses
an annoying error message due to kubelet stupidity that causes multiple
parallel calls to StopPodSandbox for the same sandbox.

docker_sandbox.go:355] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "docker-registry-1-deploy_default": Unexpected command output nsenter: cannot open /proc/22646/ns/net: No such file or directory

1) A first StopPodSandbox() request triggered by SyncLoop(PLEG) for
a ContainerDied event calls into TearDownPod() and thus the network
plugin.  Until this completes, networkReady=true for the
sandbox.

2) A second StopPodSandbox() request triggered by SyncLoop(REMOVE)
calls PodSandboxStatus() and calls into the network plugin to read
the IP address because networkReady=true

3) The first request exits the network plugin, sets networReady=false,
and calls StopContainer() on the sandbox.  This destroys the network
namespace.

4) The second request finally gets around to running nsenter but
the network namespace is already destroyed.  It returns an error
which is logged by getIP().
2018-02-08 12:22:44 -06:00
Kubernetes Submit Queue fb340a4695
Merge pull request #57824 from thockin/gcr-vanity
Automatic merge from submit-queue (batch tested with PRs 57824, 58806, 59410, 59280). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

2nd try at using a vanity GCR name

The 2nd commit here is the changes relative to the reverted PR.  Please focus review attention on that.

This is the 2nd attempt.  The previous try (#57573) was reverted while we
figured out the regional mirrors (oops).
    
New plan: k8s.gcr.io is a read-only facade that auto-detects your source
region (us, eu, or asia for now) and pulls from the closest.  To publish
an image, push k8s-staging.gcr.io and it will be synced to the regionals
automatically (similar to today).  For now the staging is an alias to
gcr.io/google_containers (the legacy URL).
    
When we move off of google-owned projects (working on it), then we just
do a one-time sync, and change the google-internal config, and nobody
outside should notice.
    
We can, in parallel, change the auto-sync into a manual sync - send a PR
to "promote" something from staging, and a bot activates it.  Nice and
visible, easy to keep track of.

xref https://github.com/kubernetes/release/issues/281

TL;DR:
  *  The new `staging-k8s.gcr.io` is where we push images.  It is literally an alias to `gcr.io/google_containers` (the existing repo) and is hosted in the US.
  * The contents of `staging-k8s.gcr.io` are automatically synced to `{asia,eu,us)-k8s.gcr.io`.
  * The new `k8s.gcr.io` will be a read-only alias to whichever regional repo is closest to you.
  * In the future, images will be promoted from `staging` to regional "prod" more explicitly and auditably.

 ```release-note
Use "k8s.gcr.io" for pulling container images rather than "gcr.io/google_containers".  Images are already synced, so this should not impact anyone materially.
    
Documentation and tools should all convert to the new name. Users should take note of this in case they see this new name in the system.
```
2018-02-08 03:29:32 -08:00
Tim Hockin 3586986416 Switch to k8s.gcr.io vanity domain
This is the 2nd attempt.  The previous was reverted while we figured out
the regional mirrors (oops).

New plan: k8s.gcr.io is a read-only facade that auto-detects your source
region (us, eu, or asia for now) and pulls from the closest.  To publish
an image, push k8s-staging.gcr.io and it will be synced to the regionals
automatically (similar to today).  For now the staging is an alias to
gcr.io/google_containers (the legacy URL).

When we move off of google-owned projects (working on it), then we just
do a one-time sync, and change the google-internal config, and nobody
outside should notice.

We can, in parallel, change the auto-sync into a manual sync - send a PR
to "promote" something from staging, and a bot activates it.  Nice and
visible, easy to keep track of.
2018-02-07 21:14:19 -08:00
Lee Verberne e10042d22f Increment CRI version from v1alpha1 to v1alpha2
This also incorporates the version string into the package name so
that incompatibile versions will fail to connect.

Arbitrary choices:
- The proto3 package name is runtime.v1alpha2. The proto compiler
  normally translates this to a go package of "runtime_v1alpha2", but
  I renamed it to "v1alpha2" for consistency with existing packages.
- kubelet/apis/cri is used as "internalapi". I left it alone and put the
  public "runtimeapi" in kubelet/apis/cri/runtime.
2018-02-07 09:06:26 +01:00
Lee Verberne 0f1de41790 Update kubelet for enumerated CRI namespaces
This adds support to both the Generic Runtime Manager and the
dockershim for the CRI's enumerated namespaces.
2018-02-07 09:06:26 +01:00
Pengfei Ni cabd2bb619 Add experimental hyperv containers support on Windows 2018-01-30 12:58:08 +08:00
Yu-Ju Hong e8da890aee dockershim: remove the use of kubelet's internal API
We let dockershim implement the kubelet's internal (CRI) API as an
intermediary step before transitioning fully to communicate using gRPC.
Now that kubelet has been communicating to the runtime over gRPC for
multiple releases, we can safely retire the extra interface in
dockershim.
2018-01-19 16:31:18 -08:00
Kubernetes Submit Queue 2e9a277a3c
Merge pull request #57845 from yujuhong/minor-clean-up
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

dockershim: bump the minimum supported docker version to 1.11

Drop the 1.10 compatibilty code.

**Release note**:

```release-note
NONE
```
2018-01-09 18:14:27 -08:00
Yu-Ju Hong 059fa35a84 dockershim: bump the minimum supported docker version to 1.11
Drop the 1.10 compatibilty code.
2018-01-04 10:22:16 -08:00
Lee Verberne 1ea697044a Update pause container version to 3.1
This updates the version of the pause container used by the kubelet and
various test utilities to 3.1.

This also adds a CHANGELOG.md for build/pause
2018-01-04 11:35:29 +01:00
Tim Hockin e9dd8a68f6 Revert k8s.gcr.io vanity domain
This reverts commit eba5b6092a.

Fixes https://github.com/kubernetes/kubernetes/issues/57526
2017-12-22 14:36:16 -08:00
Tim Hockin eba5b6092a Use k8s.gcr.io vanity domain for container images 2017-12-18 09:18:34 -08:00
Kubernetes Submit Queue 02f803cc02
Merge pull request #52842 from yanxuean/reduntdant-cgroups
Automatic merge from submit-queue (batch tested with PRs 50457, 55558, 53483, 55731, 52842). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

improve the logic setting cgroupparent in RunPodSandbox

Signed-off-by: yanxuean <yan.xuean@zte.com.cn>

**What this PR does / why we need it**:
The setting of cgroupparent is too confused!
The old logic is:
1. set CgroupParent correctly
2. reset CgroupParent incorrectly
3. set CgroupParent again  (refer to #42055 )

The login is too confused, and It is sure that there are many people who drop in trap.
We only need to set it in one place.

kubernetes/pkg/kubelet/dockershim/docker_sandbox.go
```
func (ds *dockerService) makeSandboxDockerConfig(c *runtimeapi.PodSandboxConfig, image string) (*dockertypes.ContainerCreateConfig, error) {
        ....
       // Apply linux-specific options.
	if lc := c.GetLinux(); lc != nil {
		if err := ds.applySandboxLinuxOptions(hc, lc, createConfig, image, securityOptSep); err != nil {
			return nil, err
		}
	}

	// Apply resource options.
        setSandboxResources(hc)      **<-- reset the CgroupParent incorrectly**

       // Apply cgroupsParent derived from the sandbox config.
	if lc := c.GetLinux(); lc != nil {
		// Apply Cgroup options.
		cgroupParent, err := ds.GenerateExpectedCgroupParent(lc.CgroupParent)
		if err != nil {
			return nil, fmt.Errorf("failed to generate cgroup parent in expected syntax for container %q: %v", c.Metadata.Name, err)
		}
		hc.CgroupParent = cgroupParent
	}
```

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2017-11-18 11:36:26 -08:00
Seth Jennings a4bc7707d4 dockershim: remove corrupt checkpoints immediately upon detection 2017-11-13 20:34:17 -06:00
Kubernetes Submit Queue b448dfa0e9
Merge pull request #55028 from sjenning/remove-orphaned-checkpoints
Automatic merge from submit-queue (batch tested with PRs 55050, 53464, 54936, 55028, 54928). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: dockershim: remove orphaned checkpoint files

Fixes https://github.com/kubernetes/kubernetes/issues/55070

Currently, `ListPodSandbox()` returns a combined list of sandboxes populated from both the runtime and the dockershim checkpoint files.  However the sandboxes in the checkpoint files might not exist anymore.

The kubelet sees the sandbox returned by `ListPodSandbox()` and determines it shouldn't be running and calls `StopPodSandbox()` on it.  This generates an error when `StopContainer()` is called as the container does not exist.  However the checkpoint file is not cleaned up.  This leads to subsequent calls to `StopPodSandbox()` that fail in the same way each time.

This PR removes the checkpoint file if StopContainer fails due to container not found.

The only other place `RemoveCheckpoint()` is called, except if it is corrupt, is from `RemoveSandbox()`.  If the container does not exist, what `RemoveSandbox()` would have done has been effectively been done already.  So this is just clean up.

@derekwaynecarr @eparis @freehan @dcbw
2017-11-03 12:59:19 -07:00
Seth Jennings 9f66666a30 kubelet: dockershim: remove orphaned checkpoint files 2017-11-02 16:59:39 -05:00
Derek Carr 79a08a1c90 StopPodSandbox should not log when container is already removed 2017-11-02 15:12:25 -04:00
yanxuean 3f3dae56cf improve setting cgroupparent
Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2017-10-31 16:47:39 +08:00
Kubernetes Submit Queue fc8a647f78 Merge pull request #52864 from dcbw/dockershim-fix-net-teardown
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

dockershim: fine-tune network-ready handling on sandbox teardown and removal

If sandbox teardown results in an error, GC will periodically attempt
to again remove the sandbox.  Until the sandbox is removed, pod
sandbox status calls will attempt to enter the pod's namespace and
retrieve the pod IP, but the first teardown attempt may have already
removed the network namespace, resulting in a pointless log error
message that the network namespace doesn't exist, or that nsenter
can't find eth0.

The network-ready mechanism originally attempted to suppress those
messages by ensuring that pod sandbox status skipped network checks
when networking was already torn down, but unfortunately the ready
value was cleared too early.

Also, don't tear down the pod network multiple times if the first
time we tore it down, it succeeded.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-09-24 04:32:12 -07:00
Dan Williams ddb5075842 dockershim: fine-tune network-ready handling on sandbox teardown and removal
If sandbox teardown results in an error, GC will periodically attempt
to again remove the sandbox.  Until the sandbox is removed, pod
sandbox status calls will attempt to enter the pod's namespace and
retrieve the pod IP, but the first teardown attempt may have already
removed the network namespace, resulting in a pointless log error
message that the network namespace doesn't exist, or that nsenter
can't find eth0.

The network-ready mechanism originally attempted to suppress those
messages by ensuring that pod sandbox status skipped network checks
when networking was already torn down, but unfortunately the ready
value was cleared too early.

Also, don't tear down the pod network multiple times if the first
time we tore it down, it succeeded.
2017-09-21 14:53:50 -05:00
Yu-Ju Hong aaf26b2eaa dockershim: remove support for legacy containers
The code was first introduced in 1.6 to help pre-CRI-kubelet upgrade to
using the CRI implementation. They can safely be removed now.
2017-09-11 08:44:27 -07:00
Pengfei Ni bf01fa2f00 Use seccomp from security context 2017-08-13 15:42:15 +08:00
Yang Guo bf2ced837c Updates Docker Engine API 2017-07-13 12:55:07 -07:00
Dan Williams 5100925a90 dockershim: checkpoint HostNetwork property
To ensure kubelet doesn't attempt network teardown on HostNetwork
containers that no longer exist but are still checkpointed, make
sure we preserve the HostNetwork property in checkpoints.  If
the checkpoint indicates the container was a HostNetwork one,
don't tear down the network since that would fail anyway.

Related: https://github.com/kubernetes/kubernetes/issues/44307#issuecomment-299548609
2017-06-21 13:10:47 -05:00
Dan Williams f76cc7642c dockershim: don't spam logs with pod IP errors before networking is ready
GenericPLEG's 1s relist() loop races against pod network setup.  It
may be called after the infra container has started but before
network setup is done, since PLEG and the runtime's SyncPod() run
in different goroutines.

Track network setup status and don't bother trying to read the pod's
IP address if networking is not yet ready.

See also: https://bugzilla.redhat.com/show_bug.cgi?id=1434950

Mar 22 12:18:17 ip-172-31-43-89 atomic-openshift-node: E0322
   12:18:17.651013   25624 docker_manager.go:378] NetworkPlugin
   cni failed on the status hook for pod 'pausepods22' - Unexpected
   command output Device "eth0" does not exist.
2017-06-12 15:07:38 -05:00
Pengfei Ni 22e99504d7 Update CRI references 2017-06-09 10:16:40 +08:00
Dong Liu 5936e81b2e Add determinePodIPBySandboxID. 2017-06-02 08:03:07 -05:00
Dong Liu 9c2309b7cb Add os dependent getSecurityOpts helper method. 2017-06-02 05:59:20 -05:00
Dawn Chen 78c1649f5b Revert "kubelet/network: report but tolerate errors returned from GetNetNS()" 2017-05-31 17:16:32 -07:00
Dan Williams 02200ba752 dockershim: don't spam logs with pod IP errors before networking is ready
GenericPLEG's 1s relist() loop races against pod network setup.  It
may be called after the infra container has started but before
network setup is done, since PLEG and the runtime's SyncPod() run
in different goroutines.

Track network setup status and don't bother trying to read the pod's
IP address if networking is not yet ready.

See also: https://bugzilla.redhat.com/show_bug.cgi?id=1434950

Mar 22 12:18:17 ip-172-31-43-89 atomic-openshift-node: E0322
   12:18:17.651013   25624 docker_manager.go:378] NetworkPlugin
   cni failed on the status hook for pod 'pausepods22' - Unexpected
   command output Device "eth0" does not exist.
2017-05-23 22:42:41 -05:00
zhengjiajin c79b0c797f fix typo in kubelet 2017-05-23 19:54:10 +08:00
Kubernetes Submit Queue f82bdca459 Merge pull request #44326 from xlgao-zju/forcibly-remove
Automatic merge from submit-queue (batch tested with PRs 44326, 45768)

[CRI] Forcibly remove container

Forcibly remove the running containers in `RemoveContainer`. Since we should forcibly remove the running containers in `RemovePodSandbox`. See [here](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/api/v1alpha1/runtime/api.proto#L35).

cc @feiskyer @Random-Liu 

Signed-off-by: Xianglin Gao <xlgao@zju.edu.cn>
2017-05-16 10:39:05 -07:00
Pengfei Ni 2b4956c208 dockershim: get sysctls from sandbox config instead of annotations 2017-05-15 12:53:32 +08:00
Michael Taufen cbad320205 Reorganize kubelet tree so apis can be independently versioned 2017-05-12 10:02:33 -07:00
Xianglin Gao 0144803c07 Forcibly remove container
Signed-off-by: Xianglin Gao <xlgao@zju.edu.cn>
2017-05-11 18:39:37 +08:00
Yu-Ju Hong 389c140eaf Move docker client code from dockertools to dockershim/dockerlib
The code affected include DockerInterface (renamed to Interface),
FakeDockerClient, etc.
2017-05-05 11:48:08 -07:00
Yu-Ju Hong 5644587e07 More dockertools cleanup
Move some constants/functions to dockershim and remove unused tests.
2017-05-03 11:22:06 -07:00
Pengfei Ni d301f22863 CRI: remove PodSandboxStatus.Linux.Namespaces.Network
Closes: #44972
2017-05-02 10:34:41 +08:00
Pengfei Ni ac76766a92 CRI: move apparmor annotations to container security context 2017-05-01 20:55:16 +08:00
Kubernetes Submit Queue 27cf62ac29 Merge pull request #43940 from xlgao-zju/rm-all-containers
Automatic merge from submit-queue (batch tested with PRs 42025, 44169, 43940)

[CRI] Remove all containers in the sandbox

Remove all containers in the sandbox, when we remove the sandbox.

/cc @feiskyer @Random-Liu 

Signed-off-by: Xianglin Gao <xlgao@zju.edu.cn>
2017-04-06 17:00:23 -07:00
Xianglin Gao b9c1d6c7c8 Remove all containers in the sandbox
Signed-off-by: Xianglin Gao <xlgao@zju.edu.cn>
2017-04-05 13:36:30 +08:00
Random-Liu b1ce4b7a1d Use DNSOptions passed by CRI in dockershim. 2017-04-03 10:24:42 -07:00
Pengfei Ni a16758396c Fix tiny typo 2017-03-21 14:22:33 +08:00
Pengfei Ni 95c3782043 Rewrite resolv.conf for dockershim
PR #29378 introduces ClusterFirstWithHostNet, but docker doesn't support
setting dns options togather with hostnetwork. This commit rewrites
resolv.conf same as dockertools.
2017-03-20 18:45:39 +08:00
Yu-Ju Hong 035afab901 dockershim: remove corrupted sandbox checkpoints
This is a workaround to ensure that kubelet doesn't block forever when
the checkpoint is corrupted.
2017-03-13 15:41:01 -07:00
Kubernetes Submit Queue 4cf553f78e Merge pull request #42767 from Random-Liu/cleanup-infra-container-on-error
Automatic merge from submit-queue (batch tested with PRs 42768, 42760, 42771, 42767)

Stop sandbox container when hit network error.

Fixes https://github.com/kubernetes/kubernetes/issues/42698.

This PR stops the sandbox container when hitting a network error.
This PR also adds a unit test for it.

I'm not sure whether we should try teardown pod network after `SetUpPod` failure. We don't do that in dockertools https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/dockertools/docker_manager.go#L2276.

@yujuhong @freehan
2017-03-09 00:08:01 -08:00
Random-Liu 2690461cbb Stop sandbox container when hit network error. 2017-03-08 17:28:42 -08:00
Yu-Ju Hong 8328a66bdf dockershim: Fix the race condition in ListPodSandbox
In ListPodSandbox(), we
 1. List all sandbox docker containers
 2. List all sandbox checkpoints. If the checkpoint does not have a
    corresponding container in (1), we return partial result based on
    the checkpoint.

The problem is that new PodSandboxes can be created between step (1) and
(2). In those cases, we will see the checkpoints, but not the sandbox
containers. This leads to strange behavior because the partial result
from the checkpoint does not include some critical information. For
example, the creation timestamp'd be zero, and that would cause kubelet's
garbage collector to immediately remove the sandbox.

This change fixes that by getting the list of checkpoints before listing
all the containers (since in RunPodSandbox we create them in the reverse
order).
2017-03-08 17:02:34 -08:00
Kubernetes Submit Queue 5ee6ba2f59 Merge pull request #42223 from Random-Liu/dockershim-better-implement-cri
Automatic merge from submit-queue (batch tested with PRs 41980, 42192, 42223, 41822, 42048)

CRI: Make dockershim better implements CRI.

When thinking about CRI Validation test, I found that `PodSandboxStatus.Linux.Namespaces.Options.HostPid` and `PodSandboxStatus.Linux.Namespaces.Options.HostIpc` are not populated. Although they are not used by kuberuntime now, we should populate them to conform to CRI.

/cc @yujuhong @feiskyer
2017-03-02 00:59:19 -08:00