Automatic merge from submit-queue
Improved code coverage for pkg/kubelet/types/labels
The test coverage improved from 0% to 100%.
This fixed part of #40780
**What this PR does / why we need it**:
Increase test coverage.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
release-note-none
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 45515, 45579)
Ignore openrc cgroup
**What this PR does / why we need it**:
It is a work-around for the following: https://github.com/opencontainers/runc/issues/1440
**Special notes for your reviewer**:
I am open to a cleaner way to do this, but we have many developer users on Macs that ran containerized kubelets that are not able to run them right now due to the inclusion of openrc tripping up our existence checks. Ideally, runc can give us a call to say "does this exist according to what runc knows about". Or we could add a whitelist check. Right now, this was the smallest hack pending more discussion.
Automatic merge from submit-queue
Remove the deprecated `--enable-cri` flag
Except for rkt, CRI is the default and only integration point for
container runtimes.
```release-note
Remove the deprecated `--enable-cri` flag. CRI is now the default,
and the only way to integrate with kubelet for the container runtimes.
```
Automatic merge from submit-queue (batch tested with PRs 45382, 45384, 44781, 45333, 45543)
Ensure desired state of world populator runs before volume reconstructor
If the kubelet's volumemanager reconstructor for actual state of world runs before the desired state of world has been populated, the pods in the actual state of world will have some incorrect volume information: namely outerVolumeSpecName, which if incorrect leads to part of the issue here https://github.com/kubernetes/kubernetes/issues/43515, because WaitForVolumeAttachAndMount searches the actual state of world with the correct outerVolumeSpecName and won't find it so reports 'timeout waiting....', etc. forever for existing pods. The comments acknowledge that this is a known issue
The all sources ready check doesn't work because the sources being ready doesn't necessarily mean the desired state of world populator added pods from the sources. So instead let's put the all sources ready check in the *populator*, and when the sources are ready, it will be able to populate the desired state of world and make "HasAddedPods()" return true. THEN, the reconstructor may run.
@jingxu97 PTAL, you wrote all of the reconstruction stuff
```release-note
NONE
```
Automatic merge from submit-queue
Enable shared PID namespace by default for docker pods
**What this PR does / why we need it**: This PR enables PID namespace sharing for docker pods by default, bringing the behavior of docker in line with the other CRI runtimes when used with docker >= 1.13.1.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: ref #1615
**Special notes for your reviewer**: cc @dchen1107 @yujuhong
**Release note**:
```release-note
Kubernetes now shares a single PID namespace among all containers in a pod when running with docker >= 1.13.1. This means processes can now signal processes in other containers in a pod, but it also means that the `kubectl exec {pod} kill 1` pattern will cause the pod to be restarted rather than a single container.
```
Automatic merge from submit-queue (batch tested with PRs 45453, 45307, 44987)
Migrate the docker client code from dockertools to dockershim
Move docker client code from dockertools to dockershim/libdocker. This includes
DockerInterface (renamed to Interface), FakeDockerClient, etc.
This is part of #43234
Despite its name, AssertCalls() does not assert anything. It returns an
error that must be checked. This was causing false negatives for
a handful of unit tests.
Automatic merge from submit-queue
rkt: Generate a new Network Namespace for each Pod
**What this PR does / why we need it**:
This PR concerns the Kubelet with the Container runtime rkt.
Currently, when a Pod stops and the kubelet restart it, the Pod will use the **same network namespace** based on its PodID.
When the Garbage Collection is triggered, it delete all the old resources and the current network namespace.
The Pods and all containers inside it loose the _eth0_ interface.
I explained more in details in #45149 how to reproduce this behavior.
This PR generates a new unique network namespace name for each new/restarting Pod.
The Garbage collection retrieve the correct network namespace and remove it safely.
**Which issue this PR fixes** :
fix#45149
**Special notes for your reviewer**:
Following @yifan-gu guidelines, so maybe expecting him for the final review.
**Release note**:
`NONE`
Automatic merge from submit-queue (batch tested with PRs 45018, 45330)
Clean up for qos.go
**What this PR does / why we need it**:
Seems we are not using any of those functions.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#39148
**Release note**:
```release-note
A small clean up to remove unnecessary functions.
```
Automatic merge from submit-queue (batch tested with PRs 45200, 45203)
Allow certificate manager to be initialized with no certs.
Adds support to the certificate manager so it can be initialized with no
certs and only a connection to the certificate request signing API. This
specifically covers the scenario for the kubelet server certificate,
where there is a request signing client but on first boot there is no
bootstrapping or local certs.
Automatic merge from submit-queue (batch tested with PRs 45508, 44258, 44126, 45441, 45320)
Use existing global var criSupportedLogDrivers
**What this PR does / why we need it**:
Use existing global var `criSupportedLogDrivers` defined in docker_service.go. If CRI supports other log drivers in the future, we will only need to modify that global var.
cc @Random-Liu
Automatic merge from submit-queue (batch tested with PRs 45508, 44258, 44126, 45441, 45320)
cloud initialize node in external cloud controller
@thockin This PR adds support in the `cloud-controller-manager` to initialize nodes (instead of kubelet, which did it previously)
This also adds support in the kubelet to skip node cloud initialization when `--cloud-provider=external`
Specifically,
Kubelet
1. The kubelet has a new flag called `--provider-id` which uniquely identifies a node in an external DB
2. The kubelet sets a node taint - called "ExternalCloudProvider=true:NoSchedule" if cloudprovider == "external"
Cloud-Controller-Manager
1. The cloud-controller-manager listens on "AddNode" events, and then processes nodes that starts with that above taint. It performs the cloud node initialization steps that were previously being done by the kubelet.
2. On addition of node, it figures out the zone, region, instance-type, removes the above taint and updates the node.
3. Then periodically queries the cloudprovider for node addresses (which was previously done by the kubelet) and updates the node if there are new addresses
```release-note
NONE
```
Automatic merge from submit-queue
adds log when gpuManager.start() failed
If gpuManager.start() returns error, there is no log.
We confused with scheduler do not schedule any pod(with gpu) to one node.
kubectl describe node xxx shows there is no gpu on that node, because the gpu driver do not work on that node, gpuManager.start() failed, but we can not see anything in log.
Automatic merge from submit-queue
Fix crash on Pods().Get() failure
**What this PR does / why we need it**:
Fixes a potential crash in syncPod when Pods().Get() returns an error other than NotFound. This is unlikely to occur with the standard client, but easily shows up with a stub kube client that returns Unimplemented to everything. Updates the unit test as well.
**Release note**:
`NONE`
Automatic merge from submit-queue
remove useless code in kubelet
**What this PR does / why we need it**:
This code has logical error as the etc-hosts file will be recreated even it already exists. In addition, if do not recreate etc-hosts file when it exists, the pod ip in it will be out of date when pod ips change. So remove this code as it is not needed.
**Which issue this PR fixes**:
**Special notes for your reviewer**:
xrefer: #44481, #44473
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45316, 45341)
Pass NoOpLegacyHost to dockershim in --experimental-dockershim mode
This allows dockershim to use network plugins, if needed.
/cc @Random-Liu
Automatic merge from submit-queue
Use Docker API Version instead of docker version
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#42492
**Special notes for your reviewer**:
**Release note**:
`Update cadvisor to latest head to use docker APIversion exposed by cadvisor`
Automatic merge from submit-queue (batch tested with PRs 45056, 44904, 45312)
CRI: clarify the behavior of PodSandboxStatus and ContainerStatus
**What this PR does / why we need it**:
Currently, we define that ImageStatus should return `nil, nil` when requested image doesn't exist, and kubelet is relying on this behavior now.
However, we haven't clearly defined the behavior of PodSandboxStatus and ContainerStatus. Currently, they return error when requested sandbox/container doesn't exist, and kubelet is also relying on this behavior.
**Which issue this PR fixes**
Fixes#44885.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45314, 45250, 41733)
CRI: add ImageFsInfo API
**What this PR does / why we need it**:
kubelet currently relies on cadvisor to get the ImageFS info for supported runtimes, i.e., docker and rkt. This PR adds ImageFsInfo API to CRI so kubelet could get the ImageFS correctly for all runtimes.
**Which issue this PR fixes**
First step for #33048 ~~also reverts temporary ImageStats in #33870~~.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
The test was originally in docker_manager_test.go (now removed). I
copied and adapated the logic for the new test.
Also move the original test fixtures needed for the test.
Automatic merge from submit-queue (batch tested with PRs 45005, 43053)
kubelet: fix sandbox garbage collection
**What this PR does / why we need it**:
Currently, kuberuntime garbage collection can't distinguish just-created sandboxes from failed sandboxes. Especially when the time from sandbox creation to ready is longer than GC's minAge. In such cases, those sandboxes may be garbage collected early before they are ready.
This PR removes `sandboxMinGCAge` and only garbage collect sandboxes when
* they are containing no containers at all
* and not the latest sandbox if it is belonging to an existing pod.
**Which issue this PR fixes**
Fixes#42856.
**Release note**:
```release-note
NONE
```
cc @yujuhong @Random-Liu
Automatic merge from submit-queue (batch tested with PRs 45013, 45166)
CRI: remove PodSandboxStatus.Linux.Namespaces.Network
**What this PR does / why we need it**:
PodSandboxStatus.Linux.Namespaces.Network is not used, so this PR removes it from CRI.
**Which issue this PR fixes**
Closes: #44972
**Special notes for your reviewer**:
**Release note**:
```release-note
Remove PodSandboxStatus.Linux.Namespaces.Network from CRI.
```
/assign @Random-Liu @yujuhong
These commands are important enough to be in the Kubelet itself.
By default, Ubuntu 14.04 and Debian Jessie have these set to 200 and
20000. Without this setting, nodes are limited in the number of
containers that they can start.
Adds support to the certificate manager so it can be initialized with no
certs and only a connection to the certificate request signing API. This
specifically covers the scenario for the kubelet server certificate,
where there is a request signing client but on first boot there is no
bootstrapping or local certs.
Previously we exported many constants and functions in dockertools to
share with the dockershim package. This change moves such
constants/functions to dockershim and unexport them.
This change involves only mechnical changes and should not have any
functional impact.
Automatic merge from submit-queue
Restructure unit tests for more cert/keys.
Just changing the unit tests so there is multiple cert/key pairs to be used.
No functional change, no new tests. Follow on PRs will make more use
of the multiple cert/key pairs.
This commit deletes code in dockertools that is only used by
DockerManager. A follow-up change will rename and clean up the rest of
the files in this package.
The commit also sets EnableCRI to true if the container runtime is not
rkt. A follow-up change will remove the flag/field and all references to
it.
Automatic merge from submit-queue
Fix nil pointer issue when making mounts for container
When rebooting one of the nodes in my colleague's cluster, two panics were discovered:
```
E1216 04:07:00.193058 2394 runtime.go:52] Recovered from panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/runtime/runtime.go:58
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/runtime/runtime.go:51
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/runtime/runtime.go:41
/usr/local/go/src/runtime/asm_amd64.s:472
/usr/local/go/src/runtime/panic.go:443
/usr/local/go/src/runtime/panic.go:62
/usr/local/go/src/runtime/sigpanic_unix.go:24
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:1313
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:1473
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/dockertools/docker_manager.go:1495
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/dockertools/docker_manager.go:2125
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/dockertools/docker_manager.go:2093
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:1971
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:530
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pod_workers.go:171
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pod_workers.go:154
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pod_workers.go:215
/usr/local/go/src/runtime/asm_amd64.s:1998
E1216 04:07:00.275030 2394 runtime.go:52] Recovered from panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/runtime/runtime.go:58
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/runtime/runtime.go:51
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/runtime/runtime.go:41
/usr/local/go/src/runtime/asm_amd64.s:472
/usr/local/go/src/runtime/panic.go:443
/usr/local/go/src/runtime/panic.go:62
/usr/local/go/src/runtime/sigpanic_unix.go:24
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/server/stats/volume_stat_caculator.go:98
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/server/stats/volume_stat_caculator.go:63
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/wait/wait.go:86
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/wait/wait.go:87
/usr/local/go/src/runtime/asm_amd64.s:1998
```
kubectl version
```
Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.8", GitCommit:"693ef591120267007be359f97191a6253e0e4fb5", GitTreeState:"clean", BuildDate:"2016-09-28T03:03:21Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.8", GitCommit:"693ef591120267007be359f97191a6253e0e4fb5", GitTreeState:"clean", BuildDate:"2016-09-28T02:52:25Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
```
The second panic had already been fixed by #33616 and #34251. Not sure what caused the first nil pointer issue and whether it has been fixed yet in the master branch. Just fix it by ignoring the nil pointer when making mounts.
cc @jingxu97 @yujuhong
Automatic merge from submit-queue (batch tested with PRs 45110, 45148)
write HostAliases to hosts file
**What this PR does / why we need it**: using the PodSpec's `HostAliases`, we write entries into the Kubernetes-managed hosts file.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#43632
**Special notes for your reviewer**:
Previous PRs in this series:
- https://github.com/kubernetes/kubernetes/pull/44572 isolates the logic of creating the file and writing the file
- https://github.com/kubernetes/kubernetes/pull/44641 introduces the `HostAliases` field in PodSpec along with validations
**Release note**:
```release-note
PodSpec's `HostAliases` now write entries into the Kubernetes-managed hosts file.
```
@thockin @yujuhong
Thanks for reviewing!
Automatic merge from submit-queue (batch tested with PRs 45110, 45148)
Make timeouts in the Kubelet slightly offset to aid debugging
Several of these loops overlap, and when they are the reason a failure
is happening it is difficult to sort them out. Slighly misalign these
loops to make their impact obvious.
We are seeing exactly 2 minute pod worker timeouts in a wide range of test flake scenarios, and I want to be confident we know exactly which one is the culprit.
Automatic merge from submit-queue (batch tested with PRs 41583, 45117, 45123)
Implement shared PID namespace in the dockershim
**What this PR does / why we need it**: Defaults the Docker CRI to using a shared PID namespace for pods. Implements proposal in https://github.com/kubernetes/community/pull/207 tracked by #1615.
//cc @dchen1107 @vishh @timstclair
**Special notes for your reviewer**: none
**Release note**:
```release-note
Some container runtimes share a process (PID) namespace for all containers in a pod. This will become the default for Docker in a future release of Kubernetes. You can preview this functionality if running with the CRI and Docker 1.13.1 by enabling the --experimental-docker-enable-shared-pid kubelet flag.
```
Automatic merge from submit-queue (batch tested with PRs 45033, 44961, 45021, 45097, 44938)
Cleanup orphan logging that goes on in the sync loop.
**What this PR does / why we need it**:
Fixes#44937
**Before this PR** The older logs were like this:
```
E0426 00:06:33.763347 21247 kubelet_volumes.go:114] Orphaned pod "35c4a858-2a12-11e7-910c-42010af00003" found, but volume paths are still present on disk.
E0426 00:06:33.763400 21247 kubelet_volumes.go:114] Orphaned pod "e7676365-1580-11e7-8c27-42010af00003" found, but volume paths are still present on disk.
```
The problem being that, all the volumes were spammed w/ no summary info.
**After this PR** the logs look like this:
```
E0426 01:32:27.295568 22261 kubelet_volumes.go:129] Orphaned pod "408b060e-2a1d-11e7-90e8-42010af00003" found, but volume paths are still present on disk. : There were a total of 2 errors similar to this. Turn up verbosity to see them.
E0426 01:32:29.295515 22261 kubelet_volumes.go:129] Orphaned pod "408b060e-2a1d-11e7-90e8-42010af00003" found, but volume paths are still present on disk. : There were a total of 2 errors similar to this. Turn up verbosity to see them.
E0426 01:32:31.293180 22261 kubelet_volumes.go:129] Orphaned pod "408b060e-2a1d-11e7-90e8-42010af00003" found, but volume paths are still present on disk. : There were a total of 2 errors similar to this. Turn up verbosity to see them.
```
And with logging turned up, the extra info logs are shown with details:
```
E0426 01:34:21.933983 26010 kubelet_volumes.go:129] Orphaned pod "1c565800-2a20-11e7-bbc2-42010af00003" found, but volume paths are still present on disk. : There were a total of 3 errors similar to this. Turn up verbosity to see them.
I0426 01:34:21.934010 26010 kubelet_volumes.go:131] Orphan pod: Orphaned pod "1c565800-2a20-11e7-bbc2-42010af00003" found, but volume paths are still present on disk.
I0426 01:34:21.934015 26010 kubelet_volumes.go:131] Orphan pod: Orphaned pod "408b060e-2a1d-11e7-90e8-42010af00003" found, but volume paths are still present on disk.
I0426 01:34:21.934019 26010 kubelet_volumes.go:131] Orphan pod: Orphaned pod "e7676365-1580-11e7-8c27-42010af00003" found, but volume paths are still present on disk.
```
**Release note**
```release-note
Roll up volume error messages in the kubelet sync loop.
```
Several of these loops overlap, and when they are the reason a failure
is happening it is difficult to sort them out. Slighly misalign these
loops to make their impact obvious.
Automatic merge from submit-queue
don't HandleError on container start failure
Failing to start containers is a common error case if there is something wrong with the container image or environment like missing mounts/configs/permissions/etc. Not only is it common; it is reoccurring as backoff happens and new attempts to start the container are made. `HandleError` it too verbose for this very common situation.
Replace `HandleError` with `glog.V(3).Infof`
xref https://github.com/openshift/origin/issues/13889
@smarterclayton @derekwaynecarr @eparis
Automatic merge from submit-queue (batch tested with PRs 45052, 44983, 41254)
Non-controversial part of #44523
For easier review of #44523, i extracted the non-controversial part out to this PR.
Automatic merge from submit-queue (batch tested with PRs 42740, 44980, 45039, 41627, 45044)
Improved code coverage for /pkg/kubelet/types
**What this PR does / why we need it**:
The test coverage for /pkg/kubelet/types was increased from 50% to 87.5%
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 44970, 43618)
CRI: Fix StopContainer timeout
Fixes https://github.com/kubernetes/kubernetes/issues/44956.
I verified this PR with the example provided in https://github.com/kubernetes/kubernetes/issues/44956, and now pod deletion will respect grace period timeout:
```
NAME READY STATUS RESTARTS AGE
gracefully-terminating-pod 1/1 Terminating 0 6m
```
@dchen1107 @yujuhong @feiskyer /cc @kubernetes/sig-node-bugs
Add support for following redirects to the SpdyRoundTripper. This is
necessary for clients using it directly (e.g. the apiserver talking
directly to the kubelet) because the CRI streaming server issues a
redirect for streaming requests.
Also extract common logic for following redirects.
Automatic merge from submit-queue
Add bootstrap support to certificate manager.
Adds configuration options to certificate manager for using bootstrap cert/key
pairs to handle the scenario where new nodes are initialized using a generic
cert/key pair. Bootstrap cert/key pairs are quickly rotated, independent of
duration remaining, so that each kubelet has a unique cert/key pair.
Automatic merge from submit-queue (batch tested with PRs 42202, 40784, 44642, 44623, 44761)
fix comment error for network plugin
**What this PR does / why we need it**:
**Which issue this PR fixes** : fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
** reason for this change **
CNI has recently introduced a new configuration list feature. This
allows for plugin chaining. It also supports varied plugin versions.
Automatic merge from submit-queue (batch tested with PRs 41849, 42033)
fix TODO: find and add active pods for dswp
loops through the list of active pods and ensures that each one exists in the desired state of the world cache
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44469, 44566, 44467, 44526)
Kubelet:rkt Fix the hostPath Volume creation
**What this PR does / why we need it**:
This PR fix the `hostPath` volume when the path exist and it's not a directory.
At the moment, the creation of a `hostPath` volume for an existing file leads to this error:
> kubelet[1984]: E0413 07:53:16.480922 1984 pod_workers.go:184] Error syncing pod 38359a57-1fb1-11e7-a484-76870fe7db83, skipping: failed to SyncPod: mkdir /usr/share/coreos/lsb-release: not a directory
**Special notes for your reviewer**:
You can have a look to the difference with this [gist](https://gist.github.com/JulienBalestra/28ae15efc8a1393d350300880c07ff4f)
Automatic merge from submit-queue
comment spelling correction in custommetrics
**What this PR does / why we need it**: fix spelling in a comment
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 40055, 42085, 44509, 44568, 43956)
Fix gofmt errors
**What this PR does / why we need it**:
There were some gofmt errors on master. Ran the following to fix:
```
hack/verify-gofmt.sh | grep ^diff | awk '{ print $2 }' | xargs gofmt -w -s
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: none
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44569, 44398)
Move v1/refs and v1/resource
This PR moves pkg/api/v1/ref.go and pkg/api/v1/resource_helper.go to their own sub packages, it's very similar to 44299 and 44302.
The PR is mostly mechanical, except that
* i moved some utility function from resource.go to pkg/api/v1/pod and pkg/api/v1/node, as they are more appropriate
* i updated the staging/copy.sh to copy the new subpackages, so that helper functions are copied. We can get rid of this copy after client-go stops copying API types.
rktnetes is not a CRI implementation, and does not provide runtime
conditions. This change fixes the issue where rkt will never be
considered running from kubelet's point of view.
Automatic merge from submit-queue (batch tested with PRs 44364, 44361, 42498)
Fix the certificate rotation threshold and add jitter.
Adjusts the certificate rotation threshold to be fixed, with some jitter to
spread out the load on the Certificate Signing Request API. The rotation
threshold is fixed at 20% now, meaning when 20% of the certificate's total
duration is remaining, the certificate manager will attempt to rotate, with
jitter +/-10%. For certificates of duration 1 month that means they will
rotate after 24 days, +/- 3 days.
On a 6000 node cluster, assuming all nodes added at nearly the same time, this
should result in 6000 nodes rotating spread over 6 days (total range of the
jitter), or ~42 nodes / hour requesting new certificates.
Automatic merge from submit-queue (batch tested with PRs 44406, 41543, 44071, 44374, 44299)
Decouple remotecommand
Refactored unversioned/remotecommand to decouple it from undesirable dependencies:
- term package now is not required, and functionality required to resize terminal size can be plugged in directly in kubectl
- in order to remove dependency on kubelet package - constants from kubelet/server/remotecommand were moved to separate util package (pkg/util/remotecommand)
- remotecommand_test.go moved to pkg/client/tests module
Automatic merge from submit-queue
CRI: Stop following container log when container exited.
Fixes https://github.com/kubernetes/kubernetes/issues/44340.
This PR changed kubelet to periodically check whether container is running when following container logs, and stop following when container exited.
I've tried this PR in my local cluster:
```
Wed Apr 12 20:23:54 UTC 2017
Wed Apr 12 20:23:58 UTC 2017
Wed Apr 12 20:24:02 UTC 2017
Wed Apr 12 20:24:06 UTC 2017
Wed Apr 12 20:24:10 UTC 2017
Wed Apr 12 20:24:14 UTC 2017
Wed Apr 12 20:24:18 UTC 2017
Wed Apr 12 20:24:22 UTC 2017
Wed Apr 12 20:24:26 UTC 2017
Wed Apr 12 20:24:30 UTC 2017
Wed Apr 12 20:24:34 UTC 2017
Wed Apr 12 20:24:38 UTC 2017
Wed Apr 12 20:24:42 UTC 2017
Wed Apr 12 20:24:46 UTC 2017
failed to wait logs for log file "/var/log/pods/1d54634c7b31346fc3219f5e0b7507cc/nginx_0.log": container "b9a17a2c53550c3703ab350d85911743af8bf164a41813544fd08fb9585f7501" is not running (state="CONTAINER_EXITED")
```
The only difference is that `ReadLogs` will return error when container exits during following. I'm not sure whether we should get rid of it or not.
@yujuhong @feiskyer @JorritSalverda
/cc @kubernetes/sig-node-bugs
**Release note**:
```release-note
`kubectl logs -f` now stops following when container stops.
```
Automatic merge from submit-queue
Add prometheus metrics for age of stats used for evictions.
Completes #42923
This PR adds metrics for evictions, and records how stale data used for evictions is.
cc @vishh @derekwaynecarr @kubernetes/sig-node-pr-reviews
Automatic merge from submit-queue
update docker version parser for its new versioning scheme
**What this PR does / why we need it**:
Docker has change its release strategy and versioning scheme from [v17.03.0-ce-rc1](https://github.com/docker/docker/releases/tag/v17.03.0-ce-rc1). We need to update the version verify condition to satisfy the new docker versions.
**Which issue this PR fixes** : fixes#44140
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43777, 44121)
Add patchMergeKey and patchStrategy support to OpenAPI
Support generating Open API extensions for strategic merge patch tags in go struct tags
Support `patchStrategy` and `patchMergeKey`.
Also support checking if the Open API extension and struct tags match.
```release-note
Support generating Open API extensions for strategic merge patch tags in go struct tags
```
cc: @pwittrock @ymqytw
(Description mostly copied from #43833)
Automatic merge from submit-queue (batch tested with PRs 43951, 43386)
Move & export ConstructPodPortMapping
ConstructPodPortMapping: move & export
Move ConstructPodPortMapping to pkg/kubelet/network/hostport and export
it so downstream projects (such as OpenShift) can use it.
cc @sttts @kubernetes/sig-node-pr-reviews @kubernetes/sig-network-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 43373, 41780, 44141, 43914, 44180)
kubelet: make dockershim.sock configurable
**What this PR does / why we need it**: allow the path to dockershim.sock to be configurable, so downstream projects such as OpenShift can run integration tests without needing to run them as root
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
**Special notes for your reviewer**:
**Release note**:
```release-note
```
cc @derekwaynecarr @sttts @kubernetes/rh-cluster-infra @kubernetes/sig-node-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 42025, 44169, 43940)
[CRI] Remove all containers in the sandbox
Remove all containers in the sandbox, when we remove the sandbox.
/cc @feiskyer @Random-Liu
Signed-off-by: Xianglin Gao <xlgao@zju.edu.cn>
Make the location of dockershim.sock configurable, so downstream
projects (such as OpenShift) can place it in a location that does not
require root access (e.g. for integration tests).
Make the kubelet respect and use the values of
--container-runtime-endpoint and --image-service-endpoint, if set. If
unset, the default value of /var/run/dockershim.sock is used.
Automatic merge from submit-queue
Fix container hostPid settings
**What this PR does / why we need it**:
HostPid is not set correctly for containers.
**Which issue this PR fixes**
Fixes#44041.
**Special notes for your reviewer**:
Should be cherry-picked into v1.6 branch.
**Release note**:
```release-note
Fix container hostPid settings.
```
cc @yujuhong @derekwaynecarr @unclejack @kubernetes/sig-node-pr-reviews
Automatic merge from submit-queue
Clearer ImageGC failure errors. Fewer events.
Addresses #26000. Kubelet often "fails" image garbage collection if cAdvisor has not completed the first round of stats collection. Don't create events for a single failure, and make log messages more specific.
@kubernetes/sig-node-bugs
Automatic merge from submit-queue
Support status.hostIP in downward API
**What this PR does / why we need it**:
Exposes pod's hostIP (node IP) via downward API.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
fixes https://github.com/kubernetes/kubernetes/issues/24657
**Special notes for your reviewer**:
Not sure if there's more documentation that's needed, please point me in the right direction and I will add some :)
Automatic merge from submit-queue
Add separate KubeletFlags struct and remove HostnameOverride and NodeIP from config type
Add a separate flags struct for Kubelet flags
Kubelet flags are not necessarily appropriate for the KubeletConfiguration
object. For example, this PR also removes HostnameOverride and NodeIP
from KubeletConfiguration.This is a preleminary step to enabling Nodes
to share configurations, as part of the dynamic Kubelet configuration
feature (#29459). Fields that must be unique for each node inhibit
sharing, because their values, by definition, cannot be shared.
/cc @ncdc @kubernetes/sig-node-misc @kubernetes/sig-cluster-lifecycle-misc
Automatic merge from submit-queue (batch tested with PRs 42973, 41582)
Improve status manager unit testing
This is designed to simplify testing logic in the status manager, and decrease reliance on syncBatch. This is a smaller portion of #37119, and should be easier to review than that change.
It makes the following changes:
- creates convenience functions for get, update, and delete core.Action
- prefers using syncPod on elements in the podStatusChannel to using syncBatch to reduce unintended reliance on syncBatch
- combines consuming, validating, and clearing actions into single verifyActions function. This replaces calls to testSyncBatch(), verifyActions(), and ClearActions
- changes comments in testing functions into log statements for easier debugging
@Random-Liu
Kubelet flags are not necessarily appropriate for the KubeletConfiguration
object. For example, this PR also removes HostnameOverride and NodeIP
from KubeletConfiguration. This is a preleminary step to enabling Nodes
to share configurations, as part of the dynamic Kubelet configuration
feature (#29459). Fields that must be unique for each node inhibit
sharing, because their values, by definition, cannot be shared.
Automatic merge from submit-queue
[CRI] Use DNSOptions passed by CRI in dockershim.
When @xlgao-zju is working on the CRI validation test, he found that dockershim is not using the DNSOptions passed in CRI. https://github.com/kubernetes-incubator/cri-tools/pull/30#issuecomment-290644357
This PR fixed the issue. I've manually tried, for `ClusterFirst` DNSPolicy, the resolv.conf will be:
```
nameserver 8.8.8.8
search corp.google.com prod.google.com prodz.google.com google.com
options ndots:5
```
For `Default` DNSPolicy, the resolv.conf will be:
```
nameserver 127.0.1.1
search corp.google.com prod.google.com prodz.google.com google.com
```
@xlgao-zju You should be able to test after this PR is merged.
/cc @yujuhong @feiskyer
Automatic merge from submit-queue
test/e2e_node: prepull images with CRI
Part of https://github.com/kubernetes/kubernetes/issues/40739
- This PR builds on top of #40525 (and contains one commit from #40525)
- The second commit contains a tiny change in the `Makefile`.
- Third commit is a patch to be able to prepull images using the CRI (as opposed to run `docker` to pull images which doesn't make sense if you're using CRI most of the times)
Marked WIP till #40525 makes its way into master
@Random-Liu @lucab @yujuhong @mrunalp @rhatdan
Automatic merge from submit-queue
refactor getPidsForProcess and change error handling
xref https://github.com/openshift/origin/issues/13262
Right now, failure to read the docker pid from the pid file results in some premature nasty logging. There is still a chance we can get the docker pid from `procfs.PidOf()`. If that fails we should just log at `V(4)` rather than `runtime.HanldeError()`.
This PR refactors `getPidsForProcess()` to wait until both methods for determining the pid fail before logging anything.
@smarterclayton @ncdc @derekwaynecarr
Automatic merge from submit-queue (batch tested with PRs 42379, 42668, 42876, 41473, 43260)
accurate hint
accurate hint
same err hint (Error adding network) in one method,cann't position problem
Automatic merge from submit-queue
Print dereferenced pod status fields when logging status update
Before: "Terminated:0xc421932af0"
After:"Terminated:&ContainerStateTerminated{ExitCode:0,Signal:0,Reason:Completed,Message:,StartedAt:0001-01-01 00:00:00 +0000 UTC,FinishedAt:2017-03-07 14:50:48 -0500 EST,ContainerID:docker://bd453bb969264b3ace2b3934a568af7679a0d51fee543a5f8a82429ff654970e,}"
"Ignoring same status for pod" messages already print status fully, these "Status for pod updated" messages should too IMO
```release-note
NONE
```
Automatic merge from submit-queue
Create subPaths and set their permissions like we do mountPaths
fixes https://github.com/kubernetes/kubernetes/issues/41638
If a subPath does not exist at the time MountVolume.Setup happens, SetVolumeOwnership will not have walked to the subPath and set appropriate permissions on it, leading to the above issue
So later, at makeMounts when we are parsing subPaths, let's create all subPaths and set their permissions according to how the parent mountPath looks.
```release-note
NONE
```
Automatic merge from submit-queue
kubelet: check and enforce minimum docker api version
**What this PR does / why we need it**:
This PR adds enforcing a minimum docker api version (same with what we have do for dockertools).
**Which issue this PR fixes**
Fixes#42696.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43378, 43216, 43384, 43083, 43428)
Fix tiny typo
**What this PR does / why we need it**:
**Which issue this PR fixes**
Fix type typo introduced by PR #43368.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43378, 43216, 43384, 43083, 43428)
Kubelet:rkt Create any missing hostPath Volumes
When using a `hostPath` inside the `Pod.spec.volumes`, this PR allows to creates any missing directory on the node.
**What this PR does / why we need it**:
With rkt as the container runtime we cannot use `hostPath` volumes if the directory is missing.
**Special notes for your reviewer**:
This PR follows [#39965](https://github.com/kubernetes/kubernetes/pull/39965)
The labels should be
> area/rkt
> area/kubelet
Automatic merge from submit-queue (batch tested with PRs 42998, 42902, 42959, 43020, 42948)
Add Host field to TCPSocketAction
Currently, TCPSocketAction always uses Pod's IP in connection. But when a pod uses the host network, sometimes firewall rules may prevent kubelet from connecting through the Pod's IP.
This PR introduces the 'Host' field for TCPSocketAction, and if it is set to non-empty string, the probe will be performed on the configured host rather than the Pod's IP. This gives users an opportunity to explicitly specify 'localhost' as the target for the above situations.
```release-note
Add Host field to TCPSocketAction
```
Automatic merge from submit-queue (batch tested with PRs 42672, 42770, 42818, 42820, 40849)
Return early from eviction debug helpers if !glog.V(3)
Should keep us from running a bunch of loops needlessly.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43653, 43654, 43652)
CRI: Check nil pointer to avoid kubelet panic.
When working on the containerd kubernetes integration, I casually returns an empty `sandboxStatus.Linux{}`, but it cause kubelet to panic.
This won't happen when runtime returns valid data, but we should not make the assumption here.
/cc @yujuhong @feiskyer
Automatic merge from submit-queue (batch tested with PRs 42522, 42545, 42556, 42006, 42631)
Use pod sandbox id in checkpoint
**What this PR does / why we need it**: we should log out sandbox id when checkpoint error
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 41139, 41186, 38882, 37698, 42034)
Make kubelet never delete files on mounted filesystems
With bug #27653, kubelet could remove mounted volumes and delete user data.
The bug itself is fixed, however our trust in kubelet is significantly lower.
Let's add an extra version of RemoveAll that does not cross mount boundary
(rm -rf --one-file-system).
It calls lstat(path) three times for each removed directory - once in
RemoveAllOneFilesystem and twice in IsLikelyNotMountPoint, however this way
it's platform independent and the directory that is being removed by kubelet
should be almost empty.
Automatic merge from submit-queue (batch tested with PRs 43533, 43539)
kuberuntime: don't override the pod IP for pods using host network
This fixes the issue of not passing pod IP via downward API for host network pods.
Automatic merge from submit-queue (batch tested with PRs 43465, 43529, 43474, 43521)
kubelet/cni: hook network plugin Status() up to CNI network discovery
Ensure that the plugin returns NotReady status until there is a
CNI network available which can be used to set up pods.
Fixes: https://github.com/kubernetes/kubernetes/issues/43014
I think the only reason it wasn't done like this in the first place was that the dynamic "reread /etc/cni/net.d every 10s forever" was added long after the Status() hook was. What do you think?
@freehan @caseydavenport @luxas @jbeda
Automatic merge from submit-queue (batch tested with PRs 43398, 43368)
CRI: add support for dns cluster first policy
**What this PR does / why we need it**:
PR #29378 introduces ClusterFirstWithHostNet policy but only dockertools was updated to support the feature.
This PR updates kuberuntime to support it for all runtimes.
**Which issue this PR fixes**
fixes#43352
**Special notes for your reviewer**:
Candidate for v1.6.
**Release note**:
```release-note
NONE
```
cc @thockin @luxas @vefimova @Random-Liu
PR #29378 introduces ClusterFirstWithHostNet, but docker doesn't support
setting dns options togather with hostnetwork. This commit rewrites
resolv.conf same as dockertools.
PR #29378 introduces ClusterFirstWithHostNet policy but only dockertools
was updated to support the feature. This PR updates kuberuntime to
support it for all runtimes.
Also fixes#43352.
Automatic merge from submit-queue (batch tested with PRs 42828, 43116)
Apply taint tolerations for NoExecute for all static pods.
Fixed https://github.com/kubernetes/kubernetes/issues/42753
**Release note**:
```
Apply taint tolerations for NoExecute for all static pods.
```
cc/ @davidopp
Automatic merge from submit-queue (batch tested with PRs 40964, 42967, 43091, 43115)
Improve code coverage for pkg/kubelet/status/generate.go
**What this PR does / why we need it**:
Improve code coverage for pkg/kubelet/status/generate.go from #39559
Thanks.
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 42747, 43030)
dockershim: remove corrupted sandbox checkpoints
This is a workaround to ensure that kubelet doesn't block forever when
the checkpoint is corrupted.
This is a workaround for #43021
Automatic merge from submit-queue (batch tested with PRs 42942, 42935)
[Bug] Handle container restarts and avoid using runtime pod cache while allocating GPUs
Fixes#42412
**Background**
Support for multiple GPUs is an experimental feature in v1.6.
Container restarts were handled incorrectly which resulted in stranding of GPUs
Kubelet is incorrectly using runtime cache to track running pods which can result in race conditions (as it did in other parts of kubelet). This can result in same GPU being assigned to multiple pods.
**What does this PR do**
This PR tracks assignment of GPUs to containers and returns pre-allocated GPUs instead of (incorrectly) allocating new GPUs.
GPU manager is updated to consume a list of active pods derived from apiserver cache instead of runtime cache.
Node e2e has been extended to validate this failure scenario.
**Risk**
Minimal/None since support for GPUs is an experimental feature that is turned off by default. The code is also isolated to GPU manager in kubelet.
**Workarounds**
In the absence of this PR, users can mitigate the original issue by setting `RestartPolicyNever` in their pods.
There is no workaround for the race condition caused by using the runtime cache though.
Hence it is worth including this fix in v1.6.0.
cc @jianzhangbjz @seelam @kubernetes/sig-node-pr-reviews
Replaces #42560
Currently, TCPSocketAction always uses Pod's IP in connection. But when a
pod uses the host network, sometimes firewall rules may prevent kubelet
from connecting through the Pod's IP. This PR introduces the 'Host' field
for TCPSocketAction, and if it is set to non-empty string, the probe will
be performed on the configured host rather than the Pod's IP. This gives
users an opportunity to explicitly specify 'localhost' as the target for
the above situations.
Automatic merge from submit-queue
Invalid environment var names are reported and pod starts
When processing EnvFrom items, all invalid keys are collected and
reported as a single event.
The Pod is allowed to start.
fixes#42583
Automatic merge from submit-queue (batch tested with PRs 42734, 42745, 42758, 42814, 42694)
Dropped docker 1.9.x support. Changed the minimumDockerAPIVersion to
1.22
cc/ @Random-Liu @yujuhong
We talked about dropping docker 1.9.x support for a while. I just realized that we haven't really done it yet.
```release-note
Dropped the support for docker 1.9.x and the belows.
```
Automatic merge from submit-queue
dockershim: Fix the race condition in ListPodSandbox
In ListPodSandbox(), we
1. List all sandbox docker containers
2. List all sandbox checkpoints. If the checkpoint does not have a
corresponding container in (1), we return partial result based on
the checkpoint.
The problem is that new PodSandboxes can be created between step (1) and
(2). In those cases, we will see the checkpoints, but not the sandbox
containers. This leads to strange behavior because the partial result
from the checkpoint does not include some critical information. For
example, the creation timestamp'd be zero, and that would cause kubelet's
garbage collector to immediately remove the sandbox.
This change fixes that by getting the list of checkpoints before listing
all the containers (since in RunPodSandbox we create them in the reverse
order).