Automatic merge from submit-queue
Update OWNERS to correct members' handles
**What this PR does / why we need it**:
Fix some typos of members' handles as per https://github.com/kubernetes/kubernetes/issues/50048#issuecomment-319831957.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Associated with: #50048
**Special notes for your reviewer**:
/cc @madhusudancs @sebgoa @liggitt @saad-ali
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50119, 48366, 47181, 41611, 49547)
Fail on swap enabled and deprecate experimental-fail-swap-on flag
**What this PR does / why we need it**:
* Deprecate the old experimental-fail-swap-on
* Add a new flag fail-swap-on and set it to true
Before this change, we would not fail when swap is on. With this
change we fail for everyone when swap is on, unless they explicitly
set --fail-swap-on to false.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#34726
**Special notes for your reviewer**:
**Release note**:
```release-note
Kubelet will by default fail with swap enabled from now on. The experimental flag "--experimental-fail-swap-on" has been deprecated, please set the new "--fail-swap-on" flag to false if you wish to run with /proc/swaps on.
```
Requires a separate transport that is guaranteed not to be HTTP/2 for
exec/attach/portforward, because otherwise the Go client attempts to
upgrade us to HTTP/2 first.
Automatic merge from submit-queue
kubelet: remove code for handling old pod/containers paths
**What this PR does / why we need it**:
This PR removes the code for handling the paths that has been deprecated for a long time.
**Release note**:
```release-note
NONE
```
CC @simo5
Automatic merge from submit-queue (batch tested with PRs 50103, 49677, 49449, 43586, 48969)
Do not try to run preStopHook when the gracePeriod is 0
**What this PR does / why we need it**:
1. Sometimes when the user force deletes a POD with no gracePeriod, its possible that kubelet attempts to execute the preStopHook which will certainly fail. This PR prevents this inavitable PreStopHook failure.
```
kubectl delete --force --grace-period=0 po/<pod-name>
```
2. This also adds UT for LifeCycle Hooks
```
time go test --cover -v --run "Hook" ./pkg/kubelet/kuberuntime/
.
.
.
--- PASS: TestLifeCycleHook (0.00s)
--- PASS: TestLifeCycleHook/PreStop-CMDExec (0.00s)
--- PASS: TestLifeCycleHook/PreStop-HTTPGet (0.00s)
--- PASS: TestLifeCycleHook/PreStop-NoTimeToRun (0.00s)
--- PASS: TestLifeCycleHook/PostStart-CmdExe (0.00s)
PASS
coverage: 15.3% of statements
ok k8s.io/kubernetes/pkg/kubelet/kuberuntime 0.429s
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
Do not try to run preStopHook when the gracePeriod is 0
```
Automatic merge from submit-queue
Fix incorrect owner in OWNERS
**What this PR does / why we need it**:
typo: yuyuhong --> yujuhong
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/50048#issuecomment-319846621
**Special notes for your reviewer**:
/assign @yujuhong
I don't know whether you can approve this PR or not in such case 😄
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Switch from package syscall to golang.org/x/sys/unix
**What this PR does / why we need it**:
The syscall package is locked down and the comment in https://github.com/golang/go/blob/master/src/syscall/syscall.go#L21-L24 advises to switch code to use the corresponding package from golang.org/x/sys. This PR does so and replaces usage of package syscall with package golang.org/x/sys/unix where applicable. This will also allow to get updates and fixes
without having to use a new go version.
In order to get the latest functionality, golang.org/x/sys/ is re-vendored. This also allows to use Eventfd() from this package instead of calling the eventfd() C function.
**Special notes for your reviewer**:
This follows previous works in other Go projects, see e.g. moby/moby#33399, cilium/cilium#588
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49237, 49656, 49980, 49841, 49899)
certificate manager: close existing client conns once cert rotates
After the kubelet rotates its client cert, it will keep connections to the API server open indefinitely, causing it to use its old credentials instead of the new certs. Because the API server authenticates client certs at the time of the request, and not the handshake, this could cause the kubelet to start hitting auth failures even if it rotated its certificate to a new, valid one.
When the kubelet rotates its cert, close down existing connections to force a new TLS handshake.
Ref https://github.com/kubernetes/features/issues/266
Updates https://github.com/kubernetes-incubator/bootkube/pull/663
```release-note
After a kubelet rotates its client cert, it now closes its connections to the API server to force a handshake using the new cert. Previously, the kubelet could keep its existing connection open, even if the cert used for that connection was expired and rejected by the API server.
```
/cc @kubernetes/sig-auth-bugs
/assign @jcbsmpsn @mikedanese
Automatic merge from submit-queue (batch tested with PRs 49237, 49656, 49980, 49841, 49899)
[Bug Fix] Set NodeOODCondition to false
fixes#49839, which was introduced by #48846
This PR makes the kubelet set NodeOODCondition to false, so that the scheduler and other controllers do not consider the node to be unschedulable.
/assign @vishh
/sig node
/release-note-none
* Deprecate the old experimental-fail-swap-on
* Add a new flag fail-swap-on and set it to true
Before this change, we would not fail when swap is on. With this
change we fail for everyone when swap is on, unless they explicitly
set --fail-swap-on to false.
Automatic merge from submit-queue
If error continue for loop
If err does not add continue, type conversion will be error.
If do not add continue, pod. (* V1.Pod) may cause panic to run.
**Release note**:
```release-note
NONE
```
After the kubelet rotates its client cert, it will keep connections
to the API server open indefinitely, causing it to use its old
credentials instead of the new certs
When the kubelet rotates its cert, close down existing connections
to force a new TLS handshake.
Automatic merge from submit-queue (batch tested with PRs 49284, 49555, 47639, 49526, 49724)
skip WaitForAttachAndMount for terminated pods in syncPod
Fixes https://github.com/kubernetes/kubernetes/issues/49663
I tried to tread lightly with a small localized change because this needs to be picked to 1.7 and 1.6 as well.
I suspect this has been as issue since we started unmounting volumes on pod termination https://github.com/kubernetes/kubernetes/pull/37228
xref openshift/origin#14383
@derekwaynecarr @eparis @smarterclayton @saad-ali @jwforres
/release-note-none
Automatic merge from submit-queue (batch tested with PRs 49284, 49555, 47639, 49526, 49724)
Change pod config to manifest
**What this PR does / why we need it**:
As per https://github.com/kubernetes/kubernetes/pull/46494#discussion_r119675805, change `config` to `manifest` to avoid ambiguous.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
Since it's a minor fix, there is no issue here.
/cc @mtaufen
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49651, 49707, 49662, 47019, 49747)
Add support for `no_new_privs` via AllowPrivilegeEscalation
**What this PR does / why we need it**:
Implements kubernetes/community#639
Fixes#38417
Adds `AllowPrivilegeEscalation` and `DefaultAllowPrivilegeEscalation` to `PodSecurityPolicy`.
Adds `AllowPrivilegeEscalation` to container `SecurityContext`.
Adds the proposed behavior to `kuberuntime`, `dockershim`, and `rkt`. Adds a bunch of unit tests to ensure the desired default behavior and that when `DefaultAllowPrivilegeEscalation` is explicitly set.
Tests pass locally with docker and rkt runtimes. There are also a few integration tests with a `setuid` binary for sanity.
**Release note**:
```release-note
Adds AllowPrivilegeEscalation to control whether a process can gain more privileges than it's parent process
```
The POSIX standard restricts environment variable names to uppercase
letters, digits, and the underscore character in shell contexts only.
For generic application usage, it is stated that all other characters
shall be tolerated.
This change relaxes the rules to some degree. Namely, we stop requiring
environment variable names to be strict C_IDENTIFIERS and start
permitting lowercase, dot, and dash characters.
Public container images using environment variable names beyond the
shell-only context can benefit from this relaxation. Elasticsearch is
one popular example.
Automatic merge from submit-queue (batch tested with PRs 47738, 49196, 48907, 48533, 48822)
Fix TODO: rename podInfraContainerID to sandboxID
**What this PR does / why we need it**:
Code-cleanup in kubelet to use consistent naming for sandbox ID. Not super urgent, but thought it would be nice to knock off some TODOs.
**Which issue this PR fixes**
Fixes a TODO in the code, no associated issue.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47357, 49514, 49271, 49572, 49476)
Using only the exit code to decide when to fall back on logs
We expect the exit code to be non-zero if the the container process was
OOM killed. Remove the check that uses the "Reason" field.
Automatic merge from submit-queue (batch tested with PRs 45040, 48960)
Do not generate warning event on dns search deduplication
In the case that the node is able to use the cluster DNS, `cluster.local` will already be in the node search domains in `/etc/resolv.conf`. The kubelet then parses `/etc/resolv.conf` on the node and combines it with $namespace.svc.$clusterDomain, svc.$clusterDomain, and $clusterDomain to create the pod DNS search domains. clusterDomain is `cluster.local` by default. This causes the code to generate a Warning event visible to the user for _every_ pod:
```
Warning DNSSearchForming Found and omitted duplicated dns domain in host search line: 'cluster.local' during merging with cluster dns domains
```
This is really overkill. IMHO, this should be done in the background with no user level notification or logging at all.
xref https://bugzilla.redhat.com/show_bug.cgi?id=1471198
@derekwaynecarr @eparis @vefimova
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)
Use presence of kubeconfig file to toggle standalone mode
Fixes#40049
```release-note
The deprecated --api-servers flag has been removed. Use --kubeconfig to provide API server connection information instead. The --require-kubeconfig flag is now deprecated. The default kubeconfig path is also deprecated. Both --require-kubeconfig and the default kubeconfig path will be removed in Kubernetes v1.10.0.
```
/cc @kubernetes/sig-cluster-lifecycle-misc @kubernetes/sig-node-misc
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)
Remove duplicated import and wrong alias name of api package
**What this PR does / why we need it**:
**Which issue this PR fixes**: fixes#48975
**Special notes for your reviewer**:
/assign @caesarxuchao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Remove flags low-diskspace-threshold-mb and outofdisk-transition-frequency
issue: #48843
This removes two flags replaced by the eviction manager. These have been depreciated for two releases, which I believe correctly follows the kubernetes depreciation guidelines.
```release-note
Remove depreciated flags: --low-diskspace-threshold-mb and --outofdisk-transition-frequency, which are replaced by --eviction-hard
```
cc @mtaufen since I am changing kubelet flags
cc @vishh @derekwaynecarr
/sig node
Automatic merge from submit-queue (batch tested with PRs 48636, 49088, 49251, 49417, 49494)
Fix issues for local storage allocatable feature
This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
This PR fixes issue #47809
Automatic merge from submit-queue (batch tested with PRs 49444, 47864, 48584, 49395, 49118)
Move event type
Change SandboxChanged to a constant and move to the event package below.
**Release note**:
```release-note
NONE
```
Replaces use of --api-servers with --kubeconfig in Kubelet args across
the turnup scripts. In many cases this involves generating a kubeconfig
file for the Kubelet and placing it in the correct location on the node.
The syscall package is locked down and the comment in [1] advises to
switch code to use the corresponding package from golang.org/x/sys. Do
so and replace usage of package syscall with package
golang.org/x/sys/unix where applicable.
[1] https://github.com/golang/go/blob/master/src/syscall/syscall.go#L21-L24
This will also allow to get updates and fixes for syscall wrappers
without having to use a new go version.
Errno, Signal and SysProcAttr aren't changed as they haven't been
implemented in /x/sys/. Stat_t from syscall is used if standard library
packages (e.g. os) require it. syscall.SIGTERM is used for
cross-platform files.
Automatic merge from submit-queue (batch tested with PRs 49107, 47177, 49234, 49224, 49227)
Make sure the previous symlink file is deleted before trying to create a new one
**What this PR does / why we need it**:
It deletes possibly existing symlinks to container log files.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes#49105
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 48981, 47316, 49180)
Add seccomp profile in sandbox security context
**What this PR does / why we need it**:
PR #46332 adds seccomp profile to container security context, but not sandbox. This PR adds seccomp profile in sandbox security context. Without this, we couldn't honour "seccomp.security.alpha.kubernetes.io/pod" for sandbox.
**Which issue this PR fixes**
fixes#49179.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/cc @yujuhong
Automatic merge from submit-queue (batch tested with PRs 48981, 47316, 49180)
Added golint check for pkg/kubelet.
**What this PR does / why we need it**:
Added golint check for pkg/kubelet, and make golint happy.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47315
**Release note**:
```release-note-none
```
Automatic merge from submit-queue
Restore cAdvisor prometheus metrics to the main port
But under a new path - `/metrics/cadvisor`. This ensures a secure port still exists for metrics while getting the benefit of separating out container metrics from the kubelet's metrics as recommended in the linked issue.
Fixes#48483
```release-note-action-required
Restored cAdvisor prometheus metrics to the main port -- a regression that existed in v1.7.0-v1.7.2
cAdvisor metrics can now be scraped from `/metrics/cadvisor` on the kubelet ports.
Note that you have to update your scraping jobs to get kubelet-only metrics from `/metrics` and `container_*` metrics from `/metrics/cadvisor`
```
Automatic merge from submit-queue (batch tested with PRs 49116, 49095)
Move pkg/api/v1/ref -> client-go/tools/reference
`pkg/api/v1/ref` is the only remaining package copied from pkg/api/v1 to client-go via staging/copy.sh.
Automatic merge from submit-queue (batch tested with PRs 48702, 48965, 48740, 48974, 48232)
Cleanup the conversion of ObjectReference
**What this PR does / why we need it**:
No need to convert ObjectReference as `k8s.io/kubernetes/pkg/api/v1` and `k8s.io/client-go/pkg/api/v1` has been consistent in `k8s.io/api/core/v1`.
**Which issue this PR fixes**: fixes#48747
**Special notes for your reviewer**:
/assign @caesarxuchao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48481, 48256)
Refactor: pkg/util into sub-pkgs
**What this PR does / why we need it**:
- move code in pkg/util into sub-pkgs
- delete some unused funcs
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#15634
**Special notes for your reviewer**:
This is the final work of #15634. It will close that issue.
/cc @thockin
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46094, 48544, 48807, 49102, 44174)
add InstanceID to fake cadvisor (used in Kubemark)
This change is for setting Node.Spec.ProviderID field in Kubemark hollow nodes. It shouldn't affect other tests using cadvisor.Fake as field is nil by default.
cc @gmarek
Automatic merge from submit-queue (batch tested with PRs 46094, 48544, 48807, 49102, 44174)
Static deepcopy – phase 1
This PR is the follow-up of https://github.com/kubernetes/kubernetes/pull/36412, replacing the
dynamic reflection based deepcopy with static DeepCopy+DeepCopyInto methods on API types.
This PR **does not yet** include the code dropping the cloner from the scheme and all the
porting of the calls to scheme.Copy. This will be part of a follow-up "Phase 2" PR.
A couple of the commits will go in first:
- [x] audit: fix deepcopy registration https://github.com/kubernetes/kubernetes/pull/48599
- [x] apimachinery+apiserver: separate test types in their own packages #48601
- [x] client-go: remove TPR example #48604
- [x] apimachinery: remove unneeded GetObjectKind() impls #48608
- [x] sanity check against origin, that OpenShift's types are fine for static deepcopy https://github.com/deads2k/origin/pull/34
TODO **after** review here:
- [x] merge https://github.com/kubernetes/gengo/pull/32 and update vendoring commit
But under a new path - `/metrics/cadvisor`. This ensures a secure port
still exists for metrics while getting the benefit of separating out
container metrics from the kubelet's metrics.
Automatic merge from submit-queue (batch tested with PRs 48231, 47377, 48797, 49020, 49033)
Migrate kubelet and linked code from clientset_generated to client-go
Ran a script in the kubernetes repo to migrate kubelet and linked code from clientset_generated package imports to client-go imports.
**NOTE:** There are also some manual changes that were made in order to accommodate some
differences between clientset_generated and client-go. The manual changes are isolated into a
single commit titled "Manual changes."
```sh
#! /bin/bash
for file in $(find . \( -name "clientset_generated" -o -name "informers_generated" \) -prune -o -type f -name "*.go"); do
if [[ -d $file ]]; then
continue
fi
if [[ $file == "./cmd/libs/go2idl/informer-gen/main.go" ]]; then
continue
fi
sed -i '
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset"|clientset "k8s.io/client-go/kubernetes"|;
# Correct a couple of unique cases.
s|clientset clientset "k8s.io/client-go/kubernetes"|clientset "k8s.io/client-go/kubernetes"|;
s|cs clientset "k8s.io/client-go/kubernetes"|clientset "k8s.io/client-go/kubernetes"|;
s|VersionedClientSetPackage: clientset "k8s.io/client-go/kubernetes"|VersionedClientSetPackage: "k8s.io/client-go/kubernetes"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/certificates/v1beta1"|"k8s.io/client-go/kubernetes/typed/certificates/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/core/v1"|"k8s.io/client-go/kubernetes/typed/core/v1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/extensions/v1beta1"|"k8s.io/client-go/kubernetes/typed/extensions/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/autoscaling/v1"|"k8s.io/client-go/kubernetes/typed/autoscaling/v1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/authentication/v1"|"k8s.io/client-go/kubernetes/typed/authentication/v1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/authorization/v1beta1"|"k8s.io/client-go/kubernetes/typed/authorization/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/apps/v1beta1"|"k8s.io/client-go/kubernetes/typed/apps/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/rbac/v1beta1"|"k8s.io/client-go/kubernetes/typed/rbac/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/fake"|"k8s.io/client-go/kubernetes/fake"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/core/v1/fake"|"k8s.io/client-go/kubernetes/typed/core/v1/fake"|;
s|k8s.io/kubernetes/pkg/client/clientset_generated/clientset|k8s.io/client-go/kubernetes|;
s|informers "k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions"|"k8s.io/client-go/informers"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/core/v1"|"k8s.io/client-go/informers/core/v1"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/apps/v1beta1"|"k8s.io/client-go/informers/apps/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/extensions/v1beta1"|"k8s.io/client-go/informers/extensions/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/batch/v1"|"k8s.io/client-go/informers/batch/v1"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/autoscaling/v1"|"k8s.io/client-go/informers/autoscaling/v1"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/policy/v1beta1"|"k8s.io/client-go/informers/policy/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/certificates/v1beta1"|"k8s.io/client-go/informers/certificates/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/storage/v1"|"k8s.io/client-go/informers/storage/v1"|;
s|"k8s.io/kubernetes/pkg/client/listers/core/v1"|"k8s.io/client-go/listers/core/v1"|;
s|"k8s.io/kubernetes/pkg/client/listers/apps/v1beta1"|"k8s.io/client-go/listers/apps/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/listers/extensions/v1beta1"|"k8s.io/client-go/listers/extensions/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/listers/autoscaling/v1"|"k8s.io/client-go/listers/autoscaling/v1"|;
s|"k8s.io/kubernetes/pkg/client/listers/batch/v1"|"k8s.io/client-go/listers/batch/v1"|;
s|"k8s.io/kubernetes/pkg/client/listers/certificates/v1beta1"|"k8s.io/client-go/listers/certificates/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/listers/storage/v1"|"k8s.io/client-go/listers/storage/v1"|;
s|"k8s.io/kubernetes/pkg/client/listers/policy/v1beta1"|"k8s.io/client-go/listers/policy/v1beta1"|;
' $file
done
hack/update-bazel.sh
hack/update-gofmt.sh
```
Automatic merge from submit-queue (batch tested with PRs 49017, 45440, 48384, 45894, 48808)
Fix typo in ExecCommandParam
**What this PR does / why we need it**: Makes ExecCommandParam look like all of the other "Param"s
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48997, 48595, 48898, 48711, 48972)
Revert "Use go-ansiterm version matching docker/pkg/term/windows v1.11"
This reverts commit 72044a11a1.
**What this PR does / why we need it**: earlier this week, #47140 updated the vendored azure dependencies, which broke the windows build because the docker dependencies were too old. #48933 was merged, which reverted part of #47140 and fixed the build, but then #48308, which updated the vendored docker dependencies, broke the windows build again.
By reverting #48933, we should get back to a working build, I hope.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48887
**Release note**:
```release-note
NONE
```
/release-note-none
/test pull-kubernetes-cross
/assign @brendandburns
cc @karataliu @yguo0905 @yujuhong @dchen1107
Automatic merge from submit-queue
Fix comments and typo in the error message
**What this PR does / why we need it**:
This PR fixes outdated comments and typo in the error message.
**Release note**:
```release-note
NONE
```
CC @simo5
Automatic merge from submit-queue
Revert workaround in PR 46246 as APIs have been consistent
**What this PR does / why we need it**:
No need to convert v1.ObjectReference as APIs have been consistent in `k8s.io/api/core/v1`.
**Which issue this PR fixes** : fixes#48668
**Special notes for your reviewer**:
/assign @derekwaynecarr @caesarxuchao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Add OWNERS file to kubelet gpu package
GPU support is ramping up and we do not have a lot of reviewers that
are familiar with the codebase. I added myself as a reviewer and
copied a few people from the kubelet OWNERS file as approvers.
Signed-off-by: Christopher M. Luciano <cmluciano@us.ibm.com>
**Release note**:
```
NONE
```
This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
Automatic merge from submit-queue
Fix subPath existence check to not follow symlink
**What this PR does / why we need it**:
Volume mounting logic introduced in #43775 and #45623 checks
for subPath existence before attempting to create a directory,
should subPath not be present.
This breaks if subPath is a dangling symlink, os.Stat returns
"do not exist" status, yet `os.MkdirAll` can't create directory
as symlink is present at the given path.
This patch makes existence check to use os.Lstat which works for
normal files/directories as well as doesn't not attempt to follow
symlink, therefore it's "do not exist" status is more reliable when
making a decision whether to create directory or not.
subPath symlinks can be dangling in situations where kubelet is
running in a container itself with access to docker socket, such
as CoreOS's kubelet-wrapper script
**Release note**:
```release-note
Fix pods failing to start when subPath is a dangling symlink from kubelet point of view, which can happen if it is running inside a container
```
Automatic merge from submit-queue (batch tested with PRs 48594, 47042, 48801, 48641, 48243)
Prepare to introduce websockets for exec and portforward
Refactor the code in remotecommand to better represent the structure of
what is common between portforward and exec.
Ref #48633
Automatic merge from submit-queue (batch tested with PRs 48425, 41680, 48457, 48619, 48635)
Improved code coverage for pkg/kubelet/types/pod_update
The test coverage for pod_update.go was imprved from 36% to 100%.
**What this PR does / why we need it**:
This fixed part of #40780
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 48405, 48742, 48748, 48571, 48482)
dockershim: clean up unused security context code
Most of the code in the `dockershim/securitycontext` package are
unused and can be removed. This PR migrates the rest of the code,
cleans it up (e.g., remove references to kubernetes API objects),
and removes the package entirely.
Automatic merge from submit-queue (batch tested with PRs 48698, 48712, 48516, 48734, 48735)
Name change: s/timstclair/tallclair/
I changed my name, and I'm migrating my user name to be consistent.
Automatic merge from submit-queue (batch tested with PRs 48698, 48712, 48516, 48734, 48735)
share iptables util client within kubenet
reduce the number of goroutine waiting for dbus.
Automatic merge from submit-queue (batch tested with PRs 47232, 48625, 48613, 48567, 39173)
Fix issue when setting fileysystem capacity in container manager
In Container manager, we set up the capacity by retrieving information
from cadvisor. However unlike machineinfo, filesystem information is
available at a later unknown time. This PR uses a go routine to keep
retriving the information until it is avaialble or timeout.
This PR fixes issue #48452
Automatic merge from submit-queue (batch tested with PRs 48402, 47203, 47460, 48335, 48322)
Added case on 'terminated-but-not-yet-deleted' for Admit.
**What this PR does / why we need it**:
Added case on 'terminated-but-not-yet-deleted' for Admit.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47867
**Release note**:
```release-note-none
```
Automatic merge from submit-queue
Local storage teardown fix
**What this PR does / why we need it**: Local storage uses bindmounts and the method IsLikelyNotMountPoint does not detect these as mountpoints. Therefore, local PVs are not properly unmounted when they are deleted.
**Which issue this PR fixes**: fixes#48331
**Special notes for your reviewer**:
You can use these e2e tests to reproduce the issue and validate the fix works appropriately https://github.com/kubernetes/kubernetes/pull/47999
The existing method IsLikelyNotMountPoint purposely does not check mountpoints reliability (4c5b22d4c6/pkg/util/mount/mount_linux.go (L161)), since the number of mountpoints can be large. 4c5b22d4c6/pkg/util/mount/mount.go (L46)
This implementation changes the behavior for local storage to detect mountpoints reliably, and avoids changing the behavior for any other callers to a UnmountPath.
**Release note**:
```
Fixes bind-mount teardown failure with non-mount point Local volumes (issue https://github.com/kubernetes/kubernetes/issues/48331).
```
Automatic merge from submit-queue (batch tested with PRs 44412, 44810, 47130, 46017, 47829)
recheck pod volumes before marking pod as processed
This PR allows a pod's volumes to be re-checked until all are added correctly. There's a limited amount of time when a persistent volume claim is still in the Pending phase, and if a pod is created in that time, the volume will not be added. The issue is not uncommon with helm charts that create all objects in close succession, particularly when using aws-ebs volumes.
fixes#28962
Added IsNotMountPoint method to mount utils (pkg/util/mount/mount.go)
Added UnmountMountPoint method to volume utils (pkg/volume/util/util.go)
Call UnmountMountPoint method from local storage (pkg/volume/local/local.go)
IsLikelyNotMountPoint behavior was not modified, so the logic/behavior for UnmountPath is not modified
In Container manager, we set up the capacity by retrieving information
from cadvisor. However unlike machineinfo, filesystem information is
available at a later unknown time. This PR uses a go routine to keep
retriving the information until it is avaialble or timeout.
Volume mounting logic introduced in #43775 and #45623 checks
for subPath existence before attempting to create a directory,
should subPath not be present.
This breaks if subPath is a dangling symlink, os.Stat returns
"do not exist" status, yet `os.MkdirAll` can't create directory
as symlink is present at the given path.
This patch makes existence check to use os.Lstat which works for
normal files/directories as well as doesn't not attempt to follow
symlink, therefore it's "do not exist" status is more reliable when
making a decision whether to create directory or not.
subPath symlinks can be dangling in situations where kubelet is
running in a container itself with access to docker socket, such
as CoreOS's kubelet-wrapper script
Automatic merge from submit-queue (batch tested with PRs 48518, 48525, 48269)
Move the kubelet certificate management code into a single package
Code is very similar and belongs together. Will allow future cert callers to potentially make this more generic, as well as to make it easier reuse code elsewhere.
Automatic merge from submit-queue (batch tested with PRs 47327, 48194)
Checked container spec when killing container.
**What this PR does / why we need it**:
Checked container spec when getting container, return error if failed.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48173
**Release note**:
```release-note-none
```
Automatic merge from submit-queue (batch tested with PRs 47700, 48464, 48502)
Provide a way to setup the limit NO files for rkt Pods
**What this PR does / why we need it**:
This PR allows to customize the Systemd unit files for rkt pods.
We start with the `systemd-unit-option.rkt.kubernetes.io/LimitNOFILE` to allows to run workloads like etcd, ES in kubernetes with rkt.
**Special notes for your reviewer**:
Once again, I followed @yifan-gu guidelines.
I made a basic check over the values given inside the `systemd-unit-option.rkt.kubernetes.io/LimitNOFILE` (integer and > 0).
If this check fails: I simply ignore the field.
The other implementation would be to fail the whole SetUpPod.
We discussed using a key like `rkt.kubernetes.io/systemd-unit-option/LimitNOFILE` but the validation only allows a single `/` in this field:
```The Deployment "tiller" is invalid: spec.template.annotations: Invalid value: "rkt.kubernetes.io/systemd-unit-option/LimitNOFILE": a qualified name must consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyName', or 'my.name', or '123-abc', regex used for validation is '([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9]') with an optional DNS subdomain prefix and '/' (e.g. 'example.com/MyName')```
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45467, 48091, 48033, 48498)
Allow Kubenet with ipv6
When running kubenet with IPv6, there is a panic as there
is IPv4 specific code the Event function.
With this change, Event will support IPv4 and IPv6
**What this PR does / why we need it**:
This PR allows kubenet to use IPv6. Currently there is a panic in kubenet_linux.go
as there is IPv4 specific code.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48089
**Special notes for your reviewer**:
**Release note**:
```release-note-NONE
```
Automatic merge from submit-queue (batch tested with PRs 48317, 48313, 48351, 48357, 48115)
Cleanup lint errors in the pkg/kubelet/server/... directory
Cleanup some issues that were found while experimenting with https://github.com/alecthomas/gometalinter on the `pkg/kubelet/server/...` directories.
Automatic merge from submit-queue (batch tested with PRs 47918, 47964, 48151, 47881, 48299)
move term to kubectl/util
move term from pkg/util/term to pkg/kubectl/util/term
remove dependency of `k8s.io/kubernetes/pkg/util/term` for `pkg/kubelet/dockershim/exec.go` and `pkg/kubelet/dockershim/exec.go`
Ref: https://github.com/kubernetes/kubernetes/issues/48209
```release-note
NONE
```
/assign @apelisse @monopole
cc: @pwittrock
Automatic merge from submit-queue (batch tested with PRs 47918, 47964, 48151, 47881, 48299)
Add unit test coverage for nvidiaGPUManager initialization
Part of #47750
```release-note
NONE
```
Since v1.5 and the removal of --configure-cbr0:
0800df74ab "Remove the legacy networking mode --configure-cbr0"
kubelet hasn't done any shaping operations internally. They
have all been delegated to network plugins like kubenet or
external CNI plugins. But some shaping code was still left
in kubelet, so remove it now that it's unused.
Automatic merge from submit-queue (batch tested with PRs 47850, 47835, 46197, 47250, 48284)
dockershim: checkpoint HostNetwork property
To ensure kubelet doesn't attempt network teardown on HostNetwork
containers that no longer exist but are still checkpointed, make
sure we preserve the HostNetwork property in checkpoints. If
the checkpoint indicates the container was a HostNetwork one,
don't tear down the network since that would fail anyway.
Related: https://github.com/kubernetes/kubernetes/issues/44307#issuecomment-299548609
@freehan @kubernetes/sig-network-misc
Automatic merge from submit-queue
Add type conversion judgment
If do not type conversion judgment, there may be panic.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Remove repeat type conversions
Here is the type of conversion for the variable is repeated.
**Release note**:
```release-note
NONE
```
GPU support is ramping up and we do not have a lot of reviewers that
are familiar with the codebase. I added myself as a reviewer and
copied a few people from the kubelet OWNERS file as approvers.
Signed-off-by: Christopher M. Luciano <cmluciano@us.ibm.com>
Automatic merge from submit-queue (batch tested with PRs 48123, 48079)
[Kubelet] Fix race condition in container manager
**What this PR does / why we need it**:
This fixes a race condition where the container manager capacity map was being updated without synchronization. It moves the storage capacity detection to kubelet initialization, which happens serially in one thread.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48045
**Release note**:
```release-note
Fixes kubelet race condition in container manager.
```
Centralize Capacity discovery of standard resources in Container manager.
Have storage derive node capacity from container manager.
Move certain cAdvisor interfaces to the cAdvisor package in the process.
This patch fixes a bug in container manager where it was writing to a map without synchronization.
Signed-off-by: Vishnu kannan <vishnuk@google.com>
Automatic merge from submit-queue (batch tested with PRs 47484, 47904, 48034)
fix nits in kubelet server
Signed-off-by: allencloud <allen.sun@daocloud.io>
**What this PR does / why we need it**:
fix nits in kubelet server
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
NONE
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44058, 48085, 48077, 48076, 47823)
don't pass CRI error through to waiting state reason
Raw gRPC errors are getting into the `Reason` field of the container status `State`, causing it to be output inline on a `kubectl get pod`
xref https://bugzilla.redhat.com/show_bug.cgi?id=1449820
Basically the issue is that the err and msg are reversed in `startContainer()`. The msg is short and the err is long. It should be the other way around.
This PR changes `startContainer()` to return a short error that becomes the Reason and the extracted gPRC error description that becomes the Message.
@derekwaynecarr @smarterclayton @eparis
Automatic merge from submit-queue
kubelet should let cloud-controller-manager set the node addresses
*Before this change:*
1. cloud-controller-manager sets all the addresses for a node.
2. kubelet on that node replaces these addresses with an incomplete set. (i.e. replace InternalIP and Hostname and delete all other addresses--ExternalIP, etc.)
*After this change:*
kubelet doesn't touch its node's addresses when there is an external cloudprovider.
Fixes#47155
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47694, 47772, 47783, 47803, 47673)
Make different container runtimes constant
Make different container runtimes constant to avoid hardcode
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47915, 47856, 44086, 47575, 47475)
kubelet should resume csr bootstrap
Right now the kubelet creates a new csr object with the same key every
time it restarts during the bootstrap process. It should resume with the
old csr object if it exists. To do this the name of the csr object must
be stable.
Issue https://github.com/kubernetes/kubernetes/issues/47855
Automatic merge from submit-queue (batch tested with PRs 47227, 47119, 46280, 47414, 46696)
Move seccomp helper methods and tests to platform-specific files.
**What this PR does / why we need it**:
Seccomp helper methods are for linux only, move them to linux-specific helper file.
As discussed in https://github.com/kubernetes/kubernetes/pull/46744
**Which issue this PR fixes**
**Special notes for your reviewer**:
**Release note**:
Automatic merge from submit-queue (batch tested with PRs 47922, 47195, 47241, 47095, 47401)
Run cAdvisor on the same interface as kubelet
**What this PR does / why we need it**:
cAdvisor currently binds to all interfaces. Currently the only
solution is to use iptables to block access to the port. We
are better off making cAdvisor to bind to the interface that
kubelet uses for better security.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#11710
**Special notes for your reviewer**:
**Release note**:
```release-note
cAdvisor binds only to the interface that kubelet is running on instead of all interfaces.
```
Right now the kubelet creates a new csr object with the same key every
time it restarts during the bootstrap process. It should resume with the
old csr object if it exists. To do this the name of the csr object must
be stable. Also using a list watch here eliminates a race condition
where a watch event is missed and the kubelet stalls.
Automatic merge from submit-queue (batch tested with PRs 47851, 47824, 47858, 46099)
Revert 44714 manually
#44714 broke backward compatibility for old swagger spec that kubectl still uses. The decision on #47448 was to revert this change but the change was not automatically revertible. Here I semi-manually remove all references to UnixUserID and UnixGroupID and updated generated files accordingly.
Please wait for tests to pass then review that as there may still be tests that are failing.
Fixes#47448
Adding release note just because the original PR has a release note. If possible, we should remove both release notes as they cancel each other.
**Release note**: (removed by caesarxuchao)
UnixUserID and UnixGroupID is reverted back as int64 to keep backward compatibility.
Automatic merge from submit-queue (batch tested with PRs 34515, 47236, 46694, 47819, 47792)
Adding alpha feature gate to node statuses from local storage capacity isolation.
**What this PR does / why we need it**: The Capacity.storage node attribute should not be exposed since it's part of an alpha feature. Added an feature gate.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47809
There should be a test for new statuses in the alpha feature. Will include in a different PR.
To ensure kubelet doesn't attempt network teardown on HostNetwork
containers that no longer exist but are still checkpointed, make
sure we preserve the HostNetwork property in checkpoints. If
the checkpoint indicates the container was a HostNetwork one,
don't tear down the network since that would fail anyway.
Related: https://github.com/kubernetes/kubernetes/issues/44307#issuecomment-299548609
- Wrapping all node statuses from local storage capacity isolation under an alpha feature check. Currently there should not be any storage statuses.
- Replaced all "storage" statuses with "storage.kubernetes.io/scratch". "storage" should never be exposed as a status.
Automatic merge from submit-queue
Don't rerun certificate manager tests 1000 times.
**What this PR does / why we need it**:
Running every testcase 1000 times needlessly bloats the logs.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47669, 40284, 47356, 47458, 47701)
add unit test cases for kubelet.util.sliceutils
What this PR does / why we need it:
I have not found any unit test case for this file, so i do it, thank you!
Fixes#47001
Automatic merge from submit-queue (batch tested with PRs 47626, 47674, 47683, 47290, 47688)
validate host paths on the kubelet for backsteps
**What this PR does / why we need it**:
This PR adds validation on the kubelet to ensure the host path does not contain backsteps that could allow the volume to escape the PSP's allowed host paths. Currently, there is validation done at in API server; however, that does not account for mismatch of OS's on the kubelet vs api server.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47107
**Special notes for your reviewer**:
cc @liggitt
**Release note**:
```release-note
Paths containing backsteps (for example, "../bar") are no longer allowed in hostPath volume paths, or in volumeMount subpaths
```
Automatic merge from submit-queue (batch tested with PRs 47523, 47438, 47550, 47450, 47612)
append KUBE-HOSTPORTS to system chains instead of prepend
Bug fix for conflicting iptables rules between hostport and kube-proxy
Automatic merge from submit-queue
Strip container id from events
**What this PR does / why we need it**:
reduces spam events from kubelet in bad pod scenarios
**Which issue this PR fixes**:
relates to https://github.com/kubernetes/kubernetes/issues/47366
**Special notes for your reviewer**:
pods in permanent failure states created unique events
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 46441, 43987, 46921, 46823, 47276)
kubelet/network: report but tolerate errors returned from GetNetNS() v2
Runtimes should never return "" and nil errors, since network plugin
drivers need to treat netns differently in different cases. So return
errors when we can't get the netns, and fix up the plugins to do the
right thing.
Namely, we don't need a NetNS on pod network teardown. We do need
a netns for pod Status checks and for network setup.
V2: don't return errors from getIP(), since they will block pod status :( Just log them. But even so, this still fixes the original problem by ensuring we don't log errors when the network isn't ready.
@freehan @yujuhong
Fixes: https://github.com/kubernetes/kubernetes/issues/42735
Fixes: https://github.com/kubernetes/kubernetes/issues/44307
Automatic merge from submit-queue (batch tested with PRs 47000, 47188, 47094, 47323, 47124)
fix sync loop health check
This PR will do error logging about the fall behind sync for kubelet instead of sync loop healthz checking.
The reason is kubelet can not do sync loop and therefore can not update sync loop time when there is any runtime error, such as docker hung.
When there is any runtime error, according to current implementation, kubelet will not do sync operation and thus kubelet's sync loop time will not be updated. This will make when there is any runtime error, kubelet will also return non 200 response status code when accessing healthz endpoint. This is contrary with #37865 which prevents kubelet from being killed when docker hangs.
**Release note**:
```release-note
fix sync loop health check with seperating runtime errors
```
/cc @yujuhong @Random-Liu @dchen1107
GenericPLEG's 1s relist() loop races against pod network setup. It
may be called after the infra container has started but before
network setup is done, since PLEG and the runtime's SyncPod() run
in different goroutines.
Track network setup status and don't bother trying to read the pod's
IP address if networking is not yet ready.
See also: https://bugzilla.redhat.com/show_bug.cgi?id=1434950
Mar 22 12:18:17 ip-172-31-43-89 atomic-openshift-node: E0322
12:18:17.651013 25624 docker_manager.go:378] NetworkPlugin
cni failed on the status hook for pod 'pausepods22' - Unexpected
command output Device "eth0" does not exist.
Runtimes should never return "" and nil errors, since network plugin
drivers need to treat netns differently in different cases. So return
errors when we can't get the netns, and fix up the plugins to do the
right thing.
Namely, we don't need a NetNS on pod network teardown. We do need
a netns for pod Status checks and for network setup.
This reverts commit fee4c9a7d9.
This is not the correct fix for the problem; and it causes other problems
like continuous:
docker_sandbox.go:234] NetworkPlugin cni failed on the status hook for pod
"someotherdc-1-deploy_default": Unexpected command output nsenter: cannot
open : No such file or directory with error: exit status 1
Because GetNetNS() is returning an empty network namespace. That is
not helpful nor should really be allowed; that's what the error return
from GetNetNS() is for.
cAdvisor currently binds to all interfaces. Currently the only
solution is to use iptables to block access to the port. We
are better off making cAdvisor to bind to the interface that
kubelet uses for better security.
Fixes#11710
Automatic merge from submit-queue (batch tested with PRs 45877, 46846, 46630, 46087, 47003)
gpusInUse info error when kubelet restarts
**What this PR does / why we need it**:
In my test, I found 2 errors in the nvidia_gpu_manager.go.
1. the number of activePods in gpusInUse() equals to 0 when kubelet restarts. It seems the Start() method was called before pods recovery which caused this error. So I decide not to call gpusInUse() in the Start() function, just let it happen when new pod needs to be created.
2. the container.ContainerID in line 242 returns the id in format of "docker://<container_id>", this will make the client failed to inspect the container by id. We have to erase the prefix of "docker://".
**Special notes for your reviewer**:
**Release note**:
```
Avoid assigning the same GPU to multiple containers.
```
Automatic merge from submit-queue (batch tested with PRs 45877, 46846, 46630, 46087, 47003)
func parseEndpointWithFallbackProtocol should check if protocol of endpoint is empty
**What this PR does / why we need it**:
func parseEndpointWithFallbackProtocol should check if protocol of endpoint is empty
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: part of #45927
NONE
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43005, 46660, 46385, 46991, 47103)
Consolidate sysctl commands for kubelet
**What this PR does / why we need it**:
These commands are important enough to be in the Kubelet itself.
By default, Ubuntu 14.04 and Debian Jessie have these set to 200 and
20000. Without this setting, nodes are limited in the number of
containers that they can start.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#26005
**Special notes for your reviewer**:
I had a difficult time writing tests for this. It is trivial to create a fake sysctl for testing, but the Kubelet does not have any tests for the prior settings.
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 46775, 47009)
kuberuntime: check the value of RunAsNonRoot when verifying
The verification function is fixed to check the value of RunAsNonRoot,
not just the existence of it. Also adds unit tests to verify the correct
behavior.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#46996
**Release note**:
```release-note
Fix the bug where container cannot run as root when SecurityContext.RunAsNonRoot is false.
```
The verification function is fixed to check the value of RunAsNonRoot,
not just the existence of it. Also adds unit tests to verify the correct
behavior.
This PR adds two features:
1. add support for isolating the emptyDir volume use. If user
sets a size limit for emptyDir volume, kubelet's eviction manager
monitors its usage
and evict the pod if the usage exceeds the limit.
2. add support for isolating the local storage for container overlay. If
the container's overly usage exceeds the limit defined in container
spec, eviction manager will evict the pod.
Automatic merge from submit-queue (batch tested with PRs 46734, 46810, 46759, 46259, 46771)
Improve code coverage for pkg/kubelet/images/image_gc_manager
**What this PR does / why we need it**:
#39559#40780
code coverage from 74.5% to 77.4%
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Delete all dead containers and sandboxes when under disk pressure.
This PR modifies the eviction manager to add dead container and sandbox garbage collection as a resource reclaim function for disk. It also modifies the container GC logic to allow pods that are terminated, but not deleted to be removed.
It still does not delete containers that are less than the minGcAge. This should prevent nodes from entering a permanently bad state if the entire disk is occupied by pods that are terminated (in the state failed, or succeeded), but not deleted.
There are two improvements we should consider making in the future:
- Track the disk space and inodes reclaimed by deleting containers. We currently do not track this, and it prevents us from determining if deleting containers resolves disk pressure. So we may still evict a pod even if we are able to free disk space by deleting dead containers.
- Once we can track disk space and inodes reclaimed, we should consider only deleting the containers we need to in order to relieve disk pressure. This should help avoid a scenario where we try and delete a massive number of containers all at once, and overwhelm the runtime.
/assign @vishh
cc @derekwaynecarr
```release-note
Disk Pressure triggers the deletion of terminated containers on the node.
```
Automatic merge from submit-queue
reset resultRun on pod restart
xref https://bugzilla.redhat.com/show_bug.cgi?id=1455056
There is currently an issue where, if the pod is restarted due to liveness probe failures exceeding failureThreshold, the failure count is not reset on the probe worker. When the pod restarts, if the liveness probe fails even once, the pod is restarted again, not honoring failureThreshold on the restart.
```yaml
apiVersion: v1
kind: Pod
metadata:
name: busybox
spec:
containers:
- name: busybox
image: busybox
command:
- sleep
- "3600"
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 3
timeoutSeconds: 1
periodSeconds: 3
successThreshold: 1
failureThreshold: 5
terminationGracePeriodSeconds: 0
```
Before this PR:
```
$ kubectl create -f busybox-probe-fail.yaml
pod "busybox" created
$ kubectl get pod -w
NAME READY STATUS RESTARTS AGE
busybox 1/1 Running 0 4s
busybox 1/1 Running 1 24s
busybox 1/1 Running 2 33s
busybox 0/1 CrashLoopBackOff 2 39s
```
After this PR:
```
$ kubectl create -f busybox-probe-fail.yaml
$ kubectl get pod -w
NAME READY STATUS RESTARTS AGE
busybox 0/1 ContainerCreating 0 2s
busybox 1/1 Running 0 4s
busybox 1/1 Running 1 27s
busybox 1/1 Running 2 45s
```
```release-note
Fix kubelet reset liveness probe failure count across pod restart boundaries
```
Restarts are now happen at even intervals.
@derekwaynecarr
Automatic merge from submit-queue (batch tested with PRs 46782, 46719, 46339, 46609, 46494)
Do not log the content of pod manifest if parsing fails.
**What this PR does / why we need it**:
- ~~only accepts text/plain config file~~
- ~~not log config file content when it's invalid~~
Do not log the content of pod manifest if parsing fails.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#46493
**Special notes for your reviewer**:
/cc @yujuhong
@sig-node-reviewers
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46782, 46719, 46339, 46609, 46494)
Fix inconsistency in finding cni binaries
Fixes [#46476]
Signed-off-by: Abhinav Dahiya <abhinav.dahiya@coreos.com>
**What this PR does / why we need it**:
This fixes the inconsistency in finding the appropriate cni binaries.
Currently `lo` cniNetwork follows vendorCniDir > binDir whereas default for all others is binDir > vendorCniDir. This PR makes vendorCniDir > binDir as default behavior.
**Why we need it**:
Hypercube right now ships cni binaries in /opt/cni/bin.
And to use latest version of calico you need to override kubelet's /opt/cni/bin from host which means all other cni plugins (flannel, loopback etc...) have to be mounted from host too. Keeping vendordir at higher order allows easy installation of newer versions of plugins.
Automatic merge from submit-queue (batch tested with PRs 46620, 46732, 46773, 46772, 46725)
Improving test coverage for kubelet/kuberuntime.
**What this PR does / why we need it**:
Increases test coverage for kubelet/kuberuntime
https://github.com/kubernetes/kubernetes/issues/46123
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/46123
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Add local storage (scratch space) allocatable support
This PR adds the support for allocatable local storage (scratch space).
This feature is only for root file system which is shared by kubernetes
componenets, users' containers and/or images. User could use
--kube-reserved flag to reserve the storage for kube system components.
If the allocatable storage for user's pods is used up, some pods will be
evicted to free the storage resource.
This feature is part of local storage capacity isolation and described in the proposal https://github.com/kubernetes/community/pull/306
**Release note**:
```release-note
This feature exposes local storage capacity for the primary partitions, and supports & enforces storage reservation in Node Allocatable
```
Automatic merge from submit-queue (batch tested with PRs 46239, 46627, 46346, 46388, 46524)
move labels to components which own the APIs
During the apimachinery split in 1.6, we accidentally moved several label APIs into apimachinery. They don't belong there, since the individual APIs are not general machinery concerns, but instead are the concern of particular components: most commonly the kubelet. This pull moves the labels into their owning components and out of API machinery.
@kubernetes/sig-api-machinery-misc @kubernetes/api-reviewers @kubernetes/api-approvers
@derekwaynecarr since most of these are related to the kubelet
Automatic merge from submit-queue (batch tested with PRs 46726, 41912, 46695, 46034, 46551)
Rotate kubelet client certificate.
Changes the kubelet so it bootstraps off the cert/key specified in the
config file and uses those to request new cert/key pairs from the
Certificate Signing Request API, as well as rotating client certificates
when they approach expiration.
Default behavior is for client certificate rotation to be disabled. If enabled
using a command line flag, the kubelet exits each time the certificate is
rotated. I tried to use `GetCertificate` in [tls.Config](https://golang.org/pkg/crypto/tls/#Config) but it is only called
on the server side of connections. Then I tried `GetClientCertificate`,
but it is new in 1.8.
**Release note**
```release-note
With --feature-gates=RotateKubeletClientCertificate=true set, the kubelet will
request a client certificate from the API server during the boot cycle and pause
waiting for the request to be satisfied. It will continually refresh the certificate
as the certificates expiration approaches.
```
Automatic merge from submit-queue
Improved code coverage for pkg/kubelet/util.
The test coverage for pkg/kubelet/util.go increased from 45.1%
to 84.3%.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 46432, 46701, 46326, 40848, 46396)
Fix selfLinks of pods started from manifests
**What this PR does / why we need it**:
When running `curl http://localhost:10255/pods` the selfLink for pods started from manifests were incorrect. This PR fixes it.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#46357
**Special notes for your reviewer**:
@number101010
**Release note**:
```release-note
NONE
```
This PR adds the check for local storage request when admitting pods. If
the local storage request exceeds the available resource, pod will be
rejected.
This PR adds the support for allocatable local storage (scratch space).
This feature is only for root file system which is shared by kubernetes
componenets, users' containers and/or images. User could use
--kube-reserved flag to reserve the storage for kube system components.
If the allocatable storage for user's pods is used up, some pods will be
evicted to free the storage resource.
Automatic merge from submit-queue
fix comment error in function newVolumeMounterFromPlugins
**What this PR does / why we need it**:
Fix the comment error in function newVolumeMounterFromPlugins, which may cause confusion.
Automatic merge from submit-queue
resolv.conf nameserver line has only one entry, ignore trailing garbage
**What this PR does / why we need it**:
Per the resolv.conf man page "name servers may be listed, one per keyword." Some tools such as udhcpc take advantage of this to append comments to nameserver entries. For example: `nameserver 8.8.8.8 # eth0`. This updates the resolv.conf parser to ignore trailing garbage on nameserver lines.
**Release note**:
NONE
Changes the kubelet so it bootstraps off the cert/key specified in the
config file and uses those to request new cert/key pairs from the
Certificate Signing Request API, as well as rotating client certificates
when they approach expiration.