Replaces use of --api-servers with --kubeconfig in Kubelet args across
the turnup scripts. In many cases this involves generating a kubeconfig
file for the Kubelet and placing it in the correct location on the node.
The syscall package is locked down and the comment in [1] advises to
switch code to use the corresponding package from golang.org/x/sys. Do
so and replace usage of package syscall with package
golang.org/x/sys/unix where applicable.
[1] https://github.com/golang/go/blob/master/src/syscall/syscall.go#L21-L24
This will also allow to get updates and fixes for syscall wrappers
without having to use a new go version.
Errno, Signal and SysProcAttr aren't changed as they haven't been
implemented in /x/sys/. Stat_t from syscall is used if standard library
packages (e.g. os) require it. syscall.SIGTERM is used for
cross-platform files.
Automatic merge from submit-queue (batch tested with PRs 49107, 47177, 49234, 49224, 49227)
Make sure the previous symlink file is deleted before trying to create a new one
**What this PR does / why we need it**:
It deletes possibly existing symlinks to container log files.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes#49105
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 48981, 47316, 49180)
Add seccomp profile in sandbox security context
**What this PR does / why we need it**:
PR #46332 adds seccomp profile to container security context, but not sandbox. This PR adds seccomp profile in sandbox security context. Without this, we couldn't honour "seccomp.security.alpha.kubernetes.io/pod" for sandbox.
**Which issue this PR fixes**
fixes#49179.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/cc @yujuhong
Automatic merge from submit-queue (batch tested with PRs 48981, 47316, 49180)
Added golint check for pkg/kubelet.
**What this PR does / why we need it**:
Added golint check for pkg/kubelet, and make golint happy.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47315
**Release note**:
```release-note-none
```
Automatic merge from submit-queue
Restore cAdvisor prometheus metrics to the main port
But under a new path - `/metrics/cadvisor`. This ensures a secure port still exists for metrics while getting the benefit of separating out container metrics from the kubelet's metrics as recommended in the linked issue.
Fixes#48483
```release-note-action-required
Restored cAdvisor prometheus metrics to the main port -- a regression that existed in v1.7.0-v1.7.2
cAdvisor metrics can now be scraped from `/metrics/cadvisor` on the kubelet ports.
Note that you have to update your scraping jobs to get kubelet-only metrics from `/metrics` and `container_*` metrics from `/metrics/cadvisor`
```
Automatic merge from submit-queue (batch tested with PRs 49116, 49095)
Move pkg/api/v1/ref -> client-go/tools/reference
`pkg/api/v1/ref` is the only remaining package copied from pkg/api/v1 to client-go via staging/copy.sh.
Automatic merge from submit-queue (batch tested with PRs 48702, 48965, 48740, 48974, 48232)
Cleanup the conversion of ObjectReference
**What this PR does / why we need it**:
No need to convert ObjectReference as `k8s.io/kubernetes/pkg/api/v1` and `k8s.io/client-go/pkg/api/v1` has been consistent in `k8s.io/api/core/v1`.
**Which issue this PR fixes**: fixes#48747
**Special notes for your reviewer**:
/assign @caesarxuchao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48481, 48256)
Refactor: pkg/util into sub-pkgs
**What this PR does / why we need it**:
- move code in pkg/util into sub-pkgs
- delete some unused funcs
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#15634
**Special notes for your reviewer**:
This is the final work of #15634. It will close that issue.
/cc @thockin
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46094, 48544, 48807, 49102, 44174)
add InstanceID to fake cadvisor (used in Kubemark)
This change is for setting Node.Spec.ProviderID field in Kubemark hollow nodes. It shouldn't affect other tests using cadvisor.Fake as field is nil by default.
cc @gmarek
Automatic merge from submit-queue (batch tested with PRs 46094, 48544, 48807, 49102, 44174)
Static deepcopy – phase 1
This PR is the follow-up of https://github.com/kubernetes/kubernetes/pull/36412, replacing the
dynamic reflection based deepcopy with static DeepCopy+DeepCopyInto methods on API types.
This PR **does not yet** include the code dropping the cloner from the scheme and all the
porting of the calls to scheme.Copy. This will be part of a follow-up "Phase 2" PR.
A couple of the commits will go in first:
- [x] audit: fix deepcopy registration https://github.com/kubernetes/kubernetes/pull/48599
- [x] apimachinery+apiserver: separate test types in their own packages #48601
- [x] client-go: remove TPR example #48604
- [x] apimachinery: remove unneeded GetObjectKind() impls #48608
- [x] sanity check against origin, that OpenShift's types are fine for static deepcopy https://github.com/deads2k/origin/pull/34
TODO **after** review here:
- [x] merge https://github.com/kubernetes/gengo/pull/32 and update vendoring commit
But under a new path - `/metrics/cadvisor`. This ensures a secure port
still exists for metrics while getting the benefit of separating out
container metrics from the kubelet's metrics.
Automatic merge from submit-queue (batch tested with PRs 48231, 47377, 48797, 49020, 49033)
Migrate kubelet and linked code from clientset_generated to client-go
Ran a script in the kubernetes repo to migrate kubelet and linked code from clientset_generated package imports to client-go imports.
**NOTE:** There are also some manual changes that were made in order to accommodate some
differences between clientset_generated and client-go. The manual changes are isolated into a
single commit titled "Manual changes."
```sh
#! /bin/bash
for file in $(find . \( -name "clientset_generated" -o -name "informers_generated" \) -prune -o -type f -name "*.go"); do
if [[ -d $file ]]; then
continue
fi
if [[ $file == "./cmd/libs/go2idl/informer-gen/main.go" ]]; then
continue
fi
sed -i '
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset"|clientset "k8s.io/client-go/kubernetes"|;
# Correct a couple of unique cases.
s|clientset clientset "k8s.io/client-go/kubernetes"|clientset "k8s.io/client-go/kubernetes"|;
s|cs clientset "k8s.io/client-go/kubernetes"|clientset "k8s.io/client-go/kubernetes"|;
s|VersionedClientSetPackage: clientset "k8s.io/client-go/kubernetes"|VersionedClientSetPackage: "k8s.io/client-go/kubernetes"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/certificates/v1beta1"|"k8s.io/client-go/kubernetes/typed/certificates/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/core/v1"|"k8s.io/client-go/kubernetes/typed/core/v1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/extensions/v1beta1"|"k8s.io/client-go/kubernetes/typed/extensions/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/autoscaling/v1"|"k8s.io/client-go/kubernetes/typed/autoscaling/v1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/authentication/v1"|"k8s.io/client-go/kubernetes/typed/authentication/v1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/authorization/v1beta1"|"k8s.io/client-go/kubernetes/typed/authorization/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/apps/v1beta1"|"k8s.io/client-go/kubernetes/typed/apps/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/rbac/v1beta1"|"k8s.io/client-go/kubernetes/typed/rbac/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/fake"|"k8s.io/client-go/kubernetes/fake"|;
s|"k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/core/v1/fake"|"k8s.io/client-go/kubernetes/typed/core/v1/fake"|;
s|k8s.io/kubernetes/pkg/client/clientset_generated/clientset|k8s.io/client-go/kubernetes|;
s|informers "k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions"|"k8s.io/client-go/informers"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/core/v1"|"k8s.io/client-go/informers/core/v1"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/apps/v1beta1"|"k8s.io/client-go/informers/apps/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/extensions/v1beta1"|"k8s.io/client-go/informers/extensions/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/batch/v1"|"k8s.io/client-go/informers/batch/v1"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/autoscaling/v1"|"k8s.io/client-go/informers/autoscaling/v1"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/policy/v1beta1"|"k8s.io/client-go/informers/policy/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/certificates/v1beta1"|"k8s.io/client-go/informers/certificates/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/storage/v1"|"k8s.io/client-go/informers/storage/v1"|;
s|"k8s.io/kubernetes/pkg/client/listers/core/v1"|"k8s.io/client-go/listers/core/v1"|;
s|"k8s.io/kubernetes/pkg/client/listers/apps/v1beta1"|"k8s.io/client-go/listers/apps/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/listers/extensions/v1beta1"|"k8s.io/client-go/listers/extensions/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/listers/autoscaling/v1"|"k8s.io/client-go/listers/autoscaling/v1"|;
s|"k8s.io/kubernetes/pkg/client/listers/batch/v1"|"k8s.io/client-go/listers/batch/v1"|;
s|"k8s.io/kubernetes/pkg/client/listers/certificates/v1beta1"|"k8s.io/client-go/listers/certificates/v1beta1"|;
s|"k8s.io/kubernetes/pkg/client/listers/storage/v1"|"k8s.io/client-go/listers/storage/v1"|;
s|"k8s.io/kubernetes/pkg/client/listers/policy/v1beta1"|"k8s.io/client-go/listers/policy/v1beta1"|;
' $file
done
hack/update-bazel.sh
hack/update-gofmt.sh
```
Automatic merge from submit-queue (batch tested with PRs 49017, 45440, 48384, 45894, 48808)
Fix typo in ExecCommandParam
**What this PR does / why we need it**: Makes ExecCommandParam look like all of the other "Param"s
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48997, 48595, 48898, 48711, 48972)
Revert "Use go-ansiterm version matching docker/pkg/term/windows v1.11"
This reverts commit 72044a11a1.
**What this PR does / why we need it**: earlier this week, #47140 updated the vendored azure dependencies, which broke the windows build because the docker dependencies were too old. #48933 was merged, which reverted part of #47140 and fixed the build, but then #48308, which updated the vendored docker dependencies, broke the windows build again.
By reverting #48933, we should get back to a working build, I hope.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48887
**Release note**:
```release-note
NONE
```
/release-note-none
/test pull-kubernetes-cross
/assign @brendandburns
cc @karataliu @yguo0905 @yujuhong @dchen1107
Automatic merge from submit-queue
Fix comments and typo in the error message
**What this PR does / why we need it**:
This PR fixes outdated comments and typo in the error message.
**Release note**:
```release-note
NONE
```
CC @simo5
Automatic merge from submit-queue
Revert workaround in PR 46246 as APIs have been consistent
**What this PR does / why we need it**:
No need to convert v1.ObjectReference as APIs have been consistent in `k8s.io/api/core/v1`.
**Which issue this PR fixes** : fixes#48668
**Special notes for your reviewer**:
/assign @derekwaynecarr @caesarxuchao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Add OWNERS file to kubelet gpu package
GPU support is ramping up and we do not have a lot of reviewers that
are familiar with the codebase. I added myself as a reviewer and
copied a few people from the kubelet OWNERS file as approvers.
Signed-off-by: Christopher M. Luciano <cmluciano@us.ibm.com>
**Release note**:
```
NONE
```
This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
Automatic merge from submit-queue
Fix subPath existence check to not follow symlink
**What this PR does / why we need it**:
Volume mounting logic introduced in #43775 and #45623 checks
for subPath existence before attempting to create a directory,
should subPath not be present.
This breaks if subPath is a dangling symlink, os.Stat returns
"do not exist" status, yet `os.MkdirAll` can't create directory
as symlink is present at the given path.
This patch makes existence check to use os.Lstat which works for
normal files/directories as well as doesn't not attempt to follow
symlink, therefore it's "do not exist" status is more reliable when
making a decision whether to create directory or not.
subPath symlinks can be dangling in situations where kubelet is
running in a container itself with access to docker socket, such
as CoreOS's kubelet-wrapper script
**Release note**:
```release-note
Fix pods failing to start when subPath is a dangling symlink from kubelet point of view, which can happen if it is running inside a container
```
Automatic merge from submit-queue (batch tested with PRs 48594, 47042, 48801, 48641, 48243)
Prepare to introduce websockets for exec and portforward
Refactor the code in remotecommand to better represent the structure of
what is common between portforward and exec.
Ref #48633
Automatic merge from submit-queue (batch tested with PRs 48425, 41680, 48457, 48619, 48635)
Improved code coverage for pkg/kubelet/types/pod_update
The test coverage for pod_update.go was imprved from 36% to 100%.
**What this PR does / why we need it**:
This fixed part of #40780
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 48405, 48742, 48748, 48571, 48482)
dockershim: clean up unused security context code
Most of the code in the `dockershim/securitycontext` package are
unused and can be removed. This PR migrates the rest of the code,
cleans it up (e.g., remove references to kubernetes API objects),
and removes the package entirely.
Automatic merge from submit-queue (batch tested with PRs 48698, 48712, 48516, 48734, 48735)
Name change: s/timstclair/tallclair/
I changed my name, and I'm migrating my user name to be consistent.
Automatic merge from submit-queue (batch tested with PRs 48698, 48712, 48516, 48734, 48735)
share iptables util client within kubenet
reduce the number of goroutine waiting for dbus.
Automatic merge from submit-queue (batch tested with PRs 47232, 48625, 48613, 48567, 39173)
Fix issue when setting fileysystem capacity in container manager
In Container manager, we set up the capacity by retrieving information
from cadvisor. However unlike machineinfo, filesystem information is
available at a later unknown time. This PR uses a go routine to keep
retriving the information until it is avaialble or timeout.
This PR fixes issue #48452
Automatic merge from submit-queue (batch tested with PRs 48402, 47203, 47460, 48335, 48322)
Added case on 'terminated-but-not-yet-deleted' for Admit.
**What this PR does / why we need it**:
Added case on 'terminated-but-not-yet-deleted' for Admit.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47867
**Release note**:
```release-note-none
```
Automatic merge from submit-queue
Local storage teardown fix
**What this PR does / why we need it**: Local storage uses bindmounts and the method IsLikelyNotMountPoint does not detect these as mountpoints. Therefore, local PVs are not properly unmounted when they are deleted.
**Which issue this PR fixes**: fixes#48331
**Special notes for your reviewer**:
You can use these e2e tests to reproduce the issue and validate the fix works appropriately https://github.com/kubernetes/kubernetes/pull/47999
The existing method IsLikelyNotMountPoint purposely does not check mountpoints reliability (4c5b22d4c6/pkg/util/mount/mount_linux.go (L161)), since the number of mountpoints can be large. 4c5b22d4c6/pkg/util/mount/mount.go (L46)
This implementation changes the behavior for local storage to detect mountpoints reliably, and avoids changing the behavior for any other callers to a UnmountPath.
**Release note**:
```
Fixes bind-mount teardown failure with non-mount point Local volumes (issue https://github.com/kubernetes/kubernetes/issues/48331).
```
Automatic merge from submit-queue (batch tested with PRs 44412, 44810, 47130, 46017, 47829)
recheck pod volumes before marking pod as processed
This PR allows a pod's volumes to be re-checked until all are added correctly. There's a limited amount of time when a persistent volume claim is still in the Pending phase, and if a pod is created in that time, the volume will not be added. The issue is not uncommon with helm charts that create all objects in close succession, particularly when using aws-ebs volumes.
fixes#28962
Added IsNotMountPoint method to mount utils (pkg/util/mount/mount.go)
Added UnmountMountPoint method to volume utils (pkg/volume/util/util.go)
Call UnmountMountPoint method from local storage (pkg/volume/local/local.go)
IsLikelyNotMountPoint behavior was not modified, so the logic/behavior for UnmountPath is not modified
In Container manager, we set up the capacity by retrieving information
from cadvisor. However unlike machineinfo, filesystem information is
available at a later unknown time. This PR uses a go routine to keep
retriving the information until it is avaialble or timeout.
Volume mounting logic introduced in #43775 and #45623 checks
for subPath existence before attempting to create a directory,
should subPath not be present.
This breaks if subPath is a dangling symlink, os.Stat returns
"do not exist" status, yet `os.MkdirAll` can't create directory
as symlink is present at the given path.
This patch makes existence check to use os.Lstat which works for
normal files/directories as well as doesn't not attempt to follow
symlink, therefore it's "do not exist" status is more reliable when
making a decision whether to create directory or not.
subPath symlinks can be dangling in situations where kubelet is
running in a container itself with access to docker socket, such
as CoreOS's kubelet-wrapper script
Automatic merge from submit-queue (batch tested with PRs 48518, 48525, 48269)
Move the kubelet certificate management code into a single package
Code is very similar and belongs together. Will allow future cert callers to potentially make this more generic, as well as to make it easier reuse code elsewhere.
Automatic merge from submit-queue (batch tested with PRs 47327, 48194)
Checked container spec when killing container.
**What this PR does / why we need it**:
Checked container spec when getting container, return error if failed.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48173
**Release note**:
```release-note-none
```
Automatic merge from submit-queue (batch tested with PRs 47700, 48464, 48502)
Provide a way to setup the limit NO files for rkt Pods
**What this PR does / why we need it**:
This PR allows to customize the Systemd unit files for rkt pods.
We start with the `systemd-unit-option.rkt.kubernetes.io/LimitNOFILE` to allows to run workloads like etcd, ES in kubernetes with rkt.
**Special notes for your reviewer**:
Once again, I followed @yifan-gu guidelines.
I made a basic check over the values given inside the `systemd-unit-option.rkt.kubernetes.io/LimitNOFILE` (integer and > 0).
If this check fails: I simply ignore the field.
The other implementation would be to fail the whole SetUpPod.
We discussed using a key like `rkt.kubernetes.io/systemd-unit-option/LimitNOFILE` but the validation only allows a single `/` in this field:
```The Deployment "tiller" is invalid: spec.template.annotations: Invalid value: "rkt.kubernetes.io/systemd-unit-option/LimitNOFILE": a qualified name must consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyName', or 'my.name', or '123-abc', regex used for validation is '([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9]') with an optional DNS subdomain prefix and '/' (e.g. 'example.com/MyName')```
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45467, 48091, 48033, 48498)
Allow Kubenet with ipv6
When running kubenet with IPv6, there is a panic as there
is IPv4 specific code the Event function.
With this change, Event will support IPv4 and IPv6
**What this PR does / why we need it**:
This PR allows kubenet to use IPv6. Currently there is a panic in kubenet_linux.go
as there is IPv4 specific code.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48089
**Special notes for your reviewer**:
**Release note**:
```release-note-NONE
```
Automatic merge from submit-queue (batch tested with PRs 48317, 48313, 48351, 48357, 48115)
Cleanup lint errors in the pkg/kubelet/server/... directory
Cleanup some issues that were found while experimenting with https://github.com/alecthomas/gometalinter on the `pkg/kubelet/server/...` directories.
Automatic merge from submit-queue (batch tested with PRs 47918, 47964, 48151, 47881, 48299)
move term to kubectl/util
move term from pkg/util/term to pkg/kubectl/util/term
remove dependency of `k8s.io/kubernetes/pkg/util/term` for `pkg/kubelet/dockershim/exec.go` and `pkg/kubelet/dockershim/exec.go`
Ref: https://github.com/kubernetes/kubernetes/issues/48209
```release-note
NONE
```
/assign @apelisse @monopole
cc: @pwittrock
Automatic merge from submit-queue (batch tested with PRs 47918, 47964, 48151, 47881, 48299)
Add unit test coverage for nvidiaGPUManager initialization
Part of #47750
```release-note
NONE
```
Since v1.5 and the removal of --configure-cbr0:
0800df74ab "Remove the legacy networking mode --configure-cbr0"
kubelet hasn't done any shaping operations internally. They
have all been delegated to network plugins like kubenet or
external CNI plugins. But some shaping code was still left
in kubelet, so remove it now that it's unused.
Automatic merge from submit-queue (batch tested with PRs 47850, 47835, 46197, 47250, 48284)
dockershim: checkpoint HostNetwork property
To ensure kubelet doesn't attempt network teardown on HostNetwork
containers that no longer exist but are still checkpointed, make
sure we preserve the HostNetwork property in checkpoints. If
the checkpoint indicates the container was a HostNetwork one,
don't tear down the network since that would fail anyway.
Related: https://github.com/kubernetes/kubernetes/issues/44307#issuecomment-299548609
@freehan @kubernetes/sig-network-misc
Automatic merge from submit-queue
Add type conversion judgment
If do not type conversion judgment, there may be panic.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Remove repeat type conversions
Here is the type of conversion for the variable is repeated.
**Release note**:
```release-note
NONE
```
GPU support is ramping up and we do not have a lot of reviewers that
are familiar with the codebase. I added myself as a reviewer and
copied a few people from the kubelet OWNERS file as approvers.
Signed-off-by: Christopher M. Luciano <cmluciano@us.ibm.com>
Automatic merge from submit-queue (batch tested with PRs 48123, 48079)
[Kubelet] Fix race condition in container manager
**What this PR does / why we need it**:
This fixes a race condition where the container manager capacity map was being updated without synchronization. It moves the storage capacity detection to kubelet initialization, which happens serially in one thread.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48045
**Release note**:
```release-note
Fixes kubelet race condition in container manager.
```
Centralize Capacity discovery of standard resources in Container manager.
Have storage derive node capacity from container manager.
Move certain cAdvisor interfaces to the cAdvisor package in the process.
This patch fixes a bug in container manager where it was writing to a map without synchronization.
Signed-off-by: Vishnu kannan <vishnuk@google.com>
Automatic merge from submit-queue (batch tested with PRs 47484, 47904, 48034)
fix nits in kubelet server
Signed-off-by: allencloud <allen.sun@daocloud.io>
**What this PR does / why we need it**:
fix nits in kubelet server
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
NONE
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44058, 48085, 48077, 48076, 47823)
don't pass CRI error through to waiting state reason
Raw gRPC errors are getting into the `Reason` field of the container status `State`, causing it to be output inline on a `kubectl get pod`
xref https://bugzilla.redhat.com/show_bug.cgi?id=1449820
Basically the issue is that the err and msg are reversed in `startContainer()`. The msg is short and the err is long. It should be the other way around.
This PR changes `startContainer()` to return a short error that becomes the Reason and the extracted gPRC error description that becomes the Message.
@derekwaynecarr @smarterclayton @eparis
Automatic merge from submit-queue
kubelet should let cloud-controller-manager set the node addresses
*Before this change:*
1. cloud-controller-manager sets all the addresses for a node.
2. kubelet on that node replaces these addresses with an incomplete set. (i.e. replace InternalIP and Hostname and delete all other addresses--ExternalIP, etc.)
*After this change:*
kubelet doesn't touch its node's addresses when there is an external cloudprovider.
Fixes#47155
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47694, 47772, 47783, 47803, 47673)
Make different container runtimes constant
Make different container runtimes constant to avoid hardcode
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47915, 47856, 44086, 47575, 47475)
kubelet should resume csr bootstrap
Right now the kubelet creates a new csr object with the same key every
time it restarts during the bootstrap process. It should resume with the
old csr object if it exists. To do this the name of the csr object must
be stable.
Issue https://github.com/kubernetes/kubernetes/issues/47855
Automatic merge from submit-queue (batch tested with PRs 47227, 47119, 46280, 47414, 46696)
Move seccomp helper methods and tests to platform-specific files.
**What this PR does / why we need it**:
Seccomp helper methods are for linux only, move them to linux-specific helper file.
As discussed in https://github.com/kubernetes/kubernetes/pull/46744
**Which issue this PR fixes**
**Special notes for your reviewer**:
**Release note**:
Automatic merge from submit-queue (batch tested with PRs 47922, 47195, 47241, 47095, 47401)
Run cAdvisor on the same interface as kubelet
**What this PR does / why we need it**:
cAdvisor currently binds to all interfaces. Currently the only
solution is to use iptables to block access to the port. We
are better off making cAdvisor to bind to the interface that
kubelet uses for better security.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#11710
**Special notes for your reviewer**:
**Release note**:
```release-note
cAdvisor binds only to the interface that kubelet is running on instead of all interfaces.
```
Right now the kubelet creates a new csr object with the same key every
time it restarts during the bootstrap process. It should resume with the
old csr object if it exists. To do this the name of the csr object must
be stable. Also using a list watch here eliminates a race condition
where a watch event is missed and the kubelet stalls.
Automatic merge from submit-queue (batch tested with PRs 47851, 47824, 47858, 46099)
Revert 44714 manually
#44714 broke backward compatibility for old swagger spec that kubectl still uses. The decision on #47448 was to revert this change but the change was not automatically revertible. Here I semi-manually remove all references to UnixUserID and UnixGroupID and updated generated files accordingly.
Please wait for tests to pass then review that as there may still be tests that are failing.
Fixes#47448
Adding release note just because the original PR has a release note. If possible, we should remove both release notes as they cancel each other.
**Release note**: (removed by caesarxuchao)
UnixUserID and UnixGroupID is reverted back as int64 to keep backward compatibility.
Automatic merge from submit-queue (batch tested with PRs 34515, 47236, 46694, 47819, 47792)
Adding alpha feature gate to node statuses from local storage capacity isolation.
**What this PR does / why we need it**: The Capacity.storage node attribute should not be exposed since it's part of an alpha feature. Added an feature gate.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47809
There should be a test for new statuses in the alpha feature. Will include in a different PR.
To ensure kubelet doesn't attempt network teardown on HostNetwork
containers that no longer exist but are still checkpointed, make
sure we preserve the HostNetwork property in checkpoints. If
the checkpoint indicates the container was a HostNetwork one,
don't tear down the network since that would fail anyway.
Related: https://github.com/kubernetes/kubernetes/issues/44307#issuecomment-299548609
- Wrapping all node statuses from local storage capacity isolation under an alpha feature check. Currently there should not be any storage statuses.
- Replaced all "storage" statuses with "storage.kubernetes.io/scratch". "storage" should never be exposed as a status.
Automatic merge from submit-queue
Don't rerun certificate manager tests 1000 times.
**What this PR does / why we need it**:
Running every testcase 1000 times needlessly bloats the logs.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47669, 40284, 47356, 47458, 47701)
add unit test cases for kubelet.util.sliceutils
What this PR does / why we need it:
I have not found any unit test case for this file, so i do it, thank you!
Fixes#47001
Automatic merge from submit-queue (batch tested with PRs 47626, 47674, 47683, 47290, 47688)
validate host paths on the kubelet for backsteps
**What this PR does / why we need it**:
This PR adds validation on the kubelet to ensure the host path does not contain backsteps that could allow the volume to escape the PSP's allowed host paths. Currently, there is validation done at in API server; however, that does not account for mismatch of OS's on the kubelet vs api server.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47107
**Special notes for your reviewer**:
cc @liggitt
**Release note**:
```release-note
Paths containing backsteps (for example, "../bar") are no longer allowed in hostPath volume paths, or in volumeMount subpaths
```
Automatic merge from submit-queue (batch tested with PRs 47523, 47438, 47550, 47450, 47612)
append KUBE-HOSTPORTS to system chains instead of prepend
Bug fix for conflicting iptables rules between hostport and kube-proxy
Automatic merge from submit-queue
Strip container id from events
**What this PR does / why we need it**:
reduces spam events from kubelet in bad pod scenarios
**Which issue this PR fixes**:
relates to https://github.com/kubernetes/kubernetes/issues/47366
**Special notes for your reviewer**:
pods in permanent failure states created unique events
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 46441, 43987, 46921, 46823, 47276)
kubelet/network: report but tolerate errors returned from GetNetNS() v2
Runtimes should never return "" and nil errors, since network plugin
drivers need to treat netns differently in different cases. So return
errors when we can't get the netns, and fix up the plugins to do the
right thing.
Namely, we don't need a NetNS on pod network teardown. We do need
a netns for pod Status checks and for network setup.
V2: don't return errors from getIP(), since they will block pod status :( Just log them. But even so, this still fixes the original problem by ensuring we don't log errors when the network isn't ready.
@freehan @yujuhong
Fixes: https://github.com/kubernetes/kubernetes/issues/42735
Fixes: https://github.com/kubernetes/kubernetes/issues/44307
Automatic merge from submit-queue (batch tested with PRs 47000, 47188, 47094, 47323, 47124)
fix sync loop health check
This PR will do error logging about the fall behind sync for kubelet instead of sync loop healthz checking.
The reason is kubelet can not do sync loop and therefore can not update sync loop time when there is any runtime error, such as docker hung.
When there is any runtime error, according to current implementation, kubelet will not do sync operation and thus kubelet's sync loop time will not be updated. This will make when there is any runtime error, kubelet will also return non 200 response status code when accessing healthz endpoint. This is contrary with #37865 which prevents kubelet from being killed when docker hangs.
**Release note**:
```release-note
fix sync loop health check with seperating runtime errors
```
/cc @yujuhong @Random-Liu @dchen1107
GenericPLEG's 1s relist() loop races against pod network setup. It
may be called after the infra container has started but before
network setup is done, since PLEG and the runtime's SyncPod() run
in different goroutines.
Track network setup status and don't bother trying to read the pod's
IP address if networking is not yet ready.
See also: https://bugzilla.redhat.com/show_bug.cgi?id=1434950
Mar 22 12:18:17 ip-172-31-43-89 atomic-openshift-node: E0322
12:18:17.651013 25624 docker_manager.go:378] NetworkPlugin
cni failed on the status hook for pod 'pausepods22' - Unexpected
command output Device "eth0" does not exist.