Introduce feature gate for expanding PVs
Add a field to SC
Add new Conditions and feature tag pvc update
Add tests for size update via feature gate
register the resize admission plugin
Update golint failures
Automatic merge from submit-queue
Remove deprecated init-container in annotations
fixes#50655fixes#51816closes#41004fixes#51816
Builds on #50654 and drops the initContainer annotations on conversion to prevent bypassing API server validation/security and targeting version-skewed kubelets that still honor the annotations
```release-note
The deprecated alpha and beta initContainer annotations are no longer supported. Init containers must be specified using the initContainers field in the pod spec.
```
Automatic merge from submit-queue (batch tested with PRs 51805, 51725, 50925, 51474, 51638)
Flexvolume dynamic plugin discovery: Prober unit tests and basic e2e test.
**What this PR does / why we need it**: Tests for changes introduced in PR #50031 .
As part of the prober unit test, I mocked filesystem, filesystem watch, and Flexvolume plugin initialization.
Moved the filesystem event goroutine to watcher implementation.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#51147
**Special notes for your reviewer**:
First commit contains added functionality of the mock filesystem.
Second commit is the refactor for moving mock filesystem into a common util directory.
Third commit is the unit and e2e tests.
**Release note**:
```release-note
NONE
```
/release-note-none
/sig storage
/assign @saad-ali @liggitt
/cc @mtaufen @chakri-nelluri @wongma7
Automatic merge from submit-queue (batch tested with PRs 51666, 49829, 51058, 51004, 50938)
Fix threshold notifier build tags
**What this PR does / why we need it**:
Cross building from darwin is currently broken on the following error:
```
# k8s.io/kubernetes/pkg/kubelet/eviction
pkg/kubelet/eviction/threshold_notifier_unsupported.go:25: NewMemCGThresholdNotifier redeclared in this block
previous declaration at pkg/kubelet/eviction/threshold_notifier_linux.go:38
```
It looks like #49300 broke the build tags introduced in #38630 and #37384. This fixes the build tag on `threshold_notifier_unsupported.go` as the cgo requirement was removed from `threshold_notifier_linux.go`.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50935
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45724, 48051, 46444, 51056, 51605)
Mount propagation in kubelet
Together with #45724 it implements mount propagation as proposed in https://github.com/kubernetes/community/pull/659
There is:
- New alpha annotation that allows user to explicitly set propagation mode for each `VolumeMount` in pod containers (to be replaced with real `VolumeMount.Propagation` field during beta) + validation + tests. "Private" is the default one (= no change to existing pods).
I know about proposal for real API fields for alpha feature in https://docs.google.com/document/d/1wuoSqHkeT51mQQ7dIFhUKrdi3-1wbKrNWeIL4cKb9zU/edit, but it seems it's not implemented yet. It would save me quite lot of code and ugly annotation.
- Updated CRI API to transport chosen propagation to Docker.
- New `kubelet --experimental-mount-propagation` option to enable the previous bullet without modifying types.go (worked around with changing `KubeletDeps`... not nice, but it's better than adding a parameter to `NewMainKubelet` and removing it in the next release...)
```release-note
kubelet has alpha support for mount propagation. It is disabled by default and it is there for testing only. This feature may be redesigned or even removed in a future release.
```
@derekwaynecarr @dchen1107 @kubernetes/sig-node-pr-reviews
Automatic merge from submit-queue
Make /var/lib/kubelet as shared during startup
This is part of ~~https://github.com/kubernetes/community/pull/589~~https://github.com/kubernetes/community/pull/659
We'd like kubelet to be able to consume mounts from containers in the future, therefore kubelet should make sure that `/var/lib/kubelet` has shared mount propagation to be able to see these mounts.
On most distros, root directory is already mounted with shared mount propagation and this code will not do anything. On older distros such as Debian Wheezy, this code detects that `/var/lib/kubelet` is a directory on `/` which has private mount propagation and kubelet bind-mounts `/var/lib/kubelet` as rshared.
Both "regular" linux mounter and `NsenterMounter` are updated here.
@kubernetes/sig-storage-pr-reviews @kubernetes/sig-node-pr-reviews
@vishh
Release note:
```release-note
Kubelet re-binds /var/lib/kubelet directory with rshared mount propagation during startup if it is not shared yet.
```
Automatic merge from submit-queue (batch tested with PRs 51590, 48217, 51209, 51575, 48627)
Skip system container cgroup stats if undefined
**What this PR does / why we need it**:
the kubelet /stats/summary endpoint tried to look up cgroup stats for containers that are not required. this polluted logs with messages about not finding stats for "" container. this pr skips cgroup stats if the cgroup name is not specified (they are optional anyway)
**Special notes for your reviewer**:
i think this was a regression from recent refactor.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51590, 48217, 51209, 51575, 48627)
Deviceplugin jiayingz
**What this PR does / why we need it**:
This PR implements the kubelet Device Plugin Manager.
It includes four commits implemented by @RenaudWasTaken and a commit that supports allocation.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Design document: kubernetes/community#695
PR tracking: kubernetes/features#368
**Special notes for your reviewer**:
**Release note**:
Extending Kubelet to support device plugin
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 50381, 51307, 49645, 50995, 51523)
Remove deprecated and experimental fields from KubeletConfiguration
As we work towards providing a stable (v1) kubeletconfig API,
we cannot afford to have deprecated or "experimental" (alpha) fields
living in the KubeletConfiguration struct. This removes all existing
experimental or deprecated fields, and places them in KubeletFlags
instead.
I'm going to send another PR after this one that organizes the remaining
fields into substructures for readability. Then, we should try to move
to v1 ASAP (maybe not v1 in 1.8, given how close we are, but definitely in 1.9).
It makes far more sense to focus on a clean API in kubeletconfig v2,
than to try and further clean up the existing "API" that everyone
already depends on.
fixes: #51657
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50381, 51307, 49645, 50995, 51523)
Bugfix: Use local JSON log buffer in parseDockerJSONLog.
**What this PR does / why we need it**:
The issue described in #47800 is due to a race condition in `ReadLogs`: Because the JSON log buffer (`dockerJSONLog`) is package-scoped, any two goroutines modifying the buffer could race and overwrite the other's changes. In particular, one goroutine could unmarshal a JSON log line into the buffer, then another goroutine could `Reset()` the buffer, and the resulting `Stream` would be empty (`""`). This empty `Stream` is caught in a `case` block and raises an `unexpected stream type` error.
This PR creates a new buffer for each execution of `parseDockerJSONLog`, so each goroutine is guaranteed to have a local instance of the buffer.
**Which issue this PR fixes**: fixes#47800
**Release note**:
```release-note
Fixed an issue (#47800) where `kubectl logs -f` failed with `unexpected stream type ""`.
```
Automatic merge from submit-queue (batch tested with PRs 49971, 51357, 51616, 51649, 51372)
Separate feature gates for dynamic kubelet config vs loading from a file
This makes it so these two features can be turned on independently, rather than bundling both under dynamic kubelet config.
fixes: #51664
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49971, 51357, 51616, 51649, 51372)
CPU manager wiring and `none` policy
Blocker for CPU manager #49186 (4 of 6)
* Previous PR in this series: #51140
* Next PR in this series: #51180
cc @balajismaniam @derekwaynecarr @sjenning
**Release note**:
```release-note
NONE
```
TODO:
- [X] In-memory CPU manager state
- [x] Kubelet config value
- [x] Feature gate
- [X] None policy
- [X] Unit tests
- [X] CPU manager instantiation
- [x] Calls into CPU manager from Kubelet container runtime
Automatic merge from submit-queue (batch tested with PRs 51628, 51637, 51490, 51279, 51302)
Fix pod local ephemeral storage usage calculation
We use podDiskUsage to calculate pod local ephemeral storage which is not correct, because podDiskUsage also contains HostPath volume which is considered as persistent storage
This pr fixes it
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#51489
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/assign @jingxu97 @vishh
cc @ddysher
Automatic merge from submit-queue (batch tested with PRs 51513, 51515, 50570, 51482, 51448)
Add PVCRef to VolumeStats
**What this PR does / why we need it**:
For pod volumes that reference a PVC, add a PVCRef to the corresponding
volume stat. This allows metrics to be indexed/queried by PVC name
which is more user-friendly than Pod reference
**Which issue this PR fixes** : [#363](https://github.com/kubernetes/features/issues/363)
**Special notes for your reviewer**:
**Release note**:
```
`VolumeStats` reported by the kubelet stats summary API
(http://<node>:10255/stats/summary) now include a PVCRef
field describing the PVC referenced by the volume (if any).
```
Automatic merge from submit-queue (batch tested with PRs 51513, 51515, 50570, 51482, 51448)
fix typo about volumes
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50719, 51216, 50212, 51408, 51381)
Use constants instead of magic string for runtime names
**What this PR does / why we need it**:
Use constants instead of magic string for runtime names.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#51678
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
As we work towards providing a stable (v1) kubeletconfig API,
we cannot afford to have deprecated or "experimental" (alpha) fields
living in the KubeletConfiguration struct. This removes all existing
experimental or deprecated fields, and places them in KubeletFlags
instead.
I'm going to send another PR after this one that organizes the remaining
fields into substructures for readability. Then, we should try to move
to v1 ASAP.
It makes far more sense to focus on a clean API in kubeletconfig v2,
than to try and further clean up the existing "API" that everyone
already depends on.
Kubelet makes sure that /var/lib/kubelet is rshared when it starts.
If not, it bind-mounts it with rshared propagation to containers
that mount volumes to /var/lib/kubelet can benefit from mount propagation.
For pod volumes that reference a PVC, add a PVCRef to the corresponding
volume stat. This allows metrics to be indexed/queried by PVC name
which is more user-friendly than Pod reference
Automatic merge from submit-queue
Implement stop function in streaming server.
Implement streaming server stop, so that we could properly stop streaming server.
We need this to properly stop cri-containerd.
Automatic merge from submit-queue (batch tested with PRs 49961, 50005, 50738, 51045, 49927)
adding validations on kubelet starting configurations
**What this PR does / why we need it**:
I found some validations of kubelet starting options were missing when I was creating a custom cluster from scratch. The kubelet does not check invalid configurations on `--cadvisor-port`, `--event-burst`, `--image-gc-high-threshold`, etc. I have added some validations in kubelet like validations in `cmd/kube-apiserver/app/options/validation.go`.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Adds additional validation for kubelet in `pkg/kubelet/apis/kubeletconfig/validation`.
```
Automatic merge from submit-queue (batch tested with PRs 51425, 51404, 51459, 51504, 51488)
Admit NoNewPrivs for remote and rkt runtimes
**What this PR does / why we need it**:
#51347 is aiming to admit NoNewPrivis for remote container runtime, but it didn't actually solve the problem. See @miaoyq 's comments [here](https://github.com/kubernetes/kubernetes/pull/51347#discussion_r135379446).
This PR always admit NoNewPrivs for runtimes except docker, which should fix the problem.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Fixes#51319.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51471, 50561, 50435, 51473, 51436)
Fix inconsistent Prometheus cAdvisor metrics
**What this PR does / why we need it**:
We need this because otherwise kubelet is exposing different sets of Prometheus metrics that randomly include or do not include containers.
See also https://github.com/google/cadvisor/issues/1704; quoting here:
Prometheus requires that all metrics in the same family have the same labels, so we arrange to supply blank strings for missing labels
The function `containerPrometheusLabels()` conditionally adds various metric labels from container labels - pod name, image, etc. However, when it receives the metrics, Prometheus [checks](https://github.com/prometheus/client_golang/blob/master/prometheus/registry.go#L665) that all metrics in the same family have the same label set, and [rejects](https://github.com/prometheus/client_golang/blob/master/prometheus/registry.go#L497) those that do not.
Since containers are collected in (somewhat) random order, depending on which kind is seen first you get one set of metrics or the other.
Changing the container labels function to always add the same set of labels, adding `""` when it doesn't have a real value, eliminates the issue in my testing.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#50151
**Special notes for your reviewer**:
I have made the same fix in two places. I am 98% sure the one in `cadvisor_linux.go` isn't used and indeed cannot be used, but have not gone fully down that rabbit-hole.
**Release note**:
```release-note
Fix inconsistent Prometheus cAdvisor metrics
```