This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
This PR adds the support for allocatable local storage (scratch space).
This feature is only for root file system which is shared by kubernetes
componenets, users' containers and/or images. User could use
--kube-reserved flag to reserve the storage for kube system components.
If the allocatable storage for user's pods is used up, some pods will be
evicted to free the storage resource.
Automatic merge from submit-queue (batch tested with PRs 46501, 45944, 46473)
fix func comment in helpers.go
**What this PR does / why we need it**:
fix func comment in helpers.go
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
NONE
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 42369, 42375, 42397, 42435, 42455)
[Bug Fix]: Avoid evicting more pods than necessary by adding Timestamps for fsstats and ignoring stale stats
Continuation of #33121. Credit for most of this goes to @sjenning. I added volume fs timestamps.
**why is this a bug**
This PR attempts to fix part of https://github.com/kubernetes/kubernetes/issues/31362 which results in multiple pods getting evicted unnecessarily whenever the node runs into resource pressure. This PR reduces the chances of such disruptions by avoiding reacting to old/stale metrics.
Without this PR, kubernetes nodes under resource pressure will cause unnecessary disruptions to user workloads.
This PR will also help deflake a node e2e test suite.
The eviction manager currently avoids evicting pods if metrics are old. However, timestamp data is not available for filesystem data, and this causes lots of extra evictions.
See the [inode eviction test flakes](https://k8s-testgrid.appspot.com/google-node#kubelet-flaky-gce-e2e) for examples.
This should probably be treated as a bugfix, as it should help mitigate extra evictions.
cc: @kubernetes/sig-storage-pr-reviews @kubernetes/sig-node-pr-reviews @vishh @derekwaynecarr @sjenning
Automatic merge from submit-queue
kubelet: eviction: avoid duplicate action on stale stats
Currently, the eviction code can be overly aggressive when synchronize() is called two (or more) times before a particular stat has been recollected by cadvisor. The eviction manager will take additional action based on information for which it has already taken actions.
This PR provides a method for the eviction manager to track the timestamp of the last obversation and not take action if the stat has not been updated since the last time synchronize() was run.
@derekwaynecarr @vishh @kubernetes/rh-cluster-infra
Automatic merge from submit-queue
kubelet: eviction: allow minimum reclaim as percentage
Fixes#33354
xref #32537
**Release note**:
```release-note
The kubelet --eviction-minimum-reclaim option can now take precentages as well as absolute values for resources quantities
```
@derekwaynecarr @vishh @mtaufen
Automatic merge from submit-queue
Set imagefs rank and reclaim functions when nodefs+imagefs share comm…
Fixes#31192
I decided that the behavior should match the current output of the kubelet summary API. With no dedicated imagefs, the ranking and reclaim functions will match the nodefs ranking and reclaim functions.
/cc @ronnielai @vishh