Automatic merge from submit-queue (batch tested with PRs 49342, 50581, 50777)
Update RegisterMandatoryFitPredicate to avoid double register.
**What this PR does / why we need it**:
In https://github.com/kubernetes/kubernetes/pull/50362 , we introduced `RegisterMandatoryFitPredicate` to make some predicates always included by scheduler. This PRs is to improve it by avoiding double register: `RegisterFitPredicate` and `RegisterMandatoryFitPredicate`
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50360
**Release note**:
```release-note
None
```
Automatic merge from submit-queue
Moved node condition filter into a predicates.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50360
**Release note**:
```release-note
A new predicates, named 'CheckNodeCondition', was added to replace node condition filter. 'NetworkUnavailable', 'OutOfDisk' and 'NotReady' maybe reported as a reason when failed to schedule pods.
```
Automatic merge from submit-queue
Retry scheduling pods after errors more consistently in scheduler
**What this PR does / why we need it**:
This fixes 2 places in the scheduler where pods can get stuck in Pending forever. In both these places, errors happen and `sched.config.Error` is not called afterwards. This is a problem because `sched.config.Error` is responsible for requeuing pods to retry scheduling when there are issues (see [here](2540b333b2/plugin/pkg/scheduler/factory/factory.go (L958))), so if we don't call `sched.config.Error` then the pod will never get scheduled (unless the scheduler is restarted).
One of these (where it returns when `ForgetPod` fails instead of continuing and reporting an error) is a regression from [this refactor](https://github.com/kubernetes/kubernetes/commit/ecb962e6585#diff-67f2b61521299ca8d8687b0933bbfb19L234), and with the [old behavior](80f26fa8a8/plugin/pkg/scheduler/scheduler.go (L233-L237)) the error was reported correctly. As far as I can tell changing the error handling in that refactor wasn't intentional.
When AssumePod fails there's never been an error reported but I think adding this will help the scheduler recover when something goes wrong instead of letting pods possibly never get scheduled.
This will help prevent issues like https://github.com/kubernetes/kubernetes/issues/49314 in the future.
**Release note**:
```release-note
Fix incorrect retry logic in scheduler
```
Automatic merge from submit-queue (batch tested with PRs 50119, 48366, 47181, 41611, 49547)
Task 0: Added node taints labels and feature flags
**What this PR does / why we need it**:
Added node taint const for node condition.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: part of #42001
**Release note**:
```release-note
None
```
Automatic merge from submit-queue
Fix incorrect call to 'bind' in scheduler
I previously submitted https://github.com/kubernetes/kubernetes/pull/49661 -- I'm not sure if that PR is too big or what, but this is an attempt at a smaller PR that makes progress on the same issue and is easier to review.
**What this PR does / why we need it**:
In this refactor (https://github.com/kubernetes/kubernetes/commit/ecb962e6585#diff-67f2b61521299ca8d8687b0933bbfb19R223) the scheduler code was refactored into separate `bind` and `assume` functions. When that happened, `bind` was called with `pod` as an argument. The argument to `bind` should be the assumed pod, not the original pod. Evidence that `assumedPod` is the correct argument bind and not `pod`: 80f26fa8a8/plugin/pkg/scheduler/scheduler.go (L229-L234). (and it says `assumed` in the function signature for `bind`, even though it's not called with the assumed pod as an argument).
This is an issue (and causes #49314, where pods that fail to bind to a node get stuck indefinitely) in the following scenario:
1. The pod fails to bind to the node
2. `bind` calls `ForgetPod` with the `pod` argument
3. since `ForgetPod` is expecting the assumed pod as an argument (because that's what's in the scheduler cache), it fails with an error like `scheduler cache ForgetPod failed: pod test-677550-rc-edit-namespace/nginx-jvn09 state was assumed on a different node`
4. The pod gets lost forever because of some incomplete error handling (which I haven't addressed here in the interest of making a simpler PR)
In this PR I've fixed the call to `bind` and modified the tests to make sure that `ForgetPod` gets called with the correct argument (the assumed pod) when binding fails.
**Which issue this PR fixes**: fixes#49314
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)
Remove duplicated import and wrong alias name of api package
**What this PR does / why we need it**:
**Which issue this PR fixes**: fixes#48975
**Special notes for your reviewer**:
/assign @caesarxuchao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48636, 49088, 49251, 49417, 49494)
Fix issues for local storage allocatable feature
This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
This PR fixes issue #47809
Automatic merge from submit-queue (batch tested with PRs 48043, 48200, 49139, 36238, 49130)
Implement equivalence cache by caching and re-using predicate result
The last part of #30844, I opened a new PR instead of overwrite the old one because we changed some basic assumption by allowing invalidating equivalence cache item by individual predicate.
The idea of this PR is based on discussion in https://github.com/kubernetes/kubernetes/issues/32024
- [x] Pods belong to same controllerRef considered to be equivalent
- [x] ` podFitsOnNode` will use cached predicate result if it's available
- [x] Equivalence cache will be updated when if a fresh new predicate is done
- [x] `factory.go` will invalid specific predicate cache(s) based on the object change
- [x] Since `schedule` and `bind` are async, we need to optimistically invalid affected cache(s) before `bind`
- [x] Fully unit test of affected files
- [x] e2e test to verify cache update/invalid workflow
- [x] performance test results
- [x] Some nits fixes related but expected to result in `needs-rebase` so they are split to: #36060#35968#37512
cc @wojtek-t @davidopp
Automatic merge from submit-queue (batch tested with PRs 49055, 49128, 49132, 49134, 49110)
Remove affinity annotations leftover
**What this PR does / why we need it**:
This is a further cleanup for affinity annotations, following #47869.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
ref: #47869
**Special notes for your reviewer**:
- I remove the commented test cases and just leave TODOs instead. I think converting these untestable test cases for now is not necessary. We can add new test cases in future.
- I remove the e2e test case `validates that embedding the JSON PodAffinity and PodAntiAffinity setting as a string in the annotation value work` because we have a test case `validates that InterPod Affinity and AntiAffinity is respected if matching` to test the same thing.
/cc @aveshagarwal @bsalamat @gyliu513 @k82cn @timothysc
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46094, 48544, 48807, 49102, 44174)
Static deepcopy – phase 1
This PR is the follow-up of https://github.com/kubernetes/kubernetes/pull/36412, replacing the
dynamic reflection based deepcopy with static DeepCopy+DeepCopyInto methods on API types.
This PR **does not yet** include the code dropping the cloner from the scheme and all the
porting of the calls to scheme.Copy. This will be part of a follow-up "Phase 2" PR.
A couple of the commits will go in first:
- [x] audit: fix deepcopy registration https://github.com/kubernetes/kubernetes/pull/48599
- [x] apimachinery+apiserver: separate test types in their own packages #48601
- [x] client-go: remove TPR example #48604
- [x] apimachinery: remove unneeded GetObjectKind() impls #48608
- [x] sanity check against origin, that OpenShift's types are fine for static deepcopy https://github.com/deads2k/origin/pull/34
TODO **after** review here:
- [x] merge https://github.com/kubernetes/gengo/pull/32 and update vendoring commit
Automatic merge from submit-queue (batch tested with PRs 48333, 48806, 49046)
use v1.ResourcePods instead of hard coding "pods"
Signed-off-by: sakeven <jc5930@sina.cn>
**What this PR does / why we need it**:
use v1.ResourcePods instead of hard coding 'pods'
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue
[Scheduler] Remove error since err is always nil
Signed-off-by: sakeven <jc5930@sina.cn>
**What this PR does / why we need it**:
No need to log error since err is always nil.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48262, 48805)
[Scheduler] Use const value maxPriority instead of immediate value 10
Signed-off-by: sakeven <jc5930@sina.cn>
**What this PR does / why we need it**:
Use const value maxPriority instead of immediate value 10.
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue
forget pod first after binding failed
Signed-off-by: sakeven <jc5930@sina.cn>
**What this PR does / why we need it**:
In the implementation of scheduler cache, `FinishBinding` marks Pod expired, and then pod would be cleaned in ttl seconds. While `ForgetPod` checks Pod whether assumed, if not, it reports an error.
So if binding failed and ttl(now 30s) is too short, the error will occur when `ForgetPod`, thus we won't record `BindingRejected` event.
Although it's rare, we shouldn't depend on the value of ttl.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47417, 47638, 46930)
Added scheduler integration test owners.
**What this PR does / why we need it**:
Add OWNER file into scheduler integration test.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes # N/A
**Release note**:
```release-note-none
```
Automatic merge from submit-queue
Improved code coverage for equivalence cache.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
none
```
This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
Automatic merge from submit-queue (batch tested with PRs 48405, 48742, 48748, 48571, 48482)
Removed scheduler dependencies to testapi.
**What this PR does / why we need it**:
When refactor scheduler to use client-go, k8s.io/api, it's also need to remove the dependeny to testapi.
prefer to only include import/BUILD changes for #44188, so created separated PR for other enhancement removal.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: partially fixes#44188
**Release note**:
```release-note-none
```
Automatic merge from submit-queue (batch tested with PRs 46865, 48661, 48598, 48658, 48614)
Fix function names in the comments
This patch fixes function and type names in the comments
in predicates.go.
**What this PR does / why we need it**:
It fixes function and type names in the comments in predicates.go.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
This does not have an issue # because it is a trivial fix.
**Special notes for your reviewer**:
**Release note**:
```release-note
```