Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Critical pod priorityClass addition
**What this PR does / why we need it**:
@bsalamat - Apologies for the delay. This PR is to ensure that all pods with priorityClassName `system-node-critical` and `system-cluster-critical` will be critical pods while preserving backwards compatibility.
**Special notes for your reviewer**:
- Moved some constants and other data structures to scheduler/api/types.go where other constants are present.
- An automatic assignment of critical priorities to pods based on critical pod annotation for backwards compatibility including some unit tests.
xref: https://github.com/kubernetes/kubernetes/issues/57471
**Release note**:
```release-note
Critical pods to use priorityClasses.
```
Automatic merge from submit-queue (batch tested with PRs 60106, 59510, 60263, 60063, 59088). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Reuse the `min*Nodes` slices in order to save GC time
**What this PR does / why we need it**:
Reuse the `min*Nodes` slices to save GC time when executing `pickOneNodeForPreemption`.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#59748
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use consts as predicate key names in handlers
**What this PR does / why we need it**:
Per discussion in: https://github.com/kubernetes/kubernetes/pull/59335/files#r168351460
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#59951
**Special notes for your reviewer**:
**Release note**:
```release-note
Use consts as predicate name in handlers
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improve scheduling queue's logic
**What this PR does / why we need it**:
Improves scheduling queue's code based on some recent comments on [the original PR](https://github.com/kubernetes/kubernetes/pull/55109).
This PR does not fix any bugs or make any change of behavior.
**Release note**:
```release-note
NONE
```
/sig scheduling
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Move volume scheduling and local storage to beta
**What this PR does / why we need it**:
* Move the feature gates and APIs for volume scheduling and local storage to beta
* Update tests to use the beta fields
@kubernetes/sig-storage-pr-reviews
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#59390
**Special notes for your reviewer**:
**Release note**:
```release-note
ACTION REQUIRED: VolumeScheduling and LocalPersistentVolume features are beta and enabled by default. The PersistentVolume NodeAffinity alpha annotation is deprecated and will be removed in a future release.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Taint node when it under PID pressure.
Signed-off-by: Da K. Ma <madaxa@cn.ibm.com>
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
part of #54313
**Release note**:
```release-note
If TaintNodesByCondition enabled, taint node when it under PID pressure
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update bazelbuild/rules_go, kubernetes/repo-infra, and gazelle dependencies
**What this PR does / why we need it**: updates our bazelbuild/rules_go dependency in order to bump everything to go1.9.4. I'm separating this effort into two separate PRs, since updating rules_go requires a large cleanup, removing an attribute from most build rules.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
add node shutdown taint
**What this PR does / why we need it**: we need node stopped taint in order to detach volumes immediately without waiting timeout. More info in issue ticket #58635
**Which issue(s) this PR fixes**
Fixes#58635
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes volume predicate handler for equiv class
**What this PR does / why we need it**:
Per discussion in #58797 , we are missing some predicate handler in factory.go.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#58797
Ref #58222
**Special notes for your reviewer**:
Kindly ping @msau42
**Release note**:
```release-note
Fixes volume predicate handler for equiv class
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Format some import statements in scheduler pkg
**What this PR does / why we need it**:
As the title says, apply `goimports` on some files under `pkg/scheduler` pkg.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improve performance of scheduling queue by adding a hash map to track all pods with a nominatedNodeName
**What this PR does / why we need it**:
Our investigations show that there is a performance regression in the new scheduling queue which is not enabled by default and is enabled only if "priority and preemption" which is an alpha feature is enabled. This PR is an important performance improvement for those who want to use priority and preemption in larger clusters.
The PR adds a hash table to track nominated Pods so that finding such Pods will be faster.
Other than improving performance, we don't expect this PR to change behavior of scheduler.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
ref/ #56032
ref/ #57471
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/sig scheduling
Automatic merge from submit-queue (batch tested with PRs 59479, 59246). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add tests for schedulercache
**What this PR does / why we need it**:
Add tests for `node_info.go` under `schedulercache` package.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Avoid race condition when updating equivalence cache
**What this PR does / why we need it**:
Lock the ecache to update the ecache on each predicate running, to avoid race condition.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fix#58507
**Special notes for your reviewer**:
None
**Release note**:
```release-note
None
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Subtract local ephemeral storage resource from NodeInfo when removing pod
**What this PR does / why we need it**:
When we are removing pods, we need to subtract local ephemeral storage resource from NodeInfo
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/kind bug
/sig storage
/sig scheduling
/assign @jingxu97 @bsalamat
fix function
fix gofmt
fix function return value
fix tests
skip notimplemented error
remove factory unused
in openstack we should try to find instanceid from all states instead of ACTIVE, all other cloudproviders do this already
fix tests and lint
fix gofmt
fix nodelifecycletest
fix lint errors
shutdowned -> stopped
use shutdown everywhere
use patch in taints api call
use notimplemented in clouds use AddOrUpdateTaintOnNode
correct log text
add fake cloud
try to fix bazel
add shutdown tests
add context
Automatic merge from submit-queue (batch tested with PRs 59010, 59212, 59281, 59014, 59297). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Replace nominateNodeName annotation with PodStatus.NominatedNodeName
**What this PR does / why we need it**:
Replaces nominateNodeName annotation with PodStatus.NominatedNodeName in scheudler's logic. We don't expect any logic/behavior changes.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
ref #57471
/sig scheduling
cc: @k82cn @aveshagarwal @resouer
Automatic merge from submit-queue (batch tested with PRs 58444, 59283, 59437, 59325, 59449). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix to register priority function ResourceLimitsPriority correctly.
**What this PR does / why we need it**:
This PR fixes registration of priority function ResourceLimitsPriority. Previously this function was being registered inside `init()`. Since this priority function ResourceLimitsPriority is behind feature gate `ResourceLimitsPriorityFunction` and if the feature is enabled, it was not visible in `init()` function. So now the registration of this priority function is moved inside `ApplyFeatureGates()` in scheduler where it can be correctly registered after the feature has been enabled.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
@kubernetes/sig-scheduling-pr-reviews @bsalamat @ravisantoshgudimetla
Automatic merge from submit-queue (batch tested with PRs 59394, 58769, 59423, 59363, 59245). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Ensure euqiv hash calculation is per schedule
**What this PR does / why we need it**:
Currently, equiv hash is calculated per schedule, but also, per node. This is a potential cause of dragging integration test, see #58881
We should ensure this only happens once during scheduling of specific pod no matter how many nodes we have.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#58989
**Special notes for your reviewer**:
**Release note**:
```release-note
Ensure euqiv hash calculation is per schedule
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Make predicate errors more human readable
**What this PR does / why we need it**:
Make predicate errors more human readable
Thanks.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref #58546
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove unused func in FakeConfigurator of scheduler
**What this PR does / why we need it**:
Current scheduler `Configurator` interface looks like this:
```
type Configurator interface {
GetPriorityFunctionConfigs(priorityKeys sets.String) ([]algorithm.PriorityConfig, error)
GetPriorityMetadataProducer() (algorithm.PriorityMetadataProducer, error)
GetPredicateMetadataProducer() (algorithm.PredicateMetadataProducer, error)
GetPredicates(predicateKeys sets.String) (map[string]algorithm.FitPredicate, error)
GetHardPodAffinitySymmetricWeight() int32
GetSchedulerName() string
MakeDefaultErrorFunc(backoff *util.PodBackoff, podQueue core.SchedulingQueue) func(pod *v1.Pod, err error)
// Needs to be exposed for things like integration tests where we want to make fake nodes.
GetNodeLister() corelisters.NodeLister
GetClient() clientset.Interface
GetScheduledPodLister() corelisters.PodLister
Create() (*Config, error)
CreateFromProvider(providerName string) (*Config, error)
CreateFromConfig(policy schedulerapi.Policy) (*Config, error)
CreateFromKeys(predicateKeys, priorityKeys sets.String, extenders []algorithm.SchedulerExtender) (*Config, error)
}
```
Funcs `ResponsibleForPod` and `Run` once existed have been removed, so the funcs in `FakeConfigurator` should be removed accordingly.
**Special notes for your reviewer**:
/kind cleanup
/sig scheduling
**Release note**:
```release-note
NONE
```
This moves the equivalence hashing code from
algorithm/predicates/utils.go to core/equivalence_cache.go.
In the process, making the hashing function and hashing function factory
both injectable dependencies is removed.
This changes the equivalence class hashing function to use as inputs all
the Pod fields which are read by FitPredicates. Before we used a
combination of OwnerReference and PersistentVolumeClaim info, which was
a close approximation. The new method ensures that hashing remains
correct regardless of controller behavior.
The PVCSet field can be removed from equivalencePod because it is
implicitly included in the Volume list.
Tests are now broken.
Equivalence cache for CheckNodeConditionPred becomes invalid when
Node.Spec.Unschedulable changes. This can happen even if
Node.Status.Conditions does not change, so move the logic around.
This logic is covered by integration test
"test/integration/scheduler".TestUnschedulableNodes but equivalence
cache is currently skipped when test pods have no OwnerReference.
Automatic merge from submit-queue (batch tested with PRs 58595, 58689). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Checked node.Unscheulable in Toleration predicate.
Signed-off-by: Da K. Ma <madaxa@cn.ibm.com>
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#58648
**Release note**:
```release-note
None
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix the wrong err print of assumepod
**What this PR does / why we need it**:
I think the err print is wrong, just opposite the original meaning.
/cc @timothysc
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
-Add scheduler optimization options, short circuit all predicates if …
…one predicate fails
Signed-off-by: Wang Guoliang <iamwgliang@gmail.com>
**What this PR does / why we need it**:
Short circuit all predicates if one predicate fails.
I think we can add a switch to control it, maybe some scenes do not need to know all the causes of failure, but also can get a great performance improvement; if you need to fully understand the reasons for the failure, and accept the current performance requirements, can maintain the current logic. It should expose this switch to the user.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#56889 and #48186
**Special notes for your reviewer**:
@davidopp
**Release note**:
```
Allow scheduler set AlwaysCheckAllPredicates, short circuit all predicates if one predicate fails can greatly improve the scheduling performance.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improved readability for messages being logged
**What this PR does / why we need it**:
This improves the readability for messages seen by end-user. /cc @jwforres @bsalamat - For UX
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#57152
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 57266, 58187, 58186, 46245, 56509). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Added metrics for predicate and priority evaluation
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#45972
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
- Remove string decode logic. It's not really helping to find the
conflict ports, and it's expensive to do encoding/decoding
- Not to parse the container ports information in predicate meta, use
straight []*v1.ContainerPort
- Use better data structure to search port conflict based on ip
addresses
- Collect scattered source code into common place
This moves plugin/pkg/scheduler to pkg/scheduler and
plugin/cmd/kube-scheduler to cmd/kube-scheduler.
Bulk of the work was done with gomvpkg, except for kube-scheduler main
package.
Balanced Resource Allocation policy can now be enabled to so that
host(s) with balanced resource usage would be preferred. The score
given by BRA also scales from 0 to 10 with 10 representing that the
resource usage is well balanced.
* Improper format specifier (e.g. %s for bools or %s for ints)
* More or less parameters than format specifiers
* Not calling a formatting function when it should have (e.g. Error() instead of Errorf())
Replaces the client public interface but leaves old references to "minions"
for a later refactor. Selects the path "nodes" for v1beta3 and "minions"
for older versions.
Also rename some to other names that make better reading. There are still a
bunch of "make" functions but they do things like assemble a string from parts
or build an array of things. It seemed that "make" there seemed fine. "New"
is for "constructors".