Commit Graph

68 Commits (e85b81bbee098a7ec75cc894a8785867ec586798)

Author SHA1 Message Date
Kubernetes Submit Queue 0a2467d849
Merge pull request #63459 from resouer/fix-63427
Automatic merge from submit-queue (batch tested with PRs 63598, 63913, 63459, 63963, 60464). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Check nodeInfo before ecache predicate

**What this PR does / why we need it**:

There's chances during test when nodeInfo is nil which may cause ecache predicate fail with nil pointer.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63427

**Special notes for your reviewer**:

Not sure how to reproduce the original issue yet. i.e. why and when `nodeInfo` will become nil in tests is not clear to me, that's why I label it as WIP.

cc @bsalamat who may have more inputs.

**Release note**:

```release-note
NONE
```
2018-05-19 06:49:19 -07:00
Harry Zhang 7f01ce4ec0 Update generated bazel 2018-05-11 14:25:23 +08:00
Harry Zhang 0377c69aad Use simple cache instead of LRU
Update generated bazel

Use map instead
2018-05-11 14:25:17 +08:00
Harry Zhang 8df3ab75a4 Check nodeInfo before ecache 2018-05-06 22:42:20 +08:00
Jonathan Basseri 79d30b1ad6 Hide EquivalenceCache mutex from users.
Since the equiv. cache lock no longer needs to be held across multiple
method calls, move the locking inside and don't expose it to users.
2018-04-27 15:55:10 -07:00
Jonathan Basseri b85184227d Rename exported methods on EquivalenceCache.
This changes two methods in EquivalenceCache to be unexported, because
they should no longer be called by users of this type. (Even users in
the same package!)
2018-04-27 15:55:10 -07:00
Jonathan Basseri 55662f26f1 Simplify logic in podFitsOnNode.
Use new (*EquivalenceCache).RunPredicate to simplify how we read and
update the equivalence cache items.
2018-04-27 15:55:10 -07:00
Jonathan Basseri e67b3225a4 Remove predicateResults map from podFitsOnNode.
The purpose of this map is to combine two predicate results before
writing to the equivalence cache. However, the branch that combines
results is unreachable.

1. Combining results happens in the second iteration of the outer loop.
2. There is only a second iteration when podsAdded is true.
3. We skip equiv. cache when podsAdded is true.
2018-04-27 15:55:10 -07:00
Jonathan Basseri ca6b312c97 Add RunPredicate to EquivalenceCache.
This method combines "lookup" and "update" into one operation. The
benefit is that this method call is very similar to running an ordinary
predicate, so callers can simplify their code.
2018-04-27 15:55:10 -07:00
Kubernetes Submit Queue 0cf3788419
Merge pull request #63174 from misterikkit/equivHash
Automatic merge from submit-queue (batch tested with PRs 62937, 63105, 63031, 63174). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert "Revert "Revert revert of equivalence class hash calculation i…

…n scheduler""

This reverts commit 4386751b5d.



**What this PR does / why we need it**:
This re-introduces the change from https://github.com/kubernetes/kubernetes/pull/58555 which changes how the scheduler computes equivalence classes of pods. I believe we have fixed the flakiness observed previously (https://github.com/kubernetes/kubernetes/issues/61512, https://github.com/kubernetes/kubernetes/issues/62921). I have run the test in question a few dozen times without a failure.

```bash
make test-integration WHAT="./test/integration/scheduler" KUBE_TEST_ARGS="-run TestPreemptionStarvation" GOFLAGS="-v"
```

/ref https://github.com/kubernetes/kubernetes/issues/58222

**Special notes for your reviewer**:
I had to resolve several merge conflicts. I think I resolved them correctly, but keep an eye out for anything silly.

**Release note**:

```release-note
NONE
```
/sig scheduling
2018-04-26 16:40:19 -07:00
Da K. Ma 2c10d15ae5 Do not schedule pod to the node under PID pressure.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
2018-04-26 10:07:42 +08:00
Jonathan Basseri eace2d08d0 Revert "Revert "Revert revert of equivalence class hash calculation in scheduler""
This reverts commit 4386751b5d.
2018-04-25 16:11:59 -07:00
Jonathan Basseri dacc1a8d52 Check for old NodeInfo when updating equiv. cache.
Because the scheduler takes a snapshot of cache data at the start of
each scheduling cycle, updates to the equivalence cache should be
skipped if there was a cache update during the cycle.

If the current NodeInfo becomes stale while we evaluate predicates, we
will not write any results into the equivalence cache. We will still use
the results for the current scheduling cycle, though.
2018-04-25 10:18:40 -07:00
Jonathan Basseri 02d657827c Test race condition in equivalence cache.
Add a unit test that invalidates equivalence cache during a scheduling
cycle. This exercises the bug described in
https://github.com/kubernetes/kubernetes/issues/62921
2018-04-25 10:18:40 -07:00
Harry Zhang 4f0bd4121e Disable pod preemption by config 2018-04-12 21:11:51 -07:00
Harry Zhang 083684d771 Add test to verify preempt ignore 2018-04-04 16:28:15 -07:00
Harry Zhang 7f04129736 Add Ignorable flag to extender
Ignore extender in generic scheduler

Add test to verify the ignorable flag

Fix warning msg
2018-03-30 15:10:31 -07:00
Harry Zhang 202c6b68ee Use inclien func to ensure unlock is executed 2018-03-25 11:54:16 -07:00
Bobby (Babak) Salamat 4386751b5d
Revert "Revert revert of equivalence class hash calculation in scheduler" 2018-03-23 17:34:49 -07:00
Kubernetes Submit Queue 2d864f2359
Merge pull request #60953 from anfernee/sched-cache-resync
Automatic merge from submit-queue (batch tested with PRs 60919, 60953, 61085, 61083, 60971). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Sched cache resync

**What this PR does / why we need it**:  Scheduler cache comparer
    
    A debug tool that collects resources from api server and compares it
    with the scheduler cache. It currently only compares the node list, but
    it should be easy to extend. The compare is triggered by signal USER2,
    by doing
    
      kill -12 ${SCHED_PID}
    
    The compare result goes to scheduler log.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Towards #60860

**Special notes for your reviewer**: @bsalamat 

**Release note**:
```release-note
None
```
2018-03-20 20:34:28 -07:00
Kubernetes Submit Queue 7a273aa85d
Merge pull request #60796 from ravisantoshgudimetla/extender-log-fix
Automatic merge from submit-queue (batch tested with PRs 60898, 60912, 60753, 61002, 60796). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Change to fix scheduler extender error return message

**What this PR does / why we need it**:
As of now, scheduler always logs extender endpoint without verb like "filter", "prioritize" etc. With this change, we are including the verb as well while logging which helps in debugging
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-03-20 17:37:20 -07:00
Kubernetes Submit Queue 9a3b0bd74f
Merge pull request #60753 from resouer/equiv-hash
Automatic merge from submit-queue (batch tested with PRs 60898, 60912, 60753, 61002, 60796). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert revert of equivalence class hash calculation in scheduler

**What this PR does / why we need it**:

NOTE: This is a revert revert of https://github.com/kubernetes/kubernetes/pull/58555

But since the original PR has been changed, I have to copy the original changes and resend this new PR. See: https://github.com/kubernetes/kubernetes/pull/58555#issuecomment-364345972

And I kept @misterikkit 's change as the first commit (by co-author feature of github) in the history. 

We decide to do revert revert because #58989 has been fixed, which should help to improve the time consumed by integration test.

**But** we should still pay attention to integration tests to see if there's frequent timeout happen.

**Special notes for your reviewer**:

**Release note**:

```release-note
Improve equivalence class hash calculation in scheduler
```
2018-03-20 17:37:14 -07:00
Kubernetes Submit Queue 14e3efe26a
Merge pull request #58717 from resouer/extender-interface
Automatic merge from submit-queue (batch tested with PRs 60759, 60531, 60923, 60851, 58717). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Implement preemption for extender with a verb and new interface

**What this PR does / why we need it**:

This is an alternative way of implementing #51656

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #51656

**Special notes for your reviewer**:

We will also want to compare with #56296 to see which one is the best solution. See: https://github.com/kubernetes/kubernetes/pull/56296#discussion_r163381235

cc @ravigadde @bsalamat 

**Release note**:

```release-note
Implement preemption for extender with a verb and new interface
```
2018-03-20 15:34:41 -07:00
Kubernetes Submit Queue c64f19dd1b
Merge pull request #59728 from wgliang/master.append
Automatic merge from submit-queue (batch tested with PRs 59740, 59728, 60080, 60086, 58714). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

more concise to merge the slice

**What this PR does / why we need it**:
more concise to merge the slice

**Special notes for your reviewer**:
2018-03-19 21:34:30 -07:00
Yongkun Anfernee Gui cda749c237 Pod comparer should count pods in scheduling queue
Pods in scheduler cache contains both the scheduled pods and those not
scheduled yet in scheduling queue. This commit adds the second group of
pods into consideration while comparing the cache.
2018-03-14 10:29:42 -07:00
Yongkun Anfernee Gui 5bad68ac58 Use pod UID as cache key instead of namespace/name
UID uniquely identifies pods across lifecycles, while namespace/name
could be 2 different pods across lifecycles. This could result in
tricky scheduler bugs.

Fixes #60966
2018-03-13 10:25:37 -07:00
Harry Zhang 5cc841a337 Use inline func to fix deadlock 2018-03-09 10:57:03 -08:00
Harry Zhang 7a7f9dccd0 [PATCH] Use nodename as key 2018-03-07 22:10:47 -08:00
ravisantoshgudimetla 1c416b1c39 Change to fix logging 2018-03-05 11:15:33 -05:00
Harry Zhang 4e5901f947 Fixe golints of equiv class 2018-03-04 17:12:09 -08:00
Harry Zhang c292af8f7b Use const in equiv class 2018-03-04 14:35:57 -08:00
Jonathan Basseri f5ab6d5ad4 [PATCH] Fix equiv. cache invalidation of Node condition.
Equivalence cache for CheckNodeConditionPred becomes invalid when
Node.Spec.Unschedulable changes. This can happen even if
Node.Status.Conditions does not change, so move the logic around.

This logic is covered by integration test
"test/integration/scheduler".TestUnschedulableNodes but equivalence
cache is currently skipped when test pods have no OwnerReference.

Add benchmark for equivalence hashing.

Change equivalence hash function.

This changes the equivalence class hashing function to use as inputs all
the Pod fields which are read by FitPredicates. Before we used a
combination of OwnerReference and PersistentVolumeClaim info, which was
a close approximation. The new method ensures that hashing remains
correct regardless of controller behavior.

The PVCSet field can be removed from equivalencePod because it is
implicitly included in the Volume list.

Tests are now broken.

Move equivalence class hash code.

This moves the equivalence hashing code from
algorithm/predicates/utils.go to core/equivalence_cache.go.

In the process, making the hashing function and hashing function factory
both injectable dependencies is removed.

Fix equivalence cache hash tests.

Co-authored-by: Jonathan Basseri <misterikkit@google.com>
Co-authored-by: Harry Zhang <resouer@gmail.com>
2018-03-04 13:02:28 -08:00
Harry Zhang b62d82422d Fix golints in extender 2018-03-02 17:12:02 -08:00
Harry Zhang 71603f2f85 Add preemption in scheduler extender
Add verb and preemption for scheduler extender

Update bazel

Use simple preemption in extender

Use node name instead of v1.Node

Fix support method

Fix preemption dup

Remove uneeded logics

Remove nodeInfo from param to extender

Update bazel for scheduler types

Mock extender cache with nodeInfo

Add nodeInfo as extender cache

Choose node name or node based on cache flag

Always return meta victims in result
2018-03-02 17:12:02 -08:00
Shijun Qin 158257473a
Fix a grammatical error in a comment
Fix a grammatical error in a comment in scheduler's code. We should use a word's plural form after "one of".
2018-03-02 21:30:44 +08:00
Yang Guo 8d880506fe Support cluster-level extended resources in kubelet and kube-scheduler
Co-authored-by: Yang Guo <ygg@google.com>
Co-authored-by: Chun Chen <chenchun.feed@gmail.com>
2018-02-27 17:25:30 -08:00
Kubernetes Submit Queue 49a1478839
Merge pull request #60263 from tossmilestone/reuse-minNodes
Automatic merge from submit-queue (batch tested with PRs 60106, 59510, 60263, 60063, 59088). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reuse the `min*Nodes` slices in order to save GC time

**What this PR does / why we need it**:
Reuse the `min*Nodes` slices to save GC time when executing `pickOneNodeForPreemption`.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #59748

**Special notes for your reviewer**:

**Release note**:

```release-note
None
```
2018-02-23 02:59:47 -08:00
tossmilestone 5a083f2038 Reuse the "min*Nodes" slices to save the GC time. 2018-02-23 14:16:19 +08:00
Bobby (Babak) Salamat 08406c3f6e Make the `Unschedulable Queue` interface private 2018-02-21 13:53:40 -08:00
Bobby (Babak) Salamat 5a00c42848 Minor improvements to scheduling queue 2018-02-21 12:57:28 -08:00
Kubernetes Submit Queue fe4b28cdf0
Merge pull request #60062 from bsalamat/sched_q_imprv
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve scheduling queue's logic

**What this PR does / why we need it**:
Improves scheduling queue's code based on some recent comments on [the original PR](https://github.com/kubernetes/kubernetes/pull/55109).
This PR does not fix any bugs or make any change of behavior.

**Release note**:

```release-note
NONE
```

/sig scheduling
2018-02-20 20:00:25 -08:00
Bobby (Babak) Salamat bba9b12d0c Improve scheduling queue's logic 2018-02-20 17:20:55 -08:00
Jeff Grafton ef56a8d6bb Autogenerated: hack/update-bazel.sh 2018-02-16 13:43:01 -08:00
Kubernetes Submit Queue 01bd3c4b74
Merge pull request #59734 from mlmhl/format_imports
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Format some import statements in scheduler pkg

**What this PR does / why we need it**:

As the title says, apply `goimports` on some files under `pkg/scheduler` pkg.

**Release note**:

```release-note
NONE
```
2018-02-13 08:04:15 -08:00
Kubernetes Submit Queue ba791275ce
Merge pull request #59671 from bsalamat/sched_queue_perf
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve performance of scheduling queue by adding a hash map to track all pods with a nominatedNodeName

**What this PR does / why we need it**:
Our investigations show that there is a performance regression in the new scheduling queue which is not enabled by default and is enabled only if "priority and preemption" which is an alpha feature is enabled. This PR is an important performance improvement for those who want to use priority and preemption in larger clusters.
The PR adds a hash table to track nominated Pods so that finding such Pods will be faster.
Other than improving performance, we don't expect this PR to change behavior of scheduler.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

ref/ #56032
ref/ #57471 

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

/sig scheduling
2018-02-13 00:07:58 -08:00
Kubernetes Submit Queue ab2e1cb02a
Merge pull request #59479 from tossmilestone/avoid-ecahe-update-race
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Avoid race condition when updating equivalence cache

**What this PR does / why we need it**:
Lock the ecache to update the ecache on each predicate running, to avoid race condition.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fix #58507 

**Special notes for your reviewer**:
None

**Release note**:

```release-note
None
```
2018-02-12 16:38:07 -08:00
Bobby (Babak) Salamat df5fc09411 compare Pods by UID, not by name and namespace 2018-02-12 10:13:13 -08:00
mlmhl b3fff71161 format some import statements in scheduler pkg 2018-02-12 09:04:00 +08:00
Wang Guoliang 31aad75316 more concise to merge the array 2018-02-11 21:27:11 +08:00
Bobby (Babak) Salamat 69d62a9288 Improve performance of scheduling queue by adding a hash map to track all pods in with a nominatedNodeName. 2018-02-09 14:07:29 -08:00