00bf292cdc
Automatic merge from submit-queue (batch tested with PRs 67042, 66480, 67053). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. ensure MatchNodeSelectorTerms() runs statelessly **What this PR does**: Fix sorting behavior in selector.go: - move sorting from NewRequirement() out to String() - add related unit tests - add unit tests in one of outer callers (pkg/apis/core/v1/helper) **Why we need it**: - Without this fix, scheduling and daemonset controller doesn't work well in some (corner) cases **Which issue(s) this PR fixes**: Fixes #66298 **Special notes for your reviewer**: Parameter `nodeSelectorTerms` in method MatchNodeSelectorTerms() is a slice, which is fundamentally a {*elements, len, cap} tuple - i.e. it's passing in a pointer. In that method, NodeSelectorRequirementsAsSelector() -> NewRequirement() is invoked, and the `matchExpressions[*].values` is passed into and **modified** via `sort.Strings(vals)`. This will cause following daemonset pod fall into an infinite create/delete loop: ```yaml apiVersion: apps/v1 kind: DaemonSet metadata: name: problem spec: selector: matchLabels: app: sleeper template: metadata: labels: app: sleeper spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - 127.0.0.2 - 127.0.0.1 containers: - name: busybox image: busybox command: ["/bin/sleep", "7200"] ``` (the problem can be stably reproduced on a local cluster started by `hack/local-up-cluster.sh`) The first time daemonset yaml is handled by apiserver and persisted in etcd with original format (original order of values was kept - 127.0.0.2, 127.0.0.1). After that, daemonset controller tries to schedule pod, and it reuses the predicates logic in scheduler component - where the values are **sorted** deeply. This not only causes the pod to be created in sorted order (127.0.0.1, 127.0.0.2), but also introduced a bug when updating daemonset - internally ds controller use a "rawMessage" (bytes of an object) to calculate hash acting as a "controller-revision-hash" to control revision rollingUpdate/rollBack, so it keeps killing "old" pod and spawning "new" pod back and forth, and fall into an infinite loop. The issue exists in `master`, `release-1.11` and `release-1.10`. **Release note**: ```release-note NONE ``` |
||
---|---|---|
.. | ||
src | ||
BUILD | ||
OWNERS | ||
README.md |
README.md
External Repository Staging Area
This directory is the staging area for packages that have been split to their own repository. The content here will be periodically published to respective top-level k8s.io repositories.
Repositories currently staged here:
k8s.io/apiextensions-apiserver
k8s.io/api
k8s.io/apimachinery
k8s.io/apiserver
k8s.io/client-go
k8s.io/kube-aggregator
k8s.io/code-generator
k8s.io/metrics
k8s.io/sample-apiserver
k8s.io/sample-controller
The code in the staging/ directory is authoritative, i.e. the only copy of the code. You can directly modify such code.
Using staged repositories from Kubernetes code
Kubernetes code uses the repositories in this directory via symlinks in the
vendor/k8s.io
directory into this staging area. For example, when
Kubernetes code imports a package from the k8s.io/client-go
repository, that
import is resolved to staging/src/k8s.io/client-go
relative to the project
root:
// pkg/example/some_code.go
package example
import (
"k8s.io/client-go/dynamic" // resolves to staging/src/k8s.io/client-go/dynamic
)
Once the change-over to external repositories is complete, these repositories
will actually be vendored from k8s.io/<package-name>
.