Automatic merge from submit-queue (batch tested with PRs 61378, 60915, 61499, 61507, 61478). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Capture pod startup phases as metrics
Learning from https://github.com/kubernetes/kubernetes/issues/60589, we should also start collecting and graphing sub-parts of pod-startup latency.
/sig scalability
/kind feature
/priority important-soon
/cc @wojtek-t
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 61378, 60915, 61499, 61507, 61478). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Bump image for ingress downgrade test
**What this PR does / why we need it**:
Bumps the image to the latest released version. This should also fix the failing test.
/assign @bowei
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 61378, 60915, 61499, 61507, 61478). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix "Garbage collector should support cascading deletion of custom resources" flake
**What this PR does / why we need it**:
These webhook configuration objects were intended to have no effect on admission requests, and are just used to ensure that webhook configuration objects can always be deleted.
However, I think they are causing an entirely separate test to flake (Garbage collector should support cascading deletion of custom resources) because of the interaction between these two webhooks, which are run on all admission requests, and #61355 (Unable to create/update CRD when mutating webhook configured) when the two tests are run at around the same time.
This PR will make the effects of the webhook e2e test on other tests easier to reason about.
The flakiness is being caused by a legitimate failure of the underlying code (That CRDs currently don't work with webhooks), but not a failure which either of "Garbage collector should support cascading deletion of custom resources" or "AdmissionWebhooks Should not be able to prevent deleting validating-webhook-configurations or mutating-webhook-configurations" is supposed to reliably catch.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#59997
**Release note**:
```release-note
NONE
```
/sig api-machinery
/kind flake
Automatic merge from submit-queue (batch tested with PRs 61453, 61393, 61379, 61373, 61494). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use inner volume name instead of outer volume name for subpath directory
**What this PR does / why we need it**:
Fixes volume reconstruction for PVCs with subpath
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#61372
**Special notes for your reviewer**:
**Release note**:
```release-note
ACTION REQUIRED: In-place node upgrades to this release from versions 1.7.14, 1.8.9, and 1.9.4 are not supported if using subpath volumes with PVCs. Such pods should be drained from the node first.
```
Automatic merge from submit-queue (batch tested with PRs 61453, 61393, 61379, 61373, 61494). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixing kubemci conformance tests
Ref https://github.com/GoogleCloudPlatform/k8s-multicluster-ingress/issues/131
Changes:
* Add a static ip annotation while running the tests for kubemci. kubemci requires the IP to be preallocated.
* Add a default backend service. I have added it in the spec directly, so the change will be for ingress controller as well which should be fine. kubemci requires users to specify a default backend service that they need to ensure exists in all clusters.
* Disabled update SSL cert test for kubemci since it does not support that.
* Minor logging fixes.
Verified by running the tests locally that they now pass.
```
$ KUBECONFIG=~/.kube/config KUBE_MASTER_IP="<IP>" go run hack/e2e.go -- --test --test_args="--ginkgo.focus=kubemci"
• [SLOW TEST:629.179 seconds]
Loadbalancing: L7
/usr/local/google/home/nikhiljindal/code/src/github.com/kubernetes/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/network/framework.go:22
GCE [Slow] [Feature:kubemci]
/usr/local/google/home/nikhiljindal/code/src/github.com/kubernetes/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/network/ingress.go:604
should conform to Ingress spec
/usr/local/google/home/nikhiljindal/code/src/github.com/kubernetes/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/network/ingress.go:637
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSMar 19 19:45:59.242: INFO: Running AfterSuite actions on all node
Mar 19 19:45:59.242: INFO: Running AfterSuite actions on node 1
Ran 1 of 820 Specs in 631.245 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 819 Skipped PASS
Ginkgo ran 1 suite in 10m32.602848558s
Test Suite Passed
```
cc @G-Harmon @MrHohn
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 61396, 61321, 61443, 60911, 61461). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Change pods memory boundary.
Node e2e tests are running in parallel. There might be many pods running during summary api test. `20mb` is a bit too low.
@dashpole
/cc @kubernetes/sig-node-pr-reviews
Signed-off-by: Lantao Liu <lantaol@google.com>
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
none
```
Automatic merge from submit-queue (batch tested with PRs 60980, 61273, 60811, 61021, 61367). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use apps/v1 ReplicaSet in controller and tests.
This updates the RS/RC controller and RS integration/e2e tests to use apps/v1 ReplicaSet, as part of #55714.
It does *not* update the Deployment controller, nor its integration/e2e tests, to use apps/v1 ReplicaSet. That will be done in a separate PR (#61419) because Deployment has many more tendrils embedded throughout the system.
```release-note
Conformance: ReplicaSet must be supported in the `apps/v1` version.
```
/assign @janetkuo
Automatic merge from submit-queue (batch tested with PRs 60980, 61273, 60811, 61021, 61367). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Added e2e test for local volume provisioner discovery of new mountpoints while running.
**What this PR does / why we need it**:
This PR adds an e2e test for the local volume provisioner leveraging of the MountPropogation feature (https://github.com/kubernetes/kubernetes/pull/46444) to discover newly created mountpoints. The MountPropogation feature is now in beta (https://github.com/kubernetes/kubernetes/blob/master/pkg/features/kube_features.go#L146).
**Which issue(s) this PR fixes**:
Fixes https://github.com/kubernetes/kubernetes/issues/58416
**Special notes for your reviewer**:
```
KUBE_FEATURE_GATES="BlockVolume=true" NUM_NODES=1 go run hack/e2e.go -- --up
go run hack/e2e.go -- --test --test_args='--ginkgo.focus=PersistentVolumes-local.*Local.*volume.*provisioner'
```
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60793, 61181, 61267, 61252, 61334). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Refactor disruptive tests to use more volume types
Refactor storage disruptive tests to use different volume types. Also mark existing tests to be NFS specific. Fixes https://github.com/kubernetes/kubernetes/issues/61150
cc @jeffvance @jingxu97
/sig storage
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 60632, 60806, 59471, 61251, 61013). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove DNS service from kubectl comformance test
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #https://github.com/kubernetes/kubernetes/issues/61177
**Special notes for your reviewer**:
/assign @pwittrock
/assign @soltysh
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59536, 61104, 61030, 59013, 61169). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
add rolling update daemonset existing pod adoption integration test
**What this PR does / why we need it**:
This PR adds rolling update DaemonSet existing pod adoption integration test. It also shifts all helper functions to a new utility file.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
xref #52191
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59536, 61104, 61030, 59013, 61169). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
stop using AlwaysAdmit admission
`AlwaysAdmit ` was deprecated, and stop using it in test.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60919, 60953, 61085, 61083, 60971). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove unused `pkg/api/unversioned`
**What this PR does / why we need it**:
clean code, see #61084
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#61084
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60759, 60531, 60923, 60851, 58717). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Implement preemption for extender with a verb and new interface
**What this PR does / why we need it**:
This is an alternative way of implementing #51656
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#51656
**Special notes for your reviewer**:
We will also want to compare with #56296 to see which one is the best solution. See: https://github.com/kubernetes/kubernetes/pull/56296#discussion_r163381235
cc @ravigadde @bsalamat
**Release note**:
```release-note
Implement preemption for extender with a verb and new interface
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Adding details to Conformance Tests using RFC 2119 standards.
This PR is part of the conformance documentation. This is to provide more formal specification using RFC 2119 keywords to describe the test so that who ever is running conformance tests do not have to go through the code to understand why and what is tested.
The documentation information added here into each of the tests eventually result into a document which is currently checked in at location https://github.com/cncf/k8s-conformance/blob/master/docs/KubeConformance-1.9.md
I would like to have this PR reviewed for v1.10 as I consider it important to strengthen the conformance documents.
Automatic merge from submit-queue (batch tested with PRs 60574, 60666, 60831, 60877, 60357). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove potential sources of flakes for kms_transformation_test.go.
**What this PR does / why we need it**:
Remove potential sources for flakes in TestKMSPlugin test.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
#60614
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59637, 60611, 60788, 60489, 60687). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix todo: use a better way to keep this label unique in the test
**What this PR does / why we need it**:
fix todo: use a better way to keep this label unique in the test in test/e2e/apimachinery/garbage_collector.go
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60457, 60331, 54970, 58731, 60562). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add e2e test for watch
**What this PR does / why we need it**:
Currently watch is only tested by kubectl tests, but all clients of kubernetes should be able to reliably watch resources for changes. This test should be able to accommodate testing watch for custom resources without many changes.
/sig api-machinery
```release-note
Added e2e test for watch
```
Automatic merge from submit-queue (batch tested with PRs 60457, 60331, 54970, 58731, 60562). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
tests: e2e: empty msg from channel other than stdout should be non-fatal
Currently, if the exec websocket encounters a message that is not in the stdout stream, it immediately fails. However it also currently requests the stderr steam in the query params. There doesn't seem to be any guarantee that we don't get an empty message on the stderr stream.
Requesting the stderr stream in the query is desirable if, for some reason, something in the container fails and writes to stderr.
However, we do not need fail the test if we get an empty message on another stream. If the message is not empty, then that _does_ indicate and error and we should fail.
This is the situation we are currently observing with docker 1.13 in the origin CI https://github.com/openshift/origin/issues/18726
@derekwaynecarr @smarterclayton @gabemontero @liggitt @deads2k
/sig node
Automatic merge from submit-queue (batch tested with PRs 59740, 59728, 60080, 60086, 58714). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
vSphere: Minimize property collection via Finder
The 'All' parameter of the 'NewFinder' function controls property collection while searching the inventory.
When 'All' is set to 'false', Finder collects the minimal set of object properties required to search inventory.
When 'All' is set to 'true', Finder collects *all* object properties, which are *not* required to search inventory.
Setting 'All' to 'true' is only useful when inspecting all properties of an object,
such as by certain govc commands when the '-json' or '-dump' flags are specified.
Changing All=false in VCP minimizes the SOAP payload size and marshalling required on both sides, without impacting any functionality.
**What this PR does / why we need it**:
Changing All=false in VCP minimizes the SOAP payload size and marshalling required on both sides, without impacting any functionality.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59740, 59728, 60080, 60086, 58714). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
more concise to merge the slice
**What this PR does / why we need it**:
more concise to merge the slice
**Special notes for your reviewer**:
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix test images
These commits fix volume_io tests for iSCSI and Ceph RBD. Both server images were quite old (Fedora 22), so I updated them to ~~something more stable (CentOS 7) and to newer Ceph (Jewel, 10.2.7).~~ something newer (Fedora 26).
The most important fix is that the test volumes have 120 MB so volume_io test can actually run - the tests put 100MB file to the volume to check its persistence.
When mount containers in #53440 are merged I'll try to run the tests regularly with every PR (or merge) so we catch regressions quickly.
```release-note
NONE
```
/sig testing
/sig storage
/assign @jeffvance
Fixes: #56725
Automatic merge from submit-queue (batch tested with PRs 61284, 61119, 61201). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix creation of subpath with SUID/SGID directories.
SafeMakeDir() should apply SUID/SGID/sticky bits to the directory it creates.
Fixes#61283
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Added unschedulable taint
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
part of #59194; fixes#61050
**Release note**:
```release-note
When `TaintNodesByCondition` enabled, added `node.kubernetes.io/unschedulable:NoSchedule`
taint to the node if `spec.Unschedulable` is true.
When `ScheduleDaemonSetPods` enabled, `node.kubernetes.io/unschedulable:NoSchedule`
toleration is added automatically to DaemonSet Pods; so the `unschedulable` field of
a node is not respected by the DaemonSet controller.
```
Automatic merge from submit-queue (batch tested with PRs 61118, 60579). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Increase apiserver mem-threshold in density test
Ref: https://github.com/kubernetes/kubernetes/issues/60500#issuecomment-372682659 (fixes part of that issue)
/sig scalability
/kind bug
/priority important-soon
/cc @wojtek-t
/cc @crassirostris (for the release-note)
```release-note
Audit logging with buffering enabled can increase apiserver memory usage (e.g. up to 200MB in 100-node cluster). The increase is bounded by the buffer size (configurable). Ref: issue #60500
```
Automatic merge from submit-queue (batch tested with PRs 61111, 61069). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix subpath e2e tests on multizone cluster.
Use dynamically provisioned PV to run GCE PD tests. This will make sure that the pod is scheduled to the right zone and GCE PD can be attached to a node.
**Which issue(s) this PR fixes**:
Fixes#61101
**Release note**:
```release-note
NONE
```
/sig storage
@msau42 @verult
Automatic merge from submit-queue (batch tested with PRs 60737, 60739, 61080, 60968, 60951). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix broken gke regional logging test.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#60882
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60737, 60739, 61080, 60968, 60951). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Detect backsteps correctly in base path detection
Avoids false positives with atomic writer `..<timestamp>` directories
Fixes#61076
/assign @msau42 @jsafrane
```release-note
Fix a regression that prevented using `subPath` volume mounts with secret, configMap, projected, and downwardAPI volumes
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Bump to etcd 3.1.12 to pick up critical fix
etcd [3.1.12](https://github.com/coreos/etcd/releases/tag/v3.1.12) (as well as 3.2.17 and 3.3.2) was released yesterday to fix a bug critical to kubernetes:
Fix [mvcc "unsynced" watcher restore operation](https://github.com/coreos/etcd/pull/9297).
- "unsynced" watcher is watcher that needs to be in sync with events that have happened.
- That is, "unsynced" watcher is the slow watcher that was requested on old revision.
- "unsynced" watcher restore operation was not correctly populating its underlying watcher group.
- Which possibly causes [missing events from "unsynced" watchers](https://github.com/coreos/etcd/issues/9086).
This will be backported to 1.9 as well.
Release note:
```release-note
Upgrade the default etcd server version to 3.1.12 to pick up critical etcd "mvcc "unsynced" watcher restore operation" fix.
```
cc @gyuho @wojtek-t @shyamjvs @timothysc @jdumars
Use dynamically provisioned PV to run GCE PD tests. This will make sure
that the pod is scheduled to the right zone and GCE PD can be attached
to a node.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes the races around devicemanager Allocate() and endpoint deletion.
There is a race in predicateAdmitHandler Admit() that getNodeAnyWayFunc()
could get Node with non-zero deviceplugin resource allocatable for a
non-existing endpoint. That race can happen when a device plugin fails,
but is more likely when kubelet restarts as with the current registration
model, there is a time gap between kubelet restart and device plugin
re-registration. During this time window, even though devicemanager could
have removed the resource initially during GetCapacity() call, Kubelet
may overwrite the device plugin resource capacity/allocatable with the
old value when node update from the API server comes in later. This
could cause a pod to be started without proper device runtime config set.
To solve this problem, introduce endpointStopGracePeriod. When a device
plugin fails, don't immediately remove the endpoint but set stopTime in
its endpoint. During kubelet restart, create endpoints with stopTime set
for any checkpointed registered resource. The endpoint is considered to be
in stopGracePeriod if its stoptime is set. This allows us to track what
resources should be handled by devicemanager during the time gap.
When an endpoint's stopGracePeriod expires, we remove the endpoint and
its resource. This allows the resource to be exported through other channels
(e.g., by directly updating node status through API server) if there is such
use case. Currently endpointStopGracePeriod is set as 5 minutes.
Given that an endpoint is no longer immediately removed upon disconnection,
mark all its devices unhealthy so that we can signal the resource allocatable
change to the scheduler to avoid scheduling more pods to the node.
When a device plugin endpoint is in stopGracePeriod, pods requesting the
corresponding resource will fail admission handler.
Tested:
Ran GPUDevicePlugin e2e_node test 100 times and all passed now.
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/60176
**Special notes for your reviewer**:
**Release note**:
```release-note
Fixes the races around devicemanager Allocate() and endpoint deletion.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[e2e service] Fix CleanupGCEResources for regional test
*What this PR does / why we need it**:
From https://k8s-testgrid.appspot.com/google-gke-staging#gke-staging-1-8-1-9-upgrade-regional-cluster&width=20, regional cluster test is failing because the GCE resource cleanup function attempts to parse region from `--gcp-zone` while regional cluster only set `--gcp-region`.
This PR pipes region into the cleanup function as well. This will need to be cherrypicked to 1.8 and 1.9.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #NONE
**Special notes for your reviewer**:
/assign @bowei @wojtek-t
cc @nikhiljindal to see if there is anything should be fixed for federation.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[e2e service] Move apiserver restart validation logic into util
**What this PR does / why we need it**:
Follow up of #60906, on GKE apiserver pod is invisible on k8s, hence test is failing.
This PR bakes the restart validation logic into the util function instead so it could be env-awared.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#60761
**Special notes for your reviewer**:
Sorry for the noise.
/assign @rramkumar1 @bowei
cc @krzyzacy
**Release note**:
```release-note
NONE
```
There is a race in predicateAdmitHandler Admit() that getNodeAnyWayFunc()
could get Node with non-zero deviceplugin resource allocatable for a
non-existing endpoint. That race can happen when a device plugin fails,
but is more likely when kubelet restarts as with the current registration
model, there is a time gap between kubelet restart and device plugin
re-registration. During this time window, even though devicemanager could
have removed the resource initially during GetCapacity() call, Kubelet
may overwrite the device plugin resource capacity/allocatable with the
old value when node update from the API server comes in later. This
could cause a pod to be started without proper device runtime config set.
To solve this problem, introduce endpointStopGracePeriod. When a device
plugin fails, don't immediately remove the endpoint but set stopTime in
its endpoint. During kubelet restart, create endpoints with stopTime set
for any checkpointed registered resource. The endpoint is considered to be
in stopGracePeriod if its stoptime is set. This allows us to track what
resources should be handled by devicemanager during the time gap.
When an endpoint's stopGracePeriod expires, we remove the endpoint and
its resource. This allows the resource to be exported through other channels
(e.g., by directly updating node status through API server) if there is such
use case. Currently endpointStopGracePeriod is set as 5 minutes.
Given that an endpoint is no longer immediately removed upon disconnection,
mark all its devices unhealthy so that we can signal the resource allocatable
change to the scheduler to avoid scheduling more pods to the node.
When a device plugin endpoint is in stopGracePeriod, pods requesting the
corresponding resource will fail admission handler.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Revert "[Test change - don't merge] Skip load test"
This reverts commit ba6bb999f7.
This was accidentally merged as part of 60891.
/cc @wojtek-t
/sig scalability
/kind bug
/priority important-soon
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use quotas in default performance tests
Better to use more features in default tests if possible.
LGTM whenever you think we're ready.
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[e2e service] Refine apiserver restart logic
**What this PR does / why we need it**:
Ref https://github.com/kubernetes/kubernetes/issues/60761#issuecomment-371308569, wait for apiserver's restart count increases before proceeding the test.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes (hopefully) #60761
**Special notes for your reviewer**:
/assign @rramkumar1 @bowei
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60891, 60935). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Rollback etcd server version to 3.1.11 due to #60589
Ref https://github.com/kubernetes/kubernetes/issues/60589#issuecomment-371171837
The dependencies were a bit complex (so many things relying on it) + the version was updated to 3.2.16 on top of the original bump.
So I had to mostly make manual reverting changes on a case-by-case basis - so likely to have errors :)
/cc @wojtek-t @jpbetz
```release-note
Downgrade default etcd server version to 3.1.11 due to #60589
```
(I'm not sure if we should instead remove release-notes of the original PRs)
Automatic merge from submit-queue (batch tested with PRs 60642, 60627). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Create fake /etc/hosts for conformance test
**What this PR does / why we need it**:
"KubeletManagedEtcHosts should test kubelet managed /etc/hosts file"
conformance test fails in the CI's Docker-In-Docker environment.
This test mounts a /etc/hosts file and checks if "# Kubernetes-managed
hosts file." string is present or not under various conditions. The
specific failure with DIND happens when the /etc/hosts picked up
from the box where e2e test are running already has this string. This
happens because our CI runs on kubernetes and the e2e tests are running
in a container that was started on kubernetes (and hence already has
that string)
To avoid this situation, we create a new /etc/hosts file with known
contents (and does not have the "# Kubernetes-managed hosts file."
string)
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
Please see:
https://k8s-testgrid.appspot.com/sig-testing-misc#ci-kubernetes-local-e2ehttps://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-local-e2e/285
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add cblecker to test OWNERS
**What this PR does / why we need it**:
- Adds an OWNERS file for `test/typecheck/` that was just added
- Add cblecker to approvers for `test/`
I won't touch anything I don't understand :)
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
improve daemonset's retry creating failed daemon pods e2e test
**What this PR does / why we need it**:
This PR improves daemonset's retry creating failed daemon pods e2e test by checking that the failed pod does not show up among existing daemon pods.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#59076
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixing e2e CSI test, II
The fix for #60803 in commit 2ae33cc324 had a typo, so the "Server
rejected event" error still showed up in the external-provisioner log
of the "Sanity CSI plugin test using hostPath CSI driver" e2e test.
```release-note
None
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Bump etcd server patch version to 3.2.16
etcd 3.2.16 contains a critical fix for HA clusters: https://github.com/coreos/etcd/pull/9281
Also, update newly added tests to use `REGISTRY` make variable.
Release note:
```release-note
Upgrade the default etcd server version to 3.2.16
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix test failure and delete unused code
**What this PR does / why we need it**:
`Daemon set [Serial] should update pod when spec was updated and update strategy is RollingUpdate` is broken by recent updates as Template generation isn't supported by apps/v1.DaemonSet anymore
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#60745
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
"KubeletManagedEtcHosts should test kubelet managed /etc/hosts file"
conformance test fails in the CI's Docker-In-Docker environment.
This test mounts a /etc/hosts file and checks if "# Kubernetes-managed
hosts file." string is present or not under various conditions. The
specific failure with DIND happens when the /etc/hosts picked up
from the box where e2e test are running already has this string. This
happens because our CI runs on kubernetes and the e2e tests are running
in a container that was started on kubernetes (and hence already has
that string)
To avoid this situation, we create a new /etc/hosts file with known
contents (and does not have the "# Kubernetes-managed hosts file."
string)
The fix for #60803 in commit 2ae33cc324 had a typo, so the "Server
rejected event" error still showed up in the external-provisioner log
of the "Sanity CSI plugin test using hostPath CSI driver" e2e test.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Run server-side print tests only on k8s 1.10+
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#60765
**Release note**:
```release-note
None
```
/assign @smarterclayton @deads2k
@juanvallejo fyi
After client pod is stopped we need to wait for a while for kubelet to
unmount and clean up the client pod before we can destroy the server pod.
Killing e.g. iSCSI server too early may result in stale attached device
that cannot be removed.
- create 120MB volume instead of 1MB for volume_io tests
- rebase to Fedora 26
- added compatibility with ext4 and older ceph clients
- unify CephFS and Ceph RBD images.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Prevent webhooks from affecting admission requests for WebhookConfiguration objects
**What this PR does / why we need it**:
As it stands now webhooks can be added to the system which make it impossible for a user to remove that webhook, or two webhooks could be registered which make it impossible to remove each other.
The first commit of this will add a test to make sure webhook deletion is never blocked by a webhook. This test will fail until the second commit is added which will prevent webhooks from affecting admission requests for ValidatingWebhookConfiguration and MutatingWebhookConfiguration objects in the admissionregistration.k8s.io group
- [x] Test that webhook deletion is never blocked by a webhook ([test fails before second commit](https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/59840/pull-kubernetes-e2e-gce/23731/))
- [x] Prevent webhooks from being called on admission requests for [Validating|Mutating]WebhookConfiguration objects
- [x] Document this new behavior maybe in another PR
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Part of fixing #59124 (Verifies that it can remove the broken webhook.)
**Release note**:
```release-note
ValidatingWebhooks and MutatingWebhooks will not be called on admission requests for ValidatingWebhookConfiguration and MutatingWebhookConfiguration objects in the admissionregistration.k8s.io group
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix DaemonSet e2e test for OnDelete
**What this PR does / why we need it**: DaemonSet `OnDelete` e2e test is broken after #59883 is merged, because default update strategy is different in apps/v1 API endpoint.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
xref #60003
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60679, 60804). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixing e2e CSI test
Closes#60803
After switching to CSI 0.2.0 spec, additional RBAC permissions are required. This PR adds missing permissions.
```release-note
None
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add node-e2e test for ShareProcessNamespace
**What this PR does / why we need it**: Adds a node-e2e test for kubernetes/features#495
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#59554
**Special notes for your reviewer**: This requires a feature gate to be enabled in both the kubelet and API server. I'm not sure which jenkins configs need to be updated (or if these are even still used) so I just updated a pile of them.
opened kubernetes/test-infra#7030 for https://github.com/kubernetes/test-infra/blob/master/jobs/config.json
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Volume deletion should be idempotent
- Describe* calls should return `aws.Error` so caller can handle individual errors. `aws.Error` already has enough context (`"InvalidVolume.NotFound: The volume 'vol-0a06cc096e989c5a2' does not exist"`)
- Deletion of already deleted volume should succeed.
**Release note**:
Fixes: #60778
```release-note
NONE
```
/sig storage
/sig aws
/assign @justinsb @gnufied
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add selector to DaemonSet in newDaemonSet function so that the v1 api…
**What this PR does / why we need it**:
When we upgraded the DaemonSet e2e to use apps v1 I neglected to add a selector to match the labels of the created Pods. This broke some apps Serial tests.
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update kubectl e2e test manifests to apps/v1
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60159, 60731, 60720, 60736, 60740). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Cap max number of nodes to use for local PV e2e tests
**What this PR does / why we need it**:
Large scale tests have thousands of nodes, which will make some local PV tests that use each node take forever
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Partially addresses #60589
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60159, 60731, 60720, 60736, 60740). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Expect NetworkTier not to be set as GCE value (all uppercase)
**What this PR does / why we need it**:
Reverts L76 and L123 from this PR - https://github.com/kubernetes/kubernetes/pull/59941/files#diff-497d33fc55a7de6c5bde6cbe33ecbb3cL78 . NetworkTier is set on the mock Service in Camel Case, not all uppercase.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#60721
**Release note**:
```release-note
NONE
```
/assign MrHohn nicksardo
Automatic merge from submit-queue (batch tested with PRs 60159, 60731, 60720, 60736, 60740). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Promote LocalStorageCapacityIsolation feature to beta
The LocalStorageCapacityIsolation feature added a new resource type ResourceEphemeralStorage "ephemeral-storage" so that this resource can be allocated, limited, and consumed as the same way as CPU/memory. All the features related to resource management (resource request/limit, quota, limitrange) are available for local ephemeral storage.
This local ephemeral storage represents the storage for root file system, which will be consumed by containers' writtable layer and logs. Some volumes such as emptyDir might also consume this storage.
Fixes issue #60160
This PR also fixes data race issues discovered after open the feature gate. Basically setNodeStatus function in kubelet could be called by multiple threads so the data needs lock protection. Put the fix with this PR for easy testing.
**Release note**:
```release-note
ACTION REQUIRED: LocalStorageCapacityIsolation feature is beta and enabled by default.
```
Add verb and preemption for scheduler extender
Update bazel
Use simple preemption in extender
Use node name instead of v1.Node
Fix support method
Fix preemption dup
Remove uneeded logics
Remove nodeInfo from param to extender
Update bazel for scheduler types
Mock extender cache with nodeInfo
Add nodeInfo as extender cache
Choose node name or node based on cache flag
Always return meta victims in result
The LocalStorageCapacityIsolation feature added a new resource type
ResourceEphemeralStorage "ephemeral-storage" so that this resource can
be allocated, limited, and consumed as the same way as CPU/memory. All
the features related to resource management (resource request/limit, quota, limitrange) are avaiable for local ephemeral storage.
This local ephemeral storage represents the storage for root file system, which will be consumed by containers' writtable layer and logs. Some volumes such as emptyDir might also consume this storage.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Moves the DevicePlugin e2e_node test back to Serial
I forgot the fact that the DevicePlugin test itself restarts Kubelet
for testing purpose. Move that test back to Serial but constructs
a smaller test without kubelet restart that we may run during presubmit.
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/60604
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 52077, 60456, 60591). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Adds daemonset conformance tests
**What this PR does / why we need it**: Adds conformance tests for deamonset
```release-note
Conformance tests are added for the DaemonSet kinds in the apps/v1 group version. Deprecated versions of DaemonSet will not be tested for conformance, and conformance is only applicable to release 1.10 and later.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix todo:Get rid of this duplicate function IsRetryableAPIError in favour of the one in test/utils
**What this PR does / why we need it**:
fix todo:Get rid of this duplicate function IsRetryableAPIError in favour of the one in test/utils
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add missing table converters for server side printing
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#60488
/assign @smarterclayton @juanvallejo
/kind bug
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 58171, 58036, 60540). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Changing Flexvolume plugin directory on COS in GCE to a durable directory
**What this PR does / why we need it**: The original `/etc/srv/...` directory is in an overlayfs over a path in /tmp, so Flexvolume drivers are erased across node restarts for any reason. Changing it to non-tmpfs location.
Also removing redundant Flexvolume path injection in `config-test.sh` because it's already in `cluster/common.sh`.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#57353
**Release note**:
```release-note
[action required] Default Flexvolume plugin directory for COS images on GCE is changed to `/home/kubernetes/flexvolume`.
```
/assign @roberthbailey @saad-ali
/cc @chakri-nelluri @wongma7
/sig storage
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix todo:Move function readinessCheck to util
**What this PR does / why we need it**:
fix todo:Move function readinessCheck to util in test/e2e_node/services
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60475, 60514, 60506, 59903, 60319). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[e2e ingress-gce] Add test for backside re-encryption
**What this PR does / why we need it**:
Add a basic e2e testcase for ingress backside re-encryption. echoheaders image with HTTPS support was added by https://github.com/kubernetes/ingress-nginx/pull/2091.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #NONE
**Special notes for your reviewer**:
/assign @nicksardo @rramkumar1
cc @nikhiljindal
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60475, 60514, 60506, 59903, 60319). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Refactor reusable parts of scheduler perf test
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
/cc @shyamjvs
Automatic merge from submit-queue (batch tested with PRs 60342, 60505, 59218, 52900, 60486). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Increase failureThresholds for failing HTTP liveness test
**What this PR does / why we need it**:
Removes test from e2e which relies on HTTP liveness as a measure to tell if the container is good or bad. While this is not a bad idea, we cannot rely on this test as HTTP liveness relies on network/infrastructure etc on which sometimes we have no control over. While increasing the timeout may be an option it may not be ideal for all cloud providers/type of hardware etc.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#59150
**Special notes for your reviewer**:
I have stated reasons in the issue #59150. We have seen that this test is flaking recently in https://github.com/openshift/origin/issues/12072
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60470, 59149, 56075, 60280, 60504). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Adding basic example Flexvolume drivers and DaemonSet deployment example.
**What this PR does / why we need it**: More example Flexvolume drivers as reference for driver authors, and an example driver deployment workflow.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#59148
/sig storage
/assign @chakri-nelluri @zmerlynn
/cc @wongma7
/release-note-none
Automatic merge from submit-queue (batch tested with PRs 60376, 55584, 60358, 54631, 60291). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove gcloud docker -- since it's deprecated
docker handles this now and it raises an error.
try 3
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60236, 60332, 57375, 60451, 57408). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update device plugin e2e_node test to not changing Kubelet config
as DevicePlugins feature is enabled by default now.
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 60236, 60332, 57375, 60451, 57408). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kube-scheduler: Support extender managed extended resources in kube-scheduler
**What this PR does / why we need it**:
This is the continuation of https://github.com/kubernetes/kubernetes/pull/58851.
- This PR adds new extender configurations in scheduler policy config.
- A set of extended resource names can be specified in an extender config. They are the resources that are managed by the extender. The scheduler will only send pods to the extender if the pod requests at least one of the extended resources in the set.
- An `IgnoredByScheduler` flag can be set along with each of such resources. If this flag is set to true, the scheduler will not check the resource in the `PodFitsResources` predicate.
- This PR also changes the default behavior of the `PodFitsResources` predicate. Now, by default, `PodFitsResources` will ignore the extended resources that are not in node status. This is required to support extender managed extended resources (including cluster-level resources) on node. Note that in kube-scheduler we override the default behavior by not ignoring such missing extended resources.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/53616https://github.com/kubernetes/kubernetes/issues/58850
**Special notes for your reviewer**:
**Release note**:
```
Support extender managed extended resources in kube-scheduler
```
Automatic merge from submit-queue (batch tested with PRs 60236, 60332, 57375, 60451, 57408). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Adding Data Encryption Key (DEK) Key Encryption Key (KEK) integration…
… tests via KMS Plugin Mock.
**What this PR does / why we need it**:
Adding integration tests between KubeAPI and KMS Plugin.
Concretely, this test verifies the following integration contracts:
1. Raw records in ETCD that were processed by KMS Provider should be prefixed with k8s:enc:kms:v1:grpc-kms-provider-name:
2. Data Encryption Key (DEK) should be generated by envelopeTransformer and passed to KMS gRPC Plugin
3. KMS gRPC Plugin should encrypt the DEK with a Key Encryption Key (KEK) and pass it back to envelopeTransformer
4. The payload (ex. Secret) should be encrypted via AES CBC transform
5. Prefix-EncryptedDEK-EncryptedPayload structure should be deposited to ETCD
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 53689, 56880, 55856, 59289, 60249). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add test/typecheck, a fast typecheck for all build platforms.
Add test/typecheck, a fast typecheck for all build platforms.
Most of the time spent compiling is spent optimizing and linking
binary code. Most errors occur at the syntax or semantic (type) layers.
Go's compiler is importable as a normal package, so we can do fast
syntax and type checking for the 10 platforms we build on.
This currently takes ~6 minutes of CPU time (parallelized).
This makes presubmit cross builds superfluous, since it should catch
most cross-build breaks (generally Unix and 64-bit assumptions).
Example output:
```$ time go run test/typecheck/main.go
type-checking: linux/amd64, windows/386, darwin/amd64, linux/arm,
linux/386, windows/amd64, linux/arm64, linux/ppc64le, linux/s390x, darwin/386
ERROR(windows/amd64) pkg/proxy/ipvs/proxier.go:1708:27: ENXIO not declared by package unix
ERROR(windows/386) pkg/proxy/ipvs/proxier.go:1708:27: ENXIO not declared by package unix
real 0m45.083s
user 6m15.504s
sys 1m14.000s
```
```release-note
NONE
```
Most of the time spent compiling is spent optimizing and linking
binary code. Most errors occur at the syntax or semantic (type) layers.
Go's compiler is importable as a normal package, so we can do fast
syntax and type checking for the 10 platforms we build on.
This currently takes ~6 minutes of CPU time (parallelized).
This makes presubmit cross builds superfluous, since it should catch
most cross-build breaks (generally Unix and 64-bit assumptions).
Example output:
$ time go run test/typecheck/main.go
type-checking: linux/amd64, windows/386, darwin/amd64, linux/arm, linux/386, windows/amd64, linux/arm64, linux/ppc64le, linux/s390x, darwin/386
ERROR(windows/amd64) pkg/proxy/ipvs/proxier.go:1708:27: ENXIO not declared by package unix
ERROR(windows/386) pkg/proxy/ipvs/proxier.go:1708:27: ENXIO not declared by package unix
real 0m45.083s
user 6m15.504s
sys 1m14.000s
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update gke nvidia-gpu-device-plugin to the latest version that supports
both v1alpha and v1beta1 device plugin versions.
Re-enables nvidia-gpus e2e test after verifying the test passes now.
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 60157, 60337, 60246, 59714, 60467). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Feature gate for regional PDs
**What this PR does / why we need it**: Adding beta feature gate around regional PD support.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Partially addresses #59988
**Special notes for your reviewer**: This feature has already been in alpha for two releases, but at the time it was not gated with a Kubernetes feature gate. Instead it was controlled by a GCE-specific alpha gate. However, there are additional changes with GCE PD StorageClass parameters that we'd like to gate as well, and this is out of scope of GCE alpha gates.
/cc @saad-ali @lavalamp
Automatic merge from submit-queue (batch tested with PRs 59365, 60446, 60448, 55019, 60431). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
auth: allow nodes to create tokens for svcaccts of pods
ref https://github.com/kubernetes/kubernetes/issues/58790
running on them. nodes essentially have the power to do this today
but not explicitly. this allows agents using the node identity to
take actions on behalf of local pods.
@kubernetes/sig-auth-pr-reviews @smarterclayton
```release-note
The node authorizer now allows nodes to request service account tokens for the service accounts of pods running on them.
```
Automatic merge from submit-queue (batch tested with PRs 60430, 60115, 58052, 60355, 60116). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add Garbage Collector e2e conformance tests
**What this PR does / why we need it**:
The garbage collector is a core component of kubernetes and needs to be tested by conformance, so its functionality can be relied on in any kubernetes environment.
As we can see in [testgrid](https://k8s-testgrid.appspot.com/sig-api-machinery#gce), the garbage collector tests being promoted by this PR are consistently passing. And the intention to promote them to conformance tests was laid out by [this document](https://docs.google.com/document/d/1h2S9ff9N-4MKqfayE3A8TqjD_qIwuND_dAhOAJFxYS0)
**Special notes for your reviewer**:
The last two tests in this file are not added as conformance tests because they involve beta features (custom resources and cronjobs), and conformance tests are only allowed for features in GA.
**Release note**:
```release-note
New conformance tests added for the Garbage Collector
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[e2e ingress-gce] Enhance cleanup logic for pre-shared-cert test
**What this PR does / why we need it**:
Pre-shared-cert test are flaky (https://k8s-testgrid.appspot.com/sig-network-gce#ingress-gce-e2e&width=5), mostly due to the orphaned ssl cert.
This PR enhances the cleanup logic to continue deleting the orphaned cert for this case (without this test will panic on TryDeleteIngress if no ingress is created).
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #NONE
**Special notes for your reviewer**:
/assign @rramkumar1
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59310, 60424, 60308, 60436, 60020). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Reduce number of pods created for local PV stress test
**What this PR does / why we need it**:
Local PV stress test is flaking. Failed runs show that test is timing out at 47/50 pods. Reduce the number of pods created by the test so that it's not so close to the max timeout.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
I'll need to investigate further to see why processing the pods is so slow and ways to speed it up. But for now, try to reduce flaking.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
refactor volume util files
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes # https://github.com/kubernetes/kubernetes/issues/44460
**Special notes for your reviewer**:
/assign @jsafrane @msau42
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Enable protection tests
**What this PR does / why we need it**:
- StorageObjectInUseProtection feature is enabled by default so the test can run in regular e2e test suite
- Rename PVC protection test, it tests only PVCs and not whole storage.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59674, 60059, 60220, 58916, 60336). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Conformance: Add StatefulSet tests.
Mark StatefulSet tests as Conformance where possible. I've excluded those that depend on a dynamic provisioner and a default storage class (i.e. those that use PVC), because I don't think those things are required for Conformance at this time.
@kow3ns @jagosan Please correct me if I'm wrong.
Part of #54256
```release-note
StatefulSet in apps/v1 is now included in Conformance Tests.
```