Automatic merge from submit-queue (batch tested with PRs 48665, 52849, 54006, 53755). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add named-port ingress test
**What this PR does / why we need it**:
Validate correct behavior when a `NetworkPolicyIngressRule` refers to a named port rather than a numerical port, e.g. `serve-80` rather than `80`.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 53106, 52193, 51250, 52449, 53861). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
add replicaset upgrade test
**What this PR does / why we need it**:
This PR adds existing replicaset upgrade test.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: xref #52118
**Release note**:
```release-note
NONE
```
When starting an e2e test in a pod in a cluster, if the host is
not specified in the command line, we default to using
'http://127.0.0.1:8080' currently. We should try the in-cluster
config, save it to a temporary file and use that with kubectl
Automatic merge from submit-queue (batch tested with PRs 53507, 53772, 52903, 53543). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Split downward API e2e test case for pod/host IP into two
**What this PR does / why we need it**:
Split the test case in order to avoid version block pod IP e2e test.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
ref: https://github.com/kubernetes/kubernetes/pull/42717#discussion_r144026427
**Special notes for your reviewer**:
/cc @timothysc @andrewsykim
Automatic merge from submit-queue (batch tested with PRs 53507, 53772, 52903, 53543). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Adding e2e tests to verify vsphere volume lifecycle on a clustered datastore
**What this PR does / why we need it**:
This PR introduces tests for volume provisioning on a clustered datastore. It does so in three ways
1. Static provisioning (create vsphere volume and then create a pod with it)
2. Dynamic provisioning (specify clustered datastore in storage class parameters)
3. Dynamic provisioning with spbm policy (specify storage policy name in storage class parameters. This policy is a tag based policy and tagged to a clustered datastore)
**Which issue this PR fixes** :
fixes vmware#278
**Special notes for your reviewer**:
Set env as per following example due to the need mentioned in description
```
export CLUSTER_DATASTORE="dscl1/sharedVmfs-1"
export VSPHERE_SPBM_POLICY_DS_CLUSTER="gold_cluster"
```
Internally reviewed by VMware reviewers @divyenpatel @BaluDontu @tusharnt
**Release note**:
```
None
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use gcloud for enabling/disabling autoscaling in e2e tests
This removes temporary solution added in #28011 as it's no longer necessary. Should reduce flakes caused by not waiting for master restart after disabling autoscaling.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improve e2e tests of audit logging.
Now test includes:
* Verbs: create, list, watch, delete, get, update, patch.
* Resources: pods, deployments, secrets, config maps, custom resource
definition.
* More fields: user, resource, level, stage, presence of request and
response objects.
Fixes#49653
Automatic merge from submit-queue (batch tested with PRs 53668, 53624, 52639, 53581, 51215). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Local e2e test fixes
**What this PR does / why we need it**:
1. Remove tests using TestContainerOutput because they don't wait for unmount
2. Fix scheduling error test to handle updated event msgs.
@kubernetes/sig-storage-pr-reviews
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#53597
**Release note**:
NONE
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Bump kube-dns version used in e2e
**What this PR does / why we need it**: Updates the version of kube-dns used in the e2e network tests.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: ref #53153
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Skip podpreset test if the alpha feature setttings/v1alpha1 is disabled
**What this PR does / why we need it**: Skip this test if it is not able to find the requested resource, so the test does not consistently fail.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#53079
**Special notes for your reviewer**:
**Release note**:
```release-note
Skip podpreset test if the alpha feature setttings/v1alpha1 is disabled
```
Automatic merge from submit-queue (batch tested with PRs 50223, 53205). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Create e2e tests for Custom Metrics - Stackdriver Adapter and HPA based on custom metrics from Stackdriver
**What this PR does / why we need it**:
- Add e2e test for Custom Metrics - Stackdriver Adapter
- Add 2e2 test for HPA based on custom metrics from Stackdriver
- Enable HorizontalPodAutoscalerUseRESTClients option
**Release note**:
```release-note
Horizontal pod autoscaler uses REST clients through the kube-aggregator instead of the legacy client through the API server proxy.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Increase backoffLimit for job that we expect to fail several times
**What this PR does / why we need it**:
Since the introduction of `backoffLimit` for a job that single test failed majority of times on: `BackoffLimitExceeded: Job has reach the specified backoff limit`.
I'm bumping this to 999, so that it has enough room to fail several times.
**Which issue this PR fixes**:
Fixes#35507.
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 53678, 53677, 53682, 53673). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix to prevent downward api change break on older versions
Signed-off-by: Timothy St. Clair <timothysc@gmail.com>
**What this PR does / why we need it**:
Prevents "should provide pod and host IP as an env var [Conformance]" from running on older versions whose api does not have that field and will break on those clusters.
This is not a upstream tested configuration, but downstream folks do this regularly.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
N/A
**Special notes for your reviewer**:
N/A
**Release note**:
```
Prevent downward api-change from breaking on older version
```
/cc @kubernetes/sig-testing-bugs @jpbetz @marun
Automatic merge from submit-queue (batch tested with PRs 53678, 53677, 53682, 53673). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix typo in StatefulSet e2e test
Found it while reviewing #53218
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
wait for pod to be fully deleted
**What this PR does / why we need it**:
Fix flaky glusterfs io-streaming tests.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#49529
**Special notes for your reviewer**:
1) max potential wait for complete pod deletion is ~~15m~~ 5m.
2) ~~removed [Flaky] from HostCleanup, _e2e/node/kubelet.go_ since pod deletion is reliable now.~~
3) ~~added tag [Slow] to HostCleanup due to long max wait for pod deletion.~~
After all CI tests run reliably we can consider removing the [Flaky] tag (2, above), or do that in a separate pr.
```release-note
NONE
```
cc @msau42
Automatic merge from submit-queue (batch tested with PRs 52354, 52949, 53551). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add client and server versions to the e2e.test output.
Fixes#53502.
```release-note
NONE
```
Sample output:
```
Oct 6 15:02:44.001: INFO: Client version: v1.9.0-alpha.1.737+3b1b19a1e2a9a4-dirty
Oct 6 15:02:44.039: INFO: Server version: v1.8.0
```
/assign @timothysc
The etcd3 storage now attempts to fill partial pages to prevent clients
having to make more round trips (latency from server to etcd is lower
than client to server). The server makes repeated requests to etcd of
the current page size, then uses the filter function to eliminate any
matches. After this change the apiserver will always return full pages,
but we leave the language in place that clients must tolerate it.
Reduces tail latency of large filtered lists, such as viewing pods
assigned to a node.
Automatic merge from submit-queue (batch tested with PRs 53621, 52320, 53625). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
revamp replicaset e2e tests
**What this PR does / why we need it**:
This PR removes some replicaset e2e tests as they will be converted to integration tests:
(1) condition check test
(2) pod adoption test
(3) pod release(orphaning) test
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: xref #52118
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 53567, 53197, 52944, 49593). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Clean up in `cluster_size_autoscaling.go`
**What this PR does / why we need it**:
Fix `golint` errors.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50447, 53308). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[e2e] add service session affinity test case
**What this PR does / why we need it**:
**Which issue this PR fixes**:
Add service session affinity test case for e2e
fixes#31712
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51771, 52971). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
pass labelSelector to server side opaquely
**What this PR does / why we need it**:
From @smarterclayton
> The server is responsible for handling label selection for the most part. There is some level of client side processing possible, but for the most part `label selector` should be able to be passed opaquely.
xref #50140
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
/assign @smarterclayton @liggitt
**Release note**:
```release-note
None
```
This change modifies the way that config.NodeIP is selected at the
start of e2e Networking tests such that if no external addresses are
available from the cloud provider (e.g. either no cloud provider being
used [baremetal or VMs], or the provider doesn't have external IPs
configured), then one of the internal addresses is used.
Without this change, the e2e service-related Networking tests would always
panic when config.ExternalAddrs[0] is accessed and the slice is empty.
This change eliminates the panic, and in some setups, the fallback choice
of using an internal address will provide the necessary connectivity
for the e2e Networking tests to access each node.
fixes#53568
Automatic merge from submit-queue (batch tested with PRs 53350, 52688, 53531, 52515). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Skip e2e check for logs API path if provider is skeleton
There is a networking e2e test with the It() description:
```
"should provide unchanging, static URL paths for kubernetes api services"
```
This test performs GETs from the Kubernetes API using various paths,
including "/logs". This test for a GET using path "/logs" should be
skipped for provider type "skeleton", since this path is unsupported.
This change adds "skeleton" to the list of providers for which
this test case should be skipped.
fixes#53529
**What this PR does / why we need it**:
This change adds "skeleton" to the list of providers for which
the test for an API GET using the "/logs" path should be skipped.
This is needed because, as far as I can tell, the "skeleton" provider
doesn't support the "/logs" api path.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#53529
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 53350, 52688, 53531, 52515). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
PodReady should be replaced with podutil.IsPodReady
**What this PR does / why we need it**:
PodReady should be replaced with podutil.IsPodReady.
Thanks.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 52768, 51898, 53510, 53097, 53058). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
migration of federation test
**What this PR does / why we need it**:
Migrate federation(multicluster) e2e test.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Ref Umbrella issue #49161
Ref issue https://github.com/kubernetes/kubernetes/issues/50735
- Move `ubernetes_lite.go` to new created directory named **multicluster**.
**Special notes for your reviewer**:
**Release note**:
none
/cc @quinton-hoole
There is a networking e2e test with the It() description:
```
"should provide unchanging, static URL paths for kubernetes api services"
```
This test performs GETs from the Kubernetes API using various paths,
including "/logs". This test for a GET using path "/logs" should be
skipped for provider type "skeleton", since this path is unsupported.
This change adds "skeleton" to the list of providers for which
this test case should be skipped.
fixes#53529
Automatic merge from submit-queue (batch tested with PRs 53227, 53120). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
E2E test to verify clean up of stale dummy VM for vSphere dynamic provisioning
Verify if the dummy stale VM's created during dynamic provisioning are deleted by the clean up routine in vSphere cloud Provider.
**Testing Done:**
- Create a storage class with invalid policy on a VSAN datastore.
- Create a PVC using the above storage class
- Verify if the PVC is not bound.
- Delete the PVC.
- Sleep for 6 minutes so that vSphere Cloud Provider clean up routine can delete the stale dummy VM's.
- Verify if the VM is not present. Otherwise fail the test.
@rohitjogvmw @divyenpatel
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 51750, 53195, 53384, 53410). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add ping6 option for e2e ext connectivity test for IPv6-only clusters
e2e tests provide only an (IPv4) ping test for external connectivity.
We need a way to conditionally run a ping6 external connectivity check,
and disable the (IPv4) ping-based external connectivity check,
for end-to-end testing on IPv6-only clusters.
This feature will be needed for creating gating IPv6 CI tests.
fixes#53383
**What this PR does / why we need it**:
This adds an IPv6 (ping6) version of the external connectivity ping check to the e2e test suite,
and adds "Feature:" flags for selecting whether the IPv4 or IPv6 (or both) versions
of the connectivity test should be run. We need this change to be able to use the
e2e test suite in upstream gating IPv6 CI tests on IPv6-only clusters (at least until
dual-stack operation is fully supported in Kubernetes).
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#53383
**Special notes for your reviewer**:
Please let me know if there are better tags to use for selecting IPv4 vs IPv6 testing.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 53345, 53389). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add IPv6 option for e2e iPerf test
The e2e iPerf test case currently only runs in IPv4 mode.
This change adds an option to run an iPerf test in IPv6 mode (i.e. by running
iPerf with a "-V" command line flag), so that the test can be run on
IPv6-only clusters.
**What this PR does / why we need it**:
This change adds an option to run an iPerf test in IPv6 mode (i.e. by running
iPerf with a "-V" command line flag), so that the test can be run on
IPv6-only clusters. It also adds a Feature tag to the current IPv4 iPerf test
so that it can be disabled when running e2e tests on an IPv6-only cluster.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#53388
**Special notes for your reviewer**:
Please let me know if there are better "Feature:" tags to use for selecting whether to run the IPv4 vs IPv6 test case.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 53228, 53232, 53353). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes a regression introduced by PR 52290 that extended resource
capacity may temporarily drop to zero after kubelet restarts and PODs restarted during
that time window could fail to be scheduled.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/53342
**Special notes for your reviewer**:
**Release note**:
```release-note
```
e2e tests provide only an (IPv4) ping test for external connectivity.
We need a way to conditionally run a ping6 external connectivity check,
and disable the (IPv4) ping-based external connectivity check,
for end-to-end testing on IPv6-only clusters.
This feature will be needed for creating gating IPv6 CI tests.
fixes#53383
The e2e iPerf test case currently only runs in IPv4 mode.
This change add an option to run an iPerf test in IPv6 mode (i.e. by running
iPerf with a "-V" command line flag), so that the test can be run on
IPv6-only clusters.
Automatic merge from submit-queue (batch tested with PRs 52723, 53271). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update file location in e2e test comment
**What this PR does / why we need it**: The location provided, "docs/design/expansion.md" leads to something saying the file has moved with a link. The link goes to a 404 error. The file was moved out of tree to https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/expansion.md and the comment here should be changed
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#53270
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Disable autoscaling before removing autoscaled node pool
This is to prevent flakes due to API calls failing in AfterEach during master restart, which is triggered by deleting an autoscaled node pool. Adding disable call before deleting node pool should prevent this as we'll wait for master restart in disableAutoscaler function.
While it may be faster to wait after deletion of autoscaled node pools, this is less complex and will be easier to remove in the future when changing autoscaling setttings no longer triggers master restart.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix skip condition for autoscaling test of scale to zero
This fixes test running in wrong setup (on single MIG vs multiple MIGs as was intended.)
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remake cluster size autoscaling scale to zero test
This PR affects only cluster size autoscaling test suite. Changes:
* check whether autoscaling for is enabled by looking for a node group with a given max number of nodes instead of min as the field is omitted if value is 0
* split scale to zero test into GKE & GCE versions, add GKE-specific setup and verification
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
enable to specific unconfined AppArmor profile
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52370
**Special notes for your reviewer**:
/assign @tallclair @liggitt
**Release note**:
```release-note
enable to specific unconfined AppArmor profile
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Migrate sig-ui e2e test
**What this PR does / why we need it**:
Migrate sig-ui e2e tests
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Ref Umbrella issue #49161
**Special notes for your reviewer**:
**Release note**:
none
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes a flakiness in GPUDevicePlugin e2e test.
Waits till nvidia gpu disappears from all nodes after deleting the
device plug DaemonSet to make sure its pods are deleted from all nodes.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/53281
**Special notes for your reviewer**:
**Release note**:
```release-note
```
RunCmd uses Go's os/exec library to run commands directly. Since these
are not run through a shell, we can't use shell syntax for piping for
file redirection. The proper way to do that is to create a Command
object and set the Std{in,out,err} pipes appropriately. Luckily sed
can handle the behavior we need without having to manually set this up.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
In cluster autoscaling tests, improve error logging on enable autoscaling failure
This adds logging command and request output in addition to error.
Automatic merge from submit-queue (batch tested with PRs 51311, 52575, 53169). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix a scheduler flaky e2e test
**What this PR does / why we need it**:
Makes a scheduler e2e test that verifies the resource limit predicate more robust.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#53066
**Release note**:
```release-note
NONE
```
@kubernetes/sig-scheduling-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 50280, 52529, 53093, 53108, 53168). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improve PVC ref volume metric test robustness
This test has been flaking. The current working theory is that
volume stats collection didn't run in time to grab the metrics
from the newly created pod.
Made the following changes:
- Added more logs to help debug future failures
- Poll metrics a few additional times before failing the test
fixes#53150
This test has been flaking. The current working theory is that
volume stats collection didn't run in time to grab the metrics
from the newly created pod.
Made the following changes:
- Added more logs to help debug future failures
- Poll metrics a few additional times before failing the test
Automatic merge from submit-queue (batch tested with PRs 50988, 50509, 52660, 52663, 52250). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Added device plugin e2e kubelet failure test
Signed-off-by: Renaud Gaubert <renaud.gaubert@gmail.com>
**What this PR does / why we need it**:
This is part of issue #52859 (fixes#52859)
This PR adds a e2e_node test for the device plugin.
Specifically it implements testing of failure handling by the device plugin components in case Kubelet restart / crashes.
I might try to refactor the GPU tests in a later PR.
**Special notes for your reviewer**:
@jiayingz @vishh
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52990, 53064, 52686, 52221, 53069). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Allow kubelet metrics tests to run on gke
**What this PR does / why we need it**:
On GKE, you can still access kubelet metrics, so allow the kubelet metrics test.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
NONE
Automatic merge from submit-queue (batch tested with PRs 52721, 53057, 52493, 52998, 52896). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Move deployment collision avoidance e2e test to integration
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: ref #52113
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
In autoscaling tests, add waiting for new pool to become ready
This adds missing timeout when adding a node pool in GKE scale to 0 test and improves logging error when enabling autoscaling.
Automatic merge from submit-queue (batch tested with PRs 51648, 53030, 53009). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fixed intermitant e2e aggregator test on GKE.
**What this PR does / why we need it**: Issue was caused by another test cleaning up its namespace.
This caused the namespace controller to try to clean up that namespace.
This involves deleting all flunders under that namespace.
However the sample-apiserver was not honoring the namespace filter.
So the flunders for the test would randomly disappear.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50945
**Special notes for your reviewer**: Requires we fix the container image to contain this fix to work.
**Release note**:
```release-note NONE
```
Fixes issues/50945.
Issue was caused by another test cleaning up its namespace.
This caused the namespace controller to try to clean up that namespace.
This involves deleting all flunders under that namespace.
However the sample-apiserver was not honoring the namespace filter.
So the flunders for the test would randomly disappear.
Fixed image path to pick up newly built fixes from this PR.
Automatic merge from submit-queue (batch tested with PRs 51759, 53001, 52806). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fix broken statefulset e2e test
**What this PR does / why we need it**:
Fixes the CockroachDB statefulset e2e test.
This was broken back in #43637 when the logic in
`(*StatefulSetTester).CreateStatefulSet` switched from using
`generated.ReadOrDie` to read the entire service.yaml file and pass it
to kubectl to using `manifest.SvcFromManifest`, which assumes that the
file contains only a single service.
To fix the test, just remove the second service, which isn't needed to test the Statefulset functionality.
**Which issue this PR fixes**:
Fixes#52750
**Special notes for your reviewer**:
N/A
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51067, 52319, 52803, 52961, 51972). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Add support for skeleton in GetSigner
Adding support for skeleton to GetSigner to be able to run
e2e tests against a bare metal multinode cluster.
Closes#35613
Automatic merge from submit-queue (batch tested with PRs 51067, 52319, 52803, 52961, 51972). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Move prometheus metrics for docker operations into dockershim
Automatic merge from submit-queue (batch tested with PRs 52905, 52766). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Refactor parsing cluster autoscaler status, add logging error
Minor improvements to autoscaling test suite and e2e framework.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fix GCE LB resource cleanup for service e2e tests.
**What this PR does / why we need it**: Fix GCE LB resource cleanup logic.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52347
**Special notes for your reviewer**:
/assign @shyamjvs @nicksardo
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52880, 52855, 52761, 52885, 52929). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Don't need to check useAnnotation in dns e2e test
**What this PR does / why we need it**:
hostname/subdomain annotations were removed in #44137. This PR removes the check.
Also, `var dnsServiceLabelSelector` is not used anymore.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
ref: https://github.com/kubernetes/kubernetes/pull/44137
**Special notes for your reviewer**:
/cc @bowei @MrHohn
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Allow dns e2e test case for ExternalName to run on aws
**What this PR does / why we need it**:
#52840 uses allocated clusterIP instead of hard-coded one. So we don't need to care about the clusterIP range of the CI job config. Let it run on pull-kubernetes-e2e-kops-aws
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47224
**Special notes for your reviewer**:
ref: https://github.com/kubernetes/test-infra/pull/4462
/cc @bowei @MrHohn @justinsb
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
bazel: build/test almost everything
**What this PR does / why we need it**: Miscellaneous cleanups and bug fixes. The main motivating idea here was to make `bazel build //...` and `bazel test //...` mostly work. (There's a few reasons these still don't work, but we're a lot closer.)
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/assign @BenTheElder @mikedanese @spxtr
Automatic merge from submit-queue (batch tested with PRs 52469, 52574, 52330, 52689, 52829). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fixing E2E Test - After restarting kubelet test expects node's status to be NotReady
**What this PR does / why we need it**:
This PR is fixing the e2e tests involves restarting the kubelets. After the kubelet is restarted, test expect the desired state to be NotReady.
After restarting the kubelet we should wait for some time and then check nodes status to be Ready.
Node should not be checked for NotReady state, after restarting kubelet.
**Which issue this PR fixes**
fixes # https://github.com/vmware/kubernetes/issues/285
**Special notes for your reviewer**:
@BaluDontu @rohitjogvmw @tusharnt
Test logs before fix
-----
STEP: Restarting kubelet
Sep 15 11:26:32.768: INFO: Attempting sudo systemctl restart kubelet
Sep 15 11:26:33.001: INFO: ssh root@10.162.22.205:22: command: sudo systemctl restart kubelet
Sep 15 11:26:33.001: INFO: ssh root@10.162.22.205:22: stdout: ""
Sep 15 11:26:33.001: INFO: ssh root@10.162.22.205:22: stderr: ""
Sep 15 11:26:33.001: INFO: ssh root@10.162.22.205:22: exit code: 0
Sep 15 11:26:33.002: INFO: Waiting up to 1m0s for node kubernetes-node2 condition Ready to be false
Sep 15 11:26:33.012: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:35.023: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:37.032: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:39.041: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:41.051: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:43.061: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:45.070: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:47.080: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:49.093: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:51.105: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:53.117: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:55.128: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:57.140: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:59.151: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:01.158: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:03.167: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:05.180: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:07.188: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:09.210: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:11.221: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:13.231: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:15.240: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:17.249: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:19.263: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:21.272: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:23.283: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:25.309: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:27.317: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:29.327: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:31.342: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:33.343: INFO: Node kubernetes-node2 didn't reach desired Ready condition status (false) within 1m0s
Sep 15 11:27:33.343: INFO: Node kubernetes-node2 failed to enter NotReady state
[AfterEach] [sig-storage] PersistentVolumes:vsphere
Test logs after fix
-----
STEP: Restarting kubelet
Sep 18 15:40:49.066: INFO: Checking if sudo command is present
Sep 18 15:40:49.342: INFO: Checking if systemctl command is present
Sep 18 15:40:49.573: INFO: Attempting `sudo systemctl status kubelet | grep 'Main PID'`
Sep 18 15:40:49.733: INFO: ssh root@10.162.16.97:22: command: sudo systemctl status kubelet | grep 'Main PID'
Sep 18 15:40:49.733: INFO: ssh root@10.162.16.97:22: stdout: " Main PID: 19715 (docker)\n"
Sep 18 15:40:49.733: INFO: ssh root@10.162.16.97:22: stderr: ""
Sep 18 15:40:49.733: INFO: ssh root@10.162.16.97:22: exit code: 0
Sep 18 15:40:49.733: INFO: Attempting `sudo systemctl restart kubelet`
Sep 18 15:40:49.986: INFO: ssh root@10.162.16.97:22: command: sudo systemctl restart kubelet
Sep 18 15:40:49.986: INFO: ssh root@10.162.16.97:22: stdout: ""
Sep 18 15:40:49.986: INFO: ssh root@10.162.16.97:22: stderr: ""
Sep 18 15:40:49.986: INFO: ssh root@10.162.16.97:22: exit code: 0
Sep 18 15:40:49.988: INFO: Attempting `sudo systemctl status kubelet | grep 'Main PID'`
Sep 18 15:40:50.158: INFO: ssh root@10.162.16.97:22: command: sudo systemctl status kubelet | grep 'Main PID'
Sep 18 15:40:50.158: INFO: ssh root@10.162.16.97:22: stdout: " Main PID: 25021 (docker)\n"
Sep 18 15:40:50.158: INFO: ssh root@10.162.16.97:22: stderr: ""
Sep 18 15:40:50.158: INFO: ssh root@10.162.16.97:22: exit code: 0
Sep 18 15:40:50.158: INFO: Noticed that kubelet PID is changed. Waiting for 30 Seconds for Kubelet to come back
Sep 18 15:41:20.159: INFO: Waiting up to 1m0s for node kubernetes-node4 condition Ready to be true
STEP: Testing that written file is accessible.
Sep 18 15:41:20.191: INFO: Running '/Users/divyenp/github/vmware/kubernetes/_output/dockerized/bin/darwin/amd64/kubectl --server=https://10.162.0.45 --kubeconfig=/Users/divyenp/.kube/config exec --namespace=e2e-tests-pv-9j8j0 pvc-tester-3t9ds -- /bin/sh -c cat /mnt/_SUCCESS'
Sep 18 15:41:20.855: INFO: stderr: ""
Sep 18 15:41:20.855: INFO:
Sep 18 15:41:20.855: INFO: Volume mount detected on pod pvc-tester-3t9ds and written file /mnt/_SUCCESS is readable post-restart.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52485, 52443, 52597, 52450, 51971). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Removing PrometheusPushGateway --prom-push-gateway flag from e2e tests.
**What this PR does / why we need it**: Removing obsolete PrometheusPushGateway --prom-push-gateway flag from e2e tests.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#45947
**Special notes for your reviewer**:
**Release note**:
```release-note
Removing `--prom-push-gateway` flag from e2e tests
```
Automatic merge from submit-queue (batch tested with PRs 50392, 52108, 52083, 52134, 51526). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
e2e: minor changes to network/service testing utils
Add more logging to help debug. Also refactor several functions to improve
reusability.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
should use time.Since instead of time.Now().Sub
**What this PR does / why we need it**:
should use time.Since instead of time.Now().Sub
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
NONE
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
```
NONE
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Debug for issues #50945
Aggregator e2e test is intermittantly failing on GKE but not GCE.
Adding the following debugging for help trace issue.
Make sure we always use the same rest client.
Randomly generate the flunder resource name to detect parallel tests.
Print endpoints for sample-system in case multiple instances.
Print original and new pods in case the pod has been restarted.
**What this PR does / why we need it**: Adds debugging for aggregator e2e test to track down GKE flakiness.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50945
**Special notes for your reviewer**: This is primarily additional debugging information.
**Release note**:
```release-note NONE
```
Aggregator e2e test is intermittantly failing on GKE but not GCE.
Adding the following debugging for help trace issue.
Make sure we always use the same rest client.
Randomly generate the flunder resource name to detect parallel tests.
Print endpoints for sample-system in case multiple instances.
Print original and new pods in case the pod has been restarted.
Fixed import list.
Remove rand seed.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Don't specify clusterIP in dns e2e test
**What this PR does / why we need it**:
Different upgrade tests may configure different service clusterIP ranges. If we specify the clusterIP in dns e2e test, it will succeed in one upgrade test but fail in another. This PR doesn't specify clusterIP. It just uses the allocated clusterIP.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50274
**Special notes for your reviewer**:
Hope this can really fixes that issue.
/cc @thockin @MrHohn
**Release note**:
```release-note
NONE
```
This was broken back in #43637 when the logic in
`(*StatefulSetTester).CreateStatefulSet` switched from using
`generated.ReadOrDie` to read the entire service.yaml file and pass it
to kubectl to using `manifest.SvcFromManifest`, which assumes that the
file contains only a single service.
Fixes#52750
Automatic merge from submit-queue (batch tested with PRs 52843, 52710, 52821, 52844). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
improve retrying logic when checking CA status
This should reduce the flake rate in cluster size autoscaling test suite.
Automatic merge from submit-queue (batch tested with PRs 48406, 52819). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fixed nil dereference in dynamic provisioning e2e tests
**What this PR does / why we need it**: Fixed nil dereference in dynamic provisioning e2e tests.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52815
**Release note**:
```release-note-none
NONE
```
/sig storage
/assign @saad-ali
/cc @wongma7
/release-note-none
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Retry if possible while creating latency pods in density test
Saw the [last run](https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-performance/37) of density test on 5k-node fail due to it:
```
Expected error:
<*errors.StatusError | 0xc44f2fd7a0>: {
ErrStatus: {
TypeMeta: {Kind: "", APIVersion: ""},
ListMeta: {SelfLink: "", ResourceVersion: "", Continue: ""},
Status: "Failure",
Message: "timeout",
Reason: "",
Details: nil,
Code: 500,
},
}
timeout
not to have occurred
```
cc @kubernetes/sig-scalability-misc
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Support kubernetes-anywhere provider
**What this PR does / why we need it**:
Implements a new `kubernetes-anywhere` provider to allow upgrade testing in the e2e binary. This is the final step to allow https://github.com/kubernetes/test-infra/pull/4495 and https://github.com/kubernetes/kubernetes-anywhere/pull/450.
**Which issue this PR fixes**:
https://github.com/kubernetes/kubeadm/issues/311
**Special notes for your reviewer**:
Some questions I had
- Does the `--provider` flag specified [here](dbbf6261e0/jobs/config.json (L8587)) get sent to the flag defined [here](https://github.com/kubernetes/kubernetes/blob/master/test/e2e/framework/test_context.go#L219)? Or should I add another `--provider` flag inside `--upgrade_args` like this: `--upgrade_args=... --provider=kubernetes-anywhere`?
- Is it necessary to add waiting logic after the `make` command, or will it implicitly handle that by itself?
Some other points:
- I chose `sed` to manipulate the current kubernetes-anywhere `.config` rather than duplicating another [`anywhere.go`](https://github.com/kubernetes/test-infra/blob/master/kubetest/anywhere.go). One suggestion was to use `jq` but since the config on disk is not serialized to JSON yet, I'm not sure how that'd work.
- Since I don't have a GCE/GKE account or vCenter, I can't actually verify the e2e binary works. I've managed to build it, but if somebody could quickly run a smoke test, I'd appreciate it. This is my first poke around test-infra and e2e, so there might be some plumbing missing
/cc @jessicaochen @luxas @pipejakob @roberthbailey
Automatic merge from submit-queue (batch tested with PRs 52500, 52533). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Add mount options e2e test
**What this PR does / why we need it**: A test for newly added StorageClass.mountOptions and PV.mountOptions: provision a pv using a class with its storageclass.mountoptions set, and the end result should be that the mount options can be seen from the mounter.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: Fixes#52138
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
In autoscaling integration test, use allocatable instead of capacity for node memory
This makes the remaining cluster autoscaling test (integration test of HPA and CA working together to scale up the cluster) use node allocatable resources when computing how much memory we need to consume in order to trigger scale up/prevent scale down. Follow up to #52650 as that one is already merging.
cc @wasylkowski
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
In cluster size autoscaling tests, use allocatable instead of capacity for node memory
This makes cluster size autoscaling e2e tests use node allocatable resources when computing how much memory we need to consume in order to trigger scale up/prevent scale down. It should fix failing tests in GKE.
Automatic merge from submit-queue (batch tested with PRs 52350, 52659). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Add e2e test for storageclass.reclaimpolicy
**What this PR does / why we need it**: Adds another dynamic provisioning test where the storageclass.reclaimpolicy == retain. Have to manually delete the PV at the end of the test.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: https://github.com/kubernetes/kubernetes/issues/52138
**Special notes for your reviewer**: I have not tested it but it's ready for review, I will comment and edit this when i've verified it actually works.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Bugfix: Fix e2e Flaky Apps/Job BackoffLimit test
This fix is linked to the PR #51153 that introduce the `JobSpec.BackoffLimit`.
Previously the Timeout used in the test was too aggressive and generates flaky test execution. Now it used the default `framework.JobTimeout` used in others tests.
**What this PR does / why we need it**:
This PR should fix flaky "[sig-apps] Job should exceed backoffLimit" test, due to a too short timeout duration.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes#51153
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 51824, 50476, 52451, 52009, 52237)
Improve apiserver metrics reporting
Normalize "WATCHLIST" to "WATCH", add "scope" to the other metrics (listing 50k pods is != listing pods in a namespace), and add a new scope "resource" to cover individual resource calls.
This roughly aligns metrics with our ACL model (technically resource scope is GET, but POST to a subresource and POST to a namespace are not the same thing).
```release-note
WATCHLIST calls are now reported as WATCH verbs in prometheus for the apiserver_request_* series. A new "scope" label is added to all apiserver_request_* values that is either 'cluster', 'resource', or 'namespace' depending on which level the query is performed at.
```
Automatic merge from submit-queue (batch tested with PRs 52442, 52247, 46542, 52363, 51781)
Add more tests for pod preemption
**What this PR does / why we need it**:
Adds more e2e and integration tests for pod preemption.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
This PR is based on #50949. Only the last commit is new.
**Release note**:
```release-note
NONE
```
ref/ #47604
@kubernetes/sig-scheduling-pr-reviews @davidopp
Automatic merge from submit-queue
[fluentd-gcp addon] Remove some e2e tests out of blocking suites
Fixes https://github.com/kubernetes/kubernetes/issues/52433
Some Stackdriver Logging e2e tests are broken in release-blocking suites:
- Due to the change in Docker 1.13, on some systems logs are automatically split by 16K chunks. This PR removes an e2e test that assumes otherwise
- In large clusters, it's not possible to ingest system logs from all nodes
Since it's not a Kubernetes problem per se, mitigating this by removing these tests from blocking suites.
Automatic merge from submit-queue
Fix failing autoscaling test in GKE
This should fix `[sig-autoscaling] Cluster size autoscaling [Slow] should increase cluster size if pending pods are small and there is another node pool that is not autoscaled [Feature:ClusterSizeAutoscalingScaleUp]` by getting a list of nodes from GKE nodepool in a different way (filtering nodes by labels.) Currently, gcloud command used for it is failing, as we only have GKE node pool name in the test and not the actual MIG name.
Automatic merge from submit-queue (batch tested with PRs 52376, 52439, 52382, 52358, 52372)
Remove the conversion of client config
It was needed because the clientset code in client-go was a copy of the clientset code in Kubernetes.. client-go is authoritative now, so we can remove the nasty copy.
This fix is linked to the PR #51153 that introduce the
JobSpec.BackoffLimit.
Previously the Timeout used in the test was too agressive and generates
flaky test execution. Now it used the default framework.JobTimeout used
in others tests.
Automatic merge from submit-queue
Make CPU constraint for l7-lb-controller in density test scale with #nodes
Just noticed that we changed the memory last time, but didn't change cpu. From the last run:
```
Sep 13 04:25:03.360: INFO: Unexpected error occurred: Container l7-lb-controller-v0.9.6-gce-scale-cluster-master/l7-lb-controller is using 0.642709233/0.15 CPU
```
Automatic merge from submit-queue (batch tested with PRs 52316, 52289, 52375)
Extends GPUDevicePlugin e2e test to exercise device plugin restarts.
**What this PR does / why we need it**:
This is part of issue #52189 but does not fix it.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 52316, 52289, 52375)
[fluentd-gcp addon] Trim too long log entries due to Stackdriver limitations
Stackdriver doesn't support log entries bigger than 100KB, so by default fluentd plugin just drops such entries. To avoid that and increase the visibility of this problem it's suggested to trim long lines instead.
/cc @igorpeshansky
```release-note
[fluentd-gcp addon] Fluentd will trim lines exceeding 100KB instead of dropping them.
```
Automatic merge from submit-queue
Version gates the ephemeral storage e2e test
Version gates the ephemeral storage e2e test.
**Release note**:
```
NONE
```
@kubernetes/sig-testing-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 48226, 52046, 52231, 52344, 52352)
StatefulSet: Deflake e2e RunHostCmd more.
It turns out that at some points while the Node is recovering from a reboot, we get a different kind of error ("unable to upgrade connection"). Since we can't distinguish these transient errors from an error encountered after successfully executing the remote command, let's just retry all errors for 5min. If this doesn't work, I'm gonna blame it on sig-node.
ref #48031
Automatic merge from submit-queue (batch tested with PRs 48226, 52046, 52231, 52344, 52352)
Port Guestbook tests to mutiarch
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52232
**Special notes for your reviewer**:
**Release note**:
```NONE
NONE
```
It turns out that at some points while the Node is recovering from a
reboot, we get a different kind of error ("unable to upgrade
connection"). Since we can't distinguish these transient errors from an
error encountered after successfully executing the remote command,
let's just retry all errors for 5min. If this doesn't work, I'm gonna
blame it on sig-node.
Automatic merge from submit-queue (batch tested with PRs 50289, 52106)
Fix AppArmor test at scale
**What this PR does / why we need it**:
The AppArmor test only runs on a single node, but previously was loading the necessary profiles to every node. This caused unnecessary churn in very large clusters, so this PR updates the test to only load the profiles to a single node, and ensure the test pod is run on that node (using pod affinity).
**Which issue this PR fixes**: fixes#51791
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52227, 52120)
Fix discovery restmapper finding resources in non-preferred versions
Fixes: #52219
Also reverts behavioral changes to tests that version-qualified cronjobs to work around this issue.
The discovery rest mapper was only populating the priority rest mapper's search list with preferred groupversions.
That meant that if a resource existed in multiple non-preferred versions, AND did not exist in the preferred version (like cronjob, which only exists in v1beta2.batch and v2alpha1.batch, but not v1.batch), the priority restmapper would not find it in its group/version priority list, and would return an error.
```release-note
Fixed an issue looking up cronjobs when they existed in more than one API version
```
Automatic merge from submit-queue
Extend nvidia-gpus e2e test to include a device plugin based test
**What this PR does / why we need it**:
This is needed to verify device plugin feature.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/kubernetes/features/issues/368
**Special notes for your reviewer**:
Related test_infra PR: https://github.com/kubernetes/test-infra/pull/4265
**Release note**:
Add an e2e test for nvidia gpu device plugin
Automatic merge from submit-queue
Add pod preemption to the scheduler
**What this PR does / why we need it**:
This is the last of a series of PRs to add priority-based preemption to the scheduler. This PR connects the preemption logic to the scheduler workflow.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48646
**Special notes for your reviewer**:
This PR includes other PRs which are under review (#50805, #50405, #50190). All the new code is located in 43627afdf9.
**Release note**:
```release-note
Add priority-based preemption to the scheduler.
```
ref/ #47604
/assign @davidopp
@kubernetes/sig-scheduling-pr-reviews
Automatic merge from submit-queue
Pipe in upgrade image target for kube-proxy migration tests
**What this PR does / why we need it**:
https://k8s-testgrid.appspot.com/sig-network#gci-gce-latest-upgrade-kube-proxy-ds&width=20
and
https://k8s-testgrid.appspot.com/sig-network#gci-gce-latest-downgrade-kube-proxy-ds&width=20
are still failing.
Reproduced it locally and found node image is being default to debian during upgrade (it was gci before upgrade) because we don't pass in `gci` via `--upgrade--target`. And for some reasons (haven't figured out yet), the upgraded node uses debian image with gci startupscripts...
This PR pipes in `--upgrade-target` for kube-proxy migration tests, hopefully in conjunction with https://github.com/kubernetes/test-infra/pull/4447 it will bring the tests back to normal.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #NONE
**Special notes for your reviewer**:
Sorry for bothering again.
/assign @krousey
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52097, 52054)
Move paused deployment e2e tests to integration
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: xref #52113
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
StatefulSet: Deflake e2e RunHostCmd.
The initial retry up to 20s was giving up too soon. I'm seeing this test flake because the Node rebooted and it takes ~2min to recover. Now StatefulSet RunHostCmd calls will use the same 5min timeout as with other Pod state checks.
ref #48031
The initial retry up to 20s was giving up too soon.
I'm seeing this test flake because the Node rebooted and it takes ~2min
to recover.
Now StatefulSet RunHostCmd calls will use the same 5min timeout as with
other Pod state checks.
Automatic merge from submit-queue (batch tested with PRs 51839, 51987)
Disable rbac/v1alpha1, settings/v1alpha1, and scheduling/v1alpha1 by default
**What this PR does / why we need it**: Disables alpha features which were previously enabled by default. Also changes tests which relied on these alpha features being enabled by default.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47691
**Special notes for your reviewer**:
**Release note**:
```release-note
Fixed a bug where some alpha features were enabled by default.
The feature is still Alpha and at times, the IP address previously used
by the load balancer in the test will not completely freed even after
the load balancer is long gone. In this case, the test URL with the IP
would return a 404 response. Tolerate this error and retry until the new
load balancer is fully established.
Charge object count when object is created, no matter if the object is
initialized or not.
Charge the remaining quota when the object is initialized.
Also, checking initializer.Pending and initializer.Result when
determining if an object is initialized. We didn't need to check them
because before 51082, having 0 pending initializer and nil
initializers.Result is invalid.
Automatic merge from submit-queue (batch tested with PRs 51733, 51838)
Decouple kube-proxy upgrade/downgrade tests from upgradeTests
**What this PR does / why we need it**:
Fixes the failing kube-proxy migration CI jobs:
- https://k8s-testgrid.appspot.com/sig-network#gci-gce-latest-upgrade-kube-proxy-ds
- https://k8s-testgrid.appspot.com/sig-network#gci-gce-latest-downgrade-kube-proxy-ds
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#51729
**Special notes for your reviewer**:
/assign @krousey @nicksardo
Could you please take a look post code-freeze (I believe it is fixing things)? Thanks!
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51733, 51838)
Relax update validation of uninitialized pod
Split from https://github.com/kubernetes/kubernetes/pull/50344
Fix https://github.com/kubernetes/kubernetes/issues/47837
* Let the podStrategy to only call `validation.ValidatePod()` if the old pod is not initialized, so fields are mutable.
* Let the podStatusStrategy refuse updates if the old pod is not initialized.
cc @smarterclayton
```release-note
Pod spec is mutable when the pod is uninitialized. The apiserver requires the pod spec to be valid even if it's uninitialized. Updating the status field of uninitialized pods is invalid.
```