Automatic merge from submit-queue (batch tested with PRs 48168, 48199)
Fix some flakes in autoscaler e2e on gke
This PR should fix some of the flakes we found in e2e runs, while testing for 1.7 release:
- if one of the nodes is unschedulable in set up (causing set up to fail) we used to wait for wrong number of nodes in clean-up, adding unnecessary 20 minute wait to failing test
- we did not check for errors when creating RC in test, leading to tests failing later in hard to debug way (added retry loop and explicit test failure)
Automatic merge from submit-queue (batch tested with PRs 48004, 48205, 48130, 48207)
Add e2e tests for CA scale up when pending pod requests volume
Test verifying pending pods with PVC don't interfere with scale up, issue: kubernetes/autoscaler#22
Automatic merge from submit-queue
Further reduce cluster-autoscaler e2e flakiness
Ref: https://github.com/kubernetes/autoscaler/issues/89
Add pdbs for additional kube-system pod, move adding
pdbs to separate function, as it will need to be reused
in new tests we're working on (ex. scale to 0).
Automatic merge from submit-queue
Added an e2e test timing HPA + CA scaling up from 1 to 8 pods and from 3 to >=4 clusters
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#46847
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Add new e2e test for cluster size autoscaler (evicting system pods)
This test verifies that cluster autoscaler drains nodes with system pods running if they have a PDB.
Automatic merge from submit-queue (batch tested with PRs 46835, 46856)
Made tests that create Horizontal Pod Autoscaler delete it after they are done.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#46847
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43844, 44284)
Add a retry to cluster-autoscaler e2e
This should fix https://github.com/kubernetes/kubernetes/issues/44268.
The flake was caused by following sequence of events:
1. Cluster was at minimum size (3), some node was unneeded for a while.
2. Setup for some test (scale-down, failure) would increase node group size (to 5) and wait for new nodes to come up.
3. As soon as new node come up (cluster size 4) CA would scale-down the old unneeded node (setting node group size to 4).
4. Node group would not reach size 5 (as the target was now 4) and the test would timeout and fail.
This PR makes the setup monitor re-set the target node group size if the above scenario happens.