Automatic merge from submit-queue (batch tested with PRs 40055, 42085, 44509, 44568, 43956)
improve error handling in e2e helpers
**What this PR does / why we need it**:
Changes most of the volume related helper funcs to return error rather than calling `Expect`. This is a better programming practice, is consistent with Go and Kubernetes, and allows helper funcs that create multiple resources to perform cleanup.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 40055, 42085, 44509, 44568, 43956)
Change the default CLUSTER_IP_RANGE used by e2e
The existing choice intersects with the range reserved for auto
subnets and cannot be used with some GCP features.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44364, 44361, 42498)
Move v1 helpers
The first 3 commits are other PRs.
This PR move pkg/api/v1/helpers.go to a subpackage, which is almost symmetric to #44296, where pkg/api/helpers.go was moved.
This PR is mostly mechanic, except that
1. moved the 3 methods of Taint and Toleration to pkg/api/methods.go
2. moved constants and types defined in v1/helpers.go to pkg/api/v1/annotataion_key_constants.go and nonstandard_types.go
3. updated staging/copy.sh to copy pkg/api/helpers to client-go, it's otherwise removed from client-go because no other code in client-go depends on the package. Some test code in pkg/controller imports client-go/pkg/api/helpers. After moving api types to its own repo, we can remove these copies of utility function from client-go and ask users to use the ones in the main repo.
(This PR breaks a cyclic import problem i met when I tried to move global variables pkg/api/Scheme and Registry to a subpackage)
Automatic merge from submit-queue (batch tested with PRs 44362, 44421, 44468, 43878, 44480)
use sudo in mv ssh cmd
**What this PR does / why we need it**:
Fixes _HostCleanup_ kubelet e2e test where `sudo` was missing from a ssh `mv` command.
The exact test is:
`host cleanup with volume mounts [Volume][HostCleanup][Flaky] Host cleanup after disrupting NFS volume [NFS] move NFS client pod's UID directory then delete pod` This test has been consistently red in the gce-flaky suite.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44440, 44038, 44302, 44316, 43876)
Make resource gatherer get data about etcd resource usage in kubemark…
… setup
Automatic merge from submit-queue (batch tested with PRs 44424, 44026, 43939, 44386, 42914)
Try in cluster config when input KubeConfig is empty
**What this PR does / why we need it**:
Allows for downstream providers to run e2es "incluster" sans kubeconfig
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
/cc @kubernetes/sig-testing-pr-reviews @marun
Automatic merge from submit-queue
Segregate storage e2es for easier assignment and dep isolation.
**What this PR does / why we need it**:
Follow on from slack conversations to isolate storage deps and have easier assignment of e2es.
As of today there is general consensus that the e2es need to be broken apart, this is 1st iteration on the sig-storage test splitting. Follow on work is needed here.
**Release note**:
```
NONE
```
/cc @saad-ali @kubernetes/sig-storage-pr-reviews @kubernetes/sig-testing-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 44406, 41543, 44071, 44374, 44299)
Decouple remotecommand
Refactored unversioned/remotecommand to decouple it from undesirable dependencies:
- term package now is not required, and functionality required to resize terminal size can be plugged in directly in kubectl
- in order to remove dependency on kubelet package - constants from kubelet/server/remotecommand were moved to separate util package (pkg/util/remotecommand)
- remotecommand_test.go moved to pkg/client/tests module
Automatic merge from submit-queue (batch tested with PRs 44447, 44456, 43277, 41779, 43942)
Clean up pre-ControllerRef compatibility logic
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#43323
**Special notes for your reviewer**:
No
**Release note**:
```
NONE
```
Automatic merge from submit-queue
add kubelet tests to verify host clean up
**What this PR does / why we need it**:
Increasingly we're seeing more failures in persistent volume e2e tests where pv tests are run in parallel with disruptive tests. The quick solution is to tag the pv tests as Flaky. This pr addresses one cause of the flakiness and adds a disruptive kubelet test.
Once this pr is shown to not produce flakes the [Flaky] tag for the "HostCleanup" tests will be removed in a separate pr.
+ Adds volume tests to _kubelet.go_ motivated by issues [31272](https://github.com/kubernetes/kubernetes/issues/31272) and [37657](https://github.com/kubernetes/kubernetes/issues/37657)
+ Addresses reverted pr [41178](https://github.com/kubernetes/kubernetes/pull/41178) and negates the need for pr [41229](https://github.com/kubernetes/kubernetes/pull/41229)
**Which issue this PR fixes**
Adds regression tests to cover issues: #31272 and #37657
**Special notes for your reviewer**:
It's possible that one of the new tests, which relies on the existence of _/usr/sbin/rpc.nfsd_ in the nfs-server pod, will not work in the GCI container env. If this turns out to be true then I will add a `SkipIfProviderIs("gke")` to the `It` block.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
[e2e] Bump up pod deletion time for source pod IP test
From #44225.
Source pod IP e2e test is pretty flaky lately, and most of the failures seem to be timeout waiting for pod "kube-proxy-mode-detector" to disappear. Didn't found any other thing suspicious.
This PR bumps pod deletion timeout to DefaultPodDeletionTimeout, which is 3 minutes. Hopefully it will mitigate the flakes.
/assign @freehan
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
local up dns defaults/Privileged defaults so that [Conformance] sets mostly pass on local clusters.
Fixes#43651 So that only 4 tests fail out of the box.
Automatic merge from submit-queue
Adding load balancer src cidrs to GCE cloudprovider
**What this PR does / why we need it**:
As of January 31st, 2018, GCP will be sending health checks and l7 traffic from two CIDRs and legacy health checks from three CIDS. This PR moves them into the cloudprovider package and provides a flag for override.
Another PR will need to be address firewall rule creation for external L4 network loadbalancing #40778
**Which issue this PR fixes**
Step one of #40778
Step one of https://github.com/kubernetes/ingress/issues/197
**Release note**:
```release-note
Add flags to GCE cloud provider to override known L4/L7 proxy & health check source cidrs
```
Automatic merge from submit-queue
In 'kubectl describe', find controllers with ControllerRef, instead of showing the original creator
@enisoc @kargakis @kubernetes/sig-apps-pr-reviews @kubernetes/sig-cli-pr-reviews
```release-note
In 'kubectl describe', find controllers with ControllerRef, instead of showing the original creator.
```
Automatic merge from submit-queue
Remove alphaProvisioner in PVController and AlphaStorageClassAnnotation
remove alpha annotation and alphaProvisioner
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Remove [Flaky] from presistent volume NFS tests.
**What this PR does / why we need it**:
PV e2e test flaked because of a lack of isolation of test objects (PVs, claims). The common symptoms being volume-mounting pods timing out or an unexpected number of unbound claims being detected.
PR #43645 Introduced the selector labels to test objects that restricted binds to within each "testspace". To accomplish this the test namespace was set as a label for all PVs and selector for all Claims. This allows each test to act as if it's isolated while still creating and binding PVs in parallel w/ other tests.
This has been tested in parallel on both 3-node gci and debian clusters as
`--ginkgo.focus="\[Volume\]" --ginkgo.skip="\[Disruptive\]|\[Feature.*\]|\[Serial\]"`
with out signs of flakiness.
cc @jeffvance
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43304, 41427, 43490, 44352)
Node failure tests for cluster autoscaler
E2e tests checking whether CA is still working with a single broken node.
cc: @MaciekPytel @jszczepkowski @fgrzadkowski
Automatic merge from submit-queue (batch tested with PRs 43844, 44284)
Add a retry to cluster-autoscaler e2e
This should fix https://github.com/kubernetes/kubernetes/issues/44268.
The flake was caused by following sequence of events:
1. Cluster was at minimum size (3), some node was unneeded for a while.
2. Setup for some test (scale-down, failure) would increase node group size (to 5) and wait for new nodes to come up.
3. As soon as new node come up (cluster size 4) CA would scale-down the old unneeded node (setting node group size to 4).
4. Node group would not reach size 5 (as the target was now 4) and the test would timeout and fail.
This PR makes the setup monitor re-set the target node group size if the above scenario happens.