Automatic merge from submit-queue
Ensure that we are closing files.
**What this PR does / why we need it**: In several places we are leaking file descriptors. This could be problematic on systems with low ulimits for them.
**Release note**:
```release-note
```
Automatic merge from submit-queue
Generate 1 5 clientset
Generate the 1.5 clientset. Stop updating 1.4 clientset. Remove 1.2 clientset.
@nikhiljindal @lavalamp
I will rebase #31994 atop of this one.
Automatic merge from submit-queue
remove the rest of the non-generated clients from the kubectl code
Die `Client` Die!
It's always bigger than you think. Last bit @kargakis after this, it's gone.
Automatic merge from submit-queue
Add node e2e density test using 60 QPS for benchmark
This PR adds a new benchmark node e2e density test which sets Kubelet API QPS limit from default 5 to 60, through ConfigMap.
The latency caused by API QPS limit is as large as ~30% when creating a large batch of pods (e.g. 105). It makes the pod startup latency, as well creation throughput underestimated. This test helps us to know the real performance of Kubelet core.
Automatic merge from submit-queue
Fix memory eviction test parameters!
The parameters currently in master should NOT have come through in b9f0bd95. Must have happened when I squashed. Apologies.
Automatic merge from submit-queue
Remove long sleep in provisioning e2e tests.
PV controller sync is now 15 seconds, i.e. the controller re-tries to delete a PV four times in a minute until it succeeds. There is no need to wait for three minutes.
@kubernetes/sig-storage
Automatic merge from submit-queue
Provide an e2e skip helper checking for available resource
@janetkuo @dims this is the promised util function, but unfortunately I just learned that dynamic client suffers from the problem I've fixed in the manually written one (https://github.com/kubernetes/kubernetes/pull/29187) I need to look into the dynamic client in that case :/
Automatic merge from submit-queue
Change rbac roleref to reflect the information we want
@liggitt @ericchiang This is a version of https://github.com/kubernetes/kubernetes/pull/31359 which updates the `RoleRef` to be (I think) the type that we want, with a group, resource, and name.
This is **not** backwards compatible with any existing data. I'm ok with doing this since rbac was considered alpha, but its something to consider.
If we want this instead, I'll close the previous pull (or update it with this content).
Automatic merge from submit-queue
Update container image version for downward api volume tests
Some tests were using 0.7, and some were using 0.6, so updating all to 0.7.
@kubernetes/rh-cluster-infra
Automatic merge from submit-queue
Revert "tag scheduledjob e2e as [Feature:ScheduledJob]"
Reverts kubernetes/kubernetes#32233
The way the e2e jobs are configured, `[Feature:...]` tests can't easily be run in jenkins-pr or any of submit-queue blocking jobs.
Automatic merge from submit-queue
Replace gcloud shelling out with cloudprovider calls.
gcloud flakes a lot leading to resource leak. Also fixes https://github.com/kubernetes/kubernetes/issues/16636 by verifying instance-groups, ssl-certs and firewall-rules and cleaned up.
Automatic merge from submit-queue
Add e2e tests that check for wrapped volume race
This PR adds two new e2e tests that reproduce the race condition fixed in #29641 (see e.g. #29297)
In order to observe the race, you need to revert the PR that fixes it, via e.g.
```
git revert -n df1e925143
```
or
```
curl -sL https://github.com/kubernetes/kubernetes/pull/29641.patch | patch -p1 -R
```
The tests are `[Slow]` because they need to run several passes that involve creating pods with many volumes. They also are `[Serial]` because the load on the cluster may affect reproducibility of the race. They take about ~450s each when they fail on standard GCE cluster created by `go run hack/e2e.go -v --up`. `git_repo` test takes about 66s to run when it succeeds (fix PR not reverted) and `configmap` test takes about 546s in this case because configmap mounting is slower and still requires 3 passes x 5 pods x 50 configmap volumes to fail constantly with fix PR reverted. Probably these times can be reduced but frankly I've already spent quite a bit of time on tuning the numbers to find a balance between reproducibility and speed.
Managed to reproduce the problem in more or less reliable way for `configMap` and `gitRepo` volumes. Tried to reproduce it for `secret` volumes too but without success so far because they use tmpfs-based `emptyDir` variety. For `downwardAPI` volumes I expect the same problems with race reproducibility as with `secret` volumes, although I think some e2e races were caused by the bug, e.g. #29633.
The tests operate by creating several pods (via an RC) with many volumes and waiting for them to become Running. It sets node affinity for pods so that they all get created on a single node (the first one in the node list). The race condition leads to volume mount failures with slow retries, thus causing the test to time out.
The test failures look like this:
configmap:
```
• Failure [435.547 seconds]
[k8s.io] Wrapped EmptyDir volumes
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:709
should not cause race condition when used for configmaps [Serial] [Slow] [It]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:170
Failed waiting for pod wrapped-volume-race-8c097734-6376-11e6-9ffa-5254003793ad-acbtt to enter running state
Expected error:
<*errors.errorString | 0xc8201758d0>: {
s: "timed out waiting for the condition",
}
timed out waiting for the condition
not to have occurred
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:395
```
You'll see errors like this in kubelet log on the first node in the cluster:
```
E0816 00:27:23.319431 3510 configmap.go:174] Error creating atomic writer: stat /var/lib/kubelet/pods/e5986355-6347-11e6-a5d7-42010af00002/volumes/kubernetes.io~configmap/racey-configmap-14: no such file or directory
E0816 00:27:23.319478 3510 nestedpendingoperations.go:232] Operation for "\"kubernetes.io/configmap/e5986355-6347-11e6-a5d7-42010af00002-racey-configmap-14\" (\"e5986355-6347-11e6-a5d7-42010af00002\")" failed. No retries permitted until 2016-08-16 00:28:27.319450118 +0000 UTC (durationBeforeRetry 1m4s). Error: MountVolume.SetUp failed for volume "kubernetes.io/configmap/e5986355-6347-11e6-a5d7-42010af00002-racey-configmap-14" (spec.Name: "racey-configmap-14") pod "e5986355-6347-11e6-a5d7-42010af00002" (UID: "e5986355-6347-11e6-a5d7-42010af00002") with: stat /var/lib/kubelet/pods/e5986355-6347-11e6-a5d7-42010af00002/volumes/kubernetes.io~configmap/racey-configmap-14: no such file or directory
```
git_repo:
```
• Failure [455.035 seconds] [0/1882]
[k8s.io] Wrapped EmptyDir volumes
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:709
should not cause race condition when used for git_repo [Serial] [Slow] [It]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:179
Failed waiting for pod wrapped-volume-race-71b12b3d-6375-11e6-9ffa-5254003793ad-b0slz to enter running state
Expected error:
<*errors.errorString | 0xc8201758d0>: {
s: "timed out waiting for the condition",
}
timed out waiting for the condition
not to have occurred
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:395
```
Errors in kubelet log:
```
E0815 23:41:08.670203 3510 nestedpendingoperations.go:232] Operation for "\"kubernetes.io/git-repo/97636bd8-6341-11e6-a5d7-42010af00002-racey-git-repo-8\" (\"97636bd8-6341-11e6-a5d7-42010af00002\")" failed. No retries permitted until 2016-08-15 23:42:12.670181604 +0000 UTC (durationBeforeRetry 1m4s). Error: MountVolume.SetUp failed for volume "kubernetes.io/git-repo/97636bd8-6341-11e6-a5d7-42010af00002-racey-git-repo-8" (spec.Name: "racey-git-repo-8") pod "97636bd8-6341-11e6-a5d7-42010af00002" (UID: "97636bd8-6341-11e6-a5d7-42010af00002") with: failed to exec 'git clone http://10.0.68.35:2345 test': : chdir /var/lib/kubelet/pods/97636bd8-6341-11e6-a5d7-42010af00002/volumes/kubernetes.io~git-repo/racey-git-repo-8: no such file or directory
```
Generally, the races cause unexpected "no such directory" errors in kubelet logs with subsequent volume mount failures.
I've added race tests to e2e test `empty_dir_wrapper.go` ("EmptyDir wrapper volumes"). This test was added in #18445, the same PR that introduced the race bug. The original purpose of the test was making sure that no conflicts occur between different wrapped emptyDir volumes, so I've replaced "should becomes" with "should not conflict" in the first `It(...)`.
Automatic merge from submit-queue
Add a check in ConfirmUsable() to validate the contextName
**What this PR does / why we need it**:
When a context name is provided, but can't be found (miss spelling), it currently
uses the defaults. This PR will cause the command to fail, to prevent unexpected side effects
of using the wrong configuration.
**Which issue this PR fixes**
fixes#21062
**Special notes for your reviewer**:
None
**Release note**:
```release-note
Error if a contextName is provided but not found in the kubeconfig.
```
Automatic merge from submit-queue
Automated Docker Validation: Change wrong name in perf config.
The config key `containervm-density*` is improper, remove it.
/cc @coufon
Automatic merge from submit-queue
update taints e2e, restrict taints operation with key, effect
Since taints are now unique by key, effect on a node, this PR is to restrict existing taints adding/removing/updating operations in taints e2e.
Also fixes https://github.com/kubernetes/kubernetes/issues/31066#issuecomment-242870101
Related prior Issue/PR #29362 and #30590
Automatic merge from submit-queue
Use PV shared informer in PV controller
Use the PV shared informer, addressing (partially) https://github.com/kubernetes/kubernetes/issues/26247 . Using the PVC shared informer is not so simple because sometimes the controller wants to `Requeue` and...
Automatic merge from submit-queue
Log pressure condition, memory usage, events in memory eviction test
I want to log this to help us debug some of the latest memory eviction test flakes, where we are seeing burstable "fail" before the besteffort. I saw (in the logs) attempts by the eviction manager to evict besteffort a while before burstable phase changed to "Failed", but the besteffort's phase appeared to remain "Running". I want to see the pressure condition interleaved with the pod phases to get a sense of the eviction manager's knowledge vs. pod phase.
Automatic merge from submit-queue
Plumb --feature-gates from TEST_ARGS to components in node e2e tests
This means you can set `TEST_ARGS` on the command line, in a `.properties` config for a Jenkins job, etc, to toggle gated features. For example:
`TEST_ARGS='--feature-gates=DynamicKubeletConfig=true'`
/cc @vishh @jlowdermilk
Automatic merge from submit-queue
Use etcd 2.3.7
This will switch to etcd 2.3.7 for release 1.4, to resolve issues rolling back from 1.4 to 1.3 (while preventing those same issues rolling back to 1.4.0 from a release including etcd 3.0.x).
Fixes#32253.
See #32253 (comment) for etcd roadmap.
Automatic merge from submit-queue
re-enable provisioning test
Reverts https://github.com/kubernetes/kubernetes/pull/32199 for when the gke control plane is updated. This should be merged AFTER gke is ready.
@kubernetes/sig-storage @wojtek-t