Automatic merge from submit-queue
in each pd test, create and delete the pod for every iteration to give new pod name for exec
fix#26141
based on chat with @ncdc
The following is a snapshot of the log. Each iteration now has a new Pod name
```text
[It] should schedule a pod w/two RW PDs both mounted to one container, write to PD, verify contents, delete pod, recreate pod, verify contents, and repeat in rapid succession [Slow] [Flaky]
/srv/dev/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/pd.go:277
STEP: creating PD1
Jun 10 15:55:45.878: INFO: Successfully created a new PD: "rootfs-e2e-c8b82df9-2f23-11e6-a5a0-b8ca3a62792c".
STEP: creating PD2
Jun 10 15:55:49.794: INFO: Successfully created a new PD: "rootfs-e2e-cb135362-2f23-11e6-a5a0-b8ca3a62792c".
Jun 10 15:55:49.794: INFO: PD Read/Writer Iteration #0
STEP: submitting host0Pod to kubernetes
W0610 15:55:49.860308 17282 request.go:347] Field selector: v1 - pods - metadata.name - pd-test-cd68f34b-2f23-11e6-a5a0-b8ca3a62792c: need to check if this is versioned correctly.
STEP: writing a file in the container
Jun 10 15:56:09.792: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-cd68f34b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '988876932586416926' > '/testpd1/tracker0''
Jun 10 15:56:12.003: INFO: Wrote value: "988876932586416926" to PD1 ("rootfs-e2e-c8b82df9-2f23-11e6-a5a0-b8ca3a62792c") from pod "pd-test-cd68f34b-2f23-11e6-a5a0-b8ca3a62792c" container "mycontainer"
STEP: writing a file in the container
Jun 10 15:56:12.003: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-cd68f34b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '8414937992264649637' > '/testpd2/tracker0''
Jun 10 15:56:13.170: INFO: Wrote value: "8414937992264649637" to PD2 ("rootfs-e2e-cb135362-2f23-11e6-a5a0-b8ca3a62792c") from pod "pd-test-cd68f34b-2f23-11e6-a5a0-b8ca3a62792c" container "mycontainer"
STEP: reading a file in the container
Jun 10 15:56:13.170: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-cd68f34b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
Jun 10 15:56:14.325: INFO: Read file "/testpd1/tracker0" with content: 988876932586416926
STEP: reading a file in the container
Jun 10 15:56:14.325: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-cd68f34b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker0'
Jun 10 15:56:15.590: INFO: Read file "/testpd2/tracker0" with content: 8414937992264649637
STEP: deleting host0Pod
Jun 10 15:56:15.841: INFO: PD Read/Writer Iteration #1
STEP: submitting host0Pod to kubernetes
W0610 15:56:15.905485 17282 request.go:347] Field selector: v1 - pods - metadata.name - pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c: need to check if this is versioned correctly.
STEP: reading a file in the container
Jun 10 15:56:16.832: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
Jun 10 15:56:18.132: INFO: Read file "/testpd1/tracker0" with content: 988876932586416926
STEP: reading a file in the container
Jun 10 15:56:18.132: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker0'
Jun 10 15:56:19.354: INFO: Read file "/testpd2/tracker0" with content: 8414937992264649637
STEP: writing a file in the container
Jun 10 15:56:19.354: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '7639503234625274799' > '/testpd1/tracker1''
Jun 10 15:56:20.526: INFO: Wrote value: "7639503234625274799" to PD1 ("rootfs-e2e-c8b82df9-2f23-11e6-a5a0-b8ca3a62792c") from pod "pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c" container "mycontainer"
STEP: writing a file in the container
Jun 10 15:56:20.526: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '7400445987108171911' > '/testpd2/tracker1''
Jun 10 15:56:21.694: INFO: Wrote value: "7400445987108171911" to PD2 ("rootfs-e2e-cb135362-2f23-11e6-a5a0-b8ca3a62792c") from pod "pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c" container "mycontainer"
STEP: reading a file in the container
Jun 10 15:56:21.694: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
Jun 10 15:56:22.904: INFO: Read file "/testpd1/tracker0" with content: 988876932586416926
STEP: reading a file in the container
Jun 10 15:56:22.905: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker0'
Jun 10 15:56:24.080: INFO: Read file "/testpd2/tracker0" with content: 8414937992264649637
STEP: reading a file in the container
Jun 10 15:56:24.081: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker1'
Jun 10 15:56:25.290: INFO: Read file "/testpd1/tracker1" with content: 7639503234625274799
STEP: reading a file in the container
Jun 10 15:56:25.290: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker1'
Jun 10 15:56:26.491: INFO: Read file "/testpd2/tracker1" with content: 7400445987108171911
STEP: deleting host0Pod
Jun 10 15:56:26.756: INFO: PD Read/Writer Iteration #2
STEP: submitting host0Pod to kubernetes
W0610 15:56:26.821828 17282 request.go:347] Field selector: v1 - pods - metadata.name - pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c: need to check if this is versioned correctly.
STEP: reading a file in the container
Jun 10 15:56:27.898: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker1'
Jun 10 15:56:29.096: INFO: Read file "/testpd1/tracker1" with content: 7639503234625274799
STEP: reading a file in the container
Jun 10 15:56:29.096: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker1'
Jun 10 15:56:30.325: INFO: Read file "/testpd2/tracker1" with content: 7400445987108171911
STEP: reading a file in the container
Jun 10 15:56:30.325: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
Jun 10 15:56:31.528: INFO: Read file "/testpd1/tracker0" with content: 988876932586416926
STEP: reading a file in the container
Jun 10 15:56:31.529: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker0'
Jun 10 15:56:32.972: INFO: Read file "/testpd2/tracker0" with content: 8414937992264649637
STEP: writing a file in the container
Jun 10 15:56:32.972: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '1846555975530999997' > '/testpd1/tracker2''
Jun 10 15:56:34.157: INFO: Wrote value: "1846555975530999997" to PD1 ("rootfs-e2e-c8b82df9-2f23-11e6-a5a0-b8ca3a62792c") from pod "pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c" container "mycontainer"
STEP: writing a file in the container
Jun 10 15:56:34.157: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '2775947264799611726' > '/testpd2/tracker2''
Jun 10 15:56:35.661: INFO: Wrote value: "2775947264799611726" to PD2 ("rootfs-e2e-cb135362-2f23-11e6-a5a0-b8ca3a62792c") from pod "pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c" container "mycontainer"
STEP: reading a file in the container
Jun 10 15:56:35.662: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
Jun 10 15:56:36.868: INFO: Read file "/testpd1/tracker0" with content: 988876932586416926
STEP: reading a file in the container
Jun 10 15:56:36.868: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker0'
Jun 10 15:56:38.062: INFO: Read file "/testpd2/tracker0" with content: 8414937992264649637
STEP: reading a file in the container
Jun 10 15:56:38.062: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker1'
Jun 10 15:56:39.221: INFO: Read file "/testpd1/tracker1" with content: 7639503234625274799
STEP: reading a file in the container
Jun 10 15:56:39.221: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker1'
Jun 10 15:56:40.397: INFO: Read file "/testpd2/tracker1" with content: 7400445987108171911
STEP: reading a file in the container
Jun 10 15:56:40.397: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker2'
Jun 10 15:56:41.584: INFO: Read file "/testpd1/tracker2" with content: 1846555975530999997
STEP: reading a file in the container
Jun 10 15:56:41.585: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker2'
Jun 10 15:56:42.800: INFO: Read file "/testpd2/tracker2" with content: 2775947264799611726
STEP: deleting host0Pod
```
@saad-ali
Automatic merge from submit-queue
Dumping logs of federation pods (federation-apiserver, federation-controller-manager) on e2e test failure
Ref https://github.com/kubernetes/kubernetes/issues/26762
This should help with debugging failures.
Right now there is no way to access those logs.
@kubernetes/sig-cluster-federation @colhom
Automatic merge from submit-queue
Call NewFramework constructor instead of hand creating framework.
https://github.com/kubernetes/kubernetes/issues/27486, probably because we defined a new clientConfigGetter for node e2es and this test was hand creating the framework.
Automatic merge from submit-queue
Kubelet Volume Attach/Detach/Mount/Unmount Redesign
This PR redesigns the Volume Attach/Detach/Mount/Unmount in Kubelet as proposed in https://github.com/kubernetes/kubernetes/issues/21931
```release-note
A new volume manager was introduced in kubelet that synchronizes volume mount/unmount (and attach/detach, if attach/detach controller is not enabled).
This eliminates the race conditions between the pod creation loop and the orphaned volumes loops. It also removes the unmount/detach from the `syncPod()` path so volume clean up never blocks the `syncPod` loop.
```
Automatic merge from submit-queue
federation: choosing a default federation name in test instead of failing
The tests are failing right now:
http://kubekins.dls.corp.google.com/job/kubernetes-e2e-gce-federation/
```
[k8s.io] Service [Feature:Federation] should be able to discover a non-local federated service
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/federated-service.go:130 Jun 14 12:40:35.091: FEDERATION_NAME environment variable must be set
[k8s.io] Service [Feature:Federation] should be able to discover a federated service
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/federated-service.go:130 Jun 14 12:40:40.802: FEDERATION_NAME environment variable must be set
```
This is to fix them.
cc @kubernetes/sig-cluster-federation @mml
This commit adds a new volume manager in kubelet that synchronizes
volume mount/unmount (and attach/detach, if attach/detach controller
is not enabled).
This eliminates the race conditions between the pod creation loop
and the orphaned volumes loops. It also removes the unmount/detach
from the `syncPod()` path so volume clean up never blocks the
`syncPod` loop.
Automatic merge from submit-queue
Make timeout for starting system pods configurable
Context: in 2000-node clusters (if only one node is big enough to fit heapster, which is our testing configuration), heapster won't be scheduled until that node has route. However, creating routes is pretty expensive and currently can take even 2 hours.
@zmerlynn @gmarek
Automatic merge from submit-queue
Add image pulling node e2e
Fixes#27007.
Based on #27309, will rebase after #27309 gets merged.
This PR added all tests mentioned in #27007:
* Pull an image from invalid registry;
* Pull an invalid image from gcr;
* Pull an image from gcr;
* Pull an image from docker hub;
* Pull an image needs auth with/without secrets.
For the imagePullSecrets test, I created a new gcloud project "authenticated-image-pulling", and the service account in the code only has "Storage Object Viewer" permission.
/cc @pwittrock @vishh
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Automatic merge from submit-queue
Add description to created node images
Make it a little easier to see who to contact about important node e2e images.
The number of pods to start must be non-zero.
Otherwise the function waits for pods forever if waitForRunning is true.
It the number of replicas is zero, panic so the mistake is heard all over the e2e realm.
Update all callers of StartPods to test for non-zero number of replicas.
Automatic merge from submit-queue
Updating federation up scripts to work in non e2e setup
Ref: https://github.com/kubernetes/kubernetes.github.io/pull/656
Updating the federation up scripts so that they work as per steps in https://github.com/kubernetes/kubernetes.github.io/pull/656.
Changes are:
* Updating the default namespace to be "federation" instead of "federation-e2e"
* Updated the kubeconfig context to be named "federation-cluster" instead of "federated-context"
* Fixing federation-up so that FEDERATION_IMAGE_TAG is set even when federation-up is run without running `e2e.go --up`. e2e-up.sh sets it here: 6a388d4a0d/hack/e2e-internal/e2e-up.sh (L44).
* Adding a "missingkey=zero" option to template parser. Without this, the parser adds `"<no value>"` at the place of an env var that is not set. With this change, it instead replaces it with the corresponding zero value (for ex "" for strings). This is required for the FEDERATION_DNS_PROVIDER_CONFIG env var.
cc @kubernetes/sig-cluster-federation @colhom @mml
Automatic merge from submit-queue
Implement first set of federated service e2e tests.
These tests are untested and there is no guarantee that they work. The ongoing auth problems is blocking these e2es from being tested and upon @quinton-hoole's request I am submitting them now.
Only the last commit here needs review.
Depends on #26953
cc @nikhiljindal @colhom @mfanjie @kubernetes/sig-cluster-federation
Automatic merge from submit-queue
Fix node e2e coreos kubelet cgroup detection
Fixes#26979#26431
The root issue, as best I can tell, is that cgroup detection does not work when the kubelet is started under an ssh session and the systemd `*Accounting` variables are set. I added additional logging and noted some differences in the cgroup slice names between those cadvisor returns and the kubelet detects for itself.
This difference does not occur if the kubelet is properly running under a unit. That environment is also a more common and sane environment.
See also discussion in #26903
cc @derekwaynecarr @vishh @pwittrock
Note that these tests are untested and there is no guarantee that they work.
The ongoing auth problems is blocking these e2es from being tested and upon
@quinton-hoole's request I am submitting them now.
Automatic merge from submit-queue
Add pending pod check in cluster autoscaler e2e tests
The tests should wait until all pods are running before declaring a success and resizing the mig.
cc: @fgrzadkowski @piosz @jszczepkowski
Automatic merge from submit-queue
volume integration: wait for PVs before creating PVCs
The test should wait until all volumes are processed by volume controller (i.e. in the controller cache) before creating a PVC.
Without that, the "best" matching PV could not be in the cache and controller might bind the PVC to suboptiomal one.
This fixes integration test flake "Bind mismatch! Expected pvc-2 capacity 50000000000 but got pvc-2 capacity 52000000000".
Fixes#27179 (together with #26894)
Automatic merge from submit-queue
Fix integration pv flakes
There are two fixes in this PR:
- run tests in separarate functions and use objects with different names, otherwise events from the beginning of the function are caught later when we watch for events of a different PV/PVC
- don't set PV.Spec.ClaimRef.UID of pre-bound PVs. PVs with UID set are considered as bound and they are deleted/recycled when appropriate PVC does not exists yet.
Fixes#26730 and probably also ~~#26894~~ #26256
The test should wait until all volumes are processed by volume controller (i.e.
in the controller cache) before creating a PVC.
Without that, the "best" matching PV could not be in the cache and controller
might bind the PVC to suboptiomal one.
This fixes integration test flake "Bind mismatch! Expected pvc-2 capacity
50000000000 but got pvc-2 capacity 52000000000".
Automatic merge from submit-queue
e2e: actually check error when we fail to GET the scheduled pod
and GET the pod before we try to gracefully delete it.
and add a debug log.
ref #26224
Automatic merge from submit-queue
Considering all nodes for the scheduler cache to allow lookups
Fixes the actual issue that led me to create https://github.com/kubernetes/kubernetes/issues/22554
Currently the nodes in the cache provided to the predicates excludes the unschedulable nodes using field level filtering for the watch results. This results in the above issue as the `ServiceAffinity` predicate uses the cached node list to look up the node metadata for a peer pod (another pod belonging to the same service). Since this peer pod could be currently hosted on a node that is currently unschedulable, the lookup could potentially fail, resulting in the pod failing to be scheduled.
As part of the fix, we are now including all nodes in the watch results and excluding the unschedulable nodes using `NodeCondition`
@derekwaynecarr PTAL
Automatic merge from submit-queue
Port the downward api test to the node e2e suite
Also extend the framework to allow a custom client config loading function, so
that the node e2e suite can reuse the same framework across tests.
This fixes#26609
/cc @timstclair @pwittrock
Automatic merge from submit-queue
Add e2e test for node problem detector
Based on https://github.com/kubernetes/node-problem-detector/pull/12.
This PR added e2e test for node problem detector. It starts a node problem detector with test configuration, and test its functionality.
Question:
* Should this be a node e2e test or e2e test?
* Should we mark this as Serial? The test should not affect other tests, but it will add a `TestConditon` in node conditions and remove in cleanup.
@dchen1107
/cc @kubernetes/sig-node
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Automatic merge from submit-queue
Listing pods only once when getting pods for RS in deployment
Fixes#26834
1. Avoid ranging over RSes and then `List` pods of each RS. Instead, `List` pods of the deployment once, and then filter pods of each RS.
2. Avoid using clientset to `List` pods in deployment controller. Use podStore instead. (TODO in some functions because the unit tests don't have podStore.)
@kubernetes/deployment
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Automatic merge from submit-queue
support for mounting local-ssds on GCI
This change adds support for mounting local ssds on GCI.
It updates the previous container-vm behavior as well to
match that for GCI nodes by mounting the local-ssds under
the same path (/mnt/disks/ssdN).
@vulpecula @roberthbailey @andyzheng0831 @kubernetes/goog-image
This reverts commit 2494c77972.
The previous commit, which launches the kubelet under `systemd-run`,
fixes an error in detecting the kubelet's cgroup stats.
Fixes#26979
Automatic merge from submit-queue
Fix GKE upgrade e2e util.
containers command group at HEAD no longer accepts --zone. Flag
has to be specified after subcommand group. Fix#27011
This change adds support for mounting local ssds on GCI.
It updates the previous container-vm behavior as well to
match that for GCI nodes by mounting the local-ssds under
the same path (/mnt/disks/ssdN).
Automatic merge from submit-queue
Enable WatchCache in test/integration/ tests
We already run cmd/integration/ with watch cache on. We should also run tests in test/integration/ with watch cache on.
@wojtek-t @lavalamp
Automatic merge from submit-queue
Add a custom main instead of the standard test main, to reduce stack …
Adds a custom test main handler (see: `TestMain` in https://golang.org/pkg/testing/ for details)
Partial fix for https://github.com/kubernetes/kubernetes/issues/25965
This does the standard timeout, but strips non-kubernetes stacks out of the stack trace (e.g. it filters things like:
```
goroutine 466 [IO wait, 7 minutes]:
net.runtime_pollWait(0x7fd74c4672c0, 0x72, 0xc821614000)
/usr/local/go/src/runtime/netpoll.go:160 +0x60
net.(*pollDesc).Wait(0xc8215c21b0, 0x72, 0x0, 0x0)
/usr/local/go/src/net/fd_poll_runtime.go:73 +0x3a
net.(*pollDesc).WaitRead(0xc8215c21b0, 0x0, 0x0)
/usr/local/go/src/net/fd_poll_runtime.go:78 +0x36
net.(*netFD).Read(0xc8215c2150, 0xc821614000, 0x1000, 0x1000, 0x0, 0x7fd74c491050, 0xc820014058)
/usr/local/go/src/net/fd_unix.go:250 +0x23a
net.(*conn).Read(0xc820a5a090, 0xc821614000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/usr/local/go/src/net/net.go:172 +0xe4
net/http.noteEOFReader.Read(0x7fd74c465258, 0xc820a5a090, 0xc8215f0068, 0xc821614000, 0x1000, 0x1000, 0x405773, 0x0, 0x0)
/usr/local/go/src/net/http/transport.go:1687 +0x67
net/http.(*noteEOFReader).Read(0xc8215ae1a0, 0xc821614000, 0x1000, 0x1000, 0xc82159ad1d, 0x0, 0x0)
<autogenerated>:284 +0xd0
bufio.(*Reader).fill(0xc8202a2b40)
/usr/local/go/src/bufio/bufio.go:97 +0x1e9
bufio.(*Reader).Peek(0xc8202a2b40, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/local/go/src/bufio/bufio.go:132 +0xcc
net/http.(*persistConn).readLoop(0xc8215f0000)
/usr/local/go/src/net/http/transport.go:1073 +0x177
created by net/http.(*Transport).dialConn
/usr/local/go/src/net/http/transport.go:857 +0x10a6
```
We may want to get even more aggressive in the future.
@kubernetes/sig-testing
Automatic merge from submit-queue
Move quota usage testing for loadbalancers into unit tests
Fixes https://github.com/kubernetes/kubernetes/issues/26319
* moved testing for node port and load balancer usage in quota to unit tests
* remove node port and node port -> loadbalancer service testing out of e2e
* covered already in replenishment_controller_test scenario
Given the time it takes to even allocate a load balancer, it seems better to test that outside of this test case to avoid unnecessary flakes.
/cc @bprashanth
Automatic merge from submit-queue
Flake 26210: decouple explicit access from port 80
Flake #26210 only happens for port 80. To decouple the possible causes, all
tests with explicit port 80 are moved to port 1080 (these were 80% of the flakes).
The urls without a specified port (which map to port 80 though) are left untouched.
If port 1080 does not show up as flake now, there is really a connection to the
actual port number.
Automatic merge from submit-queue
Reduce huge amount of logs in large cluster tests
When running tests in 2000-node clusters, I got more than 100.000 lines like this:
```
Jun 8 01:03:11.850: INFO: Condition NetworkUnavailable of node gke-gke-large-cluster-default-pool-1-03ee5a12-knrw is true instead of false. Reason: NoRouteCreated, message: Node created w ithout a route
```
that doesn't give much value.
This is PR is reducing the number of logs.
Also includes other improvements:
- Makefile rule to run tests against remote instance using existing host or image
- Makefile will reuse an instance created from an image if it was not torn down
- Runner starts gce instances in parallel with building source
- Runner uses instance ip instead of hostname so that it doesn't need to resolve
- Runner supports cleaning up files and processes on an instance without stopping / deleting it
- Runner runs tests using `ginkgo` binary to support running tests in parallel
Now that GCE routes take an extremely long time to come up and there's
a variance in "Ready" and "Schedulable", start cherry-picking tests
where we really want to have all nodes routable/schedulable for
testing. Adding logging. This will increase test times on large
clusters but should have 0 impact on normal testing.
Flake #26210 only happens for port 80. To decouple the possible causes, all
tests with explicit port 80 are moved to port 1080 (these were 80% of the flakes).
The urls without a specified port (which map to port 80 though) are left untouched.
If port 1080 does not show up as flake now, there is really a connection to the
actual port number.
Automatic merge from submit-queue
Mark runtime conformance tests as flaky and not run them in jenkins CI.
For #26809
@pwittrock As discussed offline, marking runtime tests as flaky for now. I'm not sure if those tests are required. Testing docker in every Kubernetes PR is un-necessary.
These tests can be run periodically in a separate CI. AFAIK, these tests don't seem to exercise any kube features.
When we create a PV, we should created it withoud Spec.ClaimRef.UID.
In rare cases, when 'PV added' event with UID is processed before 'PVC
added' (created by for loop few lines above), the controller does not know
a PVC with this UID and considers the PV as released. Reclaim policy is
then executed and the PV is deleted and it's never bound.
With UID="", the controller waits for the PVC to get created and binds
it.
Different tests should use different objects and watchers - I noticed
sometimes an event from old tests leaked into subsequent test in the
same function.
And add some logs.
Automatic merge from submit-queue
Update Node e2e Core OS image to run systemd with CPU & Memory accounting enabled by default
cc @derekwaynecarr
For #26289
Automatic merge from submit-queue
volume controller: add configurable integration test to stress the binder
The test tries to bind configured nr. of PVs to the same nr. of PVCs. '100' is used by default, which should take ~1-3 seconds (depends on log level). Periodic sync is needed in rare cases, which may add another 10 seconds. - cache from #25881 will help here and sync should not be needed at all.
The test is configurable and may be reused to measure binder performance. Set KUBE_INTEGRATION_PERSISTENTVOLUME_* env. variables as described in persistent_volume_test.go and run the tests:
```
# compile
$ cd test/integration
$ godep go test -tags 'integration no-docker' -c
# run the tests
$ KUBE_INTEGRATION_PERSISTENTVOLUME_SYNC_PERIOD=10s KUBE_INTEGRATION_PERSISTENTVOLUME_OBJECTS=1000 time ./integration.test -test.run TestPersistentVolumeMultiPVsPVCs -v 2
```
Log level '2' is useful to get timestamps of various events like 'TestPersistentVolumeMultiPVsPVCs: start' and 'TestPersistentVolumeMultiPVsPVCs: claims are bound'.
Automatic merge from submit-queue
kubelet e2e: enforce that image prepulling must finish before the test
The image prepulling pod calls docker directly to pull images. If the pod
hasn't finished before running the resource usage tracking test, there'd be a
cpu spike in docker. We'd rather wait and fail if this is the case, before
running the test.
The image prepulling pod calls docker directly to pull images. If the pod
hasn't finished before running the resource usage tracking test, there'd be a
cpu spike in docker. We'd rather wait and fail if this is the case, before
running the test.
The test tries to bind configured nr. of PVs to the same nr. of PVCs.
'100' is used by default, which should take ~1-3 seconds (depends on log level).
Periodic sync is needed in rare cases, which may add another 10 seconds. - cache
from #25881 will help here and sync should not be needed at all.
The test is configurable and may be reused to measure binder performance.
Set KUBE_INTEGRATION_PV_* env. variables as described in
persistent_volume_test.go and run the tests:
# compile
$ cd test/integration
$ godep go test -tags 'integration no-docker' -c
# run the tests
$ KUBE_INTEGRATION_PV_SYNC_PERIOD=10s KUBE_INTEGRATION_PV_OBJECTS=1000 time ./integration.test -test.run TestPersistentVolumeMultiPVsPVCs -v 2
Log level '2' is useful to get timestamps of various events like
'TestPersistentVolumeMultiPVsPVCs: start' and 'TestPersistentVolumeMultiPVsPVCs:
claims are bound'.
Automatic merge from submit-queue
Stabilize persistent volume integration tests
- add more logs
- wait both for volume and claim to get bound
When binding volumes to claims the controller saves PV first and PVC right
after that. In theory, this saved PV could cause waitForPersistentVolumePhase
to finish and PVC could be checked in the test before the controller saves it.
So, wait for both PVC and PV to get bound and check the results only after
that. This is only a theory, there are no usable logs in integration tests.
Fixes#26499 (at least I hope so...)
Automatic merge from submit-queue
Don't allow deps with no discernible license
This updates the few deps we had with no LICENSE file to current versions that do have that file. It also disallows new deps without obvious licenses.
Automatic merge from submit-queue
Enable node e2e accounting on systemd
Updated the e2e setup.sh script to enable cpu and memory accounting.
Related to https://github.com/kubernetes/kubernetes/issues/26198
/cc @pwittrock
Automatic merge from submit-queue
Disable PodAffinity SchedulerPredicates test
This feature is disabled, so it's not surprising that tests don't work.
cc @davidopp @kevin-wangzefeng
@david-mcmahon - this disables the second test that causes failures in SchedulerPredicates suite. When this and #26695 are merged it should be passing in serial.
Automatic merge from submit-queue
Revert revert of adding resource constraints for master components in density tests
The problem was the time when resource constraints were generated. It turns out that the provider is not set there. This version should work.
cc @roberthbailey @alex-mohr
Automatic merge from submit-queue
Add direct serializer
Fix#25589. Implemented a direct codec that doesn't do conversion, but sets the group, version and kind before serialization as Clayton suggested [here](https://github.com/kubernetes/kubernetes/issues/25589#issuecomment-219168009).
First commit is cherry-picked from #24826.
@kubernetes/sig-api-machinery
Automatic merge from submit-queue
kubelet e2e: bumping cpu limit
The previous limit was too aggressive and caused kubernetes-e2e-gce-serial build 1404 to fail.
Automatic merge from submit-queue
Change Kubelet test to run also in large clusters
This should fix the recent failure of kubelet test in 2000-node cluster.
@zmerlynn - FYI
- add more logs
- wait both for volume and claim to get bound
When binding volumes to claims the controller saves PV first and PVC right
after that. In theory, this saved PV could cause waitForPersistentVolumePhase
to finish and PVC could be checked in the test before the controller saves it.
So, wait for both PVC and PV to get bound and check the results only after
that. This is only a theory, there are no usable logs in integration tests.
Automatic merge from submit-queue
SSH e2e: Limit to 100 nodes, limit combinatorics
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()This limits the "for all hosts" to 100 nodes, and also limits the
combinatorial section so that we only do the other SSH command variant
testing on the first host rather than *all* of the hosts. I also
killed one of the variants because it didn't seem to be testing much
important.
Fixes#26600
This limits the "for all hosts" to 100 nodes, and also limits the
combinatorial section so that we only do the other SSH command variant
testing on the first host rather than *all* of the hosts. I also
killed one of the variants because it didn't seem to be testing much
important.
Fixes#26600
Automatic merge from submit-queue
kubelet e2e: set cpu/memory limits for docker 1.11
Docker 1.11 consumes more memory. Bump the limit to fix the tests. Also add
new limits for the 100-pod resource usage tracking test.
This fixes#26495
Automatic merge from submit-queue
Fix some gce-only tests to run on gke as well
Enable "Services should work after restarting apiserver [Disruptive]" and DaemonRestart tests, except the 2 that require master ssh access.
Move restart/upgrade related test helpers into their own file in framework package.
- Exit non-0 if infrastructure failures happen
- Exit 0 if no infrastructure failures happen regardless of test results
(Jenkins will use junit.xml to determine test results)
Automatic merge from submit-queue
Use ubuntu-slim to reduce size of the iperf:e2e image
from
```
gcr.io/google_containers/iperf e2e 8b3cc7064090 5 weeks ago 737.9 MB
```
to
```
gcr.io/google_containers/iperf e2e 204325491636 33 seconds ago 61.09 MB
```
related to https://github.com/kubernetes/kubernetes/pull/25784#issuecomment-221706886
ping @bprashanth
Automatic merge from submit-queue
Support per-test-environment ginkgo flags for node e2e tests to facilitate skipping miss behaving tests in PR builder
We had an issue today where some node e2e tests were timing out in the pr builder. We want to be able to skip tests in the pr builder and leave them running in the CI if this happens again.
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Automatic merge from submit-queue
Make Privileged pods node e2e use the framework
Made the test more readable along the way with more logs. This should help us triage failures/flakes in the future.
#24577
Automatic merge from submit-queue
Use pause image depending on the server's platform when testing
Removed all pause image constant strings, now the pause image is chosen by arch. Part of the effort of making e2e arch-agnostic.
The pause image name and version is also now only in two places, and it's documented to bump both
Also removed "amd64" constants in the code. Such constants should be replaced by `runtime.GOARCH` or by looking up the server platform
Fixes: #22876 and #15140
Makes it easier for: #25730
Related: #17981
This is for `v1.3`
@ixdy @thockin @vishh @kubernetes/sig-testing @andyzheng0831 @pensu
Automatic merge from submit-queue
Attach Detach Controller Business Logic
This PR adds the meat of the attach/detach controller proposed in #20262.
The PR splits the in-memory cache into a desired and actual state of the world.
Automatic merge from submit-queue
Flake 21484: retrieve pod log during e2e error
Print the pod log when an error occurs in
> Proxy version 1 should proxy through a service and a pod [Conformance]
e2e test. This will help to understand flake https://github.com/kubernetes/kubernetes/issues/21484 better.
Many tests expect all kube-system pods to be running and ready. The newly
added image prepull add-on pod can in the "succeeded" state. This commit fixes
the tests to allow kube-system pods to be succeeded.
Automatic merge from submit-queue
Downward API implementation for resources limits and requests
This is an implementation of Downward API for resources limits and requests, and it works with environment variables and volume plugin.
This is based on proposal https://github.com/kubernetes/kubernetes/pull/24051. This implementation follows API with magic keys approach as discussed in the proposal.
@kubernetes/rh-cluster-infra
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24179)
<!-- Reviewable:end -->
Automatic merge from submit-queue
Prepull images in e2e
Quick and dirty image puller because the SQ stalled multiple times just *today* on image pull flake (https://github.com/kubernetes/kubernetes/issues/25277).
@kubernetes/sig-node @kubernetes/sig-testing wdyt?
Split controller cache into actual and desired state of world.
Controller will only operate on volumes scheduled to nodes that
have the "volumes.kubernetes.io/controller-managed-attach" annotation.
- Add junit test reported
- Write etcd.log, kubelet.log and kube-apiserver.log to files instead of stdout
- Scp artifacts to the jenkins WORKSPACE
Fixes#25966
Automatic merge from submit-queue
in e2e test, when kubectl exec fails to find the container to run a command, it should retry
fix#26076
Without retrying upon "container not found" error, `Pod Disks` test failed on the following error:
```console
[k8s.io] Pod Disks
should schedule a pod w/two RW PDs both mounted to one container, write to PD, verify contents, delete pod, recreate pod, verify contents, and repeat in rapid succession [Slow]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/pd.go:271
[BeforeEach] [k8s.io] Pod Disks
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:108
STEP: Creating a kubernetes client
May 23 19:18:02.254: INFO: >>> TestContext.KubeConfig: /root/.kube/config
STEP: Building a namespace api object
STEP: Waiting for a default service account to be provisioned in namespace
[BeforeEach] [k8s.io] Pod Disks
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/pd.go:69
[It] should schedule a pod w/two RW PDs both mounted to one container, write to PD, verify contents, delete pod, recreate pod, verify contents, and repeat in rapid succession [Slow]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/pd.go:271
STEP: creating PD1
May 23 19:18:06.678: INFO: Successfully created a new PD: "rootfs-e2e-11dd5f5b-211b-11e6-a3ff-b8ca3a62792c".
STEP: creating PD2
May 23 19:18:11.216: INFO: Successfully created a new PD: "rootfs-e2e-141f062d-211b-11e6-a3ff-b8ca3a62792c".
May 23 19:18:11.216: INFO: PD Read/Writer Iteration #0
STEP: submitting host0Pod to kubernetes
W0523 19:18:11.279910 4984 request.go:347] Field selector: v1 - pods - metadata.name - pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c: need to check if this is versioned correctly.
STEP: writing a file in the container
May 23 19:18:39.088: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '1394466581702052925' > '/testpd1/tracker0''
May 23 19:18:40.250: INFO: Wrote value: "1394466581702052925" to PD1 ("rootfs-e2e-11dd5f5b-211b-11e6-a3ff-b8ca3a62792c") from pod "pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c" container "mycontainer"
STEP: writing a file in the container
May 23 19:18:40.251: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '1740704063962701662' > '/testpd2/tracker0''
May 23 19:18:41.433: INFO: Wrote value: "1740704063962701662" to PD2 ("rootfs-e2e-141f062d-211b-11e6-a3ff-b8ca3a62792c") from pod "pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c" container "mycontainer"
STEP: reading a file in the container
May 23 19:18:41.433: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
May 23 19:18:42.585: INFO: Read file "/testpd1/tracker0" with content: 1394466581702052925
STEP: reading a file in the container
May 23 19:18:42.585: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker0'
May 23 19:18:43.779: INFO: Read file "/testpd2/tracker0" with content: 1740704063962701662
STEP: deleting host0Pod
May 23 19:18:44.048: INFO: PD Read/Writer Iteration #1
STEP: submitting host0Pod to kubernetes
W0523 19:18:44.132475 4984 request.go:347] Field selector: v1 - pods - metadata.name - pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c: need to check if this is versioned correctly.
STEP: reading a file in the container
May 23 19:18:45.186: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
May 23 19:18:46.290: INFO: error running kubectl exec to read file: exit status 1
stdout=
stderr=error: error executing remote command: error executing command in container: container not found ("mycontainer")
)
May 23 19:18:46.290: INFO: Error reading file: exit status 1
May 23 19:18:46.290: INFO: Unexpected error occurred: exit status 1
```
Now I've run this fix on e2e pd test 5 times and no longer see any failure
Automatic merge from submit-queue
Don't dump everything in kubemarks
Don't dump all events etc. in kubemark failures, those are useless anyway with that amount of data.
Automatic merge from submit-queue
Fix panic in auth test failure
I got a spurious failure in the webhook integration test, but couldn't see the error returned because a panic was hit that assumed a body was always returned with the response
Automatic merge from submit-queue
E2e tests for GKE cluster with local SSD.
The test cover node pool with local SSD creation and scheduling a pod that writes and reads from it. Pod access local disk via hostPath.
```release-note
E2e tests for GKE cluster with local SSD.
-OR-
```
Automatic merge from submit-queue
kubelet/cadvisor: Refactor cadvisor disk stat/usage interfaces.
basically
1) cadvisor struct will know what runtime the kubelet is, passed in via additional argument to New()
2) rename cadvisor wrapper function to DockerImagesFsInfo() to ImagesFsInfo() and have linux implementation choose a label based on the runtime inside the cadvisor struct
2a) mock/fake/unsupported modified to take the same additional argument in New()
3) kubelet's wrapper for the cadvisor wrapper is renamed in parallel
4) make all tests use new interface
Automatic merge from submit-queue
Cache Webhook Authentication responses
Add a simple LRU cache w/ 2 minute TTL to the webhook authenticator.
Kubectl is a little spammy, w/ >= 4 API requests per command. This also prevents a single unauthenticated user from being able to DOS the remote authenticator.
Automatic merge from submit-queue
Extend secrets volumes with path control
As per [1] this PR extends secrets mapped into volume with:
* key-to-path mapping the same way as is for configmap. E.g.
```
{
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"name": "mypod",
"namespace": "default"
},
"spec": {
"containers": [{
"name": "mypod",
"image": "redis",
"volumeMounts": [{
"name": "foo",
"mountPath": "/etc/foo",
"readOnly": true
}]
}],
"volumes": [{
"name": "foo",
"secret": {
"secretName": "mysecret",
"items": [{
"key": "username",
"path": "my-username"
}]
}
}]
}
}
```
Here the ``spec.volumes[0].secret.items`` added changing original target ``/etc/foo/username`` to ``/etc/foo/my-username``.
* secondly, refactoring ``pkg/volumes/secrets/secrets.go`` volume plugin to use ``AtomicWritter`` to project a secret into file.
[1] https://github.com/kubernetes/kubernetes/blob/master/docs/design/configmap.md#changes-to-secret
Automatic merge from submit-queue
Check status of framework.CheckPodsRunningReady
Check status of framework.CheckPodsRunningReady and fail test if it's false, instead of silently
ignoring the failure.
This doesn't fix whatever is causing the pod not to start in #17523 but it does fail the test as soon as it detects the pod didn't start, instead of allowing the testing to proceed.
cc @kubernetes/sig-testing @spxtr @ixdy @kubernetes/rh-cluster-infra