github/k3s - k3s - https://git.xinac.net

Commit Graph

Author	SHA1	Message	Date
Michail Kargakis	69bb4e4c84	test: add/remove myself from tests appropriately	2016-09-15 12:27:05 +02:00
Kubernetes Submit Queue	f2951a54f9	Merge pull request #30674 from ivan4th/add-e2e-tests-for-wrapped-volume-race Automatic merge from submit-queue Add e2e tests that check for wrapped volume race This PR adds two new e2e tests that reproduce the race condition fixed in #29641 (see e.g. #29297) In order to observe the race, you need to revert the PR that fixes it, via e.g. ``` git revert -n `df1e925143` ``` or ``` curl -sL https://github.com/kubernetes/kubernetes/pull/29641.patch \| patch -p1 -R ``` The tests are `[Slow]` because they need to run several passes that involve creating pods with many volumes. They also are `[Serial]` because the load on the cluster may affect reproducibility of the race. They take about ~450s each when they fail on standard GCE cluster created by `go run hack/e2e.go -v --up`. `git_repo` test takes about 66s to run when it succeeds (fix PR not reverted) and `configmap` test takes about 546s in this case because configmap mounting is slower and still requires 3 passes x 5 pods x 50 configmap volumes to fail constantly with fix PR reverted. Probably these times can be reduced but frankly I've already spent quite a bit of time on tuning the numbers to find a balance between reproducibility and speed. Managed to reproduce the problem in more or less reliable way for `configMap` and `gitRepo` volumes. Tried to reproduce it for `secret` volumes too but without success so far because they use tmpfs-based `emptyDir` variety. For `downwardAPI` volumes I expect the same problems with race reproducibility as with `secret` volumes, although I think some e2e races were caused by the bug, e.g. #29633. The tests operate by creating several pods (via an RC) with many volumes and waiting for them to become Running. It sets node affinity for pods so that they all get created on a single node (the first one in the node list). The race condition leads to volume mount failures with slow retries, thus causing the test to time out. The test failures look like this: configmap: ``` • Failure [435.547 seconds] [k8s.io] Wrapped EmptyDir volumes /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:709 should not cause race condition when used for configmaps [Serial] [Slow] [It] /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:170 Failed waiting for pod wrapped-volume-race-8c097734-6376-11e6-9ffa-5254003793ad-acbtt to enter running state Expected error: <errors.errorString \| 0xc8201758d0>: { s: "timed out waiting for the condition", } timed out waiting for the condition not to have occurred /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:395 ``` You'll see errors like this in kubelet log on the first node in the cluster: ``` E0816 00:27:23.319431 3510 configmap.go:174] Error creating atomic writer: stat /var/lib/kubelet/pods/e5986355-6347-11e6-a5d7-42010af00002/volumes/kubernetes.io~configmap/racey-configmap-14: no such file or directory E0816 00:27:23.319478 3510 nestedpendingoperations.go:232] Operation for "\"kubernetes.io/configmap/e5986355-6347-11e6-a5d7-42010af00002-racey-configmap-14\" (\"e5986355-6347-11e6-a5d7-42010af00002\")" failed. No retries permitted until 2016-08-16 00:28:27.319450118 +0000 UTC (durationBeforeRetry 1m4s). Error: MountVolume.SetUp failed for volume "kubernetes.io/configmap/e5986355-6347-11e6-a5d7-42010af00002-racey-configmap-14" (spec.Name: "racey-configmap-14") pod "e5986355-6347-11e6-a5d7-42010af00002" (UID: "e5986355-6347-11e6-a5d7-42010af00002") with: stat /var/lib/kubelet/pods/e5986355-6347-11e6-a5d7-42010af00002/volumes/kubernetes.io~configmap/racey-configmap-14: no such file or directory ``` git_repo: ``` • Failure [455.035 seconds] [0/1882] [k8s.io] Wrapped EmptyDir volumes /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:709 should not cause race condition when used for git_repo [Serial] [Slow] [It] /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:179 Failed waiting for pod wrapped-volume-race-71b12b3d-6375-11e6-9ffa-5254003793ad-b0slz to enter running state Expected error: <errors.errorString \| 0xc8201758d0>: { s: "timed out waiting for the condition", } timed out waiting for the condition not to have occurred /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:395 ``` Errors in kubelet log: ``` E0815 23:41:08.670203 3510 nestedpendingoperations.go:232] Operation for "\"kubernetes.io/git-repo/97636bd8-6341-11e6-a5d7-42010af00002-racey-git-repo-8\" (\"97636bd8-6341-11e6-a5d7-42010af00002\")" failed. No retries permitted until 2016-08-15 23:42:12.670181604 +0000 UTC (durationBeforeRetry 1m4s). Error: MountVolume.SetUp failed for volume "kubernetes.io/git-repo/97636bd8-6341-11e6-a5d7-42010af00002-racey-git-repo-8" (spec.Name: "racey-git-repo-8") pod "97636bd8-6341-11e6-a5d7-42010af00002" (UID: "97636bd8-6341-11e6-a5d7-42010af00002") with: failed to exec 'git clone http://10.0.68.35:2345 test': : chdir /var/lib/kubelet/pods/97636bd8-6341-11e6-a5d7-42010af00002/volumes/kubernetes.io~git-repo/racey-git-repo-8: no such file or directory ``` Generally, the races cause unexpected "no such directory" errors in kubelet logs with subsequent volume mount failures. I've added race tests to e2e test `empty_dir_wrapper.go` ("EmptyDir wrapper volumes"). This test was added in #18445, the same PR that introduced the race bug. The original purpose of the test was making sure that no conflicts occur between different wrapped emptyDir volumes, so I've replaced "should becomes" with "should not conflict" in the first `It(...)`.	2016-09-11 03:39:21 -07:00
Kubernetes Submit Queue	8780961e94	Merge pull request #32112 from soltysh/test_owners Automatic merge from submit-queue Updated test owners and assigned ScheduledJobs to soltysh I've updated test owners by running `hack/update_owners.py` and assigned all ScheduledJob related issues to myself. @fejta ptal	2016-09-09 00:48:14 -07:00
Kubernetes Submit Queue	ddcbdcb8c8	Merge pull request #31535 from aveshagarwal/master-e2e-downward-api-issues Automatic merge from submit-queue Fix downward api tests to output node allocatable not node capacity @kubernetes/rh-cluster-infra @derekwaynecarr	2016-09-07 16:25:19 -07:00
Maciej Szulik	ac1335c979	Updated test owners and assigned ScheduledJobs to soltysh	2016-09-06 11:38:57 +02:00
Ryan Hitchman	0c80bce7a7	Fix test owners for horizontal pod autoscaling.	2016-08-30 13:30:45 -07:00
Erick Fejta	fdb085ff61	Add missing tests	2016-08-29 15:22:06 -07:00
Avesh Agarwal	db74d4dbc2	Fix downward api tests to output node allocatable not node capacity	2016-08-26 16:13:24 -04:00
Erick Fejta	5c821c1fed	Update test assignments	2016-08-19 18:43:40 -07:00
Ivan Shvedunov	8ff00d17d8	Add e2e tests that check for wrapped volume race See #29641 for details.	2016-08-17 12:14:14 +03:00
Erick Fejta	17d91dd2ec	Assign Probing Container tests to Random-Liu	2016-08-09 17:20:00 -07:00
Kubernetes Submit Queue	7da75631f6	Merge pull request #29956 from david-mcmahon/test_owners Automatic merge from submit-queue Remove myself from test ownership. These are almost certainly not correct, but probably more likely owners than myself. @rmmh @dchen1107 @timstclair @erictune @mtaufen @caesarxuchao @fgrzadkowski @krousey @lavalamp	2016-08-04 00:01:51 -07:00
David McMahon	3a88747ef8	Remove myself from test ownership.	2016-08-03 14:34:31 -07:00
gmarek	f1167e9b9c	Change the owner of JSON NodeAffinity test	2016-08-03 10:42:07 +02:00
Alex Robinson	0ed8fa5693	Give away my e2e tests.	2016-08-02 22:43:20 +00:00
Ryan Hitchman	5d53b3a686	Update test-owners with new tests, add catch-all assignment to test-infra team. A future update to the munger will use this to assign any flake without an explicit owner to a member of the test-infra team.	2016-08-01 16:02:39 -07:00
Ryan Hitchman	616e938662	Address PR comments, randomly assign owners for new tests.	2016-07-06 13:22:53 -07:00
Ryan Hitchman	3d485098c3	Add test/test_owners.csv, for automatic assignment of test failures. This file will be read by the munger -- see kubernetes/contrib#1264 This also includes a simple script to do minor automatic updates to the CSV.	2016-07-01 17:39:14 -07:00

1 2 3 4 5

218 Commits (ce4fd07b0624d654b5b7c9bda4747b2b7f239876)