It can run tests against multiple existing images that match a regex.
GCI images will be using a regex.
Signed-off-by: Vishnu kannan <vishnuk@google.com>
Automatic merge from submit-queue
Add test for supplemental gid annotation to pv e2e test
Test for feature added in #29119. Adds the annotation to the pv in the PersistentVolumes e2e test. It tests it by checking the output of 'id -G.' Also tests the actual use case of an nfs server by using an nfs server image 'volume-nfs-gid', modified slightly from 'volume-nfs' to have some permissions set.
@pmorie
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.kubernetes.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.kubernetes.io/reviews/kubernetes/kubernetes/30084)
<!-- Reviewable:end -->
Automatic merge from submit-queue
Cut the client repo, staging it in the main repo
Tracking issue: #28559
ref: https://github.com/kubernetes/kubernetes/pull/25978#issuecomment-232710174
This PR implements the plan a few of us came up with last week for cutting client into its own repo:
1. creating "_staging" (name is tentative) directory in the main repo, using a script to copy the client and its dependencies to this directory
2. periodically publishing the contents of this staging client to k8s.io/client-go repo
3. converting k8s components in the main repo to use the staged client. They should import the staged client as if the client were vendored. (i.e., the import line should be `import "k8s.io/client-go/<pacakge name>`). This requirement is to ease step 4.
4. In the future, removing the staging area, and vendoring the real client-go repo.
The advantage of having the staging area is that we can continuously run integration/e2e tests with the latest client repo and the latest main repo, without waiting for the client repo to be vendored back into the main repo. This staging area will exist until our test matrix is vendoring both the client and the server.
In the above plan, the tricky part is step 3. This PR achieves it by creating a symlink under ./vendor, pointing to the staging area, so packages in the main repo can refer to the client repo as if it's vendored. To prevent the godep tool from messing up the staging area, we export the staged client to GOPATH in hack/godep-save.sh so godep will think the client packages are local and won't attempt to manage ./vendor/k8s.io/client-go.
This is a POC. We'll rearrange the directory layout of the client before merge.
@thockin @lavalamp @bgrant0607 @kubernetes/sig-api-machinery
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.kubernetes.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.kubernetes.io/reviews/kubernetes/kubernetes/29147)
<!-- Reviewable:end -->
Automatic merge from submit-queue
E2E & Node E2E: Move pods test into common directory
This is the 4th part of #29494.
For #29081.
Based on #29092, #29806.
The first commit is squash of all dependent commits. Please only review the last 2 commits.
The 2nd commit migrates pods.go to `common/pods.go`.
Notice that the test `should be schedule with cpu and memory limits` is removed because:
* It doesn't make sense at the node level.
* It should have been tested in scheduler_predicates at the cluster level https://github.com/kubernetes/kubernetes/blob/master/test/e2e/scheduler_predicates.go#L264
The 3rd commit splits pods.go into several pods (nothing is changed, only move code):
* **Liveness probe test:** Moved into `container_probe.go`.
* **Init container test:** Moved into a new file `init_container.go`.
* **Others:** Still in pods.go
@vishh @timstclair
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.kubernetes.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.kubernetes.io/reviews/kubernetes/kubernetes/29814)
<!-- Reviewable:end -->
Automatic merge from submit-queue
[GarbageCollector] only store typeMeta and objectMeta in the gc store
GC only needs to know the apiVersion, kind, and objectMeta of an object. This PR makes the stores of GC only save these fields.
cc @kubernetes/sig-api-machinery
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.kubernetes.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.kubernetes.io/reviews/kubernetes/kubernetes/28480)
<!-- Reviewable:end -->
This test creates three pods with QoS of besteffort, burstable, and
guaranteed, respectively, which each contain a container that tries to
consume almost all the available memory at a rate of about 12Mi/10sec.
The expectation is that eviction will be initiated when the hard
memory.available<250Mi threshold is triggered, and that eviction will proceed
in the order of besteffort, then burstable. Since guaranteed pods should
only be evicted if something charged to the host uses more resources
than were reserved for it, we currently end the test when besteffort and
burstable have both been evicted.
Note that this commit also sets --eviction-hard=memory.available<250Mi
to enable eviction during tests.
Automatic merge from submit-queue
Add integration test for volume controller startup.
Tests #28002 with real etcd (unit tests have a fake one with different behavior).
@kubernetes/sig-storage
Automatic merge from submit-queue
e2e flake: real fix of PodAntiAffinity test
The fix in PR https://github.com/kubernetes/kubernetes/pull/30135 was wrong in using a wrong test condition for an already broken test.
### Summary
The test tries to launch a pod with an anti-affinity annotation, waits 10
seconds and then checks that it is still pending.
But the anti-affinity annotation does not forbid to launch that pod on just
another node that does not have the zone label at all.
This commit changes this behavior by labeling two nodes with the zone label
and then forcing the pod to be launched on one of those two nodes.
**I assume here that a non-existing label is considered as a different label value.**
Fixes#30078
Automatic merge from submit-queue
SchedulerExtender: add failedPredicateMap in Filter() returns
Fix#25797. modify extender.Filter for adding extenders information to “failedPredicateMap” in findNodesThatFit.
When all the filtered nodes that passed "predicateFuncs" don’t pass the extenders filter, the failedPredicateMap hasn’t the extenders information, should add it, I think. So when the length of the “filteredNodes.Items” is 0, we can know the integral information. (The length of the “filteredNodes.Items” is 0, may be because the extenders filter failed.)
Automatic merge from submit-queue
Use report-dir in test framework instead.
We already have `report-dir` option in framework test context.
The node e2e framework should use it as well.
/cc @ronnielai
Automatic merge from submit-queue
node_e2e: Use upstream CoreOS image directly
.. and update it to the latest alpha
This will make updating the CoreOS image in the future much simpler since it won't involve project-copying, manual-baking, or so on.
cc @pwittrock @vishh @bboreham @yifan-gu
Automatic merge from submit-queue
E2E & Node E2E: Move configmap, docker_containers, downward_api, expansion and secrets test into common directory.
This is the 3rd part of #29494.
For #29081.
Based on #29092, #29806.
The first commit is squash of all dependent commits. Please only review the second commit.
The second PR added 17 lines.
@vishh @timstclair
Automatic merge from submit-queue
For e2e_node tests tell etcd to listen on ports 2379 and 4001
This is the default for etcd2, but etcd3 only listens on 2379.
Specifying the ports keeps things consistent no matter which version the user has installed.
Fixes#29117
Automatic merge from submit-queue
Bug fix: Use p.Name instead of pod.Name
For example, if you used `pod.GenerateName`, `pod.Name` might be the empty
string while `p.Name` contains the actual name of your pod. Thus passing
`pod.Name` can result in a `resource name may not be empty` error.
For example, if you used pod.GenerateName, pod.Name might be the empty
string while p.Name contains the actual name of your pod. Thus passing
pod.Name can result in a `resource name may not be empty` error.
The test tries to launch a pod with an anti-affinity annotation, waits 10
seconds and then checks that it is still pending.
But the anti-affinity annotation does not forbid to launch that pod on just
another node that does not have the zone label at all.
This commit changes this behavior by labeling two nodes with the zone label
and then forcing the pod to be launched on one of those two nodes.
Automatic merge from submit-queue
Node E2E: Move the node name initialization to first function of SynchronizedBeforeEach
Currently, we start e2e services in the first function of `SynchronizedBeforeEach` to make sure that we only start them once even we are running test in parallel test nodes.
However, e2e services require `NodeName`, but we initialize `NodeName` in the second function.
This PR moved the initialization logic into the first function, and shared the node name with all test nodes via the `SharedContext`.
Automatic merge from submit-queue
Add density (batch pods creation latency and resource) and resource performance tests to `test-e2e-node' built for Linux only
This PR adds `+build linux' to density_test.go, resource_usage.go and resource_collector.go to last PR #29764.
#29764 fails build because it depends on cgroup which can not be built for os other than Linux.
Automatic merge from submit-queue
E2E & NodeE2E: Move host_path, downwardapi_volume and empty_dir into common directory.
This is the second part of #29494.
For #29081.
Based on #29092, #29806.
The first commit is squash of all dependent commits. Please only review the second commit.
The second PR is only 20 lines of change.
@vishh @timstclair
Automatic merge from submit-queue
Node E2E: Change the node e2e junit file name to junit_{image-name}{test-node-number}.xml
Fixes https://github.com/kubernetes/kubernetes/issues/30103.
Reuse the `report-prefix` in e2e test framework. Now the junit file will be like: `junit_{image-name}{test-node-number}.xml`.
Mark P2 to fix the test result.
/cc @rmmh
Automatic merge from submit-queue
federation: Adding secret API
Adding secret API to federation-apiserver and updating the federation client to include secrets
Automatic merge from submit-queue
Added test to density that will run maximum capacity pods on nodes
Added a test to the Density Suite that will load the kubelets with their maximum capacity number of pods
Automatic merge from submit-queue
Install go-bindata in cross-build image
Another follow-up to #25584.
We need `go-bindata` to create `test/e2e/generated`, and downloading it with `go get` at build time is painful for a variety of reasons. We can just include it in the cross-build image and not worry about it, especially as it updates very infrequently.
This fixes `hack/update-generated-protobuf.sh` as well.
cc @jayunit100 @soltysh
Automatic merge from submit-queue
pv e2e refactor and pre-bind test
refactored persistentvolume e2e so that multiple It() tests can be run. Added one test case for pre-binding, but the overall structure of the test should allow additional test cases to be more easily added.
Automatic merge from submit-queue
Resolve docker-daemon cgroup issue for both systemd and non-systemd node for node e2e tests
Fixed https://github.com/kubernetes/kubernetes/issues/29827
cc/ @coufon this should unblock your pr: #29764
I validated both containervm image and coreos image, and works as expected.
This is also required for adding gci image to node e2e test infrastructure.
Automatic merge from submit-queue
Fix deployment e2e test: waitDeploymentStatus should error when entering an invalid state
Follow up #28162
1. We should check that max unavailable and max surge aren't violated at all times in e2e tests (didn't check this in deployment scaled rollout yet, but we should wait for it to become valid and then continue do the check until it finishes)
2. Fix some minor bugs in e2e tests
@kubernetes/deployment
Automatic merge from submit-queue
Remove myself from test ownership.
These are almost certainly not correct, but probably more likely owners than myself.
@rmmh @dchen1107 @timstclair @erictune @mtaufen @caesarxuchao @fgrzadkowski @krousey @lavalamp
Automatic merge from submit-queue
Fix 29992
Fix#29992.
I copied RC test code to the wrong place to the RS test in #29798. I took a look at the failure reports, they were all failed on the RS test, so #29798 itself is correct.
Marked as P2 since it fixes a test flake that will block everyone.
Automatic merge from submit-queue
Revert "Revert "Drop support for --gce-service-account, require activated creds""
Reverts kubernetes/kubernetes#29242
Automatic merge from submit-queue
Limit number of pods spawned in SchedulerPredicates validates resourc…
Fixes https://github.com/kubernetes/kubernetes/issues/29190,
With this patch test should spawn at most 10 pods on the smallest node.
Automatic merge from submit-queue
integration test: Modify PVs/PVCs during binding.
Previous volume binder code was not able to cope with PVs or PVCs getting modified during the binding process. Current one should be resilient to these changes, so let's test it.
It makes the test approximately twice as long as before, from ~2 seconds to ~4-5.
@kubernetes/sig-storage
Marking as 1.3 target, however it does not really matter here, it's just a test.
Add min size of pod and max number of pods for SchedulerPredicates validate resouce limits test
Fix typo in patch for SchedulerPredicates validate resouce limits test
Moving max number of pods and min pod cpu request to constants
Automatic merge from submit-queue
Add density (batch pods creation latency and resource) and resource performance tests to `test-e2e-node'
This PR contains two new tests (migrate from e2e test):
1. Density test: verify startup latency and resource usage when create a batch of pod with throughput control. Throughput control is done by sleep for an interval between firing concurrently create pod operations.
It tests both batch creation and sequential (back-to-back) creation and report the throughputs.
2. Verify resource usage of steady state kubelet.
The test creates a new resource controller for `test-node-e2e' (resource_controller.go) which monitors resource through a standalone Cadvisor pod (port 8090) with 1s housekeeping interval.
Automatic merge from submit-queue
[Garbage Collector] add e2e tests again
#27151 is reverted because gke didn't start correctly after it's merged (https://github.com/kubernetes/kubernetes/pull/27151#issuecomment-233030686).
The possible problem is the `unbound variable`, which is fixed in the second commit of this PR. However, I cannot verify if the PR will fail the gke suite since I don't have the environment to run that suite.
@wojtek-t @lavalamp
Automatic merge from submit-queue
Update test-owners with new tests, add catch-all assignment to test-infra team.
We will triage any additional failures, since they're more likely to be infra related. If they're not, they can always be reassigned (and the owners list can be updated!)
/cc @kubernetes/test-infra-maintainers
Automatic merge from submit-queue
Node E2E: Add serial jenkins job.
This PR added a jenkins job for serial test. It will run all serial test one by one.
This will be useful for https://github.com/kubernetes/kubernetes/pull/29809.
@coufon @yujuhong @dchen1107
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Add support to quota pvc storage requests
Adds support to quota cumulative `PersistentVolumeClaim` storage requests in a namespace.
Per our chat today @markturansky @abhgupta - this is not done (lacks unit testing), but is functional.
This lets quota enforcement for `PersistentVolumeClaim` to occur at creation time. Supporting bind time enforcement would require substantial more work. It's possible this is sufficient for many, so I am opening it up for feedback.
In the future, I suspect we may want to treat local disk in a special manner, but that would have to be a different resource altogether (i.e. `requests.disk`) or something.
Example quota:
```
apiVersion: v1
kind: ResourceQuota
metadata:
name: quota
spec:
hard:
persistentvolumeclaims: "10"
requests.storage: "40Gi"
```
/cc @kubernetes/rh-cluster-infra @deads2k
Automatic merge from submit-queue
E2E & Node E2E: Add exec util in framework
For #29081.
Based on #29092 and #29494.
For first commit is a squashed commit of all old commits.
**The last 2 commits are new.**
This PR added exec util in framework, and moved `privileged.go` and `kubelet_etc_hosts` into `common` directory.
@vishh @timstclair
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Node E2E: Make node e2e parallel
For https://github.com/kubernetes/kubernetes/issues/29081.
Fix https://github.com/kubernetes/kubernetes/issues/26215.
Based on https://github.com/kubernetes/kubernetes/pull/28807, https://github.com/kubernetes/kubernetes/pull/29020, will rebase after they are merged.
**Only the last commit is new.**
We are going to move more tests into the node e2e test. However, currently node e2e test only run sequentially, the test duration will increase quickly when we add more test.
This PR makes the node e2e test run in parallel so as to shorten test duration, so that we can add more test to improve the test coverage.
* If you run the test locally with `make test-e2e-node`, it will use `-p` ginkgo flag, which uses `(cores-1)` parallel test nodes by default.
* If you run the test remotely or in the Jenkin, the parallelism will be controlled by the environment variable `PARALLELISM`. The default value is `8`, which is reasonable for our test node (n1-standard-1).
Before this PR, it took **833.592s** to run all test on my desktop.
With this PR, it only takes **234.058s** to run.
The pull request node e2e run with this PR takes **232.327s**.
The pull request node e2e run for other PRs takes **673.810s**.
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Adding GCI to node e2e.
Depends on https://github.com/kubernetes/kubernetes/pull/29486
Adding the dev release as of now since stable and beta run docker v1.9.1
which is incompatible with kubelet.
Automatic merge from submit-queue
Fix 29451
Fix#29451. I've also checked other tests in that file to make sure they don't have similar problems.
The issue is P0 and will block the submit queue, so I marked this PR as P0.
bindata and yaml, Gobindata automation
bindata utils for generating, go generate
match server version
gitignore for dirty, ca, rbase, KUBE_ROOT, buildfix
(rebased jul-25,29)
Automatic merge from submit-queue
Add API for StorageClasses
This is the API objects only required for dynamic provisioning picked apart from the controller logic.
Entire feature is here: https://github.com/kubernetes/kubernetes/pull/29006
Automatic merge from submit-queue
Remove redundant pod deletion in scheduler predicates tests and fix taints-tolerations e2e
~~In scheduler predicates test, some tests won't clean pods they created when exit with failure, which may lead to pod leak. This PR is to fix it.~~
Remove redundant pod deletion in scheduler predicates tests, since framework.AfterEach() already did the cleanup work after every test.
Also fix the test "validates that taints-tolerations is respected if not matching", refer to the change on taint-toleration test in #29003, and https://github.com/kubernetes/kubernetes/pull/24134#discussion_r63794924.
Automatic merge from submit-queue
make the resource prefix in etcd configurable for cohabitation
This looks big, its not as bad as it seems.
When you have different resources cohabiting, the resource name used for the etcd directory needs to be configurable. HPA in two different groups worked fine before. Now we're looking at something like RC<->RS. They normally store into two different etcd directories. This code allows them to be configured to store into the same location.
To maintain consistency across all resources, I allowed the `StorageFactory` to indicate which `ResourcePrefix` should be used inside `RESTOptions` which already contains storage information.
@lavalamp affects cohabitation.
@smarterclayton @mfojtik prereq for our rc<->rs and d<->dc story.
Automatic merge from submit-queue
Fix mount collision timeout issue
Short- or medium-term workaround for #29555. The root issue being fixed here is that the recent attach/detach work in the kubelet uses a unique volume name as a key that tracks the work that has to be done for each volume in a pod to attach/mount/umount/detach. However, the non-attachable volume plugins do not report unique names for themselves, which causes collisions when a single secret or configmap is mounted multiple times in a pod.
This is still a WIP -- I need to add a couple E2E tests that ensure that tests break in the future if there is a regression -- but posting for early review.
cc @kubernetes/sig-storage
Ultimately, I would like to refine this a bit further. A couple things I would like to change:
1. `GetUniqueVolumeName` should be a property ONLY of attachable volumes
2. I would like to see the kubelet apparatus for attach/mount/umount/detach handle non-attachable volumes specifically to avoid things like the `WaitForControllerAttach` call that has to be done for those volume types now
Automatic merge from submit-queue
Fix killing child sudo process in e2e_node tests
Fixes#29211.
The context is we are trying to kill a process started as `sudo kube-apiserver`, but `sudo` ignores signals from the same process group. Applying `Setpgid` means the `sudo kill` process won't be in the same process group, so will not fall foul of this nifty feature.
I also took the liberty of removing some code setting `Pdeathsig` because it claims to be doing something in the same area, but actually it doesn't do that at all. The setting is applied to the forked process, i.e. `sudo`, and it means the `sudo` will get killed if we (`e2e_node.test`) die. This (a) isn't what the comment says and (b) doesn't help because sending SIGKILL to the sudo process leaves sudo's child alive.
I didn't use the "hack for linux-only" approach because I think `Setpgid` is available on all platforms that `e2e_node` builds on.
This is the default for etcd2, but etcd3 only listens on 2379.
Specifying the ports keeps things consistent no matter which
version the user has installed.
Automatic merge from submit-queue
Faster test
<!--
Checklist for submitting a Pull Request
Please remove this comment block before submitting.
1. Please read our [contributor guidelines](https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md).
2. See our [developer guide](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md).
3. If you want this PR to automatically close an issue when it is merged,
add `fixes #<issue number>` or `fixes #<issue number>, fixes #<issue number>`
to close multiple issues (see: https://github.com/blog/1506-closing-issues-via-pull-requests).
4. Follow the instructions for [labeling and writing a release note for this PR](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes) in the block below.
-->
In attempting to troubleshoot flakes with this test case I actually wanted to understand how it worked.
There's some poor comments that need work.
I added some additional output which may or may not help in debugging the flakes.
I doubt this fixes the flake.
My major concern is the 'refactor' I did of the test case to batch up runs by sub-test-case. As it stood there was a 200ms pause between each sub, so they should not have interfered with each other. Now they are just started as fast as possible, but only 20 run at a time before moving on to the next 20. I am not sure if I am violating the ethos of the original test case.
Runs on my computer are down from 2m40s -> 40s.
Getting rid of the arbitrary client limiting brings it down to ~12 seconds. 11 to fetch the image and <1 to actually run the tests against the proxies. I can add a zero to the number of loops if you want to hit it harder. It would result in 10x as much text output though.
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Automatic merge from submit-queue
Add support for kubectl create quota command
Follow-up of https://github.com/kubernetes/kubernetes/pull/19625
```
Create a resourcequota with the specified name, hard limits and optional scopes
Usage:
kubectl create quota NAME [--hard=key1=value1,key2=value2] [--scopes=Scope1,Scope2] [--dry-run=bool] [flags]
Aliases:
quota, q
Examples:
// Create a new resourcequota named my-quota
$ kubectl create quota my-quota --hard=cpu=1,memory=1G,pods=2,services=3,replicationcontrollers=2,resourcequotas=1,secrets=5,persistentvolumeclaims=10
// Create a new resourcequota named best-effort
$ kubectl create quota best-effort --hard=pods=100 --scopes=BestEffort
```
Automatic merge from submit-queue
Rework pod waiting mechanism in e2e tests to accept pod and watch based
This PR re-applies #28212 which was reverted in #29223. The only difference is that the initial PR contained also `PodStartTimeout` shortening (see [here](4b0c0bd924)) which might caused the problems. Let's give it a 2nd try. I've tested all the flakes and they were passing on my machine.
@smarterclayton @apelisse ptal
- what the test is doing
- how the test is set up
- subsections of the test setup
additional output
- print time spent getting ready to run proxy attempts
- number of test cases
- multiple attempts of each test case
- how many total proxying attempts will be made
- fast path output now has numerical identity of attempt like error output
- error output has time taken and http status like fast path output
batching runs
- run groups of test cases vs starting all 34*20=680 proxy attempts at
the same time.
- don't wait between starting proxy attempts anymore.
proxy e2e changes
- disable the client side rate limiter
- use `By` construct of ginkgo for inline `STEP` logging
- move the waitGroup add outside of the loop
Automatic merge from submit-queue
Syncing imaging pulling backoff logic
- Syncing the backoff logic in the parallel image puller and the sequential image puller to prepare for merging the two pullers into one.
- Moving image error definitions under kubelet/images
Automatic merge from submit-queue
Change SETUP_NODE to True for node e2e docker validation test.
The continuous node e2e docker validation test is failing because:
```
W0722 00:48:52.163940 1265 image_list.go:85] Could not pre-pull image gcr.io/google_containers/netexec:1.4 exit status 1 output: Cannot connect to the Docker daemon. Is the docker daemon running on this host?
```
This is because jenkins is not added to docker user group.
For other images tested in node e2e, jenkins is added to docker user group when the images are initially created https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/environment/setup_host.sh#L102.
However, in node e2e docker validation test, we are using GCI image which doesn't do that.
So we should use the `SETUP_NODE` option to add user to docker group before test running b6c87904f6/test/e2e_node/e2e_remote.go (L150-L159).
This is only one line change, could you help me review the PR? @wonderfly
Thanks a lot! :)
Automatic merge from submit-queue
test/e2e: plug time.Ticker resource leak.
This commit ensures that `logPodStartupStatus` does not leak
running `time.Ticker` instances. Upon termination of the consuming
routine, we stop the ticker.
Automatic merge from submit-queue
use regular client instead of kubectl in scheduler predicate tests when checking/setting/cleanning taints/labels
The existing implementation in scheduler predicate tests uses kubectl to check/set/clean taints/labels on node, which makes the test very related to kubectl.
This PR is to use regular client instead.
Automatic merge from submit-queue
Revert "Drop support for --gce-service-account, require activated creds"
Reverts kubernetes/kubernetes#28802
This appears to break the soak tests with "invalid grant" errors -- see the recent batch of errors in #27920.
Automatic merge from submit-queue
Allow for overriding throughput in load test
We seem to be already supporting higher throughput that what the default is.
I'm going to increase the throughput in our tests:
- speed up scalability tests
- ensure that what I'm seeing locally is really the repeatable case
This PR is a short preparation for those experiments.
[Ideally, I would like to have kubemark-500 to be finishing within 30 minutes. And I think this should be doable pretty soon.]
@gmarek
Automatic merge from submit-queue
Change some node e2e test to use the prepull image framework.
Fix https://github.com/kubernetes/kubernetes/issues/28868.
Node e2e test framework pre-pulls all images in [image_list.go](bc2f223f5a/test/e2e_node/image_list.go)
All node e2e test should use image from the "image_list". If a test needs new image, we should update the image_list to include the new image.
/cc @kubernetes/sig-node to notice people to use `image_list` when adding test. :)
Automatic merge from submit-queue
add tokenreviews endpoint to implement webhook
Wires up an API resource under `apis/authentication.k8s.io/v1beta1` to expose the webhook token authentication API as an API resource. This allows one API server to use another for authentication and uses existing policy engines for the "authoritative" API server to controller access to the endpoint.
@cjcullen you wrote the initial type
Automatic merge from submit-queue
Start namespace controller in node e2e
Fix https://github.com/kubernetes/kubernetes/issues/28320.
Based on https://github.com/kubernetes/kubernetes/pull/28807, only the last 2 commits are new.
Before this PR, there was no namespace controller running in node e2e test infrastructure. We can not enable the [`delete-namespace`](f2ddd60eb9/test/e2e/framework/test_context.go (L109)) flag in the test framework.
So after the test running, there will be running pod left on the test node. This seems to be acceptable in our test infrastructure because we create an new instance each time.
However, in 1.4 we may want to provide part of the test as node conformance test to the user, they definitely don't want the test to leave tons of pods on their node after test running.
Currently, there is no easy way to only start namespace controller in kube-controller-manager (confirmed with @mikedanese), so in this PR I started a "uncontainerized" one in the test infrastructure.
This PR:
* Started the namespace controller in the node e2e test infrastructure and enable the automatic namespace deletion.
* Change the privileged test to use framework (@yujuhong), so that all node e2e tests are using the framework and test pods will be cleaned up by namespace controller.
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Switched watches in tests require ResourceVersion to be passed
For testing the Watches are not sufficient in that it might miss the event of transitioning a Pod from one state to another which might happen before we start Watching events. To remedy this, I'm proposing to switch to Gets to always read the actual state of a Pod.
@smarterclayton this fixes https://github.com/openshift/origin/issues/9192 and hopefully all `gave up waiting for pod...` flakes
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Automatic merge from submit-queue
Don't repeat the program name in healthCheckCommand.String()
The name is in both `Path` and `Args[0]`, so start printing args at 1.
Also refactor to avoid an extra space character in the output.
I pondered whether `healthCheckCommand.String()` should check if the slice is empty, to avoid a panic, but it didn't check for `Cmd==nil` before.
Fixes#29107
Automatic merge from submit-queue
Change the docker validation node e2e test to use gci-canary-test
This PR changed the continuous docker validation node e2e test to use the image config file introduced in https://github.com/kubernetes/kubernetes/pull/28708. @euank
This PR also changed the gci image family from `gci-preview-test` to `gci-canary-test`. @wonderfly
Automatic merge from submit-queue
Return (bool, error) in Authorizer.Authorize()
Before this change, Authorize() method was just returning an error, regardless of whether the user is unauthorized or whether there is some other unrelated error. Returning boolean with information about user authorization and error (which should be unrelated to the authorization) separately will make it easier to debug.
Fixes#27974
Automatic merge from submit-queue
Node E2E: Make it possible to share test between e2e and node e2e
This PR is part of the plan to improve node e2e test coverage.
* Now to improve test coverage, we have to copy test from e2e to node e2e.
* When adding a new test, we have to decide its destiny at the very beginning - whether it is a node e2e or e2e.
This PR makes it possible to share test between e2e and node e2e.
By leveraging the mechanism of ginkgo, as long as we can import the test package in the test suite, the corresponding `Describe` will be run to initialize the global variable `_`, and the test will be inserted into the test suite. (See https://github.com/onsi/composition-ginkgo-example)
In the future, we just need to use the framework to write the test, and put the test into `test/e2e/node`, then it will be automatically shared by the 2 test suites.
This PR:
1) Refactored the framework to make it automatically differentiate e2e and node e2e (Mainly refactored the `PodClient` and the apiserver client initialization).
2) Created a new directory `test/e2e/node` and make it shared by e2e and node e2e.
3) Moved `container_probe.go` into `test/e2e/node` to verify the change.
@kubernetes/sig-node
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Automatic merge from submit-queue
[flake fix] Wait for the podInformer to observe the pod
Fix#29065
The problem is that the rc manager hasn't observed pod1, so it creates another pod and scales down, pod1 might get deleted. To fix it, wait for the podInformer to observe the pod before running the rc manager.
Marked as P0 as it's fixing a P0 flake.
Automatic merge from submit-queue
Drop support for --gce-service-account, require activated creds
Now that `gcloud auth activate-service-account` is in remove support in the test framework for default service accounts -- testing GCE/GKE now requires prior gcloud activation.
This commit ensures that `logPodStartupStatus` does not leak
running `time.Ticker` instances. Upon termination of the consuming
routine, we stop the ticker.
Before this change, Authorize() method was just returning an error,
regardless of whether the user is unauthorized or whether there
is some other unrelated error. Returning boolean with information
about user authorization and error (which should be unrelated to
the authorization) separately will make it easier to debug.
Fixes#27974
Automatic merge from submit-queue
Fix verify results in MaxPods
As we already have "unschedulable" PodCondition we can stop relying on Events, which should make the tests more reliable.
cc @davidopp
Automatic merge from submit-queue
authorize based on user.Info
Update the `authorization.Attributes` to use the `user.Info` instead of discrete getters for each piece.
@kubernetes/sig-auth
Automatic merge from submit-queue
Fix a bug in mirror pod node e2e test.
Fixed a bug in test/e2e_node/mirror_pod_test.go. The function 'checkMirrorPodDisappear' returns nil even when the pod does not disappear. It should return a non-nil error.
@Random-Liu
Automatic merge from submit-queue
[GarbageCollector] Let the RC manager set/remove ControllerRef
What's done:
* RC manager sets Controller Ref when creating new pods
* RC manager sets Controller Ref when adopting pods with matching labels but having no controller
* RC manager clears Controller Ref when pod labels change
* RC manager clears pods' Controller Ref when rc's selector changes
* RC manager stops adoption/creating/deleting pods when rc's DeletionTimestamp is set
* RC manager bumps up ObservedGeneration: The [original code](https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/replication/replication_controller_utils.go#L36) will do this.
* Integration tests:
* verifies that changing RC's selector or Pod's Labels triggers adoption/abandoning
* e2e tests (separated to #27151):
* verifies GC deletes the pods created by RC if DeleteOptions.OrphanDependents=false, and orphans the pods if DeleteOptions.OrphanDependents=true.
TODO:
- [x] we need to be able to select Pods that have a specific ControllerRef. Then each time we sync the RC, we will iterate through all the Pods that has a controllerRef pointing the RC, event if the labels of the Pod doesn't match the selector of RC anymore. This will prevent a Pod from stuck with a stale controllerRef, which could be caused by the race between abandoner (the goroutine that removes controllerRef) and worker the goroutine that add controllerRef to pods).
- [ ] use controllerRef instead of calling `getPodController`. This might be carried out by the control-plane team.
- [ ] according to the controllerRef proposal (#25256): "For debugging purposes we want to add an adoptionTime annotation prefixed with kubernetes.io/ which will keep the time of last controller ownership transfer." This might be carried out by the control-plane team.
cc @lavalamp @gmarek
Automatic merge from submit-queue
[garbage collector] add e2e test
This PR also includes some changes to plumb controller-manager's `--enable_garbage_collector` from the environment variable.
The e2e test will not be run by the core suite because it's marked `[Feature:GarbageCollector]`.
The corresponding jenkins job configuration PR is https://github.com/kubernetes/test-infra/pull/132.
Automatic merge from submit-queue
Support terminal resizing for exec/attach/run
```release-note
Add support for terminal resizing for exec, attach, and run. Note that for Docker, exec sessions
inherit the environment from the primary process, so if the container was created with tty=false,
that means the exec session's TERM variable will default to "dumb". Users can override this by
setting TERM=xterm (or whatever is appropriate) to get the correct "smart" terminal behavior.
```
Fixes#13585