github/k3s - k3s - https://git.xinac.net

Commit Graph

Author	SHA1	Message	Date
Kubernetes Submit Queue	93ddb7be5f	Merge pull request #52237 from smarterclayton/watch_metric Automatic merge from submit-queue (batch tested with PRs 51824, 50476, 52451, 52009, 52237) Improve apiserver metrics reporting Normalize "WATCHLIST" to "WATCH", add "scope" to the other metrics (listing 50k pods is != listing pods in a namespace), and add a new scope "resource" to cover individual resource calls. This roughly aligns metrics with our ACL model (technically resource scope is GET, but POST to a subresource and POST to a namespace are not the same thing). ```release-note WATCHLIST calls are now reported as WATCH verbs in prometheus for the apiserver_request_* series. A new "scope" label is added to all apiserver_request_* values that is either 'cluster', 'resource', or 'namespace' depending on which level the query is performed at. ```	2017-09-15 01:08:11 -07:00
Kubernetes Submit Queue	5d995e3f7b	Merge pull request #52372 from caesarxuchao/remove-config-copy Automatic merge from submit-queue (batch tested with PRs 52376, 52439, 52382, 52358, 52372) Remove the conversion of client config It was needed because the clientset code in client-go was a copy of the clientset code in Kubernetes.. client-go is authoritative now, so we can remove the nasty copy.	2017-09-14 15:27:17 -07:00
Aleksandra Malinowska	c173296632	log gcloud command error	2017-09-13 11:56:55 +02:00
Chao Xu	6c5a8d5db9	Remove the conversion of client config, because client-go is authoratative now	2017-09-12 16:02:17 -07:00
Anthony Yeh	bff5f7e6b0	StatefulSet: Deflake e2e RunHostCmd more. It turns out that at some points while the Node is recovering from a reboot, we get a different kind of error ("unable to upgrade connection"). Since we can't distinguish these transient errors from an error encountered after successfully executing the remote command, let's just retry all errors for 5min. If this doesn't work, I'm gonna blame it on sig-node.	2017-09-12 10:12:46 -07:00
Clayton Coleman	30a92a8f0a	Report scope in e2e test metrics	2017-09-11 22:13:55 -04:00
Kubernetes Submit Queue	ad0d36f0f0	Merge pull request #52111 from MrHohn/kube-proxy-upgrade-image Automatic merge from submit-queue Pipe in upgrade image target for kube-proxy migration tests What this PR does / why we need it: https://k8s-testgrid.appspot.com/sig-network#gci-gce-latest-upgrade-kube-proxy-ds&width=20 and https://k8s-testgrid.appspot.com/sig-network#gci-gce-latest-downgrade-kube-proxy-ds&width=20 are still failing. Reproduced it locally and found node image is being default to debian during upgrade (it was gci before upgrade) because we don't pass in `gci` via `--upgrade--target`. And for some reasons (haven't figured out yet), the upgraded node uses debian image with gci startupscripts... This PR pipes in `--upgrade-target` for kube-proxy migration tests, hopefully in conjunction with https://github.com/kubernetes/test-infra/pull/4447 it will bring the tests back to normal. Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): fixes #NONE Special notes for your reviewer: Sorry for bothering again. /assign @krousey Release note: ```release-note NONE ```	2017-09-07 20:46:04 -07:00
Kubernetes Submit Queue	f4f21b3f06	Merge pull request #52054 from janetkuo/pause-dep-integra Automatic merge from submit-queue (batch tested with PRs 52097, 52054) Move paused deployment e2e tests to integration What this PR does / why we need it: Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): xref #52113 Special notes for your reviewer: Release note: ```release-note NONE ```	2017-09-07 15:28:25 -07:00
Zihong Zheng	0cb6471f35	Pipe in upgrade image target to kube-proxy migration tests	2017-09-07 13:39:27 -07:00
Kubernetes Submit Queue	507af4b9c2	Merge pull request #52057 from enisoc/sts-deflake Automatic merge from submit-queue StatefulSet: Deflake e2e RunHostCmd. The initial retry up to 20s was giving up too soon. I'm seeing this test flake because the Node rebooted and it takes ~2min to recover. Now StatefulSet RunHostCmd calls will use the same 5min timeout as with other Pod state checks. ref #48031	2017-09-07 11:42:32 -07:00
Janet Kuo	124344a1a4	Move paused deployment e2e tests to integration	2017-09-06 18:12:28 -07:00
Anthony Yeh	b4f639f57a	StatefulSet: Deflake e2e RunHostCmd. The initial retry up to 20s was giving up too soon. I'm seeing this test flake because the Node rebooted and it takes ~2min to recover. Now StatefulSet RunHostCmd calls will use the same 5min timeout as with other Pod state checks.	2017-09-06 17:51:11 -07:00
Yu-Ju Hong	bb50086b8f	e2e: network tiers should retry on 404 errors The feature is still Alpha and at times, the IP address previously used by the load balancer in the test will not completely freed even after the load balancer is long gone. In this case, the test URL with the IP would return a 404 response. Tolerate this error and retry until the new load balancer is fully established.	2017-09-06 13:16:28 -07:00
Jordan Liggitt	f61ac93a0d	Fix dynamic discovery error in e2e	2017-09-05 23:01:54 -04:00
Kubernetes Submit Queue	1732a8b9bd	Merge pull request #51562 from nicksardo/gce-attempt-firewall Automatic merge from submit-queue (batch tested with PRs 51915, 51294, 51562, 51911) GCE: Gracefully handle permission errors when attempting to create firewall rules Purpose of this PR is to raise events from the GCE cloud provider if the GCE service account does not have the permissions necessary to create/update/delete firewall rules. Fixes #51812 Release note: ```release-note NONE ``` Example Events: ``` Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 2m 2m 1 service-controller Normal EnsuringLoadBalancer Ensuring load balancer 2m 2m 1 gce-cloudprovider Normal LoadBalancerManualChange Firewall change required by network admin: `gcloud compute firewall-rules create aa8a1dd628ddb11e78ce042010a80000 --network https://www.googleapis.com/compute/v1/projects/playground/global/networks/e2e-test-nicksardo --description "{\"kubernetes.io/service-name\":\"default/myechosvc1\", \"kubernetes.io/service-ip\":\"\"}" --allow tcp:9000 --source-ranges 0.0.0.0/0 --target-tags e2e-test-nicksardo-minion --project playground` 2m 2m 1 gce-cloudprovider Normal LoadBalancerManualChange Firewall change required by network admin: `gcloud compute firewall-rules create k8s-1aee5045e658d174-node-hc --network https://www.googleapis.com/compute/v1/projects/playground/global/networks/e2e-test-nicksardo --description "" --allow tcp:10256 --source-ranges 130.211.0.0/22,35.191.0.0/16,209.85.152.0/22,209.85.204.0/22 --target-tags e2e-test-nicksardo-minion --project playground` 1m 1m 1 service-controller Normal EnsuredLoadBalancer Ensured load balancer ```	2017-09-05 08:47:28 -07:00
Jordan Liggitt	5acd5b52f4	Tolerate group discovery errors in e2e ns cleanup	2017-09-04 17:31:17 -04:00
Nick Sardo	676b95e097	Gracefully handle permission errors when attempting to create firewall rules	2017-09-04 09:00:49 -07:00
cedric lamoriniere	1dbef2f113	Job failure policy support in JobController Job failure policy integration in JobController. From the JobSpec.BackoffLimit the JobController will define the backoff duration between Job retry. It use the ```workqueue.RateLimitingInterface``` to store the number of "retry" as "requeue" and the default Job backoff initial duration is set during the initialization of the ```workqueue.RateLimiter. Since the number of retry for each job is store in a local structure "JobController.queue" if the JobController restarts the number of retries will be lost and the backoff duration will be reset to 0. Add e2e test for Job backoff failure policy	2017-09-03 12:07:12 +02:00
Manjunath A Kumatagi	ee4d54c70c	Port e2e tests for multi architecture	2017-09-01 05:40:52 +05:30
Manjunath A Kumatagi	22c3a590d1	Fix bazel	2017-09-01 05:39:00 +05:30
Kubernetes Submit Queue	022919d1a4	Merge pull request #51483 from yujuhong/e2e-net-tiers Automatic merge from submit-queue e2e: Add tests for network tiers in GCE This test depends on #51301, which adds the new feature. Only the `e2e: Add tests for network tiers in GCE` commit is new. #51301 should pass this new test.	2017-08-30 06:55:35 -07:00
Kubernetes Submit Queue	04bc4ec716	Merge pull request #50398 from pci/gcloud-compute-list Automatic merge from submit-queue (batch tested with PRs 47054, 50398, 51541, 51535, 51545) Switch away from gcloud deprecated flags in compute resource listings What is fixed Remove deprecated `gcloud compute` flags, see linked issue. Which issue this PR fixes: fixes #49673 Special notes for your reviewer: The change in `gcloudComputeResourceList` in `test/e2e/framework/ingress_utils.go` isn't strictly needed as currently no affected resources are called on within that file, however the function has the _potential_ to access affected resources so I covered it as well. Happy to change if deemed unnecessary. Release note: ```release-note NONE ```	2017-08-30 01:51:29 -07:00
Kubernetes Submit Queue	b4d08cb9b5	Merge pull request #50940 from MrHohn/kube-proxy-ds-upgrade-tests Automatic merge from submit-queue (batch tested with PRs 51228, 50185, 50940, 51544, 51543) Add upgrades tests for kube-proxy daemonset migration path What this PR does / why we need it: From #23225, this is a part of setting up CIs to validate the kube-proxy migration path (static pods -> daemonset and reverse). The other part of the works (adding real CIs that run these tests) will be in a separate PR against [kubernetes/test-infra](https://github.com/kubernetes/test-infra). Though this is currently blocked by #50705. Special notes for your reviewer: cc @roberthbailey @pwittrock Release note: ```release-note NONE ```	2017-08-29 23:54:30 -07:00
Kubernetes Submit Queue	01e961b380	Merge pull request #49749 from sbezverk/e2e_selinux_local_starage_test Automatic merge from submit-queue (batch tested with PRs 51377, 46580, 50998, 51466, 49749) Adding e2e SELinux test for local storage Adding e2e test for SELinux enabled local storage /sig storage Closes #45054	2017-08-29 22:57:11 -07:00
Philip Ingrey	697f92a5d2	Switch away from gcloud deprecated flags in compute resource listings	2017-08-30 06:41:09 +01:00
Shyam JVS	36910232ab	Merge pull request #51343 from shyamjvs/correct-cluster-ip-range Correct default cluster-ip-range subnet	2017-08-30 01:31:50 +02:00
Shyam Jeedigunta	2df4698473	Correct default cluster-ip-range subnet	2017-08-29 23:15:23 +02:00
Zihong Zheng	5dc0845e36	Add upgrades tests for kube-proxy daemonset migration path	2017-08-29 10:16:37 -07:00
Kubernetes Submit Queue	25da6e64e2	Merge pull request #48454 from weiwei04/check-job-activeDeadlineSeconds Automatic merge from submit-queue (batch tested with PRs 44719, 48454) check job ActiveDeadlineSeconds What this PR does / why we need it: enqueue a sync task after ActiveDeadlineSeconds Which issue this PR fixes : fixes #32149 Special notes for your reviewer: Release note*: ```release-note enqueue a sync task to wake up jobcontroller to check job ActiveDeadlineSeconds in time ```	2017-08-29 08:25:06 -07:00
Wei Wei	46239ea30b	check job ActiveDeadlineSeconds	2017-08-29 20:15:11 +08:00
Andrzej Wasylkowski	0c1ab5597e	Renamed ClusterSize and WaitForClusterSize to NumberOfReadyNodes and WaitForReadyNodes, respectively.	2017-08-29 11:53:17 +02:00
Andrzej Wasylkowski	9b0f4c9f7c	Added an end-to-end test ensuring that Cluster Autoscaler does not scale up when all pending pods are unschedulable.	2017-08-29 11:52:26 +02:00
Yu-Ju Hong	f33c37e102	e2e: Add tests for network tiers in GCE	2017-08-28 18:40:20 -07:00
Serguei Bezverkhi	d904e52570	Adding e2e SELinux test for local storage Also changing provisioner bootstrapper frpm Pod to Job	2017-08-28 19:12:17 -04:00
Kubernetes Submit Queue	b65d665b99	Merge pull request #51264 from m1093782566/e2e-maxTries Automatic merge from submit-queue (batch tested with PRs 50889, 51347, 50582, 51297, 51264) Fix e2e network util wrong output message What this PR does / why we need it: See https://github.com/kubernetes/kubernetes/blob/master/test/e2e/framework/networking_utils.go#L217 and https://github.com/kubernetes/kubernetes/blob/master/test/e2e/framework/networking_utils.go#L273 I assume it should be `minTries` -> `MaxTries` This PR fixes the wrong output message. Which issue this PR fixes: fixes #51265 Special notes for your reviewer: Release note: ```release-note NONE ```	2017-08-25 22:43:37 -07:00
Kubernetes Submit Queue	11299e363c	Merge pull request #51282 from shyamjvs/new-allowed-not-ready-semantics Automatic merge from submit-queue AllowedNotReadyNodes allowed to be not ready for absolutely any reason It's as good as we allow those many nodes to be not part of the cluster at all, ever. Btw - currently our 5k-node correctness test fails if "kubelet stopped posting node status" or "route not created", etc (ref: https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-correctness/3/build-log.txt) cc @kubernetes/sig-scalability-misc	2017-08-25 05:00:32 -07:00
Kubernetes Submit Queue	81363abc20	Merge pull request #51230 from enisoc/sts-deflake-exec Automatic merge from submit-queue (batch tested with PRs 50213, 50707, 49502, 51230, 50848) StatefulSet: Deflake e2e `kubectl exec` commands. This may help with another source of flakiness found while investigating #48031. We seem to get a lot of flakes due to "connection refused" while running `kubectl exec`. I can't find any reason this would be caused by the test flow, so I'm adding retries to see if that helps.	2017-08-25 01:10:35 -07:00
Kubernetes Submit Queue	ce3e2d9b10	Merge pull request #51224 from enisoc/sts-deflake-restart Automatic merge from submit-queue (batch tested with PRs 51224, 51191, 51158, 50669, 51222) StatefulSet: Deflake e2e "restart" phase. This addresses another source of flakiness found while investigating #48031. The test used to scale the StatefulSet down to 0, wait for ListPods to return 0 matching Pods, and then scale the StatefulSet back up. This was prone to a race in which StatefulSet was told to scale back up before it had observed its own deletion of the last Pod, as evidenced by logs showing the creation of Pod ss-1 prior to the creation of the replacement Pod ss-0. Instead, we now wait for the controller to observe all deletions before scaling it back up. This should fix flakes of the form: ``` Too many pods scheduled, expected 1 got 2 ```	2017-08-24 22:59:28 -07:00
Anthony Yeh	05d6c8a6c2	StatefulSet: Deflake e2e `kubectl exec` commands. We seem to get a lot of flakes due to "connection refused" while running `kubectl exec`. I can't find any reason this would be caused by the test flow, so I'm adding retries to see if that helps.	2017-08-24 11:42:05 -07:00
Shyam Jeedigunta	b374416807	AllowedNotReadyNodes allowed to be not ready for absolutely any reason	2017-08-24 19:39:26 +02:00
m1093782566	b8edd9b885	fix e2e network wrong output message	2017-08-24 19:39:42 +08:00
Kubernetes Submit Queue	db928095a0	Merge pull request #50947 from shyamjvs/clusterIpRange-ginkgo Automatic merge from submit-queue (batch tested with PRs 51108, 51035, 50539, 51160, 50947) Auto-calculate CLUSTER_IP_RANGE based on cluster size In preparation for eliminating CLUSTER_IP_RANGE env var from job configs, making it less error prone while folks try to start their own large cluster tests (https://github.com/kubernetes/kubernetes/issues/50907). /cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek	2017-08-24 02:32:14 -07:00
Anthony Yeh	ce3fad326f	StatefulSet: Deflake e2e "restart" phase. The test used to scale the StatefulSet down to 0, wait for ListPods to return 0 matching Pods, and then scale the StatefulSet back up. This was prone to a race in which StatefulSet was told to scale back up before it had observed its own deletion of the last Pod, as evidenced by logs showing the creation of Pod ss-1 prior to the creation of the replacement Pod ss-0. We now wait for the controller to observe all deletions before scaling it back up. This should fix flakes of the form: ``` Too many pods scheduled, expected 1 got 2 ```	2017-08-23 15:08:58 -07:00
Kubernetes Submit Queue	a44e538dbc	Merge pull request #51039 from enisoc/deflake-sts-saturate Automatic merge from submit-queue StatefulSet: Deflake e2e "Saturate" phase. This should reduce one source of flakiness found while investigating #48031. The "Saturate" phase of StatefulSet e2e tests verifies orderly startup by controlling when each Pod is allowed to report Ready. If a Pod unexepectedly goes down during the test, the replacement Pod created by the controller will forget if it was already allowed to report Ready. After this change, the signal that allows each Pod to report Ready is persisted in the Pod's PVC. Thus, the replacement Pod will remember that it was already told to proceed to a Ready state.	2017-08-22 21:13:13 -07:00
Kubernetes Submit Queue	fdf14b8218	Merge pull request #50913 from shyamjvs/list-call-slo Automatic merge from submit-queue (batch tested with PRs 50893, 50913, 50963, 50629, 50640) Increase latency threshold for list api calls This is only a short-term solution to make our density test green. In the long-term, we should measure as per our new SLIs. From @wojtek-t's [doc](https://docs.google.com/document/d/1Q5qxdeBPgTTIXZxdsFILg7kgqWhvOwY8uROEf0j5YBw) on the new SLIs/SLOs, we have the following SLO for list calls: ``` SLO1: In default Kubernetes installation, 99th percentile of SLI2 per cluster-day: <= 1s if total number of objects of the same type as resource in the system <= X <= 5s if total number of objects of the same type as resource in the system <= Y <= 30s if total number of objects of the same types as resource in the system <= Z ``` I would guess that 170,000 pods would fall into the 2nd bracket (at least) and hence the new value of 5s. WDYT? cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek	2017-08-22 05:31:07 -07:00
Anthony Yeh	3bc7676024	StatefulSet: Deflake e2e "Saturate" phase. The "Saturate" phase of StatefulSet e2e tests verifies orderly startup by controlling when each Pod is allowed to report Ready. If a Pod unexepectedly goes down during the test, the replacement Pod created by the controller will forget if it was already allowed to report Ready. After this change, the signal that allows each Pod to report Ready is persisted in the Pod's PVC. Thus, the replacement Pod will remember that it was already told to proceed to a Ready state.	2017-08-21 13:52:15 -07:00
Shyam Jeedigunta	bacc01f729	Auto-calculate CLUSTER_IP_RANGE based on no. of nodes	2017-08-21 14:21:43 +02:00
Kubernetes Submit Queue	b59ad9cbff	Merge pull request #50146 from gmarek/deepcopyinto Automatic merge from submit-queue (batch tested with PRs 46512, 50146) Make metav1.(Micro)?Time functions take pointers Is there any reason for those functions not to be on pointers?	2017-08-19 11:28:15 -07:00
Shyam Jeedigunta	70123e71bb	Increase latency threshold for list api calls	2017-08-19 00:55:35 +02:00
Walter Fender	cb28f0f34f	Add e2e aggregator test. What this PR does / why we need it: This adds an e2e test for aggregation based on the sample-apiserver. Currently is uses a sample-apiserver built as of 1.7. This should ensure that the aggregation system works end-to-end. It will also help detect if we break "old" extension api servers. Which issue this PR fixes (optional, in fixes #<issue number>(, fixes fixes #43714 Fixed bazel for the change. Fixed # of args issue from govet. Added code to test dynamic.Client.	2017-08-17 10:56:43 -07:00

1 2 3 4 5 ...

1030 Commits (16670f1a95a61434f8d6c96596a3c541b01fb8aa)