Commit Graph

10031 Commits (ba535d57f6e2eb338f6fbf121d6be2e6f9204136)

Author SHA1 Message Date
Kubernetes Submit Queue a3f40dd8df
Merge pull request #60856 from jiayingz/race-fix
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fixes the races around devicemanager Allocate() and endpoint deletion.

There is a race in predicateAdmitHandler Admit() that getNodeAnyWayFunc()
could get Node with non-zero deviceplugin resource allocatable for a
non-existing endpoint. That race can happen when a device plugin fails,
but is more likely when kubelet restarts as with the current registration
model, there is a time gap between kubelet restart and device plugin
re-registration. During this time window, even though devicemanager could
have removed the resource initially during GetCapacity() call, Kubelet
may overwrite the device plugin resource capacity/allocatable with the
old value when node update from the API server comes in later. This
could cause a pod to be started without proper device runtime config set.

To solve this problem, introduce endpointStopGracePeriod. When a device
plugin fails, don't immediately remove the endpoint but set stopTime in
its endpoint. During kubelet restart, create endpoints with stopTime set
for any checkpointed registered resource. The endpoint is considered to be
in stopGracePeriod if its stoptime is set. This allows us to track what
resources should be handled by devicemanager during the time gap.
When an endpoint's stopGracePeriod expires, we remove the endpoint and
its resource. This allows the resource to be exported through other channels
(e.g., by directly updating node status through API server) if there is such
use case. Currently endpointStopGracePeriod is set as 5 minutes.

Given that an endpoint is no longer immediately removed upon disconnection,
mark all its devices unhealthy so that we can signal the resource allocatable
change to the scheduler to avoid scheduling more pods to the node.
When a device plugin endpoint is in stopGracePeriod, pods requesting the
corresponding resource will fail admission handler.

Tested:
Ran GPUDevicePlugin e2e_node test 100 times and all passed now.



**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/60176

**Special notes for your reviewer**:

**Release note**:

```release-note
Fixes the races around devicemanager Allocate() and endpoint deletion.
```
2018-03-12 02:50:13 -07:00
Kubernetes Submit Queue 36058cb0c3
Merge pull request #60997 from MrHohn/e2e-fix-cleanup-svc-regional
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[e2e service] Fix CleanupGCEResources for regional test

*What this PR does / why we need it**:
From https://k8s-testgrid.appspot.com/google-gke-staging#gke-staging-1-8-1-9-upgrade-regional-cluster&width=20, regional cluster test is failing because the GCE resource cleanup function attempts to parse region from `--gcp-zone` while regional cluster only set `--gcp-region`.

This PR pipes region into the cleanup function as well. This will need to be cherrypicked to 1.8 and 1.9.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #NONE 

**Special notes for your reviewer**:
/assign @bowei @wojtek-t 
cc @nikhiljindal to see if there is anything should be fixed for federation.

**Release note**:

```release-note
NONE
```
2018-03-12 00:03:38 -07:00
Kubernetes Submit Queue 9ad5ea2d61
Merge pull request #60993 from MrHohn/e2e-restart-apiserver-refine-followup
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[e2e service] Move apiserver restart validation logic into util

**What this PR does / why we need it**:
Follow up of #60906, on GKE apiserver pod is invisible on k8s, hence test is failing.

This PR bakes the restart validation logic into the util function instead so it could be env-awared.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #60761

**Special notes for your reviewer**:
Sorry for the noise.
/assign @rramkumar1 @bowei 
cc @krzyzacy 

**Release note**:

```release-note
NONE
```
2018-03-09 18:33:05 -08:00
Jiaying Zhang 5514a1f4dd Fixes the races around devicemanager Allocate() and endpoint deletion.
There is a race in predicateAdmitHandler Admit() that getNodeAnyWayFunc()
could get Node with non-zero deviceplugin resource allocatable for a
non-existing endpoint. That race can happen when a device plugin fails,
but is more likely when kubelet restarts as with the current registration
model, there is a time gap between kubelet restart and device plugin
re-registration. During this time window, even though devicemanager could
have removed the resource initially during GetCapacity() call, Kubelet
may overwrite the device plugin resource capacity/allocatable with the
old value when node update from the API server comes in later. This
could cause a pod to be started without proper device runtime config set.

To solve this problem, introduce endpointStopGracePeriod. When a device
plugin fails, don't immediately remove the endpoint but set stopTime in
its endpoint. During kubelet restart, create endpoints with stopTime set
for any checkpointed registered resource. The endpoint is considered to be
in stopGracePeriod if its stoptime is set. This allows us to track what
resources should be handled by devicemanager during the time gap.
When an endpoint's stopGracePeriod expires, we remove the endpoint and
its resource. This allows the resource to be exported through other channels
(e.g., by directly updating node status through API server) if there is such
use case. Currently endpointStopGracePeriod is set as 5 minutes.

Given that an endpoint is no longer immediately removed upon disconnection,
mark all its devices unhealthy so that we can signal the resource allocatable
change to the scheduler to avoid scheduling more pods to the node.
When a device plugin endpoint is in stopGracePeriod, pods requesting the
corresponding resource will fail admission handler.
2018-03-09 17:00:57 -08:00
Kubernetes Submit Queue 36fd62eed8
Merge pull request #60972 from wojtek-t/fix_upgrade_test
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix upgrade tests for GKE Regional Clusters
2018-03-09 15:44:46 -08:00
Zihong Zheng 9bb962e238 [e2e service] Fix CleanupGCEResources for regional test 2018-03-09 13:29:30 -08:00
Zihong Zheng e7c673086f [e2e service] Fix gke failure: move apiserver restart validation logic into util 2018-03-09 10:56:46 -08:00
Shyam Jeedigunta 34e7a7cf06 Revert "Use quotas in default performance tests"
This reverts commit c3c10208bd.
2018-03-09 18:18:18 +01:00
Kubernetes Submit Queue 7c9293e1c3
Merge pull request #60973 from shyamjvs/revert-accidental-load-test-remove
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert "[Test change - don't merge] Skip load test"

This reverts commit ba6bb999f7.

This was accidentally merged as part of 60891.

/cc @wojtek-t 
/sig scalability
/kind bug
/priority important-soon

```release-note
NONE
```
2018-03-09 05:34:18 -08:00
Kubernetes Submit Queue b13105d43b
Merge pull request #60421 from gmarek/quotas
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use quotas in default performance tests

Better to use more features in default tests if possible.

LGTM whenever you think we're ready.

```release-note
NONE
```
2018-03-09 04:39:31 -08:00
Shyam Jeedigunta 62f62fc93a Revert "[Test change - don't merge] Skip load test"
This reverts commit ba6bb999f7.
2018-03-09 13:06:50 +01:00
wojtekt 875c1a7053 Fix upgrade tests for GKE Regional Clusters 2018-03-09 12:23:29 +01:00
Kubernetes Submit Queue a5a81da4f3
Merge pull request #60906 from MrHohn/e2e-restart-apiserver-refine
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[e2e service] Refine apiserver restart logic

**What this PR does / why we need it**:
Ref https://github.com/kubernetes/kubernetes/issues/60761#issuecomment-371308569, wait for apiserver's restart count increases before proceeding the test.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes (hopefully) #60761

**Special notes for your reviewer**:
/assign @rramkumar1 @bowei 

**Release note**:

```release-note
NONE
```
2018-03-08 14:32:42 -08:00
Kubernetes Submit Queue 56195fd1d3
Merge pull request #60891 from shyamjvs/go-back-to-etcd-3.1.10
Automatic merge from submit-queue (batch tested with PRs 60891, 60935). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Rollback etcd server version to 3.1.11 due to #60589

Ref https://github.com/kubernetes/kubernetes/issues/60589#issuecomment-371171837

The dependencies were a bit complex (so many things relying on it) + the version was updated to 3.2.16 on top of the original bump.
So I had to mostly make manual reverting changes on a case-by-case basis - so likely to have errors :)

/cc @wojtek-t @jpbetz 

```release-note
Downgrade default etcd server version to 3.1.11 due to #60589
```

(I'm not sure if we should instead remove release-notes of the original PRs)
2018-03-08 12:45:46 -08:00
Shyam Jeedigunta ba6bb999f7 [Test change - don't merge] Skip load test 2018-03-08 13:07:21 +01:00
Shyam Jeedigunta 21f5e69f08 Rollback etcd server version to 3.1.11 due to #60589 2018-03-08 13:07:15 +01:00
Zihong Zheng a25187329b [e2e service] Refine apiserver restart logic 2018-03-07 16:26:17 -08:00
Kubernetes Submit Queue c74c826452
Merge pull request #60627 from dims/create-fake-etc-hosts-for-conformance-test
Automatic merge from submit-queue (batch tested with PRs 60642, 60627). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Create fake /etc/hosts for conformance test

**What this PR does / why we need it**:

    "KubeletManagedEtcHosts should test kubelet managed /etc/hosts file"
    conformance test fails in the CI's Docker-In-Docker environment.

    This test mounts a /etc/hosts file and checks if "# Kubernetes-managed
    hosts file." string is present or not under various conditions. The
    specific failure with DIND happens when the /etc/hosts picked up
    from the box where e2e test are running already has this string. This
    happens because our CI runs on kubernetes and the e2e tests are running
    in a container that was started on kubernetes (and hence already has
    that string)

    To avoid this situation, we create a new /etc/hosts file with known
    contents (and does not have the "# Kubernetes-managed hosts file."
    string)

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:
Please see:
https://k8s-testgrid.appspot.com/sig-testing-misc#ci-kubernetes-local-e2e
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-local-e2e/285

**Release note**:

```release-note
NONE
```
2018-03-07 12:40:47 -08:00
Kubernetes Submit Queue 07c04c2a17
Merge pull request #60874 from cblecker/cblecker-test-approver
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add cblecker to test OWNERS

**What this PR does / why we need it**:
- Adds an OWNERS file for `test/typecheck/` that was just added
- Add cblecker to approvers for `test/`

I won't touch anything I don't understand :)

**Release note**:
```release-note
NONE
```
2018-03-07 10:33:59 -08:00
Christoph Blecker bf063aa6b1
Add OWNERS file to test/typecheck/ 2018-03-06 22:35:47 -08:00
Christoph Blecker b2d62fcd5d
Add cblecker to test/ approvers 2018-03-06 22:35:26 -08:00
Sen Lu b01d2ba048 purge all the -v references from e2e.go 2018-03-06 20:36:06 -08:00
Kubernetes Submit Queue fad7ca836d
Merge pull request #59224 from crimsonfaith91/ds-integration
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

improve daemonset's retry creating failed daemon pods e2e test

**What this PR does / why we need it**:
This PR improves daemonset's retry creating failed daemon pods e2e test by checking that the failed pod does not show up among existing daemon pods.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #59076

**Release note**:
```release-note
NONE
```
2018-03-06 15:23:58 -08:00
Kubernetes Submit Queue 86ca3afb4d
Merge pull request #60840 from pohly/e2e_csi_create
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fixing e2e CSI test, II

The fix for #60803 in commit 2ae33cc324 had a typo, so the "Server
rejected event" error still showed up in the external-provisioner log
of the "Sanity CSI plugin test using hostPath CSI driver" e2e test.

```release-note
None
```
2018-03-06 10:45:05 -08:00
Kubernetes Submit Queue 9aae9b58a5
Merge pull request #59836 from jpbetz/etcd-3.2.16-patch-upgrade
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Bump etcd server patch version to 3.2.16

etcd 3.2.16 contains a critical fix for HA clusters: https://github.com/coreos/etcd/pull/9281

Also, update newly added tests to use `REGISTRY` make variable.

Release note:
```release-note
Upgrade the default etcd server version to 3.2.16
```
2018-03-06 10:00:53 -08:00
Kubernetes Submit Queue a83aec0a0c
Merge pull request #60794 from crassirostris/fix-audit-e2e
Automatic merge from submit-queue (batch tested with PRs 60630, 60794). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add retrying to audit logging e2e tests

Fixes https://github.com/kubernetes/kubernetes/issues/60719

Adds retrying to the audit logging e2e tests so it can work when audit logging is in batch mode and actual writing is delayed.

```release-note
NONE
```

/cc @tallclair @liggitt @sttts
2018-03-06 08:39:48 -08:00
Mik Vyatskov f327a2a4c0 Add retrying to audit logging e2e tests
Signed-off-by: Mik Vyatskov <vmik@google.com>
2018-03-06 14:56:56 +01:00
Kubernetes Submit Queue 48a7048d98
Merge pull request #60747 from dims/fix-daemonset-conformance-test-failure-and-remove-unused-code
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix test failure and delete unused code

**What this PR does / why we need it**:
`Daemon set [Serial] should update pod when spec was updated and update strategy is RollingUpdate` is broken by recent updates as Template generation isn't supported by apps/v1.DaemonSet anymore

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #60745 

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-03-06 05:41:06 -08:00
Kubernetes Submit Queue 100d82935a
Merge pull request #60503 from serathius/fix-passing-location
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[fluentd-gcp addon] Fix passing location to event exporter

Fix passing argument to event-expoter in https://github.com/kubernetes/kubernetes/pull/58090

```release-note
NONE
```
2018-03-06 04:54:28 -08:00
Davanum Srinivas ba20e63446 Create fake /etc/hosts for conformance test
"KubeletManagedEtcHosts should test kubelet managed /etc/hosts file"
conformance test fails in the CI's Docker-In-Docker environment.

This test mounts a /etc/hosts file and checks if "# Kubernetes-managed
hosts file." string is present or not under various conditions. The
specific failure with DIND happens when the /etc/hosts picked up
from the box where e2e test are running already has this string. This
happens because our CI runs on kubernetes and the e2e tests are running
in a container that was started on kubernetes (and hence already has
that string)

To avoid this situation, we create a new /etc/hosts file with known
contents (and does not have the "# Kubernetes-managed hosts file."
string)
2018-03-06 06:51:41 -05:00
Patrick Ohly 17d9a0c5ab Fixing e2e CSI test, II
The fix for #60803 in commit 2ae33cc324 had a typo, so the "Server
rejected event" error still showed up in the external-provisioner log
of the "Sanity CSI plugin test using hostPath CSI driver" e2e test.
2018-03-06 11:43:25 +01:00
Kubernetes Submit Queue 7e98a3ad7c
Merge pull request #60797 from soltysh/issue60765
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Run server-side print tests only on k8s 1.10+

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #60765

**Release note**:
```release-note
None
```

/assign @smarterclayton @deads2k 
@juanvallejo fyi
2018-03-06 02:12:40 -08:00
Kubernetes Submit Queue 5066a67caa
Merge pull request #59840 from jennybuckley/webhooks-on-webhooks
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Prevent webhooks from affecting admission requests for WebhookConfiguration objects

**What this PR does / why we need it**:
As it stands now webhooks can be added to the system which make it impossible for a user to remove that webhook, or two webhooks could be registered which make it impossible to remove each other.

The first commit of this will add a test to make sure webhook deletion is never blocked by a webhook. This test will fail until the second commit is added which will prevent webhooks from affecting admission requests for ValidatingWebhookConfiguration and MutatingWebhookConfiguration objects in the admissionregistration.k8s.io group

- [x] Test that webhook deletion is never blocked by a webhook ([test fails before second commit](https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/59840/pull-kubernetes-e2e-gce/23731/))
- [x] Prevent webhooks from being called on admission requests for [Validating|Mutating]WebhookConfiguration objects
- [x] Document this new behavior maybe in another PR

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Part of fixing #59124 (Verifies that it can remove the broken webhook.)

**Release note**:
```release-note
ValidatingWebhooks and MutatingWebhooks will not be called on admission requests for ValidatingWebhookConfiguration and MutatingWebhookConfiguration objects in the admissionregistration.k8s.io group
```
2018-03-05 19:09:33 -08:00
Kubernetes Submit Queue 01504f66e3
Merge pull request #60820 from janetkuo/ds-test-ondelete
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix DaemonSet e2e test for OnDelete

**What this PR does / why we need it**: DaemonSet `OnDelete` e2e test is broken after #59883 is merged, because default update strategy is different in apps/v1 API endpoint. 

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
xref #60003

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-03-05 17:14:46 -08:00
jennybuckley 58b43ad27d Prevent webhooks from affecting admission requests for webhooks 2018-03-05 16:35:52 -08:00
Kubernetes Submit Queue f68eb1e171
Merge pull request #60804 from sbezverk/e2e_csi_test
Automatic merge from submit-queue (batch tested with PRs 60679, 60804). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fixing e2e CSI test

Closes #60803

After switching to CSI 0.2.0 spec, additional RBAC permissions are required.  This PR adds missing permissions.

```release-note
None
```
2018-03-05 15:38:46 -08:00
Janet Kuo 79d7b425fa Fix DaemonSet e2e test for OnDelete 2018-03-05 15:23:01 -08:00
Kubernetes Submit Queue ae7be34c32
Merge pull request #60509 from verb/pid-e2e
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add node-e2e test for ShareProcessNamespace

**What this PR does / why we need it**: Adds a node-e2e test for kubernetes/features#495

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #59554

**Special notes for your reviewer**: This requires a feature gate to be enabled in both the kubelet and API server. I'm not sure which jenkins configs need to be updated (or if these are even still used) so I just updated a pile of them.

opened kubernetes/test-infra#7030 for https://github.com/kubernetes/test-infra/blob/master/jobs/config.json

**Release note**:

```release-note
NONE
```
2018-03-05 14:20:14 -08:00
Davanum Srinivas 4cbf397280 fix test failure and delete unused code
Removing the template generation that was left behind when the API was
updated in 22fb5c4762

Also cleaned up many unused methods left behind
2018-03-05 17:15:33 -05:00
Joe Betz 04c6d0ab26 Bump etcd server patch version to 3.2.16 2018-03-05 13:58:51 -08:00
Kubernetes Submit Queue 3d60b3cd67
Merge pull request #60490 from jsafrane/fix-aws-delete
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Volume deletion should be idempotent

- Describe* calls should return `aws.Error` so caller can handle individual errors. `aws.Error` already has enough context (`"InvalidVolume.NotFound: The volume 'vol-0a06cc096e989c5a2' does not exist"`)
- Deletion of already deleted volume should succeed.


**Release note**:


Fixes: #60778

```release-note
NONE
```

/sig storage
/sig aws

/assign @justinsb @gnufied
2018-03-05 12:42:22 -08:00
Serguei Bezverkhi 2ae33cc324 Fixing e2e CSI test 2018-03-05 14:20:18 -05:00
Maciej Szulik 64abf5b8ad
Run server-side print tests only on k8s 1.10+ 2018-03-05 17:52:51 +01:00
Marek Siarkowicz 288dbd03e5 [fluentd-gcp addon] Fix passing location to event exporter 2018-03-05 15:05:35 +01:00
Kubernetes Submit Queue a81787052c
Merge pull request #60566 from serathius/fix-stackdriver-logging-test
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix stackdriver logging test

Fixes https://k8s-testgrid.appspot.com/google-gce#gci-gce-sd-logging

Removes location verification that was used for event-exporter tests.

```release-note
NONE
```
2018-03-05 03:11:23 -08:00
Jan Safranek 97fae903a6 Add e2e test for deletion 2018-03-05 09:25:51 +01:00
Kubernetes Submit Queue a456d1cec2
Merge pull request #60672 from kow3ns/fix-60003
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add selector to DaemonSet in newDaemonSet function so that the v1 api…

**What this PR does / why we need it**:
When we upgraded the DaemonSet e2e to use apps v1 I neglected to add a selector to match the labels of the created Pods. This broke some apps Serial tests.

```release-note
NONE
```
2018-03-02 21:17:15 -08:00
Kubernetes Submit Queue 03a97f8ec9
Merge pull request #52130 from janetkuo/e2e-kubectl-apps
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Update kubectl e2e test manifests to apps/v1

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-03-02 19:20:34 -08:00
Kubernetes Submit Queue 8182501bb5
Merge pull request #60740 from msau42/local-e2es
Automatic merge from submit-queue (batch tested with PRs 60159, 60731, 60720, 60736, 60740). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Cap max number of nodes to use for local PV e2e tests

**What this PR does / why we need it**:
Large scale tests have thousands of nodes, which will make some local PV tests that use each node take forever

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Partially addresses #60589

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-03-02 18:36:01 -08:00
Kubernetes Submit Queue 63a05c8bc9
Merge pull request #60720 from dashpole/allocatable_flake
Automatic merge from submit-queue (batch tested with PRs 60159, 60731, 60720, 60736, 60740). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[Flaky Test] Increase amount of memory filled by memory allocatable eviction test

**What this PR does / why we need it**:
MemoryAllocatableEviction tests have been somewhat flaky: https://k8s-testgrid.appspot.com/sig-node-kubelet#kubelet-serial-gce-e2e&include-filter-by-regex=MemoryAllocatable
The failure on the flakes is ["Pod ran to completion"](https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-serial/3785#k8sio-memoryallocatableeviction-slow-serial-disruptive-when-we-run-containers-that-should-cause-memorypressure-should-eventually-evict-all-of-the-correct-pods).
Looking at [an example log](https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-serial/3785/artifacts/tmp-node-e2e-6070a774-cos-stable-63-10032-71-0/kubelet.log) (and search for memory-hog-pod, we can see that this pod fails admission because the allocatable memory threshold has already been crossed.
`eviction manager: thresholds - ignoring grace period: threshold [signal=allocatableMemory.available, quantity=250Mi] observed 242404Ki`
There is likely memory usage because the allocatable cgroup is not low on memory, and thus has not reclaimed all pages belonging to previous test containers.  Of the 300Mi of capacity in the allocatalbe cgroup, 250Mi is reserved for the eviction threshold, and only 50 is left for the test.  Increasing this to a 400Mi cgroup limit, with 150Mi for pods should eliminate this flake.

**Release note**:
```release-note
NONE
```

/sig node
/kind bug
/priority critical-urgent
/assign @Random-Liu @yujuhong
2018-03-02 18:35:55 -08:00