Commit Graph

7842 Commits (0939602ca65d977ce334852c229b12d3711a4233)

Author SHA1 Message Date
Andrzej Wasylkowski 79d3d795b5 Removed a race condition from ResourceConsumer. 2017-06-08 06:05:11 +02:00
Kubernetes Submit Queue 1901cf8a37 Merge pull request #47138 from smarterclayton/delete_collection
Automatic merge from submit-queue (batch tested with PRs 46979, 47078, 47138, 46916)

DeleteCollection should include uninitialized resources

Users who delete a collection expect all resources to be deleted, and
users can also delete an uninitialized resource. To preserve this
expectation, DeleteCollection selects all resources regardless of
initialization.

The namespace controller should list uninitialized resources in order to
gate cleanup of a namespace.

Fixes #47137
2017-06-07 19:01:47 -07:00
Kubernetes Submit Queue 6bc4006d23 Merge pull request #46979 from shashidharatd/fed-cleanup-cp-resource
Automatic merge from submit-queue (batch tested with PRs 46979, 47078, 47138, 46916)

[federation][e2e] Fix cleanupServiceShardLoadBalancer

**What this PR does / why we need it**:
Fixes the issue mentioned in #46976

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #46976

**Special notes for your reviewer**:

```release-note
NONE
```
2017-06-07 19:01:43 -07:00
Kubernetes Submit Queue 551d01c129 Merge pull request #46630 from danwinship/networkpolicy-test-v1
Automatic merge from submit-queue (batch tested with PRs 45877, 46846, 46630, 46087, 47003)

update NetworkPolicy e2e test for v1 semantics

This makes the NetworkPolicy test at least correct for v1, although ideally we'll eventually add a few more tests... (So this covers about half of #46625.)

I've tested that this compiles, but not that it passes, since I don't have a v1-compatible NetworkPolicy implementation yet...

@caseydavenport @ozdanborne, maybe you're closer to having a testable plugin than I am?

**Release note**:
```release-note
NONE
```
2017-06-07 17:55:48 -07:00
Kubernetes Submit Queue 43295501a3 Merge pull request #47050 from sttts/sttts-deprecate-tpr-example
Automatic merge from submit-queue (batch tested with PRs 47024, 47050, 47086, 47081, 47013)

client-go: deprecate TPR example and add CRD example

/cc @nilebox

Part of https://github.com/kubernetes/kubernetes/issues/46702
2017-06-07 16:53:40 -07:00
Clayton Coleman 9ad1f80fdc
DeleteCollection should include uninitialized resources
Users who delete a collection expect all resources to be deleted, and
users can also delete an uninitialized resource. To preserve this
expectation, DeleteCollection selects all resources regardless of
initialization.

The namespace controller should list uninitialized resources in order to
gate cleanup of a namespace.
2017-06-07 17:50:57 -04:00
Dan Gillespie fa67fdea1e use actual hostname in NodeSelector for network test Pod 2017-06-07 13:51:24 -07:00
Kubernetes Submit Queue 8f4aacf6d5 Merge pull request #47136 from smarterclayton/skip_on_not_found
Automatic merge from submit-queue

Skip dynamic configuration of initializers test on alpha disable

Fixes #47133
2017-06-07 13:49:00 -07:00
Kubernetes Submit Queue 24f958d46e Merge pull request #46991 from MrHohn/defer-deleteStaticIP-before-assert
Automatic merge from submit-queue (batch tested with PRs 43005, 46660, 46385, 46991, 47103)

[gke-slow always fails] Defer DeleteGCEStaticIP before asserting error

From https://github.com/kubernetes/kubernetes/issues/46918.

I'm getting close to the root cause: During tests, CreateGCEStaticIP() in fact successfully created static IP, but the parser we wrote in test mistakenly think we failed, probably because the gcloud output format was changed recently (or not). I'm still looking into fixing that.

This PR defer the delete function before asserting the error so that we can stop consistently leaking static IP in every run.

/assign @krzyzacy @dchen1107 

```release-note
NONE
```
2017-06-07 13:31:02 -07:00
Kubernetes Submit Queue 6ee028249b Merge pull request #46385 from rickypai/rpai/host_mapping_node_e2e_test
Automatic merge from submit-queue (batch tested with PRs 43005, 46660, 46385, 46991, 47103)

add e2e node test for Pod hostAliases feature

**What this PR does / why we need it**: adds node e2e test for #45148 

tests requested in https://github.com/kubernetes/kubernetes/issues/43632#issuecomment-298434125

**Release note**:
```release-note
NONE
```

@yujuhong @thockin
2017-06-07 13:31:00 -07:00
shashidharatd 25dfe33bdb Auto generated file 2017-06-08 00:02:22 +05:30
shashidharatd f81c1e6702 Fix cleanupServiceShardLoadBalancer 2017-06-08 00:02:22 +05:30
shashidharatd 605b106d2d Made CleanupGCEResources explicitly take zone parameter 2017-06-08 00:02:22 +05:30
Quinton Hoole 5b5a2e512c Add quinton-hoole to test/federation_e2e/OWNERS 2017-06-07 11:04:24 -07:00
Clayton Coleman a48ba2873b
Skip dynamic configuration of initializers test on alpha disable 2017-06-07 14:00:49 -04:00
Kubernetes Submit Queue 335794674f Merge pull request #46120 from shashidharatd/federation-service-e2e-1
Automatic merge from submit-queue

Federation: create loadbalancer service in tests only if test depends on it

**What this PR does / why we need it**:
Creating LoadBalancer type of service for every test case is kind of expensive and time consuming to provision. So this PR changes the test cases to use LoadBalancer type services only when necessary.

**Which issue this PR fixes** (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #47068 

**Release note**:
```release-note
NONE
```
cc @kubernetes/sig-federation-pr-reviews 
/assign @madhusudancs
2017-06-07 09:13:04 -07:00
Kubernetes Submit Queue 3b14924904 Merge pull request #47061 from madhusudancs/fed-sync-loop-clause-bug
Automatic merge from submit-queue (batch tested with PRs 46977, 47005, 47018, 47061, 46809)

Directly grab map values instead of using loop-clause variables when setting up federated sync controller tests.

Go's loop-clause variables are allocated once and the items are copied to that variable while iterating through the loop. This means, these variables can't escape the scope since closures are bound to loop-clause variables whose value change during each iteration. Doing so would lead to undesired behavior. For more on this topic see: https://github.com/golang/go/wiki/CommonMistakes

So in order to workaround this problem in sync controller e2e tests, we iterate through the map and copy the map value to a variable inside the loop before using it in closures.

Fixes issue: #47059

**Release note**:
```release-note
NONE
```

/assign @marun @shashidharatd @perotinus 

cc @csbell @nikhiljindal 

/sig federation
2017-06-07 08:10:46 -07:00
Kubernetes Submit Queue e7bb6af46c Merge pull request #47005 from MaciekPytel/add_e2e_setup_time
Automatic merge from submit-queue (batch tested with PRs 46977, 47005, 47018, 47061, 46809)

Fix for cluster-autoscaler e2e failures

This may help with cluster-autoscaler e2e failing in setup if the tests are run before all machines in mig get fully ready.
2017-06-07 08:10:40 -07:00
Dan Winship a2cb516690 Basic port of NetworkPolicy test to v1 semantics 2017-06-07 09:37:41 -04:00
Dan Winship bc13aa5e60 Abstract out duplicated cleanup code 2017-06-07 09:37:40 -04:00
Dan Winship a0a7f0148e Update NetworkPolicy test for v1 API (and use generated client) 2017-06-07 09:37:40 -04:00
Dr. Stefan Schimanski e2b50ac9b8 client-go: deprecate TPR example and add CRD example 2017-06-07 13:45:58 +02:00
Kubernetes Submit Queue 0613ae5077 Merge pull request #46669 from kow3ns/statefulset-update
Automatic merge from submit-queue (batch tested with PRs 46235, 44786, 46833, 46756, 46669)

implements StatefulSet update

**What this PR does / why we need it**:
1. Implements rolling update for StatefulSets
2. Implements controller history for StatefulSets.
3. Makes StatefulSet status reporting consistent with DaemonSet and ReplicaSet.

https://github.com/kubernetes/features/issues/188

**Special notes for your reviewer**:

**Release note**:
```release-note
Implements rolling update for StatefulSets. Updates can be performed using the RollingUpdate, Paritioned, or OnDelete strategies. OnDelete implements the manual behavior from 1.6. status now tracks 
replicas, readyReplicas, currentReplicas, and updatedReplicas. The semantics of replicas is now consistent with DaemonSet and ReplicaSet, and readyReplicas has the semantics that replicas did prior to this release.
```
2017-06-07 00:27:53 -07:00
Kubernetes Submit Queue 07e4cca7b3 Merge pull request #46833 from wasylkowski/fix-rc-cleanup
Automatic merge from submit-queue (batch tested with PRs 46235, 44786, 46833, 46756, 46669)

Fixed ResourceConsumer.CleanUp to properly clean up non-replication-controller resources and pods

**What this PR does / why we need it**: Without this fix CleanUp does not remove non-replication-controller resources and pods. This leads to pollution that in some cases inadvertently affects what is happening in AfterEachs before the namespace gets deleted.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-06-07 00:27:49 -07:00
Kubernetes Submit Queue 25352a7fb5 Merge pull request #46997 from jlowdermilk/no-hr-gcloud
Automatic merge from submit-queue (batch tested with PRs 46997, 47021)

Don't parse human-readable output from gcloud in tests

This is the reason  `[k8s.io] Services should be able to change the type and ports of a service [Slow]` is currently failing on GKE e2e tests. For GKE jobs we run a prerelease version of gcloud, in which the default command output was changed.

gcloud's default output for commands is human readable, and is subject to change. Anything scripting against gcloud should always pass `--format=json|yaml|value(...)`  so you get standardized output.

fixes: #46918
2017-06-06 20:12:16 -07:00
Kubernetes Submit Queue 379a15a478 Merge pull request #46881 from smarterclayton/fixes_to_table_print
Automatic merge from submit-queue (batch tested with PRs 47083, 44115, 46881, 47082, 46577)

Add an e2e test for server side get

Print a better error from the response. Performs validation to ensure it
does not regress in alpha state.

This is tests and bug fixes for https://github.com/kubernetes/community/pull/363

@kubernetes/sig-api-machinery-pr-reviews
2017-06-06 18:48:16 -07:00
Zihong Zheng 23dfe56b53 Make firewall test get tag from config instead of instance and fix multi-zone issue 2017-06-06 17:29:36 -07:00
Zihong Zheng 0acdc89d96 Pipe in GCE master/node tags through flags for e2e test 2017-06-06 17:27:07 -07:00
Random-Liu 1d3979190c Bump up npd version to v0.4.0 2017-06-06 16:30:02 -07:00
Kenneth Owens 1b55f57391 Implements StatefulSet update
Implements history utilities for ControllerRevision in the controller/history package
StatefulSetStatus now has additional fields for consistency with DaemonSet and Deployment
StatefulSetStatus.Replicas now represents the current number of createdPods and StatefulSetStatus.ReadyReplicas is the current number of ready Pods
2017-06-06 12:00:28 -07:00
Kubernetes Submit Queue 6ed4bc7b97 Merge pull request #46828 from cblecker/links-update
Automatic merge from submit-queue (batch tested with PRs 46718, 46828, 46988)

Update docs/ links to point to main site

**What this PR does / why we need it**:
This updates various links to either point to kubernetes.io or to the kubernetes/community repo instead of the legacy docs/ tree in k/k
Pre-requisite for #46813

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```

@kubernetes/sig-docs-maintainers @chenopis @ahmetb @thockin
2017-06-06 11:43:18 -07:00
Andrzej Wasylkowski abb5e6e709 Made tests that create Horizontal Pod Autoscaler delete it after they are done. 2017-06-06 19:59:14 +02:00
Andrzej Wasylkowski c12f4978c2 Made WaitForReplicas and EnsureDesiredReplicas use PollImmediate and improved logging. 2017-06-06 19:52:34 +02:00
Madhusudan.C.S 28a6578325 Directly grab map values instead of using loop-clause variables when setting up federated sync controller tests.
Go's loop-clause variables are allocated once and the items are copied
to that variable while iterating through the loop. This means, these
variables can't escape the scope since closures are bound to loop-clause
variables whose value change during each iteration. Doing so would lead
to undesired behavior. For more on this topic see:
https://github.com/golang/go/wiki/CommonMistakes

So in order to workaround this problem in sync controller e2e tests, we
iterate through the map and copy the map value to a variable inside the
loop before using it in closures.

Fixes issue: #47059
2017-06-06 10:41:19 -07:00
Yu-Ju Hong ce57de9a84 Improve the e2e node restart test
This commit includes the following two changes:
 * Move pre-test checks (pods/nodes ready) to BeforeEach() so that it's
   clear whether the test has run or not.
 * Dumping logs for unready pods.
2017-06-06 09:30:17 -07:00
Kubernetes Submit Queue 3338d784ba Merge pull request #46899 from mindprince/issue-46889-node-e2e-gpu-cos-fix
Automatic merge from submit-queue (batch tested with PRs 46897, 46899, 46864, 46854, 46875)

Wait for cloud-init to finish before starting tests.

This fixes #46889.


**Release note**:
```release-note
NONE
```
2017-06-06 05:22:42 -07:00
Kubernetes Submit Queue c0407972e9 Merge pull request #46938 from shyamjvs/no-reprinting-gcloud-result
Automatic merge from submit-queue

Avoid double printing output of gcloud commands in kubemark

Just noticed we were unnecessarily echoing the result again.

/cc @wojtek-t
2017-06-06 04:07:55 -07:00
Kubernetes Submit Queue 8da89aeb00 Merge pull request #46112 from sttts/sttts-unversioned-to-meta
Automatic merge from submit-queue

apimachinery: move unversioned registration to metav1

Follow-up from the discussions in https://github.com/kubernetes/kubernetes/pull/43027:

We need `Status` as unversioned type which is hardcoded to `GroupVersion{Group: "", Version: "v1"}`. If the core group is not in the scheme, we miss `Status`.

Fixing https://github.com/kubernetes/kubernetes/issues/47030.
2017-06-06 03:13:01 -07:00
Kubernetes Submit Queue cb681321c7 Merge pull request #45686 from jingxu97/May/emptyDir
Automatic merge from submit-queue

Add EmptyDir volume capacity isolation

This PR adds the support for isolating the emptyDir volume use. If user
sets a size limit for emptyDir volume, kubelet's eviction manager monitors its usage
and evict the pod if the usage exceeds the limit.

This feature is part of local storage capacity isolation and described in the proposal kubernetes/community#306

**Release note**:

```release-note
Alpha feature: allows users to set storage limit to isolate EmptyDir volumes. It enforces the limit by evicting pods that exceed their storage limits  
```
2017-06-05 23:08:58 -07:00
Christoph Blecker 1bdc7a29ae
Update docs/ URLs to point to proper locations 2017-06-05 22:13:54 -07:00
Kubernetes Submit Queue a552ee61a0 Merge pull request #46672 from smarterclayton/initializer_with_config
Automatic merge from submit-queue (batch tested with PRs 46967, 46992, 43338, 46717, 46672)

Select initializers from the dynamic configuration

Continues #36721

kubernetes/features#209
2017-06-05 20:27:50 -07:00
Kubernetes Submit Queue 356d4e8ce2 Merge pull request #44883 from ravigadde/bind-1.7
Automatic merge from submit-queue (batch tested with PRs 44883, 46836, 46765, 46683, 46050)

Added Bind method to Scheduler Extender

- only one extender can support the bind method
- if an extender supports bind, scheduler delegates the pod binding to the extender



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #41235

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-06-05 18:43:06 -07:00
Kubernetes Submit Queue f842bb9987 Merge pull request #46802 from shyamjvs/npd-kernel-config
Automatic merge from submit-queue (batch tested with PRs 46972, 42829, 46799, 46802, 46844)

Add KernelDeadlock condition to hollow NPD

Ref https://github.com/kubernetes/kubernetes/issues/44701

/cc @wojtek-t @gmarek
2017-06-05 17:46:55 -07:00
Kubernetes Submit Queue 1c64f31fdb Merge pull request #42829 from msau42/multizone_pv_tests
Automatic merge from submit-queue (batch tested with PRs 46972, 42829, 46799, 46802, 46844)

Multizone static pv test

**What this PR does / why we need it**:
Adds an e2e test for checking that pods get scheduled to the same zone as statically created PVs.  This tests the PersistentVolumeLabel admission controller, which adds zone and region labels when PVs are created.  As part of this, I also had to make changes to volume test utility code to pass in a zone parameter for creating PDs, and also had to add an argument to the e2e test program to accept a list of zones.

Fixes #46995

**Special notes for your reviewer**:
It's probably easier to review each commit separately.

**Release note**:

NONE
2017-06-05 17:46:49 -07:00
Maciej Pytel ecc33fd8c2 Wait for instances boot in cluster-autoscaler e2e 2017-06-06 01:46:57 +02:00
Kubernetes Submit Queue 4faf7f1f4c Merge pull request #46663 from nicksardo/gce-internallb
Automatic merge from submit-queue (batch tested with PRs 46550, 46663, 46816, 46820, 46460)

[GCE] Support internal load balancers

**What this PR does / why we need it**:
Allows users to expose K8s services externally of the K8s cluster but within their GCP network. 

Fixes #33483

**Important User Notes:**
- This is a beta feature. ILB could be enabled differently in the future. 
- Requires nodes having version 1.7.0+ (ILB requires health checking and a health check endpoint on kube-proxy has just been exposed)
- This cannot be used for intra-cluster communication. Do not call the load balancer IP from a K8s node/pod.  
- There is no reservation system for private IPs. You can specify a RFC 1918 address in `loadBalancerIP` field, but it could be lost to another VM or LB if service settings are modified.
- If you're running an ingress, your existing loadbalancer backend service must be using BalancingMode type `RATE` - not `UTILIZATION`. 
  - Option 1: With a 1.5.8+ or 1.6.4+ version master, delete all your ingresses, and re-create them.
  - Option 2: Migrate to a new cluster running 1.7.0. Considering ILB requires nodes with 1.7.0, this isn't a bad idea.
  - Option 3: Possible migration opportunity, but use at your own risk. More to come later.


**Reviewer Notes**:
Several files were renamed, so github thinks ~2k lines have changed. Review commits one-by-one to see the actual changes.

**Release note**:
```release-note
Support creation of GCP Internal Load Balancers from Service objects
```
2017-06-05 16:43:41 -07:00
Jeff Lowdermilk ac1ce7f1cd Don't parse human-readable output from gcloud in tests 2017-06-05 16:15:57 -07:00
Clayton Coleman 772ab8e1b4
Load initializers from dynamic config
Handle failure cases on startup gracefully to avoid causing cascading
errors and poor initialization in other components. Initial errors from
config load cause the initializer to pause and hold requests. Return
typed errors to better communicate failures to clients.

Add code to handle two specific cases - admin wants to bypass
initialization defaulting, and mirror pods (which want to bypass
initialization because the kubelet owns their lifecycle).
2017-06-05 19:12:41 -04:00
Zihong Zheng d455fad134 Defer DeleteGCEStaticIP before asserting error 2017-06-05 14:24:32 -07:00
Jing Xu 0b13aee0c0 Add EmptyDir Volume and local storage for container overlay Isolation
This PR adds two features:
1. add support for isolating the emptyDir volume use. If user
sets a size limit for emptyDir volume, kubelet's eviction manager
monitors its usage
and evict the pod if the usage exceeds the limit.
2. add support for isolating the local storage for container overlay. If
the container's overly usage exceeds the limit defined in container
spec, eviction manager will evict the pod.
2017-06-05 12:05:48 -07:00
Rohit Agarwal 1561f55c4c Wait for cloud-init to finish before starting tests.
This fixes #46889.
2017-06-05 10:50:24 -07:00
Ravi Gadde 7f179bf936 Added Bind method to Scheduler Extender
- only one extender can support the bind method
- if an extender supports bind, scheduler delegates the pod binding to the extender
2017-06-05 09:44:53 -07:00
Jeff Peeler 08a59530e1 Allow pods to opt out of PodPreset mutation
An annotation in the pod spec of the form:
podpreset.admission.kubernetes.io/exclude: "true"
Will cause the admission controller to skip manipulating the pod spec,
no matter the labelling.

The annotation for a podpreset acting on a pod has also been slightly
modified to contain a podpreset prefix:
podpreset.admission.kubernetes.io/podpreset-{name} = resource version

Fixes #44161
2017-06-05 11:56:30 -04:00
Kubernetes Submit Queue 6fef1a1deb Merge pull request #46810 from vishh/gpu-cos-image-validation
Automatic merge from submit-queue (batch tested with PRs 46734, 46810, 46759, 46259, 46771)

Update the COS kernel sha for node e2e gpu installer

cc @mindprince

Relevant COS image - https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/jenkins/image-config-serial.yaml#L19
2017-06-05 06:51:23 -07:00
Kubernetes Submit Queue d3146080b4 Merge pull request #46804 from verult/gce-pdflake
Automatic merge from submit-queue (batch tested with PRs 45871, 46498, 46729, 46144, 46804)

PD e2e test: Ready node check now uses the most up-to-date node count.

Follow-up to PR #46746 

<!--  Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
2017-06-05 03:06:29 -07:00
Shyam Jeedigunta 163f1de5ed Avoid double printing output of gcloud commands in kubemark 2017-06-04 20:07:36 +02:00
Shyam Jeedigunta b655953e21 Enable DefaultTolerationSeconds and PodPreset admission plugins for kubemark 2017-06-04 19:52:57 +02:00
Nick Sardo 7248c61ea5 Update test utilities & build file 2017-06-04 10:25:05 -07:00
Kubernetes Submit Queue 3837d95191 Merge pull request #45748 from mml/reliable-node-upgrade
Automatic merge from submit-queue

Respect PDBs during node upgrades and add test coverage to the ServiceTest upgrade test.

This is still a WIP... needs to be squashed at least, and I don't think it's currently passing until I increase the scale of the RC, but please have a look at the general outline.  Thanks!

Fixes #38336 

@kow3ns @bdbauer @krousey @erictune @maisem @davidopp 

```
On GCE, node upgrades will now respect PodDisruptionBudgets, if present.
```
2017-06-04 06:11:59 -07:00
Kubernetes Submit Queue f28fe811ad Merge pull request #46680 from cheftako/aggregate
Automatic merge from submit-queue (batch tested with PRs 46681, 46786, 46264, 46680, 46805)

Enable Dialer on the Aggregator

Centralize the creation of the dialer during startup.
Have the dialer then passed in to both APIServer and Aggregator.
Aggregator the uses the dialer as its Transport base.

**What this PR does / why we need it**:Enables the Aggregator to use the Dialer/SSHTunneler to connect to the user-apiserver.

**Which issue this PR fixes** : fixes ##46679

**Special notes for your reviewer**:

**Release note**: None
2017-06-03 21:16:46 -07:00
Kubernetes Submit Queue dbd1503b65 Merge pull request #45924 from janetkuo/daemonset-history
Automatic merge from submit-queue

Implement Daemonset history

~Depends on #45867 (the 1st commit, ignore it when reviewing)~ (already merged)

Ref https://github.com/kubernetes/community/pull/527/ and https://github.com/kubernetes/community/pull/594

@kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews @erictune @kow3ns @lukaszo @kargakis 

---

TODOs:
- [x] API changes
  - [x] (maybe) Remove rollback subresource if we decide to do client-side rollback 
- [x] deployment controller 
  - [x] controller revision
    - [x] owner ref (claim & adoption)
    - [x] history reconstruct (put revision number, hash collision avoidance)
    - [x] de-dup history and relabel pods
    - [x] compare ds template with history 
  - [x] hash labels (put it in controller revision, pods, and maybe deployment)
  - [x] clean up old history 
  - [x] Rename status.uniquifier when we reach consensus in #44774 
- [x] e2e tests 
- [x] unit tests 
  - [x] daemoncontroller_test.go 
  - [x] update_test.go 
  - [x] ~(maybe) storage_test.go // if we do server side rollback~

kubectl part is in #46144

--- 

**Release note**:

```release-note
```
2017-06-03 16:52:38 -07:00
Tim Hockin be987b015c Merge pull request #46716 from thockin/proxy-comments
Kube-proxy cleanups
2017-06-03 15:57:17 -07:00
Clayton Coleman ce972ca475
Add an e2e test for server side get
Print a better error from the response. Performs validation to ensure it
does not regress in alpha state.
2017-06-03 18:22:39 -04:00
Kubernetes Submit Queue 903c40b5d3 Merge pull request #46725 from timstclair/apparmor-debug
Automatic merge from submit-queue (batch tested with PRs 46620, 46732, 46773, 46772, 46725)

Fix AppArmor test for docker 1.13

... & better debugging.

The issue is that we run the pod containers in a shared PID namespace with docker 1.13, so PID 1 is no longer the container's root process. Since it's messy to get the container's root process, I switched to using `/proc/self` to read the apparmor profile. While this wouldn't catch a regression that caused only the init process to run with the wrong profile, I think it's a good approximation.

/cc @aulanov @Amey-D
2017-06-03 11:39:46 -07:00
Kubernetes Submit Queue a281ad8d4b Merge pull request #46773 from wasylkowski/nig-doc-change
Automatic merge from submit-queue (batch tested with PRs 46620, 46732, 46773, 46772, 46725)

Added missing documentation to NodeInstanceGroup.

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-06-03 11:39:42 -07:00
Kubernetes Submit Queue 07f85565a2 Merge pull request #36721 from smarterclayton/initializers
Automatic merge from submit-queue

Add initializer support to admission and uninitialized filtering to rest storage

Initializers are the opposite of finalizers - they allow API clients to react to object creation and populate fields prior to other clients seeing them.

High level description:

1. Add `metadata.initializers` field to all objects
2. By default, filter objects with > 0 initializers from LIST and WATCH to preserve legacy client behavior (known as partially-initialized objects)
3. Add an admission controller that populates .initializer values per type, and denies mutation of initializers except by certain privilege levels (you must have the `initialize` verb on a resource)
4. Allow partially-initialized objects to be viewed via LIST and WATCH for initializer types
5. When creating objects, the object is "held" by the server until the initializers list is empty
6. Allow some creators to bypass initialization (set initializers to `[]`), or to have the result returned immediately when the object is created.

The code here should be backwards compatible for all clients because they do not see partially initialized objects unless they GET the resource directly. The watch cache makes checking for partially initialized objects cheap. Some reflectors may need to change to ask for partially-initialized objects.

```release-note
Kubernetes resources, when the `Initializers` admission controller is enabled, can be initialized (defaulting or other additive functions) by other agents in the system prior to those resources being visible to other clients.  An initialized resource is not visible to clients unless they request (for get, list, or watch) to see uninitialized resources with the `?includeUninitialized=true` query parameter.  Once the initializers have completed the resource is then visible.  Clients must have the the ability to perform the `initialize` action on a resource in order to modify it prior to initialization being completed.
```
2017-06-03 07:16:52 -07:00
Kubernetes Submit Queue e6c74bbaaf Merge pull request #46221 from FengyunPan/close-file
Automatic merge from submit-queue

Close file after os.Open()

None
2017-06-03 04:42:00 -07:00
Janet Kuo 85ec49c9bb Verify histories and pods in DaemonSet e2e test 2017-06-03 00:46:11 -07:00
Kubernetes Submit Queue b8c9ee8abb Merge pull request #46456 from jingxu97/May/allocatable
Automatic merge from submit-queue

Add local storage (scratch space) allocatable support

This PR adds the support for allocatable local storage (scratch space).
This feature is only for root file system which is shared by kubernetes
componenets, users' containers and/or images. User could use
--kube-reserved flag to reserve the storage for kube system components.
If the allocatable storage for user's pods is used up, some pods will be
evicted to free the storage resource.

This feature is part of local storage capacity isolation and described in the proposal https://github.com/kubernetes/community/pull/306

**Release note**:

```release-note
This feature exposes local storage capacity for the primary partitions, and supports & enforces storage reservation in Node Allocatable 
```
2017-06-03 00:24:29 -07:00
Kubernetes Submit Queue e837c3bbc2 Merge pull request #46388 from lavalamp/whitlockjc-generic-webhook-admission
Automatic merge from submit-queue (batch tested with PRs 46239, 46627, 46346, 46388, 46524)

Dynamic webhook admission control plugin

Unit tests pass.

Needs plumbing:
* [ ] service resolver (depends on @wfender PR)
* [x] client cert (depends on ????)
* [ ] hook source (depends on @caesarxuchao PR)

Also at least one thing will need to be renamed after Chao's PR merges.

```release-note
Allow remote admission controllers to be dynamically added and removed by administrators.  External admission controllers make an HTTP POST containing details of the requested action which the service can approve or reject.
```
2017-06-02 23:37:42 -07:00
Kubernetes Submit Queue 348bf1e032 Merge pull request #46627 from deads2k/api-12-labels
Automatic merge from submit-queue (batch tested with PRs 46239, 46627, 46346, 46388, 46524)

move labels to components which own the APIs

During the apimachinery split in 1.6, we accidentally moved several label APIs into apimachinery.  They don't belong there, since the individual APIs are not general machinery concerns, but instead are the concern of particular components: most commonly the kubelet.  This pull moves the labels into their owning components and out of API machinery.

@kubernetes/sig-api-machinery-misc @kubernetes/api-reviewers @kubernetes/api-approvers 
@derekwaynecarr  since most of these are related to the kubelet
2017-06-02 23:37:38 -07:00
Clayton Coleman 4ce3907639
Add Initializers to all admission control paths by default 2017-06-02 22:09:04 -04:00
Clayton Coleman 331eea67d8
Allow initialization of resources
Add support for creating resources that are not immediately visible to
naive clients, but must first be initialized by one or more privileged
cluster agents. These controllers can mark the object as initialized,
allowing others to see them.

Permission to override initialization defaults or modify an initializing
object is limited per resource to a virtual subresource "RESOURCE/initialize"
via RBAC.

Initialization is currently alpha.
2017-06-02 22:09:03 -04:00
Kubernetes Submit Queue d063ce213f Merge pull request #46801 from dashpole/summary_container_restart
Automatic merge from submit-queue

[Flaky PR Test] Fix summary test

fixes issue: #46797 

As we can see in the [example failure build log](https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet/4319/build-log.txt), the summary containers are pinging google 100s of times a second.  This causes the summary container to be killed occasionally, and fail the test.  The summary containers are only supposed to ping every 10 seconds according to the current test.  As it turns out, we were missing a semicolon, and were not sleeping between pings.  For background, we ping google to generate network traffic, so that the summary test can validate network metrics.

This PR adds the semicolon to make the container sleep between calls, and decreases the sleep time from 10 seconds to 1 second, as 1 call / 10 seconds did not produce enough activity.

cc @kubernetes/kubernetes-build-cops @dchen1107
2017-06-02 18:02:19 -07:00
Ricky Pai 4e7fed4479 e2e node test for PodSpec HostAliases 2017-06-02 17:01:44 -07:00
Kubernetes Submit Queue 310ea94b6e Merge pull request #46557 from timstclair/audit-test
Automatic merge from submit-queue (batch tested with PRs 46648, 46500, 46238, 46668, 46557)

Add an e2e test for AdvancedAuditing

Enable a simple "advanced auditing" setup for e2e tests running on GCE, and add an e2e test that creates & deletes a pod, a secret, and verifies that they're audited.

Includes https://github.com/kubernetes/kubernetes/pull/46548

For https://github.com/kubernetes/features/issues/22

/cc @ericchiang @sttts @soltysh @ihmccreery
2017-06-02 15:20:52 -07:00
Kubernetes Submit Queue a6f0033164 Merge pull request #46238 from yguo0905/package-validator
Automatic merge from submit-queue (batch tested with PRs 46648, 46500, 46238, 46668, 46557)

Support validating package versions in node conformance test

**What this PR does / why we need it**:

This PR adds a package validator in node conformance test for checking whether the locally installed packages meet the image spec.

**Special notes for your reviewer**:

The image spec for GKE (which has the package spec) will be in a separate PR. Then we will publish a new node conformance test image for GKE whose name should use the convention in https://github.com/kubernetes/kubernetes/issues/45760 and have `gke` in it.


**Release note**:
```
NONE
```
2017-06-02 15:20:47 -07:00
Ke Wu cb28ed1f95 Let COS docker validation node tests against gci-next-canary
We plan to use family gci-next-canary in container-vm-image-staging
for future Docker upgration and validation.
2017-06-02 14:52:01 -07:00
shashidharatd a453131f95 create loadbalancer service in tests only if test depends on it 2017-06-02 18:51:14 +05:30
Andrzej Wasylkowski 30b3472f89 Added new helper methods FailfWithOffset and ExpectNoErrorWithOffset. 2017-06-02 12:01:52 +02:00
Andrzej Wasylkowski 5678bcf224 Fixed ResourceConsumer.CleanUp to properly clean up non-replication-controller resources and pods. 2017-06-02 10:37:06 +02:00
Matt Liggett 43e2bec49d update-bazel.sh 2017-06-01 17:58:45 -07:00
Matt Liggett 775f2ef9a0 Respect PDBs during GCE node upgrades.
Respect PDBs during node upgrades and add test coverage to the
ServiceTest upgrade test.  Modified that test so that we include pod
anti-affinity constraints and a PDB.
2017-06-01 17:58:45 -07:00
Vishnu kannan d45286c575 update cos kernel sha for node e2e GPU installer
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-06-01 17:09:18 -07:00
Tim Hockin fc34a9d6ba 'Global' -> 'Cluster' for traffic policy 2017-06-01 16:17:38 -07:00
Jing Xu 943fc53bf7 Add predicates check for local storage request
This PR adds the check for local storage request when admitting pods. If
the local storage request exceeds the available resource, pod will be
rejected.
2017-06-01 15:57:50 -07:00
Jing Xu dd67e96c01 Add local storage (scratch space) allocatable support
This PR adds the support for allocatable local storage (scratch space).
This feature is only for root file system which is shared by kubernetes
componenets, users' containers and/or images. User could use
--kube-reserved flag to reserve the storage for kube system components.
If the allocatable storage for user's pods is used up, some pods will be
evicted to free the storage resource.
2017-06-01 15:57:50 -07:00
Cheng Xing 6a073374f8 PD e2e test: Ready node check now uses the most up-to-date node count. 2017-06-01 14:03:02 -07:00
David Ashpole d1545e1e47 add semicolon 2017-06-01 13:32:59 -07:00
Shyam Jeedigunta 23ef37d9ce Add KernelDeadlock condition to hollow NPD 2017-06-01 22:23:28 +02:00
Dawn Chen 5943e83417 Merge pull request #46746 from verult/gce-pdflake
Added API node ready check after PD test deleting a GCE instance.
2017-06-01 11:39:41 -07:00
prateekgogia eb067a9ba3 Fixed e2e test flake - ClusterDns [Feature:Example] should create pod that uses dns 2017-06-01 16:07:40 +00:00
supereagle dc9f0f9729 mark --network-plugin-dir deprecated for kubelet, and update related bootstrap scripts 2017-06-01 22:06:44 +08:00
Kubernetes Submit Queue 98e5496aa2 Merge pull request #46677 from enisoc/tpr-migrate-etcd
Automatic merge from submit-queue (batch tested with PRs 43505, 45168, 46439, 46677, 46623)

Add TPR to CRD migration helper.

This is a helper for migrating TPR data to CustomResource. It's rather hacky because it requires crossing apiserver boundaries, but doing it this way keeps the mess contained to the TPR code, which is scheduled for deletion anyway.

It's also not completely hands-free because making it resilient enough to be completely automated is too involved to be worth it for an alpha-to-beta migration, and would require investing significant effort to fix up soon-to-be-deleted TPR code. Instead, this feature will be documented as a best-effort helper whose results should be verified by hand.

The intended benefit of this over a totally manual process is that it should be possible to copy TPR data into a CRD without having to tear everything down in the middle. The process would look like this:

1. Upgrade to k8s 1.7. Nothing happens to your TPRs.
1. Create CRD with group/version and resource names that match the TPR. Still nothing happens to your TPRs, as the CRD is hidden by the overlapping TPR.
1. Delete the TPR. The TPR data is converted to CustomResource data, and the CRD begins serving at the same REST path.

Note that the old TPR data is left behind by this process, so watchers should not receive DELETE events. This also means the user can revert to the pre-migration state by recreating the TPR definition.

Ref. https://github.com/kubernetes/kubernetes/issues/45728
2017-06-01 05:43:44 -07:00
Andrzej Wasylkowski 4280b95915 Added missing documentation to NodeInstanceGroup. 2017-06-01 13:19:41 +02:00
Walter Fender 5b3f4684ed Enable Dialer on the Aggregator
Centralize the creation of the dialer during startup.
Have the dialer then passed in to both APIServer and Aggregator.
Aggregator the sets the dialer on its Transport base.
This should allow the SSTunnel to be used but also allow the Aggregation
Auth to work with it.
Depending on Environment InsecureSkipTLSVerify *may* need to be set to
true.
Fixed as few tests to call CreateDialer as part of start-up.
2017-06-01 00:05:02 -07:00
Anthony Yeh ba59e14d44
Add TPR to CRD migration helper. 2017-05-31 19:07:38 -07:00
Cheng Xing 5c2cba391d Added API node ready check after PD test deleting a GCE instance.
- Need to ensure that all nodes are ready, i.e. back to the state before the test.
2017-05-31 18:38:24 -07:00
Tim St. Clair b1af8da735
Fix AppArmor test for docker 1.13 2017-05-31 17:09:22 -07:00
Daniel Smith d6e1140b5d Implement dynamic admission webhooks
Also fix a bug in rest client
2017-05-31 16:38:46 -07:00
Tim St. Clair 81c9181995
Capture better debug logs on AppArmor test failure 2017-05-31 15:31:11 -07:00
Tim St. Clair 63d1d5a500
Add AdvancedAuditing E2E test 2017-05-31 09:52:55 -07:00
deads2k 954eb3ceb9 move labels to components which own the APIs 2017-05-31 10:32:06 -04:00
Shyam Jeedigunta 52ef3e6e94 Performance tests also cover configmaps now 2017-05-31 13:13:15 +02:00
Kubernetes Submit Queue 0aad9d30e3 Merge pull request #44897 from msau42/local-storage-plugin
Automatic merge from submit-queue (batch tested with PRs 46076, 43879, 44897, 46556, 46654)

Local storage plugin

**What this PR does / why we need it**:
Volume plugin implementation for local persistent volumes.  Scheduler predicate will direct already-bound PVCs to the node that the local PV is at.  PVC binding still happens independently.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 
Part of #43640

**Release note**:

```
Alpha feature: Local volume plugin allows local directories to be created and consumed as a Persistent Volume.  These volumes have node affinity and pods will only be scheduled to the node that the volume is at.
```
2017-05-30 23:20:02 -07:00
Kubernetes Submit Queue 5995690396 Merge pull request #46076 from liggitt/node-authorizer
Automatic merge from submit-queue

Node authorizer

This PR implements the authorization portion of https://github.com/kubernetes/community/blob/master/contributors/design-proposals/kubelet-authorizer.md and kubernetes/features#279:
* Adds a new authorization mode (`Node`) that authorizes requests from nodes based on a graph of related pods,secrets,configmaps,pvcs, and pvs:
  * Watches pods, adds edges (secret -> pod, configmap -> pod, pvc -> pod, pod -> node)
  * Watches pvs, adds edges (secret -> pv, pv -> pvc)
  * When both Node and RBAC authorization modes are enabled, the default RBAC binding that grants the `system:node` role to the `system:nodes` group is not automatically created.
* Tightens the `NodeRestriction` admission plugin to require identifiable nodes for requests from users in the `system:nodes` group.

This authorization mode is intended to be used in combination with the `NodeRestriction` admission plugin, which limits the pods and nodes a node may modify. To enable in combination with RBAC authorization and the NodeRestriction admission plugin:
* start the API server with `--authorization-mode=Node,RBAC --admission-control=...,NodeRestriction,...`
* start kubelets with TLS boostrapping or with client credentials that place them in the `system:nodes` group with a username of `system:node:<nodeName>`

```release-note
kube-apiserver: a new authorization mode (`--authorization-mode=Node`) authorizes nodes to access secrets, configmaps, persistent volume claims and persistent volumes related to their pods.
* Nodes must use client credentials that place them in the `system:nodes` group with a username of `system:node:<nodeName>` in order to be authorized by the node authorizer (the credentials obtained by the kubelet via TLS bootstrapping satisfy these requirements)
* When used in combination with the `RBAC` authorization mode (`--authorization-mode=Node,RBAC`), the `system:node` role is no longer automatically granted to the `system:nodes` group.
```

```release-note
RBAC: the automatic binding of the `system:node` role to the `system:nodes` group is deprecated and will not be created in future releases. It is recommended that nodes be authorized using the new `Node` authorization mode instead. Installations that wish to continue giving all members of the `system:nodes` group the `system:node` role (which grants broad read access, including all secrets and configmaps) must create an installation-specific ClusterRoleBinding.
```

Follow-up:
- [ ] enable e2e CI environment with admission and authorizer enabled (blocked by kubelet TLS bootstrapping enablement in https://github.com/kubernetes/kubernetes/pull/40760)
- [ ] optionally enable this authorizer and admission plugin in kubeadm
- [ ] optionally enable this authorizer and admission plugin in kube-up
2017-05-30 22:42:54 -07:00
Kubernetes Submit Queue 1f213765f6 Merge pull request #46521 from dashpole/summary_container_restart
Automatic merge from submit-queue

Fix Cross-Build, and reduce test to 1 restart to reduce flakyness

In response to https://github.com/kubernetes/kubernetes/pull/46308#issuecomment-304248450

This fixes the error: `test/e2e_node/summary_test.go:138: constant 100000000000 overflows int` from the cross build.

This [recent flake](https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet/4179) was because the container restarted during the period where the test expected to Continually see the container in the Summary API.

/assign @dchen1107 
cc @gmarek @luxas 

/release-note-none
2017-05-30 21:45:56 -07:00
Kubernetes Submit Queue 32bce030d8 Merge pull request #46635 from krzyzacy/copy-files
Automatic merge from submit-queue

Switch gcloud compute copy-files to scp

gcloud is deprecating `gcloud compute copy-files` and switching to `gcloud compute scp`. Make the change before things start to break.

https://cloud.google.com/sdk/gcloud/reference/compute/copy-files

Warnings we get: `W0529 10:28:59.097] WARNING: `gcloud compute copy-files` is deprecated.  Please use `gcloud compute scp` instead.  Note that `gcloud compute scp` does not have recursive copy on by default.  To turn on recursion, use the `--recurse` flag.`

/cc @jlowdermilk
2017-05-30 19:35:50 -07:00
Yang Guo ecf214729d Support validating package versions in node conformance test 2017-05-30 17:44:40 -07:00
Kubernetes Submit Queue 40dcbc4eb3 Merge pull request #46461 from ncdc/e2e-suite-metrics
Automatic merge from submit-queue

Support grabbing test suite metrics

**What this PR does / why we need it**:
Add support for grabbing metrics that cover the entire test suite's execution.

Update the "interesting" controller-manager metrics to match the
current names for the garbage collector, and add namespace controller
metrics to the list.

If you enable `--gather-suite-metrics-at-teardown`, the metrics file is written to a file with a name such as `MetricsForE2ESuite_2017-05-25T20:25:57Z.json` in the `--report-dir`. If you don't specify `--report-dir`, the metrics are written to the test log output.

I'd like to enable this for some of the `pull-*` CI jobs, which will require a separate PR to test-infra.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

@kubernetes/sig-testing-pr-reviews @smarterclayton @wojtek-t @gmarek @derekwaynecarr @timothysc
2017-05-30 16:41:49 -07:00
Jordan Liggitt fc8e915a4b
Add Node authorization mode based on graph of node-related objects 2017-05-30 16:53:03 -04:00
David Ashpole e2718f3bc5 fix crossbuild, verify container restarts, and restart only once 2017-05-30 13:15:22 -07:00
Sen Lu d237e54a24 Switch gcloud compute copy-files to scp 2017-05-30 10:19:33 -07:00
Kubernetes Submit Queue 36548b07cd Merge pull request #46605 from shyamjvs/fix-perfdata-subresource
Automatic merge from submit-queue (batch tested with PRs 46552, 46608, 46390, 46605, 46459)

Make kubemark scripts fail fast

Fixes https://github.com/kubernetes/kubernetes/issues/46601

/cc @wojtek-t @gmarek
2017-05-30 08:42:00 -07:00
Kubernetes Submit Queue 38b26db33a Merge pull request #46613 from FengyunPan/fix-e2e-service
Automatic merge from submit-queue (batch tested with PRs 45534, 37212, 46613, 46350)

[e2e]Fix define redundant parameter

When timeout to reach HTTP service, redundant parameter make the
error is nil.
2017-05-30 04:46:04 -07:00
Shyam Jeedigunta 02092312bb Make kubemark scripts fail fast 2017-05-30 11:59:13 +02:00
gmarek 0cc1999e16 Make log-monitor give up on trying to ssh to a dead node after some time 2017-05-30 10:27:10 +02:00
FengyunPan 38e8c32a26 [e2e]Fix define redundant parameter
When timeout to reach HTTP service, redundant parameter make the
error is nil.
2017-05-30 16:09:33 +08:00
Kubernetes Submit Queue 755d368c4a Merge pull request #45782 from mtaufen/no-snat-test
Automatic merge from submit-queue

no-snat test

This test checks that Pods can communicate with each other in the same cluster without SNAT.

I intend to create a job that runs this in small clusters (\~3 nodes) at a low frequency (\~once per day) so that we have a signal as we work on allowing multiple non-masquerade CIDRs to be configured (see [kubernetes-incubator/ip-masq-agent](https://github.com/kubernetes-incubator/ip-masq-agent), for example).

/cc @dnardo
2017-05-29 16:19:46 -07:00
Kubernetes Submit Queue 52337d5db6 Merge pull request #46502 from gmarek/run_kubemark_tests
Automatic merge from submit-queue

Fix kubemark/run-e2e-tests.sh

This should make most common arguments work.

cc @shyamjvs
2017-05-29 09:35:01 -07:00
Kubernetes Submit Queue d9f3ea5191 Merge pull request #46593 from shyamjvs/fix-perfdata-subresource
Automatic merge from submit-queue

Fix minor bugs in setting API call metrics with subresource

Based on changes from https://github.com/kubernetes/kubernetes/pull/46354

/cc @wojtek-t @smarterclayton
2017-05-29 08:45:02 -07:00
gmarek 0ca6aeb95c Fix kubemark/run-e2e-tests.sh 2017-05-29 15:20:54 +02:00
Shyam Jeedigunta e897b21506 Fix minor bugs in setting API call metrics with subresource 2017-05-29 15:04:52 +02:00
Wojciech Tyczynski 1583912dd0 Fix panics in load test 2017-05-29 13:09:53 +02:00
Dr. Stefan Schimanski 3eca0b5ad0 integration test: check API version onf Status object 2017-05-29 11:53:45 +02:00
Kubernetes Submit Queue 451d0a436c Merge pull request #46509 from k82cn/add_k82cn_as_approver
Automatic merge from submit-queue

Added k82cn as one of scheduler approver.

According to the requirement of Approver at [community-membership.md](https://github.com/kubernetes/community/blob/master/community-membership.md), I meet the requirements as follow; so I'd like to add myself as an approver of scheduler.

* Reviewer of the codebase for at least 3 months
[k82cn]: [~3 months](6cc40678b6 )
* Primary reviewer for at least 10 substantial PRs to the codebase
[k82cn] Reviewed [40 PRs](https://github.com/issues?q=assignee%3Ak82cn+is%3Aclosed)
* Reviewed or merged at least 30 PRs to the codebase
[k82cn]: 71 merged PRs in kubernetes/kubernetes, and ~100 PRs in kuberentes at https://goo.gl/j2D1fR

As an approver,

* I agree to only approve familiar PRs
* I agree to be responsive to review/approve requests as per community expectations
* I agree to continue my reviewer work as per community expectations
* I agree to continue my contribution, e.g. PRs, mentor contributors
2017-05-28 22:01:32 -07:00
Kubernetes Submit Queue 1444d252e1 Merge pull request #46457 from nicksardo/gce-api-refactor
Automatic merge from submit-queue (batch tested with PRs 46407, 46457)

GCE - Refactor API for firewall and backend service creation

**What this PR does / why we need it**:
 - Currently, firewall creation function actually instantiates the firewall object; this is inconsistent with the rest of GCE api calls. The API normally gets passed in an existing object.
 - Necessary information for firewall creation, (`computeHostTags`,`nodeTags`,`networkURL`,`subnetworkURL`,`region`) were private to within the package. These now have public getters.
 - Consumers might need to know whether the cluster is running on a cross-project network. A new `OnXPN` func will make that information available.
 - Backend services for regions have been added. Global ones have been renamed to specify global. 
 - NamedPort management of instance groups has been changed from an `AddPortsToInstanceGroup` func (and missing complementary `Remove...`) to a single, simple `SetNamedPortsOfInstanceGroup`
 - Addressed nitpick review comments of #45524 

ILB needs the regional backend services and firewall refactor.  The ingress controller needs the new `OnXPN` func to decide whether to create a firewall.

**Release note**:
```release-note
NONE
```
2017-05-28 13:16:58 -07:00
Kubernetes Submit Queue 382a170054 Merge pull request #39164 from danwinship/networkpolicy-v1
Automatic merge from submit-queue

Move NetworkPolicy to v1

Move NetworkPolicy to v1

@kubernetes/sig-network-misc 

**Release note**:
```release-note
NetworkPolicy has been moved from `extensions/v1beta1` to the new
`networking.k8s.io/v1` API group. The structure remains unchanged from
the beta1 API.

The `net.beta.kubernetes.io/network-policy` annotation on Namespaces
to opt in to isolation has been removed. Instead, isolation is now
determined at a per-pod level, with pods being isolated if there is
any NetworkPolicy whose spec.podSelector targets them. Pods that are
targeted by NetworkPolicies accept traffic that is accepted by any of
the NetworkPolicies (and nothing else), and pods that are not targeted
by any NetworkPolicy accept all traffic by default.

Action Required:

When upgrading to Kubernetes 1.7 (and a network plugin that supports
the new NetworkPolicy v1 semantics), to ensure full behavioral
compatibility with v1beta1:

    1. In Namespaces that previously had the "DefaultDeny" annotation,
       you can create equivalent v1 semantics by creating a
       NetworkPolicy that matches all pods but does not allow any
       traffic:

           kind: NetworkPolicy
           apiVersion: networking.k8s.io/v1
           metadata:
             name: default-deny
           spec:
             podSelector:

       This will ensure that pods that aren't match by any other
       NetworkPolicy will continue to be fully-isolated, as they were
       before.

    2. In Namespaces that previously did not have the "DefaultDeny"
       annotation, you should delete any existing NetworkPolicy
       objects. These would have had no effect before, but with v1
       semantics they might cause some traffic to be blocked that you
       didn't intend to be blocked.
```
2017-05-28 11:13:14 -07:00
Dan Winship 0683e55fc1 Add networking.k8s.io v1 API, with NetworkPolicy 2017-05-28 10:11:01 -04:00
Kubernetes Submit Queue 1fd6e97ad9 Merge pull request #46538 from shyamjvs/kubemark-chmod
Automatic merge from submit-queue

chmod +x kubemark scripts

Just noticed. We don't need to do chmod each time.

/cc @wojtek-t @gmarek
2017-05-27 23:01:48 -07:00
Kubernetes Submit Queue f219f3c153 Merge pull request #46558 from MrHohn/esipp-endpoint-waittime
Automatic merge from submit-queue

Apply KubeProxyEndpointLagTimeout to ESIPP tests

Fixes #46533.

The previous construction of ESIPP tests is weird, so I redo it a bit.

A 30 seconds `KubeProxyEndpointLagTimeout` is introduced, as these tests ain't verifying performance, may be better to not make it too tight.

/assign @thockin 

**Release note**:

```release-note
NONE
```
2017-05-27 11:17:51 -07:00
Nick Sardo 9063526dfb GCE: Refactor firewalls/backendservices api; other small changes 2017-05-27 10:25:03 -07:00
Kubernetes Submit Queue daee6d4826 Merge pull request #45524 from MrHohn/l4-lb-healthcheck
Automatic merge from submit-queue (batch tested with PRs 46252, 45524, 46236, 46277, 46522)

Make GCE load-balancers create health checks for nodes

From #14661. Proposal on kubernetes/community#552. Fixes #46313.

Bullet points:
- Create nodes health check and firewall (for health checking) for non-OnlyLocal service.
- Create local traffic health check and firewall (for health checking) for OnlyLocal service.
- Version skew: 
   - Don't create nodes health check if any nodes has version < 1.7.0.
   - Don't backfill nodes health check on existing LBs unless users explicitly trigger it.

**Release note**:

```release-note
GCE Cloud Provider: New created LoadBalancer type Service now have health checks for nodes by default.
An existing LoadBalancer will have health check attached to it when:
- Change Service.Spec.Type from LoadBalancer to others and flip it back.
- Any effective change on Service.Spec.ExternalTrafficPolicy.
```
2017-05-26 19:47:57 -07:00
Zihong Zheng e332828690 Apply KubeProxyEndpointLagTimeout to ESIPP tests 2017-05-26 18:14:20 -07:00
Kubernetes Submit Queue 2b084af6dd Merge pull request #46484 from guoyunxian/remove
Automatic merge from submit-queue (batch tested with PRs 45809, 46515, 46484, 46516, 45614)

Remove the reduplicated case judement

This patch remove the  reduplicated case judgement
2017-05-26 16:59:04 -07:00
Kubernetes Submit Queue bd1311a0a4 Merge pull request #46515 from ncdc/vet
Automatic merge from submit-queue (batch tested with PRs 45809, 46515, 46484, 46516, 45614)

Fix incorrect printf format

**What this PR does / why we need it**: changes `%s` to `%d` for something that is actually an `int` (found via `make vet`).

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-05-26 16:59:02 -07:00
Michael Taufen a653603e13 no-snat test
Test checks that Pods can communicate with each other in the same
cluster without SNAT.
2017-05-26 13:45:10 -07:00
Zihong Zheng 897da549bc Autogenerated files 2017-05-26 13:19:14 -07:00
Zihong Zheng a61cc7f477 Update firewall e2e test for LB healthcheck firewall 2017-05-26 13:18:50 -07:00
Shyam Jeedigunta b72cbc074c chmod +x kubemark scripts 2017-05-26 22:03:12 +02:00
Michelle Au f385dfcb3b Address review comments 2017-05-26 11:48:31 -07:00
Andy Goldstein ab76f7320a Fix incorrect printf format 2017-05-26 11:36:52 -04:00
Andy Goldstein 41345418cb Support grabbing test suite metrics
Update the "interesting" controller-manager metrics to match the
current names for the garbage collector, and add namespace controller
metrics to the list.
2017-05-26 11:21:27 -04:00
Klaus Ma 68a34c1baf Added k82cn as kube-scheduler approver. 2017-05-26 22:26:20 +08:00
guoyunxian 0bf96a3ca4 Remove the same case judement
This patch remove the same case judement
2017-05-26 17:28:53 +08:00
Chao Xu 262799f91f serve the api in kube-apiserver 2017-05-25 23:55:15 -07:00
Kubernetes Submit Queue 7d37a2685c Merge pull request #45867 from kow3ns/controller-history
Automatic merge from submit-queue (batch tested with PRs 46429, 46308, 46395, 45867, 45492)

Controller history

**What this PR does / why we need it**:
Implements the ControllerRevision API object and clientset to allow for the implementation of StatefulSet update and DaemonSet history

```release-note
ControllerRevision type added for StatefulSet and DaemonSet history.
```
2017-05-25 22:42:08 -07:00
Kubernetes Submit Queue 54a47a6f1d Merge pull request #46308 from dashpole/summary_container_restart
Automatic merge from submit-queue (batch tested with PRs 46429, 46308, 46395, 45867, 45492)

Summary Test looks at pods that have containers that restart.

Occasionally, the node can report extra containers that had been restarted through the summary API.
This test change tests a pod that restarts, and hopefully should allow us to reproduce and debug this behavior.

/assign @dchen1107 

/release-note-none
2017-05-25 22:42:04 -07:00
Kubernetes Submit Queue 59ee250ced Merge pull request #46429 from wojtek-t/bump_go_to_183
Automatic merge from submit-queue (batch tested with PRs 46429, 46308, 46395, 45867, 45492)

Bump Go version to 1.8.3

This PR also removed this patched version of Go 1.8.1 which we used to use to workaround performance problem of Go 1.8.1.

Fix https://github.com/kubernetes/kubernetes/issues/45216
Ref #46391

@timothysc @bradfitz
2017-05-25 22:42:01 -07:00
Kubernetes Submit Queue c60bc53921 Merge pull request #46434 from shyamjvs/kubemark-config-upload
Automatic merge from submit-queue (batch tested with PRs 46124, 46434, 46089, 45589, 46045)

Copy kubeconfig to kubemark master

This should save the effort of digging through jenkins agent and its container to get the kubeconfig.
Ideally we should have kubectl directly working on the kubemark master, but I'm facing some issues due to older version of kubectl present by default on the node.

cc @wojtek-t @gmarek
2017-05-25 21:39:59 -07:00
Kubernetes Submit Queue b8dc4915f7 Merge pull request #46423 from gmarek/fix_perf
Automatic merge from submit-queue (batch tested with PRs 45949, 46009, 46320, 46423, 46437)

Fix performance test issues

Fix #46198
2017-05-25 19:41:04 -07:00
Kubernetes Submit Queue b9416c2c91 Merge pull request #46320 from vmware/e2evSphereStoragePolicySupport
Automatic merge from submit-queue (batch tested with PRs 45949, 46009, 46320, 46423, 46437)

e2e tests for storage policy support in Kubernetes

This PR covers e2e test cases for vSphere storage policy support in Kubernetes - #46176.

The following test scenario have been implemented.
- Specify only SPBM storage policy name.
     - Verify if the disk is provisioned on a compatible datastore with max free space.
- Specify a storage policy name which is not defined on VC.
    - Verify if PVC create errors out that no pbm profile with this policy is found.
- Specify both SPBM storage policy name and VSAN capabilities together.
    - Verify if PVC create errors out that you can't use both SPBM policy name with VSAN capabilities. You can only specify one.
- Specify SPBM storage policy name with user specified datastore which is non-compatible.
   - Verify if PVC create errors out that it can't provision a disk on a non-compatible datastore.

@jeffvance @divyenpatel

**Release note**:

```release-note
None
```
2017-05-25 19:41:02 -07:00
Kubernetes Submit Queue 470a6a45d5 Merge pull request #45949 from NickrenREN/kubelet-metric
Automatic merge from submit-queue (batch tested with PRs 45949, 46009, 46320, 46423, 46437)

Unregister some metrics

delete some registered metrics since they are not observed


**Release note**:
```release-note
NONE
```
2017-05-25 19:40:58 -07:00
Kenneth Owens ba128e6e41 Implements ControllerRevision API Object without codec and code
generation
2017-05-25 11:38:57 -07:00
Wojciech Tyczynski 3e8c27af34 Bump Go version to 1.8.3 2017-05-25 20:05:34 +02:00
Michail Kargakis e18f6cb591
test: set failure traps for all deployment e2e tests 2017-05-25 19:01:50 +02:00
Kubernetes Submit Queue 4a58809d88 Merge pull request #46219 from aleksandra-malinowska/stackdriver-performance-test-2
Automatic merge from submit-queue (batch tested with PRs 45269, 46219, 45966)

Add overriding Stackdriver API endpoint

Allow using Stackdriver test endpoint.
2017-05-25 07:21:01 -07:00
Kubernetes Submit Queue 26d7ee0447 Merge pull request #44774 from kargakis/uniquifier
Automatic merge from submit-queue

Switch Deployments to new hashing algo w/ collision avoidance mechanism

Implements https://github.com/kubernetes/community/pull/477

@kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews 

Fixes https://github.com/kubernetes/kubernetes/issues/29735
Fixes https://github.com/kubernetes/kubernetes/issues/43948

```release-note
Deployments are updated to use (1) a more stable hashing algorithm (fnv) than the previous one (adler) and (2) a hashing collision avoidance mechanism that will ensure new rollouts will not block on hashing collisions anymore.
```
2017-05-25 06:09:58 -07:00
Shyam Jeedigunta 8f2b4c3b33 Copy kubeconfig to kubemark master 2017-05-25 14:55:28 +02:00
Michail Kargakis 9190a47c37
Generated changes for collision count
Signed-off-by: Michail Kargakis <mkargaki@redhat.com>
2017-05-25 12:23:17 +02:00
Kubernetes Submit Queue 9c1480bb61 Merge pull request #46366 from nicksardo/gce-subnetwork-url
Automatic merge from submit-queue (batch tested with PRs 45573, 46354, 46376, 46162, 46366)

GCE - Retrieve subnetwork name/url from gce.conf 

**What this PR does / why we need it**:
Features like ILB require specifying the subnetwork if the network is type manual.

**Notes:**
The network URL can be [constructed](68e7e18698/pkg/cloudprovider/providers/gce/gce.go (L211-L217)) by fetching instance metadata; however, the subnetwork is not provided through this feature. Users must specify the subnetwork name/url through the gce.conf.

Although multiple subnets can exist in the same region for a network, the cloud provider will only use one subnet url for creating LBs. 


**Release note**:
```release-note
NONE
```
2017-05-25 03:14:05 -07:00
Kubernetes Submit Queue 23348ceedc Merge pull request #46354 from smarterclayton/metrics_subresource
Automatic merge from submit-queue (batch tested with PRs 45573, 46354, 46376, 46162, 46366)

Subresources are not included in apiserver prometheus metrics

Subresources are very often completely different code paths and errors
generated on those code paths are important to distinguish.

@kubernetes/sig-api-machinery-pr-reviews

```release-note
The Prometheus metrics for the kube-apiserver for tracking incoming API requests and latencies now return the `subresource` label for correctly attributing the type of API call.
```
2017-05-25 03:13:59 -07:00
gmarek 2437cf4d59 fix type in start-kubemark 2017-05-25 11:48:01 +02:00
gmarek 02951f182e Correctly handle nil resource usage in performance e2e tests 2017-05-25 11:44:03 +02:00
gmarek ded8e03fc3 Reduce service creation/deletion parallelism in the load test 2017-05-25 11:44:03 +02:00
Michail Kargakis 4a2c5eae92
Implement hash collision avoidance mechanism
Signed-off-by: Michail Kargakis <mkargaki@redhat.com>
2017-05-25 11:17:45 +02:00
Kubernetes Submit Queue d84f3f4b7e Merge pull request #46363 from MrHohn/fix-CheckPodsCondition
Automatic merge from submit-queue (batch tested with PRs 45913, 46065, 46352, 46363, 46373)

Fix CheckPodsCondition to print out the correct podName

From a couple CIs (https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-serial/1114, https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-gci-qa-serial-master/2246, https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gke-pre-release/2187), all indicate we print out the wrong pod name in CheckPodsCondition for _"Pod XXX failed to be running and ready, or succeeded."_:
```
I0524 02:09:50.173] May 24 02:09:50.173: INFO: Waiting for pod heapster-v1.3.0-3806988011-kzkg6 in namespace 'kube-system' status to be 'running and ready, or succeeded'(found phase: "Running", readiness: false) (4m55.033881993s elapsed)
I0524 02:09:52.178] May 24 02:09:52.178: INFO: Waiting for pod heapster-v1.3.0-3806988011-kzkg6 in namespace 'kube-system' status to be 'running and ready, or succeeded'(found phase: "Running", readiness: false) (4m57.03848264s elapsed)
I0524 02:09:54.183] May 24 02:09:54.182: INFO: Waiting for pod heapster-v1.3.0-3806988011-kzkg6 in namespace 'kube-system' status to be 'running and ready, or succeeded'(found phase: "Running", readiness: false) (4m59.043463323s elapsed)
I0524 02:09:56.183] May 24 02:09:56.183: INFO: Pod fluentd-gcp-v2.0-6wf67 failed to be running and ready, or succeeded.
I0524 02:09:56.184] May 24 02:09:56.183: INFO: Wanted all 23 pods to be running and ready, or succeeded. Result: false. Pods: [heapster-v1.3.0-3806988011-kzkg6 kube-proxy-bootstrap-e2e-minion-group-bbwn rescheduler-v0.3.0-bootstrap-e2e-master monitoring-influxdb-grafana-v4-1q59k l7-default-backend-1044750973-zgxsc etcd-server-events-bootstrap-e2e-master kube-apiserver-bootstrap-e2e-master kube-proxy-bootstrap-e2e-minion-group-6nqb kube-proxy-bootstrap-e2e-minion-group-mzbz fluentd-gcp-v2.0-chd2x kube-dns-806549836-f8p46 fluentd-gcp-v2.0-44x97 kube-dns-autoscaler-2528518105-vlg8t fluentd-gcp-v2.0-p1h4b kube-controller-manager-bootstrap-e2e-master l7-lb-controller-v0.9.3-bootstrap-e2e-master kubernetes-dashboard-2917854236-tn3nx kube-dns-806549836-fq2fp kube-scheduler-bootstrap-e2e-master etcd-empty-dir-cleanup-bootstrap-e2e-master kube-addon-manager-bootstrap-e2e-master etcd-server-bootstrap-e2e-master fluentd-gcp-v2.0-6wf67]
I0524 02:09:56.184] May 24 02:09:56.183: INFO: At least one pod wasn't running and ready or succeeded at test start.
I0524 02:09:56.184] [AfterEach] [k8s.io] Restart [Disruptive]
```

Check the codes and found we always print out the last pod name, which is random. Pass the pod name into channel to fix.

**Release note**:

```release-note
NONE
```
2017-05-25 00:11:05 -07:00
Kubernetes Submit Queue fe5b303365 Merge pull request #45913 from enj/enj/t/etcd_cohabitating_resources
Automatic merge from submit-queue (batch tested with PRs 45913, 46065, 46352, 46363, 46373)

Detect cohabitating resources in etcd storage test

**What this PR does / why we need it**:

This change updates the etcd storage path test to detect cohabitating resources by looking at their expected location in etcd.  This was not detected in the past because the GVK check did not span across groups.

To limit noise from failures caused by multiple objects at the same location in etcd, the test now fails when different GVRs share the same expected path.  Thus every object is expected to have a unique path.

@liggitt PTAL

Signed-off-by: Monis Khan <mkhan@redhat.com>

**Release note**:

```
NONE
```
2017-05-25 00:10:59 -07:00
Kubernetes Submit Queue ed8843406e Merge pull request #46303 from Random-Liu/fix-cos-image-project
Automatic merge from submit-queue (batch tested with PRs 46299, 46309, 46311, 46303, 46150)

Fix cos image project to cos-cloud.

Addressed https://github.com/kubernetes/kubernetes/pull/45136#discussion_r118092211.

@vishh @yujuhong @dchen1107
2017-05-24 23:19:09 -07:00
Kubernetes Submit Queue 8d88c55231 Merge pull request #46311 from dashpole/disable_ubuntu_gpu_test
Automatic merge from submit-queue (batch tested with PRs 46299, 46309, 46311, 46303, 46150)

Dont attach a GPU to ubuntu test machines for node e2e serial tests

This should fix flakes in the e2e_node serial suite.

@vishh I think this is what you were asking for...

/assign @vishh
2017-05-24 23:19:07 -07:00
Kubernetes Submit Queue b71ca6691b Merge pull request #46309 from Random-Liu/move-docker-validation-to-separate-project
Automatic merge from submit-queue (batch tested with PRs 46299, 46309, 46311, 46303, 46150)

Move docker validation test to separate project.

Docker validation test is leaking VMs because new docker version `DOCKER_VERSION=17.05.0-c` totally breaks the new gci image `GCE_IMAGES=gci-test-60-9579-0-0` with the `gci-docker-version` metadata specified.

The test successfully created the instance, but timed out when checking VM aliveness, and leaked the VM.

I've cleaned up all leaked VMs. This PR moves docker validation node e2e test into a separate project to not influencing other node e2e test.

@kewu1992 We should fix the docker automated validation test.

/cc @dchen1107 @yujuhong @abgworrall
2017-05-24 23:19:05 -07:00
System Administrator 9c8e92b8ff e2e tests for storage policy support in Kubernetes 2017-05-24 16:39:00 -07:00
David Ashpole 1a6572fc6c summary test now tests a pod that has containers that have restarted 2017-05-24 13:27:57 -07:00
Clayton Coleman ad431c454c
Subresources are not included in apiserver prometheus metrics
Subresources are very often completely different code paths and errors
generated on those code paths are important to distinguish.
2017-05-24 16:23:50 -04:00
Nick Sardo e7ee3913d7 Add subnetworkUrl param to e2e 2017-05-24 10:54:51 -07:00
Zihong Zheng 03d08623e8 Fix CheckPodsCondition to print out the correct podName 2017-05-24 10:20:57 -07:00
Kubernetes Submit Queue d4ff0f2a0e Merge pull request #46312 from dashpole/remove_memcg_jenkins_properties
Automatic merge from submit-queue (batch tested with PRs 42042, 46139, 46126, 46258, 46312)

Remove unused test properties

Issue:  #42676
A separate serial memcg suite was created for the initial stages of re-enabling memcg notifications.  Now that all e2e tests have memcg notifications enabled, this suite is no longer needed.
2017-05-23 19:43:07 -07:00
Kubernetes Submit Queue dae6955555 Merge pull request #46293 from nicksardo/chaosmonkey-defer-stop
Automatic merge from submit-queue (batch tested with PRs 46149, 45897, 46293, 46296, 46194)

Chaosmonkey - Signal stop to tests and wait for done when disruption fails

**What this PR does / why we need it**:
Prevents tests from leaking resources because their Teardown was never called when test disruption fails.   

**Which issue this PR fixes**
First problem of #45842 

**Release note**:
```release-note
NONE
```
2017-05-23 15:48:59 -07:00
Kubernetes Submit Queue 45b275d52c Merge pull request #45897 from ncdc/gc-require-list-watch
Automatic merge from submit-queue (batch tested with PRs 46149, 45897, 46293, 46296, 46194)

GC: update required verbs for deletable resources, allow list of ignored resources to be customized

The garbage collector controller currently needs to list, watch, get,
patch, update, and delete resources. Update the criteria for
deletable resources to reflect this.

Also allow the list of resources the garbage collector controller should
ignore to be customizable, so downstream integrators can add their own
resources to the list, if necessary.

cc @caesarxuchao @deads2k @smarterclayton @mfojtik @liggitt @sttts @kubernetes/sig-api-machinery-pr-reviews
2017-05-23 15:48:57 -07:00
Random-Liu 82f588b483 Fix cos image project to cos-cloud. 2017-05-23 15:12:03 -07:00
David Ashpole 8341d544f3 remove unused test properties 2017-05-23 14:39:18 -07:00
David Ashpole 20eb016597 dont attach a GPU to ubuntu machines 2017-05-23 14:34:18 -07:00
Random-Liu dc023144a3 Move docker validation test to separate project. 2017-05-23 14:07:15 -07:00
Kubernetes Submit Queue 1e2105808b Merge pull request #45136 from vishh/cos-nvidia-driver-install
Automatic merge from submit-queue

Enable "kick the tires" support for Nvidia GPUs in COS

This PR provides an installation daemonset that will install Nvidia CUDA drivers on Google Container Optimized OS (COS).
User space libraries and debug utilities from the Nvidia driver installation are made available on the host in a special directory on the host -
* `/home/kubernetes/bin/nvidia/lib` for libraries
*  `/home/kubernetes/bin/nvidia/bin` for debug utilities

Containers that run CUDA applications on COS are expected to consume the libraries and debug utilities (if necessary) from the host directories using `HostPath` volumes.

Note: This solution requires updating Pod Spec across distros. This is a known issue and will be addressed in the future. Until then CUDA workloads will not be portable.

This PR updates the COS base image version to m59. This is coupled with this PR for the following reasons:
1. Driver installation requires disabling a kernel feature in COS. 
2. The kernel API for disabling this interface changed across COS versions
3. If the COS image update is not handled in this PR, then a subsequent COS image update will break GPU integration and will require an update to the installation scripts in this PR.
4. Instead of having to post `3` PRs, one each for adding the basic installer, updating COS to m59, and then updating the installer again, this PR combines all the changes to reduce review overhead and latency, and additional noise that will be created when GPU tests break.

**Try out this PR**
1. Get Quota for GPUs in any region
2. `export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci`
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1" cluster/kube-up.sh`
4. `kubectl create -f cluster/gce/gci/nvidia-gpus/cos-installer-daemonset.yaml`
5. Run your CUDA app in a pod.

**Another option is to run a e2e manually to try out this PR**
1. Get Quota for GPUs in any region
2. export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1"`
4. `go run hack/e2e.go -- --up` 
5. `hack/ginkgo-e2e.sh --ginkgo.focus="\[Feature:GPU\]"`
The e2e will install the drivers automatically using the daemonset and then run test workloads to validate driver integration.

TODO:
- [x] Update COS image version to m59 release.
- [x] Remove sleep from the install script and add it to the daemonset
- [x] Add an e2e that will run the daemonset and run a sample CUDA app on COS clusters.
- [x] Setup a test project with necessary quota to run GPU tests against HEAD to start with https://github.com/kubernetes/test-infra/pull/2759
- [x] Update node e2e serial configs to install nvidia drivers on COS by default
2017-05-23 10:46:10 -07:00
Kubernetes Submit Queue 1602e2a338 Merge pull request #45587 from foxish/pdb-maxunavailab
Automatic merge from submit-queue (batch tested with PRs 45587, 46286)

PDB Max Unavailable Field

Completes https://github.com/kubernetes/features/issues/285

```release-note
Adds a MaxUnavailable field to PodDisruptionBudget
```


Individual commits are self-contained; Last commit can be ignored because it is autogenerated code.
cc @kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews
2017-05-23 10:29:56 -07:00
Nick Sardo f40f45abc1 Defer test stop & cleanup 2017-05-23 10:11:46 -07:00
Andy Goldstein d1a0384678 GC: allow ignored resources to be customized
Allow the list of resources the garbage collector controller should
ignore to be customizable, so downstream integrators can add their own
resources to the list, if necessary.
2017-05-23 12:05:09 -04:00
Kubernetes Submit Queue 8e07e61a43 Merge pull request #46223 from smarterclayton/scheduler_max
Automatic merge from submit-queue (batch tested with PRs 45766, 46223)

Scheduler should use a shared informer, and fix broken watch behavior for cached watches

Can be used either from a true shared informer or a local shared
informer created just for the scheduler.

Fixes a bug in the cache watcher where we were returning the "current" object from a watch event, not the historic event.  This means that we broke behavior when introducing the watch cache.  This may have API implications for filtering watch consumers - but on the other hand, it prevents clients filtering from seeing objects outside of their watch correctly, which can lead to other subtle bugs.

```release-note
The behavior of some watch calls to the server when filtering on fields was incorrect.  If watching objects with a filter, when an update was made that no longer matched the filter a DELETE event was correctly sent.  However, the object that was returned by that delete was not the (correct) version before the update, but instead, the newer version.  That meant the new object was not matched by the filter.  This was a regression from behavior between cached watches on the server side and uncached watches, and thus broke downstream API clients.
```
2017-05-23 07:42:00 -07:00
Anirudh 63e51dc66e PDB MaxUnavailable: e2e tests 2017-05-23 07:18:44 -07:00
Kubernetes Submit Queue cc6e51c6e8 Merge pull request #45427 from ncdc/gc-shared-informers
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)

Use shared informers in gc controller if possible

Modify the garbage collector controller to try to use shared informers for resources, if possible, to reduce the number of unique reflectors listing and watching the same thing.

cc @kubernetes/sig-api-machinery-pr-reviews @caesarxuchao @deads2k @liggitt @sttts @smarterclayton @timothysc @soltysh @kargakis @kubernetes/rh-cluster-infra @derekwaynecarr @wojtek-t @gmarek
2017-05-22 20:58:03 -07:00
Kubernetes Submit Queue bb56937b92 Merge pull request #46055 from deads2k/crd-01-embed
Automatic merge from submit-queue (batch tested with PRs 46022, 46055, 45308, 46209, 43590)

embed kube-apiextensions inside of kube-apiserver

To reduce operation complexity, we decided to include the kube-apiextensions-server inside of kube-apiserver (https://github.com/kubernetes/community/blob/master/sig-api-machinery/api-extensions-position-statement.md#q-should-kube-aggregator-be-a-separate-binaryprocess-than-kube-apiserver).  With the API reasonably well established and a finalizer about merge, I think its time to add ourselves.

This pull wires kube-apiextensions-server ahead of the TPRs so that one will replace the other if both are added by accident (CRDs should have priority) and wires a controller for automatic aggregation.

WIP because I still need tests: unit test for controller, test-cmd test to mirror the TPR test.


```release-note
Adds the `CustomResourceDefinition` (crd) types to the `kube-apiserver`.  These are the successors to `ThirdPartyResource`.  See https://github.com/kubernetes/community/blob/master/contributors/design-proposals/thirdpartyresources.md for more details.
```
2017-05-22 19:59:57 -07:00
Kubernetes Submit Queue c2c5051adf Merge pull request #44899 from smarterclayton/burst
Automatic merge from submit-queue (batch tested with PRs 38990, 45781, 46225, 44899, 43663)

Support parallel scaling on StatefulSets

Fixes #41255

```release-note
StatefulSets now include an alpha scaling feature accessible by setting the `spec.podManagementPolicy` field to `Parallel`.  The controller will not wait for pods to be ready before adding the other pods, and will replace deleted pods as needed.  Since parallel scaling creates pods out of order, you cannot depend on predictable membership changes within your set.
```
2017-05-22 19:07:09 -07:00
Kubernetes Submit Queue a572f10387 Merge pull request #46205 from billy2180/bump-network-tester-json-image-version-to-1.9
Automatic merge from submit-queue (batch tested with PRs 46133, 46211, 46224, 46205, 45910)

test/images/network-tester:bump rc/pod image version to 1.9

Current image version is 1.9,update the image version of the associated json file to 1.9
```release-note
NONE
```
2017-05-22 15:50:05 -07:00
Kubernetes Submit Queue 03ba1324cf Merge pull request #46224 from gmarek/kubemark_heapster
Automatic merge from submit-queue (batch tested with PRs 46133, 46211, 46224, 46205, 45910)

Make CPU request for heapster in kubemark scale with the number of Nodes
2017-05-22 15:50:03 -07:00
Kubernetes Submit Queue 0329e3fdaf Merge pull request #46211 from gmarek/panic
Automatic merge from submit-queue (batch tested with PRs 46133, 46211, 46224, 46205, 45910)

Add more logs to kubelet_stats

Ref. #46198
2017-05-22 15:50:00 -07:00
Michelle Au 1a280993a9 Local persistent volume basic e2e 2017-05-22 14:46:03 -07:00
Clayton Coleman 8cd95c78c4
Scheduler should use a shared informer
Can be used either from a true shared informer or a local shared
informer created just for the scheduler.
2017-05-22 13:50:14 -04:00
Monis Khan cbfe566e49
Detect cohabitating resources in etcd storage test
This change updates the etcd storage path test to detect cohabitating
resources by looking at their expected location in etcd.  This was not
detected in the past because the GVK check did not span across groups.

To limit noise from failures caused by multiple objects at the same
location in etcd, the test now fails when different GVRs share the same
expected path.  Thus every object is expected to have a unique path.

Signed-off-by: Monis Khan <mkhan@redhat.com>
2017-05-22 13:48:18 -04:00
Andy Goldstein 2480f2ceb6 Use shared informers in gc controller if possible 2017-05-22 12:51:37 -04:00
Mik Vyatskov f605040165 Make Stackdriver Logging e2e tests less restrictive 2017-05-22 18:14:20 +02:00
Kubernetes Submit Queue b00c1b66f4 Merge pull request #46164 from shyamjvs/master-log-kubemark
Automatic merge from submit-queue

Add script to dump kubemark master logs

First step towards solving the issue https://github.com/kubernetes/kubernetes/issues/46109.

cc @kubernetes/test-infra-maintainers  @wojtek-t @gmarek
2017-05-22 09:01:36 -07:00
gmarek 27fc7be396 Make CPU request for heapster in kubemark scale with the number of Nodes 2017-05-22 16:20:27 +02:00
FengyunPan 287f703d3a Close file after os.Open() 2017-05-22 21:51:11 +08:00
gmarek 38981e9fd4 Add more logs to kubelet_stats 2017-05-22 15:49:57 +02:00
Aleksandra Malinowska 0e5051a84c Add overriding Stackdriver API endpoint 2017-05-22 15:47:39 +02:00
deads2k 446e959bf7 make CRD apiservice controller 2017-05-22 08:54:14 -04:00
billy2180 952ad3f4a7 test/images/network-tester:bump rc/pod image verison to 1.9 2017-05-22 17:11:23 +08:00
Wojciech Tyczynski 8de8446840 Revert "Scheduler should use shared informer for pods"
This reverts commit 479f01d340.
2017-05-22 09:03:35 +02:00
Kubernetes Submit Queue 06c12e717a Merge pull request #46071 from emaildanwilson/fedClusterSelectorIntegration
Automatic merge from submit-queue

[Federation] ClusterSelector Integration Testing

This pull request adds integration testing for the federated ClusterSelector ref: design #29887 merged pull #40234

cc: @nikhiljindal @marun
2017-05-21 23:18:44 -07:00
Kubernetes Submit Queue ba4c8b8db2 Merge pull request #46025 from billy2180/bump-netexec-pod-xml-to-1.7
Automatic merge from submit-queue

Bump e2e netexec pod.xml image version to 1.7

Changing the image version from 1.5 to 1.7
2017-05-21 02:27:05 -07:00
Clayton Coleman e40648de68
E2E test for statefulset burst 2017-05-21 01:14:31 -04:00
Vishnu kannan 86b5edb79a Update COS version to m59
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-05-20 21:17:19 -07:00
Vishnu kannan 1e77594958 Adding an installer script that installs Nvidia drivers in Container Optimized OS
Packaged the script as a docker container stored in gcr.io/google-containers
A daemonset deployment is included to make it easy to consume the installer
A cluster e2e has been added to test the installation daemonset along with verifying installation
by using a sample CUDA application.
Node e2e for GPUs updated to avoid running on nodes without GPU devices.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-05-20 21:17:19 -07:00
Clayton Coleman 479f01d340
Scheduler should use shared informer for pods
Previously, the scheduler created two separate list watchers. This
changes the scheduler to be able to leverage a shared informer, whether
passed in externally or spawned using the new in place method. This
removes the last use of a "special" informer in the codebase.

Allows someone wrapping the scheduler to use a shared informer if they
have more information avaliable.
2017-05-20 14:19:49 -04:00
Clayton Coleman 784e3ae5fa
Switch the tokens controller to use shared informers
Tokens controller previously needed a bit of extra help in order to be
safe for concurrent use. The new MutationCache allows it to keep a local
cache and still use a shared informer. The filtering event handler lets
it only see changes to secrets it cares about.
2017-05-20 14:19:49 -04:00
Shyam Jeedigunta 360054a75f Add script to dump kubemark master logs 2017-05-20 13:12:38 +02:00
Kubernetes Submit Queue 03495f12c3 Merge pull request #46152 from shashidharatd/federation
Automatic merge from submit-queue (batch tested with PRs 46014, 46152)

Updated test/test_owners.csv for federation test cases

To the best of my knowledge have updated the test owners for federation e2e test cases. PTAL and comment if any concern.

**Release note**:
```release-note
NONE
```
cc @kubernetes/sig-federation-pr-reviews @fejta 
/assign @madhusudancs
2017-05-20 00:59:27 -07:00
Kubernetes Submit Queue 8fe818b2a1 Merge pull request #45981 from fabianofranz/kubectl_plugins_v1_part1
Automatic merge from submit-queue (batch tested with PRs 46033, 46122, 46053, 46018, 45981)

Command tree and exported env in kubectl plugins

This is part of `kubectl` plugins V1:
- Adds support to several env vars passing context information to the plugin. Plugins can make use of them to connect to the REST API, access global flags, get the path of the plugin caller (so that `kubectl` can be invoked) and so on. Exported env vars include
  - `KUBECTL_PLUGINS_DESCRIPTOR_*`: the plugin descriptor fields
  - `KUBECTL_PLUGINS_GLOBAL_FLAG_*`: one for each global flag, useful to access namespace, context, etc
  - ~`KUBECTL_PLUGINS_REST_CLIENT_CONFIG_*`: one for most fields in `rest.Config` so that a REST client can be built.~
  - `KUBECTL_PLUGINS_CALLER`: path to `kubectl`
  - `KUBECTL_PLUGINS_CURRENT_NAMESPACE`: namespace in use
- Adds support for plugins as child of other plugins so that a tree of commands can be built (e.g. `kubectl myplugin list`, `kubectl myplugin add`, etc)

**Release note**:

```release-note
Added support to a hierarchy of kubectl plugins (a tree of plugins as children of other plugins).

Added exported env vars to kubectl plugins so that plugin developers have access to global flags, namespace, the plugin descriptor and the full path to the caller binary.
```
@kubernetes/sig-cli-pr-reviews
2017-05-19 23:29:32 -07:00
Kubernetes Submit Queue 112ed869c7 Merge pull request #46053 from dashpole/test_eviction_metrics
Automatic merge from submit-queue (batch tested with PRs 46033, 46122, 46053, 46018, 45981)

Log age of stats used for evictions during eviction tests

I recently added prometheus metrics for the age of the metrics used for evictions #43031.  It would be nice to surface these during eviction tests, so I can better assess how old stats are, and whether or not the age of stats causes extra evictions.

This isnt super-high priority, and can be done after code-freeze, since it is a testing improvement.  Feel free to take a look whenever either of you has time.

/assign @mtaufen 
/assign @Random-Liu
2017-05-19 23:29:28 -07:00
Kubernetes Submit Queue 4f55f49035 Merge pull request #46042 from derekwaynecarr/quota-admission-registry
Automatic merge from submit-queue (batch tested with PRs 45346, 45903, 45958, 46042, 45975)

ResourceQuota admission control injects registry

**What this PR does / why we need it**:
The `ResourceQuota` admission controller works with a registry that maps a GroupKind to an Evaluator.  The registry used in the existing plug-in is not injectable, which makes usage of the ResourceQuota plug-in in other API server contexts difficult.  This PR updates the code to support late injection of the registry via a plug-in initializer.
2017-05-19 22:29:34 -07:00
shashidharatd b4792014d1 Updated test/test_owners.csv for federation test cases 2017-05-20 08:43:30 +05:30
Kubernetes Submit Queue 2473c24f81 Merge pull request #45979 from bowei/owners
Automatic merge from submit-queue

Add bowei to OWNERS: e2e/test dns,network; cloud route, node, service…
2017-05-19 19:39:05 -07:00
Kubernetes Submit Queue 73e7ef1f8c Merge pull request #46011 from MrHohn/e2e-fix-return-podnames
Automatic merge from submit-queue (batch tested with PRs 45996, 46121, 45707, 46011, 45564)

Fix waitForNPods in restart.go

From https://github.com/kubernetes/kubernetes/issues/45991#issuecomment-302292404.

Don't redefine `pods` so we can return real pod names instead of empty array.

/assign @dchen1107 @bowei 

**Release note**:

```release-note
NONE
```
2017-05-19 18:57:36 -07:00
Fabiano Franz 18cb56bf78 kubectl plugins have access config, global flags and environment 2017-05-19 19:17:43 -03:00
Bowei Du 3af1c0efcb Add bowei to OWNERS: e2e/test dns,network; cloud route, node, service controller 2017-05-19 14:49:43 -07:00
Fabiano Franz da85262f70 Adds support to a tree hierarchy of kubectl plugins 2017-05-19 18:06:15 -03:00
emaildanwilson 2cef454fd3 fed cluster selector integration test
updates from review
2017-05-19 13:47:52 -07:00
Shashidhara T D 40c32b02d7 Revert "[Federation] Fix federated service reconcilation issue due to addition of External…" 2017-05-19 18:29:07 +05:30
Kubernetes Submit Queue 51f3ac1b99 Merge pull request #45004 from feiskyer/hostnetwork
Automatic merge from submit-queue

Add node e2e tests for hostNetwork

**What this PR does / why we need it**:

Add node e2e tests for hostNetwork.

**Which issue this PR fixes**

Part of #44118.

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

/assign @Random-Liu @yujuhong
2017-05-19 01:18:07 -07:00
Derek Carr a71bea312a ResourceQuota admission control injects registry 2017-05-18 23:17:13 -04:00
David Ashpole 0bd0d705e3 log age of stats used for evictions during eviction tests 2017-05-18 13:51:23 -07:00
Kubernetes Submit Queue 93579d637e Merge pull request #45968 from marun/fed-remove-redundant-e2e
Automatic merge from submit-queue (batch tested with PRs 45950, 45968)

[Federation] Remove redundant e2e for secret and daemonset

Federation of daemonset and secret types is now implemented by the sync controller, and e2e testing for each type is provided via crud lifecycle e2e tests.  This renders the legacy e2e tests for these types redundant, and this commit removes those tests.

The secret wait and delete functions required by the ingress e2e tests have been retained and moved to ingress.go.

cc: @kubernetes/sig-federation-pr-reviews
2017-05-18 08:39:05 -07:00
Kubernetes Submit Queue a1c2db2fec Merge pull request #45950 from shyamjvs/revert-proxier
Automatic merge from submit-queue

Make real proxier in hollow-proxy optional (default=true)

Ref https://github.com/kubernetes/kubernetes/pull/45622
This allows using real proxier for hollow proxy, but we use the fake one by default.

cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
2017-05-18 07:55:09 -07:00
Shyam Jeedigunta 804a4f558c Make usage of real proxier in hollow-proxy optional (default=true) 2017-05-18 14:30:12 +02:00
billy2180 4cd92e8f37 Bump e2e netexec pod.xml image version to 1.7 2017-05-18 17:54:13 +08:00
Zihong Zheng 2095e0bee6 Fix waitForNPods in restart.go 2017-05-17 20:47:11 -07:00
Kubernetes Submit Queue 7df0178076 Merge pull request #42975 from smarterclayton/time_namespace
Automatic merge from submit-queue (batch tested with PRs 40234, 45885, 42975)

Log how much time it takes e2e tests to clean up the namespace
2017-05-17 20:27:52 -07:00
Kubernetes Submit Queue 747b706c4b Merge pull request #45678 from a-robinson/1.0
Automatic merge from submit-queue (batch tested with PRs 45990, 45544, 45745, 45742, 45678)

Add explicit image tag to cockroachdb example and test

@gyliu513 

```release-note
NONE
```
2017-05-17 18:40:59 -07:00
Kubernetes Submit Queue 4f56e8b756 Merge pull request #45742 from janetkuo/deployment-integration-test
Automatic merge from submit-queue (batch tested with PRs 45990, 45544, 45745, 45742, 45678)

Add integration test for deployment

We have no integration test for Deployment currently. In this PR, add an integration test which covers an e2e test (create a new RollingUpdate deployment), add more replicas to the Deployment, and set minReadySeconds so that we can test maxUnavailable. 

Plan to add more integration tests that cover e2e tests after this initial PR is merged.

@kubernetes/sig-apps-pr-reviews
2017-05-17 18:40:57 -07:00
Kubernetes Submit Queue 8710f6e62d Merge pull request #45745 from marun/fed-test-cluster-addition
Automatic merge from submit-queue (batch tested with PRs 45990, 45544, 45745, 45742, 45678)

[Federation]  Add integration testing for cluster addition

This PR adds integration testing of the sync controller for cluster addition.  This ensures coverage equivalency between the integration tests and the old controller unit tests, so those tests are removed by this PR.

Resolves #45257

cc: @kubernetes/sig-federation-pr-reviews
2017-05-17 18:40:54 -07:00
Clayton Coleman 4f35b31fc7
Try speeding up ConfigMap e2e namespace deletion 2017-05-17 17:45:05 -04:00
Clayton Coleman fcab3a442d
Log how much time it takes e2e tests to clean up the namespace
Will get a better handle on deletion test wasted time
2017-05-17 17:45:04 -04:00
Janet Kuo 3f2d8ae682 Extract common code in deployment e2e and integration test 2017-05-17 14:41:59 -07:00
Janet Kuo 282c90bc1a Remove e2e test for creating a new deployment 2017-05-17 14:41:59 -07:00
Janet Kuo 1ced5ae22c Add integration test for deployment 2017-05-17 14:41:59 -07:00
Kubernetes Submit Queue 581ebf46a9 Merge pull request #45852 from wongma7/subpath-e2e
Automatic merge from submit-queue

Add subPath e2e: test permissions and when subPath pre-exists

These tests cover issues https://github.com/kubernetes/kubernetes/issues/45613 and ~~https://github.com/kubernetes/kubernetes/pull/43775~~ https://github.com/kubernetes/kubernetes/issues/41638
-->
```release-note
NONE
```
2017-05-17 12:21:36 -07:00
Kubernetes Submit Queue ac62748480 Merge pull request #44230 from mtaufen/remove-babysit-daemons
Automatic merge from submit-queue

Remove the deprecated --babysit-daemons kubelet flag

```release-note
Removes the deprecated kubelet flag --babysit-daemons
```

This flag has been deprecated for over a year (git blame says marked deprecated on March 1, 2016).
Relatively easy removal - nothing in the Kubelet relies on it anymore.

There was still some stuff in the provisioning scripts. It was easy to rip out, but in general we probably need to be more disciplined about updating the provisioning scripts at the same time that we initially mark things deprecated.
2017-05-17 11:23:17 -07:00
Kubernetes Submit Queue 4a9a702ee1 Merge pull request #45926 from MrHohn/api-annotations-move
Automatic merge from submit-queue

Move all API related annotations into annotation_key_constants.go

Separate from #45869. See https://github.com/kubernetes/kubernetes/pull/45869#discussion_r116839411 for details.

This PR does nothing but move constants around :)

/assign @caesarxuchao 

**Release note**:

```release-note
NONE
```
2017-05-17 10:34:53 -07:00
Michael Taufen 2ee2ec5e21 Remove the deprecated --babysit-daemons kubelet flag 2017-05-17 09:08:57 -07:00
Maru Newby 9b7270821b fed: Remove redundant e2e for secret and daemonset
Federation of daemonset and secret types is now implemented by the
sync controller, and e2e testing for each type is provided via crud
lifecycle e2e tests.  This renders the legacy e2e tests for these types
redundant, and this commit removes those tests.
2017-05-17 08:28:04 -07:00