Commit Graph

1583 Commits (d7b3c62c563beae64cc9d9bf60fea73b1784b877)

Author SHA1 Message Date
Kubernetes Submit Queue 18fe59264c Merge pull request #41875 from wanghaoran1988/fix_issue
Automatic merge from submit-queue (batch tested with PRs 41954, 40528, 41875, 41165, 41877)

Move the scheduler Fake* to testing folder

Address this issue: https://github.com/kubernetes/kubernetes/issues/41662
@timothysc I moved the interface to the types.go, and the Fake* to the testing package, not sure whether you like or not.
2017-02-26 14:54:52 -08:00
Kubernetes Submit Queue 11b9a2d038 Merge pull request #41119 from sarat-k/master
Automatic merge from submit-queue (batch tested with PRs 41701, 41818, 41897, 41119, 41562)

Optimization of on-wire information sent to scheduler extender interfaces that are stateful

The changes are to address the issue described in https://github.com/kubernetes/kubernetes/issues/40991

```release-note
Added support to minimize sending verbose node information to scheduler extender by sending only node names and expecting extenders to cache the rest of the node information
```

cc @ravigadde
2017-02-26 14:02:54 -08:00
Kubernetes Submit Queue 1359ffc502 Merge pull request #41818 from aveshagarwal/master-taints-tolerations-api-fields-pod-spec-updates
Automatic merge from submit-queue (batch tested with PRs 41701, 41818, 41897, 41119, 41562)

Allow updates to pod tolerations.

Opening this PR to continue discussion for pod spec tolerations updates when a pod has been scheduled already. This PR is built on top of https://github.com/kubernetes/kubernetes/pull/38957.

@kubernetes/sig-scheduling-pr-reviews @liggitt @davidopp @derekwaynecarr @kubernetes/rh-cluster-infra
2017-02-26 14:02:51 -08:00
Kubernetes Submit Queue 16f87fe7d8 Merge pull request #40952 from dashpole/premption
Automatic merge from submit-queue (batch tested with PRs 41994, 41969, 41997, 40952, 40576)

Guaranteed admission for Critical Pods

This is the first step in implementing node-level preemption for critical pods.
It defines the AdmissionFailureHandler interface, which allows callers, like the kubelet, to define how failed predicates are handled, and take steps to correct failures if necessary.
In the kubelet's implementation, it triggers preemption if the pod being admitted is critical, and if the only failed predicates are InsufficientResourceErrors, then it prempts (not yet implemented) other other pods to allow admission of the critical pod.

cc: @vishh
2017-02-26 12:57:59 -08:00
Kubernetes Submit Queue 452420484c Merge pull request #41982 from deads2k/agg-18-ca-permissions
Automatic merge from submit-queue

Add namespaced role to inspect particular configmap for delegated authentication

Builds on https://github.com/kubernetes/kubernetes/pull/41814 and https://github.com/kubernetes/kubernetes/pull/41922 (those are already lgtm'ed) with the ultimate goal of making an extension API server zero-config for "normal" authentication cases.

This part creates a namespace role in `kube-system` that can *only* look the configmap which gives the delegated authentication check.  When a cluster-admin grants the SA running the extension API server the power to run delegated authentication checks, he should also bind this role in this namespace.

@sttts Should we add a flag to aggregated API servers to indicate they want to look this up so they can crashloop on startup?  The alternative is sometimes having it and sometimes not.  I guess we could try to key on explicit "disable front-proxy" which may make more sense.

@kubernetes/sig-api-machinery-misc 

@ncdc I spoke to @liggitt about this before he left and he was ok in concept.  Can you take a look at the details?
2017-02-26 12:12:49 -08:00
Kubernetes Submit Queue 2eef3b1a14 Merge pull request #41957 from liggitt/mirror-pod-secrets
Automatic merge from submit-queue (batch tested with PRs 41814, 41922, 41957, 41406, 41077)

Use consistent helper for getting secret names from pod

Kubelet secret-manager and mirror-pod admission both need to know what secrets a pod spec references. Eventually, a node authorizer will also need to know the list of secrets.

This creates a single (well, double, because api versions) helper that can be used to traverse the secret names referenced from a pod, optionally short-circuiting (for places that are just looking to see if any secrets are referenced, like admission, or are looking for a particular secret ref, like authorization)

Fixes:
* secret manager not handling secrets used by env/envFrom in initcontainers
* admission allowing mirror pods with secret references

@smarterclayton @wojtek-t
2017-02-26 10:22:51 -08:00
Kubernetes Submit Queue 77ba346f55 Merge pull request #41815 from kevin-wangzefeng/enable-defaulttolerationseconds-admission-controller
Automatic merge from submit-queue (batch tested with PRs 40932, 41896, 41815, 41309, 41628)

enable DefaultTolerationSeconds admission controller by default

**What this PR does / why we need it**:
Continuation of PR #41414, enable DefaultTolerationSeconds admission controller by default.


**Which issue this PR fixes**: 
fixes: #41860
related Issue: #1574, #25320
related PRs: #34825, #41133, #41414 

**Special notes for your reviewer**:

**Release note**:

```release-note
enable DefaultTolerationSeconds admission controller by default
```
2017-02-26 08:09:58 -08:00
Haoran Wang 4540bb9e77 move the lister.go to testing folder 2017-02-25 23:36:27 +08:00
Kubernetes Submit Queue 8e6af485f9 Merge pull request #41918 from ncdc/shared-informers-14-scheduler
Automatic merge from submit-queue (batch tested with PRs 41714, 41510, 42052, 41918, 31515)

Switch scheduler to use generated listers/informers

Where possible, switch the scheduler to use generated listers and
informers. There are still some places where it probably makes more
sense to use one-off reflectors/informers (listing/watching just a
single node, listing/watching scheduled & unscheduled pods using a field
selector).

I think this can wait until master is open for 1.7 pulls, given that we're close to the 1.6 freeze.

After this and #41482 go in, the only code left that references legacylisters will be federation, and 1 bit in a stateful set unit test (which I'll clean up in a follow-up).

@resouer I imagine this will conflict with your equivalence class work, so one of us will be doing some rebasing 😄 

cc @wojtek-t @gmarek  @timothysc @jayunit100 @smarterclayton @deads2k @liggitt @sttts @derekwaynecarr @kubernetes/sig-scheduling-pr-reviews @kubernetes/sig-scalability-pr-reviews
2017-02-25 02:17:55 -08:00
gmarek 6637592b1d generated 2017-02-24 09:24:33 +01:00
gmarek d88af7806c NodeController sets NodeTaints instead of deleting Pods 2017-02-24 09:24:33 +01:00
Kubernetes Submit Queue 46dda7e32a Merge pull request #41821 from deads2k/apiserver-15-healthz
Automatic merge from submit-queue

redact detailed errors from healthz and expose in default policy

Makes `/healthz` less sensitive and exposes it by default.

@kubernetes/sig-auth-pr-reviews @kubernetes/sig-api-machinery-misc @liggitt
2017-02-24 00:22:55 -08:00
Kubernetes Submit Queue 51f498f6f3 Merge pull request #41645 from ncdc/shared-informers-12-admission
Automatic merge from submit-queue (batch tested with PRs 41667, 41820, 40910, 41645, 41361)

Switch admission to use shared informers

Originally part of #40097

cc @smarterclayton @derekwaynecarr @deads2k @liggitt @sttts @gmarek @wojtek-t @timothysc @lavalamp @kubernetes/sig-scalability-pr-reviews @kubernetes/sig-api-machinery-pr-reviews
2017-02-23 20:57:31 -08:00
Kubernetes Submit Queue b799bbf0a8 Merge pull request #38816 from deads2k/rbac-23-switch-kubedns-sa
Automatic merge from submit-queue

move kube-dns to a separate service account

Switches the kubedns addon to run as a separate service account so that we can subdivide RBAC permission for it.  The RBAC permissions will need a little more refinement which I'm expecting to find in https://github.com/kubernetes/kubernetes/pull/38626 .

@cjcullen @kubernetes/sig-auth since this is directly related to enabling RBAC with subdivided permissions
 @thockin @kubernetes/sig-network since this directly affects now kubedns is added.  


```release-note
`kube-dns` now runs using a separate `system:serviceaccount:kube-system:kube-dns` service account which is automatically bound to the correct RBAC permissions.
```
2017-02-23 12:06:13 -08:00
David Ashpole c58970e47c critical pods can preempt other pods to be admitted 2017-02-23 10:31:20 -08:00
Sarat Kamisetty dda62ec207 - scheduler extenders that are capable of maintaining their own
node cache don't need to get all the information about every
  candidate node. For huge clusters, sending full node information
  of all nodes in the cluster on the wire every time a pod is scheduled
  is expensive. If the scheduler is capable of caching node information
  along with its capabilities, sending node name alone is sufficient.
  These changes provide that optimization in a backward compatible way

- removed the inadvertent signature change of Prioritize() function
- added defensive checks as suggested
-  added annotation specific test case
- updated the comments in the scheduler types
- got rid of apiVersion thats unused
- using *v1.NodeList as suggested
- backing out pod annotation update related changes made in the
  1st commit
- Adjusted the comments in types.go and v1/types.go as suggested
  in the code review
2017-02-23 10:25:42 -08:00
deads2k da3da29223 add kube-system local roles 2017-02-23 11:33:12 -05:00
Andy Goldstein 022bff7fbe Switch admission to use shared informers 2017-02-23 11:16:09 -05:00
Avesh Agarwal b9d95b4426 Allow toleration updates via pod spec. 2017-02-23 11:06:13 -05:00
Andy Goldstein 9d8d6ad16c Switch scheduler to use generated listers/informers
Where possible, switch the scheduler to use generated listers and
informers. There are still some places where it probably makes more
sense to use one-off reflectors/informers (listing/watching just a
single node, listing/watching scheduled & unscheduled pods using a field
selector).
2017-02-23 09:57:12 -05:00
Kubernetes Submit Queue 787b1a2388 Merge pull request #41281 from ericchiang/bootstrap-token-authenticator
Automatic merge from submit-queue (batch tested with PRs 41812, 41665, 40007, 41281, 41771)

kube-apiserver: add a bootstrap token authenticator for TLS bootstrapping

Follows up on https://github.com/kubernetes/kubernetes/pull/36101

Still needs:

* More tests.
* To be hooked up to the API server.
  - Do I have to do that in a separate PR after k8s.io/apiserver is synced?
* Docs (kubernetes.io PR).
* Figure out caching strategy.
* Release notes.

cc @kubernetes/sig-auth-api-reviews @liggitt @luxas @jbeda

```release-notes
Added a new secret type "bootstrap.kubernetes.io/token" for dynamically creating TLS bootstrapping bearer tokens.
```
2017-02-23 00:11:40 -08:00
Jordan Liggitt a5526304bc
Use consistent helper for getting secret names from pod 2017-02-23 00:40:17 -05:00
Avesh Agarwal b4d3d24eaf Update tests. 2017-02-22 09:27:42 -05:00
Avesh Agarwal 9b640838a5 Change taint/toleration annotations to api fields. 2017-02-22 09:27:42 -05:00
deads2k 4cd0b7cdbe redact detailed errors from healthz and expose in default policy 2017-02-22 07:52:13 -05:00
Kevin cd427fa4be enable DefaultTolerationSeconds admission controller by default 2017-02-22 00:45:56 +08:00
Eric Chiang a0df658b20 kube-apiserver: add a bootstrap token authenticator for TLS bootstrapping 2017-02-21 08:43:55 -08:00
Kubernetes Submit Queue f2e234e47f Merge pull request #41398 from codablock/azure_max_pd
Automatic merge from submit-queue

Add scheduler predicate to filter for max Azure disks attached

**What this PR does / why we need it**: This PR adds scheduler predicates for maximum Azure Disks count. This allows to use the environment variable KUBE_MAX_PD_VOLS on scheduler the same as it's already possible with GCE and AWS.

This is needed as we need a way to specify the maximum attachable disks on Azure to avoid permanently failing disk attachment in cases k8s scheduled too many PODs with AzureDisk volumes onto the same node. 

I've chosen 16 as the default value for DefaultMaxAzureDiskVolumes even though it may be too high for many smaller VM types and too low for the larger VM types. This means, the default behavior may change for clusters with large VM types. For smaller VM types, the behavior will not change (it will keep failing attaching).

In the future, the value should be determined at run time on a per node basis, depending on the VM size. I know that this is already implemented in the ongoing Azure Managed Disks work, but I don't remember where to find this anymore and also forgot who was working on this. Maybe @colemickens can help here.

**Release note**:

```release-note
Support KUBE_MAX_PD_VOLS on Azure
```

CC @colemickens @brendandburns
2017-02-21 06:11:09 -08:00
Kubernetes Submit Queue 506950ada0 Merge pull request #36765 from derekwaynecarr/quota-precious-resources
Automatic merge from submit-queue (batch tested with PRs 41421, 41440, 36765, 41722)

ResourceQuota ability to support default limited resources

Add support for the ability to configure the quota system to identify specific resources that are limited by default.  A limited resource means its consumption is denied absent a covering quota.  This is in contrast to the current behavior where consumption is unlimited absent a covering quota.  Intended use case is to allow operators to restrict consumption of high-cost resources by default.

Example configuration:

**admission-control-config-file.yaml**
```
apiVersion: apiserver.k8s.io/v1alpha1
kind: AdmissionConfiguration
plugins:
- name: "ResourceQuota"
  configuration:
    apiVersion: resourcequota.admission.k8s.io/v1alpha1
    kind: Configuration
    limitedResources:
    - resource: pods
      matchContains:
      - pods
      - requests.cpu
    - resource: persistentvolumeclaims
      matchContains:
      - .storageclass.storage.k8s.io/requests.storage
```

In the above configuration, if a namespace lacked a quota for any of the following:
* cpu
* any pvc associated with particular storage class

The attempt to consume the resource is denied with a message stating the user has insufficient quota for the matching resources.

```
$ kubectl create -f pvc-gold.yaml 
Error from server: error when creating "pvc-gold.yaml": insufficient quota to consume: gold.storageclass.storage.k8s.io/requests.storage
$ kubectl create quota quota --hard=gold.storageclass.storage.k8s.io/requests.storage=10Gi
$ kubectl create -f pvc-gold.yaml 
... created
```
2017-02-20 10:37:42 -08:00
Kubernetes Submit Queue af41d2f57c Merge pull request #41661 from liggitt/satoken
Automatic merge from submit-queue

Make controller-manager resilient to stale serviceaccount tokens

Now that the controller manager is spinning up controller loops using service accounts, we need to be more proactive in making sure the clients will actually work.

Future additional work:
* make a controller that reaps invalid service account tokens (c.f. https://github.com/kubernetes/kubernetes/issues/20165)
* allow updating the client held by a controller with a new token while the controller is running (c.f. https://github.com/kubernetes/kubernetes/issues/4672)
2017-02-20 08:39:31 -08:00
deads2k 36b586d5d7 move kube-dns to a separate service account 2017-02-20 07:35:08 -05:00
Kubernetes Submit Queue ba6dca94bc Merge pull request #41458 from humblec/iscsi-nodisk-conflict
Automatic merge from submit-queue

Adjust nodiskconflict support based on iscsi multipath.

With the multipath support is in place, to declare whether both iscsi disks are same, we need to only depend on IQN.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2017-02-20 03:54:41 -08:00
Alexander Block 73a0083a84 Add scheduler predicate to filter for max Azure disks attached 2017-02-20 09:00:18 +01:00
Kubernetes Submit Queue b3d627c2e2 Merge pull request #41387 from gyliu513/most-request
Automatic merge from submit-queue

Improved code coverage for plugin/pkg/scheduler/algorithm/priorities…

…/most_requested.go



**What this PR does / why we need it**:
Part of #39559 , code coverage improved from 70+% to 80+%
2017-02-19 23:04:02 -08:00
Kubernetes Submit Queue 4a75c1b2aa Merge pull request #41617 from timothysc/affinity_annotations_flaggate
Automatic merge from submit-queue (batch tested with PRs 39373, 41585, 41617, 41707, 39958)

Feature-Gate affinity in annotations 

**What this PR does / why we need it**:
Adds back basic flaggated support for alpha Affinity annotations

**Special notes for your reviewer**:
Reconcile function is placed in the lowest common denominator, which in this case is schedulercache, because you can't place flag-gated functions in apimachinery. 

**Release note**:

```
NONE
```

/cc @davidopp
2017-02-19 13:50:40 -08:00
Kubernetes Submit Queue 070ebfe622 Merge pull request #41414 from kevin-wangzefeng/tolerationseconds-admission-controller
Automatic merge from submit-queue (batch tested with PRs 41043, 39058, 41021, 41603, 41414)

add defaultTolerationSeconds admission controller

**What this PR does / why we need it**:
Splited from #34825, add a new admission-controller that
1. adds toleration (with tolerationSeconds = 300) for taint `notReady:NoExecute` to every pod that does not already have a toleration for that taint, and
2. adds toleration (with tolerationSeconds = 300) for taint `unreachable:NoExecute` to every pod that does not already have a toleration for that taint.

**Which issue this PR fixes**: 
Related issue: #1574
Related PR: #34825

**Special notes for your reviewer**:

**Release note**:

```release-note
add defaultTolerationSeconds admission controller
```
2017-02-19 00:58:47 -08:00
Derek Carr 3fad0cb52a Implement support for limited resources in quota 2017-02-18 12:10:22 -05:00
Derek Carr 8575978d7a ResourceQuota API configuration type 2017-02-18 12:09:54 -05:00
Kevin 83545a65f1 add defaultTolerationSeconds admission controller 2017-02-18 23:48:03 +08:00
Timothy St. Clair 2bcd63c524 Cleanup work to enable feature gating annotations 2017-02-18 09:25:57 -06:00
Robert Rati 32c4683242 Feature-Gate affinity in annotations 2017-02-18 09:08:38 -06:00
Jordan Liggitt b83e6f7d91
Make controller-manager resilient to stale serviceaccount tokens 2017-02-17 23:59:00 -05:00
Kubernetes Submit Queue 97921ff38e Merge pull request #41195 from wojtek-t/remove_default_failure_domains
Automatic merge from submit-queue (batch tested with PRs 41401, 41195, 41664, 41521, 41651)

Remove default failure domains from anti-affinity feature

Removing it is necessary to make performance of this feature acceptable at some point.

With default failure domains (or in general when multiple topology keys are possible), we don't have transitivity between node belonging to a topology. And without this, it's pretty much impossible to solve this effectively.

@timothysc
2017-02-17 19:46:40 -08:00
Matthew Wong 33f98d4db3 Switch pv controller to shared informers 2017-02-16 10:08:23 -05:00
Wojciech Tyczynski 3de7195cf8 Remove default failure domains from anti-affinity feature 2017-02-16 13:32:34 +01:00
Humble Chirammal 7a1ac6c6db Adjust nodiskconflict support based on iscsi multipath feature.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2017-02-16 16:24:53 +05:30
Kubernetes Submit Queue 97212f5b3a Merge pull request #37953 from liggitt/automount
Automatic merge from submit-queue (batch tested with PRs 37137, 41506, 41239, 41511, 37953)

Add field to control service account token automounting

Fixes https://github.com/kubernetes/kubernetes/issues/16779

* adds an `automountServiceAccountToken *bool` field to `ServiceAccount` and `PodSpec`
* if set in both the service account and pod, the pod wins
* if unset in both the service account and pod, we automount for backwards compatibility

```release-note
An `automountServiceAccountToken *bool` field was added to ServiceAccount and PodSpec objects. If set to `false` on a pod spec, no service account token is automounted in the pod. If set to `false` on a service account, no service account token is automounted for that service account unless explicitly overridden in the pod spec.
```
2017-02-15 20:05:13 -08:00
Jordan Liggitt 0d6e877de2
Add automountServiceAccountToken field to PodSpec and ServiceAccount types 2017-02-15 16:04:09 -05:00
Kubernetes Submit Queue 1ad5cea24e Merge pull request #41261 from ncdc/shared-informers-07-resourcequota
Automatic merge from submit-queue

Switch resourcequota controller to shared informers

Originally part of #40097 

I have had some issues with this change in the past, when I updated `pkg/quota` to use the new informers while `pkg/controller/resourcequota` remained on the old informers. In this PR, both are switched to using the new informers. The issues in the past were lots of flakey test failures in the ResourceQuota e2es, where it would randomly fail to see deletions and handle replenishment. I am hoping that now that everything here is consistently using the new informers, there won't be any more of these flakes, but it's something to keep an eye out for.

I also think `pkg/controller/resourcequota` could be cleaned up. I don't think there's really any need for `replenishment_controller.go` any more since it's no longer running individual controllers per kind to replenish. It instead just uses the shared informer and adds event handlers to it. But maybe we do that in a follow up.

cc @derekwaynecarr @smarterclayton @wojtek-t @deads2k @sttts @liggitt @timothysc @kubernetes/sig-scalability-pr-reviews
2017-02-15 11:37:04 -08:00
Kubernetes Submit Queue e4a4fe4a89 Merge pull request #41285 from liggitt/kube-scheduler-role
Automatic merge from submit-queue (batch tested with PRs 40297, 41285, 41211, 41243, 39735)

Secure kube-scheduler

This PR:
* Adds a bootstrap `system:kube-scheduler` clusterrole
* Adds a bootstrap clusterrolebinding to the `system:kube-scheduler` user
* Sets up a kubeconfig for kube-scheduler on GCE (following the controller-manager pattern)
* Switches kube-scheduler to running with kubeconfig against secured port (salt changes, beware)
* Removes superuser permissions from kube-scheduler in local-up-cluster.sh
* Adds detailed RBAC deny logging

```release-note
On kube-up.sh clusters on GCE, kube-scheduler now contacts the API on the secured port.
```
2017-02-15 03:25:10 -08:00