Automatic merge from submit-queue (batch tested with PRs 39538, 40188, 40357, 38214, 40195)
Decoupling scheduler creation from creation of scheduler.Config struc…
**What this PR does / why we need it**:
Adds functionality to the scheduler to initialize from an Configurator interface, rather then via a Config struct.
**Which issue this PR fixes**
Reduces coupling to `scheduler.Config` data structure format so that we can proliferate more interface driven composition of scheduler components.
Automatic merge from submit-queue
Add serviceaccount owners files
Poor @derekwaynecarr is the sole approver/reviewer for the entire admission package.
This adds OWNERS files for service account controller and admission packages
When a pod uses a service account that references multiple secrets,
prefer the secrets in the order they're listed.
Without this change, the added test fails:
--- FAIL: TestMultipleReferencedSecrets (0.00s)
admission_test.go:832: expected first referenced secret to be mounted, got "token2"
Automatic merge from submit-queue
move client/cache and client/discovery to client-go
mechanical changes to move those packages. Had to create a `k8s.io/kubernetes/pkg/client/tests` package for tests that were blacklisted from client-go. We can rewrite these tests later and move them, but for now they'll still run at least.
@caesarxuchao @sttts
Automatic merge from submit-queue
Skip schedule deleting pod
Since binding a deleting pod will always return fail, we should skip that kind of pod early
These files have been created lately, so we don't have much information
about them anyway, so let's just:
- Remove assignees and make them approvers
- Copy approves as reviewers
Automatic merge from submit-queue (batch tested with PRs 36693, 40154, 40170, 39033)
make client-go authoritative for pkg/client/restclient
Moves client/restclient to client-go and a util/certs, util/testing as transitives.
Automatic merge from submit-queue (batch tested with PRs 36693, 40154, 40170, 39033)
Minor hygiene in scheduler.
**What this PR does / why we need it**:
Minor cleanups in scheduler, related to PR #31652.
- Unified lazy opaque resource caching.
- Deleted a commented-out line of code.
**Release note**:
```release-note
N/A
```
Automatic merge from submit-queue
move pkg/fields to apimachinery
Purely mechanical move of `pkg/fields` to apimachinery.
Discussed with @lavalamp on slack. Moving this an `labels` to apimachinery.
@liggitt any concerns? I think the idea of field selection should become generic and this ends up shared between client and server, so this is a more logical location.
Automatic merge from submit-queue
make client-go more authoritative
Builds on https://github.com/kubernetes/kubernetes/pull/40103
This moves a few more support package to client-go for origination.
1. restclient/watch - nodep
1. util/flowcontrol - used interface
1. util/integer, util/clock - used in controllers and in support of util/flowcontrol
Automatic merge from submit-queue (batch tested with PRs 39898, 39904)
[scheduler] interface for config
**What this PR fixes**
This PR converts the Scheduler configuration factory into an interface, so that
- the scheduler_perf and scheduler integration tests dont rely on the struct for their implementation
- the exported functionality of the factory (i.e. what it needs to provide to create a scheduler configuration) is completely explicit, rather then completely coupled to a struct.
- makes some parts of the factory immutable, again to minimize possible coupling.
This makes it easier to make a custom factory in instances where we might specifically want to import scheduler logic without actually reusing the entire scheduler codebase.
Automatic merge from submit-queue (batch tested with PRs 36467, 36528, 39568, 40094, 39042)
Improve code coverage for algorithm/predicates.
Improve code coverage for algorithm/predicates for #39559 .
Improved coverage from 71.3% to 81.9%.
Coverage report: [combined-coverage.html.gz](https://github.com/kubernetes/kubernetes/files/691518/combined-coverage.html.gz)
Automatic merge from submit-queue (batch tested with PRs 39625, 39842)
Add RBAC v1beta1
Add `rbac.authorization.k8s.io/v1beta1`. This scrubs `v1alpha1` to remove cruft, then add `v1beta1`. We'll update other bits of infrastructure to code to `v1beta1` as a separate step.
```release-note
The `attributeRestrictions` field has been removed from the PolicyRule type in the rbac.authorization.k8s.io/v1alpha1 API. The field was not used by the RBAC authorizer.
```
@kubernetes/sig-auth-misc @liggitt @erictune
Automatic merge from submit-queue
Include "ingresses" resource in RBAC bootstrap roles
The bootstrap RBAC roles "admin", "edit", and "view" should all be able to apply their respective access verbs to the "ingresses" resource in order to facilitate both publishing Ingress resources (for
service administrators) and consuming them (for ingress controllers).
Note that I alphabetized the resources listed in the role definitions that I changed to make it easier to decide later where to insert new entries. The original order looked like it may have started out alphabetized, but lost its way. If I missed an intended order there, please advise.
I am uncertain whether this change deserves mention in a release note, given the RBAC feature's alpha state. Regardless, it's possible that a cluster administrator could have been happy with the previous set of permissions afforded by these roles, and would be surprised to discover that bound subjects can now control _Ingress_ resources. However, in order to be afflicted, that administrator would have had to have applied these role definitions again which, if I understand it, would be a deliberate act, as bootstrapping should only occur once in a given cluster.
The bootstrap RBAC roles "admin", "edit", and "view" should all be
able to apply their respective access verbs to the "ingresses"
resource in order to facilitate both publishing Ingress resources (for
service administrators) and consuming them (for ingress controllers).
Automatic merge from submit-queue
Corrected a typo in scheduler factory.go.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 39945, 39601)
bugfix for PodToleratesNodeTaints
`PodToleratesNodeTaints`predicate func should return true if pod has no toleration annotations and node's taint effect is `PreferNoSchedule`
Automatic merge from submit-queue (batch tested with PRs 37680, 39968)
Update Owners for Scheduler
Update Owners file for scheduler component to spread the reviews around.
/cc @davidopp per previous sig-mtg.
Automatic merge from submit-queue
add patch RS to deployment controller
Found in http://gcsweb.k8s.io/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce/2841/artifacts/bootstrap-e2e-master/, `RBAC DENY: user "system:serviceaccount:kube-system:deployment-controller" groups [system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] cannot "patch" on "replicasets.extensions/" in namespace "e2e-tests-deployment-3rj5g"
`
@kubernetes/sig-auth-misc
Automatic merge from submit-queue
Give replicaset controller patch permission on pods
Needed for AdoptPod/ReleasePod
Fixes denials seen in autoscaling test log:
`RBAC DENY: user "system:serviceaccount:kube-system:replicaset-controller" groups [system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] cannot "patch" on "pods./"`
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)
move api/errors to apimachinery
`pkg/api/errors` is a set of helpers around `meta/v1.Status` that help to create and interpret various apiserver errors. Things like `.NewNotFound` and `IsNotFound` pairings. This pull moves it into apimachinery for use by the clients and servers.
@smarterclayton @lavalamp First commit is the move plus minor fitting. Second commit is straight replace and generation.
Automatic merge from submit-queue
Fix examples e2e permission check
Ref #39382
Follow-up from #39896
Permission check should be done within the e2e test namespace, not cluster-wide
Also improved RBAC audit logging to make the scope of the permission check clearer
Automatic merge from submit-queue
eliminate duplicated codes in estimateContainer method
**What this PR does / why we need it**:
there are two code snippets about when to estimate resource for cpu and mem are duplicated, i extracted them into method `getEstimationIfNeeded` method
Signed-off-by: bruceauyeung <ouyang.qinhua@zte.com.cn>
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)
Admission control support for versioned configuration files
**What this PR does / why we need it**:
Today, the `--admission-control-config-file=` argument takes an opaque file that is shared across all admission controllers to provide configuration. This file is not well-versioned and it's shared across multiple plug-ins. Some plugins take file based configuration (`ImagePolicyWebhook`) and others abuse flags to provide configuration because we lacked a good example (`InitialResources`). This PR defines a versioned configuration format that we can use moving forward to provide configuration input to admission controllers that is well-versioned, and does not require the addition of new flags.
The sample configuration file would look as follows:
```
apiVersion: componentconfig/v1alpha1
kind: AdmissionConfiguration
plugins:
- name: "ImagePolicyWebhook"
path: "image-policy-webhook.json"
```
The general behavior is each plugin that requires additional configuration is enumerated by name. An alternate file location is provided for its specific configuration, or the configuration can be embedded as a raw extension via the configuration section.
**Special notes for your reviewer**:
A follow-on PR will be needed to make `ImagePolicyWebhook` to use versioned configuration. This PR maintains backwards compatibility by ignoring configuration it cannot understand and therefore treating the file as opaque. I plan to make use of this PR to complete https://github.com/kubernetes/kubernetes/pull/36765 which attempts to allow more configuration parameters to the `ResourceQuota` admission plugin.
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)
Made cache.Controller to be interface.
**What this PR does / why we need it**:
#37504
Automatic merge from submit-queue
run staging client-go update
Chasing to see what real problems we have in staging-client-go.
@sttts you get similar results?
Automatic merge from submit-queue
replace global registry in apimachinery with global registry in k8s.io/kubernetes
We'd like to remove all globals, but our immediate problem is that a shared registry between k8s.io/kubernetes and k8s.io/client-go doesn't work. Since client-go makes a copy, we can actually keep a global registry with other globals in pkg/api for now.
@kubernetes/sig-api-machinery-misc @lavalamp @smarterclayton @sttts
Automatic merge from submit-queue (batch tested with PRs 39803, 39698, 39537, 39478)
[scheduling] Moved pod affinity and anti-affinity from annotations to api fields #25319
Converted pod affinity and anti-affinity from annotations to api fields
Related: #25319
Related: #34508
**Release note**:
```Pod affinity and anti-affinity has moved from annotations to api fields in the pod spec. Pod affinity or anti-affinity that is defined in the annotations will be ignored.```
Automatic merge from submit-queue (batch tested with PRs 39803, 39698, 39537, 39478)
Use controller interface for everything in config factory
**What this PR does / why we need it**:
We want to replace controller structs with interfaces
- per the TODO in `ControllerInterface`
- Specifically this will make the decoupling from Config and reuse of the scheduler's subcomponents cleaner.
Automatic merge from submit-queue (batch tested with PRs 39230, 39718)
Remove special case for StatefulSets in scheduler
**What this PR does / why we need it**: Removes special case for StatefulSet in scheduler code
/ref: https://github.com/kubernetes/kubernetes/issues/39687
**Special notes for your reviewer**:
**Release note**:
```release-note
Scheduler treats StatefulSet pods as belonging to a single equivalence class.
```
Automatic merge from submit-queue (batch tested with PRs 39684, 39577, 38989, 39534, 39702)
Set PodStatus QOSClass field
This PR continues the work for https://github.com/kubernetes/kubernetes/pull/37968
It converts all local usage of the `qos` package class types to the new API level types (first commit) and sets the pod status QOSClass field in the at pod creation time on the API server in `PrepareForCreate` and in the kubelet in the pod status update path (second commit). This way the pod QOS class is set even if the pod isn't scheduled yet.
Fixes#33255
@ConnorDoyle @derekwaynecarr @vishh
Automatic merge from submit-queue (batch tested with PRs 39694, 39383, 39651, 39691, 39497)
Add support for groups to passwordfile
As we move deployment methods to using RBAC, it is useful to be able to place the admin user in the bootstrap kubeconfig files in a superuser group. The tokencsv file supports specifying group membership, but the basicauth file does not. This adds it for parity.
I plan to update the generated password file to put the admin user in a group (similar to the way https://github.com/kubernetes/kubernetes/pull/39537 puts that user in a group in the token file)
```release-note
--basic-auth-file supports optionally specifying groups in the fourth column of the file
```
Automatic merge from submit-queue (batch tested with PRs 38212, 38792, 39641, 36390, 39005)
Allow node-controller to update node status
ref: #39639
* adds required permissions to node-controller
* fixes typo in role name for pod-garbage-collector role
* adds event watching permissions to persistent volume controller
* adds event permissions to node proxier
Automatic merge from submit-queue (batch tested with PRs 34488, 39511, 39619, 38342, 39491)
Update FitError as a message component into the PodConditionUpdater.
Fixes#20064 , after a roundabout volley of ideas, we ended up digging into existing Conditions for this, rather then a first class API object. This is just a quick sketch of the skeleton minimal implementation, it should pretty much "just work". I'll test it more later today.
Release Note:
```
Histogram data of predicate failures is contained in pod conditions and thus available to users by kubectl commands.
```
Automatic merge from submit-queue
Start moving genericapiserver to staging
This moves `pkg/auth/user` to `staging/k8s.io/genericapiserver/pkg/authentication/user`. I'll open a separate pull into the upstream gengo to support using `import-boss` on vendored folders to support staging.
After we agree this is the correct approach and see everything build, I'll start moving other packages over which don't have k8s.io/kubernetes deps.
@kubernetes/sig-api-machinery-misc @lavalamp
@sttts @caesarxuchao ptal
Automatic merge from submit-queue
Fix comment and optimize code
**What this PR does / why we need it**:
Fix comment and optimize code.
Thanks.
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
oidc auth-n plugin: enforce email_verified claim
This change causes the OpenID Connect authenticator to start
enforcing the 'email_verified' claim.
https://openid.net/specs/openid-connect-core-1_0.html#StandardClaims
If the OIDC authenticator uses the 'email' claim as a user's username
and the 'email_verified' is not set to `true`, reject that authentication attempt.
cc @erictune @kubernetes/sig-auth @mlbiam
```release-note
When using OIDC authentication and specifying --oidc-username-claim=email, an `"email_verified":true` claim must be returned from the identity provider.
```
Automatic merge from submit-queue (batch tested with PRs 39075, 39350, 39353)
Move pkg/api.{Context,RequestContextMapper} into pkg/genericapiserver/api/request
**Based on #39350**
Automatic merge from submit-queue
Ensure that assumed pod won't be expired until the end of binding
In case if api server is overloaded and will reply with 429 too many requests error, binding may take longer than ttl of scheduler cache for assumed pods 1199d42210/pkg/client/restclient/request.go (L787-L850)
This problem was mitigated by using this fix e4d215d508 and increased rate limit for api server. But it is possible that it will occur again.
In such cases when api server is overloaded and returns a lot of
429 (too many requests) errors - binding may take a lot of time
to succeed due to retry policy implemented in rest client.
In such events cache ttl for assumed pods wasn't big enough.
In order to minimize probability of such errors ttl for assumed pods
will be counted from the time when binding for particular pod is finished
(either with error or success)
Change-Id: Ib0122f8a76dc57c82f2c7c52497aad1bdd8be411
Automatic merge from submit-queue
Avoid unnecessary memory allocations
Low-hanging fruits in saving memory allocations. During our 5000-node kubemark runs I've see this:
ControllerManager:
- 40.17% k8s.io/kubernetes/pkg/util/system.IsMasterNode
- 19.04% k8s.io/kubernetes/pkg/controller.(*PodControllerRefManager).Classify
Scheduler:
- 42.74% k8s.io/kubernetes/plugin/pkg/scheduler/algrorithm/predicates.(*MaxPDVolumeCountChecker).filterVolumes
This PR is eliminating all of those.
Automatic merge from submit-queue
Optimize pod affinity when predicate
Optimize by returning as early as possible to avoid invoking priorityutil.PodMatchesTermsNamespaceAndSelector.
* Cache OpenID Connect clients to prevent reinitialization
* Don't retry requests in the http.RoundTripper.
* Don't rely on the server not reading POST bodies.
* Don't leak response body FDs.
* Formerly ignored any throttling requests by the server.
* Determine if the id token's expired by inspecting it.
* Similar to logic in golang.org/x/oauth2
* Synchronize around refreshing tokens and persisting the new config.
Automatic merge from submit-queue
Remove two zany unit tests.
These two tests aren't unit tests in the usual sense. We can consider switching them to run as verify checks, but I'm not convinced that they're even necessary.
They essentially work by searching their code for public functions with signatures that look like `FitPredicate`, then they shell out to grep to see that they're used somewhere in the source tree. This will never work in bazel.
Automatic merge from submit-queue
Add use case to service affinity
Also part of nits in refactoring predicates, I found the explanation of `serviceaffinity` in its comment is very hard to understand. So I added example instead here to help user/developer to digest it.
Automatic merge from submit-queue (batch tested with PRs 38154, 38502)
Rename "release_1_5" clientset to just "clientset"
We used to keep multiple releases in the main repo. Now that [client-go](https://github.com/kubernetes/client-go) does the versioning, there is no need to keep releases in the main repo. This PR renames the "release_1_5" clientset to just "clientset", clientset development will be done in this directory.
@kubernetes/sig-api-machinery @deads2k
```release-note
The main repository does not keep multiple releases of clientsets anymore. Please find previous releases at https://github.com/kubernetes/client-go
```
Automatic merge from submit-queue (batch tested with PRs 38648, 38747)
fix typo
**What this PR does / why we need it**:
fix typo.
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 35436, 37090, 38700)
Make iscsi pv/pvc aware of nodiskconflict feature
Being iscsi a `RWO, ROX` volume we should conflict if more than one pod is using same iscsi LUN.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Automatic merge from submit-queue (batch tested with PRs 38315, 38624, 38572, 38544)
scheduler: refactor main entry Run()
The kube-scheduler/app.Run() is the main entry of scheduler program.
It's enormous.
This PR tries to clean it up. Should be more modular and readable.
Automatic merge from submit-queue
Remove json serialization annotations from internal types
fixes#3933
Internal types should never be serialized, and including json serialization tags on them makes it possible to accidentally do that without realizing it.
fixes in this PR:
* types
* [x] remove json tags from internal types
* [x] fix references from serialized types to internal ObjectMeta
* generation
* [x] remove generated json codecs for internal types (they should never be used)
* kubectl
* [x] fix `apply` to operate on versioned object
* [x] fix sorting by field to operate on versioned object
* [x] fix `--record` to build annotation patch using versioned object
* hpa
* [x] fix unmarshaling to internal CustomMetricTargetList in validation
* thirdpartyresources
* [x] fix encoding API responses using internal ObjectMeta
* tests
* [x] fix tests to use versioned objects when checking encoded content
* [x] fix tests passing internal objects to generic printers
follow ups (will open tracking issues or additional PRs):
- [ ] remove json tags from internal kubeconfig types (`kubectl config set` pathfinding needs to work against external type)
- [ ] HPA should version CustomMetricTargetList serialization in annotations
- [ ] revisit how TPR resthandlers encoding objects
- [ ] audit and add tests for printer use (human-readable printer requires internal versions, generic printers require external versions)
- [ ] add static analysis tests preventing new internal types from adding tags
- [ ] add static analysis tests requiring json tags on external types (and enforcing lower-case first letter)
- [ ] add more tests for `kubectl get` exercising known and unknown types with all output options
This change causes the OpenID Connect authenticator to start
enforcing the 'email_verified' claim.
https://openid.net/specs/openid-connect-core-1_0.html#StandardClaims
If the OIDC authenticator uses the 'email' claim as a user's password
and the 'email_verified' holds the value false, reject that
authentication attempt.
If 'email_verified' is true or not present, continue as before.
Automatic merge from submit-queue (batch tested with PRs 38377, 36365, 36648, 37691, 38339)
Do not create selector and namespaces in a loop where possible
With 1000 nodes and 5000 pods (5 pods per node) with anti-affinity a lot of CPU wasted on creating LabelSelector and sets.String (map).
With this change we are able to deploy that number of pods in ~25 minutes. Without - it takes 30 minutes to deploy 500 pods with anti-affinity configured.
failureDomains are only used for PreferredDuringScheduling pod
anti-affinity, which is ignored by PodAffinityChecker.
This unnecessary requirement was making it hard to move
PodAffinityChecker to GeneralPredicates because that would require
passing --failure-domains to both kubelet and kube-controller-manager.
Automatic merge from submit-queue (batch tested with PRs 35101, 38215, 38092)
auth duplicate detect
I think we should not allow people set duplicate tokens in token file or set duplicate usernames in password file. because the default action overwriting the old data may let people misunderstand.
Automatic merge from submit-queue
add default label to rbac bootstrap policy
allow people to retrieve information of bootstrap policy by label :
`kubectl get clusterroles -l key=value`
`kubectl get clusterrolebindings -l key=value`
Automatic merge from submit-queue (batch tested with PRs 38076, 38137, 36882, 37634, 37558)
[scheduler] Use V(10) for anything which may be O(N*P) logging
Fixes#37014
This PR makes sure that logging statements which are capable of being called on a perNode / perPod basis (i.e. non essential ones that will just clog up logs at large scale) are at V(10) level.
I dreamt of a levenstein filter that built a weak map of word frequencies and alerted once log throughput increased w/o varying information content.... but then I woke up and realized this is probably all we really need for now :)
Automatic merge from submit-queue (batch tested with PRs 38111, 38121)
remove rbac super user
Cleaning up cruft and duplicated capabilities as we transition from RBAC alpha to beta. In 1.5, we added a secured loopback connection based on the `system:masters` group name. `system:masters` have full power in the API, so the RBAC super user is superfluous.
The flag will stay in place so that the process can still launch, but it will be disconnected.
@kubernetes/sig-auth
Automatic merge from submit-queue (batch tested with PRs 36816, 37534)
Move pkg/api/unversioned to pkg/apis/meta/v1
This moves code from using pkg/api/unversioned to pkg/apis/meta/v1 with the `metav1` local package name.
Built on top of #37532 (the first three commits related to ExportOptions)
Part of #37530
Automatic merge from submit-queue
plumb in front proxy group header
Builds on https://github.com/kubernetes/kubernetes/pull/36662 and https://github.com/kubernetes/kubernetes/pull/36774, so only the last commit is unique.
This completes the plumbing for front proxy header information and makes it possible to add just the front proxy header authenticator.
WIP because I'm going to assess it in use downstream.
Automatic merge from submit-queue (batch tested with PRs 35300, 36709, 37643, 37813, 37697)
add rbac action to subjects type
This adds the ability to go from an authorization action to the list subjects who have the power to perform the action. This will be used to either back an RBAC specific endpoint or generic authorization endpoint. Because of the way authorization works today, the set of subjects returned will always be a subset of those with access since any authorizer can say yes.
@kubernetes/sig-auth
Automatic merge from submit-queue (batch tested with PRs 37945, 37498, 37391, 37209, 37169)
add controller roles
Upstream controller roles that have downstream.
@sttts this is a start at roles for controllers. I've made names match for now, but they could use some love in both the controller manager and here. I'd recommend using this as a starting point.
Automatic merge from submit-queue (batch tested with PRs 36263, 36755, 37357, 37222, 37524)
Add flag to enable contention profiling in scheduler.
```release-note
Add flag to enable contention profiling in scheduler.
```
Automatic merge from submit-queue
auth delegation role
Add a bootstrap role for authentication and authorization delegation. Useful for extension API servers.
@kubernetes/sig-auth
Automatic merge from submit-queue
Revert "Avoid hard-coding list of Node Conditions"
* we don't know how other API consumers are using node conditions (there was no prior expectation that the scheduler would block on custom conditions)
* not all conditions map directly to schedulability (e.g. `MemoryPressure`/`DiskPressure`)
* not all conditions use True to mean "unschedulable" (e.g. `Ready`)
This reverts commit 511b2ecaa8 to avoid breaking existing API users and to avoid constraining future uses of the node conditions API
Automatic merge from submit-queue
Add authz to psp admission
Add authz integration to PSP admission to enable granting access to use specific PSPs on a per-user and per-service account basis. This allows an administrator to use multiple policies in a cluster that grant different levels of access for different types of users.
Builds on https://github.com/kubernetes/kubernetes/pull/32555. Second commit adds authz check to matching policy function in psp admission.
@deads2k @sttts @timstclair
Automatic merge from submit-queue
specify custom ca file to verify the keystone server
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
Sometimes the keystone server's certificate is self-signed, mainly used for internal development, testing and etc.
For this kind of ca, we need a way to verify the keystone server.
Otherwise, below error will occur.
> x509: certificate signed by unknown authority
This patch provide a way to pass in a ca file to verify the keystone server when starting `kube-apiserver`.
**Which issue this PR fixes** : fixes#22695, #24984
**Special notes for your reviewer**:
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
``` release-note
```
Automatic merge from submit-queue
'Max' and 'MIn' don't seem to used anywhere, so I would suggest removing them
Signed-off-by: Yanqiang Miao miao.yanqiang@zte.com.cn
Automatic merge from submit-queue
Rename ScheduledJobs to CronJobs
I went with @smarterclayton idea of registering named types in schema. This way we can support both the new (CronJobs) and old (ScheduledJobs) resource name. Fixes#32150.
fyi @erictune @caesarxuchao @janetkuo
Not ready yet, but getting close there...
**Release note**:
```release-note
Rename ScheduledJobs to CronJobs.
```
Automatic merge from submit-queue
We only report diskpressure to users, and no longer report inodepressure
See #36180 for more information on why #33218 was reverted.
Automatic merge from submit-queue
Add cmd support to gcp auth provider plugin
**What this PR does / why we need it**:
Adds ability for gcp auth provider plugin to get access token by shelling out to an external command. We need this because for GKE, kubectl should be using gcloud credentials. It currently uses google application default credentials, which causes confusion if user has configured both with different permissions (previously the two were almost always identical).
**Which issue this PR fixes**:
Addresses #35530 with gcp-only solution, as generic cmd plugin was deemed not useful for other providers.
**Special notes for your reviewer**:
Configuration options are to support whatever future command gcloud provides for printing access token of active user. Also works with existing command (`gcloud auth print-access-token`)
```release-note
```
We are more liberal in what we accept as a volume id in k8s, and indeed
we ourselves generate names that look like `aws://<zone>/<id>` for
dynamic volumes.
This volume id (hereafter a KubernetesVolumeID) cannot directly be
compared to an AWS volume ID (hereafter an awsVolumeID).
We introduce types for each, to prevent accidental comparison or
confusion.
Issue #35746
Automatic merge from submit-queue
split scheduler priorities into separate files
In the current state it's really hard to find a thing one is looking for, if he doesn't know already know where to look. cc @davidopp
Automatic merge from submit-queue
Add tooling to generate listers
Add lister-gen tool to auto-generate listers. So far this PR only demonstrates replacing the manually-written `StoreToLimitRangeLister` with the generated `LimitRangeLister`, as it's a small and easy swap.
cc @deads2k @liggitt @sttts @nikhiljindal @lavalamp @smarterclayton @derekwaynecarr @kubernetes/sig-api-machinery @kubernetes/rh-cluster-infra
Automatic merge from submit-queue
Update PodAntiAffinity to ignore calls to subresources
@smarterclayton I hit this when I was trying to evict a pod, apparently k8s does not have this particular admission plugin on by default. ptal
@mml @davidopp fyi
Automatic merge from submit-queue
allow authentication through a front-proxy
This allows a front proxy to set a request header and have that be a valid `user.Info` in the authentication chain. To secure this power, a client certificate may be used to confirm the identity of the front proxy
@kubernetes/sig-auth fyi
@erictune per-request
@liggitt you wrote the openshift one, ptal.
Automatic merge from submit-queue
Simplify negotiation in server in preparation for multi version support
This is a pre-factor for #33900 to simplify runtime.NegotiatedSerializer, tighten up a few abstractions that may break when clients can request different client versions, and pave the way for better negotiation.
View this as pure simplification.
Automatic merge from submit-queue
[PHASE 1] Opaque integer resource accounting.
## [PHASE 1] Opaque integer resource accounting.
This change provides a simple way to advertise some amount of arbitrary countable resource for a node in a Kubernetes cluster. Users can consume these resources by including them in pod specs, and the scheduler takes them into account when placing pods on nodes. See the example at the bottom of the PR description for more info.
Summary of changes:
- Defines opaque integer resources as any resource with prefix `pod.alpha.kubernetes.io/opaque-int-resource-`.
- Prevent kubelet from overwriting capacity.
- Handle opaque resources in scheduler.
- Validate integer-ness of opaque int quantities in API server.
- Tests for above.
Feature issue: https://github.com/kubernetes/features/issues/76
Design: http://goo.gl/IoKYP1
Issues:
kubernetes/kubernetes#28312kubernetes/kubernetes#19082
Related:
kubernetes/kubernetes#19080
CC @davidopp @timothysc @balajismaniam
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
Added support for accounting opaque integer resources.
Allows cluster operators to advertise new node-level resources that would be
otherwise unknown to Kubernetes. Users can consume these resources in pod
specs just like CPU and memory. The scheduler takes care of the resource
accounting so that no more than the available amount is simultaneously
allocated to pods.
```
## Usage example
```sh
$ echo '[{"op": "add", "path": "pod.alpha.kubernetes.io~1opaque-int-resource-bananas", "value": "555"}]' | \
> http PATCH http://localhost:8080/api/v1/nodes/localhost.localdomain/status \
> Content-Type:application/json-patch+json
```
```http
HTTP/1.1 200 OK
Content-Type: application/json
Date: Thu, 11 Aug 2016 16:44:55 GMT
Transfer-Encoding: chunked
{
"apiVersion": "v1",
"kind": "Node",
"metadata": {
"annotations": {
"volumes.kubernetes.io/controller-managed-attach-detach": "true"
},
"creationTimestamp": "2016-07-12T04:07:43Z",
"labels": {
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/os": "linux",
"kubernetes.io/hostname": "localhost.localdomain"
},
"name": "localhost.localdomain",
"resourceVersion": "12837",
"selfLink": "/api/v1/nodes/localhost.localdomain/status",
"uid": "2ee9ea1c-47e6-11e6-9fb4-525400659b2e"
},
"spec": {
"externalID": "localhost.localdomain"
},
"status": {
"addresses": [
{
"address": "10.0.2.15",
"type": "LegacyHostIP"
},
{
"address": "10.0.2.15",
"type": "InternalIP"
}
],
"allocatable": {
"alpha.kubernetes.io/nvidia-gpu": "0",
"cpu": "2",
"memory": "8175808Ki",
"pods": "110"
},
"capacity": {
"alpha.kubernetes.io/nvidia-gpu": "0",
"pod.alpha.kubernetes.io/opaque-int-resource-bananas": "555",
"cpu": "2",
"memory": "8175808Ki",
"pods": "110"
},
"conditions": [
{
"lastHeartbeatTime": "2016-08-11T16:44:47Z",
"lastTransitionTime": "2016-07-12T04:07:43Z",
"message": "kubelet has sufficient disk space available",
"reason": "KubeletHasSufficientDisk",
"status": "False",
"type": "OutOfDisk"
},
{
"lastHeartbeatTime": "2016-08-11T16:44:47Z",
"lastTransitionTime": "2016-07-12T04:07:43Z",
"message": "kubelet has sufficient memory available",
"reason": "KubeletHasSufficientMemory",
"status": "False",
"type": "MemoryPressure"
},
{
"lastHeartbeatTime": "2016-08-11T16:44:47Z",
"lastTransitionTime": "2016-08-10T06:27:11Z",
"message": "kubelet is posting ready status",
"reason": "KubeletReady",
"status": "True",
"type": "Ready"
},
{
"lastHeartbeatTime": "2016-08-11T16:44:47Z",
"lastTransitionTime": "2016-08-10T06:27:01Z",
"message": "kubelet has no disk pressure",
"reason": "KubeletHasNoDiskPressure",
"status": "False",
"type": "DiskPressure"
}
],
"daemonEndpoints": {
"kubeletEndpoint": {
"Port": 10250
}
},
"images": [],
"nodeInfo": {
"architecture": "amd64",
"bootID": "1f7e95ca-a4c2-490e-8ca2-6621ae1eb5f0",
"containerRuntimeVersion": "docker://1.10.3",
"kernelVersion": "4.5.7-202.fc23.x86_64",
"kubeProxyVersion": "v1.3.0-alpha.4.4285+7e4b86c96110d3-dirty",
"kubeletVersion": "v1.3.0-alpha.4.4285+7e4b86c96110d3-dirty",
"machineID": "cac4063395254bc89d06af5d05322453",
"operatingSystem": "linux",
"osImage": "Fedora 23 (Cloud Edition)",
"systemUUID": "D6EE0782-5DEB-4465-B35D-E54190C5EE96"
}
}
}
```
After patching, the kubelet's next sync fills in allocatable:
```
$ kubectl get node localhost.localdomain -o json | jq .status.allocatable
```
```json
{
"alpha.kubernetes.io/nvidia-gpu": "0",
"pod.alpha.kubernetes.io/opaque-int-resource-bananas": "555",
"cpu": "2",
"memory": "8175808Ki",
"pods": "110"
}
```
Create two pods, one that needs a single banana and another that needs a truck load:
```
$ kubectl create -f chimp.yaml
$ kubectl create -f superchimp.yaml
```
Inspect the scheduler result and pod status:
```
$ kubectl describe pods chimp
Name: chimp
Namespace: default
Node: localhost.localdomain/10.0.2.15
Start Time: Thu, 11 Aug 2016 19:58:46 +0000
Labels: <none>
Status: Running
IP: 172.17.0.2
Controllers: <none>
Containers:
nginx:
Container ID: docker://46ff268f2f9217c59cc49f97cc4f0f085d5ac0e251f508cc08938601117c0cec
Image: nginx:1.10
Image ID: docker://sha256:82e97a2b0390a20107ab1310dea17f539ff6034438099384998fd91fc540b128
Port: 80/TCP
Limits:
cpu: 500m
memory: 64Mi
pod.alpha.kubernetes.io/opaque-int-resource-bananas: 3
Requests:
cpu: 250m
memory: 32Mi
pod.alpha.kubernetes.io/opaque-int-resource-bananas: 1
State: Running
Started: Thu, 11 Aug 2016 19:58:51 +0000
Ready: True
Restart Count: 0
Volume Mounts: <none>
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
No volumes.
QoS Class: Burstable
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
9m 9m 1 {default-scheduler } Normal Scheduled Successfully assigned chimp to localhost.localdomain
9m 9m 2 {kubelet localhost.localdomain} Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy.
9m 9m 1 {kubelet localhost.localdomain} spec.containers{nginx} Normal Pulled Container image "nginx:1.10" already present on machine
9m 9m 1 {kubelet localhost.localdomain} spec.containers{nginx} Normal Created Created container with docker id 46ff268f2f92
9m 9m 1 {kubelet localhost.localdomain} spec.containers{nginx} Normal Started Started container with docker id 46ff268f2f92
```
```
$ kubectl describe pods superchimp
Name: superchimp
Namespace: default
Node: /
Labels: <none>
Status: Pending
IP:
Controllers: <none>
Containers:
nginx:
Image: nginx:1.10
Port: 80/TCP
Requests:
cpu: 250m
memory: 32Mi
pod.alpha.kubernetes.io/opaque-int-resource-bananas: 10Ki
Volume Mounts: <none>
Environment Variables: <none>
Conditions:
Type Status
PodScheduled False
No volumes.
QoS Class: Burstable
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
3m 1s 15 {default-scheduler } Warning FailedScheduling pod (superchimp) failed to fit in any node
fit failure on node (localhost.localdomain): Insufficient pod.alpha.kubernetes.io/opaque-int-resource-bananas
```
Automatic merge from submit-queue
Add sync state loop in master's volume reconciler
At master volume reconciler, the information about which volumes are
attached to nodes is cached in actual state of world. However, this
information might be out of date in case that node is terminated (volume
is detached automatically). In this situation, reconciler assume volume
is still attached and will not issue attach operation when node comes
back. Pods created on those nodes will fail to mount.
This PR adds the logic to periodically sync up the truth for attached
volumes kept in
the actual state cache. If the volume is no longer attached to the node,
the actual state will be updated to reflect the truth. In turn,
reconciler will take actions if needed.
To avoid issuing many concurrent operations on cloud provider, this PR
tries to add batch operation to check whether a list of volumes are
attached to the node instead of one request per volume.
- Prevents kubelet from overwriting capacity during sync.
- Handles opaque integer resources in the scheduler.
- Adds scheduler predicate tests for opaque resources.
- Validates opaque int resources:
- Ensures supplied opaque int quantities in node capacity,
node allocatable, pod request and pod limit are integers.
- Adds tests for new validation logic (node update and pod spec).
- Added e2e tests for opaque integer resources.
At master volume reconciler, the information about which volumes are
attached to nodes is cached in actual state of world. However, this
information might be out of date in case that node is terminated (volume
is detached automatically). In this situation, reconciler assume volume
is still attached and will not issue attach operation when node comes
back. Pods created on those nodes will fail to mount.
This PR adds the logic to periodically sync up the truth for attached volumes kept in the actual state cache. If the volume is no longer attached to the node, the actual state will be updated to reflect the truth. In turn, reconciler will take actions if needed.
To avoid issuing many concurrent operations on cloud provider, this PR
tries to add batch operation to check whether a list of volumes are
attached to the node instead of one request per volume.
More details are explained in PR #33760
Alter how runtime.SerializeInfo is represented to simplify negotiation
and reduce the need to allocate during negotiation. Simplify the dynamic
client's logic around negotiating type. Add more tests for media type
handling where necessary.
Automatic merge from submit-queue
Make "Unschedulable" reason a constant in api
String "Unschedulable" is used in couple places in K8S:
* scheduler
* federation replicaset and deployment controllers
* cluster autoscaler
* rescheduler
This PR makes the string a part of API so it not changed.
cc: @davidopp @fgrzadkowski @wojtek-t
Automatic merge from submit-queue
Create restclient interface
Refactoring of code to allow replace *restclient.RESTClient with any RESTClient implementation that implements restclient.RESTClientInterface interface.
Automatic merge from submit-queue
Predicate cacheing and cleanup
Fix to #31795
First pass @ cleanup and caching of the CheckServiceAffinity function.
The cleanup IMO is necessary because the logic around the pod listing and the use of the "implicit selector" (which is reverse engineered to enable the homogenous pod groups).
Should still pass the E2Es.
@timothysc @wojtek-t
Comments addressed, Make emptyMetadataProducer a func to avoid casting,
FakeSvcLister: remove error return for len(svc)=0. New test for
predicatePrecomp to make method semantics explictly enforced when meta
is missing. Precompute wrapper.
Automatic merge from submit-queue
PVC informer lister supports listing
This will be used in follow-on PRs for quota evaluation backed by informers for pvcs.
/cc @deads2k @eparis @kubernetes/sig-storage
Automatic merge from submit-queue
Adding default StorageClass annotation printout for resource_printer and describer and some refactoring
adding ISDEFAULT for _kubectl get storageclass_ output
```
[root@screeley-sc1 gce]# kubectl get storageclass
NAME TYPE ISDEFAULT
another-class kubernetes.io/gce-pd NO
generic1-slow kubernetes.io/gce-pd YES
generic2-fast kubernetes.io/gce-pd YES
```
```release-note
Add ISDEFAULT to kubectl get storageClass output
```
@kubernetes/sig-storage
Most normal codec use should perform defaulting. DirectCodecs should not
perform defaulting. Update the defaulting_test to fuzz the list of known
defaulters. Use the new versioning.NewDefaultingCodec() method.
Automatic merge from submit-queue
Add PSP support for seccomp profiles
Seccomp support for PSP. There are still a couple of TODOs that need to be fixed but this is passing tests.
One thing of note, since seccomp is all being stored in annotations right now it breaks some of the assumptions we've stated for the provider in terms of mutating the passed in pod. I've put big warning comments around the pieces that do that to make sure it's clear and covered the rollback in admission if the policy fails to validate.
@sttts @pmorie @erictune @smarterclayton @liggitt
Automatic merge from submit-queue
Refactor scheduler to enable switch on equivalence cache
Part 2 of #30844
Refactoring to enable easier switch on of equivalence cache.
Automatic merge from submit-queue
add ownerref permission checks
Adds an admission plugin that ensures that anyone adding an `ownerReference` to a resource has delete rights on the resource they're setting up a delete for.
@caesarxuchao example admission plugin that tests for ownerReference diffs and uses an authorizer to drive the decision.
@liggitt @ncdc we've talked about this before