Automatic merge from submit-queue (batch tested with PRs 49316, 46117, 49064, 48073, 49323)
add e2e tests for the bootstrapsigner and tokencleaner controllers, integration testing for bootstrap token auth
**What this PR does / why we need it**:
Add e2e test for bootstrap signer
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
None
```
Automatic merge from submit-queue (batch tested with PRs 49316, 46117, 49064, 48073, 49323)
Modular extensions for kube scheduler perf testing framework
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#45973
**Special notes for your reviewer**:
It is not same as the existing one, the previous one has a single nodeaffinity key with multiple values. This one has multiple keys, values.
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49107, 47177, 49234, 49224, 49227)
tighten quota controller interface
While debugging a quota performance problem, I had to chase some references deeper than necessary because the interfaces were overly broad. This tightens them.
```release-note
NONE
```
Automatic merge from submit-queue
Add PriorityClass API object under new "scheduling" API group
**What this PR does / why we need it**: This PR is a part of a series of PRs to add pod priority to Kubernetes. This PR adds a new API group called "scheduling" with a new API object called "PriorityClass". PriorityClass maps the string value of priority to its integer value.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**: Given the size of this PR, I will add the admission controller for the PriorityClass in a separate PR.
**Release note**:
```release-note
Add PriorityClass API object under new "scheduling" API group
```
ref/ #47604
ref/ #48646
Automatic merge from submit-queue (batch tested with PRs 49058, 49072, 49137, 49182, 49045)
*: remove --insecure-allow-any-token option
~Since the authenticator is still used in e2e tests, don't remove
the actual package. Maybe a follow up?~
edit: e2e and integration tests have been switched over to the tokenfile
authenticator instead.
```release-note
The --insecure-allow-any-token flag has been removed from kube-apiserver. Users of the flag should use impersonation headers instead for debugging.
```
closes#49031
cc @kubernetes/sig-auth-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 49116, 49095)
Move pkg/api/v1/ref -> client-go/tools/reference
`pkg/api/v1/ref` is the only remaining package copied from pkg/api/v1 to client-go via staging/copy.sh.
# The first commit's message is:
Modular extensions for kube scheduler perf testing framework
# This is the 2nd commit message:
Modular extensions for kube scheduler perf testing framework
Add PriorityClass to pkg/registry
Add PriorityClass to pkg/master/master.go
Add PriorityClass to import_know_versions.go
Update linted packages
minor fix
e2e and integration tests have been switched over to the tokenfile
authenticator instead.
```release-note
The --insecure-allow-any-token flag has been removed from kube-apiserver. Users of the flag should use impersonation headers instead for debugging.
```
Automatic merge from submit-queue
Remove old, core/v1 specific constructs from RESTClient
Now that metav1 is abstracted from the APIs, RESTClient should also be agnostic to the core API.
* Remove `LabelSelectorParam` and `FieldSelectorParam` - use `VersionedParams` with `ListOptions`
* Remove `UintParam`
* Remove all legacy field selector logic from `VersionedParams` - ParameterCodec now handles that
* Remove special parameters (like `timeout`) which is no longer set by most clients
Automatic merge from submit-queue (batch tested with PRs 47417, 47638, 46930)
Added scheduler integration test owners.
**What this PR does / why we need it**:
Add OWNER file into scheduler integration test.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes # N/A
**Release note**:
```release-note-none
```
Automatic merge from submit-queue (batch tested with PRs 48497, 48604, 48599, 48560, 48546)
remove dead code
This removes the dead code cruft since we stopped serving TPRs.
ref #48152
Automatic merge from submit-queue
Small fix for number of pods and nodes in test function
**What this PR does / why we need it**:
Small fix to have correct pods and nodes in test function.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
Scheduler perf, nodes and pods number fix for 100 nodes 3k pods.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Add ownership for the future of scheduler_perf and kubemark
**What this PR does / why we need it**:
The scheduler_perf project is cross-cutting with the other goals of the performance and scale initiatives, so, I've put together a list of interested parties who have been running, using, and contributing to it.
cc @kubernetes/sig-scheduling-pr-reviews @ravisantoshgudimetla @sjug
Automatic merge from submit-queue (batch tested with PRs 47162, 48444, 48445)
Introducing a cluster-scoped resource in the wardle.k8s.io group.
**What this PR does / why we need it**:
This PR adds a cluster-scoped resource to the wardle.k8s.io group.
The cluster scoped resource has a field that indicates Flunder.Names that are disallowed.
The resource is going to be used by an admission plugin.
The admission plugin will list the cluster-scope resources and check against banned names.
**Special notes for your reviewer**:
Issue: #47868
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48480, 48353)
remove tpr api access
xref https://github.com/kubernetes/kubernetes/issues/48152
TPR tentacles go pretty deep. This gets us started by removing API access and we'll move down from there.
@kubernetes/sig-api-machinery-misc
@ironcladlou this should free up the GC implementation since TPRs will no longer be present and failing.
```release-note
Removing TPR api access per https://github.com/kubernetes/kubernetes/issues/48152
```
The cluster scoped resource has a field that indicates Flunder.Names that are disallowed.
The resource is going to be used by an admission plugin.
The admission plugin will list the cluster-scope resources and check against banned names.
Issue: #47868
Internal attach/detach controller timers should be configurable and tests
should use much shorter values.
reconcilerSyncDuration is deliberately left out of TimerConfig because it's
the only one that's not a constant one, it's configurable by user.
Automatic merge from submit-queue (batch tested with PRs 46750, 47141)
Speed up volume integration test
Partly solves https://github.com/kubernetes/kubernetes/issues/47129 .
On my local box:
before - 7m56.751s
after - 5m53.132s
So approx. 2m time saving. More saving will require refactoring of attach detach controller.
cc @mikedanese
Automatic merge from submit-queue (batch tested with PRs 36376, 47251)
client-go: GetOptions for dynamic client
Looks like `GetOptions` were forgotten in the dynamic client. Without them it's hard to write a dynamic initializer controller (useful for custom resources).
Automatic merge from submit-queue (batch tested with PRs 46718, 46828, 46988)
Update docs/ links to point to main site
**What this PR does / why we need it**:
This updates various links to either point to kubernetes.io or to the kubernetes/community repo instead of the legacy docs/ tree in k/k
Pre-requisite for #46813
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
@kubernetes/sig-docs-maintainers @chenopis @ahmetb @thockin
Automatic merge from submit-queue (batch tested with PRs 46681, 46786, 46264, 46680, 46805)
Enable Dialer on the Aggregator
Centralize the creation of the dialer during startup.
Have the dialer then passed in to both APIServer and Aggregator.
Aggregator the uses the dialer as its Transport base.
**What this PR does / why we need it**:Enables the Aggregator to use the Dialer/SSHTunneler to connect to the user-apiserver.
**Which issue this PR fixes** : fixes ##46679
**Special notes for your reviewer**:
**Release note**: None
Automatic merge from submit-queue (batch tested with PRs 46239, 46627, 46346, 46388, 46524)
Dynamic webhook admission control plugin
Unit tests pass.
Needs plumbing:
* [ ] service resolver (depends on @wfender PR)
* [x] client cert (depends on ????)
* [ ] hook source (depends on @caesarxuchao PR)
Also at least one thing will need to be renamed after Chao's PR merges.
```release-note
Allow remote admission controllers to be dynamically added and removed by administrators. External admission controllers make an HTTP POST containing details of the requested action which the service can approve or reject.
```
Automatic merge from submit-queue (batch tested with PRs 46239, 46627, 46346, 46388, 46524)
move labels to components which own the APIs
During the apimachinery split in 1.6, we accidentally moved several label APIs into apimachinery. They don't belong there, since the individual APIs are not general machinery concerns, but instead are the concern of particular components: most commonly the kubelet. This pull moves the labels into their owning components and out of API machinery.
@kubernetes/sig-api-machinery-misc @kubernetes/api-reviewers @kubernetes/api-approvers
@derekwaynecarr since most of these are related to the kubelet
Automatic merge from submit-queue (batch tested with PRs 43505, 45168, 46439, 46677, 46623)
Add TPR to CRD migration helper.
This is a helper for migrating TPR data to CustomResource. It's rather hacky because it requires crossing apiserver boundaries, but doing it this way keeps the mess contained to the TPR code, which is scheduled for deletion anyway.
It's also not completely hands-free because making it resilient enough to be completely automated is too involved to be worth it for an alpha-to-beta migration, and would require investing significant effort to fix up soon-to-be-deleted TPR code. Instead, this feature will be documented as a best-effort helper whose results should be verified by hand.
The intended benefit of this over a totally manual process is that it should be possible to copy TPR data into a CRD without having to tear everything down in the middle. The process would look like this:
1. Upgrade to k8s 1.7. Nothing happens to your TPRs.
1. Create CRD with group/version and resource names that match the TPR. Still nothing happens to your TPRs, as the CRD is hidden by the overlapping TPR.
1. Delete the TPR. The TPR data is converted to CustomResource data, and the CRD begins serving at the same REST path.
Note that the old TPR data is left behind by this process, so watchers should not receive DELETE events. This also means the user can revert to the pre-migration state by recreating the TPR definition.
Ref. https://github.com/kubernetes/kubernetes/issues/45728
Centralize the creation of the dialer during startup.
Have the dialer then passed in to both APIServer and Aggregator.
Aggregator the sets the dialer on its Transport base.
This should allow the SSTunnel to be used but also allow the Aggregation
Auth to work with it.
Depending on Environment InsecureSkipTLSVerify *may* need to be set to
true.
Fixed as few tests to call CreateDialer as part of start-up.
Automatic merge from submit-queue (batch tested with PRs 45913, 46065, 46352, 46363, 46373)
Detect cohabitating resources in etcd storage test
**What this PR does / why we need it**:
This change updates the etcd storage path test to detect cohabitating resources by looking at their expected location in etcd. This was not detected in the past because the GVK check did not span across groups.
To limit noise from failures caused by multiple objects at the same location in etcd, the test now fails when different GVRs share the same expected path. Thus every object is expected to have a unique path.
@liggitt PTAL
Signed-off-by: Monis Khan <mkhan@redhat.com>
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46149, 45897, 46293, 46296, 46194)
GC: update required verbs for deletable resources, allow list of ignored resources to be customized
The garbage collector controller currently needs to list, watch, get,
patch, update, and delete resources. Update the criteria for
deletable resources to reflect this.
Also allow the list of resources the garbage collector controller should
ignore to be customizable, so downstream integrators can add their own
resources to the list, if necessary.
cc @caesarxuchao @deads2k @smarterclayton @mfojtik @liggitt @sttts @kubernetes/sig-api-machinery-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 45587, 46286)
PDB Max Unavailable Field
Completes https://github.com/kubernetes/features/issues/285
```release-note
Adds a MaxUnavailable field to PodDisruptionBudget
```
Individual commits are self-contained; Last commit can be ignored because it is autogenerated code.
cc @kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews
Allow the list of resources the garbage collector controller should
ignore to be customizable, so downstream integrators can add their own
resources to the list, if necessary.
Automatic merge from submit-queue (batch tested with PRs 45766, 46223)
Scheduler should use a shared informer, and fix broken watch behavior for cached watches
Can be used either from a true shared informer or a local shared
informer created just for the scheduler.
Fixes a bug in the cache watcher where we were returning the "current" object from a watch event, not the historic event. This means that we broke behavior when introducing the watch cache. This may have API implications for filtering watch consumers - but on the other hand, it prevents clients filtering from seeing objects outside of their watch correctly, which can lead to other subtle bugs.
```release-note
The behavior of some watch calls to the server when filtering on fields was incorrect. If watching objects with a filter, when an update was made that no longer matched the filter a DELETE event was correctly sent. However, the object that was returned by that delete was not the (correct) version before the update, but instead, the newer version. That meant the new object was not matched by the filter. This was a regression from behavior between cached watches on the server side and uncached watches, and thus broke downstream API clients.
```
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)
Use shared informers in gc controller if possible
Modify the garbage collector controller to try to use shared informers for resources, if possible, to reduce the number of unique reflectors listing and watching the same thing.
cc @kubernetes/sig-api-machinery-pr-reviews @caesarxuchao @deads2k @liggitt @sttts @smarterclayton @timothysc @soltysh @kargakis @kubernetes/rh-cluster-infra @derekwaynecarr @wojtek-t @gmarek
This change updates the etcd storage path test to detect cohabitating
resources by looking at their expected location in etcd. This was not
detected in the past because the GVK check did not span across groups.
To limit noise from failures caused by multiple objects at the same
location in etcd, the test now fails when different GVRs share the same
expected path. Thus every object is expected to have a unique path.
Signed-off-by: Monis Khan <mkhan@redhat.com>
Previously, the scheduler created two separate list watchers. This
changes the scheduler to be able to leverage a shared informer, whether
passed in externally or spawned using the new in place method. This
removes the last use of a "special" informer in the codebase.
Allows someone wrapping the scheduler to use a shared informer if they
have more information avaliable.
Tokens controller previously needed a bit of extra help in order to be
safe for concurrent use. The new MutationCache allows it to keep a local
cache and still use a shared informer. The filtering event handler lets
it only see changes to secrets it cares about.
Automatic merge from submit-queue (batch tested with PRs 45990, 45544, 45745, 45742, 45678)
Add integration test for deployment
We have no integration test for Deployment currently. In this PR, add an integration test which covers an e2e test (create a new RollingUpdate deployment), add more replicas to the Deployment, and set minReadySeconds so that we can test maxUnavailable.
Plan to add more integration tests that cover e2e tests after this initial PR is merged.
@kubernetes/sig-apps-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 45990, 45544, 45745, 45742, 45678)
[Federation] Add integration testing for cluster addition
This PR adds integration testing of the sync controller for cluster addition. This ensures coverage equivalency between the integration tests and the old controller unit tests, so those tests are removed by this PR.
Resolves#45257
cc: @kubernetes/sig-federation-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 45247, 45810, 45034, 45898, 45899)
Apiregistration v1alpha1→v1beta1
Promoting apiregistration api from v1alpha1 to v1beta1.
API Registration is responsible for registering an API `Group`/`Version` with
another kubernetes like API server. The `APIService` holds information
about the other API server in `APIServiceSpec` type as well as general
`TypeMeta` and `ObjectMeta`. The `APIServiceSpec` type have the main
configuration needed to do the aggregation. Any request coming for
specified `Group`/`Version` will be directed to the service defined by
`ServiceReference` (on port 443) after validating the target using provided
`CABundle` or skipping validation if development flag `InsecureSkipTLSVerify`
is set. `Priority` is controlling the order of this API group in the overall
discovery document.
The return status is a set of conditions for this aggregation. Currently
there is only one condition named "Available", if true, it means the
api/server requests will be redirected to specified API server.
```release-note
API Registration is now in beta.
```
ApplyTo adds the admission chain to the server configuration the method lazily initializes a generic plugin
that is appended to the list of pluginInitializers.
apiserver.Config will hold an instance of SharedInformerFactory to ensure we only have once instance.
The field will be initialized in apisever.SecureServingOptions
Automatic merge from submit-queue (batch tested with PRs 45481, 45463)
ThirdPartyResource example: added watcher example, code cleanup
**NOTE**: This is a cleaned and updated version of PR https://github.com/kubernetes/kubernetes/pull/43027
**What this PR does / why we need it**:
An example of using go-client for watching on ThirdPartyResource events (create/update/delete).
Automatic merge from submit-queue (batch tested with PRs 41903, 45311, 45474, 45472, 45501)
Removed old scheduler constructor.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes # N/A
**Release note**:
```release-note-none
```
Automatic merge from submit-queue
Fix golint verify error
I don't know why CI pass the hack/verify-golint.sh test.
But in my environment I get this:
> staging/src/k8s.io/client-go/util/workqueue/queue_test.go is in package workqueue_test, not workqueue
Errors from golint:
test/integration/apiserver/apiserver_test.go:102:13: should omit type string from declaration of var cascDel; it will be inferred from the right-hand side
Please fix the above errors. You can test via "golint" and commit the result.
!!! Error in hack/verify-golint.sh:98
Error in hack/verify-golint.sh:98. 'false' exited with status 1
Call stack:
1: hack/verify-golint.sh:98 main(...)
Exiting with status 1
This change fix this err in my environment.
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 45120, 45243)
skip integration test when run make bazel-test
we should opt for a seperate target for integration tests. This is breaking @deads2k who is trying to add an integration test in staging.
Automatic merge from submit-queue
don't use build tags to mark integration tests
This is a bad pattern that leads to checked in code that isn't check for compilation. We should avoid this if it doesn't provide value, which it seems like it doesn't.
`/var/run` is not world-writable on my OSX 10.11.x setup, so tests that
standup a secure apiserver fail with the default cert dir. Use a
tempdir instead.
Automatic merge from submit-queue
Scheduler configurator looks for a specific key in ConfigMap.Data
**What this PR does / why we need it**: Changes scheduler configurator to look for a specific key in ConfigMap.Data instead of the old logic which expected only one entry to exist in the map. The key is a constant whose value is "policy.cfg".
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
TestEtcdStoragePath prevents the accidental movement of objects stored
in etcd. It creates a stub of each object and then checks the expected
location in etcd. Inadvertent GroupVersionKind changes are prevented.
Signed-off-by: Monis Khan <mkhan@redhat.com>
Automatic merge from submit-queue (batch tested with PRs 44469, 44566, 44467, 44526)
WaitForCacheSync before running attachdetach controller
@gnufied you wrote the test and @ncdc the TODO comment. Let's just run the pv and pvc informers, we do not care about them in this test. But we want to be able to stop the pod Informer at will, hence not just using informers.Start, is my understanding.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44569, 44398)
Move v1/refs and v1/resource
This PR moves pkg/api/v1/ref.go and pkg/api/v1/resource_helper.go to their own sub packages, it's very similar to 44299 and 44302.
The PR is mostly mechanical, except that
* i moved some utility function from resource.go to pkg/api/v1/pod and pkg/api/v1/node, as they are more appropriate
* i updated the staging/copy.sh to copy the new subpackages, so that helper functions are copied. We can get rid of this copy after client-go stops copying API types.
Automatic merge from submit-queue
Exit from NewController() for PersistentVolumeController when InitPlugins() failed
Exit from NewController() for PersistentVolumeController when InitPlugins() failed just like NewAttachDetachController() does
**Release note**:
```release-note
NONE
```
@jsafrane @saad-ali PTAL. Thanks in advance
Automatic merge from submit-queue (batch tested with PRs 44019, 42225)
federation: Fixing runtime-config support for federation-apiserver
Fixes https://github.com/kubernetes/kubernetes/issues/42587
Ref https://github.com/kubernetes/kubernetes/issues/38593
Fixing the broken `--runtime-config` flag support in federation-apiserver. Fixing the bugs and using it to disable batch and autoscaling groups. Users can enable them by passing `--runtime-config=apis/all=true` to federation-apiserver.
~This also includes a bug fix to kube-apiserver registry that allows users to disable api/v1 resources~
cc @kubernetes/sig-federation-pr-reviews
Automatic merge from submit-queue
[Federation] Add integration test for secrets
This PR adds an integration test for secrets that:
- performs create/read/update/delete on federation resources and validates that the changes are propagated to member clusters.
- uses an abstraction layer (fixture and adapter) to minimize the code required to support each federated type
- It should be possible to replace a test-specific adapter with a runtime adapter in the future (as per #41050)
- reuses fixture (federation api and clusters) across different resource types to minimize setup overhead
- on a fast machine, setup takes ~4s, and validating each type takes ~2s
- uses the [Subtest feature added in Go 1.7](https://blog.golang.org/subtests) to allow the test for a specific controller to be run in isolation
- ``make test-integration WHAT="federation -test.run=TestFederationCRUD/secret"``
Once this PR merges the test can be extended to target other federated types.
This PR targets #40705
cc: @kubernetes/sig-federation-pr-reviews @derekwaynecarr
Automatic merge from submit-queue (batch tested with PRs 42835, 42974)
remove legacy insecure port options from genericapiserver
The insecure port has been a source of problems and it will prevent proper aggregation into a cluster, so the genericapiserver has no need for it. In addition, there's no reason for it to be in the main kube-apiserver flow either. This pull removes it from genericapiserver and removes it from the shared kube-apiserver code. It's still wired up in the command, but its no longer possible for someone to mess up and start using in mainline code.
@kubernetes/sig-api-machinery-misc @ncdc
Automatic merge from submit-queue
add local option to APIService
APIServices need an option to avoid proxying in cases where the groupversion is handled later in the chain. This will allow a coherent and complete set of APIServices, but won't require extra connections.
@kubernetes/sig-api-machinery-misc @ncdc @cheftako
Automatic merge from submit-queue (batch tested with PRs 40964, 42967, 43091, 43115)
fixes dswp flake
Sometimes a pod may not appear in desired state
of world immediately, we poll before failing.
It only adds additional 30s to tests in worst case.
Fixes#42990
cc @jingxu97
Automatic merge from submit-queue (batch tested with PRs 42728, 42278)
[Federation] Create integration test fixture for api
This PR factors a reusable fixture for the federation api server out of the existing integration test.
Targets #40705
cc: @kubernetes/sig-federation-pr-reviews
Automatic merge from submit-queue
Use namespace from context
Fixes#42653
Updates rbac_test.go to submit objects without namespaces set, which matches how actual objects are submitted to the API.
Automatic merge from submit-queue (batch tested with PRs 41984, 41682, 41924, 41928)
RC/RS: Fully Respect ControllerRef
**What this PR does / why we need it**:
This is part of the completion of the [ControllerRef](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md) proposal. It brings ReplicaSet and ReplicationController into full compliance with ControllerRef. See the individual commit messages for details.
**Which issue this PR fixes**:
Although RC/RS had partially implemented ControllerRef, they didn't use it to determine which controller to sync, or to update expectations. This could lead to instability or controllers getting stuck.
Ref: https://github.com/kubernetes/kubernetes/issues/24433
**Special notes for your reviewer**:
**Release note**:
```release-note
```
cc @erictune @kubernetes/sig-apps-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 41597, 42185, 42075, 42178, 41705)
auto discovery CA for extension API servers
This is what the smaller pulls were leading to. Only the last commit is unique and I expect I'll still tweak some pod definitions, but this is where I was going.
@sttts @liggitt
Automatic merge from submit-queue (batch tested with PRs 42200, 39535, 41708, 41487, 41335)
Add support for statefulset spreading to the scheduler
**What this PR does / why we need it**:
The scheduler SelectorSpread priority funtion didn't have the code to spread pods of StatefulSets. This PR adds StatefulSets to the list of controllers that SelectorSpread supports.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#41513
**Special notes for your reviewer**:
**Release note**:
```release-note
Add the support to the scheduler for spreading pods of StatefulSets.
```
Automatic merge from submit-queue
clean up generic apiserver options
Clean up generic apiserver options before we tag any levels. This makes them more in-line with "normal" api servers running on the platform.
Also remove dead example code.
@sttts
Automatic merge from submit-queue (batch tested with PRs 41954, 40528, 41875, 41165, 41877)
Updating apiserver to return 202 when resource is being deleted asynchronously via cascading deletion
As per https://github.com/kubernetes/kubernetes/issues/33196#issuecomment-278440622.
cc @kubernetes/sig-api-machinery-pr-reviews @smarterclayton @caesarxuchao @bgrant0607 @kubernetes/api-reviewers
```release-note
Updating apiserver to return http status code 202 for a delete request when the resource is not immediately deleted because of user requesting cascading deletion using DeleteOptions.OrphanDependents=false.
```
Automatic merge from submit-queue (batch tested with PRs 41701, 41818, 41897, 41119, 41562)
Optimization of on-wire information sent to scheduler extender interfaces that are stateful
The changes are to address the issue described in https://github.com/kubernetes/kubernetes/issues/40991
```release-note
Added support to minimize sending verbose node information to scheduler extender by sending only node names and expecting extenders to cache the rest of the node information
```
cc @ravigadde
Automatic merge from submit-queue (batch tested with PRs 41854, 41801, 40088, 41590, 41911)
Add storage.k8s.io/v1 API
v1 API is direct copy of v1beta1 API. This v1 API gets installed and exposed in this PR, I tested that kubectl can create both v1beta1 and v1 StorageClass.
~~Rest of Kubernetes (controllers, examples,. tests, ...) still use v1beta1 API, I will update it when this PR gets merged as these changes would get lost among generated code.~~ Most parts use v1 API now, it would not compile / run tests without it.
**Release note**:
```
Kubernetes API storage.k8s.io for storage objects is now fully supported and is available as storage.k8s.io/v1. Beta version of the API storage.k8s.io/v1beta1 is still available in this release, however it will be removed in a future Kubernetes release.
Together with the API endpoint, StorageClass annotation "storageclass.beta.kubernetes.io/is-default-class" is deprecated and "storageclass.kubernetes.io/is-default-class" should be used instead to mark a default storage class. The beta annotation is still working in this release, however it won't be supported in the next one.
```
@kubernetes/sig-storage-misc
Automatic merge from submit-queue (batch tested with PRs 41714, 41510, 42052, 41918, 31515)
Switch scheduler to use generated listers/informers
Where possible, switch the scheduler to use generated listers and
informers. There are still some places where it probably makes more
sense to use one-off reflectors/informers (listing/watching just a
single node, listing/watching scheduled & unscheduled pods using a field
selector).
I think this can wait until master is open for 1.7 pulls, given that we're close to the 1.6 freeze.
After this and #41482 go in, the only code left that references legacylisters will be federation, and 1 bit in a stateful set unit test (which I'll clean up in a follow-up).
@resouer I imagine this will conflict with your equivalence class work, so one of us will be doing some rebasing 😄
cc @wojtek-t @gmarek @timothysc @jayunit100 @smarterclayton @deads2k @liggitt @sttts @derekwaynecarr @kubernetes/sig-scheduling-pr-reviews @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 41667, 41820, 40910, 41645, 41361)
Switch admission to use shared informers
Originally part of #40097
cc @smarterclayton @derekwaynecarr @deads2k @liggitt @sttts @gmarek @wojtek-t @timothysc @lavalamp @kubernetes/sig-scalability-pr-reviews @kubernetes/sig-api-machinery-pr-reviews
node cache don't need to get all the information about every
candidate node. For huge clusters, sending full node information
of all nodes in the cluster on the wire every time a pod is scheduled
is expensive. If the scheduler is capable of caching node information
along with its capabilities, sending node name alone is sufficient.
These changes provide that optimization in a backward compatible way
- removed the inadvertent signature change of Prioritize() function
- added defensive checks as suggested
- added annotation specific test case
- updated the comments in the scheduler types
- got rid of apiVersion thats unused
- using *v1.NodeList as suggested
- backing out pod annotation update related changes made in the
1st commit
- Adjusted the comments in types.go and v1/types.go as suggested
in the code review
Where possible, switch the scheduler to use generated listers and
informers. There are still some places where it probably makes more
sense to use one-off reflectors/informers (listing/watching just a
single node, listing/watching scheduled & unscheduled pods using a field
selector).
Automatic merge from submit-queue (batch tested with PRs 41421, 41440, 36765, 41722)
Use watch param instead of deprecated /watch/ prefix
Switches clients to use watch param instead of /watch/ prefix
```release-note
Clients now use the `?watch=true` parameter to make watch API calls, instead of the `/watch/` path prefix
```
Automatic merge from submit-queue (batch tested with PRs 41401, 41195, 41664, 41521, 41651)
Remove default failure domains from anti-affinity feature
Removing it is necessary to make performance of this feature acceptable at some point.
With default failure domains (or in general when multiple topology keys are possible), we don't have transitivity between node belonging to a topology. And without this, it's pretty much impossible to solve this effectively.
@timothysc
Automatic merge from submit-queue (batch tested with PRs 41604, 41273, 41547)
Switch pv controller to shared informer
This is WIP because I still need to do something with bazel? and add 'get storageclasses' to the controller-manager rbac role
@jsafrane PTAL and make sure I did not break anything in the PV controller. Do we need to clone the volumes/claims we get from the shared informer before we use them? I could not find a place where we modify them but you would know for certain.
cc @ncdc because I copied what you did in your other PRs.
Automatic merge from submit-queue (batch tested with PRs 41517, 41494, 41163)
Make scheduler_perf warn rather then fail if scheduling rate is betwe…
**What this PR does / why we need it**
Fix scheduler per bench tests so they pass even if there is a cold interval .
**Special notes for your reviewer**:
This wont effect CI, as this bench test is run manually by devs.
`/var/run` is not world-writable on my OSX 10.11.x setup, so tests that
standup a secure apiserver fail with the default cert dir. Use a
tempdir instead.
Automatic merge from submit-queue
Switch serviceaccounts controller to generated shared informers
Originally part of #40097
cc @deads2k @sttts @liggitt @smarterclayton @gmarek @wojtek-t @timothysc @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue
Switch resourcequota controller to shared informers
Originally part of #40097
I have had some issues with this change in the past, when I updated `pkg/quota` to use the new informers while `pkg/controller/resourcequota` remained on the old informers. In this PR, both are switched to using the new informers. The issues in the past were lots of flakey test failures in the ResourceQuota e2es, where it would randomly fail to see deletions and handle replenishment. I am hoping that now that everything here is consistently using the new informers, there won't be any more of these flakes, but it's something to keep an eye out for.
I also think `pkg/controller/resourcequota` could be cleaned up. I don't think there's really any need for `replenishment_controller.go` any more since it's no longer running individual controllers per kind to replenish. It instead just uses the shared informer and adds event handlers to it. But maybe we do that in a follow up.
cc @derekwaynecarr @smarterclayton @wojtek-t @deads2k @sttts @liggitt @timothysc @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 41248, 41214)
Switch hpa controller to shared informer
**What this PR does / why we need it**: switch the hpa controller to use a shared informer
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**: Only the last commit is relevant. The others are from #40759, #41114, #41148
**Release note**:
```release-note
```
cc @smarterclayton @deads2k @sttts @liggitt @DirectXMan12 @timothysc @kubernetes/sig-scalability-pr-reviews @jszczepkowski @mwielgus @piosz
Automatic merge from submit-queue (batch tested with PRs 40796, 40878, 36033, 40838, 41210)
Implement TTL controller and use the ttl annotation attached to node in secret manager
For every secret attached to a pod as volume, Kubelet is trying to refresh it every sync period. Currently Kubelet has a ttl-cache of secrets of its pods and the ttl is set to 1 minute. That means that in large clusters we are targetting (5k nodes, 30pods/node), given that each pod has a secret associated with ServiceAccount from its namespaces, and with large enough number of namespaces (where on each node (almost) every pod is from a different namespace), that resource in ~30 GETs to refresh all secrets every minute from one node, which gives ~2500QPS for GET secrets to apiserver.
Apiserver cannot keep up with it very easily.
Desired solution would be to watch for secret changes, but because of security we don't want a node watching for all secrets, and it is not possible for now to watch only for secrets attached to pods from my node.
So as a temporary solution, we are introducing an annotation that would be a suggestion for kubelet for the TTL of secrets in the cache and a very simple controller that would be setting this annotation based on the cluster size (the large cluster is, the bigger ttl is).
That workaround mean that only very local changes are needed in Kubelet, we are creating a well separated very simple controller, and once watching "my secrets" will be possible it will be easy to remove it and switch to that. And it will allow us to reach scalability goals.
@dchen1107 @thockin @liggitt
Automatic merge from submit-queue (batch tested with PRs 40917, 41181, 41123, 36592, 41183)
fix scheduler performance test script
**What this PR does / why we need it**:
`test-performance.sh` is in dir `kubernetes/test/integration/scheduler_perf`
the dir `kubernetes/test/component/scheduler/perf` does not exist
Thanks.
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 41023, 41031, 40947)
scrub aggregator names to eliminate discovery
Cleanup old uses of `discovery`. Also removes the legacy functionality.
@kubernetes/sig-api-machinery-misc @sttts
Automatic merge from submit-queue
Replace hand-written informers with generated ones
Replace existing uses of hand-written informers with generated ones.
Follow-up commits will switch the use of one-off informers to shared
informers.
This is a precursor to #40097. That PR will switch one-off informers to shared informers for the majority of the code base (but not quite all of it...).
NOTE: this does create a second set of shared informers in the kube-controller-manager. This will be resolved back down to a single factory once #40097 is reviewed and merged.
There are a couple of places where I expanded the # of caches we wait for in the calls to `WaitForCacheSync` - please pay attention to those. I also added in a commented-out wait in the attach/detach controller. If @kubernetes/sig-storage-pr-reviews is ok with enabling the waiting, I'll do it (I'll just need to tweak an integration test slightly).
@deads2k @sttts @smarterclayton @liggitt @soltysh @timothysc @lavalamp @wojtek-t @gmarek @sjenning @derekwaynecarr @kubernetes/sig-scalability-pr-reviews