Improve GC discovery sync performance by only syncing when discovered
resource diffs are detected. Before, the GC worker pool was shut down
and monitors resynced unconditionally every sync period, leading to
significant processing delays causing test flakes where otherwise
reasonable GC timeouts were being exceeded.
Related to https://github.com/kubernetes/kubernetes/issues/49966.
Automatic merge from submit-queue (batch tested with PRs 50173, 50324, 50288, 50263, 50333)
Honor --use-service-account-credentials and warn when missing private key
Fixes#50275 by logging a warning and failing to start rather than continue to run ignoring the user's specified config
- Move public key functions to client-go/util/cert
- Move pki file helper functions to client-go/util/cert
- Standardize on certutil package alias
- Update dependencies to client-go/util/cert
Automatic merge from submit-queue (batch tested with PRs 49538, 49708, 47665, 49750, 49528)
Enable garbage collection of custom resources
Enhance the garbage collector to periodically refresh the resources it monitors (via discovery) to enable custom resource definition GC (addressing #44507 and reverting #47432).
This is a replacement for #46000.
/cc @lavalamp @deads2k @sttts @caesarxuchao
/ref https://github.com/kubernetes/kubernetes/pull/48065
```release-note
The garbage collector now supports custom APIs added via CustomeResourceDefinition or aggregated apiservers. Note that the garbage collector controller refreshes periodically, so there is a latency between when the API is added and when the garbage collector starts to manage it.
```
Automatic merge from submit-queue (batch tested with PRs 49538, 49708, 47665, 49750, 49528)
Use the core client with version
**What this PR does / why we need it**:
Replace the **deprecated** `clientSet.Core()` with `clientSet.CoreV1()`.
**Which issue this PR fixes**: fixes#49535
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Enhance the garbage collector to periodically refresh the resources it
monitors (via discovery) to enable custom resource definition GC.
This implementation caches Unstructured structs for any kinds not
covered by a shared informer. The existing meta-only codec only supports
compiled types; an improved codec which supports arbitrary types could
be introduced to optimize caching to store only metadata for all
non-informer types.
Automatic merge from submit-queue (batch tested with PRs 49665, 49689, 49495, 49146, 48934)
make it possible to allow discovery errors for controllers
Update the discovery client to return partial discovery information *and* an error. Since we can aggregate API servers, discovery of some resources can fail independently. Callers of this function who want to tolerate the errors can, existing callers will still get an error and fail in normal blocks.
@kubernetes/sig-api-machinery-misc @sttts
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)
Remove duplicated import and wrong alias name of api package
**What this PR does / why we need it**:
**Which issue this PR fixes**: fixes#48975
**Special notes for your reviewer**:
/assign @caesarxuchao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48224, 45431, 45946, 48775, 49396)
add reflector metrics
This adds metrics (optionally prometheus) to reflectors so that you can see when one reflector is behaving poorly and just how poorly its doing.
@eparis
```release-note
Adds metrics for checking reflector health.
```
Automatic merge from submit-queue (batch tested with PRs 49107, 47177, 49234, 49224, 49227)
tighten quota controller interface
While debugging a quota performance problem, I had to chase some references deeper than necessary because the interfaces were overly broad. This tightens them.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48377, 48940, 49144, 49062, 49148)
support fc volume attach and detach
**What this PR does / why we need it**:
Support FC volume attach and detach to enforce RWO access
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48953
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47232, 48625, 48613, 48567, 39173)
Include leaderelection in client-go;
Fix#39117
Fix https://github.com/kubernetes/client-go/issues/28
This PR:
* includes the leaderelection to the staging client-go
* to avoid conflict with golang's testing package, renames package /testing to /testutil, and renames cache/testing to cache/testframework
```release-note
client-go now includes the leaderelection package
```
Automatic merge from submit-queue
controller-manager: fix horizontal-pod-autoscaler-use-rest-clients fl…
…ag help info
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Use endpoints informer for the endpoint controller
This substantially reduces the number of API calls made by the endpoint
controller. Currently the controller makes an API call per endpoint for
each service that is synced. When the 30s resync is triggered, this
results in an API call for every single endpoint in the cluster. This
quickly exceeds the default qps/burst limit of 20/30 even in small
clusters, leading to delays in endpoint updates.
This change modifies the controller to use the endpoint informer cache
for all endpoint GETs. This means we only make API calls for changes in
endpoints. As a result, qps only depends on the pod activity in the
cluster, rather than the number of services.
**What this PR does / why we need it**:
Address endpoint update delays as described in https://github.com/kubernetes/kubernetes/issues/47597.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/47597
**Special notes for your reviewer**:
**Release note**:
```release-note
```
This substantially reduces the number of API calls made by the endpoint
controller. Currently the controller makes an API call per endpoint for
each service that is synced. When the 30s resync is triggered, this
results in an API call for every single endpoint in the cluster. This
quickly exceeds the default qps/burst limit of 20/30 even in small
clusters, leading to delays in endpoint updates.
This change modifies the controller to use the endpoint informer cache
for all endpoint GETs. This means we only make API calls for changes in
endpoints. As a result, qps only depends on the pod activity in the
cluster, rather than the number of services.
Internal attach/detach controller timers should be configurable and tests
should use much shorter values.
reconcilerSyncDuration is deliberately left out of TimerConfig because it's
the only one that's not a constant one, it's configurable by user.
Automatic merge from submit-queue
bazel: stamp multiple packages by using x_defs instead of linkstamp in go_binary rules
**What this PR does / why we need it**: Fixes regression introduced sometime in the last few months that prevented bazel-built clusters from identifying version properly.
It does so by updating the bazelbuild/rules_go and kubernetes/repo-infra dependencies to support using stamp values in `go_binary` `x_defs`, and then changing our `go_binary` rules to use `x_defs` instead of `linkstamp`.
This whole charade is necessary because we need to stamp version information in multiple packages.
This pretty much only affects the bazel build, so it should be low risk.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#45298
**Special notes for your reviewer**: depends on https://github.com/kubernetes/repo-infra/pull/18; should not be merged before it.
**Release note**:
```release-note
NONE
```
/assign @spxtr @mikedanese
Automatic merge from submit-queue (batch tested with PRs 44883, 46836, 46765, 46683, 46050)
While deleting a namespace, the TPR instances under this ns should be…
… deleted.
While deleting a namespace, the TPR instances under this ns should be deleted.
Fixed#46736
**Release note**:
```
None
```
Automatic merge from submit-queue (batch tested with PRs 40760, 46706, 46783, 46742, 46751)
complete the controller context for init funcs
This completes the conversion to initFuncs for the controller initialization to make easier and more manageable to add them.
1. Create controllerrevisions (history) and label pods with template
hash for both RollingUpdate and OnDelete update strategy
2. Clean up old, non-live history based on revisionHistoryLimit
3. Remove duplicate controllerrevisions (the ones with the same template)
and relabel their pods
4. Update RBAC to allow DaemonSet controller to manage
controllerrevisions
5. In DaemonSet controller unit tests, create new pods with hash labels
Automatic merge from submit-queue (batch tested with PRs 46076, 43879, 44897, 46556, 46654)
Local storage plugin
**What this PR does / why we need it**:
Volume plugin implementation for local persistent volumes. Scheduler predicate will direct already-bound PVCs to the node that the local PV is at. PVC binding still happens independently.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Part of #43640
**Release note**:
```
Alpha feature: Local volume plugin allows local directories to be created and consumed as a Persistent Volume. These volumes have node affinity and pods will only be scheduled to the node that the volume is at.
```
Automatic merge from submit-queue (batch tested with PRs 45514, 45635)
refactor certificate controller to break it into two parts
Break pkg/controller/certificates into:
* pkg/controller/certificates/approver: containing the group approver
* pkg/controller/certificates/signer: containing the local signer
* pkg/controller/certificates: containing shared infrastructure
```release-note
Break the 'certificatesigningrequests' controller into a 'csrapprover' controller and 'csrsigner' controller.
```
Allow the list of resources the garbage collector controller should
ignore to be customizable, so downstream integrators can add their own
resources to the list, if necessary.
The garbage collector controller currently needs to list, watch, get,
patch, update, and delete resources. Update the criteria for
deletable resources to reflect this.
Tokens controller previously needed a bit of extra help in order to be
safe for concurrent use. The new MutationCache allows it to keep a local
cache and still use a shared informer. The filtering event handler lets
it only see changes to secrets it cares about.
Automatic merge from submit-queue (batch tested with PRs 45374, 44537, 45739, 44474, 45888)
Allow kcm and scheduler to lock on ConfigMaps.
**What this PR does / why we need it**:
Plumbs through the ability to lock on ConfigMaps through the kcm and scheduler.
**Which issue this PR fixes**
Fixes: #44857
Addresses issues with: #45415
**Special notes for your reviewer**:
**Release note**:
```
Add leader-election-resource-lock support to kcm and scheduler to allow for locking on ConfigMaps as well as Endpoints(default)
```
/cc @kubernetes/sig-cluster-lifecycle-pr-reviews @jamiehannaford @bsalamat @mikedanese
Automatic merge from submit-queue (batch tested with PRs 45382, 45384, 44781, 45333, 45543)
Copy internal types to metrics
Supersedes #45306.
#45306 removed the internal types and suggested whoever needs the internal types should define their own copy, and use the code-gen tools to generated the conversion functions. Per offline discussion with @DirectXMan12, we wanted to go that direction but it's not clear where to put the internal types yet. Hence, as a temporary solution, we decided copy the referred client-go/pkg/api types into metrics api to avoid the dependency.
The commit "remove need of registry from custom_metrics/client.go" is similar to what I did to the fake client in an earlier PR. Let me know if you want to put the commit in another PR.
Automatic merge from submit-queue (batch tested with PRs 45304, 45006, 45527)
increase the QPS for namespace controller
The namespace controller is really chatty. Especially to discovery since that involves two requests for every API version available. This bumps the QPS and burst on the namespace controller to avoid being stuck waiting.
Automatic merge from submit-queue (batch tested with PRs 44862, 42241, 42101, 43181, 44147)
Feature/hpa upscale downscale delay configurable
**What this PR does / why we need it**:
Makes "upscale forbidden window" and "downscale forbidden window" duration configurable in arguments of kube-controller-manager. Those are options of horizontal pod autoscaler.
**Special notes for your reviewer**:
Please have a look @DirectXMan12 , the PR as discussed in Slack.
**Release note**:
```
Make "upscale forbidden window" and "downscale forbidden window" duration configurable in arguments of kube-controller-manager. Those are options of horizontal pod autoscaler. Right now are hardcoded 3 minutes for upscale, and 5 minutes to downscale. But sometimes cluster administrator might want to change this for his own needs.
```
Automatic merge from submit-queue
Add support for IP aliases for pod IPs (GCP alpha feature)
```release-note
Adds support for allocation of pod IPs via IP aliases.
# Adds KUBE_GCE_ENABLE_IP_ALIASES flag to the cluster up scripts (`kube-{up,down}.sh`).
KUBE_GCE_ENABLE_IP_ALIASES=true will enable allocation of PodCIDR ips
using the ip alias mechanism rather than using routes. This feature is currently
only available on GCE.
## Usage
$ CLUSTER_IP_RANGE=10.100.0.0/16 KUBE_GCE_ENABLE_IP_ALIASES=true bash -x cluster/kube-up.sh
# Adds CloudAllocator to the node CIDR allocator (kubernetes-controller manager).
If CIDRAllocatorType is set to `CloudCIDRAllocator`, then allocation
of CIDR allocation instead is done by the external cloud provider and
the node controller is only responsible for reflecting the allocation
into the node spec.
- Splits off the rangeAllocator from the cidr_allocator.go file.
- Adds cloudCIDRAllocator, which is used when the cloud provider allocates
the CIDR ranges externally. (GCE support only)
- Updates RBAC permission for node controller to include PATCH
```
Automatic merge from submit-queue
Remove alphaProvisioner in PVController and AlphaStorageClassAnnotation
remove alpha annotation and alphaProvisioner
**Release note**:
```release-note
NONE
```
If CIDRAllocatorType is set to `CloudCIDRAllocator`, then allocation
of CIDR allocation instead is done by the external cloud provider and
the node controller is only responsible for reflecting the allocation
into the node spec.
- Splits off the rangeAllocator from the cidr_allocator.go file.
- Adds cloudCIDRAllocator, which is used when the cloud provider allocates
the CIDR ranges externally. (GCE support only)
- Updates RBAC permission for node controller to include PATCH
Automatic merge from submit-queue (batch tested with PRs 43144, 42671, 43226, 43314, 43361)
don't start controllers against unhealthy master
Operating against an unhealthy apiserver is unpredictable. Some clients like `kubectl` need to be best effort in this regard so that you can debug broken apiservers. Controllers shouldn't run against unhealthy masters.
Automatic merge from submit-queue (batch tested with PRs 43642, 43170, 41813, 42170, 41581)
Enable storage class support in Azure File volume
**What this PR does / why we need it**:
Support StorageClass in Azure file volume
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Support StorageClass in Azure file volume
```
Automatic merge from submit-queue (batch tested with PRs 42692, 42169, 42173)
DaemonSet: Respect ControllerRef
**What this PR does / why we need it**:
This is part of the completion of the [ControllerRef](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md) proposal. It brings DaemonSet into full compliance with ControllerRef. See the individual commit messages for details.
**Which issue this PR fixes**:
This ensures that DaemonSet does not fight with other controllers over control of Pods.
**Special notes for your reviewer**:
**Release note**:
```release-note
DaemonSet now respects ControllerRef to avoid fighting over Pods.
```
cc @erictune @kubernetes/sig-apps-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 42692, 42169, 42173)
Add pprof trace support
Add support for `/debug/pprof/trace`
Can wait for master to reopen for 1.7.
cc @smarterclayton @wojtek-t @gmarek @timothysc @jeremyeder @kubernetes/sig-scalability-pr-reviews
This commit switches over the HPA controller to use the custom metrics
API. It also converts the HPA controller to use the generated client
in k8s.io/metrics for the resource metrics API.
In order to enable support, you must enable
`--horizontal-pod-autoscaler-use-rest-clients` on the
controller-manager, which will switch the HPA controller's MetricsClient
implementation over to use the standard rest clients for both custom
metrics and resource metrics. This requires that at the least resource
metrics API is registered with kube-aggregator, and that the controller
manager is pointed at kube-aggregator. For this to work, Heapster
must be serving the new-style API server (`--api-server=true`).
- Add a new type PortworxVolumeSource
- Implement the kubernetes volume plugin for Portworx Volumes under pkg/volume/portworx
- The Portworx Volume Driver uses the libopenstorage/openstorage specifications and apis for volume operations.
Changes for k8s configuration and examples for portworx volumes.
- Add PortworxVolume hooks in kubectl, kube-controller-manager and validation.
- Add a README for PortworxVolume usage as PVs, PVCs and StorageClass.
- Add example spec files
Handle code review comments.
- Modified READMEs to incorporate to suggestions.
- Add a test for ReadWriteMany access mode.
- Use util.UnmountPath in TearDown.
- Add ReadOnly flag to PortworxVolumeSource
- Use hostname:port instead of unix sockets
- Delete the mount dir in TearDown.
- Fix link issue in persistentvolumes README
- In unit test check for mountpath after Setup is done.
- Add PVC Claim Name as a Portworx Volume Label
Generated code and documentation.
- Updated swagger spec
- Updated api-reference docs
- Updated generated code under pkg/api/v1
Godeps update for Portworx Volume Driver
- Adds github.com/libopenstorage/openstorage
- Adds go.pedge.io/pb/go/google/protobuf
- Updates Godep Licenses
Automatic merge from submit-queue (batch tested with PRs 42053, 41282, 42056, 41663, 40927)
Fully remove hand-written listers and informers
Note: the first commit is from #41927. Adding do-not-merge for now as we'll want that to go in first, and then I'll rebase this on top.
Update statefulset controller to use a lister for PVCs instead of a client request. Also replace a unit test's dependency on legacylisters with the generated ones. cc @kargakis @kow3ns @foxish @kubernetes/sig-apps-pr-reviews
Remove all references to pkg/controller/informers and pkg/client/legacylisters, and remove those packages.
@smarterclayton @deads2k this should be it!
cc @gmarek @wojtek-t @derekwaynecarr @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue
Make controller-manager resilient to stale serviceaccount tokens
Now that the controller manager is spinning up controller loops using service accounts, we need to be more proactive in making sure the clients will actually work.
Future additional work:
* make a controller that reaps invalid service account tokens (c.f. https://github.com/kubernetes/kubernetes/issues/20165)
* allow updating the client held by a controller with a new token while the controller is running (c.f. https://github.com/kubernetes/kubernetes/issues/4672)
Automatic merge from submit-queue
Switch service controller to shared informers
Originally part of #40097
cc @deads2k @smarterclayton @gmarek @wojtek-t @timothysc @sttts @liggitt @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue
Remove alpha provisioning
This is the first part of https://github.com/kubernetes/features/issues/36
@kubernetes/sig-storage-misc
**Release note**:
```release-note
Alpha version of dynamic volume provisioning is removed in this release. Annotation
"volume.alpha.kubernetes.io/storage-class" does not have any special meaning. A default storage class
and DefaultStorageClass admission plugin can be used to preserve similar behavior of Kubernetes cluster,
see https://kubernetes.io/docs/user-guide/persistent-volumes/#class-1 for details.
```
Automatic merge from submit-queue
Switch serviceaccounts controller to generated shared informers
Originally part of #40097
cc @deads2k @sttts @liggitt @smarterclayton @gmarek @wojtek-t @timothysc @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue
Change default attach_detach_controller reconciler sync period to 1 minute
When default reconciler sync period is set to 5 second, we often see
rateLimit issue for a large cluster. This PR is changing the period to 1
minute to mitigate this problem.
Make this period longer means that there might be some period of time
that the cached information in master's attach_detach_controller is out
of date. The node might use this information to mount to the wrong
device. For GCE PD, since device path is uniquely associated with volume
id, so mount operation will just fail because of this outdated
information. For AWS, before kubelet might mount to the wrong volume
because device path could be reused immediately once it is available.
But after PR #38818, device path will only be reused after all device
paths have been explored. That means it is very unlikely that kubelet will
mount to a wrong volume that is using the old device path that had been
assigned to the same node.
**Release note**:
```release-note
We change the default attach_detach_controller sync period to 1 minute to reduce the query frequency through cloud provider to check whether volumes are attached or not.
```
When default reconciler sync period is set to 5 second, we often see
rateLimit issue for a large cluster. This PR is change the period to 1
minute to mitigate this problem.
Make this period longer means that there might be some period of time
that the cached information in master's attach_detach_controller is out
of date. The node might use this information to mount to the wrong
device. For GCE PD, since device path is uniquely associated with volume
id, so mount operation will just fail because of this outdated
information. For AWS, before kubelet might mount to the wrong volume
because device path could be reused immediately once it is available.
But after PR #38818, device path will only be reused after all device
paths have been explored. That means it is very unlikely that kubelet will
mount to a wrong volume that is using the old device path that had been
assigned to the same node.
Automatic merge from submit-queue (batch tested with PRs 41137, 41268)
Allow the CertificateController to use any Signer implementation.
**What this PR does / why we need it**:
This will allow developers to create `CertificateController`s with arbitrary `Signer`s, instead of forcing the use of `CFSSLSigner`. It matches the behavior of allowing an arbitrary `AutoApprover` to be passed in the constructor.
**Release note**:
```release-note
NONE
```
CC @mikedanese
Automatic merge from submit-queue (batch tested with PRs 39418, 41175, 40355, 41114, 32325)
TaintController
```release-note
This PR adds a manager to NodeController that is responsible for removing Pods from Nodes tainted with NoExecute Taints. This feature is beta (as the rest of taints) and enabled by default. It's gated by controller-manager enable-taint-manager flag.
```
Automatic merge from submit-queue (batch tested with PRs 39418, 41175, 40355, 41114, 32325)
ResyncPeriod Comment
ResyncPeriod Comment:
// ResyncPeriod returns a function which generates a duration each time it is
// invoked; this is so that multiple controllers don't get into lock-step and all
// hammer the apiserver with list requests simultaneously.
Automatic merge from submit-queue (batch tested with PRs 40796, 40878, 36033, 40838, 41210)
Implement TTL controller and use the ttl annotation attached to node in secret manager
For every secret attached to a pod as volume, Kubelet is trying to refresh it every sync period. Currently Kubelet has a ttl-cache of secrets of its pods and the ttl is set to 1 minute. That means that in large clusters we are targetting (5k nodes, 30pods/node), given that each pod has a secret associated with ServiceAccount from its namespaces, and with large enough number of namespaces (where on each node (almost) every pod is from a different namespace), that resource in ~30 GETs to refresh all secrets every minute from one node, which gives ~2500QPS for GET secrets to apiserver.
Apiserver cannot keep up with it very easily.
Desired solution would be to watch for secret changes, but because of security we don't want a node watching for all secrets, and it is not possible for now to watch only for secrets attached to pods from my node.
So as a temporary solution, we are introducing an annotation that would be a suggestion for kubelet for the TTL of secrets in the cache and a very simple controller that would be setting this annotation based on the cluster size (the large cluster is, the bigger ttl is).
That workaround mean that only very local changes are needed in Kubelet, we are creating a well separated very simple controller, and once watching "my secrets" will be possible it will be easy to remove it and switch to that. And it will allow us to reach scalability goals.
@dchen1107 @thockin @liggitt
Automatic merge from submit-queue (batch tested with PRs 40345, 38183, 40236, 40861, 40900)
refactor approver and signer interfaces to be consisten w.r.t. apiserver interaction
This makes it so that only the controller loop talks to the
API server directly. The signatures for Sign and Approve also
become more consistent, while allowing the Signer to report
conditions (which it wasn't able to do before).
This makes it so that only the controller loop talks to the
API server directly. The signatures for Sign and Approve also
become more consistent, while allowing the Signer to report
conditions (which it wasn't able to do before).
Automatic merge from submit-queue
Remove packages which are now apimachinery
Removes all the content from the packages that were moved to `apimachinery`. This will force all vendoring projects to figure out what's wrong. I had to leave many empty marker packages behind to have verify-godep succeed on vendoring heapster.
@sttts straight deletes and simple adds
Automatic merge from submit-queue (batch tested with PRs 39947, 39936, 39902, 39859, 39915)
don't lie about starting the controllers in the controller manager
We print started even if it didn't start.
Automatic merge from submit-queue
replace global registry in apimachinery with global registry in k8s.io/kubernetes
We'd like to remove all globals, but our immediate problem is that a shared registry between k8s.io/kubernetes and k8s.io/client-go doesn't work. Since client-go makes a copy, we can actually keep a global registry with other globals in pkg/api for now.
@kubernetes/sig-api-machinery-misc @lavalamp @smarterclayton @sttts
Automatic merge from submit-queue (batch tested with PRs 39661, 39740, 39801, 39468, 39743)
add --controllers to controller manager
Adds a `--controllers` flag to the `kube-controller-manager` to indicate which controllers are enabled and disabled. From the help:
```
--controllers stringSlice A list of controllers to enable. '*' enables all on-by-default controllers, 'foo' enables the controller named 'foo', '-foo' disables the controller named 'foo'.
All controllers: certificatesigningrequests, cronjob, daemonset, deployment, disruption, endpoint, garbagecollector, horizontalpodautoscaling, job, namespace, podgc, replicaset, replicationcontroller, resourcequota, serviceaccount, statefuleset
```
Automatic merge from submit-queue
make discovery static when extensions/thirdpartyresources is not enabled
this should be a bug fix, if `extensions/thirdpartyresources` is enabled, the result of `Discovery().ServerPreferredNamespacedResources` will be dynamic then, so we are making the `discoverResourcesFn` static only when the `extensions/thirdpartyresources` is not enabled.
Automatic merge from submit-queue (batch tested with PRs 39150, 38615)
Add work queues to PV controller
PV controller should not use Controller.Requeue, as as it is not available in
shared informers. We need to implement our own work queues instead, where we
can enqueue volumes/claims as we want.
PV controller should not use Controller.Requeue, as as it is not available in
shared informers. We need to implement our own work queues instead where we
can enqueue volumes/claims as we want.
Automatic merge from submit-queue (batch tested with PRs 37959, 36221)
Recycle Pod Template Check
The kube-controller-manager has two command line arguments (--pv-recycler-pod-template-filepath-hostpath and --pv-recycler-pod-template-filepath-nfs) that specify a recycle pod template. The recycle pod template may not contain the volume that shall be recycled.
A check is added to make sure that the recycle pod template contains at least a volume.
cc: @jsafrane
The kube-controller-manager has two command line arguments (--pv-recycler-pod-template-filepath-hostpath and --pv-recycler-pod-template-filepath-nfs) that specify a recycle pod template. The recycle pod template may not contain the volume that shall be recycled.
A check is added to make sure that the recycle pod template contains at least a volume.
Currently, the HPA considers unready pods the same as ready pods when
looking at their CPU and custom metric usage. However, pods frequently
use extra CPU during initialization, so we want to consider them
separately.
This commit causes the HPA to consider unready pods as having 0 CPU
usage when scaling up, and ignores them when scaling down. If, when
scaling up, factoring the unready pods as having 0 CPU would cause a
downscale instead, we simply choose not to scale. Otherwise, we simply
scale up at the reduced amount caculated by factoring the pods in at
zero CPU usage.
The effect is that unready pods cause the autoscaler to be a bit more
conservative -- large increases in CPU usage can still cause scales,
even with unready pods in the mix, but will not cause the scale factors
to be as large, in anticipation of the new pods later becoming ready and
handling load.
Similarly, if there are pods for which no metrics have been retrieved,
these pods are treated as having 100% of the requested metric when
scaling down, and 0% when scaling up. As above, this cannot change the
direction of the scale.
This commit also changes the HPA to ignore superfluous metrics -- as
long as metrics for all ready pods are present, the HPA we make scaling
decisions. Currently, this only works for CPU. For custom metrics, we
cannot identify which metrics go to which pods if we get superfluous
metrics, so we abort the scale.
Automatic merge from submit-queue
Enable HPA controller based on autoscaling/v1 api group
ref #29778
``` release-note
Enable HPA controller based on autoscaling/v1 api group.
```
Automatic merge from submit-queue
lister-gen updates
- Remove "zz_generated." prefix from generated lister file names
- Add support for expansion interfaces
- Switch to new generated JobLister
@deads2k @liggitt @sttts @mikedanese @caesarxuchao for the lister-gen changes
@soltysh @deads2k for the informer / job controller changes
Automatic merge from submit-queue
convert SA controller to shared informers
convert the SA controller to shared informer + workqueue.
I think one of @derekwaynecarr @ncdc or @liggitt
Alter how runtime.SerializeInfo is represented to simplify negotiation
and reduce the need to allocate during negotiation. Simplify the dynamic
client's logic around negotiating type. Add more tests for media type
handling where necessary.
Automatic merge from submit-queue
decouple workqueue metrics from prometheus
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
We want to include the workqueue in client-go, but do not want to having to import Prometheus. This PR decouples the workqueue from prometheus.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Partially address https://github.com/kubernetes/kubernetes/issues/33497
User requested for `workqueue` in client-go: https://github.com/kubernetes/client-go/issues/4#issuecomment-249444848
**Special notes for your reviewer**:
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
The implicit registration of Prometheus metrics for workqueue has been removed, and a plug-able interface was added. If you were using workqueue in your own binaries and want these metrics, add the following to your imports in the main package: "k8s.io/pkg/util/workqueue/prometheus".
```
Automatic merge from submit-queue
Add ECDSA support for service account tokens
Fixes#28180
```release-note
ECDSA keys can now be used for signing and verifying service account tokens.
```
Automatic merge from submit-queue
Abstraction of endpoints in leaderelection code
**Problem Statement**:
Currently the Leader Election code is hard coded against the endpoints api. This causes performance issues on large scale clusters due to incessant iptables refreshes, see: https://github.com/kubernetes/kubernetes/issues/26637
The goal of this PR is to:
- Abstract Endpoints out of the leader election code
- Fix a known bug in the event recording
fixes#18386
**Special notes for your reviewer**:
This is a 1st pass at abstracting the details of endpoints out into an interface. Any suggestions around how we we want to refactor this interface is welcome and could be addressed in either this PR or follow on PR.
/cc @ncdc @wojtek-t @rrati
Automatic merge from submit-queue
Dynamic provisioning for flocker volume plugin
Refactor flocker volume plugin
* [x] Support provisioning beta (#29006)
* [x] Support deletion
* [x] Use bind mounts instead of /flocker in containers
* [x] support ownership management or SELinux relabeling.
* [x] adds volume specification via datasetUUID (this is guranted to be unique)
I based my refactor work to replicate pretty much GCE-PD behaviour
**Related issues**: #29006#26908
@jsafrane @mattbates @wallrj @wallnerryan
persistentvolumecontroller.NewPersistentVolumeController has 11 arguments now,
put them into a structure.
Also, rename NewPersistentVolumeController to NewController, persistentvolume
is already name of the package.
Fixes#30219
The GC needs to build clients based only on Resource or Kind. Hoist the
restmapper out of the controller and the clientpool, support a new
ClientForGroupVersionKind and ClientForGroupVersionResource, and use the
appropriate one in both places.
Automatic merge from submit-queue
Refactor cert utils into one pkg, add funcs from bootkube for kubeadm to use
**What this PR does / why we need it**:
We have ended-up with rather incomplete and fragmented collection of utils for handling certificates. It may be worse to consider using `cfssl` for doing all of these things, but for now there is some functionality that we need in `kubeadm` that we can borrow from bootkube. It makes sense to move the utils from bookube into core, as discussed in #31221.
**Special notes for your reviewer**: I've taken the opportunity to review names of existing funcs and tried to make some improvements in that area (with help from @peterbourgon).
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Support Quobyte as StorageClass
This PR allows Users to use Quobyte as StorageClass for dynamic volume provisioning and implements the Provisioner/Deleter Interface.
@quolix @kubernetes/sig-storage @rootfs
Automatic merge from submit-queue
convert daemonset controller to shared informers
Convert the daemonset controller completely to `SharedInformers` for its list/watch resources.
@kubernetes/rh-cluster-infra @ncdc
Automatic merge from submit-queue
Switch ScheduledJob controller to use clientset
**What this PR does / why we need it**:
This is part of #25442. I've applied here the same fix I've applied in the manual client in #29187, see the 1st commit for that (@caesarxuchao we've talked about it in #29856).
@deads2k as promised
@janetkuo ptal
Automatic merge from submit-queue
Use PV shared informer in PV controller
Use the PV shared informer, addressing (partially) https://github.com/kubernetes/kubernetes/issues/26247 . Using the PVC shared informer is not so simple because sometimes the controller wants to `Requeue` and...
Automatic merge from submit-queue
support storage class in Ceph RBD volume
replace WIP PR #30959, using PV annotation idea from @jsafrane
@kubernetes/sig-storage @johscheuer @elsonrodriguez
Automatic merge from submit-queue
Split the version metric out to its own package
This PR breaks a client dependency on prometheus. Combined with #30638, the client will no longer depend on these packages.
Automatic merge from submit-queue
Implements Attacher Plugin Interface for vSphere
This PR does the following,
Fixes#29028 (vsphere volume should implement attacher interface): Implements Attacher Plugin Interface for vSphere.
See file:
pkg/volume/vsphere_volume/vsphere_volume.go. - Removed attach and detach calls from SetupAt and TearDownAt.
pkg/volume/vsphere_volume/attacher.go. - Implements Attacher & Detacher Plugin Interface for vSphere. (Ref :- GCE_PD & AWS attacher.go)
pkg/cloudproviders/provider/vsphere.go - Added DiskIsAttach method.
The vSphere plugin code needs clean up. (ex: The code for getting vSphere instance is repeated in file pkg/cloudprovider/providers/vsphere.go). I will fix this in next PR.
Automatic merge from submit-queue
Remove implicit Prometheus metrics from client
**What this PR does / why we need it**: This PR starts to cut away at dependencies that the client has.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
The implicit registration of Prometheus metrics for request count and latency have been removed, and a plug-able interface was added. If you were using our client libraries in your own binaries and want these metrics, add the following to your imports in the main package: "k8s.io/pkg/client/metrics/prometheus".
```
cc: @kubernetes/sig-api-machinery @kubernetes/sig-instrumentation @fgrzadkowski @wojtek-t
Automatic merge from submit-queue
Enable the garbage collector by default
Turning GC on by default.
Memory usage of GC is back to normal after #30943. The CPU usage is a little higher than the cap in scalability test (1.11 core vs. 1 core). This PR adjusted the default GC worker to 20 to see if that helps CPU usage.
@kubernetes/sig-api-machinery @wojtek-t @lavalamp
Automatic merge from submit-queue
Add Events for operation_executor to show status of mounts, failed/successful to show in describe events
Fixes#27590
@saad-ali @pmorie @erinboyd
After talking with @pmorie last week about the above issue, I decided to poke around and see if I could remedy. The refactoring broke my previous UXP merged PR's that correctly showed failed mount errors in the describe events. However, Not sure I implemented correctly, but it tested out and seems to be working, let me know what I missed or if this is not the correct approach.
```
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
2m 2m 1 {default-scheduler } Normal Scheduled Successfully assigned nfs-bb-pod1 to 127.0.0.1
44s 44s 1 {kubelet 127.0.0.1} Warning FailedMount Unable to mount volumes for pod "nfs-bb-pod1_default(a94f64f1-37c9-11e6-9aa5-52540073d346)": timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol]
44s 44s 1 {kubelet 127.0.0.1} Warning FailedSync Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol]
38s 38s 1 {kubelet } Warning FailedMount Unable to mount volumes for pod "a94f64f1-37c9-11e6-9aa5-52540073d346": Mount failed: exit status 32
Mounting arguments: nfs1.rhs:/opt/data99 /var/lib/kubelet/pods/a94f64f1-37c9-11e6-9aa5-52540073d346/volumes/kubernetes.io~nfs/nfsvol nfs []
Output: mount.nfs: Connection timed out
Resolution hint: Check and make sure the NFS Server exists (ensure that correct IPAddress/Hostname was given) and is available/reachable.
Also make sure firewall ports are open on both client and NFS Server (2049 v4 and 2049, 20048 and 111 for v3).
Use commands telnet <nfs server> <port> and showmount <nfs server> to help test connectivity.
```
Automatic merge from submit-queue
Cut the client repo, staging it in the main repo
Tracking issue: #28559
ref: https://github.com/kubernetes/kubernetes/pull/25978#issuecomment-232710174
This PR implements the plan a few of us came up with last week for cutting client into its own repo:
1. creating "_staging" (name is tentative) directory in the main repo, using a script to copy the client and its dependencies to this directory
2. periodically publishing the contents of this staging client to k8s.io/client-go repo
3. converting k8s components in the main repo to use the staged client. They should import the staged client as if the client were vendored. (i.e., the import line should be `import "k8s.io/client-go/<pacakge name>`). This requirement is to ease step 4.
4. In the future, removing the staging area, and vendoring the real client-go repo.
The advantage of having the staging area is that we can continuously run integration/e2e tests with the latest client repo and the latest main repo, without waiting for the client repo to be vendored back into the main repo. This staging area will exist until our test matrix is vendoring both the client and the server.
In the above plan, the tricky part is step 3. This PR achieves it by creating a symlink under ./vendor, pointing to the staging area, so packages in the main repo can refer to the client repo as if it's vendored. To prevent the godep tool from messing up the staging area, we export the staged client to GOPATH in hack/godep-save.sh so godep will think the client packages are local and won't attempt to manage ./vendor/k8s.io/client-go.
This is a POC. We'll rearrange the directory layout of the client before merge.
@thockin @lavalamp @bgrant0607 @kubernetes/sig-api-machinery
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.kubernetes.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.kubernetes.io/reviews/kubernetes/kubernetes/29147)
<!-- Reviewable:end -->
Automatic merge from submit-queue
Rewrite service controller to apply best controller pattern
This PR is a long term solution for #21625:
We apply the same pattern like replication controller to service controller to avoid the potential process order messes in service controller, the change includes:
1. introduce informer controller to watch service changes from kube-apiserver, so that every changes on same service will be kept in serviceStore as the only element.
2. put the service name to be processed to working queue
3. when process service, always get info from serviceStore to ensure the info is up-to-date
4. keep the retry mechanism, sleep for certain interval and add it back to queue.
5. remote the logic of reading last service info from kube-apiserver before processing the LB info as we trust the info from serviceStore.
The UT has been passed, manual test passed after I hardcode the cloud provider as FakeCloud, however I am not able to boot a k8s cluster with any available cloudprovider, so e2e test is not done.
Submit this PR first for review and for triggering a e2e test.
Automatic merge from submit-queue
add configz.InstallHandler in controllermanager.go
I think it should add configz.InstallHandler for Run function in controllermanager.go.
Automatic merge from submit-queue
two optimization for StartControllers in controllermanager.go
The PR changed two places to optimise StartControllers function in controllermanager.go.
Automatic merge from submit-queue
controller-manager support number of garbage collector workers to be configurable
The number of garbage collector workers of controller-manager is a fixed value 5 now, make it configurable should more properly