Automatic merge from submit-queue
convert SA controller to shared informers
convert the SA controller to shared informer + workqueue.
I think one of @derekwaynecarr @ncdc or @liggitt
Alter how runtime.SerializeInfo is represented to simplify negotiation
and reduce the need to allocate during negotiation. Simplify the dynamic
client's logic around negotiating type. Add more tests for media type
handling where necessary.
Automatic merge from submit-queue
decouple workqueue metrics from prometheus
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
We want to include the workqueue in client-go, but do not want to having to import Prometheus. This PR decouples the workqueue from prometheus.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Partially address https://github.com/kubernetes/kubernetes/issues/33497
User requested for `workqueue` in client-go: https://github.com/kubernetes/client-go/issues/4#issuecomment-249444848
**Special notes for your reviewer**:
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
The implicit registration of Prometheus metrics for workqueue has been removed, and a plug-able interface was added. If you were using workqueue in your own binaries and want these metrics, add the following to your imports in the main package: "k8s.io/pkg/util/workqueue/prometheus".
```
Automatic merge from submit-queue
Add ECDSA support for service account tokens
Fixes#28180
```release-note
ECDSA keys can now be used for signing and verifying service account tokens.
```
Automatic merge from submit-queue
Abstraction of endpoints in leaderelection code
**Problem Statement**:
Currently the Leader Election code is hard coded against the endpoints api. This causes performance issues on large scale clusters due to incessant iptables refreshes, see: https://github.com/kubernetes/kubernetes/issues/26637
The goal of this PR is to:
- Abstract Endpoints out of the leader election code
- Fix a known bug in the event recording
fixes#18386
**Special notes for your reviewer**:
This is a 1st pass at abstracting the details of endpoints out into an interface. Any suggestions around how we we want to refactor this interface is welcome and could be addressed in either this PR or follow on PR.
/cc @ncdc @wojtek-t @rrati
Automatic merge from submit-queue
Dynamic provisioning for flocker volume plugin
Refactor flocker volume plugin
* [x] Support provisioning beta (#29006)
* [x] Support deletion
* [x] Use bind mounts instead of /flocker in containers
* [x] support ownership management or SELinux relabeling.
* [x] adds volume specification via datasetUUID (this is guranted to be unique)
I based my refactor work to replicate pretty much GCE-PD behaviour
**Related issues**: #29006#26908
@jsafrane @mattbates @wallrj @wallnerryan
persistentvolumecontroller.NewPersistentVolumeController has 11 arguments now,
put them into a structure.
Also, rename NewPersistentVolumeController to NewController, persistentvolume
is already name of the package.
Fixes#30219
The GC needs to build clients based only on Resource or Kind. Hoist the
restmapper out of the controller and the clientpool, support a new
ClientForGroupVersionKind and ClientForGroupVersionResource, and use the
appropriate one in both places.
Automatic merge from submit-queue
Refactor cert utils into one pkg, add funcs from bootkube for kubeadm to use
**What this PR does / why we need it**:
We have ended-up with rather incomplete and fragmented collection of utils for handling certificates. It may be worse to consider using `cfssl` for doing all of these things, but for now there is some functionality that we need in `kubeadm` that we can borrow from bootkube. It makes sense to move the utils from bookube into core, as discussed in #31221.
**Special notes for your reviewer**: I've taken the opportunity to review names of existing funcs and tried to make some improvements in that area (with help from @peterbourgon).
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Support Quobyte as StorageClass
This PR allows Users to use Quobyte as StorageClass for dynamic volume provisioning and implements the Provisioner/Deleter Interface.
@quolix @kubernetes/sig-storage @rootfs
Automatic merge from submit-queue
convert daemonset controller to shared informers
Convert the daemonset controller completely to `SharedInformers` for its list/watch resources.
@kubernetes/rh-cluster-infra @ncdc
Automatic merge from submit-queue
Switch ScheduledJob controller to use clientset
**What this PR does / why we need it**:
This is part of #25442. I've applied here the same fix I've applied in the manual client in #29187, see the 1st commit for that (@caesarxuchao we've talked about it in #29856).
@deads2k as promised
@janetkuo ptal
Automatic merge from submit-queue
Use PV shared informer in PV controller
Use the PV shared informer, addressing (partially) https://github.com/kubernetes/kubernetes/issues/26247 . Using the PVC shared informer is not so simple because sometimes the controller wants to `Requeue` and...
Automatic merge from submit-queue
support storage class in Ceph RBD volume
replace WIP PR #30959, using PV annotation idea from @jsafrane
@kubernetes/sig-storage @johscheuer @elsonrodriguez
Automatic merge from submit-queue
Split the version metric out to its own package
This PR breaks a client dependency on prometheus. Combined with #30638, the client will no longer depend on these packages.
Automatic merge from submit-queue
Implements Attacher Plugin Interface for vSphere
This PR does the following,
Fixes#29028 (vsphere volume should implement attacher interface): Implements Attacher Plugin Interface for vSphere.
See file:
pkg/volume/vsphere_volume/vsphere_volume.go. - Removed attach and detach calls from SetupAt and TearDownAt.
pkg/volume/vsphere_volume/attacher.go. - Implements Attacher & Detacher Plugin Interface for vSphere. (Ref :- GCE_PD & AWS attacher.go)
pkg/cloudproviders/provider/vsphere.go - Added DiskIsAttach method.
The vSphere plugin code needs clean up. (ex: The code for getting vSphere instance is repeated in file pkg/cloudprovider/providers/vsphere.go). I will fix this in next PR.
Automatic merge from submit-queue
Remove implicit Prometheus metrics from client
**What this PR does / why we need it**: This PR starts to cut away at dependencies that the client has.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
The implicit registration of Prometheus metrics for request count and latency have been removed, and a plug-able interface was added. If you were using our client libraries in your own binaries and want these metrics, add the following to your imports in the main package: "k8s.io/pkg/client/metrics/prometheus".
```
cc: @kubernetes/sig-api-machinery @kubernetes/sig-instrumentation @fgrzadkowski @wojtek-t
Automatic merge from submit-queue
Enable the garbage collector by default
Turning GC on by default.
Memory usage of GC is back to normal after #30943. The CPU usage is a little higher than the cap in scalability test (1.11 core vs. 1 core). This PR adjusted the default GC worker to 20 to see if that helps CPU usage.
@kubernetes/sig-api-machinery @wojtek-t @lavalamp
Automatic merge from submit-queue
Add Events for operation_executor to show status of mounts, failed/successful to show in describe events
Fixes#27590
@saad-ali @pmorie @erinboyd
After talking with @pmorie last week about the above issue, I decided to poke around and see if I could remedy. The refactoring broke my previous UXP merged PR's that correctly showed failed mount errors in the describe events. However, Not sure I implemented correctly, but it tested out and seems to be working, let me know what I missed or if this is not the correct approach.
```
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
2m 2m 1 {default-scheduler } Normal Scheduled Successfully assigned nfs-bb-pod1 to 127.0.0.1
44s 44s 1 {kubelet 127.0.0.1} Warning FailedMount Unable to mount volumes for pod "nfs-bb-pod1_default(a94f64f1-37c9-11e6-9aa5-52540073d346)": timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol]
44s 44s 1 {kubelet 127.0.0.1} Warning FailedSync Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol]
38s 38s 1 {kubelet } Warning FailedMount Unable to mount volumes for pod "a94f64f1-37c9-11e6-9aa5-52540073d346": Mount failed: exit status 32
Mounting arguments: nfs1.rhs:/opt/data99 /var/lib/kubelet/pods/a94f64f1-37c9-11e6-9aa5-52540073d346/volumes/kubernetes.io~nfs/nfsvol nfs []
Output: mount.nfs: Connection timed out
Resolution hint: Check and make sure the NFS Server exists (ensure that correct IPAddress/Hostname was given) and is available/reachable.
Also make sure firewall ports are open on both client and NFS Server (2049 v4 and 2049, 20048 and 111 for v3).
Use commands telnet <nfs server> <port> and showmount <nfs server> to help test connectivity.
```
Automatic merge from submit-queue
Cut the client repo, staging it in the main repo
Tracking issue: #28559
ref: https://github.com/kubernetes/kubernetes/pull/25978#issuecomment-232710174
This PR implements the plan a few of us came up with last week for cutting client into its own repo:
1. creating "_staging" (name is tentative) directory in the main repo, using a script to copy the client and its dependencies to this directory
2. periodically publishing the contents of this staging client to k8s.io/client-go repo
3. converting k8s components in the main repo to use the staged client. They should import the staged client as if the client were vendored. (i.e., the import line should be `import "k8s.io/client-go/<pacakge name>`). This requirement is to ease step 4.
4. In the future, removing the staging area, and vendoring the real client-go repo.
The advantage of having the staging area is that we can continuously run integration/e2e tests with the latest client repo and the latest main repo, without waiting for the client repo to be vendored back into the main repo. This staging area will exist until our test matrix is vendoring both the client and the server.
In the above plan, the tricky part is step 3. This PR achieves it by creating a symlink under ./vendor, pointing to the staging area, so packages in the main repo can refer to the client repo as if it's vendored. To prevent the godep tool from messing up the staging area, we export the staged client to GOPATH in hack/godep-save.sh so godep will think the client packages are local and won't attempt to manage ./vendor/k8s.io/client-go.
This is a POC. We'll rearrange the directory layout of the client before merge.
@thockin @lavalamp @bgrant0607 @kubernetes/sig-api-machinery
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.kubernetes.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.kubernetes.io/reviews/kubernetes/kubernetes/29147)
<!-- Reviewable:end -->
Automatic merge from submit-queue
Rewrite service controller to apply best controller pattern
This PR is a long term solution for #21625:
We apply the same pattern like replication controller to service controller to avoid the potential process order messes in service controller, the change includes:
1. introduce informer controller to watch service changes from kube-apiserver, so that every changes on same service will be kept in serviceStore as the only element.
2. put the service name to be processed to working queue
3. when process service, always get info from serviceStore to ensure the info is up-to-date
4. keep the retry mechanism, sleep for certain interval and add it back to queue.
5. remote the logic of reading last service info from kube-apiserver before processing the LB info as we trust the info from serviceStore.
The UT has been passed, manual test passed after I hardcode the cloud provider as FakeCloud, however I am not able to boot a k8s cluster with any available cloudprovider, so e2e test is not done.
Submit this PR first for review and for triggering a e2e test.
Automatic merge from submit-queue
add configz.InstallHandler in controllermanager.go
I think it should add configz.InstallHandler for Run function in controllermanager.go.
Automatic merge from submit-queue
two optimization for StartControllers in controllermanager.go
The PR changed two places to optimise StartControllers function in controllermanager.go.
Automatic merge from submit-queue
controller-manager support number of garbage collector workers to be configurable
The number of garbage collector workers of controller-manager is a fixed value 5 now, make it configurable should more properly
Automatic merge from submit-queue
kube-controller-manager: Add configure-cloud-routes option
This allows kube-controller-manager to allocate CIDRs to nodes (with
allocate-node-cidrs=true), but will not try to configure them on the
cloud provider, even if the cloud provider supports Routes.
The default is configure-cloud-routes=true, and it will only try to
configure routes if allocate-node-cidrs is also configured, so the
default behaviour is unchanged.
This is useful because on AWS the cloud provider configures routes by
setting up VPC routing table entries, but there is a limit of 50
entries. So setting configure-cloud-routes on AWS would allow us to
continue to allocate node CIDRs as today, but replace the VPC
route-table mechanism with something not limited to 50 nodes.
We can't just turn off the cloud-provider entirely because it also
controls other things - node discovery, load balancer creation etc.
Fix#25602
This allows kube-controller-manager to allocate CIDRs to nodes (with
allocate-node-cidrs=true), but will not try to configure them on the
cloud provider, even if the cloud provider supports Routes.
The default is configure-cloud-routes=true, and it will only try to
configure routes if allocate-node-cidrs is also configured, so the
default behaviour is unchanged.
This is useful because on AWS the cloud provider configures routes by
setting up VPC routing table entries, but there is a limit of 50
entries. So setting configure-cloud-routes on AWS would allow us to
continue to allocate node CIDRs as today, but replace the VPC
route-table mechanism with something not limited to 50 nodes.
We can't just turn off the cloud-provider entirely because it also
controls other things - node discovery, load balancer creation etc.
Fix#25602
Split controller cache into actual and desired state of world.
Controller will only operate on volumes scheduled to nodes that
have the "volumes.kubernetes.io/controller-managed-attach" annotation.
Automatic merge from submit-queue
vSphere Volume Plugin Implementation
This PR implements vSphere Volume plugin support in Kubernetes (ref. issue #23932).
Automatic merge from submit-queue
Refactor persistent volume controller
Here is complete persistent controller as designed in https://github.com/pmorie/pv-haxxz/blob/master/controller.go
It's feature complete and compatible with current binder/recycler/provisioner. No new features, it *should* be much more stable and predictable.
Testing
--
The unit test framework is quite complicated, still it was necessary to reach reasonable coverage (78% in `persistentvolume_controller.go`). The untested part are error cases, which are quite hard to test in reasonable way - sure, I can inject a VersionConflictError on any object update and check the error bubbles up to appropriate places, but the real test would be to run `syncClaim`/`syncVolume` again and check it recovers appropriately from the error in the next periodic sync. That's the hard part.
Organization
---
The PR starts with `rm -rf kubernetes/pkg/controller/persistentvolume`. I find it easier to read when I see only the new controller without old pieces scattered around.
[`types.go` from the old controller is reused to speed up matching a bit, the code looks solid and has 95% unit test coverage].
I tried to split the PR into smaller patches, let me know what you think.
~~TODO~~
--
* ~~Missing: provisioning, recycling~~.
* ~~Fix integration tests~~
* ~~Fix e2e tests~~
@kubernetes/sig-storage
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24331)
<!-- Reviewable:end -->
Fixes#15632
Automatic merge from submit-queue
update controllers watching all pods to share an informer
This plumbs the shared pod informer through the various controllers to avoid duplicated watches.
Volume names have now format <cluster-name>-dynamic-<pv-name>.
pv-name is guaranteed to be unique in Kubernetes cluster, adding
<cluster-name> ensures we don't conflict with any running cluster
in the cloud project (kube-controller-manager --cluster-name=XXX).
'kubernetes' is the default cluster name.
I can't revert with github which says "Sorry, this pull request couldn’t be
reverted automatically. It may have already been reverted, or the content may
have changed since it was merged."
Reverts commit: 0c191e787b
Recycle controller tries to recycle or delete a PV several times.
It stores count of failed attempts and timestamp of the last attempt in
annotations of the PV.
By default, the controller tries to recycle/delete a PV 3 times in
10 minutes interval. These values are configurable by
kube-controller-manager --pv-recycler-maximum-retry=X --pvclaimbinder-sync-period=Y
arguments.
Public utility methods and JWT parsing, and controller specific logic.
Also remove the coupling between ServiceAccountTokenGetter and the
authenticator class.
The HPA controller had previously used a single Client
object to act as three different Namespacers. To improve
ease of extensibility and to make it clearer what the HPA
controller actually needs to use from the client, it should
use separate Namespacers for each of its needs (Scales, HPAs,
and Events).
This commit makes the HPA metrics client configurable in where
it looks for heapster instead of hard coding it to
"kube-system/heapster". The values of "kube-system/heapster"
are still recorded as constants in the metrics client package
for use as default values.
Default to hardcodes for components that had them, and 5.0 qps, 10 burst
for those that relied on client defaults
Unclear if maybe it'd be better to just assume these are set as part of
the incoming kubeconfig. For now just exposing them as flags since it's
easier for me to manually tweak.