Automatic merge from submit-queue
controller-manager support number of garbage collector workers to be configurable
The number of garbage collector workers of controller-manager is a fixed value 5 now, make it configurable should more properly
Automatic merge from submit-queue
kube-controller-manager: Add configure-cloud-routes option
This allows kube-controller-manager to allocate CIDRs to nodes (with
allocate-node-cidrs=true), but will not try to configure them on the
cloud provider, even if the cloud provider supports Routes.
The default is configure-cloud-routes=true, and it will only try to
configure routes if allocate-node-cidrs is also configured, so the
default behaviour is unchanged.
This is useful because on AWS the cloud provider configures routes by
setting up VPC routing table entries, but there is a limit of 50
entries. So setting configure-cloud-routes on AWS would allow us to
continue to allocate node CIDRs as today, but replace the VPC
route-table mechanism with something not limited to 50 nodes.
We can't just turn off the cloud-provider entirely because it also
controls other things - node discovery, load balancer creation etc.
Fix#25602
This allows kube-controller-manager to allocate CIDRs to nodes (with
allocate-node-cidrs=true), but will not try to configure them on the
cloud provider, even if the cloud provider supports Routes.
The default is configure-cloud-routes=true, and it will only try to
configure routes if allocate-node-cidrs is also configured, so the
default behaviour is unchanged.
This is useful because on AWS the cloud provider configures routes by
setting up VPC routing table entries, but there is a limit of 50
entries. So setting configure-cloud-routes on AWS would allow us to
continue to allocate node CIDRs as today, but replace the VPC
route-table mechanism with something not limited to 50 nodes.
We can't just turn off the cloud-provider entirely because it also
controls other things - node discovery, load balancer creation etc.
Fix#25602
Split controller cache into actual and desired state of world.
Controller will only operate on volumes scheduled to nodes that
have the "volumes.kubernetes.io/controller-managed-attach" annotation.
Automatic merge from submit-queue
Refactor persistent volume controller
Here is complete persistent controller as designed in https://github.com/pmorie/pv-haxxz/blob/master/controller.go
It's feature complete and compatible with current binder/recycler/provisioner. No new features, it *should* be much more stable and predictable.
Testing
--
The unit test framework is quite complicated, still it was necessary to reach reasonable coverage (78% in `persistentvolume_controller.go`). The untested part are error cases, which are quite hard to test in reasonable way - sure, I can inject a VersionConflictError on any object update and check the error bubbles up to appropriate places, but the real test would be to run `syncClaim`/`syncVolume` again and check it recovers appropriately from the error in the next periodic sync. That's the hard part.
Organization
---
The PR starts with `rm -rf kubernetes/pkg/controller/persistentvolume`. I find it easier to read when I see only the new controller without old pieces scattered around.
[`types.go` from the old controller is reused to speed up matching a bit, the code looks solid and has 95% unit test coverage].
I tried to split the PR into smaller patches, let me know what you think.
~~TODO~~
--
* ~~Missing: provisioning, recycling~~.
* ~~Fix integration tests~~
* ~~Fix e2e tests~~
@kubernetes/sig-storage
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24331)
<!-- Reviewable:end -->
Fixes#15632
Automatic merge from submit-queue
update controllers watching all pods to share an informer
This plumbs the shared pod informer through the various controllers to avoid duplicated watches.
Volume names have now format <cluster-name>-dynamic-<pv-name>.
pv-name is guaranteed to be unique in Kubernetes cluster, adding
<cluster-name> ensures we don't conflict with any running cluster
in the cloud project (kube-controller-manager --cluster-name=XXX).
'kubernetes' is the default cluster name.
I can't revert with github which says "Sorry, this pull request couldn’t be
reverted automatically. It may have already been reverted, or the content may
have changed since it was merged."
Reverts commit: 0c191e787b
Recycle controller tries to recycle or delete a PV several times.
It stores count of failed attempts and timestamp of the last attempt in
annotations of the PV.
By default, the controller tries to recycle/delete a PV 3 times in
10 minutes interval. These values are configurable by
kube-controller-manager --pv-recycler-maximum-retry=X --pvclaimbinder-sync-period=Y
arguments.
Public utility methods and JWT parsing, and controller specific logic.
Also remove the coupling between ServiceAccountTokenGetter and the
authenticator class.
The HPA controller had previously used a single Client
object to act as three different Namespacers. To improve
ease of extensibility and to make it clearer what the HPA
controller actually needs to use from the client, it should
use separate Namespacers for each of its needs (Scales, HPAs,
and Events).
This commit makes the HPA metrics client configurable in where
it looks for heapster instead of hard coding it to
"kube-system/heapster". The values of "kube-system/heapster"
are still recorded as constants in the metrics client package
for use as default values.
Default to hardcodes for components that had them, and 5.0 qps, 10 burst
for those that relied on client defaults
Unclear if maybe it'd be better to just assume these are set as part of
the incoming kubeconfig. For now just exposing them as flags since it's
easier for me to manually tweak.
Since pflag can handle net.IPNet arguements use that code. This means
that our code no longer has casts back and forth and just natively uses
net.IPNet.
pflag can handle IP addresses so use the pflag code instead of doing it
ourselves. This means our code just uses net.IP and we don't have all of
the useless casting back and forth!
Currently setting the `--allocate-node-cidrs` flag to true with an
empty cloud provider causes the kube-controller-manager to crash during
startup.
Fix the issue by checking for an empty cloud provider before setting
up route management on the cloud provider. This change introduces a
change in behavior. The kube-controller-manager now supports allocating
pod CIDRs without a cloud provider. This means users must manage routes
through some other mechanism.
The controller manager logs a warning if `--allocate-node-cidrs` is set,
but not a cloud provider:
```
I0725 17:10:41.587888 43185 plugins.go:70] No cloud provider specified.
I0725 17:10:41.588036 43185 nodecontroller.go:114] Sending events to api server.
E0725 17:10:41.588122 43185 controllermanager.go:201] Failed to start service controller: ServiceController should not be run without a cloudprovider.
W0725 17:10:41.588136 43185 controllermanager.go:213] allocate-node-cidrs is set, but no cloud provider specified. Will not manage routes.
E0725 17:10:41.589703 43185 nodecontroller.go:187] Error monitoring node status: Get http://127.0.0.1:8080/api/v1/nodes: dial tcp 127.0.0.1
```
Fixes#11866
This should ensure all load balancers get deleted even if a reordering of
watch events causes us to strand one after its service has been deleted,
because the sync will notice that the service controller's cache has a
service in it that no longer exists in the apiserver.
It could still leak in the case that the controller manager is killed
between when it leaks something and the sync runs, but this should
improve things.
--node-milli-cpu
--node-memory
--machines
--minion-regexp
--sync-nodes
Remove the following flags from the standalon kubernetes binary:
--node-milli-cpu
--node-memory
- Delete nodes when they are no longer ready and don't exist in the
cloud provider.
- Label each node with it's hostname.
- Add flag to skip node registration.
- Add a test for registering an existing node.
--master option still supported.
--kubeconfig option added to kube-proxy,
kube-scheduler, and kube-controller-manager
binaries.
Kube-proxy now always makes some kind of API
source, since that is its only kind of config.
Warn if it is using a default client, which probably won't work.
Uses the clientcmd builder.
external load balancers up-to-date based on the service's specs, using
the new DeltaFIFO watch queue class. Remove the old registry REST
handler code for creating/updating/deleting load balancers.
Also clean up a bunch of the GCE cloudprovider code related to load balancers.
It currently is impossible to use two healthz handlers on different
ports in the same process. This removes the global variables in favor
of requiring the consumer to specify all health checks up front.