Since the HPA controller pulls information from an external source that
makes no guarantees about consistency, it's possible for the HPA
to get into an infinite update loop -- if the metrics change with
every query, the HPA controller will run it's normal reconcilation,
post a status update, see that status update itself, fetch new metrics,
and if those metrics are different, post another status update, and
repeat. This can lead to continuously updating a single HPA.
By rate-limiting each HPA to once per sync interval, we prevent this
from happening.
Conversions can mutate the underlying object (and ours were).
Make a deepcopy before our first conversion at the very start
of the reconciler method in order to avoid mutating the shared
informer cache during conversion.
Fixes#41768
This commit converts the HPA controller over to using the new version of
the HorizontalPodAutoscaler object found in autoscaling/v2alpha1. Note
that while the autoscaler will accept requests for object metrics, the
scale client will return an error on attempts to get object metrics
(since that requires the new custom metrics API, which is not yet
implemented).
This also enables the HPA object in v2alpha1 as a retrievable API
version by default.
Automatic merge from submit-queue
rescale immediately if the basic constraints are not satisfied
refactor reconcileAutoscaler.
If the basic constraints are not satisfied, we should rescale the target ref immediately.
Currently, the HPA considers unready pods the same as ready pods when
looking at their CPU and custom metric usage. However, pods frequently
use extra CPU during initialization, so we want to consider them
separately.
This commit causes the HPA to consider unready pods as having 0 CPU
usage when scaling up, and ignores them when scaling down. If, when
scaling up, factoring the unready pods as having 0 CPU would cause a
downscale instead, we simply choose not to scale. Otherwise, we simply
scale up at the reduced amount caculated by factoring the pods in at
zero CPU usage.
The effect is that unready pods cause the autoscaler to be a bit more
conservative -- large increases in CPU usage can still cause scales,
even with unready pods in the mix, but will not cause the scale factors
to be as large, in anticipation of the new pods later becoming ready and
handling load.
Similarly, if there are pods for which no metrics have been retrieved,
these pods are treated as having 100% of the requested metric when
scaling down, and 0% when scaling up. As above, this cannot change the
direction of the scale.
This commit also changes the HPA to ignore superfluous metrics -- as
long as metrics for all ready pods are present, the HPA we make scaling
decisions. Currently, this only works for CPU. For custom metrics, we
cannot identify which metrics go to which pods if we get superfluous
metrics, so we abort the scale.
Automatic merge from submit-queue
Additional go vet fixes
Mostly:
- pass lock by value
- bad syntax for struct tag value
- example functions not formatted properly
Here are a list of changes along with an explanation of how they work:
1. Add a new string field called TargetSelector to the external version of
extensions Scale type (extensions/v1beta1.Scale). This is a serialized
version of either the map-based selector (in case of ReplicationControllers)
or the unversioned.LabelSelector struct (in case of Deployments and
ReplicaSets).
2. Change the selector field in the internal Scale type (extensions.Scale) to
unversioned.LabelSelector.
3. Add conversion functions to convert from two external selector fields to a
single internal selector field. The rules for conversion are as follows:
i. If the target resource that this scale targets supports LabelSelector
(Deployments and ReplicaSets), then serialize the LabelSelector and
store the string in the TargetSelector field in the external version
and leave the map-based Selector field as nil.
ii. If the target resource only supports a map-based selector
(ReplicationControllers), then still serialize that selector and
store the serialized string in the TargetSelector field. Also,
set the the Selector map field in the external Scale type.
iii. When converting from external to internal version, parse the
TargetSelector string into LabelSelector struct if the string isn't
empty. If it is empty, then check if the Selector map is set and just
assign that map to the MatchLabels component of the LabelSelector.
iv. When converting from internal to external version, serialize the
LabelSelector and store it in the TargetSelector field. If only
the MatchLabel component is set, then also copy that value to
the Selector map field in the external version.
4. HPA now just converts the LabelSelector field to a Selector interface
type to list the pods.
5. Scale Get and Update etcd methods for Deployments and ReplicaSets now
return extensions.Scale instead of autoscaling.Scale.
6. Consequently, SubresourceGroupVersion override and is "autoscaling"
enabled check is now removed from pkg/master/master.go
7. Other small changes to labels package, fuzzer and LabelSelector
helpers to piece this all together.
8. Add unit tests to HPA targeting Deployments and ReplicaSets.
9. Add an e2e test to HPA targeting ReplicaSets.
The HPA controller had previously used a single Client
object to act as three different Namespacers. To improve
ease of extensibility and to make it clearer what the HPA
controller actually needs to use from the client, it should
use separate Namespacers for each of its needs (Scales, HPAs,
and Events).