- Scale down based on custom metric was flaking. Increase target value
of the metric.
- Scale down based on CPU was flaking during stabilization. Increase
tolerance of stabilization (caused by resource consumer using more CPU
than requested).
I increased request from 500mCPU to 1CPU in
https://github.com/kubernetes/kubernetes/pull/70125 but this interacts
poorly with other e2e tests (higher requests mean pods fil to schedule).
So revert that change.
Resource consumer might use slightly more CPU than requested. That
resulted in HPA sometimes increasing size of deployments during e2e
tests. Deflake tests by:
- Scaling up CPU requests in those tests. Resource consumer might go a fixed
number of milli CPU seconds above target. Having higher requests makes
the test less sensitive.
- On scale down consume CPU in the middle between what would generate
recommendation of expexted size and 1 pod fewer (instead of righ on
edge beween expected and expected +1).
Some variables were int32 but always cast to int before use. Make them
int.
Previously, the HPA controller ignored APIVersion when resolving the
scale subresource for a kind, meaning if it was set incorrectly in the
HPA's scaleTargetRef, it would not matter. This was the case for
several of the HPA e2e tests.
Since the polymorphic scale client merged, APIVersion now matters. This
updates the HPA e2es to care about APIVersion, by passing kind as a full
GroupVersionKind, and not just a string.