Rolling back from a broken update with only one replica fails with a
timeout in the existing code.
The problem is the scale down logic does not consider unavailable
replicas in the old replication controller when calculating how much to
scale down by. This leads to an obvious problem with a single replica
when min unavailable is 1.
The fix is to allow scaling down all unavailable replicas in the old
controller, while still maintaining the min unavailable invariant.
The pending codec -> conversion split changes the signature of
Encode and Decode to be more complicated. Create a stub helper
with the exact semantics of today and do the simple mechanical
refactor here to reduce the cost of that change.
Accept codec as parameter to CreateNewControllerFromCurrentController function. Add tests for performing a rolling update on a container in a multi-container pod.
A lot of packages use StringSet, but they don't use anything else from
the util package. Moving StringSet into another package will shrink
their dependency trees significantly.
Add an UpdateAcceptor interface to the rolling updater which supports
injecting code to validate the first replica during scale-up. If the
replica is not accepted, the deployment fails. This facilitates canary
checking so that many broken replicas aren't rolled out during an update.
Make the rolling update scale amount configurable as a percent of the replica
count; a negative value changes the scale direction to down/up to support
in-place deployments.
This commit wires together the graceful delete option for pods
on the Kubelet. When a pod is deleted on the API server, a
grace period is calculated that is based on the
Pod.Spec.TerminationGracePeriodInSeconds, the user's provided grace
period, or a default. The grace period can only shrink once set.
The value provided by the user (or the default) is set onto metadata
as DeletionGracePeriod.
When the Kubelet sees a pod with DeletionTimestamp set, it uses the
value of ObjectMeta.GracePeriodSeconds as the grace period
sent to Docker. When updating status, if the pod has DeletionTimestamp
set and all containers are terminated, the Kubelet will update the
status one last time and then invoke Delete(pod, grace: 0) to
clean up the pod immediately.
* Support configurable cleanup policies in RollingUpdater. Downstream
library consumers don't necessarily have the same rules for post
deployment cleanup; making the behavior policy driven is more flexible.
* Refactor RollingUpdater to accept a config object during Update instead
of a long argument list.
* Add test coverage for cleanup policy.
Use custom narrowly scoped interfaces for client access from the
RollingUpdater and Resizer. This allows for more flexible downstream
integration and unit testing without imposing a burden to implement
the entire client.Interface for just a handful of methods.
Standardize how our fakes are used so that a test case can use a
simpler mechanism for providing large, complex data sets, as well
as represent queries over time.
When kubectl does rolling updates of replication controllers, retry updates that
fail due to version mismatches (caused by concurrent updates by other clients).
These failed rolling updates were causing intermittent e2e test failures
(e.g. issue 5821)