Automatic merge from submit-queue
Retry job update after failure to prevent modification conflict
This fixes#34585 flake.
@janetkuo || @kubernetes/sig-apps ptal
I've been getting too many emails recently wrt to that issue, so I wanted to "clean" my inbox a bit 😉
Automatic merge from submit-queue
Speed up job's e2e when waiting for failure
**What this PR does / why we need it**:
Job controller synchronizes objects only when job itself or underlying pod changes. Or, when full resync is performed once 10 mins. This leads e2e test to unnecessarily wait that longer timeout, sometimes at least. I've added job modification action which triggers resync, if the job wasn't terminated within shorter period of time.
@ixdy ptal
@janetkuo @erictune fyi
Job controller synchronizes objects only when job itself or underlying pod
changes. Or, when full resync is performed once 10 mins. This leads e2e test
to unnecessarily wait that longer timeout, sometimes at least. I've added job
modification action which triggers resync, if the job wasn't terminated within
shorter period of time.
Added selector generation to Job's
strategy.Validate, right before validation.
Can't do in defaulting since UID is not known.
Added a validation to Job to ensure that the generated
labels and selector are correct when generation was requested.
This happens right after generation, but validation is in a better
place to return an error.
Adds "manualSelector" field to batch/v1 Job to control selector generation.
Adds same field to extensions/__internal. Conversion between those two
is automatic.
Adds "autoSelector" field to extensions/v1beta1 Job. Used for storing batch/v1 Jobs
- Default for v1 is to do generation.
- Default for v1beta1 is to not do it.
- In both cases, unset == false == do the default thing.
Release notes:
Added batch/v1 group, which contains just Job, and which is the next
version of extensions/v1beta1 Job.
The changes from the previous version are:
- Users no longer need to ensure labels on their pod template are unique to the enclosing
job (but may add labels as needed for categorization).
- In v1beta1, job.spec.selector was defaulted from pod labels, with the user responsible for uniqueness.
In v1, a unique label is generated and added to the pod template, and used as the selector (other
labels added by user stay on pod template, but need not be used by selector).
- a new field called "manualSelector" field exists to control whether the new behavior is used,
versus a more error-prone but more flexible "manual" (not generated) seletor. Most users
will not need to use this field and should leave it unset.
Users who are creating extensions.Job go objects and then posting them using the go client
will see a change in the default behavior. They need to either stop providing a selector (relying on
selector generation) or else specify "spec.manualSelector" until they are ready to do the former.
Makes number of failures per pod fixed at 1, for the RestartOnFailure
case, which prevents Kubelet restart backoff, which causes test timeout.
For RestartNever tests, it keeps using the random success/failure.
Fixes#15389.
Renables previously flaky e2e.