Automatic merge from submit-queue
Corrected a typo in scheduler factory.go.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 39945, 39601)
bugfix for PodToleratesNodeTaints
`PodToleratesNodeTaints`predicate func should return true if pod has no toleration annotations and node's taint effect is `PreferNoSchedule`
Automatic merge from submit-queue
genericapiserver: cut off pkg/serviceaccount dependency
**Blocked** by pkg/api/validation/genericvalidation to be split up and moved into apimachinery.
Automatic merge from submit-queue (batch tested with PRs 39948, 39997)
Fix ScheduledJob -> CronJob rename leftovers
I found a few leftovers from the rename I did some time ago.
@kubernetes/sig-apps-misc ptal
Automatic merge from submit-queue
Move pkg/api/rest into genericapiserver
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
Do not list CronJob unmet starting times beyond deadline
**What this PR does / why we need it**:
See #36311. `getRecentUnmetScheduleTimes` gives up after 100 unmet times to avoid wasting too much CPU or memory generating all the times, as it generates them sequentially.
When concurrency is forbidden, this is conceptually un-necessary: we only need the last unmet start time. This suggests that when concurrency is forbidden, we could generate times by going backward in time from now. This is not very practical as CronJob currently relies on a package that only provides `Next` and no `Prev`. Hand-cooking a `Prev` does not seem like a good idea. I could submit a PR to the cron library to add a `Prev` method, and use that when concurrency is forbidden through something like `getLastUnmetScheduleTime`. This would be `O(1)` and there would be no limit involved.
(edit: actually, even for the other concurrency settings, we only start the last unmet start times -- there is a `TODO` in the controller to actually start all of them, but that is not implemented at the moment. This means the solution would apply, at least temporarily, to all concurrency settings).
cc @soltysh what do you think?
In the meantime, I would suggest to do something simple. Currently, the user has no way to configure anything to ensure that his CronJob will not get stuck if one job takes more that 100 unmet times.
`getRecentUnmetScheduleTimes` starts with an initial time corresponding to the last start (or to the creation of the CronJob, if nothing has started yet). However, when `StartingDeadlineSeconds` is set, the controller will not start anything that is older than the deadline, so if the last start is way beyond the deadline, we are generating potentially lots of unmet start times that will not be considered by the scheduler for scheduling anyway.
Consider a job running every minute, where the last instance has taken 120 minutes. This means there are more than 100 unmet times when we start counting from the last start time.
**The PR makes `getRecentUnmetScheduleTimes` only consider times that do not fall beyond the deadline.** Here, the CronJob can be configured with a `StartingDeadlineSeconds` of, say, 10 minutes. After the 120min job has run, `getRecentUnmetScheduleTimes` will only consider the times in the last 10 minutes from now, and will not get stuck.
As a side note on the max. number of unmet times to use as limits in terms of CPU used by the controller: I have run a quick benchmark on my i7 mac. Schedules corresponding to "once a week" tend to be more expensive to generate unmet times for. Just FYI.
```
+--------------+---------------+--------------+
| SCHEDULE | MISSED STARTS | TIMING |
+--------------+---------------+--------------+
| */1 * * * ? | 100 | 383.645µs |
| */30 * * * ? | 100 | 354.765µs |
| 30 1 * * ? | 100 | 1.065124ms |
| 30 1 * * 0 | 100 | 1.80034ms |
| */1 * * * ? | 500 | 1.341365ms |
| */30 * * * ? | 500 | 1.814441ms |
| 30 1 * * ? | 500 | 8.475012ms |
| 30 1 * * 0 | 500 | 10.020613ms |
| */1 * * * ? | 1000 | 2.551697ms |
| */30 * * * ? | 1000 | 4.075813ms |
| 30 1 * * ? | 1000 | 17.674945ms |
| 30 1 * * 0 | 1000 | 19.149324ms |
| */1 * * * ? | 10000 | 25.725531ms |
| */30 * * * ? | 10000 | 87.520022ms |
| 30 1 * * ? | 10000 | 174.29216ms |
| 30 1 * * 0 | 10000 | 196.565748ms |
+--------------+---------------+--------------+
```
using
```.go
package main
import (
"fmt"
"time"
"os"
"strconv"
"github.com/robfig/cron"
"github.com/olekukonko/tablewriter"
)
func timeSchedule(schedule string, iterations int) (time.Duration) {
sched, err := cron.ParseStandard(schedule)
if err != nil {
panic(fmt.Sprintf("Unparseable schedule: %s", err))
}
start := time.Now()
t := time.Now()
for i := 1; i <= iterations; i++ {
t = sched.Next(t)
}
return time.Since(start)
}
func main() {
table := tablewriter.NewWriter(os.Stdout)
table.SetHeader([]string{"Schedule", "Missed starts", "Timing"})
schedules := []string{"*/1 * * * ?", "*/30 * * * ?", "30 1 * * ?", "30 1 * * 0"}
iteration_nums := []int{100, 500, 1000, 10000}
for _, iterations := range iteration_nums {
for _, schedule := range schedules {
table.Append([]string{schedule,
strconv.Itoa(iterations),
timeSchedule(schedule, iterations).String()})
}
}
table.Render()
}
```
**Which issue this PR fixes**: fixes#36311
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 37680, 39968)
Update Owners for Scheduler
Update Owners file for scheduler component to spread the reviews around.
/cc @davidopp per previous sig-mtg.
Automatic merge from submit-queue
Report the Pod name and namespace when kubelet fails to sync the container
This helps debugging problems with SELinux (and other problems related to the Docker failed to run the container) as currently only the UUID of the Pod is reported:
```
Error syncing pod 670f607d-b5a8-11a4-b673-005056b7468b, skipping: failed to "StartContainer" for "deployment" with RunContainerError: "runContainer: Error response from daemon: Relabeling content in /usr is not allowed."
```
Here it would be useful to know what pod in which namespace is trying to mount the "/usr".
Automatic merge from submit-queue
Added validation for API server's 'apiserver-count' flag.
Added validation for API server's 'apiserver-count' flag. The value of this flag should be a positive number, otherwise, will cause error while reconciling endpoints in MasterCountEndpointsReconciler.
Fixed#38143
Automatic merge from submit-queue
move name generation to generic api server storage helpers
Move name generation to the genericapiserver since only the server needs to know about it.
@kubernetes/sig-api-machinery-misc @sttts
Automatic merge from submit-queue
kubeadm: must lower-case token portion used in DNS label.
**What this PR does / why we need it**: In Kubernetes, DNS labels must be lower-case. `kubeadm` doesn't care when creating certain objects through the API. This PR fixes that erroneous behavior.
**Which issue this PR fixes**: fixes https://github.com/kubernetes/kubeadm/issues/104
**Special notes for your reviewer**: /cc @luxas @mikedanese @dgoodwin
Automatic merge from submit-queue (batch tested with PRs 39806, 39887, 39401)
refactor delete to remove cobra dependency
FYI. As part of CLI Q1 roadmap, we would like to reduce the dependency of Cobra from actual commands implementations. In this PR, I tried to refactor delete command to achieve this. @kubernetes/sig-cli-misc a quick review is quite welcome, and I am just working on more PRs.
Automatic merge from submit-queue (batch tested with PRs 39806, 39887, 39401)
export list of user resources
This patch exports the list of "userResources" found in
`pkg/kubectl/cmd/util/shortcut_resmapper.go` to allow its use in
external packages and clients.
Related downstream PR: https://github.com/openshift/origin/pull/12147
**Release note**:
```release-note
release-note-none
```
cc @deads2k
Automatic merge from submit-queue
fixup some apimachinery packages
Moves the meta/v1/validation to apimachinery (since its logically shared) and moves the funny `apimachinery/pkg/genericapiserver/openapi/common` to `apimachinery/pkg/openapi`.
@kubernetes/sig-api-machinery-misc
Automatic merge from submit-queue
add patch RS to deployment controller
Found in http://gcsweb.k8s.io/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce/2841/artifacts/bootstrap-e2e-master/, `RBAC DENY: user "system:serviceaccount:kube-system:deployment-controller" groups [system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] cannot "patch" on "replicasets.extensions/" in namespace "e2e-tests-deployment-3rj5g"
`
@kubernetes/sig-auth-misc
Automatic merge from submit-queue
Give replicaset controller patch permission on pods
Needed for AdoptPod/ReleasePod
Fixes denials seen in autoscaling test log:
`RBAC DENY: user "system:serviceaccount:kube-system:replicaset-controller" groups [system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] cannot "patch" on "pods./"`
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)
move api/errors to apimachinery
`pkg/api/errors` is a set of helpers around `meta/v1.Status` that help to create and interpret various apiserver errors. Things like `.NewNotFound` and `IsNotFound` pairings. This pull moves it into apimachinery for use by the clients and servers.
@smarterclayton @lavalamp First commit is the move plus minor fitting. Second commit is straight replace and generation.
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)
genericapiserver: cut off pkg/apis/extensions and pkg/storage dependencies
Move BuildDefaultStorageFactory to kubeapiserver.
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)
Remove fluentd buffers if fluentd is stuck
Fluentd now stores its buffers on disk for the resiliency. However, if buffer is corrupted, fluentd will be restarting forever.
Following change will make fluentd liveness probe delete buffers if fluentd is stuck for more than X minutes (15 by default).
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)
Add optional per-request context to restclient
**What this PR does / why we need it**: It adds per-request contexts to restclient's API, and uses them to add timeouts to all proxy calls in the e2e tests. An entire e2e shouldn't hang for hours on a single API call.
**Which issue this PR fixes**: #38305
**Special notes for your reviewer**:
This adds a feature to the low-level rest client request feature that is entirely optional. It doesn't affect any requests that don't use it. The api of the generated clients does not change, and they currently don't take advantage of this.
I intend to patch this in to 1.5 as a mostly test only change since it's not going to affect any controller, generated client, or user of the generated client.
cc @kubernetes/sig-api-machinery
cc @saad-ali