Automatic merge from submit-queue
Add dockershim only mode
This PR added a `experimental-dockershim` hidden flag in kubelet to run dockershim only.
We introduce this flag mainly for cri validation test. In the future we should compile dockershim into another binary.
@yujuhong @feiskyer @xlgao-zju
/cc @kubernetes/sig-node-pr-reviews
Automatic merge from submit-queue
Scheduler can recieve its policy configuration from a ConfigMap
**What this PR does / why we need it**: This PR adds the ability to scheduler to receive its policy configuration from a ConfigMap. Before this, scheduler could receive its policy config only from a file. The logic to watch the ConfigMap object will be added in a subsequent PR.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```Add the ability to the default scheduler to receive its policy configuration from a ConfigMap object.
```
Automatic merge from submit-queue
Remove 'beta' from default storage class annotation (storage/util)
**What this PR does / why we need it**:
This is a follow up to: #42991 where I believe this file was overlooked.
It removes `beta` from the default storageclass annotation.
Without this fix you are not able to specify a default storage class like this:
```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: standard
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
```
because the annotation is ignored in: https://github.com/kubernetes/kubernetes/blob/master/plugin/pkg/admission/storageclass/default/admission.go#L129
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
/cc @jsafrane
Kubelet flags are not necessarily appropriate for the KubeletConfiguration
object. For example, this PR also removes HostnameOverride and NodeIP
from KubeletConfiguration. This is a preleminary step to enabling Nodes
to share configurations, as part of the dynamic Kubelet configuration
feature (#29459). Fields that must be unique for each node inhibit
sharing, because their values, by definition, cannot be shared.
Automatic merge from submit-queue
kubelet: change image-gc-high-threshold below docker dm.min_free_space
docker dm.min_free_space defaults to 10%, which "specifies the min free space percent in a thin pool require for new device creation to succeed....Whenever a new a thin pool device is created (during docker pull or during container creation), the Engine checks if the minimum free space is available. If sufficient space is unavailable, then device creation fails and any relevant docker operation fails." [1]
This setting is preventing the storage usage to cross the 90% limit. However, image GC is expected to kick in only beyond image-gc-high-threshold. The image-gc-high-threshold has a default value of 90%, and hence GC never triggers. If image-gc-high-threshold is set to a value lower than (100 - dm.min_free_space)%, GC triggers.
xref https://bugzilla.redhat.com/show_bug.cgi?id=1408309
```release-note
changed kubelet default image-gc-high-threshold to 85% to resolve a conflict with default settings in docker that prevented image garbage collection from resolving low disk space situations when using devicemapper storage.
```
@derekwaynecarr @sdodson @rhvgoyal
Automatic merge from submit-queue
validate activeDeadlineSeconds in rs/rc
**What this PR does / why we need it**:
if setting activeDeadlineSeconds, deployment will continuously created new pods after old pod dies.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#38684
**Special notes for your reviewer**:
**Release note**:
```release-note
ActiveDeadlineSeconds is validated in workload controllers now, make sure it's not set anywhere (it shouldn't be set by default and having it set means your controller will restart the Pods at some point)
```
Automatic merge from submit-queue
Disable readyReplicas validation for Deployments
Because there is no field in 1.5, when we update to 1.6 and the
controller tries to update the Deployment, it will be denied by
validation because the pre-existing availableReplicas field is greater
than readyReplicas (normally readyReplicas should always be greater or
equal).
Fixes https://github.com/kubernetes/kubernetes/issues/43392
@kubernetes/sig-apps-bugs
Because there is no field in 1.5, when we update to 1.6 and the
controller tries to update the Deployment, it will be denied by
validation because the pre-existing availableReplicas field is greater
than readyReplicas (normally readyReplicas should always be greater or
equal).
Automatic merge from submit-queue
Use Semantic.DeepEqual to compare DaemonSet template on updates
Switch to `Semantic.DeepEqual` when comparing templates on DaemonSet updates, since we can't distinguish between `null` and `[]` in protobuf. This avoids unnecessary DaemonSet pods restarts.
I didn't touch `reflect.DeepEqual` used in controller because it's close to release date, and the DeepEqual in the controller doesn't cause serious issues (except for maybe causing more enqueues than needed).
Fixes#43218
@liggitt @kargakis @lukaszo @kubernetes/sig-apps-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 42775, 42991, 42968, 43029)
Remove 'beta' from default storage class annotation
I forgot to update default storage class annotation in my storage.k8s.io/v1beta1 -> v1 PRs. Let's fix it before 1.6 is released.
I consider it as a bugfix, in #40088 I already updated the release notes to include non-beta annotation `storageclass.kubernetes.io/is-default-class`
```release-note
NONE
```
@kubernetes/sig-storage-pr-reviews
@msau42, please help with merging.
Automatic merge from submit-queue
Add DaemonSet templateGeneration validation and tests, and fix a bunch of validation test errors
For DaemonSet update:
1. Validate that templateGeneration is increased when and only when template is changed
1. Validate that templateGeneration is never decreased
1. Added validation tests for templateGeneration
1. Fix a bunch of errors in validate tests
- fake tests: almost all validation test error cases failed on "missing resource version", "name changes", "missing update strategy", "selector/template labels mismatch", not on the real validation we wanted to test
- some error cases should be success cases
@kargakis @lukaszo @kubernetes/sig-apps-bugs
*I've verified locally that all DaemonSet e2e tests pass with this change.*
Automatic merge from submit-queue (batch tested with PRs 42786, 42553)
Updated auto generated protobuf codes.
Generated by `./hack/update-generated-protobuf-dockerized.sh` in Mac.
1. Validate that templateGeneration is increased when and only when template is changed
2. Validate that templateGeneration is never decreased
3. Added validation tests for templateGeneration
4. Fix a bunch of errors in validate tests, for example, all validation test error cases failed
on lack of resource version, or on name changes, not on the real validation we wanted to test
Automatic merge from submit-queue (batch tested with PRs 42692, 42169, 42173)
DaemonSet: Respect ControllerRef
**What this PR does / why we need it**:
This is part of the completion of the [ControllerRef](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md) proposal. It brings DaemonSet into full compliance with ControllerRef. See the individual commit messages for details.
**Which issue this PR fixes**:
This ensures that DaemonSet does not fight with other controllers over control of Pods.
**Special notes for your reviewer**:
**Release note**:
```release-note
DaemonSet now respects ControllerRef to avoid fighting over Pods.
```
cc @erictune @kubernetes/sig-apps-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 42692, 42169, 42173)
Add pprof trace support
Add support for `/debug/pprof/trace`
Can wait for master to reopen for 1.7.
cc @smarterclayton @wojtek-t @gmarek @timothysc @jeremyeder @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 42443, 38924, 42367, 42391, 42310)
Dell EMC ScaleIO Volume Plugin
**What this PR does / why we need it**
This PR implements the Kubernetes volume plugin to allow pods to seamlessly access and use data stored on ScaleIO volumes. [ScaleIO](https://www.emc.com/storage/scaleio/index.htm) is a software-based storage platform that creates a pool of distributed block storage using locally attached disks on every server. The code for this PR supports persistent volumes using PVs, PVCs, and dynamic provisioning.
You can find examples of how to use and configure the ScaleIO Kubernetes volume plugin in [examples/volumes/scaleio/README.md](examples/volumes/scaleio/README.md).
**Special notes for your reviewer**:
To facilitate code review, commits for source code implementation are separated from other artifacts such as generated, docs, and vendored sources.
```release-note
ScaleIO Kubernetes Volume Plugin added enabling pods to seamlessly access and use data stored on ScaleIO volumes.
```
Automatic merge from submit-queue (batch tested with PRs 41919, 41149, 42350, 42351, 42285)
enable cgroups tiers and node allocatable enforcement on pods by default.
```release-note
Pods are launched in a separate cgroup hierarchy than system services.
```
Depends on #41753
cc @derekwaynecarr
Automatic merge from submit-queue (batch tested with PRs 41919, 41149, 42350, 42351, 42285)
kubelet: enable qos-level memory limits
```release-note
Experimental support to reserve a pod's memory request from being utilized by pods in lower QoS tiers.
```
Enables the QoS-level memory cgroup limits described in https://github.com/kubernetes/community/pull/314
**Note: QoS level cgroups have to be enabled for any of this to take effect.**
Adds a new `--experimental-qos-reserved` flag that can be used to set the percentage of a resource to be reserved at the QoS level for pod resource requests.
For example, `--experimental-qos-reserved="memory=50%`, means that if a Guaranteed pod sets a memory request of 2Gi, the Burstable and BestEffort QoS memory cgroups will have their `memory.limit_in_bytes` set to `NodeAllocatable - (2Gi*50%)` to reserve 50% of the guaranteed pod's request from being used by the lower QoS tiers.
If a Burstable pod sets a request, its reserve will be deducted from the BestEffort memory limit.
The result is that:
- Guaranteed limit matches root cgroup at is not set by this code
- Burstable limit is `NodeAllocatable - Guaranteed reserve`
- BestEffort limit is `NodeAllocatable - Guaranteed reserve - Burstable reserve`
The only resource currently supported is `memory`; however, the code is generic enough that other resources can be added in the future.
@derekwaynecarr @vishh