Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Removal of KubeletConfigFile feature gate: Step 1
This feature gate was redundant with the `--config` flag, which already
enables/disables loading Kubelet config from a file.
Since the gate guarded an alpha feature, removing it is not a violation
of our API guidelines.
Some stuff in `kubernetes/test-infra` currently sets the gate,
so removing will be a 3 step process:
1. This PR, which makes the gate a no-op.
2. Stop setting the gate in `kubernetes/test-infra`.
3. Completely remove the gate (this PR will get the release note).
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 57973, 57990). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Set pids limit at pod level
**What this PR does / why we need it**:
Add a new Alpha Feature to set a maximum number of pids per Pod.
This is to allow the use case where cluster administrators wish
to limit the pids consumed per pod (example when running a CI system).
By default, we do not set any maximum limit, If an administrator wants
to enable this, they should enable `SupportPodPidsLimit=true` in the
`--feature-gates=` parameter to kubelet and specify the limit using the
`--pod-max-pids` parameter.
The limit set is the total count of all processes running in all
containers in the pod.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#43783
**Special notes for your reviewer**:
**Release note**:
```release-note
New alpha feature to limit the number of processes running in a pod. Cluster administrators will be able to place limits by using the new kubelet command line parameter --pod-max-pids. Note that since this is a alpha feature they will need to enable the "SupportPodPidsLimit" feature.
```
This feature gate was redundant with the `--config` flag, which already
enables/disables loading Kubelet config from a file.
Since the gate guarded an alpha feature, removing it is not a violation
of our API guidelines.
Some stuff in `kubernetes/test-infra` currently sets the gate,
so removing will be a 3 step process:
1. This PR, which makes the gate a no-op.
2. Stop setting the gate in `kubernetes/test-infra`.
3. Completely remove the gate.
Add a new Alpha Feature to set a maximum number of pids per Pod.
This is to allow the use case where cluster administrators wish
to limit the pids consumed per pod (example when running a CI system).
By default, we do not set any maximum limit, If an administrator wants
to enable this, they should enable `SupportPodPidsLimit=true` in the
`--feature-gates=` parameter to kubelet and specify the limit using the
`--pod-max-pids` parameter.
The limit set is the total count of all processes running in all
containers in the pod.
Automatic merge from submit-queue (batch tested with PRs 57746, 57621, 56839, 57464). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix PodCIDR flag: defaults come from the object, not as literal args to the flag function
The defaulter runs on the object before adding flags. Flags should be registered with defaults sourced from this object, so that the defaulter, not the flag var function, determines the canonical default value.
```release-note
NONE
```
The first call to Set will clear the map before adding entries;
subsequent calls will simply append to the map.
This makes it possible to override default values with a command-line
option rather than appending to defaults,
while still allowing the distribution of key-value pairs across
multiple flag invocations.
For example: `--flag "a:hello" --flag "b:again" --flag "b:beautiful"
--flag "c:world"` results in `{"a": ["hello"], "b": ["again",
"beautiful"], "c": ["world"]}`
- Changes the following KubeletConfiguration fields from `string` to
`map[string]string`:
- `EvictionHard`
- `EvictionSoft`
- `EvictionSoftGracePeriod`
- `EvictionMinimumReclaim`
- Adds flag parsing shims to maintain Kubelet's public flags API, while
enabling structured input in the file API.
- Also removes `kubeletconfig.ConfigurationMap`, which was an ad-hoc flag
parsing shim living in the kubeletconfig API group, and replaces it
with the `MapStringString` shim introduced in this PR. Flag parsing
shims belong in a common place, not in the kubeletconfig API.
I manually audited these to ensure that this wouldn't cause errors
parsing the command line for syntax that would have previously been
error free (`kubeletconfig.ConfigurationMap` was unique in that it
allowed keys to be provided on the CLI without values. I believe this was
done in `flags.ConfigurationMap` to facilitate the `--node-labels` flag,
which rightfully accepts value-free keys, and that this shim was then
just copied to `kubeletconfig`). Fortunately, the affected fields
(`ExperimentalQOSReserved`, `SystemReserved`, and `KubeReserved`) expect
non-empty strings in the values of the map, and as a result passing the
empty string is already an error. Thus requiring keys shouldn't break
anyone's scripts.
- Updates code and tests accordingly.
Regarding eviction operators, directionality is already implicit in the
signal type (for a given signal, the decision to evict will be made when
crossing the threshold from either above or below, never both). There is
no need to expose an operator, such as `<`, in the API. By changing
`EvictionHard` and `EvictionSoft` to `map[string]string`, this PR
simplifies the experience of working with these fields via the
`KubeletConfiguration` type. Again, flags stay the same.
Other things:
- There is another flag parsing shim, `flags.ConfigurationMap`, from the
shared flag utility. The `NodeLabels` field still uses
`flags.ConfigurationMap`. This PR moves the allocation of the
`map[string]string` for the `NodeLabels` field from
`AddKubeletConfigFlags` to the defaulter for the external
`KubeletConfiguration` type. Flags are layered on top of an internal
object that has undergone conversion from a defaulted external object,
which means that previously the mere registration of flags would have
overwritten any previously-defined defaults for `NodeLabels` (fortunately
there were none).
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix incorrect parameter tip
**What this PR does / why we need it**:
run kubelet set --init-config-dir=xxx, kubelet not work. see the error log need to open KubeletConfigFile feature gates.
But
kubelet --help
--init-config-dir string The Kubelet will look in this directory for the init configuration. The path may be absolute or relative; relative paths start at the Kubelet's current working directory. Omit this argument to use the built-in default configuration values. Presently, you must also enable the `DynamicKubeletConfig` feature gate to pass this flag.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes ##55666
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52367, 53363, 54989, 54872, 54643). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Lift embedded structure out of ManifestURLHeader field
Related: #53833
```release-note
It is now possible to set multiple manifest url headers via the Kubelet's --manifest-url-header flag. Multiple headers for the same key will be added in the order provided. The ManifestURLHeader field in KubeletConfiguration object (kubeletconfig/v1alpha1) is now a map[string][]string, which facilitates writing JSON and YAML files.
```
Revert "Merge pull request #51857 from kubernetes/revert-51307-kc-type-refactor"
This reverts commit 9d27d92420, reversing
changes made to 2e69d4e625.
See original: #51307
We punted this from 1.8 so it could go through an API review. The point
of this PR is that we are trying to stabilize the kubeletconfig API so
that we can move it out of alpha, and unblock features like Dynamic
Kubelet Config, Kubelet loading its initial config from a file instead
of flags, kubeadm and other install tools having a versioned API to rely
on, etc.
We shouldn't rev the version without both removing all the deprecated
junk from the KubeletConfiguration struct, and without (at least
temporarily) removing all of the fields that have "Experimental" in
their names. It wouldn't make sense to lock in to deprecated fields.
"Experimental" fields can be audited on a 1-by-1 basis after this PR,
and if found to be stable (or sufficiently alpha-gated), can be restored
to the KubeletConfiguration without the "Experimental" prefix.
Command line flag API remains the same. This allows ComponentConfig
structures (e.g. KubeletConfiguration) to express the map structure
behind feature gates in a natural way when written as JSON or YAML.
For example:
KubeletConfiguration Before:
```
apiVersion: kubeletconfig/v1alpha1
kind: KubeletConfiguration
featureGates: "DynamicKubeletConfig=true,Accelerators=true"
```
KubeletConfiguration After:
```
apiVersion: kubeletconfig/v1alpha1
kind: KubeletConfiguration
featureGates:
DynamicKubeletConfig: true
Accelerators: true
```
This is part of the move to external cloud providers. Please see
plan detail in issue 50986. This PR covers step #2:
v1.9 - set no cloud provider as the default in kubelet but still allow
opt in for auto-detect
Automatic merge from submit-queue (batch tested with PRs 44596, 52708, 53163, 53167, 52692). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix --kube-reserved storage key name and add UTs for node allocatable reservation
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: part of #52463
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/assign @jingxu97
Automatic merge from submit-queue (batch tested with PRs 52485, 52443, 52597, 52450, 51971). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Some kubelet flags do not accept their default values
Correct the flags and add a round trip test that ensure these do not
break again in the future.
@deads2k as observed when we tried to turn flags into args.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
remove duplicated import
**What this PR does / why we need it**:
**Which issue this PR fixes** : fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50381, 51307, 49645, 50995, 51523)
Remove deprecated and experimental fields from KubeletConfiguration
As we work towards providing a stable (v1) kubeletconfig API,
we cannot afford to have deprecated or "experimental" (alpha) fields
living in the KubeletConfiguration struct. This removes all existing
experimental or deprecated fields, and places them in KubeletFlags
instead.
I'm going to send another PR after this one that organizes the remaining
fields into substructures for readability. Then, we should try to move
to v1 ASAP (maybe not v1 in 1.8, given how close we are, but definitely in 1.9).
It makes far more sense to focus on a clean API in kubeletconfig v2,
than to try and further clean up the existing "API" that everyone
already depends on.
fixes: #51657
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49971, 51357, 51616, 51649, 51372)
Separate feature gates for dynamic kubelet config vs loading from a file
This makes it so these two features can be turned on independently, rather than bundling both under dynamic kubelet config.
fixes: #51664
```release-note
NONE
```
As we work towards providing a stable (v1) kubeletconfig API,
we cannot afford to have deprecated or "experimental" (alpha) fields
living in the KubeletConfiguration struct. This removes all existing
experimental or deprecated fields, and places them in KubeletFlags
instead.
I'm going to send another PR after this one that organizes the remaining
fields into substructures for readability. Then, we should try to move
to v1 ASAP.
It makes far more sense to focus on a clean API in kubeletconfig v2,
than to try and further clean up the existing "API" that everyone
already depends on.
Automatic merge from submit-queue (batch tested with PRs 50932, 49610, 51312, 51415, 50705)
Deprecation warnings for auto detecting cloud providers
**What this PR does / why we need it**:
Adds deprecation warnings for auto detecting cloud providers. As part of the initiative for out-of-tree cloud providers, this feature is conflicting since we're shifting the dependency of kubernetes core into cAdvisor. In the future kubelets should be using `--cloud-provider=external` or no cloud provider at all.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50986
**Special notes for your reviewer**:
NOTE: I still have to coordinate with sig-node and kubernetes-dev to get approval for this deprecation, I'm only opening this PR since we're close to code freeze and it's something presentable.
**Release note**:
```release-note
Deprecate auto detecting cloud providers in kubelet. Auto detecting cloud providers go against the initiative for out-of-tree cloud providers as we'll now depend on cAdvisor integrations with cloud providers instead of the core repo. In the near future, `--cloud-provider` for kubelet will either be an empty string or `external`.
```
The ReadOnlyPort defaulting prevented passing 0 to diable via
the KubeletConfiguraiton struct.
The HealthzPort defaulting prevented passing 0 to disable via the
KubeletConfiguration struct. The documentation also failed to mention
this, but the check is performed in code.
The CAdvisorPort documentation failed to mention that you can pass 0 to
disable.
This reverts commit f4afdecef8, reversing
changes made to e633a1604f.
This also fixes a bug where Kubemark was still using the core api scheme
to manipulate the Kubelet's types, which was the cause of the initial
revert.
* Deprecate the old experimental-fail-swap-on
* Add a new flag fail-swap-on and set it to true
Before this change, we would not fail when swap is on. With this
change we fail for everyone when swap is on, unless they explicitly
set --fail-swap-on to false.
Automatic merge from submit-queue (batch tested with PRs 45813, 49594, 49443, 49167, 47539)
Deprecate keep-terminated-pod-volumes
It was discussed and agreed by sig-storage that
this flag causes unnecessary confusion and is hard to keep
synchornized with controller's attach/detach functionality.
Fixes https://github.com/kubernetes/kubernetes/issues/45615
```release-note
keep-terminated-pod-volumes flag on kubelet is deprecated.
```
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)
Use presence of kubeconfig file to toggle standalone mode
Fixes#40049
```release-note
The deprecated --api-servers flag has been removed. Use --kubeconfig to provide API server connection information instead. The --require-kubeconfig flag is now deprecated. The default kubeconfig path is also deprecated. Both --require-kubeconfig and the default kubeconfig path will be removed in Kubernetes v1.10.0.
```
/cc @kubernetes/sig-cluster-lifecycle-misc @kubernetes/sig-node-misc
Replaces use of --api-servers with --kubeconfig in Kubelet args across
the turnup scripts. In many cases this involves generating a kubeconfig
file for the Kubelet and placing it in the correct location on the node.
It was discussed and agreed by sig-storage that
this flag causes unnecessary confusion and is hard to keep
synchornized with controller's attach/detach functionality.
Automatic merge from submit-queue (batch tested with PRs 46972, 42829, 46799, 46802, 46844)
promote tls-bootstrap to beta
last commit of this PR.
Towards https://github.com/kubernetes/kubernetes/issues/46999
```release-note
Promote kubelet tls bootstrap to beta. Add a non-experimental flag to use it and deprecate the old flag.
```
This PR adds the check for local storage request when admitting pods. If
the local storage request exceeds the available resource, pod will be
rejected.
Automatic merge from submit-queue
kubelet: group all container-runtime-specific flags/options into a separate struct
They don't belong in the KubeletConfig.
This addresses #43253
Automatic merge from submit-queue
fixtypo
**What this PR does / why we need it**:
fix typo seperated -> separated
**Release note**:
```release-note
None
```
Automatic merge from submit-queue
Enable shared PID namespace by default for docker pods
**What this PR does / why we need it**: This PR enables PID namespace sharing for docker pods by default, bringing the behavior of docker in line with the other CRI runtimes when used with docker >= 1.13.1.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: ref #1615
**Special notes for your reviewer**: cc @dchen1107 @yujuhong
**Release note**:
```release-note
Kubernetes now shares a single PID namespace among all containers in a pod when running with docker >= 1.13.1. This means processes can now signal processes in other containers in a pod, but it also means that the `kubectl exec {pod} kill 1` pattern will cause the pod to be restarted rather than a single container.
```
Automatic merge from submit-queue
Delete "hard-coded" default value in flags usage.
**What this PR does / why we need it**:
Some flags of kubernetes components have "hard-coded" default values in their usage info. In fact, [pflag pkg](https://github.com/kubernetes/kubernetes/blob/master/vendor/github.com/spf13/pflag/flag.go#L602-L608) has already added a string `(default value)` automatically in the usage info if the flag is initialized. Then we don't need to hard-code the default value in usage info. After this PR, if we want to update the default value of a flag, we only need to update the flag where it is initialized. `pflag` will update the usage info for us. This will avoid inconsistency.
For example:
Before
```
kubelet -h
...
--node-status-update-frequency duration Specifies how often kubelet posts node status to master. Note: be cautious when changing the constant, it must work with nodeMonitorGracePeriod in nodecontroller. Default: 10s (default 10s)
...
```
After
```
kubelet -h
...
--node-status-update-frequency duration Specifies how often kubelet posts node status to master. Note: be cautious when changing the constant, it must work with nodeMonitorGracePeriod in nodecontroller. (default 10s)
...
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
This PR doesn't delete some "hard-coded" default values because they are not explicitly initialized. We still need to hard-code them to give users friendly info.
```
--allow-privileged If true, allow containers to request privileged mode. [default=false]
```
**Release note**:
```release-note
None
```
Kubelet flags are not necessarily appropriate for the KubeletConfiguration
object. For example, this PR also removes HostnameOverride and NodeIP
from KubeletConfiguration. This is a preleminary step to enabling Nodes
to share configurations, as part of the dynamic Kubelet configuration
feature (#29459). Fields that must be unique for each node inhibit
sharing, because their values, by definition, cannot be shared.
Automatic merge from submit-queue
kubelet: change image-gc-high-threshold below docker dm.min_free_space
docker dm.min_free_space defaults to 10%, which "specifies the min free space percent in a thin pool require for new device creation to succeed....Whenever a new a thin pool device is created (during docker pull or during container creation), the Engine checks if the minimum free space is available. If sufficient space is unavailable, then device creation fails and any relevant docker operation fails." [1]
This setting is preventing the storage usage to cross the 90% limit. However, image GC is expected to kick in only beyond image-gc-high-threshold. The image-gc-high-threshold has a default value of 90%, and hence GC never triggers. If image-gc-high-threshold is set to a value lower than (100 - dm.min_free_space)%, GC triggers.
xref https://bugzilla.redhat.com/show_bug.cgi?id=1408309
```release-note
changed kubelet default image-gc-high-threshold to 85% to resolve a conflict with default settings in docker that prevented image garbage collection from resolving low disk space situations when using devicemapper storage.
```
@derekwaynecarr @sdodson @rhvgoyal
Automatic merge from submit-queue
Use realistic value for the memory example of kube-reserved and system-reserved
Use realistic value for the memory example of kube-reserved and system-reserved
Currently, kublet help shows the memory example of
kube-reserved and system-reserved as 150G. This 150G is not realistic
value and it leads misconfiguration or confusion. This patch changes
to example value as 500Mi.
Before(same with system-reserved):
```
--kube-reserved value A set of ResourceName=ResourceQuantity (e.g. cpu=200m,memory=150G) pairs that describe resources reserved for kubernetes system components. Currently only cpu and memory are supported. See http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md for more detail. [default=none]
```
After(same with system-reserved):
```
--kube-reserved value A set of ResourceName=ResourceQuantity (e.g. cpu=200m,memory=500Mi) pairs that describe resources reserved for kubernetes system components. Currently only cpu and memory are supported. See http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md for more detail. [default=none]
```
Automatic merge from submit-queue (batch tested with PRs 41919, 41149, 42350, 42351, 42285)
enable cgroups tiers and node allocatable enforcement on pods by default.
```release-note
Pods are launched in a separate cgroup hierarchy than system services.
```
Depends on #41753
cc @derekwaynecarr