Automatic merge from submit-queue (batch tested with PRs 50932, 49610, 51312, 51415, 50705)
Deprecation warnings for auto detecting cloud providers
**What this PR does / why we need it**:
Adds deprecation warnings for auto detecting cloud providers. As part of the initiative for out-of-tree cloud providers, this feature is conflicting since we're shifting the dependency of kubernetes core into cAdvisor. In the future kubelets should be using `--cloud-provider=external` or no cloud provider at all.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50986
**Special notes for your reviewer**:
NOTE: I still have to coordinate with sig-node and kubernetes-dev to get approval for this deprecation, I'm only opening this PR since we're close to code freeze and it's something presentable.
**Release note**:
```release-note
Deprecate auto detecting cloud providers in kubelet. Auto detecting cloud providers go against the initiative for out-of-tree cloud providers as we'll now depend on cAdvisor integrations with cloud providers instead of the core repo. In the near future, `--cloud-provider` for kubelet will either be an empty string or `external`.
```
Automatic merge from submit-queue (batch tested with PRs 51054, 51101, 50031, 51296, 51173)
Dynamic Flexvolume plugin discovery, probing with filesystem watch.
**What this PR does / why we need it**: Enables dynamic Flexvolume plugin discovery. This model uses a filesystem watch (fsnotify library), which notifies the system that a probe is necessary only if something changes in the Flexvolume plugin directory.
This PR uses the dependency injection model in https://github.com/kubernetes/kubernetes/pull/49668.
**Release Note**:
```release-note
Dynamic Flexvolume plugin discovery. Flexvolume plugins can now be discovered on the fly rather than only at system initialization time.
```
/sig-storage
/assign @jsafrane @saad-ali
/cc @bassam @chakri-nelluri @kokhang @liggitt @thockin
Automatic merge from submit-queue (batch tested with PRs 50889, 51347, 50582, 51297, 51264)
Change eviction manager to manage one single local storage resource
**What this PR does / why we need it**:
We decided to manage one single resource name, eviction policy should be modified too.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: part of #50818
**Special notes for your reviewer**:
**Release note**:
```release-note
Change eviction manager to manage one single local ephemeral storage resource
```
/assign @jingxu97
The ReadOnlyPort defaulting prevented passing 0 to diable via
the KubeletConfiguraiton struct.
The HealthzPort defaulting prevented passing 0 to disable via the
KubeletConfiguration struct. The documentation also failed to mention
this, but the check is performed in code.
The CAdvisorPort documentation failed to mention that you can pass 0 to
disable.
This reverts commit f4afdecef8, reversing
changes made to e633a1604f.
This also fixes a bug where Kubemark was still using the core api scheme
to manipulate the Kubelet's types, which was the cause of the initial
revert.
Automatic merge from submit-queue (batch tested with PRs 50119, 48366, 47181, 41611, 49547)
Fail on swap enabled and deprecate experimental-fail-swap-on flag
**What this PR does / why we need it**:
* Deprecate the old experimental-fail-swap-on
* Add a new flag fail-swap-on and set it to true
Before this change, we would not fail when swap is on. With this
change we fail for everyone when swap is on, unless they explicitly
set --fail-swap-on to false.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#34726
**Special notes for your reviewer**:
**Release note**:
```release-note
Kubelet will by default fail with swap enabled from now on. The experimental flag "--experimental-fail-swap-on" has been deprecated, please set the new "--fail-swap-on" flag to false if you wish to run with /proc/swaps on.
```
* Deprecate the old experimental-fail-swap-on
* Add a new flag fail-swap-on and set it to true
Before this change, we would not fail when swap is on. With this
change we fail for everyone when swap is on, unless they explicitly
set --fail-swap-on to false.
After the kubelet rotates its client cert, it will keep connections
to the API server open indefinitely, causing it to use its old
credentials instead of the new certs
When the kubelet rotates its cert, close down existing connections
to force a new TLS handshake.
Automatic merge from submit-queue (batch tested with PRs 49538, 49708, 47665, 49750, 49528)
Use the core client with version
**What this PR does / why we need it**:
Replace the **deprecated** `clientSet.Core()` with `clientSet.CoreV1()`.
**Which issue this PR fixes**: fixes#49535
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45813, 49594, 49443, 49167, 47539)
Deprecate keep-terminated-pod-volumes
It was discussed and agreed by sig-storage that
this flag causes unnecessary confusion and is hard to keep
synchornized with controller's attach/detach functionality.
Fixes https://github.com/kubernetes/kubernetes/issues/45615
```release-note
keep-terminated-pod-volumes flag on kubelet is deprecated.
```
Automatic merge from submit-queue (batch tested with PRs 49238, 49595, 43494, 47897, 48905)
Should not set struct pointer directly to interface which may cause potential panic
fix https://github.com/kubernetes/kubernetes/issues/43127 to avoid potential kubelet panic.
In our old implemention, interface `kubeDeps.EventClient ` (interface) will never equals to `nil` even if `eventClient `(struct pointer )was set to `nil`. `kubeDeps.ExternalKubeClient` and `kubeDeps.KubeClient` also have same potential risk.
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)
Use presence of kubeconfig file to toggle standalone mode
Fixes#40049
```release-note
The deprecated --api-servers flag has been removed. Use --kubeconfig to provide API server connection information instead. The --require-kubeconfig flag is now deprecated. The default kubeconfig path is also deprecated. Both --require-kubeconfig and the default kubeconfig path will be removed in Kubernetes v1.10.0.
```
/cc @kubernetes/sig-cluster-lifecycle-misc @kubernetes/sig-node-misc
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)
Remove duplicated import and wrong alias name of api package
**What this PR does / why we need it**:
**Which issue this PR fixes**: fixes#48975
**Special notes for your reviewer**:
/assign @caesarxuchao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Remove flags low-diskspace-threshold-mb and outofdisk-transition-frequency
issue: #48843
This removes two flags replaced by the eviction manager. These have been depreciated for two releases, which I believe correctly follows the kubernetes depreciation guidelines.
```release-note
Remove depreciated flags: --low-diskspace-threshold-mb and --outofdisk-transition-frequency, which are replaced by --eviction-hard
```
cc @mtaufen since I am changing kubelet flags
cc @vishh @derekwaynecarr
/sig node
Automatic merge from submit-queue (batch tested with PRs 48636, 49088, 49251, 49417, 49494)
Fix issues for local storage allocatable feature
This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
This PR fixes issue #47809
Replaces use of --api-servers with --kubeconfig in Kubelet args across
the turnup scripts. In many cases this involves generating a kubeconfig
file for the Kubelet and placing it in the correct location on the node.
This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
This is used by integrators that want to perform partial overrides of
key interfaces. Refactors the run() method to fit the existing style and
preserve the existing behavior, but allow (for instance) client
bootstrap and cert refresh even when some dependencies are injected.
Automatic merge from submit-queue (batch tested with PRs 47694, 47772, 47783, 47803, 47673)
Make different container runtimes constant
Make different container runtimes constant to avoid hardcode
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47915, 47856, 44086, 47575, 47475)
kubelet should resume csr bootstrap
Right now the kubelet creates a new csr object with the same key every
time it restarts during the bootstrap process. It should resume with the
old csr object if it exists. To do this the name of the csr object must
be stable.
Issue https://github.com/kubernetes/kubernetes/issues/47855
Automatic merge from submit-queue (batch tested with PRs 47922, 47195, 47241, 47095, 47401)
Run cAdvisor on the same interface as kubelet
**What this PR does / why we need it**:
cAdvisor currently binds to all interfaces. Currently the only
solution is to use iptables to block access to the port. We
are better off making cAdvisor to bind to the interface that
kubelet uses for better security.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#11710
**Special notes for your reviewer**:
**Release note**:
```release-note
cAdvisor binds only to the interface that kubelet is running on instead of all interfaces.
```
Right now the kubelet creates a new csr object with the same key every
time it restarts during the bootstrap process. It should resume with the
old csr object if it exists. To do this the name of the csr object must
be stable. Also using a list watch here eliminates a race condition
where a watch event is missed and the kubelet stalls.
Automatic merge from submit-queue (batch tested with PRs 46327, 47166)
Fixed typo in comments.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes # N/A
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46327, 47166)
mark --network-plugin-dir deprecated for kubelet
**What this PR does / why we need it**:
**Which issue this PR fixes** : fixes#43967
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
It was discussed and agreed by sig-storage that
this flag causes unnecessary confusion and is hard to keep
synchornized with controller's attach/detach functionality.
cAdvisor currently binds to all interfaces. Currently the only
solution is to use iptables to block access to the port. We
are better off making cAdvisor to bind to the interface that
kubelet uses for better security.
Fixes#11710
Automatic merge from submit-queue (batch tested with PRs 43005, 46660, 46385, 46991, 47103)
Consolidate sysctl commands for kubelet
**What this PR does / why we need it**:
These commands are important enough to be in the Kubelet itself.
By default, Ubuntu 14.04 and Debian Jessie have these set to 200 and
20000. Without this setting, nodes are limited in the number of
containers that they can start.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#26005
**Special notes for your reviewer**:
I had a difficult time writing tests for this. It is trivial to create a fake sysctl for testing, but the Kubelet does not have any tests for the prior settings.
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 46972, 42829, 46799, 46802, 46844)
promote tls-bootstrap to beta
last commit of this PR.
Towards https://github.com/kubernetes/kubernetes/issues/46999
```release-note
Promote kubelet tls bootstrap to beta. Add a non-experimental flag to use it and deprecate the old flag.
```
Automatic merge from submit-queue
Add local storage (scratch space) allocatable support
This PR adds the support for allocatable local storage (scratch space).
This feature is only for root file system which is shared by kubernetes
componenets, users' containers and/or images. User could use
--kube-reserved flag to reserve the storage for kube system components.
If the allocatable storage for user's pods is used up, some pods will be
evicted to free the storage resource.
This feature is part of local storage capacity isolation and described in the proposal https://github.com/kubernetes/community/pull/306
**Release note**:
```release-note
This feature exposes local storage capacity for the primary partitions, and supports & enforces storage reservation in Node Allocatable
```
Automatic merge from submit-queue (batch tested with PRs 46726, 41912, 46695, 46034, 46551)
Rotate kubelet client certificate.
Changes the kubelet so it bootstraps off the cert/key specified in the
config file and uses those to request new cert/key pairs from the
Certificate Signing Request API, as well as rotating client certificates
when they approach expiration.
Default behavior is for client certificate rotation to be disabled. If enabled
using a command line flag, the kubelet exits each time the certificate is
rotated. I tried to use `GetCertificate` in [tls.Config](https://golang.org/pkg/crypto/tls/#Config) but it is only called
on the server side of connections. Then I tried `GetClientCertificate`,
but it is new in 1.8.
**Release note**
```release-note
With --feature-gates=RotateKubeletClientCertificate=true set, the kubelet will
request a client certificate from the API server during the boot cycle and pause
waiting for the request to be satisfied. It will continually refresh the certificate
as the certificates expiration approaches.
```
Automatic merge from submit-queue (batch tested with PRs 46801, 45184, 45930, 46192, 45563)
adds log when --kubeconfig with wrong config
**What this PR does / why we need it**:
easy for troubleshooting
I have set --kubeconfig==/etc/kubernetes/kubelet.conf when copy & paste(the file path is wrong “==/etc/kubernetes/kubelet.conf”), but kubelet start with no error log. I don't know what happend.
**Release note**:
```release-note
NONE
```
This PR adds the check for local storage request when admitting pods. If
the local storage request exceeds the available resource, pod will be
rejected.
This PR adds the support for allocatable local storage (scratch space).
This feature is only for root file system which is shared by kubernetes
componenets, users' containers and/or images. User could use
--kube-reserved flag to reserve the storage for kube system components.
If the allocatable storage for user's pods is used up, some pods will be
evicted to free the storage resource.
Changes the kubelet so it bootstraps off the cert/key specified in the
config file and uses those to request new cert/key pairs from the
Certificate Signing Request API, as well as rotating client certificates
when they approach expiration.
Automatic merge from submit-queue (batch tested with PRs 46076, 43879, 44897, 46556, 46654)
Local storage plugin
**What this PR does / why we need it**:
Volume plugin implementation for local persistent volumes. Scheduler predicate will direct already-bound PVCs to the node that the local PV is at. PVC binding still happens independently.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Part of #43640
**Release note**:
```
Alpha feature: Local volume plugin allows local directories to be created and consumed as a Persistent Volume. These volumes have node affinity and pods will only be scheduled to the node that the volume is at.
```
Automatic merge from submit-queue (batch tested with PRs 46635, 45619, 46637, 45059, 46415)
Certificate rotation for kubelet server certs.
Replaces the current kubelet server side self signed certs with certs signed by
the Certificate Request Signing API on the API server. Also renews expiring
kubelet server certs as expiration approaches.
Two Points:
1. With `--feature-gates=RotateKubeletServerCertificate=true` set, the kubelet will
request a certificate during the boot cycle and pause waiting for the request to
be satisfied.
2. In order to have the kubelet's certificate signing request auto approved,
`--insecure-experimental-approve-all-kubelet-csrs-for-group=` must be set on
the cluster controller manager. There is an improved mechanism for auto
approval [proposed](https://github.com/kubernetes/kubernetes/issues/45030).
**Release note**:
```release-note
With `--feature-gates=RotateKubeletServerCertificate=true` set, the kubelet will
request a server certificate from the API server during the boot cycle and pause
waiting for the request to be satisfied. It will continually refresh the certificate as
the certificates expiration approaches.
```
Automatic merge from submit-queue
kubelet: group all container-runtime-specific flags/options into a separate struct
They don't belong in the KubeletConfig.
This addresses #43253
Automatic merge from submit-queue
fixtypo
**What this PR does / why we need it**:
fix typo seperated -> separated
**Release note**:
```release-note
None
```
Replaces the current kubelet server side self signed certs with certs
signed by the Certificate Request Signing API on the API server. Also
renews expiring kubelet server certs as expiration approaches.
Automatic merge from submit-queue
Enable shared PID namespace by default for docker pods
**What this PR does / why we need it**: This PR enables PID namespace sharing for docker pods by default, bringing the behavior of docker in line with the other CRI runtimes when used with docker >= 1.13.1.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: ref #1615
**Special notes for your reviewer**: cc @dchen1107 @yujuhong
**Release note**:
```release-note
Kubernetes now shares a single PID namespace among all containers in a pod when running with docker >= 1.13.1. This means processes can now signal processes in other containers in a pod, but it also means that the `kubectl exec {pod} kill 1` pattern will cause the pod to be restarted rather than a single container.
```
Automatic merge from submit-queue (batch tested with PRs 45453, 45307, 44987)
Migrate the docker client code from dockertools to dockershim
Move docker client code from dockertools to dockershim/libdocker. This includes
DockerInterface (renamed to Interface), FakeDockerClient, etc.
This is part of #43234
Automatic merge from submit-queue (batch tested with PRs 45362, 45159, 45321, 45238)
expose kubelet authentication and authorization builders
The kubelet authentication and authorization builder methods are useful for consumers.
@liggitt
Automatic merge from submit-queue
kubelet/get-pods-from-path: correct description of implemention
**What this PR does / why we need it**:
I find this description does not follow the current implementation, it should be describe like this according to my understanding of the source code.
These commands are important enough to be in the Kubelet itself.
By default, Ubuntu 14.04 and Debian Jessie have these set to 200 and
20000. Without this setting, nodes are limited in the number of
containers that they can start.
Automatic merge from submit-queue
Delete "hard-coded" default value in flags usage.
**What this PR does / why we need it**:
Some flags of kubernetes components have "hard-coded" default values in their usage info. In fact, [pflag pkg](https://github.com/kubernetes/kubernetes/blob/master/vendor/github.com/spf13/pflag/flag.go#L602-L608) has already added a string `(default value)` automatically in the usage info if the flag is initialized. Then we don't need to hard-code the default value in usage info. After this PR, if we want to update the default value of a flag, we only need to update the flag where it is initialized. `pflag` will update the usage info for us. This will avoid inconsistency.
For example:
Before
```
kubelet -h
...
--node-status-update-frequency duration Specifies how often kubelet posts node status to master. Note: be cautious when changing the constant, it must work with nodeMonitorGracePeriod in nodecontroller. Default: 10s (default 10s)
...
```
After
```
kubelet -h
...
--node-status-update-frequency duration Specifies how often kubelet posts node status to master. Note: be cautious when changing the constant, it must work with nodeMonitorGracePeriod in nodecontroller. (default 10s)
...
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
This PR doesn't delete some "hard-coded" default values because they are not explicitly initialized. We still need to hard-code them to give users friendly info.
```
--allow-privileged If true, allow containers to request privileged mode. [default=false]
```
**Release note**:
```release-note
None
```
Automatic merge from submit-queue
Add dockershim only mode
This PR added a `experimental-dockershim` hidden flag in kubelet to run dockershim only.
We introduce this flag mainly for cri validation test. In the future we should compile dockershim into another binary.
@yujuhong @feiskyer @xlgao-zju
/cc @kubernetes/sig-node-pr-reviews
Automatic merge from submit-queue
add rancher credential provider
This adds rancher as a credential provider in kubernetes.
@erictune This might be a good opportunity to discuss adding a provision for people to have their own credential providers that is similar to the new cloud provider changes (https://github.com/kubernetes/community/pull/128). WDYT?
```
release-note
Added Rancher Credential Provider to use Rancher Registry credentials when running in a Rancher cluster
```
Kubelet flags are not necessarily appropriate for the KubeletConfiguration
object. For example, this PR also removes HostnameOverride and NodeIP
from KubeletConfiguration. This is a preleminary step to enabling Nodes
to share configurations, as part of the dynamic Kubelet configuration
feature (#29459). Fields that must be unique for each node inhibit
sharing, because their values, by definition, cannot be shared.
Automatic merge from submit-queue
kubelet: change image-gc-high-threshold below docker dm.min_free_space
docker dm.min_free_space defaults to 10%, which "specifies the min free space percent in a thin pool require for new device creation to succeed....Whenever a new a thin pool device is created (during docker pull or during container creation), the Engine checks if the minimum free space is available. If sufficient space is unavailable, then device creation fails and any relevant docker operation fails." [1]
This setting is preventing the storage usage to cross the 90% limit. However, image GC is expected to kick in only beyond image-gc-high-threshold. The image-gc-high-threshold has a default value of 90%, and hence GC never triggers. If image-gc-high-threshold is set to a value lower than (100 - dm.min_free_space)%, GC triggers.
xref https://bugzilla.redhat.com/show_bug.cgi?id=1408309
```release-note
changed kubelet default image-gc-high-threshold to 85% to resolve a conflict with default settings in docker that prevented image garbage collection from resolving low disk space situations when using devicemapper storage.
```
@derekwaynecarr @sdodson @rhvgoyal
Automatic merge from submit-queue
Use realistic value for the memory example of kube-reserved and system-reserved
Use realistic value for the memory example of kube-reserved and system-reserved
Currently, kublet help shows the memory example of
kube-reserved and system-reserved as 150G. This 150G is not realistic
value and it leads misconfiguration or confusion. This patch changes
to example value as 500Mi.
Before(same with system-reserved):
```
--kube-reserved value A set of ResourceName=ResourceQuantity (e.g. cpu=200m,memory=150G) pairs that describe resources reserved for kubernetes system components. Currently only cpu and memory are supported. See http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md for more detail. [default=none]
```
After(same with system-reserved):
```
--kube-reserved value A set of ResourceName=ResourceQuantity (e.g. cpu=200m,memory=500Mi) pairs that describe resources reserved for kubernetes system components. Currently only cpu and memory are supported. See http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md for more detail. [default=none]
```
Automatic merge from submit-queue
Eviction Manager Enforces Allocatable Thresholds
This PR modifies the eviction manager to enforce node allocatable thresholds for memory as described in kubernetes/community#348.
This PR should be merged after #41234.
cc @kubernetes/sig-node-pr-reviews @kubernetes/sig-node-feature-requests @vishh
** Why is this a bug/regression**
Kubelet uses `oom_score_adj` to enforce QoS policies. But the `oom_score_adj` is based on overall memory requested, which means that a Burstable pod that requested a lot of memory can lead to OOM kills for Guaranteed pods, which violates QoS. Even worse, we have observed system daemons like kubelet or kube-proxy being killed by the OOM killer.
Without this PR, v1.6 will have node stability issues and regressions in an existing GA feature `out of Resource` handling.
Automatic merge from submit-queue (batch tested with PRs 42443, 38924, 42367, 42391, 42310)
Dell EMC ScaleIO Volume Plugin
**What this PR does / why we need it**
This PR implements the Kubernetes volume plugin to allow pods to seamlessly access and use data stored on ScaleIO volumes. [ScaleIO](https://www.emc.com/storage/scaleio/index.htm) is a software-based storage platform that creates a pool of distributed block storage using locally attached disks on every server. The code for this PR supports persistent volumes using PVs, PVCs, and dynamic provisioning.
You can find examples of how to use and configure the ScaleIO Kubernetes volume plugin in [examples/volumes/scaleio/README.md](examples/volumes/scaleio/README.md).
**Special notes for your reviewer**:
To facilitate code review, commits for source code implementation are separated from other artifacts such as generated, docs, and vendored sources.
```release-note
ScaleIO Kubernetes Volume Plugin added enabling pods to seamlessly access and use data stored on ScaleIO volumes.
```
Automatic merge from submit-queue (batch tested with PRs 41919, 41149, 42350, 42351, 42285)
enable cgroups tiers and node allocatable enforcement on pods by default.
```release-note
Pods are launched in a separate cgroup hierarchy than system services.
```
Depends on #41753
cc @derekwaynecarr
Automatic merge from submit-queue
Extend experimental support to multiple Nvidia GPUs
Extended from #28216
```release-note
`--experimental-nvidia-gpus` flag is **replaced** by `Accelerators` alpha feature gate along with support for multiple Nvidia GPUs.
To use GPUs, pass `Accelerators=true` as part of `--feature-gates` flag.
Works only with Docker runtime.
```
1. Automated testing for this PR is not possible since creation of clusters with GPUs isn't supported yet in GCP.
1. To test this PR locally, use the node e2e.
```shell
TEST_ARGS='--feature-gates=DynamicKubeletConfig=true' FOCUS=GPU SKIP="" make test-e2e-node
```
TODO:
- [x] Run manual tests
- [x] Add node e2e
- [x] Add unit tests for GPU manager (< 100% coverage)
- [ ] Add unit tests in kubelet package
- Add a new type PortworxVolumeSource
- Implement the kubernetes volume plugin for Portworx Volumes under pkg/volume/portworx
- The Portworx Volume Driver uses the libopenstorage/openstorage specifications and apis for volume operations.
Changes for k8s configuration and examples for portworx volumes.
- Add PortworxVolume hooks in kubectl, kube-controller-manager and validation.
- Add a README for PortworxVolume usage as PVs, PVCs and StorageClass.
- Add example spec files
Handle code review comments.
- Modified READMEs to incorporate to suggestions.
- Add a test for ReadWriteMany access mode.
- Use util.UnmountPath in TearDown.
- Add ReadOnly flag to PortworxVolumeSource
- Use hostname:port instead of unix sockets
- Delete the mount dir in TearDown.
- Fix link issue in persistentvolumes README
- In unit test check for mountpath after Setup is done.
- Add PVC Claim Name as a Portworx Volume Label
Generated code and documentation.
- Updated swagger spec
- Updated api-reference docs
- Updated generated code under pkg/api/v1
Godeps update for Portworx Volume Driver
- Adds github.com/libopenstorage/openstorage
- Adds go.pedge.io/pb/go/google/protobuf
- Updates Godep Licenses
Some imports dont exist yet (or so it seems) in client-go (examples
being:
- "k8s.io/kubernetes/pkg/api/validation"
- "k8s.io/kubernetes/pkg/util/initsystem"
- "k8s.io/kubernetes/pkg/util/node"
one change in kubelet to import to client-go
Automatic merge from submit-queue
Allow multipe DNS servers as comma-seperated argument for kubelet --dns
This PR explores how kubectls "--dns" could be extended to specify multiple DNS servers for in-cluster PODs. Testing on the local libvirt-coreos cluster shows that multiple DNS server are injected without issues.
Specifying multiple DNS servers increases resilience against
- Packet drops
- Single server failure
I am debugging services that do 50+ DNS requests for a single incoming interactive request, thus highly increase the chance of a slowdown (+5s) due to a single packet drop. Switching to two DNS servers will reduce the impact of the issues (roughly +1s on glibc, 0s on musl, error-rate goes down to error-rate^2).
Note that there is no need to change any runtime related code as far as I know. In the case of "default" dns the /etc/resolv.conf is parsed and multiple DNS server are send to the backend anyway. This only adds the same capability for the clusterFirst case.
I've heard from @thockin that multiple DNS entries are somehow considered. I've no idea what was considered, though. This is what I would like to see for our production use, though.
```release-note
NONE
```
Automatic merge from submit-queue
Switch resourcequota controller to shared informers
Originally part of #40097
I have had some issues with this change in the past, when I updated `pkg/quota` to use the new informers while `pkg/controller/resourcequota` remained on the old informers. In this PR, both are switched to using the new informers. The issues in the past were lots of flakey test failures in the ResourceQuota e2es, where it would randomly fail to see deletions and handle replenishment. I am hoping that now that everything here is consistently using the new informers, there won't be any more of these flakes, but it's something to keep an eye out for.
I also think `pkg/controller/resourcequota` could be cleaned up. I don't think there's really any need for `replenishment_controller.go` any more since it's no longer running individual controllers per kind to replenish. It instead just uses the shared informer and adds event handlers to it. But maybe we do that in a follow up.
cc @derekwaynecarr @smarterclayton @wojtek-t @deads2k @sttts @liggitt @timothysc @kubernetes/sig-scalability-pr-reviews
This change makes kubelet to use the CRI implementation by default,
unless the users opt out explicitly by using --enable-cri=false.
For the rkt integration, the --enable-cri flag will have no effect
since rktnetes does not use CRI.
Also, mark the original --experimental-cri flag hidden and deprecated,
so that we can remove it in the next release.
Automatic merge from submit-queue (batch tested with PRs 41112, 41201, 41058, 40650, 40926)
Promote TokenReview to v1
Peer to https://github.com/kubernetes/kubernetes/pull/40709
We have multiple features that depend on this API:
- [webhook authentication](https://kubernetes.io/docs/admin/authentication/#webhook-token-authentication)
- [kubelet delegated authentication](https://kubernetes.io/docs/admin/kubelet-authentication-authorization/#kubelet-authentication)
- add-on API server delegated authentication
The API has been in use since 1.3 in beta status (v1beta1) with negligible changes:
- Added a status field for reporting errors evaluating the token
This PR promotes the existing v1beta1 API to v1 with no changes
Because the API does not persist data (it is a query/response-style API), there are no data migration concerns.
This positions us to promote the features that depend on this API to stable in 1.7
cc @kubernetes/sig-auth-api-reviews @kubernetes/sig-auth-misc
```release-note
The authentication.k8s.io API group was promoted to v1
```
Automatic merge from submit-queue (batch tested with PRs 41121, 40048, 40502, 41136, 40759)
Remove deprecated kubelet flags that look safe to remove
Removes:
```
--config
--auth-path
--resource-container
--system-container
```
which have all been marked deprecated since at least 1.4 and look safe to remove.
```release-note
The deprecated flags --config, --auth-path, --resource-container, and --system-container were removed.
```
Automatic merge from submit-queue (batch tested with PRs 41103, 41042, 41097, 40946, 40770)
Use Clientset interface in KubeletDeps
**What this PR does / why we need it**:
This replaces the Clientset struct with the equivalent interface for the KubeClient injected via KubeletDeps. This is useful for testing and for accessing the Node and Pod status event stream without an API server.
**Special notes for your reviewer**:
Follow up to #4907
**Release note**:
`NONE`
Depending on an exact cluster setup multiple dns may make sense.
Comma-seperated lists of DNS server are quite common as DNS servers
are always plain IPs.
Automatic merge from submit-queue
error strings should not end with punctuation
**What this PR does / why we need it**:
Delete the end punctuation of error strings
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
https://github.com/golang/go/wiki/CodeReviewComments#error-strings
**Release note**:
```release-note
```
Signed-off-by: yupeng <yu.peng36@zte.com.cn>
Automatic merge from submit-queue (batch tested with PRs 40251, 40171)
Mark --docker-exec-handler deprecated
We plan to drop support for the nsenter exec handler in the future. Marking this flag as deprecated to warn the users.
Automatic merge from submit-queue (batch tested with PRs 37228, 40146, 40075, 38789, 40189)
kubelet: storage: teardown terminated pod volumes
This is a continuation of the work done in https://github.com/kubernetes/kubernetes/pull/36779
There really is no reason to keep volumes for terminated pods attached on the node. This PR extends the removal of volumes on the node from memory-backed (the current policy) to all volumes.
@pmorie raised a concern an impact debugging volume related issues if terminated pod volumes are removed. To address this issue, the PR adds a `--keep-terminated-pod-volumes` flag the kubelet and sets it for `hack/local-up-cluster.sh`.
For consideration in 1.6.
Fixes#35406
@derekwaynecarr @vishh @dashpole
```release-note
kubelet tears down pod volumes on pod termination rather than pod deletion
```
Automatic merge from submit-queue
make client-go more authoritative
Builds on https://github.com/kubernetes/kubernetes/pull/40103
This moves a few more support package to client-go for origination.
1. restclient/watch - nodep
1. util/flowcontrol - used interface
1. util/integer, util/clock - used in controllers and in support of util/flowcontrol
Automatic merge from submit-queue
Curating Owners: cmd/kubelet
cc @yujuhong @dchen1107 @vishh
In an effort to expand the existing pool of reviewers and establish a
two-tiered review process (first someone lgtms and then someone
experienced in the project approves), we are adding new reviewers to
existing owners files.
If You Care About the Process:
------------------------------
We did this by algorithmically figuring out who’s contributed code to
the project and in what directories. Unfortunately, that doesn’t work
well: people that have made mechanical code changes (e.g change the
copyright header across all directories) end up as reviewers in lots of
places.
Instead of using pure commit data, we generated an excessively large
list of reviewers and pruned based on all time commit data, recent
commit data and review data (number of PRs commented on).
At this point we have a decent list of reviewers, but it needs one last
pass for fine tuning.
Also, see https://github.com/kubernetes/contrib/issues/1389.
TLDR:
-----
As an owner of a sig/directory and a leader of the project, here’s what
we need from you:
1. Use PR https://github.com/kubernetes/kubernetes/pull/35715 as an example.
2. The pull-request is made editable, please edit the `OWNERS` file to
remove the names of people that shouldn't be reviewing code in the future in
the **reviewers** section. You probably do NOT need to modify the **approvers**
section. Names asre sorted by relevance, using some secret statistics.
3. Notify me if you want some OWNERS file to be removed. Being an
approver or reviewer of a parent directory makes you a reviewer/approver
of the subdirectories too, so not all OWNERS files may be necessary.
4. Please use ALIAS if you want to use the same list of people over and
over again (don't hesitate to ask me for help, or use the pull-request
above as an example)
Automatic merge from submit-queue
log cfgzErr if err happened
We need to log err info when err info returned by initConfigz(),no matter what the result of utilconfig.DefaultFeatureGate.DynamicKubeletConfig() is and
whether s.RunOnce is true or not.
We should log the initKubeletConfigSync() err info too.
We need to log err info when err info returned by initConfigz(),no matter what the result of utilconfig.DefaultFeatureGate.DynamicKubeletConfig() is and
whether s.RunOnce is true or not.
We should log the initKubeletConfigSync() err info too.
Automatic merge from submit-queue (batch tested with PRs 37094, 37663, 37442, 37808, 37826)
fix if condition question in kubelet run() function
Here variable err returned by function NewForConfig(&eventClientConfig) if CreateAPIServerClientConfig() function runs correctly . And we should not print "invalid kubeconfig" info.
Should we use else instead of if.
here variable err returned by function NewForConfig(&eventClientConfig) if CreateAPIServerClientConfig() function is executed correctly. We should use else instead of if.
Or put block (if err != nil) to block (if err == nil) above
Automatic merge from submit-queue
Kubelet: Fix the description of MaxContainers kubelet flag.
Found this during code review.
The default number has been changed to `-1` and `1`. 82c488bd6e/pkg/apis/componentconfig/v1alpha1/defaults.go (L279-L285)
@yujuhong
/cc @saad-ali This PR fixed incorrect doc.
Automatic merge from submit-queue
Support persistent volume usage for kubernetes running on Photon Controller platform
**What this PR does / why we need it:**
Enable the persistent volume usage for kubernetes running on Photon platform.
Photon Controller: https://vmware.github.io/photon-controller/
_Only the first commit include the real code change.
The following commits are for third-party vendor dependency and auto-generated code/docs updating._
Two components are added:
pkg/cloudprovider/providers/photon: support Photon Controller as cloud provider
pkg/volume/photon_pd: support Photon persistent disk as volume source for persistent volume
Usage introduction:
a. Photon Controller is supported as cloud provider.
When choosing to use photon controller as a cloud provider, "--cloud-provider=photon --cloud-config=[path_to_config_file]" is required for kubelet/kube-controller-manager/kube-apiserver. The config file of Photon Controller should follow the following usage:
```
[Global]
target = http://[photon_controller_endpoint_IP]
ignoreCertificate = true
tenant = [tenant_name]
project = [project_name]
overrideIP = true
```
b. Photon persistent disk is supported as volume source/persistent volume source.
yaml usage:
```
volumes:
- name: photon-storage-1
photonPersistentDisk:
pdID: "643ed4e2-3fcc-482b-96d0-12ff6cab2a69"
```
pdID is the persistent disk ID from Photon Controller.
c. Enable Photon Controller as volume provisioner.
yaml usage:
```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: gold_sc
provisioner: kubernetes.io/photon-pd
parameters:
flavor: persistent-disk-gold
```
The flavor "persistent-disk-gold" needs to be created by Photon platform admin before hand.
Provides an opt-in flag, --experimental-fail-swap-on (and corresponding
KubeletConfiguration value, ExperimentalFailSwapOn), which is false by default.
Automatic merge from submit-queue
improve and modify log
1, the content of a unified writing, compared to the following line of failure (314th lines)
2, “instance” should be “node”
Automatic merge from submit-queue
[Kubelet] Use the custom mounter script for Nfs and Glusterfs only
This patch reduces the scope for the containerized mounter to NFS and GlusterFS on GCE + GCI clusters
This patch also enabled the containerized mounter on GCI nodes
Shepherding multiple PRs through the submit queue is painful. Hence I combined them into this PR. Please review each commit individually.
cc @jingxu97 @saad-ali
https://github.com/kubernetes/kubernetes/pull/35652 has also been reverted as part of this PR
This change add a container manager inside the dockershim to move docker daemon
and associated processes to a specified cgroup. The original kubelet container
manager will continue checking the name of the cgroup, so that kubelet know how
to report runtime stats.
In order to be able to use new mounter library, this PR adds the
mounterPath flag to kubelet which passes the flag to the mount
interface. If flag is empty, mount uses default mount path.