Commit Graph

6877 Commits (8060d565946848abe4f67ffffb810dd3505fe4cb)

Author SHA1 Message Date
Krzysztof Jastrzebski 138a3c7172 Add "only_cpu_and_memory" GET parameter to /stats/summary http handler in kubelet. If parameter is true then only cpu and memory will be present in response. The parameter will be used by Metric Server to avoid sending/decoding unneeded data. 2018-09-06 21:49:00 +02:00
WanLinghao 794e665d7b Currently, token manager use keyFunc like: `fmt.Sprintf("%q/%q/%#v", name, namespace, tr.Spec)`.
Since tr.Spec contains point fields, new token request would not reuse
the cache at all.  This patch fix this, also adds unit test.

Signed-off-by: Mike Danese <mikedanese@google.com>
2018-09-06 09:03:26 -07:00
Renaud Gaubert 8dd1d27c03 Updated the device manager pluginwatcher handler 2018-09-06 15:34:46 +02:00
Renaud Gaubert 78b55eb5bf Updated the CSI pluginwatcher handler 2018-09-06 15:34:46 +02:00
Renaud Gaubert 29d225e90c Update pluginwatcher tests 2018-09-06 14:44:03 +02:00
Renaud Gaubert 4d18aa63cd Refactor pluginwatcher to use the new API 2018-09-06 14:42:21 +02:00
Renaud Gaubert 2eb91e89c0 Update the plugin watcher interface 2018-09-06 14:42:21 +02:00
Lucas Käldström 83d53ea1c2
Standardize componentconfig code/comment patterns 2018-09-06 13:42:02 +03:00
Kubernetes Submit Queue 4bc9e94fee
Merge pull request #67690 from feiskyer/iptables-cross
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

Kubelet: only sync iptables on linux

**What this PR does / why we need it**:

Iptables is only supported on Linux, kubelet should only sync NAT rules on Linux.

Without this PR, Kubelet on Windows would logs following errors on each `syncNetworkUtil()`:

```
kubelet.err.log:4692:E0711 22:03:42.103939    2872 kubelet_network.go:102] Failed to ensure that nat chain KUBE-MARK-DROP exists: error creating chain "KUBE-MARK-DROP": executable file
```

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #65713

**Special notes for your reviewer**:

**Release note**:

```release-note
Kubelet now only sync iptables on Linux.
```
2018-09-05 22:55:15 -07:00
wangqingcan 6506e0c51a Simple code and typo fixed in in kubelet 2018-09-06 09:12:39 +08:00
Kubernetes Submit Queue 0df5d8d205
Merge pull request #67909 from tallclair/runtimeclass-kubelet
Automatic merge from submit-queue (batch tested with PRs 68161, 68023, 67909, 67955, 67731). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

Dynamic RuntimeClass implementation

**What this PR does / why we need it**:

Implement RuntimeClass using the dynamic client to break the dependency on https://github.com/kubernetes/kubernetes/pull/67791

Once (if) https://github.com/kubernetes/kubernetes/pull/67791 merges, I will migrate to the typed client.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
For https://github.com/kubernetes/features/issues/585

**Release note**:
Covered by #67737
```release-note
NONE
```

/sig node
/kind feature
/priority important-soon
/milestone v1.12
2018-09-05 14:51:47 -07:00
Kubernetes Submit Queue 70a0089ae6
Merge pull request #68200 from RenaudWasTaken/pluginwatcher-beta
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

KubeletPluginsWatcher feature is beta in 1.12 release

*What this PR does / why we need it:*
Graduates DevicePlugins feature to beta.

*Which issue(s) this PR fixes:*
Related but does not fix: https://github.com/kubernetes/features/issues/595 as well as https://github.com/kubernetes/kubernetes/issues/65773

*Special notes for your reviewer:*
Includes upgrading the gRPC pluginwatcher API to beta. Based on the [device plugin model](https://github.com/kubernetes/kubernetes/pull/59588).

*Depends on https://github.com/kubernetes/kubernetes/pull/64621 being merged* 

Release note:

```release-note
KubeletPluginsWatcher feature graduates to beta.
```

/sig node
/sig storage

/cc @vladimirvivien @sbezverk @vikaschoudhary16 @saad-ali @vishh @jiayingz
2018-09-05 13:18:39 -07:00
wangqingcan b0c308f082 Simple code and typo fixed in in pkg/kubelet 2018-09-05 21:51:32 +08:00
Kubernetes Submit Queue 743e4fba63
Merge pull request #67709 from feiskyer/inodes-clean
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

 Kubelet: only apply default hard evictions of nodefs.inodesFree on Linux

**What this PR does / why we need it**:

Kubelet sets default hard evictions of `nodefs.inodesFree ` for all platforms today. This will cause errors on Windows and a lot `no observation found for eviction signal nodefs.inodesFree` errors will be logs for kubelet.

```
kubelet.err.log:4961:W0711 22:21:12.378789    2872 helpers.go:808] eviction manager: no observation found for eviction signal nodefs.inodesFree
kubelet.err.log:4967:W0711 22:21:30.411371    2872 helpers.go:808] eviction manager: no observation found for eviction signal nodefs.inodesFree
kubelet.err.log:4974:W0711 22:21:48.446456    2872 helpers.go:808] eviction manager: no observation found for eviction signal nodefs.inodesFree
kubelet.err.log:4978:W0711 22:22:06.482441    2872 helpers.go:808] eviction manager: no observation found for eviction signal nodefs.inodesFree
```

This PR updates the default hard eviction value and only apply nodefs.inodesFree on Linux.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #66088

**Special notes for your reviewer**:

**Release note**:

```release-note
Kubelet only applies default hard evictions of nodefs.inodesFree on Linux
```
2018-09-04 23:08:30 -07:00
Kubernetes Submit Queue 8f906fefae
Merge pull request #66427 from feiskyer/win-pods-stats
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

Add kubelet stats for windows system container "pods"

**What this PR does / why we need it**:

This PR adds kubelet stats for windows system container "pods". Without this, kubelet will always logs error: 

```
kubelet.err.log:4832:E0711 22:12:49.241358    2872 helpers.go:735] eviction manager: failed to construct signal: "allocatableMemory.available" error: system container "pods" not found
```

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #66087

**Special notes for your reviewer**:

/sig windows
/sig node

**Release note**:

```release-note
Add kubelet stats for windows system container "pods"
```
2018-09-04 21:59:49 -07:00
Pengfei Ni 376b45cb64 Fix unit tests for Windows
* TestMakeBlockVolume is moved to Linux only.
* TestMakeMounts are running on both Linux and Windows
2018-09-05 10:22:53 +08:00
Pengfei Ni aeea967149 Kubelet: only sync iptables on linux 2018-09-05 10:22:48 +08:00
Tim Allclair 63f3bc1b7e
Implement RuntimeClass support for the Kubelet & CRI 2018-09-04 13:45:11 -07:00
Renaud Gaubert 44dd0672b6 Add pluginwatcher generated files 2018-09-04 20:22:59 +02:00
Renaud Gaubert f8e80e45e7 Create pkg/kubelet/apis/pluginregistration/v1beta1 directory 2018-09-04 20:22:59 +02:00
Pengfei Ni 8255318b96 Kubelet: do not report used inodes on Windows 2018-09-03 16:42:33 +08:00
Pengfei Ni e1fdaa177f Kubelet: only apply default hard evictions of nodefs.inodesFree on Linux 2018-09-03 16:42:30 +08:00
Lucas Käldström 8b6a7ee075
autogenerated go code, godeps, bazel and gofmt 2018-09-02 14:38:59 +03:00
Lucas Käldström 15760506c2
Move the kubelet's external types to k8s.io/kubelet 2018-09-02 14:19:38 +03:00
Lucas Käldström 0707b1274f
Automated package reference rename 2018-09-02 14:15:38 +03:00
Sandor Szücs 588d2808b7
fix #51135 make CFS quota period configurable, adds a cli flag and config option to kubelet to be able to set cpu.cfs_period and defaults to 100ms as before.
It requires to enable feature gate CustomCPUCFSQuotaPeriod.

Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
2018-09-01 20:19:59 +02:00
Kubernetes Submit Queue 33cca5251c
Merge pull request #67255 from bertinatto/promote_mount_propagation
Automatic merge from submit-queue (batch tested with PRs 65251, 67255, 67224, 67297, 68105). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

Promote mount propagation to GA

**What this PR does / why we need it**:

This PR promotes mount propagation to GA.

Website PR: https://github.com/kubernetes/website/pull/9823

**Release note**:

```release-note
Mount propagation has promoted to GA. The `MountPropagation` feature gate is deprecated and will be removed in 1.13.
```
2018-08-31 19:25:30 -07:00
Kubernetes Submit Queue 85300f4f5d
Merge pull request #67803 from saad-ali/csiClusterReg3
Automatic merge from submit-queue (batch tested with PRs 64283, 67910, 67803, 68100). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

CSI Cluster Registry and Node Info CRDs

**What this PR does / why we need it**:
Introduces the new `CSIDriver` and `CSINodeInfo` API Object as proposed in https://github.com/kubernetes/community/pull/2514 and https://github.com/kubernetes/community/pull/2034

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/features/issues/594

**Special notes for your reviewer**:
Per the discussion in https://groups.google.com/d/msg/kubernetes-sig-storage-wg-csi/x5CchIP9qiI/D_TyOrn2CwAJ the API is being added to the staging directory of the `kubernetes/kubernetes` repo because the consumers will be attach/detach controller and possibly kubelet, but it will be installed as a CRD (because we want to move in the direction where the API server is Kubernetes agnostic, and all Kubernetes specific types are installed).

**Release note**:

```release-note
Introduce CSI Cluster Registration mechanism to ease CSI plugin discovery and allow CSI drivers to customize Kubernetes' interaction with them.
```

CC @jsafrane
2018-08-31 16:46:41 -07:00
Kubernetes Submit Queue 39004e852b
Merge pull request #64283 from jessfraz/ProcMountType
Automatic merge from submit-queue (batch tested with PRs 64283, 67910, 67803, 68100). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

Add a ProcMount option to the SecurityContext & AllowedProcMountTypes to PodSecurityPolicy

So there is a bit of a chicken and egg problem here in that the CRI runtimes will need to implement this for there to be any sort of e2e testing.

**What this PR does / why we need it**: This PR implements design proposal https://github.com/kubernetes/community/pull/1934. This adds a ProcMount option to the SecurityContext and AllowedProcMountTypes to PodSecurityPolicy

Relies on https://github.com/google/cadvisor/pull/1967

**Release note**:

```release-note
ProcMount added to SecurityContext and AllowedProcMounts added to PodSecurityPolicy to allow paths in the container's /proc to not be masked.
```

cc @Random-Liu @mrunalp
2018-08-31 16:46:33 -07:00
Jan Safranek 7d673cb8f0 Pass new CSI API Client and informer to Volume Plugins 2018-08-31 12:25:59 -07:00
Fabio Bertinatto b87a57a111 Promote mount propagation to GA 2018-08-31 10:04:51 +02:00
Kubernetes Submit Queue c1e37a5f16
Merge pull request #66056 from mikedanese/fixhang
Automatic merge from submit-queue (batch tested with PRs 67349, 66056). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

wait until apiserver connection before starting kubelet tls bootstrap

I wonder if this helps with sometimes slow network programming

cc @mwielgus @awly
2018-08-30 20:16:32 -07:00
Jess Frazelle 1a4cf7a36e
make update
Signed-off-by: Jess Frazelle <acidburn@microsoft.com>
2018-08-30 18:24:23 -04:00
Mike Danese 2cf1c75e07 wait until apiserver connection before starting kubelet tls bootstrap 2018-08-30 11:37:05 -07:00
Jess Frazelle 20cc40a5dc
ProcMount: add dockershim support
Signed-off-by: Jess Frazelle <acidburn@microsoft.com>
2018-08-30 11:40:06 -04:00
Jess Frazelle 31ffd9f881
vendor: update docker cadvisor winterm
This vendor change was purely for the changes in docker to allow for
setting the Masked and Read-only paths.

See: moby/moby#36644

But because of the docker dep update it also needed cadvisor to be
updated and winterm due to changes in pkg/tlsconfig in docker

See: google/cadvisor#1967

Signed-off-by: Jess Frazelle <acidburn@microsoft.com>
2018-08-30 11:40:05 -04:00
Jess Frazelle dbf7186bee
update jsonlog path for updated vendor
Signed-off-by: Jess Frazelle <acidburn@microsoft.com>
2018-08-30 11:40:05 -04:00
Jess Frazelle 30dcca6233
ProcMount: add api options and feature gate
Signed-off-by: Jess Frazelle <acidburn@microsoft.com>
2018-08-30 11:40:02 -04:00
Jess Frazelle 6b7c39a4f8
pkg/kubelet/apis/cri/runtime: add masked_paths and readonly_paths
generate runtime protobufs

Signed-off-by: Jess Frazelle <acidburn@microsoft.com>
2018-08-30 11:39:18 -04:00
Pingan2017 2f1284bc34 cleanup unneeded if block 2018-08-30 17:18:56 +08:00
Lucas Käldström 844487aea4
autogenerated 2018-08-29 20:21:17 +03:00
Lucas Käldström 994ac98586
Update api violations, golint failures and gofmt 2018-08-29 20:21:09 +03:00
Lucas Käldström 7a840cb4c8
automated: Rename all package references 2018-08-29 19:07:52 +03:00
Lucas Käldström 62bfe29ce4
automated, boring: Rename pkg/kubelet/apis/{kubelet,}config 2018-08-29 18:59:05 +03:00
Kubernetes Submit Queue cd06419973
Merge pull request #67369 from tianshapjq/should-not-eventf-directly
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

should not event directly

**What this PR does / why we need it**:
should not event directly, using recordContainerEvent() to generate ref and deduplicate events instead.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
none
```
2018-08-28 16:18:13 -07:00
Kubernetes Submit Queue a26e1ddacc
Merge pull request #67739 from liggitt/hostname-override
Automatic merge from submit-queue (batch tested with PRs 67739, 65222). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Honor --hostname-override, report compatible hostname addresses with cloud provider

xref #67714

7828e5d made cloud providers authoritative for the addresses reported on Node objects, so that the addresses used by the node (and requested as SANs in serving certs) could be verified via cloud provider metadata.

This had the effect of no longer reporting addresses of type Hostname for Node objects for some cloud providers. Cloud providers that have the instance hostname available in metadata should add a `type: Hostname` address to node status. This is being tracked in #67714

This PR does a couple other things to ease the transition to authoritative cloud providers:
* if `--hostname-override` is set on the kubelet, make the kubelet report that `Hostname` address. if it can't be verified via cloud-provider metadata (for cert approval, etc), the kubelet deployer is responsible for fixing the situation by adjusting the kubelet configuration (as they were in 1.11 and previously)
* if `--hostname-override` is not set, *and* the cloud provider didn't report a Hostname address, *and* the auto-detected hostname matches one of the addresses the cloud provider *did* report, make the kubelet report that as a Hostname address. That lets the addresses remain verifiable via cloud provider metadata, while still including a `Hostname` address whenever possible.

/sig node
/sig cloud-provider

/cc @mikedanese

fyi @hh

```release-note
NONE
```
2018-08-28 12:31:00 -07:00
Jordan Liggitt e309bd3abf
Remove deprecated feature flags 2018-08-28 15:25:46 -04:00
Jordan Liggitt 2857de73ce
Honor --hostname-override, report compatible hostname addresses with cloud provider 2018-08-28 11:21:01 -04:00
Kubernetes Submit Queue 2eb14e3007
Merge pull request #64973 from nokia/k8s-sctp
Automatic merge from submit-queue (batch tested with PRs 67694, 64973, 67902). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

SCTP support implementation for Kubernetes

**What this PR does / why we need it**: This PR adds SCTP support to Kubernetes, including Service, Endpoint, and NetworkPolicy.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #44485

**Special notes for your reviewer**:

**Release note**:

```release-note

SCTP is now supported as additional protocol (alpha) alongside TCP and UDP in Pod, Service, Endpoint, and NetworkPolicy.  

```
2018-08-28 07:21:18 -07:00
Tim Allclair 62d56060b7 Remove unused kubelet dependency 2018-08-27 16:48:12 -07:00
tianshapjq 9daaf12397 use podPrefix as it's defined 2018-08-27 14:32:26 +08:00
Laszlo Janosi cbe94df8c6 gofmt update 2018-08-27 05:59:50 +00:00
Laszlo Janosi e466bdc67e Changes according to the approved KEP. SCTP is supported for HostPort and LoadBalancer. Alpha feature flag SCTPSupport controls the support of SCTP. Kube-proxy config parameter is removed. 2018-08-27 05:58:36 +00:00
Laszlo Janosi a6da2b1472 K8s SCTP support implementation for the first pull request
The requested Service Protocol is checked against the supported protocols of GCE Internal LB. The supported protocols are TCP and UDP.

SCTP is not supported by OpenStack LBaaS. If SCTP is requested in a Service with type=LoadBalancer, the request is rejected. Comment style is also corrected.

SCTP is not allowed for LoadBalancer Service and for HostPort. Kube-proxy can be configured not to start listening on the host port for SCTP: see the new SCTPUserSpaceNode parameter

changed the vendor github.com/nokia/sctp to github.com/ishidawataru/sctp. I.e. from now on we use the upstream version.

netexec.go compilation fixed. Various test cases fixed

SCTP related conformance tests removed. Netexec's pod definition and Dockerfile are updated to expose the new SCTP port(8082)

SCTP related e2e test cases are removed as the e2e test systems do not support SCTP

sctp related firewall config is removed from cluster/gce/util.sh. Variable name sctp_addr is corrected to sctpAddr in pkg/proxy/ipvs/proxier.go

cluster/gce/util.sh is copied from master
2018-08-27 05:56:27 +00:00
Michael Taufen 1b7d06e025 Kubelet creates and manages node leases
This extends the Kubelet to create and periodically update leases in a
new kube-node-lease namespace. Based on [KEP-0009](https://github.com/kubernetes/community/blob/master/keps/sig-node/0009-node-heartbeat.md),
these leases can be used as a node health signal, and will allow us to
reduce the load caused by over-frequent node status reporting.

- add NodeLease feature gate
- add kube-node-lease system namespace for node leases
- add Kubelet option for lease duration
- add Kubelet-internal lease controller to create and update lease
- add e2e test for NodeLease feature
- modify node authorizer and node restriction admission controller
to allow Kubelets access to corresponding leases
2018-08-26 16:03:36 -07:00
Kubernetes Submit Queue 83030032ad
Merge pull request #67425 from Lion-Wei/kubelet-ipv6
Automatic merge from submit-queue (batch tested with PRs 65247, 63633, 67425). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix kubelet iptclient in ipv6 cluster

**What this PR does / why we need it**:
Kubelet uses "iptables" instead of "ip6tables" in an ipv6-only cluster. This causes failed traffic for type: LoadBalancer services (and probably a lot of other problems).

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #67398 

**Special notes for your reviewer**:


**Release note**:
```release-note
NONE
```
2018-08-23 14:15:12 -07:00
Kubernetes Submit Queue d67a03183a
Merge pull request #67687 from Lion-Wei/remote-reschrduler
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove rescheduler since scheduling DS pods by default scheduler is moving to beta

**What this PR does / why we need it**:

remove rescheduler since scheduling DS pods by default scheduler is moving to beta

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64725

**Special notes for your reviewer**:

**Release note**:
```release-note
Remove rescheduler since scheduling DS pods by default scheduler is moving to beta.
```
2018-08-23 12:32:17 -07:00
Kubernetes Submit Queue e46203c40d
Merge pull request #67031 from krzysztof-jastrzebski/node_startup
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reduce latency to node ready after CIDR is assigned.

This adds code to execute an immediate runtime and node status update when the Kubelet sees that it has a CIDR, which significantly decreases the latency to node ready.

```release-note
Speed up kubelet start time by executing an immediate runtime and node status update when the Kubelet sees that it has a CIDR.
```
2018-08-23 10:37:30 -07:00
liangwei 67f4be87c0 fix kubelet iptclient in ipv6 cluster 2018-08-23 15:08:51 +08:00
Krzysztof Jastrzebski 7ffa4e17e0 Reduce latency to node ready after CIDR is assigned. 2018-08-22 10:43:58 +02:00
Kubernetes Submit Queue c491d48cde
Merge pull request #67430 from choury/cpumanager
Automatic merge from submit-queue (batch tested with PRs 67430, 67550). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

cpumanager: rollback state if updateContainerCPUSet failed

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63018

If `updateContainerCPUSet`  failed, the container will start failed. We should rollback the state to avoid CPU leak.
**Special notes for your reviewer**:

**Release note**:

```release-note
cpumanager: rollback state if updateContainerCPUSet failed
```
2018-08-21 23:20:58 -07:00
Kubernetes Submit Queue 444373b404
Merge pull request #67599 from neolit123/owners-kubelet
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add labels to kubelet OWNERS files

**What this PR does / why we need it**:

This change makes it possible to automatically add the two labels: `area/kubelet` to PRs that touch the paths in question.

this already exists for kubeadm:
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/OWNERS#L17-L19

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
refs https://github.com/kubernetes/community/issues/1808

**Special notes for your reviewer**:
none

**Release note**:

```release-note
NONE
```
/area kubelet
@kubernetes/sig-node-pr-reviews
2018-08-21 21:10:28 -07:00
liangwei 5ea138f4e9 remove rescheduler 2018-08-22 11:49:14 +08:00
Kubernetes Submit Queue 7cd140aa4f
Merge pull request #67518 from tallclair/runtimeclass-cri
Automatic merge from submit-queue (batch tested with PRs 67298, 67518, 67635, 67673). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add RuntimeHandler to the CRI RunPodSandboxRequest

**What this PR does / why we need it**:

Adds the CRI portion of the [RuntimeClass](https://github.com/kubernetes/community/blob/master/keps/sig-node/0014-runtime-class.md#runtime-handler) API.

**Which issue(s) this PR fixes**:
For https://github.com/kubernetes/features/issues/585

**Special notes for your reviewer**:
The Kubernetes API is still blocked on a decision about alpha field usage, see [discussion on sig-architecture](https://groups.google.com/forum/#!topic/kubernetes-sig-architecture/y9FulL9Uq6A). I'd like to start with the CRI piece so we can unblock work on the CRI implementation side to have support ready when Kubernetes support is there.

**Release note**:
```release-note
[CRI] Adds a "runtime_handler" field to RunPodSandboxRequest, for selecting the runtime configuration to run the sandbox with (alpha feature).
```

/sig node
/milestone v1.12
/priority important-soon
/kind api-change
2018-08-21 18:33:04 -07:00
Lubomir I. Ivanov 1a1d236f61 Add labels to kubelet OWNERS files 2018-08-22 00:43:32 +03:00
Kubernetes Submit Queue c94ececccc
Merge pull request #67672 from dims/add-labels-to-owners-files
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add Labels to various OWNERS files

**What this PR does / why we need it**:

Will reduce the burden of manually adding labels. Information pulled
from:
https://github.com/kubernetes/community/blob/master/sigs.yaml

Change-Id: I17e661e37719f0bccf63e41347b628269cef7c8b

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-08-21 14:37:21 -07:00
Kubernetes Submit Queue 473ebb21d1
Merge pull request #67632 from feiskyer/verbose-fix
Automatic merge from submit-queue (batch tested with PRs 67661, 67497, 66523, 67622, 67632). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reduce verbose logs of node addresses requesting

**What this PR does / why we need it**:

Kubelet build from the master branch is flushing node addresses requesting logs, which is too verbose:

```sh
Aug 16 10:09:40 node-1 kubelet[24217]: I0816 10:09:40.658479   24217 cloud_request_manager.go:97] Requesting node addresses from cloud provider for node "node-1"
Aug 16 10:09:40 node-1 kubelet[24217]: I0816 10:09:40.666114   24217 cloud_request_manager.go:116] Node addresses from cloud provider for node "node-1" collected
Aug 16 10:09:50 node-1 kubelet[24217]: I0816 10:09:50.666357   24217 cloud_request_manager.go:97] Requesting node addresses from cloud provider for node "node-1"
Aug 16 10:09:50 node-1 kubelet[24217]: I0816 10:09:50.674322   24217 cloud_request_manager.go:116] Node addresses from cloud provider for node "node-1" collected
Aug 16 10:10:01 node-1 kubelet[24217]: I0816 10:10:00.674644   24217 cloud_request_manager.go:97] Requesting node addresses from cloud provider for node "node-1"
Aug 16 10:10:01 node-1 kubelet[24217]: I0816 10:10:00.682794   24217 cloud_request_manager.go:116] Node addresses from cloud provider for node "node-1" collected
Aug 16 10:10:10 node-1 kubelet[24217]: I0816 10:10:10.683002   24217 cloud_request_manager.go:97] Requesting node addresses from cloud provider for node "node-1"
Aug 16 10:10:10 node-1 kubelet[24217]: I0816 10:10:10.689641   24217 cloud_request_manager.go:116] Node addresses from cloud provider for node "node-1" collected
Aug 16 10:10:20 node-1 kubelet[24217]: I0816 10:10:20.690006   24217 cloud_request_manager.go:97] Requesting node addresses from cloud provider for node "node-1"
Aug 16 10:10:20 node-1 kubelet[24217]: I0816 10:10:20.696545   24217 cloud_request_manager.go:116] Node addresses from cloud provider for node "node-1" collected
```

This PR sets them to level 5.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

/cc @ingvagabund
2018-08-21 13:00:13 -07:00
Davanum Srinivas 9b43d97cd4
Add Labels to various OWNERS files
Will reduce the burden of manually adding labels. Information pulled
from:
https://github.com/kubernetes/community/blob/master/sigs.yaml

Change-Id: I17e661e37719f0bccf63e41347b628269cef7c8b
2018-08-21 13:59:08 -04:00
Ismo Puustinen dd3eeb3f46 device manager: don't do operations on nil pointer.
If grpc.DialContext() fails, a nil connection is returned. Check the
error before calling conn.Close().
2018-08-21 15:20:36 +03:00
Kubernetes Submit Queue d017bebf6b
Merge pull request #67145 from jiayingz/reboot-fix
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fail container start if its requested device plugin resource is unknown.

With the change, Kubelet device manager now checks whether it has cached option state for the requested device plugin resource to make sure the resource is in ready state when we start the container.



**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/67107

**Special notes for your reviewer**:

**Release note**:

```release-note
Fail container start if its requested device plugin resource hasn't registered after Kubelet restart.
```
2018-08-21 01:48:54 -07:00
Pengfei Ni 2d82cd811f Reduce verbose logs of node addresses requesting 2018-08-21 13:23:01 +08:00
Tim Allclair e6eb2e7dea Add RuntimeHandler to the CRI RunPodSandboxRequest 2018-08-17 10:56:49 -07:00
choury 36b92b9b29 cpumanager: rollback state if updateContainerCPUSet failed 2018-08-17 18:08:58 +08:00
Kubernetes Submit Queue 4819c65028
Merge pull request #67380 from tianshapjq/nits-in-manager.go
Automatic merge from submit-queue (batch tested with PRs 66209, 67380, 67499, 67437, 67498). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

nits in manager.go

**What this PR does / why we need it**:
just found some nits in the manager.go

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-08-17 03:01:09 -07:00
Kubernetes Submit Queue da3f1a3ea1
Merge pull request #64445 from squeed/more-cni-capabilities
Automatic merge from submit-queue (batch tested with PRs 64445, 67459, 67434). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

dockershim/network: pass ipRange CNI capabilities

**What this PR does / why we need it**:
Updates the dynamic (capability args) passed from Kubernetes to the CNI plugin. This means CNI plugin authors can offer more features and / or reduce their dependency on the APIServer.

Currently, we only pass the `portMappings` capability. CNI now supports `bandwidth` for bandwidth limiting and `ipRanges` for preferred IP blocks. This PR adds support for these two new capabilities.

Bandwidth limits are provided - as implemented in kubenet - via the pod annotations `kubernetes.io/ingress-bandwidth` and `kubernetes.io/egress-bandwidth`.

The ipRanges field simply passes the PodCIDR. This does mean that we need to change the NodeReady algorithm. Previously, we would only set NodeNotReady on missing PodCIDR when using Kubenet. Now, if the CNI configuration includes the `ipRanges` capability, we need to do the same.

**Which issue(s) this PR fixes**:
Fixes #64393

**Release note**:

```release-note
The dockershim now sets the "bandwidth" and "ipRanges" CNI capabilities (dynamic parameters). Plugin authors and administrators can now take advantage of this by updating their CNI configuration file. For more information, see the [CNI docs](https://github.com/containernetworking/cni/blob/master/CONVENTIONS.md#dynamic-plugin-specific-fields-capabilities--runtime-configuration)
```
2018-08-15 22:54:07 -07:00
Kubernetes Submit Queue cffa2aed0e
Merge pull request #64601 from hzxuzhonghu/cm-dynamic-loglevel-set
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Other components support set log level dynamically

**What this PR does / why we need it**:

#63777 introduced a way to set glog.logging.verbosity dynamically. 
We should enable this for all other components, which is specially useful in debugging. 


**Release note**:

```release-note
Expose `/debug/flags/v` to allow kubelet dynamically set glog logging level.  If want to change glog level to 3, you only have to send a PUT request like `curl -X PUT http://127.0.0.1:8080/debug/flags/v -d "3"`.
```
2018-08-15 21:32:46 -07:00
Kubernetes Submit Queue b904a3dc48
Merge pull request #67109 from MHBauer/error-typo
Automatic merge from submit-queue (batch tested with PRs 65561, 67109, 67450, 67456, 67402). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

error text refers to wrong stream type

**What this PR does / why we need it**:
clarify error text

**Special notes for your reviewer**:
I think this was a copy and paste error.

**Release note**:
```release-note
NONE
```
2018-08-15 18:15:10 -07:00
Kubernetes Submit Queue 6faf115870
Merge pull request #65561 from k82cn/k8s_65372_1
Automatic merge from submit-queue (batch tested with PRs 65561, 67109, 67450, 67456, 67402). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Compared preemption by priority in Kubelet

Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #65372 

**Release note**:
```release-note
None
```
2018-08-15 18:15:06 -07:00
Casey Callendrello 5d9ec20d7e kubelet/dockershim/network: pass ipRange dynamically to the CNI plugin
CNI now supports passing ipRanges dynamically. Pass podCIDR so that
plugins no longer have to look it up.
2018-08-15 17:41:09 +02:00
Kubernetes Submit Queue c5e74d128d
Merge pull request #66884 from NickrenREN/attacher-detacher-refactor
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Attacher/Detacher refactor for local storage

Proposal link: https://github.com/kubernetes/community/pull/2438

**What this PR does / why we need it**:

Attacher/Detacher refactor for the plugins which just need to mount device, but do not need to attach, such as local storage plugin.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

```release-note
Attacher/Detacher refactor for local storage
```

/sig storage
/kind feature
2018-08-15 07:03:48 -07:00
xuzhonghu 815799638b run update all 2018-08-15 17:18:27 +08:00
xuzhonghu c867bf9cab kubelet support dynamically set glog log level --v 2018-08-15 17:18:25 +08:00
Kubernetes Submit Queue c65f65cf6a
Merge pull request #65065 from sjenning/reduce-backoff-logging
Automatic merge from submit-queue (batch tested with PRs 66177, 66185, 67136, 67157, 65065). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: reduce logging for backoff situations

xref https://bugzilla.redhat.com/show_bug.cgi?id=1555057#c6

Pods that are in `ImagePullBackOff` or `CrashLoopBackOff` currently generate a lot of logging at the `glog.Info()` level.  This PR moves some of that logging to `V(3)` and avoids logging in situations where the `SyncPod` only fails because pod are in a BackOff error condition.

@derekwaynecarr @liggitt
2018-08-15 02:09:20 -07:00
Kubernetes Submit Queue fba4cf6f4c
Merge pull request #67334 from fqsghostcloud/indent-error-flow
Automatic merge from submit-queue (batch tested with PRs 67294, 67320, 67335, 67334, 67325). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

indent error flow
2018-08-15 00:07:18 -07:00
Kubernetes Submit Queue b4bfb1847c
Merge pull request #66446 from bertinatto/metrics_volume_manager
Automatic merge from submit-queue (batch tested with PRs 61212, 66369, 66446, 66895, 66969). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add more metrics for Volume Manager

**What this PR does / why we need it**:

This PR adds a few metrics described in the [Metrics Spec](https://docs.google.com/document/d/1Fh0T60T_y888LsRwC51CQHO75b2IZ3A34ZQS71s_F0g/edit#heading=h.ys6pjpbasqdu):

* Number of volumes in ActualStateofWorld and DesiredStateofWorld
* Number of times ReconstructVolume Spec on kubelet failed

**Release note**:

```release-note
NONE
```
2018-08-14 21:18:12 -07:00
Kubernetes Submit Queue 1f86c1cf26
Merge pull request #61212 from charrywanganthony/duplicated_import
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove duplicated import

**Release note**:

```release-note
NONE
```
2018-08-14 20:18:00 -07:00
Kubernetes Submit Queue 99053fbf33
Merge pull request #64877 from AdamDang/patch-11
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Typo fix in returned message: utilites->utilities

Line 250: utilites->utilities
2018-08-14 18:57:50 -07:00
Kubernetes Submit Queue af2f72af47
Merge pull request #66587 from feiskyer/revert-63905
Automatic merge from submit-queue (batch tested with PRs 66491, 66587, 66856, 66657, 66923). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert #63905: Setup dns servers and search domains for Windows Pods

**What this PR does / why we need it**:

From https://github.com/kubernetes/kubernetes/pull/63905#issuecomment-396709775:

> I don't think this change does anything on Windows. On windows, the network endpoint configuration is taken care of completely by CNI. If you would like to pass on the custom dns polices from the pod spec, it should be dynamically going to the cni configuration that gets passed to CNI. From there, it would be passed down to platform and would be taken care of appropriately by HNS.

> etc\resolve.conf is very specific to linux and that should remain linux speicfic implementation. We should be trying to move away from platform specific code in Kubelet.
Docker is not managing the networking here for windows. So it doens't really care about any network settings. So passing it to docker shim's hostconfig also doens;t make sense here.

DNS for Windows containers will be set by CNI plugins.  And this change also introduced two endpoints for sandbox container.  So this PR reverts #63905 .


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

The PR should also be cherry-picked to release-1.11.

Also, https://github.com/kubernetes/kubernetes/issues/66588 is opened to track the process of pushing this to CNI.

**Release note**:

```release-note
Revert #63905: Setup dns servers and search domains for Windows Pods. DNS for Windows containers will be set by CNI plugins.
```

/sig windows
/sig node
/kind bug
2018-08-14 17:55:07 -07:00
tianshapjq 81081dc9e7 nits in manager.go 2018-08-15 08:16:04 +08:00
tianshapjq 27c5ced809 should not event directly 2018-08-14 14:35:47 +08:00
NickrenREN c7e4466873 attacher/detacher refactor 2018-08-14 11:12:41 +08:00
Fabio Bertinatto 376a94e039 Add more metrics for Volume Manager
Specifically:

* Number of volumes in ActualStateofWorld and DesiredStateofWorld
* Number of times ReconstructVolume Spec on kubelet failed
2018-08-13 17:36:36 +02:00
fqsghostcloud 21f9ac0e7e
indent error flow
indent error flow
2018-08-13 17:31:31 +08:00
Yu-Ju Hong 390b158db9 kubelet: plumb context for log requests
This allows kubelets to stop the necessary work when the context has
been canceled (e.g., connection closed), and not leaking a goroutine
and inotify watcher waiting indefinitely.
2018-08-10 17:35:46 -07:00
Kubernetes Submit Queue 57bb26911d
Merge pull request #53042 from chentao1596/support-unit-test-case-for-pod-format
Automatic merge from submit-queue (batch tested with PRs 67177, 53042). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Adding unit tests to methods of pod's format

What this PR does / why we need it:

Add unit test cases, thank you!
2018-08-08 23:49:06 -07:00
Jiaying Zhang 7b1ae66432 Fail container start if its requested device plugin resource doesn't
have cached option state to make sure the device plugin resource is
in ready state when we start the container.
2018-08-08 13:11:36 -07:00
Morgan Bauer 0b709dcf7d
error text refers to wrong stream type 2018-08-07 18:20:24 -07:00
Kubernetes Submit Queue 60ac433922
Merge pull request #66946 from LinEricYang/unused-variable
Automatic merge from submit-queue (batch tested with PRs 66512, 66946, 66083). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet/cm/cpumanager: Fix unused variable "skipIfPermissionsError"

The variable "skipIfPermissionsError" is not needed even when
permission error happened.
2018-08-06 19:44:04 -07:00
Kubernetes Submit Queue d114692a58
Merge pull request #58058 from tianshapjq/cleanup-useless-var-deviceplugin/types.go
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

clean up useless variables in deviceplugin/types.go

**What this PR does / why we need it**:
some variables is useless for reasons, I think we need a clean up.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note

```NONE
2018-08-06 16:33:54 -07:00
Da K. Ma a75d625cc3 Compared preemption by priority.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
2018-08-04 11:33:07 +08:00
Kubernetes Submit Queue cb1ef9f7e8
Merge pull request #64815 from dixudx/hostname_empty
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

error out empty hostname

**What this PR does / why we need it**:
For linux, the hostname is read from file `/proc/sys/kernel/hostname` directly, which can be overwritten with whitespaces.

Should error out such invalid hostnames.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes kubernetes/kubeadm#835

**Special notes for your reviewer**:
/cc luxas timothysc 

**Release note**:

```release-note
nodes: improve handling of erroneous host names
```
2018-08-03 17:13:32 -07:00
Kubernetes Submit Queue 6a33d1ba10
Merge pull request #66938 from sjenning/avoid-mount-delay
Automatic merge from submit-queue (batch tested with PRs 62901, 66562, 66938, 66927, 66926). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: volumemanager: poll immediate when waiting for volume attachment

Currently, `WaitForAttachAndMount()` introduces a 300ms minimum delay by using `wait.Poll()` rather than `wait.PollImmediate()`.  This wait constitutes >99% of the total processing time for `syncPod()`.  Changing this reduced `syncPod()` processing time for a simple busybox pod with one emptyDir volume from 302ms to 2ms.

@derekwaynecarr @pmorie @smarterclayton @jsafrane 

/sig node
/release-note-none
2018-08-02 19:57:15 -07:00
Lin Yang b7e1f0bf17 kubelet/cm/cpumanager: Fix unused variable "skipIfPermissionsError"
The variable "skipIfPermissionsError" is not needed even when
permission error happened.
2018-08-02 17:24:33 -07:00
Kubernetes Submit Queue 266cf70ac0
Merge pull request #66617 from pravisankar/fix-pod-cgroup-parent
Automatic merge from submit-queue (batch tested with PRs 66190, 66871, 66617, 66293, 66891). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Do not set cgroup parent when --cgroups-per-qos is disabled

When --cgroups-per-qos=false (default is true), kubelet sets pod
container management to podContainerManagerNoop implementation and
GetPodContainerName() returns '/' as cgroup parent (default cgroup root).

(1) In case of 'systemd' cgroup driver, '/' is invalid parent as
docker daemon expects '.slice' suffix and throws this error:
'cgroup-parent for systemd cgroup should be a valid slice named as \"xxx.slice\"'
(5fc12449d8/daemon/daemon_unix.go (L618))
'/' corresponds to '-.slice' (root slice) in systemd but I don't think
we want to assign root slice instead of runtime specific default value.
In case of docker runtime, this will be 'system.slice'
(e2593239d9/daemon/oci_linux.go (L698))

(2) In case of 'cgroupfs' cgroup driver, '/' is valid parent but I don't
think we want to assign root instead of runtime specific default value.
In case of docker runtime, this will be '/docker'
(e2593239d9/daemon/oci_linux.go (L695))

Current fix will not set the cgroup parent when --cgroups-per-qos is disabled.

```release-note
Fix pod launch by kubelet when --cgroups-per-qos=false and --cgroup-driver="systemd"
```
2018-08-02 15:42:16 -07:00
Kubernetes Submit Queue 2f21394859
Merge pull request #66190 from linyouchong/issue-66189
Automatic merge from submit-queue (batch tested with PRs 66190, 66871, 66617, 66293, 66891). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix nil pointer dereference in node_container_manager#enforceExisting

**What this PR does / why we need it**:
fix nil pointer dereference in node_container_manager#enforceExisting

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #66189

**Special notes for your reviewer**:
NONE

**Release note**:
```release-note
kubelet: fix nil pointer dereference while enforce-node-allocatable flag is not config properly
```
2018-08-02 15:42:09 -07:00
Seth Jennings 0413850d14 kubelet: volumemanager: poll immediate when waiting for volume attachment 2018-08-02 16:41:15 -05:00
Kubernetes Submit Queue c2536e2b0d
Merge pull request #61159 from linyouchong/linyouchong-20180314
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Skip checking when failSwapOn=false

**What this PR does / why we need it**:
Skip checking when failSwapOn=false

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:
NONE
**Release note**:
```
NONE
```
2018-08-02 14:09:39 -07:00
Kubernetes Submit Queue 4a54f3f0d6
Merge pull request #66779 from deads2k/api-05-easy-unit
Automatic merge from submit-queue (batch tested with PRs 66850, 66902, 66779, 66864, 66912). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add methods to apimachinery to easy unit testing

When unit testing, you often want a selective scheme and codec factory.  Rather than writing the vars and the init function and the error handling, you can simply do

`scheme, codecs := testing.SchemeForInstallOrDie(install.Install)`

@kubernetes/sig-api-machinery-misc 
@sttts 

```release-note
NONE
```
2018-08-02 10:03:16 -07:00
Kubernetes Submit Queue 94c2c6c842
Merge pull request #66510 from sjenning/add-image-gc-validation
Automatic merge from submit-queue (batch tested with PRs 65730, 66615, 66684, 66519, 66510). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: add image-gc low/high validation check

Currently, there is no protection against setting the high watermark <= the low watermark for image GC

This PR adds a validation rule for that.

@smarterclayton
2018-08-01 15:52:20 -07:00
Kubernetes Submit Queue 7ac32a4f7a
Merge pull request #61983 from mikedanese/closur
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

volumemanager: remove unneccesary closure

```release-note
NONE
```
2018-08-01 14:26:12 -07:00
David Eads d3bd0eb1d5 make package name match all the import aliases 2018-08-01 15:31:12 -04:00
Kubernetes Submit Queue 0a284c1cde
Merge pull request #66082 from sjenning/fix-is-critical-checks
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

move feature gate checks inside IsCriticalPod

Currently `IsCriticalPod()` calls throughout the code are protected by `utilfeature.DefaultFeatureGate.Enabled(features.ExperimentalCriticalPodAnnotation)`.

However, with Pod Priority, this gate could be disabled which skips the priority check inside IsCriticalPod().

This PR moves the feature gate checking inside `IsCriticalPod()` and handles both situations properly.

@aveshagarwal @ravisantoshgudimetla @derekwaynecarr 
/sig node
/sig scheduling
/king bug
2018-08-01 11:47:08 -07:00
Di Xu b3dfe0c652 nodes: improve handling of erroneous host names 2018-08-01 14:57:25 +08:00
Chao Wang 39a4730db6 remove duplicated import 2018-08-01 13:27:42 +08:00
Mike Danese f3922dff19 volumemanager: remove unneccesary closure 2018-07-31 18:48:15 -07:00
Kubernetes Submit Queue c0bf2e680f
Merge pull request #66270 from Pingan2017/delevent
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

delete unused events

**What this PR does / why we need it**:
 events (HostNetworkNotSupported, UndefinedShaper) is unused since #47058
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-07-31 12:14:06 -07:00
Kubernetes Submit Queue f2c6473e25
Merge pull request #66718 from ipuustin/cpu-manager-validate-offline
Automatic merge from submit-queue (batch tested with PRs 66623, 66718). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

cpumanager: validate topology in static policy

**What this PR does / why we need it**:

This patch adds a check for the static policy state validation. The check fails if the CPU topology obtained from cadvisor doesn't match with the current topology in the state file.

If the CPU topology has changed in a node, cpumanager static policy might try to assign non-present cores to containers.

For example in my test case, static policy had the default CPU set of `0-1,4-7`. Then kubelet was shut down and CPU 7 was offlined. After restarting the kubelet, CPU manager tries to assign the non-existent CPU 7 to containers which don't have exclusive allocations assigned to them:

    Error response from daemon: Requested CPUs are not available - requested 0-1,4-7, available: 0-6)

This breaks the exclusivity, since the CPUs from the shared pool don't get assigned to non-exclusive containers, meaning that they can execute on the exclusive CPUs.

**Release note**:

```release-note
Added CPU Manager state validation in case of changed CPU topology.
```
2018-07-31 08:05:06 -07:00
Ismo Puustinen 3bb5ca9257 cpumanager: add test for available CPUs in static policy.
Test the cases where the number of CPUs available in the system is
smaller or larger than the number of CPUs known in the state, which
should lead to a panic. This covers both CPU onlining and offlining. The
case where the number of CPUs matches is already covered by the
"non-corrupted state" test.
2018-07-31 10:20:37 +03:00
Kubernetes Submit Queue 2bee858a7b
Merge pull request #66284 from stewart-yu/stewart-sharedtype-move
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Move the` k8s.io/kubernetes/pkg/util/pointer` package to` k8s.io/utils/pointer`

**What this PR does / why we need it**:
Move `k8s.io/kubernetes/pkg/util/pointer` to  `shared utils` directory, so that we can use it  easily.
Close #66010 accidentally, and can't reopen it, so the same as #66010 

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-07-30 19:50:36 -07:00
Ismo Puustinen 4f604eb73c cpumanager: validate topology in static policy.
This patch adds a check for the static policy state validation. The
check fails if the CPU topology obtained from cadvisor doesn't match
with the current topology in the state file.

If the CPU topology has changed in a node, cpu manager static policy
might try to assign non-present cores to containers.

For example in my test case, static policy had the default CPU set of
0-1,4-7. Then kubelet was shut down and CPU 7 was offlined. After
restarting the kubelet, CPU manager tries to assign the non-existent CPU
7 to containers which don't have exclusive allocations assigned to them:

 Error response from daemon: Requested CPUs are not available - requested 0-1,4-7, available: 0-6)

This breaks the exclusivity, since the CPUs from the shared pool don't
get assigned to non-exclusive containers, meaning that they can execute
on the exclusive CPUs.
2018-07-30 08:49:13 +03:00
hui luo 7101c17498 While reviewing devicemanager code, found
the caching layer on endpoint is redundant.

Here are the 3 related objects in picture:
devicemanager <-> endpoint <-> plugin

Plugin is the source of truth for devices
and device health status.

devicemanager maintain healthyDevices,
unhealthyDevices, allocatedDevices based on updates
from plugin.

So there is no point for endpoint caching devices,
this patch is removing this caching layer on endpoint,

Also removing the Manager.Devices() since i didn't
find any caller of this other than test, i am adding a
notification channel to facilitate testing,

If we need to get all devices from manager in future,
it just need to return healthyDevices + unhealthyDevices,
we don't have to call endpoint after all.

This patch makes code more readable, data model been simplified.
2018-07-29 21:07:14 -07:00
Kubernetes Submit Queue 8e2a444b6d
Merge pull request #66593 from stewart-yu/stewart-kubelet-commentclean
Automatic merge from submit-queue (batch tested with PRs 66593, 66727, 66558). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove the outdate comments in tryRegisterWithAPIServer

**What this PR does / why we need it**:
some judgement about ExternalID removed in #61877, so remove the outdate comments in tryRegisterWithAPIServer


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-07-27 18:05:00 -07:00
stewart-yu f1343af5d7 auto-generated file 2018-07-28 07:54:17 +08:00
Kubernetes Submit Queue 32e38b6659
Merge pull request #58755 from vikaschoudhary16/probing-mode
Automatic merge from submit-queue (batch tested with PRs 58755, 66414). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use probe based plugin watcher mechanism in Device Manager

**What this PR does / why we need it**:
Uses this probe based utility in the device plugin manager.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #56944 

**Notes For Reviewers**:
Changes are backward compatible and existing device plugins will continue to work. At the same time, any new plugins that has required support for probing model (Identity service implementation), will also work. 


**Release note**
```release-note
Add support kubelet plugin watcher in device manager.
```
/sig node
/area hw-accelerators
/cc /cc @jiayingz @RenaudWasTaken @vishh @ScorpioCPH @sjenning @derekwaynecarr @jeremyeder @lichuqiang @tengqm @saad-ali @chakri-nelluri @ConnorDoyle
2018-07-27 15:20:06 -07:00
Kubernetes Submit Queue 2630d09c84
Merge pull request #66596 from BSWANG/master
Automatic merge from submit-queue (batch tested with PRs 66665, 66707, 66596). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix kubelet npe panic on device plugin return zero container

Signed-off-by: bingshen.wbs <bingshen.wbs@alibaba-inc.com>



**What this PR does / why we need it**:
Fix kubelet panic when device plugin return zero containers. Panic logs like follows:
```
Jul 17 12:50:24 iZwz9bqgzuo4i8qu435zk8Z kubelet[25815]: /workspace/anago-v1.10.4-beta.0.68+5ca598b4ba5abb/src/k8s.io/kubernetes/_output/dockerized/go/src/
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
Jul 17 12:50:24 iZwz9bqgzuo4i8qu435zk8Z kubelet[25815]: /workspace/anago-v1.10.4-beta.0.68+5ca598b4ba5abb/src/k8s.io/kubernetes/_output/dockerized/go/src/
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
Jul 17 12:50:24 iZwz9bqgzuo4i8qu435zk8Z kubelet[25815]: /workspace/anago-v1.10.4-beta.0.68+5ca598b4ba5abb/src/k8s.io/kubernetes/_output/dockerized/go/src/
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
Jul 17 12:50:24 iZwz9bqgzuo4i8qu435zk8Z kubelet[25815]: E0717 12:50:24.726856   25815 runtime.go:66] Observed a panic: "index out of range" (runtime error
: index out of range)
```

**Release note**:

```
NONE
```
2018-07-27 12:57:11 -07:00
stewart-yu 55251c716a update the import file for move util/pointer to k8s.io/utils 2018-07-27 19:47:02 +08:00
Kubernetes Submit Queue ed58d0dfd4
Merge pull request #63955 from k82cn/k8s_63897
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Taint node when initializing node.

Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63897 

**Release note**:
```release-note
If `TaintNodesByCondition` enabled, taint node with `TaintNodeUnschedulable` when
initializing node to avoid race condition.
```
2018-07-26 21:01:16 -07:00
Kubernetes Submit Queue cef2d325ee
Merge pull request #66395 from awly/fix-kubelet-exec-plugin-startup
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Update http.Transport if it already exists in ExecProvider

**What this PR does / why we need it**:
This unbreaks ExecPlugin. Without the change, we hit this error
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/transport/transport.go#L32

**Release note**:
```release-note
Fix kubelet startup failure when using ExecPlugin in kubeconfig
```
2018-07-26 10:47:05 -07:00
Andrew Lytvynov 3357b5ecf4 Set connrotation dialer via restclient.Config.Dialer
Instead of Transport. This fixes ExecPlugin, which fails if
restclient.Config.Transport is set.
2018-07-25 16:23:57 -07:00
stewart-yu ffbd7b22b3 remove the unnecessary comments in tryRegisterWithAPIServer for externalID removed in PR#61877 2018-07-25 11:23:56 +08:00
bingshen.wbs b1bdd043c4 fix kubelet npe on device plugin return zero container
Signed-off-by: bingshen.wbs <bingshen.wbs@alibaba-inc.com>
2018-07-25 10:15:30 +08:00
Pengfei Ni cfb776dcdd Revert #63905: Setup dns servers and search domains for Windows Pods 2018-07-25 09:58:47 +08:00
Seth Jennings b1ec6da4c7 kubelet: add image-gc low/high validation check 2018-07-23 13:14:31 -05:00
Da K. Ma aac9f1cbaa Taint node when initializing node.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
2018-07-23 12:52:05 +08:00
Lee Verberne 7c558fb7bb Remove kubelet-level docker shared pid flag
The --docker-disable-shared-pid flag has been deprecated since 1.10 and
has been superceded by ShareProcessNamespace in the pod API, which is
scheduled for beta in 1.12.
2018-07-22 16:54:44 +02:00
Kubernetes Submit Queue 53ee0c8652
Merge pull request #65660 from mtaufen/incremental-refactor-kubelet-node-status
Automatic merge from submit-queue (batch tested with PRs 66152, 66406, 66218, 66278, 65660). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Refactor kubelet node status setters, add test coverage

This internal refactor moves the node status setters to a new package, explicitly injects dependencies to facilitate unit testing, and adds individual unit tests for the setters.

I gave each setter a distinct commit to facilitate review.

Non-goals:
- I intentionally excluded the class of setters that return a "modified" boolean, as I want to think more carefully about how to cleanly handle the behavior, and this PR is already rather large.
- I would like to clean up the status update control loops as well, but that belongs in a separate PR.

```release-note
NONE
```
2018-07-20 12:12:24 -07:00
Ravi Sankar Penta 0282720e29 Do not set cgroup parent when --cgroups-per-qos is disabled
When --cgroups-per-qos=false (default is true), kubelet sets pod
container management to podContainerManagerNoop implementation and
GetPodContainerName() returns '/' as cgroup parent (default cgroup root).

(1) In case of 'systemd' cgroup driver, '/' is invalid parent as
docker daemon expects '.slice' suffix and throws this error:
'cgroup-parent for systemd cgroup should be a valid slice named as \"xxx.slice\"'
(5fc12449d8/daemon/daemon_unix.go (L618))
'/' corresponds to '-.slice' (root slice) in systemd but I don't think
we want to assign root slice instead of runtime specific default value.
In case of docker runtime, this will be 'system.slice'
(e2593239d9/daemon/oci_linux.go (L698))

(2) In case of 'cgroupfs' cgroup driver, '/' is valid parent but I don't
think we want to assign root instead of runtime specific default value.
In case of docker runtime, this will be '/docker'
(e2593239d9/daemon/oci_linux.go (L695))

Current fix will not set the cgroup parent when --cgroups-per-qos is disabled.
2018-07-20 10:25:50 -07:00
Pengfei Ni 4272c0fde6 Add unit tests for windows stats 2018-07-20 13:01:23 +08:00
Kubernetes Submit Queue d2cc34fb07
Merge pull request #65771 from smarterclayton/untyped
Automatic merge from submit-queue (batch tested with PRs 65771, 65849). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add a new conversion path to replace GenericConversionFunc

reflect.Call is very expensive. We currently use a switch block as part of AddGenericConversionFunc to avoid the bulk of top level a->b conversion for our primary types which is hand-written. Instead of having these be handwritten, we should generate them.

The pattern for generating them looks like:

```
scheme.AddConversionFunc(&v1.Type{}, &internal.Type{}, func(a, b interface{}, scope conversion.Scope) error {
  return Convert_v1_Type_to_internal_Type(a.(*v1.Type), b.(*internal.Type), scope)
})
```

which matches AddDefaultObjectFunc (which proved out the approach last year). The
conversion machinery should then do a simple map lookup based on the incoming types and invoke the function.  Like defaulting, it's up to the caller to match the types to arguments, which we do by generating this code.  This bypasses reflect.Call and in the future allows Golang mid-stack inlining to optimize this code.

As part of this change I strengthened registration of custom functions to be generated instead of hand registered, and also strengthened error checking of the generator when it sees a manual conversion to error out.  Since custom functions are automatically used by the generator, we don't really have a case for not registering the functions.

Once this is fully tested out, we can remove the reflection based path and the old registration methods, and all conversion will work from point to point methods (whether generated or custom).

Much of the need for the reflection path has been removed by changes to generation (to omit fields) and changes to Go (to make assigning equivalent structs easy).

```release-note
NONE
```
2018-07-19 09:29:00 -07:00
Pengfei Ni a2fe1ab059 Add stats for system containers "pods" 2018-07-19 22:20:24 +08:00
Kubernetes Submit Queue afcc156806
Merge pull request #66350 from aveshagarwal/master-rhbz-1601378
Automatic merge from submit-queue (batch tested with PRs 66175, 66324, 65828, 65901, 66350). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Start cloudResourceSyncsManager before getNodeAnyWay (initializeModules) to avoid kubelet getting stuck in retrieving node addresses from a cloudprovider.

**What this PR does / why we need it**:
This PR starts cloudResourceSyncsManager before getNodeAnyWay (initializeModules) otherwise kubelet gets stuck in setNodeAddress->kl.cloudResourceSyncManager.NodeAddresses() (https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_node_status.go#L470) forever retrieving node addresses from a cloud provider, and due to this cloudResourceSyncsManager will not be started at all.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
None
```

@ingvagabund @derekwaynecarr @sjenning @kubernetes/sig-node-bugs
2018-07-18 16:42:22 -07:00
Avesh Agarwal 6c33ca13e9 Start cloudResourceSyncsManager before getNodeAnyWay (initializeModules)
so that kubelet does not get stuck in retriving node addresses from a cloudprovider.
2018-07-18 15:15:03 -04:00
Clayton Coleman ef561ba8b5
generated: Avoid use of reflect.Call in conversion code paths 2018-07-17 23:02:16 -04:00
Shimin Guo e8cd28ae57 fix a panic due to assignment to nil map 2018-07-17 12:34:20 -07:00
vikaschoudhary16 a5842503eb Use probe based plugin discovery mechanism in device manager 2018-07-17 04:02:31 -04:00
chentao1596 ce3f5002dd Add unit tests for methods of pod's format 2018-07-17 15:37:13 +08:00
chentao1596 9319be121e Change the method name from PodsWithDeletiontimestamps to PodsWithDeletionTimestamps 2018-07-17 15:34:32 +08:00
Pingan2017 45fac6f469 delete unused events 2018-07-17 14:19:50 +08:00
Michael Taufen 0f8976bf84 eliminate unnecessary helper 2018-07-16 09:09:48 -07:00
Michael Taufen 0170826542 port setVolumeLimits to Setter abstraction, add test 2018-07-16 09:09:48 -07:00
Michael Taufen c5a5e21639 port setNodeStatusGoRuntime to Setter abstraction 2018-07-16 09:09:48 -07:00
Michael Taufen 8e217f7102 port setNodeStatusImages to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen b7ec333f01 port setNodeStatusDaemonEndpoints to Setter abstraction 2018-07-16 09:09:47 -07:00
Michael Taufen 59bb21051e port setNodeStatusVersionInfo to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen 596fa89af0 port setNodeStatusMachineInfo to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen aa94a3ba4e lift node-info setters into defaultNodeStatusFuncs
Instead of hiding these behind a helper, we just register them in a
uniform way. We are careful to keep the call-order of the setters the
same, though we can consider re-ordering in a future PR to achieve
fewer appends.
2018-07-16 09:09:47 -07:00
Michael Taufen 2df7e1ad5c port setNodeVolumesInUseStatus to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen 3e03e0611e port setNodeReadyCondition to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen e0b6ae219f port setNodePIDPressureCondition to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen b26e4dfa7f port setNodeDiskPressureCondition to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen f057c9a4ae port setNodeMemoryPressureCondition to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen c33f321acd port setNodeOODCondition to Setter abstraction 2018-07-16 09:09:47 -07:00
Michael Taufen 15b03b8c0c port setNodeAddress to Setter abstraction, port test
also put cloud_request_manager.go in its own package
2018-07-16 09:09:47 -07:00
Michael Taufen a3cbbbd931 move call to defaultNodeStatusFuncs to after the rest of the Kubelet is constructed 2018-07-16 09:03:13 -07:00
Michael Taufen 08c94e0616 add nodestatus package with Setter abstraction for composable node constructors 2018-07-16 09:03:13 -07:00
Michael Taufen d245e72bae remove incorrect comment referencing removed functionality
The cbr0 configuration behavior this comment references was removed in #34906
2018-07-16 09:03:13 -07:00
linyouchong 6ff285bce3 fix nil pointer dereference in node_container_manager#enforceExistingCgroup 2018-07-14 10:42:42 +08:00
Joonyoung Park e6d02e9410 fix metrics help comment
pod_start_latency_microseconds is not broken down by podname.
2018-07-13 10:26:35 +09:00
Kubernetes Submit Queue 337dfe0a9c
Merge pull request #65594 from liggitt/node-csr-addresses-2
Automatic merge from submit-queue (batch tested with PRs 65052, 65594). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Derive kubelet serving certificate CSR template from node status addresses

xref https://github.com/kubernetes/features/issues/267
fixes #55633

Builds on https://github.com/kubernetes/kubernetes/pull/65587

* Makes the cloud provider authoritative when recording node status addresses
* Makes the node status addresses authoritative for the kube-apiserver determining how to speak to a kubelet (stops paying attention to the hostname label when determining how to reach a kubelet, which was only done to support kubelets < 1.5)
* Updates kubelet certificate rotation to be driven from node status
  * Avoids needing to compute node addresses a second time, and differently, in order to request serving certificates.
  * Allows the kubelet to react to changes in its status addresses by updating its serving certificate
  * Allows the kubelet to be driven by external cloud providers recording node addresses on the node status

test procedure:
```sh
# setup
export FEATURE_GATES=RotateKubeletServerCertificate=true
export KUBELET_FLAGS="--rotate-server-certificates=true --cloud-provider=external"

# cleanup from previous runs
sudo rm -fr /var/lib/kubelet/pki/

# startup
hack/local-up-cluster.sh

# wait for a node to register, verify it didn't set addresses
kubectl get nodes 
kubectl get node/127.0.0.1 -o jsonpath={.status.addresses}

# verify the kubelet server isn't available, and that it didn't populate a serving certificate
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
ls -la /var/lib/kubelet/pki

# set an address on the node
curl -X PATCH http://localhost:8080/api/v1/nodes/127.0.0.1/status \
  -H "Content-Type: application/merge-patch+json" \
  --data '{"status":{"addresses":[{"type":"Hostname","address":"localhost"}]}}'

# verify a csr was submitted with the right SAN, and approve it
kubectl describe csr
kubectl certificate approve csr-...

# verify the kubelet connection uses a cert that is properly signed and valid for the specified hostname, but NOT the IP
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
curl --cacert _output/certs/server-ca.crt -v https://127.0.0.1:10250/pods
ls -la /var/lib/kubelet/pki

# set an hostname and IP address on the node
curl -X PATCH http://localhost:8080/api/v1/nodes/127.0.0.1/status \
  -H "Content-Type: application/merge-patch+json" \
  --data '{"status":{"addresses":[{"type":"Hostname","address":"localhost"},{"type":"InternalIP","address":"127.0.0.1"}]}}'

# verify a csr was submitted with the right SAN, and approve it
kubectl describe csr
kubectl certificate approve csr-...

# verify the kubelet connection uses a cert that is properly signed and valid for the specified hostname AND IP
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
curl --cacert _output/certs/server-ca.crt -v https://127.0.0.1:10250/pods
ls -la /var/lib/kubelet/pki
```

```release-note
* kubelets that specify `--cloud-provider` now only report addresses in Node status as determined by the cloud provider
* kubelet serving certificate rotation now reacts to changes in reported node addresses, and will request certificates for addresses set by an external cloud provider
```
2018-07-11 22:25:07 -07:00
Seth Jennings f2a7654978 move feature gate checks inside IsCriticalPod 2018-07-11 16:10:05 -05:00
Kubernetes Submit Queue 0972ce1acc
Merge pull request #65649 from rsc/fix-printf
Automatic merge from submit-queue (batch tested with PRs 66076, 65792, 65649). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubernetes: fix printf format errors

These are all flagged by Go 1.11's
more accurate printf checking in go vet,
which runs as part of go test.

```release-note
NONE
```
2018-07-11 14:09:08 -07:00
jiaxuanzhou 6ac4a8588e fix bug for garbage collection 2018-07-11 09:33:08 +08:00
Russ Cox 2bd91dda64 kubernetes: fix printf format errors
These are all flagged by Go 1.11's
more accurate printf checking in go vet,
which runs as part of go test.

Lubomir I. Ivanov <neolit123@gmail.com>
applied ammend for:
  pkg/cloudprovider/provivers/vsphere/nodemanager.go
2018-07-11 00:10:15 +03:00
Kubernetes Submit Queue 421789328f
Merge pull request #65997 from tallclair/writer
Automatic merge from submit-queue (batch tested with PRs 66030, 65997). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove unused io util writer & volume host GetWriter()

Cleanup unused code.
Fixes https://github.com/kubernetes/kubernetes/issues/16971

**Release note**:
```release-note
NONE
```

/kind cleanup
/sig storage
2018-07-10 12:46:09 -07:00
Jordan Liggitt 7828e5d0f9
Make cloud provider authoritative for node status address reporting 2018-07-10 14:33:48 -04:00
Jordan Liggitt db9d3c2d10
Derive kubelet serving certificate CSR template from node status addresses 2018-07-10 14:33:48 -04:00
Kubernetes Submit Queue 13f9c26fd7
Merge pull request #65902 from wojtek-t/kube_proxy_less_allocations_2
Automatic merge from submit-queue (batch tested with PRs 65902, 65781). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Avoid unnecessary allocations in kube-proxy
2018-07-09 23:07:01 -07:00
Kubernetes Submit Queue 55620e2be6
Merge pull request #65987 from Random-Liu/fix-pod-worker-deadlock
Automatic merge from submit-queue (batch tested with PRs 65987, 65962). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix pod worker deadlock.

Preemption will stuck forever if `killPodNow` timeout once. The sequence is:
* `killPodNow` create the response channel (size 0) and send it to pod worker.
* `killPodNow` timeout and return.
*  Pod worker finishes killing the pod, and tries to send back response via the channel.

However, because the channel size is 0, and the receiver has exited, the pod worker will stuck forever.

In @jingxu97's case, this causes a critical system pod (apiserver) unable to come up, because the csi pod can't be preempted.

I checked the history, and the bug was introduced 2 years ago 6fefb428c1.

I think we should at least cherrypick this to `1.11` since preemption is beta and enabled by default in 1.11.

@kubernetes/sig-node-bugs @derekwaynecarr @dashpole @yujuhong 
Signed-off-by: Lantao Liu <lantaol@google.com>

```release-note
none
```
2018-07-09 16:53:59 -07:00
Tim Allclair b1012b2543
Remove unused io util writer & volume host GetWriter() 2018-07-09 14:09:48 -07:00
Lantao Liu 0f4c739b2c Fix pod worker deadlock.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-07-09 11:45:26 -07:00
Kubernetes Submit Queue f70410959d
Merge pull request #65226 from ingvagabund/store-cloud-provider-latest-node-addresses
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Store the latest cloud provider node addresses

**What this PR does / why we need it**:
Buffer the recently retrieved node address so they can be used as soon as the next node status update is run.


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #65814

**Special notes for your reviewer**:

**Release note**:

```release-note
None
```
2018-07-09 10:47:07 -07:00
Kubernetes Submit Queue e943d09fa3
Merge pull request #63194 from m1093782566/cni-ts
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Adding traffic shaping support for CNI network driver

**What this PR does / why we need it**:

Adding traffic shaping support for CNI network driver - it's also a sub-task of kubenet deprecation work.

Design document is available here: https://github.com/kubernetes/community/pull/1893

**Which issue(s) this PR fixes**:
Fixes #

**Special notes for your reviewer**:

/cc @freehan @jingax10 @caseydavenport @dcbw 

/sig network
/sig node

**Release note**:

```release-note
Support traffic shaping for CNI network driver
```
2018-07-08 23:54:25 -07:00
liangwei 34d848eb1a add cni bandwidth test 2018-07-09 09:51:33 +08:00
m1093782566 8038a0dfa6 add traffic shaping support for CNI network driver 2018-07-08 22:22:25 +08:00
wojtekt 6e50f39dbd Avoid allocations when parsing iptables 2018-07-08 10:55:19 +02:00
Kubernetes Submit Queue 097f300a4d
Merge pull request #65707 from dims/remove-deprecated-cadvisor-port
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove --cadvisor-port - has been deprecated since v1.10

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #56523

**Special notes for your reviewer**:
- Deprecated in https://github.com/kubernetes/kubernetes/pull/59827 (v1.10)
- Disabled in https://github.com/kubernetes/kubernetes/pull/63881 (v1.11)

**Release note**:

```release-note
[action required] The formerly publicly-available cAdvisor web UI that the kubelet started using `--cadvisor-port` is now entirely removed in 1.12. The recommended way to run cAdvisor if you still need it, is via a DaemonSet.
```
2018-07-07 05:28:13 -07:00
Lantao Liu 3193a4a469 Fix RunAsGroup. 2018-07-06 15:42:26 -07:00
choury 8e4b62a74b
Remove duplicate check line
There is a same [line](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/cm/cpumanager/policy_static.go#L81).
2018-07-05 11:07:56 +08:00
Jordan Liggitt b7b4b84afe
Add healthz check to ensure logging is not blocked 2018-07-03 22:27:23 -04:00
Jan Chaloupka 9d9fb4de29 Put all the node address cloud provider retrival complex logic into cloudResourceSyncManager 2018-07-03 20:11:35 +02:00
Davanum Srinivas 5feab86329
Remove --cadvisor-port - has been deprecated since v1.10
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2018-07-02 08:54:14 -04:00
wojtekt e50c0b904f Speed up cluster startup in GCE 2018-07-02 10:22:32 +02:00
Kubernetes Submit Queue b265f7c682
Merge pull request #65582 from dtaniwaki/fix-test-failure-of-truncated-time
Automatic merge from submit-queue (batch tested with PRs 65582, 65480, 65310, 65644, 65645). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix test failure of truncated time

**What this PR does / why we need it**:

The test of `TestFsStoreAssignedModified` in `pkg/kubelet/kubeletconfig/checkpoint/store` fails in my environment like below.

```
$ make test WHAT=./pkg/kubelet/kubeletconfig/checkpoint/store/
Running tests for APIVersion: v1,admissionregistration.k8s.io/v1alpha1,admissionregistration.k8s.io/v1beta1,admission.k8s.io/v1beta1,apps/v1beta1,apps/v1beta2,apps/v1,authentication.k8s.io/v1,authentication.k8s.io/v1beta1,authorization.k8s.io/v1,authorization.k8s.io/v1beta1,autoscaling/v1,autoscaling/v2beta1,batch/v1,batch/v1beta1,batch/v2alpha1,certificates.k8s.io/v1beta1,coordination.k8s.io/v1beta1,extensions/v1beta1,events.k8s.io/v1beta1,imagepolicy.k8s.io/v1alpha1,networking.k8s.io/v1,policy/v1beta1,rbac.authorization.k8s.io/v1,rbac.authorization.k8s.io/v1beta1,rbac.authorization.k8s.io/v1alpha1,scheduling.k8s.io/v1alpha1,scheduling.k8s.io/v1beta1,settings.k8s.io/v1alpha1,storage.k8s.io/v1beta1,storage.k8s.io/v1,storage.k8s.io/v1alpha1,
+++ [0628 22:53:39] Running tests without code coverage
--- FAIL: TestFsStoreAssignedModified (0.00s)
        fsstore_test.go:316: expect "2018-06-28T22:53:43+09:00" but got "2018-06-28T22:53:43+09:00"
FAIL
FAIL    k8s.io/kubernetes/pkg/kubelet/kubeletconfig/checkpoint/store    0.236s
make: *** [test] Error 1
```

My environment is
OS: macOS Sierra Version 10.12.6
File System: Journaled HFS+

The error message confused me because the comparing times looked the same in the error log. If we know certain systems truncate times, I think we can just compare less precise times to avoid confusions in tests.

**Special notes for your reviewer**:
N/A

**Release note**:

```release-note
NONE
```
2018-06-29 20:14:06 -07:00
Daisuke Taniwaki 7d4c85b02c
Fix test failure of truncated time 2018-06-30 01:14:44 +09:00
Kubernetes Submit Queue 93f3249e3c
Merge pull request #65595 from sjenning/feature-gate-lsi-capacity
Automatic merge from submit-queue (batch tested with PRs 60150, 65467, 65487, 65595, 65374). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: feature gate LSI capacity calculation

Currently if `cm.cadvisorInterface.RootFsInfo()` fails, the whole kubelet bails.  If `/var/lib/kubelet` is on a tmpfs or bindmount, this can happen (this is the case for some of our CI envs https://github.com/openshift/origin/issues/19948).

We would be able to workaround this, in the short term, by disabling the LSI feature gate if the capacity calculate was protected by the gate, but currently it isn't.

This PR adds the gate check around setting the ephemeral storage capacity.

@liggitt @derekwaynecarr @dashpole 

It might be a different discussion about whether or not this should be fatal.  If it isn't fatal, seems that it would just prevent pods that had a ephemeral storage request from being scheduled.

/sig node
2018-06-28 19:15:15 -07:00
Kubernetes Submit Queue c57cdc1d35
Merge pull request #65587 from liggitt/node-csr-addresses-1
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert "certs: only append locally discovered addresses when we got none from the cloudprovider"

This reverts commit 7354bbe5ac.

https://github.com/kubernetes/kubernetes/pull/61869 caused a mismatch between the requested CSR and the addresses in node status.

Instead of computing addresses in two places, the cert manager should derive its CSR request from the addresses in node status. This would enable the kubelet to react to address changes, as well as be driven by an external cloud provider.

/cc @mikedanese

```release-note
NONE
```
2018-06-28 17:36:45 -07:00
Kubernetes Submit Queue 44073e6f43
Merge pull request #64660 from figo/master
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add support for plugin directory hierarchy

**What this PR does / why we need it**:

Add hierarchy support for plugin directory, it traverses and 
watch plugin directory and its sub directory recursively.

plugin socket file only need be unique within one directory,
``` 
 plugin socket directory  
    |  
    ---->sub directory 1
    |              |  
    |              ----->  socket1,  socket2 ...
    ----->sub directory 2
                  |
                  ------> socket1, socket2 ...  
```
the design itself allow sub directory be anything,
but in practical, each plugin type could just use one sub directory.

**Which issue(s) this PR fixes**:
Fixes #64003

**Special notes for your reviewer**:

twos bonus changes added as below

1) propose to let pluginWatcher bookkeeping registered plugins,
to make sure plugin name is unique within one plugin type.  
arguably, we could let each handler do the same work, but it requires
every handler repeat the same thing.    
 
2) extract example handler out from test, it is easier to read the code with the
seperation.  


**Release note**:

```release-note
N/A
```

/sig node
/cc @vikaschoudhary16  @jiayingz @RenaudWasTaken @vishh @derekwaynecarr  @saad-ali @vladimirvivien @dchen1107 @yujuhong @tallclair @Random-Liu @anfernee @akutz
2018-06-28 14:53:44 -07:00
Seth Jennings 3234b0fa5b feature gate LSI capacity calculation 2018-06-28 14:01:08 -05:00
Jordan Liggitt f1adf74b4e
Revert "certs: only append locally discovered addresses when we got none from the cloudprovider"
This reverts commit 7354bbe5ac.
2018-06-28 12:36:24 -04:00
Kubernetes Submit Queue 270b675c61
Merge pull request #65513 from tallclair/test-cleanup2
Automatic merge from submit-queue (batch tested with PRs 65453, 65523, 65513, 65560). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Cleanup verbose cAdvisor mocking in Kubelet unit tests

These tests had a lot of duplicate code to set up the cAdvisor mock, but weren't really depending on the mock functionality. By moving the tests to use the fake cAdvisor, most of the setup can be cleaned up.

/kind cleanup
/sig node

```release-note
NONE
```
2018-06-27 22:30:12 -07:00
ceshihao 3b9ed9afff pod status should contain ContainerStatuses after eviction 2018-06-28 11:52:08 +08:00
Tim Allclair 5955b839ff
Cleanup verbose cAdvisor mocking in Kubelet unit tests 2018-06-27 11:53:41 -07:00
stewart-yu d5513c6d14 fix wrong output messages about EnforceNodeAllocatable 2018-06-27 15:31:32 +08:00
Kubernetes Submit Queue 991a84758f
Merge pull request #59214 from kdembler/cpumanager-checkpointing
Automatic merge from submit-queue (batch tested with PRs 59214, 65330). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Migrate cpumanager to use checkpointing manager

**What this PR does / why we need it**:
This PR migrates `cpumanager` to use new kubelet level node checkpointing feature (#56040) to decrease code redundancy and improve consistency.

**Which issue(s) this PR fixes**:
Fixes #58339

**Notes**:
At point of submitting PR the most straightforward approach was used - `state_checkpoint` implementation of `State` interface was added. However, with checkpointing implementation there might be no point to keep `State` interface and just use single implementation with checkpoint backend and in case of different backend than filestore needed just supply `cpumanager` with custom `CheckpointManager` implementation.

/kind feature
/sig node
cc @flyingcougar @ConnorDoyle
2018-06-25 18:19:00 -07:00
hui luo d04f596829 Add hierarchy support for plugin directory
it traverses and watch plugin directory and its sub directory recursively,
plugin socket file only need be unique within one directory,

- plugin socket directory
-    |
-    ---->sub directory 1
-    |              |
-    |              ----->  socket1,  socket2 ...
-    ----->sub directory 2
-                  |
-                  ------> socket1, socket2 ...

the design itself allow sub directory be anything,
but in practical, each plugin type could just use one sub directory.

four bonus changes added as below

1. extract example handler out from test, it is easier to read the code
with the seperation.

2. there are two variables here: "Watcher" and "watcher".
"Watcher" is the plugin watcher, and "watcher" is the fsnotify watcher.
so rename the "watcher" to "fsWatcher" to make code easier to
understand.

3. change RegisterCallbackFn() return value order, it is
conventional to return error last, after this change,
the pkg/volume/csi is compliance with golint, so remove it
from hack/.golint_failures

4. refactor errors handling at invokeRegistrationCallbackAtHandler()
to make error message more clear.
2018-06-25 17:32:18 -07:00
Jeff Grafton 23ceebac22 Run hack/update-bazel.sh 2018-06-22 16:22:57 -07:00
Jeff Grafton a725660640 Update to gazelle 0.12.0 and run hack/update-bazel.sh 2018-06-22 16:22:18 -07:00
Jeff Grafton 01f94051c8 Remove the go_default_library_protos filegroups using buildozer 2018-06-22 16:22:18 -07:00
Kubernetes Submit Queue f09a938bcd
Merge pull request #64675 from yue9944882/fix-data-race-cli-file-linux
Automatic merge from submit-queue (batch tested with PRs 61330, 64793, 64675, 65059, 65368). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fixes data races for pkg/kubelet/config/file_linux_test.go

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64655

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-06-22 14:52:37 -07:00
Kubernetes Submit Queue 1ca851baec
Merge pull request #64860 from wgliang/master.kubelet-check-limit
Automatic merge from submit-queue (batch tested with PRs 65290, 65326, 65289, 65334, 64860). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

checkLimitsForResolvConf for the  pod create and update events instead of checking period

**What this PR does / why we need it**:

- Check for the same at pod create and update events instead of checking continuously for every 30 seconds.
- Increase the logging level to 4 or higher since the event is not catastrophic to cluster health .


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64849

**Special notes for your reviewer**:
@ravisantoshgudimetla 

**Release note**:

```release-note
checkLimitsForResolvConf for the  pod create and update events instead of checking period
```
2018-06-22 04:43:16 -07:00
Kubernetes Submit Queue 96c7f3a34a
Merge pull request #64752 from wojtek-t/default_to_watching_managers
Automatic merge from submit-queue (batch tested with PRs 65187, 65206, 65223, 64752, 65238). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Kubelet watches necessary secrets/configmaps instead of periodic polling
2018-06-21 19:48:14 -07:00
Kubernetes Submit Queue 02dba36128
Merge pull request #65019 from mirake/fix-typo-toto
Automatic merge from submit-queue (batch tested with PRs 65265, 64822, 65026, 65019, 65077). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Typo fix: toto -> to
2018-06-21 11:25:16 -07:00
Kubernetes Submit Queue d1f5cb2348
Merge pull request #65050 from sttts/sttts-deepcopy-update
Automatic merge from submit-queue (batch tested with PRs 64895, 64938, 63700, 65050, 64957). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Bump gengo to include uniform pointer deepcopy

This bumps k8s.io/gengo with uniform pointer support in deepcopy-gen.

Fixes https://github.com/kubernetes/code-generator/issues/45.
2018-06-21 04:15:16 -07:00
Kubernetes Submit Queue 332da0a943
Merge pull request #64491 from hzxuzhonghu/kubelet-node-schedule-event-record
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

move oldNodeUnschedulable pkg var to kubelet struct

**What this PR does / why we need it**:

move oldNodeUnschedulable pkg var to kubelet struct


**Release note**:

```release-note
NONE
```
2018-06-20 23:02:52 -07:00
Kubernetes Submit Queue ce09da5653
Merge pull request #64880 from dixudx/manifest_file_not_found
Automatic merge from submit-queue (batch tested with PRs 58690, 64773, 64880, 64915, 64831). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

ignore not found file error when watching manifests

**What this PR does / why we need it**:
An alternative of #63910.

When using vim to create a new file in manifest folder, a temporary file, with an arbitrary number (like 4913) as its name, will be created to check if a directory is writable and see the resulting ACL.

These temporary files will be deleted later, which should by ignored when watching the manifest folder.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #55928, #59009, #48219

**Special notes for your reviewer**:
/cc dims luxas yujuhong liggitt tallclair

**Release note**:

```release-note
ignore not found file error when watching manifests
```
2018-06-20 14:21:17 -07:00
Kubernetes Submit Queue aa25539ef6
Merge pull request #64451 from wgliang/master.remove-kubelet
Automatic merge from submit-queue (batch tested with PRs 64688, 64451, 64504, 64506, 56358). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

cleanup some dead kubelet code

**Release note**:

```release-note
NONE
```
2018-06-20 05:48:11 -07:00
Kubernetes Submit Queue a622f1404c
Merge pull request #64672 from mcluseau/wip-remote-grpc-message-size
Automatic merge from submit-queue (batch tested with PRs 65032, 63471, 64104, 64672, 64427). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

pkg: kubelet: remote: increase grpc client default size to 16MiB

**What this PR does / why we need it**:

Increase the gRPC max message size to 16MB in the remote container runtime. I've seen sizes over 8MB in clusters with big (256GB RAM) nodes.

**Release note**:
```release-note
Increase the gRPC max message size to 16MB in the remote container runtime.
```
2018-06-20 04:23:21 -07:00
Kubernetes Submit Queue 381b663b66
Merge pull request #63580 from dixudx/fix_cni_flag_binding
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

bind alpha feature network plugin flags correctly

**What this PR does / why we need it**:
When working #63542, I found the flags, like `--cni-conf-dir` and `cni-bin-dir`, were not correctly bound.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:
/cc kubernetes/sig-node-pr-reviews

**Release note**:

```release-note
None
```
2018-06-20 01:26:52 -07:00
Kubernetes Submit Queue 148350d3c4
Merge pull request #64426 from cofyc/remove_unnecessary_fakemounters
Automatic merge from submit-queue (batch tested with PRs 64142, 64426, 62910, 63942, 64548). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Clean up fake mounters.

**What this PR does / why we need it**:

Fixes https://github.com/kubernetes/kubernetes/issues/61502

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

list of fake mounters:

- (keep) pkg/util/mount.FakeMounter
- (removed) pkg/kubelet/cm.fakeMountInterface:
- (inherit from mount.FakeMounter) pkg/util/mount.fakeMounter
- (inherit from mount.FakeMounter) pkg/util/removeall.fakeMounter
- (removed) pkg/volume/host_path.fakeFileTypeChecker

**Release note**:

```release-note
NONE
```
2018-06-20 00:05:10 -07:00
Kubernetes Submit Queue c399c306e2
Merge pull request #59174 from tianshapjq/todo-already-done
Automatic merge from submit-queue (batch tested with PRs 65230, 57355, 59174, 63698, 63659). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

TODO has already been implemented

**What this PR does / why we need it**:
TODO has already been implemented, remove the TODO tag.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note

```NONE
2018-06-19 20:19:17 -07:00
wojtekt 72a0f4d167 Enable watching secret and configmap manager 2018-06-19 22:13:18 +02:00
wojtekt ffb32472bb Kubelet manager configuration 2018-06-19 22:12:55 +02:00
vikaschoudhary16 e8119dc134 Start plugin watcher after initialization of all kubelet components 2018-06-14 01:03:37 -04:00
Andrew Lytvynov 2c0f043957 Re-use private key after failed CSR
If we create a new key on each CSR, if CSR fails the next attempt will
create a new one instead of reusing previous CSR.

If approver/signer don't handle CSRs as quickly as new nodes come up,
they can pile up and approver would keep handling old abandoned CSRs and
Nodes would keep timing out on startup.
2018-06-13 13:12:43 -07:00
Seth Jennings f1551798e4 reduce logging for backoff situations 2018-06-13 13:25:20 -05:00
Dr. Stefan Schimanski 1208437f84 Update generated files 2018-06-13 12:35:13 +02:00
Kubernetes Submit Queue bb7e14429d
Merge pull request #64922 from dcbw/dcbw-dockershim-network-approver
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

dockershim/network: add dcbw to OWNERS as an approver

I've been involved with the kubelet network code, including most
of this code, for a couple years and contributed a good number
of PRs for these directories. I've also been a SIG Network
co-lead for couple years.

I've also been on the CNI maintainers team for a couple years.

```release-note
NONE
```
@freehan @thockin @kubernetes/sig-network-pr-reviews
2018-06-12 13:31:15 -07:00
Kubernetes Submit Queue 67ebbc675a
Merge pull request #64862 from feiskyer/win-cni
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert #64189: Fix Windows CNI for the sandbox case

**What this PR does / why we need it**:

This reverts PR #64189, which breaks DNS for Windows containers.

Refer https://github.com/kubernetes/kubernetes/pull/64189#issuecomment-395248704

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64861

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

cc @madhanrm @PatrickLang @alinbalutoiu @dineshgovindasamy
2018-06-12 11:18:01 -07:00
ruicao 95c232ee07 Typo fix: toto -> to 2018-06-12 23:12:39 +08:00
Kubernetes Submit Queue 8e03228c1a
Merge pull request #64643 from dashpole/memcg_poll
Automatic merge from submit-queue (batch tested with PRs 64503, 64903, 64643, 64987). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use unix.EpollWait to determine when memcg events are available to be Read

**What this PR does / why we need it**:
This fixes a file descriptor leak introduced in https://github.com/kubernetes/kubernetes/pull/60531 when the `--experimental-kernel-memcg-notification` kubelet flag is enabled.  The root of the issue is that `unix.Read` blocks indefinitely when reading from an event file descriptor and there is nothing to read.  Since we refresh the memcg notifications, these reads accumulate until the memcg threshold is crossed, at which time all reads complete.  However, if the node never comes under memory pressure, the node can run out of file descriptors.

This PR changes the eviction manager to use `unix.EpollWait` to wait, with a 10 second timeout, for events to be available on the eventfd.  We only read from the eventfd when there is an event available to be read, preventing an accumulation of `unix.Read` threads, and allowing the event file descriptors to be reclaimed by the kernel.

This PR also breaks the creation, and updating of the memcg threshold into separate portions, and performs creation before starting the periodic synchronize calls.  It also moves the logic of configuring memory thresholds into memory_threshold_notifier into a separate file.

This also reverts https://github.com/kubernetes/kubernetes/pull/64582, as the underlying leak that caused us to disable it for testing is fixed here.

Fixes #62808

**Release note**:
```release-note
NONE
```

/sig node
/kind bug
/priority critical-urgent
2018-06-11 17:29:19 -07:00
David Ashpole b7deb6d9e0 fix eviction event formatting 2018-06-11 11:38:00 -07:00
David Ashpole 93b6d026d9 fix memcg fd leak 2018-06-11 11:37:50 -07:00
Dan Williams 37792076b4 dockershim/network: add dcbw to OWNERS as an approver
I've been involved with the kubelet network code, including most
of this code, for a couple years and contributed a good number
of PRs for these directories. I've also been a SIG Network
co-lead for couple years.

I've also been on the CNI maintainers team for a couple years.
2018-06-08 10:06:19 -05:00
yue9944882 d467b29c5c remove duplicated cleaning up func 2018-06-08 14:28:19 +08:00
WanLinghao 52140ea1d3 fix a bug of wrong parameters which could cause token projection failure 2018-06-08 12:00:58 +08:00
Kubernetes Submit Queue 38beee65d3
Merge pull request #63905 from feiskyer/win-dns
Automatic merge from submit-queue (batch tested with PRs 63905, 64855). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Setup dns servers and search domains for Windows Pods

**What this PR does / why we need it**:

Kubelet is depending on docker container's ResolvConfPath (e.g. /var/lib/docker/containers/439efe31d70fc17485fb6810730679404bb5a6d721b10035c3784157966c7e17/resolv.conf) to setup dns servers and search domains. While this is ok for Linux containers, ResolvConfPath is always an empty string for windows containers. So that the DNS setting for windows containers is always not set.

This PR setups DNS for Windows sandboxes. In this way, Windows Pods could also use kubernetes dns policies.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #61579

**Special notes for your reviewer**:

Requires Docker EE version >= 17.10.0.

**Release note**:

```release-note
Setup dns servers and search domains for Windows Pods in dockershim. Docker EE version >= 17.10.0 is required for propagating DNS to containers.
```

/cc @PatrickLang @taylorb-microsoft @michmike @JiangtianLi
2018-06-07 11:40:11 -07:00
Di Xu 6d14771fd8 ignore not found file error when watching manifests 2018-06-07 22:02:53 +08:00
AdamDang 37a6fbfd6a
Typo fix in returned message: utilites->utilities
utilites->utilities
2018-06-07 21:49:38 +08:00
Klaudiusz Dembler a9df2acc4b Typo fix 2018-06-07 12:08:48 +02:00
Pengfei Ni d0cd1d17ae Add clarification for Windows DNS setup flow 2018-06-07 16:26:13 +08:00
Di Xu 8285a26589 add missing LastTransitionTime of ContainerReady condition 2018-06-07 14:51:14 +08:00
yue9944882 a221218681 fixes data races 2018-06-07 11:24:35 +08:00
Guoliang Wang 4f9d2047dd checkLimitsForResolvConf for the pod create and update events instead of checking period 2018-06-07 10:14:22 +08:00
Pengfei Ni 10b6f405e1 Revert "Fix Windows CNI for the sandbox case"
This reverts commit 49e762ab3a.
2018-06-07 09:56:13 +08:00
Kubernetes Submit Queue 8013bdb180
Merge pull request #64749 from Random-Liu/fix-standalone-dockershim
Automatic merge from submit-queue (batch tested with PRs 64749, 64797). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix standalone dockershim.

Ref https://github.com/kubernetes-incubator/cri-tools/pull/320#issuecomment-394554484.

This PR fixes a bug that standalone dockershim exits immediately.

This PR:
1) Changes standalone dockershim to wait on `stopCh`, so that it won't exit immediately.
2) Removes `stopCh` from dockershim internal. It doesn't help much for graceful stop, because kubelet will exit immediately anyway. https://github.com/kubernetes/kubernetes/blob/master/cmd/kubelet/app/server.go#L748

@kubernetes/sig-node-pr-reviews @yujuhong @feiskyer 

**Release note**:

```release-note
none
```
2018-06-06 10:08:12 -07:00
Kubernetes Submit Queue f54593b740
Merge pull request #64795 from mikedanese/fixgke
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

auth: standalone kubelets shouldn't start a token manager

fixes https://github.com/kubernetes/kubernetes/issues/64789
2018-06-06 06:58:28 -07:00
Kubernetes Submit Queue f4668d281c
Merge pull request #64800 from dashpole/cadvisor_godep
Automatic merge from submit-queue (batch tested with PRs 63717, 64646, 64792, 64784, 64800). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Update cadvisor godeps to v0.30.0

**What this PR does / why we need it**:
cAdvisor godep update corresponding to 1.11

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63204

**Release note**:
```release-note
Use IONice to reduce IO priority of du and find
cAdvisor ContainerReference no longer contains Labels. Use ContainerSpec instead.
Fix a bug where cadvisor failed to discover a sub-cgroup that was created soon after the parent cgroup.
```

/sig node
/kind bug
/priority critical-urgent

/assign @dchen1107
2018-06-06 01:24:26 -07:00
Kubernetes Submit Queue a32e5b6a59
Merge pull request #64784 from jiayingz/status-ready
Automatic merge from submit-queue (batch tested with PRs 63717, 64646, 64792, 64784, 64800). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reconcile extended resource capacity after kubelet restart.

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/64632

**Special notes for your reviewer**:

**Release note**:

```release-note
Kubelet will set extended resource capacity to zero after it restarts. If the extended resource is exported by a device plugin, its capacity will change to a valid value after the device plugin re-connects with the Kubelet. If the extended resource is exported by an external component through direct node status capacity patching, the component should repatch the field after kubelet becomes ready again. During the time gap, pods previously assigned with such resources may fail kubelet admission but their controller should create new pods in response to such failures.
```
2018-06-06 01:24:21 -07:00
Kubernetes Submit Queue 0b8394a1f4
Merge pull request #64646 from freehan/pod-ready-plus2-new
Automatic merge from submit-queue (batch tested with PRs 63717, 64646, 64792, 64784, 64800). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add ContainersReady condition into Pod Status

**Last 3 commits are new**

Follow up PR of: https://github.com/kubernetes/kubernetes/pull/64057 and https://github.com/kubernetes/kubernetes/pull/64344

Have a single PR for adding ContainersReady per https://github.com/kubernetes/kubernetes/pull/64344#issuecomment-394038384

```release-note
Introduce ContainersReady condition in Pod Status
```


/assign yujuhong for review
/assign thockin for the tiny API change
2018-06-06 01:24:14 -07:00
Kubernetes Submit Queue b6f75ac30e
Merge pull request #63717 from ingvagabund/promote-sysctl-annotations-to-fields
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Promote sysctl annotations to fields

#


**What this PR does / why we need it**:

Promoting experimental sysctl feature from annotations to API fields.

**Special notes for your reviewer**:

Following sysctl KEP: https://github.com/kubernetes/community/pull/2093

**Release note**:

```release-note
The Sysctls experimental feature has been promoted to beta (enabled by default via the `Sysctls` feature flag). PodSecurityPolicy and Pod objects now have fields for specifying and controlling sysctls. Alpha sysctl annotations will be ignored by 1.11+ kubelets. All alpha sysctl annotations in existing deployments must be converted to API fields to be effective.
```

**TODO**:

* [x] - Promote sysctl annotation in Pod spec
* [x] - Promote sysctl annotation in PodSecuritySpec spec
* [x] - Feature gate the sysctl
* [x] - Promote from alpha to beta
* [x] - docs PR - https://github.com/kubernetes/website/pull/8804
2018-06-06 00:47:36 -07:00