Commit Graph

5508 Commits (ee4d2d0a941b4298a3e07aab8fef5b3c5b85b27d)

Author SHA1 Message Date
Kubernetes Submit Queue 03adf92aa9 Merge pull request #53753 from derekwaynecarr/log-spam
Automatic merge from submit-queue (batch tested with PRs 53119, 53753, 53795, 52981). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reduce log spam in qos container manager

**What this PR does / why we need it**:
excessive log stmts make it hard to debug actual problems.

**Release note**:
```release-note
NONE
```
2017-10-12 08:28:36 -07:00
yanxuean 8adb2181eb remove redundancy code in setCPUCgroupConfig
Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2017-10-12 18:42:18 +08:00
Kubernetes Submit Queue 0515895c08 Merge pull request #53684 from dashpole/feature_gate_allocatable_eviction
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add feature gate for allocatable disk eviction

Issue: #52336 
This PR adds the local storage feature gate to local storage allocatable eviction.

cc @kubernetes/sig-node-bugs 
/assign @jingxu97 @dchen1107 

we should target this for 1.7 if possible.

```release-note
fix a bug where disk pressure could trigger prematurely
```
2017-10-11 20:39:32 -07:00
Kubernetes Submit Queue eabc7a3553 Merge pull request #53700 from euank/swapReader
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet/cm: remove unneeded fork of 'cat'

Reading a file in Go is perfectly possible without invoking cat.

I also removed an outdated comment.

This is meant to be a trivial/minor code cleanup, nothing more.

```release-note
NONE
```
2017-10-11 17:54:08 -07:00
Di Xu 811447ea0a avoid kubelet converts and validates pods multiple times 2017-10-12 08:10:09 +08:00
Derek Carr 328a12d160 Reduce log spam in qos container manager 2017-10-11 19:47:40 -04:00
David Ashpole 8659676408 feature gate local storage allocatable eviction 2017-10-11 09:53:56 -07:00
Michael Taufen 8180536bed Mulligan: Remove deprecated and experimental fields from KubeletConfiguration
Revert "Merge pull request #51857 from kubernetes/revert-51307-kc-type-refactor"

This reverts commit 9d27d92420, reversing
changes made to 2e69d4e625.

See original: #51307

We punted this from 1.8 so it could go through an API review. The point
of this PR is that we are trying to stabilize the kubeletconfig API so
that we can move it out of alpha, and unblock features like Dynamic
Kubelet Config, Kubelet loading its initial config from a file instead
of flags, kubeadm and other install tools having a versioned API to rely
on, etc.

We shouldn't rev the version without both removing all the deprecated
junk from the KubeletConfiguration struct, and without (at least
temporarily) removing all of the fields that have "Experimental" in
their names. It wouldn't make sense to lock in to deprecated fields.
"Experimental" fields can be audited on a 1-by-1 basis after this PR,
and if found to be stable (or sufficiently alpha-gated), can be restored
to the KubeletConfiguration without the "Experimental" prefix.
2017-10-11 09:52:39 -07:00
Kubernetes Submit Queue df072ca97e Merge pull request #53025 from mtaufen/feature-gate-map
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Make feature gates loadable from a map[string]bool

Command line flag API remains the same. This allows ComponentConfig             
structures (e.g. KubeletConfiguration) to express the map structure             
behind feature gates in a natural way when written as JSON or YAML.             
                                                                                
For example:                                                                    
                                                                                
KubeletConfiguration Before:
```
apiVersion: kubeletconfig/v1alpha1
kind: KubeletConfiguration
featureGates: "DynamicKubeletConfig=true,Accelerators=true"
```

KubeletConfiguration After:
```
apiVersion: kubeletconfig/v1alpha1
kind: KubeletConfiguration
featureGates:
  DynamicKubeletConfig: true
  Accelerators: true
```

Fixes: #53024

```release-note
The Kubelet's feature gates are now specified as a map when provided via a JSON or YAML KubeletConfiguration, rather than as a string of key-value pairs.
```

/cc @mikedanese @jlowdermilk @smarterclayton
2017-10-11 09:05:33 -07:00
Euan Kemp 7aa88b5103 kubelet/cm: remove unneeded fork of 'cat'
Reading a file in Go is perfectly possible without invoking cat.

I also removed an outdated comment.
2017-10-10 21:53:35 -07:00
chenguoyan01 b88cf9435e add instrumented serivce unit test of version
Change-Id: I21b65cd3a03528a1ea14a77d71feb7d2bf7b097e
2017-10-11 11:31:29 +08:00
Kubernetes Submit Queue ec116fdc73 Merge pull request #53328 from intelsdi-x/lscpu_fix
Automatic merge from submit-queue (batch tested with PRs 53297, 53328). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Cpu Manager - make CoreID's platform unique

**What this PR does / why we need it**:
Cpu Manager uses topology from cAdvisor(`/proc/cpuinfo`) where coreID's are socket unique - not platform unique - this causes problems on multi-socket platforms.

All code assumes unique coreID's (on platform) -  `Discovery` function has been changed to assign CoreID as the lowest cpuID from all cpus belonging to the same core. This can be expressed as:
`CoreID=min(cpuID's on the same core)`

Since cpuID's are platform unique - above gives us guarantee that CoreID's will also be platform unique.



**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #53323
2017-10-10 11:20:37 -07:00
Kubernetes Submit Queue b543f67fc8 Merge pull request #53297 from x1957/code_format
Automatic merge from submit-queue (batch tested with PRs 53297, 53328). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

format some code in dockershim

**What this PR does / why we need it**:
format some code in dockershim

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
None
```
2017-10-10 11:20:34 -07:00
Michael Taufen 131b419596 Make feature gates loadable from a map[string]bool
Command line flag API remains the same. This allows ComponentConfig
structures (e.g. KubeletConfiguration) to express the map structure
behind feature gates in a natural way when written as JSON or YAML.

For example:

KubeletConfiguration Before:
```
apiVersion: kubeletconfig/v1alpha1
kind: KubeletConfiguration
featureGates: "DynamicKubeletConfig=true,Accelerators=true"
```

KubeletConfiguration After:
```
apiVersion: kubeletconfig/v1alpha1
kind: KubeletConfiguration
featureGates:
  DynamicKubeletConfig: true
  Accelerators: true
```
2017-10-10 09:37:51 -07:00
Kubernetes Submit Queue aaf14d4619 Merge pull request #53525 from sttts/sttts-scheme-copier-romoval
Automatic merge from submit-queue (batch tested with PRs 53525, 53652). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

apimachinery: remove ObjectCopier interface(s)

The big commit is a mechanical, transitive removal of the copier interfaces in all structs and function calls.
2017-10-10 08:31:41 -07:00
Szymon Scharmach b86dc9c054 Make CoreID's platform unique 2017-10-10 10:45:44 +02:00
Kubernetes Submit Queue d6cabc7e99 Merge pull request #53444 from msau42/make-mounts
Automatic merge from submit-queue (batch tested with PRs 53444, 52067, 53571, 53182). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Don't skip mounts if we can't find the volume

**What this PR does / why we need it**:
Return an error instead of skipping the volume while constructing the list of volume mounts for the container runtime.  This prevents the scenario of a container writing data to an ephemeral volume when it expects the volume to be persistent.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #53421

**Release note**:

NONE

@kubernetes/sig-storage-pr-reviews
2017-10-10 00:33:20 -07:00
Michelle Au 266120c189 Don't skip mounts if we can't find the volume 2017-10-09 14:00:23 -07:00
Kubernetes Submit Queue c12dab37e7 Merge pull request #53547 from jiayingz/deviceplugin-fix
Automatic merge from submit-queue (batch tested with PRs 52662, 53547, 53588, 53573, 53599). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

In DevicePluginHandlerImpl.Allocate(), skips untracked extended resou…

…rces.

Otherwise, we would fail a Pod allocation request that has an extended
resource not managed by any device plugin.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/53548

**Special notes for your reviewer**:

**Release note**:

```release-note
Ignore extended resources that are not registered with kubelet
```
2017-10-09 12:51:17 -07:00
Kubernetes Submit Queue 85b252d47e Merge pull request #51771 from dixudx/refactor_nsenter
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Refactor nsenter

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #51273

**Special notes for your reviewer**:
/assign @jsafrane 

**Release note**:

```release-note
None
```
2017-10-08 23:27:32 -07:00
Renaud Gaubert d2f08c94a9 Device Plugin now closes client connexion 2017-10-08 20:02:29 +02:00
Yuhao Fang c1c89d986b format some code in dockershim 2017-10-08 22:30:37 +08:00
Dr. Stefan Schimanski ecb65a6a71 Update generated files 2017-10-07 11:28:47 +02:00
Kubernetes Submit Queue f321a16af4 Merge pull request #49654 from jcbsmpsn/move-certificate-manager
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Move certificate manager to client.

Fixes https://github.com/kubernetes/kubernetes/issues/53452

**What this PR does / why we need it**:
Migrate the certificate_manager to a location where it can be shared.

```release-note
NONE
```
2017-10-06 15:00:07 -07:00
Jiaying Zhang ee1ffa619b In DevicePluginHandlerImpl.Allocate(), skips untracked extended resources.
Otherwise, we would fail a Pod allocation request that has an extended
resource not managed by any device plugin.
2017-10-06 13:57:53 -07:00
Dr. Stefan Schimanski ed586da147 apimachinery: remove Scheme.DeepCopy 2017-10-06 14:59:17 +02:00
Dr. Stefan Schimanski 19285b7357 apimachinery: remove Scheme.Copy 2017-10-06 14:24:05 +02:00
Kubernetes Submit Queue 16e42282d3 Merge pull request #53028 from jiayingz/flaky-test
Automatic merge from submit-queue (batch tested with PRs 53044, 52956, 53512, 53028). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fixes the flaky TestDevicePluginReRegistration.

In the current test, there is a race that the new device plugin endpoint
may not be added to the device plugin manager endpoints at the time when
we call manager.Devices(). Added the checking and waiting for endpoint
updates before calling manager.Devices() in the test.

Tested:
go test -race -count 500 k8s.io/kubernetes/pkg/kubelet/deviceplugin -run
TestDevicePluginReRegistration -timeout 5h



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/52560

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-10-05 18:29:44 -07:00
Jacob Simpson 415c4d2c3a Move certificate manager to client. 2017-10-05 12:54:38 -07:00
Kubernetes Submit Queue eaaa93c70c Merge pull request #53446 from sjenning/network-plugin-metrics
Automatic merge from submit-queue (batch tested with PRs 53454, 53446, 52935, 53443, 52917). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: add latency metrics to network plugin manager

This PR adds latency metrics to the network plugin operations, namely `GetPodNetworkStatus()`, `SetUpPod()`, and `TearDownPod()`.

I recently had to debug and issue where a PLEG relist hang was occurring due to a hang in a CNI plugin and it would have been really nice to have these.  Between the these new metrics and `docker_operations_latency_microseconds`, we will be able to account for nearly all the time consuming routines in the PLEG relist.

@derekwaynecarr @smarterclayton @eparis @vishh 

```release-note
Metrics were added to network plugin to report latency of CNI operations
```
/sig node
2017-10-05 05:06:25 -07:00
Kubernetes Submit Queue fd2f3fa521 Merge pull request #53261 from x1957/fixcomment
Automatic merge from submit-queue (batch tested with PRs 51754, 53261, 53450). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix comment

**What this PR does / why we need it**:
fix method name in comment

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
None
```
2017-10-04 13:13:18 -07:00
Seth Jennings 607fddf984 kubelet: add metrics to network plugin manager 2017-10-04 12:13:35 -05:00
Kubernetes Submit Queue 94fbe2ba99 Merge pull request #53353 from jiayingz/node-status-fix
Automatic merge from submit-queue (batch tested with PRs 53228, 53232, 53353). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fixes a regression introduced by PR 52290 that extended resource

capacity may temporarily drop to zero after kubelet restarts and PODs restarted during
that time window could fail to be scheduled.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/53342

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-10-03 19:27:20 -07:00
Kubernetes Submit Queue 93862282a4 Merge pull request #53233 from dashpole/kubelet_gc_faster
Automatic merge from submit-queue (batch tested with PRs 53403, 53233). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove containers from deleted pods once containers have exited

Issue #51899 
Since container deletion is currently done through periodic garbage collection every 30 seconds, it takes a long time for pods to be deleted, and causes the kubelet to send all delete pod requests at the same time, which has performance issues.  This PR makes the kubelet actively remove containers of deleted pods rather than wait for them to be removed in periodic garbage collection.

/release-note-none
2017-10-03 17:21:15 -07:00
Jiaying Zhang 6fecd04924 Fixes a regression introduced by PR 52290 that extended resource
capacity may temporarily drop to zero after kubelet restarts and
PODs restarted during that time window could fail to be scheduled.
2017-10-03 10:26:53 -07:00
Di Xu 32199cb95b don't recreate static pods when node gets deleted 2017-10-03 10:28:08 +08:00
Kubernetes Submit Queue f0a061e361 Merge pull request #51152 from bobbypage/cri
Automatic merge from submit-queue (batch tested with PRs 50555, 51152). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Implement CRI stats in Docker Shim

**What this PR does / why we need it**:
This PR implements CRI Stats in the Docker Shim. It is needed to enable CRI stats for Docker and ongoing /stats/summary API changes in moving to use CRI.

Related issues:
#46984 (CRI: instruct kubelet to (optionally) consume container stats from CRI)
#45614 (CRI: add methods for container stats) 

This PR is also a followup to my original PR (https://github.com/kubernetes/kubernetes/pull/50396) to implement Windows Container Stats. The plan is that Windows Stats will use a hybrid model: pod and container level stats will come from CRI (via dockershim) and that node level stats will come from a "winstats" package that exports cadvisor like datastructures using windows specific perf counters from the node. I will update that PR to only export node level stats. 

@yujuhong @yguo0905 @dchen1107 @jdumars @anhowe @michmike

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-10-02 14:49:12 -07:00
David Ashpole 1eddab3313 remove containers of deleted pods once all containers have exited 2017-10-02 10:15:21 -07:00
Kubernetes Submit Queue c6a3f26988 Merge pull request #52395 from dixudx/fix_apparmor_annotation_unconfined
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

enable to specific unconfined AppArmor profile

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #52370

**Special notes for your reviewer**:
/assign @tallclair @liggitt 

**Release note**:

```release-note
enable to specific unconfined AppArmor profile
```
2017-10-02 08:03:50 -07:00
Kubernetes Submit Queue 6ed207374f Merge pull request #53318 from sjenning/fix-http-probe-conn-pools
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

create separate transports for liveness and readiness probes

There is currently an issue with the http2 connection pools in golang such that two GETs to the same host:port using the same Transport can collide and one gets rejected with `http2: no cached connection was available`.  This happens with readiness and liveness probes if the intervals line up such that worker goroutines invoke the two probes at the exact same time.

The result is a transient probe error that appears in the events.  If the failureThreshold is 1, which is kinda crazy, it would cause a pod restart.

The PR creates a separate `httprobe` instance for readiness and liveness probes so that they don't share a Transport and connection pool.

Fixes https://github.com/kubernetes/kubernetes/issues/49740

@smarterclayton @jhorwit2
2017-10-01 21:45:50 -07:00
David Porter 5eae7eb166 Implement CRI stats in dockershim for Windows
Implement CRI stats for dockershim using docker stats. This enables use
of the summary api to get container metrics on Windows where CRI stats
are enabled.
2017-10-02 04:10:48 +00:00
Seth Jennings 343036e350 create separate transports for liveness and readiness probes 2017-10-01 21:45:43 -05:00
Kubernetes Submit Queue 5e2ce3aaf2 Merge pull request #53122 from resouer/fix-cpu
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Eliminate extra CRI call during processing cpu set

**What this PR does / why we need it**:

Encountered this during `kubernetes/frakti` node e2e test.

When cpuset is not set, there's still plenty of `runtime.UpdateContainerResources` been called, which seems unnecessary.

cc @ConnorDoyle Make sense? Fixes: #53304

**Special notes for your reviewer**:

**Release note**:

```release-note
Only do UpdateContainerResources when cpuset is set 
```
2017-10-01 15:30:56 -07:00
Harry Zhang 282973d87d Elimenate extra CRI call 2017-09-30 16:51:32 +08:00
Kubernetes Submit Queue 68d2722be0 Merge pull request #53107 from Random-Liu/fix-cri-stats
Automatic merge from submit-queue (batch tested with PRs 53234, 53252, 53267, 53276, 53107). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix imagefs stats

Without this CRI stats based summary api won't work:
```console
$ curl localhost:10255/stats/summary
Internal Error: failed to get root cgroup stats: failed to get imageFs info: no imagefs label for configured runtime
```
With this PR, we could get summary api from cri-containerd now:
```console
$ curl localhost:10255/stats/summary
{
  "node": {
   "nodeName": "127.0.0.1",
   "startTime": "2017-09-23T06:26:49Z",
   "cpu": {
    "time": "2017-09-27T05:12:08Z",
    "usageNanoCores": 275510572,
    "usageCoreNanoSeconds": 11924595625329
   },
   "memory": {
    "time": "2017-09-27T05:12:08Z",
    "availableBytes": 27737075712,
    "usageBytes": 6028234752,
    "workingSetBytes": 3884470272,
    "rssBytes": 652304384,
    "pageFaults": 98472,
    "majorPageFaults": 87
   },
   "fs": {
    "time": "2017-09-27T05:12:08Z",
    "availableBytes": 75281231872,
    "capacityBytes": 104022159360,
    "usedBytes": 28724150272,
    "inodesFree": 12003204,
    "inodes": 12800000,
    "inodesUsed": 796796
   },
   "runtime": {
    "imageFs": {
     "time": "2017-09-27T05:12:00Z",
     "availableBytes": 75281231872,
     "capacityBytes": 104022159360,
     "usedBytes": 247732356,
     "inodesFree": 12003204,
     "inodes": 12800000,
     "inodesUsed": 6103
    }
   }
  },
  "pods": [
   {
    "podRef": {
     "name": "kube-dns-7797cb8758-qxkrz",
     "namespace": "kube-system",
     "uid": "4425b069-a342-11e7-ac90-42010af00002"
    },
    "startTime": "2017-09-27T05:11:23Z",
    "containers": [
     {
      "name": "kubedns",
      "startTime": "2017-09-27T05:11:24Z",
      "cpu": {
       "time": "1970-01-01T00:00:01Z",
       "usageCoreNanoSeconds": 154194917
      },
      "memory": {
       "time": "1970-01-01T00:00:01Z",
       "workingSetBytes": 7643136
      },
      "rootfs": {
       "time": "2017-09-27T05:12:00Z",
       "availableBytes": 75281231872,
       "capacityBytes": 104022159360,
       "usedBytes": 9,
       "inodesFree": 12003204,
       "inodes": 12800000,
       "inodesUsed": 32768
      },
      "logs": {
       "time": "2017-09-27T05:12:08Z",
       "availableBytes": 75281231872,
       "capacityBytes": 104022159360,
       "inodesFree": 12003204,
       "inodes": 12800000
      },
      "userDefinedMetrics": null
     },
     {
      "name": "dnsmasq",
      "startTime": "2017-09-27T05:11:24Z",
      "cpu": {
       "time": "1970-01-01T00:00:01Z",
       "usageCoreNanoSeconds": 114482989
      },
      "memory": {
       "time": "1970-01-01T00:00:01Z",
       "workingSetBytes": 7966720
      },
      "rootfs": {
       "time": "2017-09-27T05:12:00Z",
       "availableBytes": 75281231872,
       "capacityBytes": 104022159360,
       "usedBytes": 9,
       "inodesFree": 12003204,
       "inodes": 12800000,
       "inodesUsed": 28675
      },
      "logs": {
       "time": "2017-09-27T05:12:08Z",
       "availableBytes": 75281231872,
       "capacityBytes": 104022159360,
       "inodesFree": 12003204,
       "inodes": 12800000
      },
      "userDefinedMetrics": null
     },
     {
      "name": "sidecar",
      "startTime": "2017-09-27T05:11:24Z",
      "cpu": {
       "time": "1970-01-01T00:00:01Z",
       "usageCoreNanoSeconds": 140797580
      },
      "memory": {
       "time": "1970-01-01T00:00:01Z",
       "workingSetBytes": 7430144
      },
      "rootfs": {
       "time": "2017-09-27T05:12:00Z",
       "availableBytes": 75281231872,
       "capacityBytes": 104022159360,
       "usedBytes": 8,
       "inodesFree": 12003204,
       "inodes": 12800000,
       "inodesUsed": 28672
      },
      "logs": {
       "time": "2017-09-27T05:12:08Z",
       "availableBytes": 75281231872,
       "capacityBytes": 104022159360,
       "inodesFree": 12003204,
       "inodes": 12800000
      },
      "userDefinedMetrics": null
     }
    ],
    "volume": [
     {
      "time": "2017-09-27T05:12:03Z",
      "availableBytes": 15810760704,
      "capacityBytes": 15810772992,
      "usedBytes": 12288,
      "inodesFree": 3860043,
      "inodes": 3860052,
      "inodesUsed": 9,
      "name": "kube-dns-token-l2blr"
     }
    ]
   }
  ]
 }
```
Signed-off-by: Lantao Liu <lantaol@google.com>

```release-note
Fix the bug that query Kubelet's stats summary with CRI stats enabled results in error.
```
2017-09-29 20:17:45 -07:00
Kubernetes Submit Queue 57688bb64b Merge pull request #52894 from huzhengchuan/fix/incorrect_links_kubelet
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix broken links in kubelet after moving proposals to subdirs

**What this PR does / why we need it**:
fix incorrect links in kubelet after  kubernetes/community#1010

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes  kubernetes/community#918

**Special notes for your reviewer**:
CC @bgrant0607
**Release note**:

```
NONE
```
2017-09-29 15:36:42 -07:00
Lantao Liu f6be138821 Fix imagefs stats. 2017-09-29 22:15:48 +00:00
Kubernetes Submit Queue a0b7d467e2 Merge pull request #53094 from yguo0905/fix
Automatic merge from submit-queue (batch tested with PRs 51021, 53225, 53094, 53219). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Change ImageGCManage to consume ImageFS stats from StatsProvider

Fixes #53083.

**Release note**:

```
Change ImageGCManage to consume ImageFS stats from StatsProvider
```

/assign @Random-Liu
2017-09-29 12:38:22 -07:00
x1957 f28140429e fix comment 2017-09-30 01:00:24 +08:00
zhengchuan hu f4df66aa17 Fix broken links in kubelet 2017-09-29 19:22:23 +08:00
Kubernetes Submit Queue 6fcf841d69 Merge pull request #52692 from wackxu/fbc
Automatic merge from submit-queue (batch tested with PRs 44596, 52708, 53163, 53167, 52692). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix the bad code comment and make the format unify

**What this PR does / why we need it**:

Fix the bad code comment and make the format unify

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #


**Release note**:

```release-note
NONE
```
2017-09-28 21:15:43 -07:00
Kubernetes Submit Queue dcaf8e8203 Merge pull request #53167 from dashpole/fix_init_container
Automatic merge from submit-queue (batch tested with PRs 44596, 52708, 53163, 53167, 52692). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Do not GC exited containers in running pods

This fixes a regression introduced by #45896, and was identified by #52462.
This bug causes the kubelet to garbage collect exited containers in a running pod.
This manifests in strange and confusing state when viewing the cluster.  For example, it can show running pods as having no init container (see #52462), if that container has exited and been removed.

This PR solves this problem by only removing containers and sandboxes from terminated pods.
The important line change is:
` if cgc.podDeletionProvider.IsPodDeleted(podUID) || evictNonDeletedPods {` ---> 
`if cgc.podStateProvider.IsPodDeleted(podUID) || (cgc.podStateProvider.IsPodTerminated(podUID) && evictTerminatedPods) {`

cc @MrHohn @yujuhong @kubernetes/sig-node-bugs 

```release-note
BugFix: Exited containers are not Garbage Collected by the kubelet while the pod is running
```
2017-09-28 21:15:41 -07:00
Kubernetes Submit Queue 8ba5ff9a0b Merge pull request #52708 from NickrenREN/kubereserved-localephemeral
Automatic merge from submit-queue (batch tested with PRs 44596, 52708, 53163, 53167, 52692). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix --kube-reserved storage key name and add UTs for node allocatable reservation

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: part of #52463

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```

/assign @jingxu97
2017-09-28 21:15:36 -07:00
Kubernetes Submit Queue 69b2e73d5f Merge pull request #44596 from yanxuean/bugfix
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Caller of HandlePodSyncs should be  handler in kubelet syncLoopIteration
2017-09-28 21:15:13 -07:00
Kubernetes Submit Queue 05200a4c23 Merge pull request #52529 from hzxuzhonghu/cert-manager
Automatic merge from submit-queue (batch tested with PRs 50280, 52529, 53093, 53108, 53168). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove certificate manager unused code

**What this PR does / why we need it**:
remove unused const
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-09-28 14:59:23 -07:00
Kubernetes Submit Queue 22ae750803 Merge pull request #49249 from orkun1675/patch-1
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix typo in config_test.go
2017-09-28 13:55:48 -07:00
Yang Guo f6c36474f2 Change ImageGCManage to consume ImageFS stats from StatsProvider 2017-09-28 10:27:22 -07:00
David Ashpole 4300c75d48 fix #52462. Do not GC exited containers in running pods 2017-09-28 09:37:21 -07:00
Kubernetes Submit Queue d0233d1a50 Merge pull request #53157 from MrHohn/revert-kubelet-touch-lock
Automatic merge from submit-queue (batch tested with PRs 53157, 52628). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert "Make kubelet touch iptables lock file during initialization"

**What this PR does / why we need it**: Revert #47212. #36485 is fixed so this is no longer needed.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #NONE

**Special notes for your reviewer**:
/assign @yujuhong @dchen1107 

**Release note**:

```release-note
NONE
```
2017-09-27 22:54:12 -07:00
Kubernetes Submit Queue 85c37d76a5 Merge pull request #53161 from dims/fix-repotags
Automatic merge from submit-queue (batch tested with PRs 52634, 53121, 53161). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Normalize RepoTags before checking for match

**What this PR does / why we need it**:

on projectatomic-based docker, we get "docker.io/library/busybox:latest"
when someone uses an unqualified name like "busybox". Though when we
inspect, the RepoTag will still say "docker.io/busybox:latest", So
we have reparse the tag, normalize it and try again. Please see the
additional test case.

Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

Fixes #52110

**Special notes for your reviewer**:

**Release note**:

```release-note
Fixes an issue pulling pod specs referencing unqualified images from docker.io on centos/fedora/rhel
```
2017-09-27 20:35:31 -07:00
Kubernetes Submit Queue 8be101ecb7 Merge pull request #52634 from FengyunPan/improve-containerGC
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve codes which checks whether sandbox contains containers

Currently evictSandboxes() checks whether sandbox contains
containers, it traverses all the containers for every sandbox,
but when cluster has many containres, it wastes a lot of time.
It is better to use sets in this case.

**Release note**:
```release-note
NONE
```
2017-09-27 20:10:24 -07:00
Di Xu 5e96f7cae9 enable to specific unconfined AppArmor profile 2017-09-28 10:06:36 +08:00
Andy Goldstein 95f373fde6 Normalize RepoTags before checking for match
on projectatomic-based docker, we get "docker.io/library/busybox:latest"
when someone uses an unqualified name like "busybox". Though when we
inspect, the RepoTag will still say "docker.io/busybox:latest", So
we have reparse the tag, normalize it and try again. Please see the
additional test case.

Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
2017-09-27 20:51:31 -04:00
Zihong Zheng 69b5e0ab67 Revert "Make kubelet touch iptables lock file during initialization" 2017-09-27 13:34:43 -07:00
Kubernetes Submit Queue 0ea979a2f2 Merge pull request #50509 from feiskyer/link-logs
Automatic merge from submit-queue (batch tested with PRs 50988, 50509, 52660, 52663, 52250). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Create container log symlink for all containers

**What this PR does / why we need it**:

dockershim only makes  log symlink for running containers now, we should also create the log symlink for failed containers.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #50499

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-09-27 05:32:23 -07:00
Kubernetes Submit Queue c4d87032c8 Merge pull request #50988 from feiskyer/typo
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix typo in docs of remote package

**What this PR does / why we need it**:

Fix typo in docs of kubelet/remote package

**Which issue this PR fixes**: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-09-27 04:45:56 -07:00
Kubernetes Submit Queue 5a721f5a02 Merge pull request #53065 from msau42/add-reviewers
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add more reviewers for volume components

**Release note**:

NONE
2017-09-26 23:52:02 -07:00
Kubernetes Submit Queue 80fee4d399 Merge pull request #53069 from derekwaynecarr/imagefs-eviction
Automatic merge from submit-queue (batch tested with PRs 52990, 53064, 52686, 52221, 53069). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Align imagefs eviction defaults with image gc defaults

**What this PR does / why we need it**:
If a node is configured to use an imagefs for container storage, we should align the default imagefs eviction threshold with the default image-gc threshold.  This PR updates the default imagesfs.available threshold to trigger when below 15% available space, which is same as default image-gc high threshold for 85%.

Fixes https://github.com/kubernetes/kubernetes/issues/53074

**Special notes for your reviewer**:
none, this only impacts nodes running an imagefs otherwise its ignored.

**Release note**:
```release-note
NONE
```
2017-09-26 23:12:32 -07:00
Kubernetes Submit Queue 631bc37cf6 Merge pull request #52686 from yujuhong/stream
Automatic merge from submit-queue (batch tested with PRs 52990, 53064, 52686, 52221, 53069). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

CRI: Allow configuring stdout/stderr streams for Exec/Attach requests

Add stdout/stderr to exec and attach requests. Also check the request to
ensure it meets the requirements.

**Which issue this PR fixes**: fixes #44448

```release-note
CRI: Add stdout/stderr fields to Exec and Attach requests.
```
2017-09-26 23:12:27 -07:00
Kubernetes Submit Queue 751bcc473c Merge pull request #51975 from mindprince/deviceplugin-gpu-reviewers
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add an OWNERS file for deviceplugin package. Update OWNERS file for gpu package.

**Release note**:
```release-note
NONE
```
2017-09-26 21:01:26 -07:00
Kubernetes Submit Queue 65a2f15e06 Merge pull request #52493 from mtaufen/fix-file-leak
Automatic merge from submit-queue (batch tested with PRs 52721, 53057, 52493, 52998, 52896). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix a potential file leak

Previously, if a write or sync error occurred, we would not have called
Close(). This commit refactors ReplaceFile() so that we are sure to call
Close(), and also attempts to delete the temporary file if errors occur.

See: https://github.com/kubernetes/kubernetes/pull/52119#discussion_r137916659
Fixes: #53060

```release-note
NONE
```

@yujuhong @ash2k
2017-09-26 15:51:19 -07:00
Derek Carr b6db700880 Align imagefs eviction defaults with image gc defaults 2017-09-26 13:57:49 -04:00
Michelle Au e6687ad5c6 Add more reviewers for volume components 2017-09-26 10:24:21 -07:00
Joel Smith d53d29faf7 Get fallback termination msg from docker when using journald log driver
When using the legacy docker container runtime and when a container has
terminationMessagePolicy=FallbackToLogsOnError and when docker is
configured with a log driver other than json-log (such as journald),
the kubelet should not try to get the container's log from the
json log file (since it's not there) but should instead ask docker for
the logs.
2017-09-26 07:14:15 -06:00
hzxuzhonghu 00d703d4dc remove unused code 2017-09-26 16:39:21 +08:00
Michael Taufen 62fecfb0f4 Fix a potential file leak
Previously, if a write or sync error occurred, we would not have called
Close(). This commit refactors ReplaceFile() so that we are sure to call
Close(), and also attempts to delete the temporary file if errors occur.
2017-09-25 20:45:52 -07:00
Di Xu 57ead4898b use GetFileType per mount.Interface to check hostpath type 2017-09-26 09:57:06 +08:00
NickrenREN 7f9696201e Fix --kube-reserved storage key name and add test cases for node allocatable reservation 2017-09-26 09:32:21 +08:00
Jiaying Zhang 5953a182cf Fixes the flaky TestDevicePluginReRegistration.
In the current test, there is a race that the new device plugin endpoint
may not be added to the device plugin manager endpoints at the time when
we call manager.Devices(). Added the checking and waiting for endpoint
updates before calling manager.Devices() in the test.

Tested:
go test -race -count 500 k8s.io/kubernetes/pkg/kubelet/deviceplugin -run
TestDevicePluginReRegistration -timeout 5h
2017-09-25 16:55:29 -07:00
Kubernetes Submit Queue 69011d10c2 Merge pull request #52319 from yujuhong/docker-metrics
Automatic merge from submit-queue (batch tested with PRs 51067, 52319, 52803, 52961, 51972). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Move prometheus metrics for docker operations into dockershim
2017-09-25 14:50:51 -07:00
Kubernetes Submit Queue af411e387a Merge pull request #52287 from yujuhong/rm-nsenter
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

kubelet: remove the --docker-exec-handler flag

Stop supporting the "nsenter" exec handler. Only the Docker native exec
handler is supported.

The flag was deprecated in Kubernetes 1.6 and is safe to remove
in Kubernetes 1.9 according to the deprecation policy.

**What this PR does / why we need it**:

**Which issue this PR fixes** : fixes #40229

**Special notes for your reviewer**:
N/A

**Release note**:

```release-note
Remove the --docker-exec-handler flag. Only native exec handler is supported.
```
2017-09-25 12:22:57 -07:00
Yu-Ju Hong 331628b7dc Move prometheus metrics for docker operations into dockershim 2017-09-25 10:03:17 -07:00
Kubernetes Submit Queue fc8a647f78 Merge pull request #52864 from dcbw/dockershim-fix-net-teardown
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

dockershim: fine-tune network-ready handling on sandbox teardown and removal

If sandbox teardown results in an error, GC will periodically attempt
to again remove the sandbox.  Until the sandbox is removed, pod
sandbox status calls will attempt to enter the pod's namespace and
retrieve the pod IP, but the first teardown attempt may have already
removed the network namespace, resulting in a pointless log error
message that the network namespace doesn't exist, or that nsenter
can't find eth0.

The network-ready mechanism originally attempted to suppress those
messages by ensuring that pod sandbox status skipped network checks
when networking was already torn down, but unfortunately the ready
value was cleared too early.

Also, don't tear down the pod network multiple times if the first
time we tore it down, it succeeded.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-09-24 04:32:12 -07:00
Kubernetes Submit Queue 7c9e614cbb Merge pull request #52873 from ixdy/bazel-cleanup
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

bazel: build/test almost everything

**What this PR does / why we need it**: Miscellaneous cleanups and bug fixes. The main motivating idea here was to make `bazel build //...` and `bazel test //...` mostly work. (There's a few reasons these still don't work, but we're a lot closer.)

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

/assign @BenTheElder @mikedanese @spxtr
2017-09-24 00:04:36 -07:00
Kubernetes Submit Queue cece399058 Merge pull request #52567 from smarterclayton/fix_fallback_to_logs
Automatic merge from submit-queue (batch tested with PRs 50890, 52484, 52542, 52567, 50672). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Do not set message when terminationMessagePath not found

If terminationMessagePath is set to a file that does not exist, we should not log an error message and instead try falling back to logs (based on the user's request).

This also slightly simplifies the terminationMessagePath processing.

Seen in #50499

```release-note
If a container does not create a file at the `terminationMessagePath`, no message should be output about being unable to find the file.
```
2017-09-23 16:26:54 -07:00
Kubernetes Submit Queue 441f674c60 Merge pull request #50396 from bobbypage/stats
Automatic merge from submit-queue (batch tested with PRs 52168, 48939, 51889, 52051, 50396). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Add Windows Server Containers Stats and Metrics to Kubelet

**What this PR does / why we need it**:

This PR implements stats for Windows Server Containers. This adds the ability to monitor Windows Server containers via the existing stats/summary endpoint inside the kubelet. Windows metrics can now be ingested into heapster and monitored using existing tools (like Grafana). 

Previously, the /stats/summary api would consistently crash the kubelet on Windows server containers. This PR implements a new package "winstats" which reads windows server metrics from a combination of windows specific perf counters as well as docker stats. The "winstats" package exports functions that return CAdvisor data structures, which the existing summary api can read. 


**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #49398

This PR addresses my plan to implement windows server container stats https://github.com/kubernetes/kubernetes/issues/49398 .


**Release note**:

```release-note
Add monitoring of Windows Server containers metrics in the kubelet via the stats/summary endpoint.
```
2017-09-23 13:40:56 -07:00
Kubernetes Submit Queue 5e3b681caa Merge pull request #48939 from verb/nit-expetected
Automatic merge from submit-queue (batch tested with PRs 52168, 48939, 51889, 52051, 50396). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Fix typo in kubelet kuberuntime container test

Changes "Expetected" to "Expected"

**What this PR does / why we need it**: Fixes a typo in a test

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-09-23 13:40:47 -07:00
Kubernetes Submit Queue 2c5413b379 Merge pull request #50422 from karataliu/apid
Automatic merge from submit-queue (batch tested with PRs 50294, 50422, 51757, 52379, 52014). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Fix AnnotationProvidedIPAddr annotation for externalCloudProvider

**What this PR does / why we need it**:
In #44258, it introduced `AnnotationProvidedIPAddr`. When kubelet has 'node-ip' parameter set, and cloud provider not set, this annotation would be populated, and then will be validated by cloud-controller-manager:
https://github.com/kubernetes/kubernetes/pull/44258/files#diff-6b0808bd1afb15f9f77986f4459601c2R465

Later with #47152, externalCloudProvider is checked and func returns before that annotation got set. In this case, that annotation will not get populated.

This fix is to bring that annotation assignment to a proper location.

Please correct me if I have any misunderstanding.
@wlan0 @ublubu 

**Which issue this PR fixes**

**Special notes for your reviewer**:

**Release note**:
2017-09-23 11:40:47 -07:00
Kubernetes Submit Queue 7485aad067 Merge pull request #52235 from xiangpengzhao/remove-hostportChainName
Automatic merge from submit-queue (batch tested with PRs 52109, 52235, 51809, 52161, 50080). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Remove backward compatibility of hostportChainName

**What this PR does / why we need it**:
fix TODO.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:
/assign @freehan 

**Release note**:

```release-note
NONE
```
2017-09-23 10:26:47 -07:00
Kubernetes Submit Queue ffe122d89c Merge pull request #52220 from yujuhong/rm-legacy-code
Automatic merge from submit-queue (batch tested with PRs 52240, 48145, 52220, 51698, 51777). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

dockershim: remove support for legacy containers

The code was first introduced in 1.6 to help pre-CRI-kubelet upgrade to
using the CRI implementation. They can safely be removed now.
2017-09-23 09:14:00 -07:00
Kubernetes Submit Queue d4ac62cea4 Merge pull request #51031 from jcbsmpsn/metric-certificate-expiration-on-kubelet
Automatic merge from submit-queue (batch tested with PRs 51031, 51705, 51888, 51727, 51684). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Add a kubelet metric to track certificate expiration.

Fix https://github.com/kubernetes/kubernetes/issues/51964

```release-note
Add a metric to the kubelet to monitor remaining lifetime of the certificate that
authenticates the kubelet to the API server.
```
2017-09-23 01:46:58 -07:00
Kubernetes Submit Queue 28df7a1cae Merge pull request #47806 from dcbw/fix-pod-ip-race
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

kubelet: fix inconsistent display of terminated pod IPs

PLEG and kubelet race when reading and sending pod status to the apiserver.  PLEG
inserts status into a cache, and then signals kubelet.  Kubelet then eventually
reads the status out of that cache, but in the mean time the status could have
been changed by PLEG.

When a pod exits, pod status will no longer include the pod's IP address because
the network plugin/runtime will report "" for terminated pod IPs.  If this status
gets inserted into the PLEG cache before kubelet gets the status out of the cache,
kubelet will see a blank pod IP address.  This happens in about 1/5 of cases when
pods are short-lived, and somewhat less frequently for longer running pods.

To ensure consistency for properties of dead pods, copy an old status update's
IP address over to the new status update if (a) the new status update's IP is
missing and (b) all sandboxes of the pod are dead/not-ready (eg, no possibility
for a valid IP from the sandbox).

Fixes: https://github.com/kubernetes/kubernetes/issues/47265
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1449373

@eparis @freehan @kubernetes/rh-networking @kubernetes/sig-network-misc
2017-09-22 21:01:50 -07:00
Yu-Ju Hong 3837a016ef kubelet: remove the --docker-exec-handler flag
Stop supporting the "nsenter" exec handler. Only the Docker native exec
handler is supported.

The flag was deprecated in Kubernetes 1.6 and is safe to remove
in Kubernetes 1.9 according to the deprecation policy.
2017-09-22 12:13:31 -07:00
Jeff Grafton 02fb4200dc Use buildozer to delete licenses() rules 2017-09-21 15:53:22 -07:00
Jeff Grafton 532bd482df Use buildozer to remove deprecated automanaged tags 2017-09-21 15:53:22 -07:00
Kubernetes Submit Queue a284c1e7a9 Merge pull request #51985 from DiamantiCom/fix-to-mount-on-reboot-pr
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Fix volume remount on reboot

**What this PR does / why we need it**:
Check the mount is actually attached & mounted before marking actual state of world of Kubelet reconciler.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #51982  

**Special notes for your reviewer**:
Added explicit check to make sure volumes are attached and are mounted before marking the state in actual state of world.

**Release note**:
NONE
2017-09-21 14:19:43 -07:00
Dan Williams ddb5075842 dockershim: fine-tune network-ready handling on sandbox teardown and removal
If sandbox teardown results in an error, GC will periodically attempt
to again remove the sandbox.  Until the sandbox is removed, pod
sandbox status calls will attempt to enter the pod's namespace and
retrieve the pod IP, but the first teardown attempt may have already
removed the network namespace, resulting in a pointless log error
message that the network namespace doesn't exist, or that nsenter
can't find eth0.

The network-ready mechanism originally attempted to suppress those
messages by ensuring that pod sandbox status skipped network checks
when networking was already torn down, but unfortunately the ready
value was cleared too early.

Also, don't tear down the pod network multiple times if the first
time we tore it down, it succeeded.
2017-09-21 14:53:50 -05:00
Yu-Ju Hong 478b7f8ab0 CRI: Allow configuring stdout/stderr streams for Exec/Attach requests
Add stdout/stderr to exec and attach requests. Also check the request to
ensure it meets the requirements.
2017-09-20 16:40:15 -07:00
Kubernetes Submit Queue 14b32888de Merge pull request #52635 from Random-Liu/fix-cri-stats
Automatic merge from submit-queue (batch tested with PRs 51337, 47080, 52646, 52635, 52666). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Fix CRI container/imagefs stats.

`ContainerStats`, `ListContainerStats` and `ImageFsInfo` are returning `not implemented` error now.

This PR fixes it.

@yujuhong @feiskyer @yguo0905
2017-09-19 17:31:11 -07:00
Kubernetes Submit Queue 0bd2ed16a0 Merge pull request #47080 from jingxu97/May/allocatable
Automatic merge from submit-queue (batch tested with PRs 51337, 47080, 52646, 52635, 52666). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Map a resource to multiple signals in eviction manager

It is possible to have multiple signals that point to the same type of
resource, e.g., both SignalNodeFsAvailable and
SignalAllocatableNodeFsAvailable refer to the same resource NodeFs.
Change the map from map[v1.ResourceName]evictionapi.Signal to
map[v1.ResourceName][]evictionapi.Signal



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #52661

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-09-19 17:31:07 -07:00
Kubernetes Submit Queue 08486ab4aa Merge pull request #52561 from jiayingz/deviceplugin-failure
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Fixes a race in deviceplugin/manager_test.go and a race in deviceplug…

…in/manager.go.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/52560

**Special notes for your reviewer**:
Tested with  go test -count 50 -race k8s.io/kubernetes/pkg/kubelet/deviceplugin and all runs passed.

**Release note**:

```release-note
```
2017-09-19 13:35:44 -07:00
Kubernetes Submit Queue f80999f438 Merge pull request #48970 from caseydavenport/fix-kubelet-restart
Automatic merge from submit-queue (batch tested with PRs 48970, 52497, 51367, 52549, 52541). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Recreate pod sandbox when the sandbox does not have an IP address.

**What this PR does / why we need it**:

Attempts to fix a bug where Pods do not receive networking when the kubelet restarts during pod creation.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:

fixes # https://github.com/kubernetes/kubernetes/issues/48510

**Release note**:

```release-note
NONE
```
2017-09-19 01:17:39 -07:00
wackxu d8aa0ca82a fix the bad code comment and make the format unify 2017-09-19 11:15:10 +08:00
Chakravarthy Nelluri b8d1c3bcd8 Fix volume remount on reboot 2017-09-18 16:28:21 -04:00
Jiaying Zhang 34dccc5d2a Fixes some races in deviceplugin manager_test.go and manager.go. 2017-09-18 13:19:51 -07:00
Lantao Liu d387eab817 Fix CRI container/imagefs stats. 2017-09-18 07:48:20 +00:00
FengyunPan bfc171ccaa Improve codes which checks whether sandbox contains containers
Currently when evictSandboxes() checks whether sandbox contains
containers, it traverses all the containers for every sandbox,
but when cluster has many containres, it wastes a lot of time.
It is better to use sets in this case.
2017-09-18 14:34:34 +08:00
Kubernetes Submit Queue 3277de69b4 Merge pull request #52176 from liggitt/heartbeat-timeout
Automatic merge from submit-queue (batch tested with PRs 52176, 43152). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Eliminate hangs/throttling of node heartbeat

Fixes https://github.com/kubernetes/kubernetes/issues/48638
Fixes #50304

Stops kubelet from wedging when updating node status if unable to establish tcp connection.

 Notes that this only affects the node status loop. The pod sync loop would still hang until the dead TCP connections timed out,  so more work is needed to keep the sync loop responsive in the face of network issues, but this change lets existing pods coast without the node controller trying to evict them

```release-note
kubelet to master communication when doing node status updates now has a timeout to prevent indefinite hangs
```
2017-09-16 09:45:29 -07:00
supereagle 87c29a08e1 fix typos: remove duplicated word in comments 2017-09-16 14:38:10 +08:00
David Porter aee1e58d58 Handle nil WritableLayer 2017-09-16 00:13:17 +00:00
David Porter 0b1f806557 Fix nil dereference if storage id is nil 2017-09-16 00:13:04 +00:00
Clayton Coleman eb0cab5b18
Do not set message when terminationMessagePath not found
If terminationMessagePath is set to a file that does not exist, we
should not log an error message and instead try falling back to logs
(based on the user's request).
2017-09-15 16:27:36 -04:00
Casey Davenport 94bf2b0ccf Attempt at fixing UTs 2017-09-15 09:23:52 -07:00
Casey Davenport be5cd7fed2 Recreate pod sandbox when the sandbox does not have an IP address. 2017-09-15 09:23:52 -07:00
Kubernetes Submit Queue b5fbd71bbc Merge pull request #52290 from jiayingz/deviceplugin-failure
Automatic merge from submit-queue (batch tested with PRs 52452, 52115, 52260, 52290)

Fixes device plugin re-registration handling logic to make sure:

- If a device plugin exits, its exported resource will be removed.
- No capacity change if a new device plugin instance comes up to replace the old instance.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/kubernetes/kubernetes/issues/52510

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-09-15 02:00:08 -07:00
Kubernetes Submit Queue 86dc5fceda Merge pull request #52451 from yujuhong/enable-cri-stats
Automatic merge from submit-queue (batch tested with PRs 51824, 50476, 52451, 52009, 52237)

kubelet: enable CRI container metrics

Fixes #46984
2017-09-15 01:08:05 -07:00
Kubernetes Submit Queue ce5c41ab0f Merge pull request #52363 from balajismaniam/fix-cpuman-restartpol-never-bug
Automatic merge from submit-queue (batch tested with PRs 52442, 52247, 46542, 52363, 51781)

Make CPU manager release CPUs when Pod enters completed phase. 

**What this PR does / why we need it**: When CPU manager is enabled, this PR releases allocated CPUs when container is not running and is non-restartable. 

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #52351

**Special notes for your reviewer**:
This bug is only reproduced for pods with `restartPolicy` = `Never` or `OnFailure`.  The following output is from a 4 CPU node. This bug can be reproduced as long >= half the cores are requested. 

pod1.yaml:
```
apiVersion: v1
kind: Pod
metadata:
  name: test-pod1
spec:
  containers:
  - image: ubuntu
    command: ["/bin/bash"]
    args: ["-c", "sleep 5"]
    name: test-container1
    resources:
      requests:
        cpu: 2
        memory: 100Mi
      limits:
        cpu: 2
        memory: 100Mi
  restartPolicy: "Never"
```

pod2.yaml:
```
apiVersion: v1
kind: Pod
metadata:
  name: test-pod2
spec:
  containers:
  - image: ubuntu
    command: ["/bin/bash"]
    args: ["-c", "sleep 5"]
    name: test-container1
    resources:
      requests:
        cpu: 2
        memory: 100Mi
      limits:
        cpu: 2
        memory: 100Mi
  restartPolicy: "Never"
```
Run a local Kubernetes cluster with CPU manager enabled. 
```sh
KUBELET_FLAGS='--feature-gates=CPUManager=true --cpu-manager-policy=static --cpu-manager-reconcile-period=1s --kube-reserved=cpu=500m' ./hack/local-up-cluster.sh
```
_Before:_
Create `test-pod1` using pod1.yaml. 
```
./cluster/kubectl.sh create -f pod1.yaml
```
Wait for the pod to complete and wait another 90 seconds (give enough time for GC to kick-in). 

Create `test-pod2` using pod2.yaml. 
```
./cluster/kubectl.sh create -f pod2.yaml
```

Get all pods in the cluster. 
```
./cluster/kubectl.sh get pods -a
NAME        READY     STATUS                                         RESTARTS   AGE
test-pod1   0/1       Completed                                      0          1m
test-pod2   0/1       not enough cpus available to satisfy request   0          9s
```

_After:_
Create `test-pod1` using pod1.yaml. 
```
./cluster/kubectl.sh create -f pod1.yaml
```
Wait for the pod to complete and wait another 90 seconds (give enough time for GC to kick-in). 

Create `test-pod2` using pod2.yaml. 
```
./cluster/kubectl.sh create -f pod2.yaml
```

Get all pods in the cluster. 
```
./cluster/kubectl.sh get pods -a
NAME        READY     STATUS      RESTARTS   AGE
test-pod1   0/1       Completed    0          1m
test-pod2   0/1       Completed    0          9s
```
2017-09-15 00:11:14 -07:00
Kubernetes Submit Queue 20a4112e88 Merge pull request #46542 from derekwaynecarr/quota-ignore-pod-whose-node-lost
Automatic merge from submit-queue (batch tested with PRs 52442, 52247, 46542, 52363, 51781)

Ignore pods for quota marked for deletion whose node is unreachable

**What this PR does / why we need it**:
Traditionally, we charge to quota all pods that are in a non-terminal phase.  We have a user report that noted the behavior change in kube 1.5 for the node controller to no longer force delete pods whose nodes have been lost.  Instead, the pod is marked for deletion, and the reason is updated to state that the node is unreachable.  The user expected the quota to be released.  If the user was at their quota limit, their application may not be able to create a new replica given the current behavior.  As a result, this PR ignores pods marked for deletion that have exceeded their grace period.

**Which issue this PR fixes**
xref https://bugzilla.redhat.com/show_bug.cgi?id=1455743
fixes https://github.com/kubernetes/kubernetes/issues/52436

**Release note**:
```release-note
Ignore pods marked for deletion that exceed their grace period in ResourceQuota
```
2017-09-15 00:11:10 -07:00
yanxuean 3150d3ebcb improve the relation of ExecInContainer and Exec
keep the relation between ExecInContainer and Exec be consistence with PortForward in streaming server

Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2017-09-15 09:29:46 +08:00
Jiaying Zhang 5cac9fc984 Fixes device plugin re-registration handling logic to make sure:
- If a device plugin exits, its exported resource will be removed.
- No capacity change if a new device plugin instance comes up to replace the old instance.
2017-09-14 15:24:46 -07:00
Jordan Liggitt f8f57d8959
Use separate client for node status loop 2017-09-14 15:56:22 -04:00
David Porter a854ddb358 Implement metrics for Windows Nodes
This implements stats for windows nodes in a new package, winstats.
WinStats exports methods to get cadvisor like datastructures, however
with windows specific metrics. WinStats only gets node level metrics and
information, container stats will go via the CRI. This enables the
use of the summary api to get metrics for windows nodes.
2017-09-14 06:32:51 +00:00
Yu-Ju Hong 2c415cc506 kubelet: enable CRI container metrics 2017-09-13 15:09:35 -07:00
Lee Verberne e2e6a8cd85 Fix typo in kubelet kuberuntime container test
Changes "Expetected" to "Expected"
2017-09-13 14:32:48 +02:00
Kubernetes Submit Queue c6a9b1e198 Merge pull request #52125 from yujuhong/fix-file-sync
Automatic merge from submit-queue (batch tested with PRs 52339, 52343, 52125, 52360, 52301)

dockershim: check if f.Sync() returns an error and surface it

```release-note
dockershim: check the error when syncing the checkpoint.
```
2017-09-12 21:45:56 -07:00
Balaji Subramaniam e2e356964a Make CPU manager release allocated CPUs when container enters completed phase. 2017-09-12 21:01:01 -07:00
Kubernetes Submit Queue b04f81d342 Merge pull request #52344 from smarterclayton/no_log_pull
Automatic merge from submit-queue (batch tested with PRs 48226, 52046, 52231, 52344, 52352)

Log at higher verbosity levels some common SyncPod errors

This log message was 90% of all glog.Errorf level statements reported on a production cluster, hiding other more impactful errors. We already log it in start container, but for extra caution we continue to log it at v(3) here (the downside of not logging a start container error is worse than some log spam at higher levels).

HandleError() is intended only for unknown and unexpected errors.

```release-note
NONE
```

@derekwaynecarr @sjenning
2017-09-12 19:40:03 -07:00
Kubernetes Submit Queue 32f1521cc2 Merge pull request #52046 from dashpole/soft_eviction
Automatic merge from submit-queue (batch tested with PRs 48226, 52046, 52231, 52344, 52352)

[BugFix] Soft Eviction timer works correctly

fixes #51516

thresholdsMet should not exclude previously met thresholds when we do not have new stats for a threshold.

/assign @vishh @derekwaynecarr 
cc @kubernetes/sig-node-bugs
2017-09-12 19:39:55 -07:00
Kubernetes Submit Queue 8e95e39c15 Merge pull request #52297 from derekwaynecarr/code-hygiene
Automatic merge from submit-queue (batch tested with PRs 51041, 52297, 52296, 52335, 52338)

Use cAdvisor constant for crio imagefs

**What this PR does / why we need it**:
code hygiene to use a constant from cAdvisor

**Release note**:
```release-note
NONE
```
2017-09-12 11:10:10 -07:00
Clayton Coleman a5ac80cbce
Log at higher verbosity levels some common SyncPod errors 2017-09-12 10:52:31 -04:00
Kubernetes Submit Queue d8847a8f1d Merge pull request #52119 from mtaufen/sync-files
Automatic merge from submit-queue

fsync config checkpoint files after writing

@yujuhong brought up that it's possible for a hard reboot to result in empty checkpoint files, if they haven't been synced to disk yet. This PR ensures that Kubelet configuration checkpoints are synced after writing to avoid this issue.

fixes #52222

**Release note**:
```release-note
NONE
```
2017-09-12 05:41:25 -07:00
Kubernetes Submit Queue 01154dd3cf Merge pull request #51870 from feiskyer/sandbox-creds
Automatic merge from submit-queue (batch tested with PRs 52264, 51870)

Use credentials from providers for docker sandbox image

**What this PR does / why we need it**:

Sandbox image lookup uses creds from docker config only; other credential providers are ignored. This is a regression introduced in dockershim.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #51293

**Special notes for your reviewer**:

Should also cherry-pick this to release-1.6 and release-1.7.

**Release note**:

```release-note
Fix credentials providers for docker sandbox image.
```
2017-09-12 02:10:24 -07:00
yanxuean 799d0e5a6e correct to handler 2017-09-12 13:47:08 +08:00
Derek Carr cf2c688385 Use cAdvisor constant for crio imagefs 2017-09-11 14:08:00 -04:00
Derek Carr da01c6d3a2 Ignore pods for quota that exceed deletion grace period 2017-09-11 13:31:52 -04:00
Yu-Ju Hong aaf26b2eaa dockershim: remove support for legacy containers
The code was first introduced in 1.6 to help pre-CRI-kubelet upgrade to
using the CRI implementation. They can safely be removed now.
2017-09-11 08:44:27 -07:00
xiangpengzhao 0484a1c2c5 Remove backward compatibility of hostportChainName 2017-09-10 00:24:00 +08:00
Kubernetes Submit Queue d6df4a5127 Merge pull request #52063 from mtaufen/dkcfg-e2enode
Automatic merge from submit-queue (batch tested with PRs 52047, 52063, 51528)

Improve dynamic kubelet config e2e node test and fix bugs

Rather than just changing the config once to see if dynamic kubelet
config at-least-sort-of-works, this extends the test to check that the
Kubelet reports the expected Node condition and the expected configuration
values after several possible state transitions.

Additionally, this adds a stress test that changes the configuration 100
times. It is possible for resource leaks across Kubelet restarts to
eventually prevent the Kubelet from restarting. For example, this test
revealed that cAdvisor's leaking journalctl processes (see:
https://github.com/google/cadvisor/issues/1725) could break dynamic
kubelet config. This test will help reveal these problems earlier.

This commit also makes better use of const strings and fixes a few bugs
that the new testing turned up.

Related issue: #50217

I had been sitting on this until the cAdvisor fix merged in #51751, as these tests fail without that fix.

**Release note**:

```release-note
NONE
```
2017-09-08 16:06:56 -07:00
Pengfei Ni 4d5d97438b Use credentials from providers for docker sandbox image 2017-09-09 07:02:04 +08:00
Kubernetes Submit Queue 943817f57b Merge pull request #52047 from balajismaniam/cpuman-large-topo-test
Automatic merge from submit-queue

Added large topology tests for static policy in CPU Manager.

**What this PR does / why we need it**: This PR adds a very large topology test case for the CPU Manager feature.

Related to #51180. 

CC @ConnorDoyle
2017-09-08 15:57:41 -07:00
Kevin f50761c9d4 fix prober ticking shift for kubelet restarted cases 2017-09-08 17:31:02 +08:00
Yu-Ju Hong a850614613 dockershim: check if f.Sync() returns an error and surface it 2017-09-07 16:05:02 -07:00
Michael Taufen a846ba191c Improve dynamic kubelet config e2e node test and fix bugs
Rather than just changing the config once to see if dynamic kubelet
config at-least-sort-of-works, this extends the test to check that the
Kubelet reports the expected Node condition and the expected configuration
values after several possible state transitions.

Additionally, this adds a stress test that changes the configuration 100
times. It is possible for resource leaks across Kubelet restarts to
eventually prevent the Kubelet from restarting. For example, this test
revealed that cAdvisor's leaking journalctl processes (see:
https://github.com/google/cadvisor/issues/1725) could break dynamic
kubelet config. This test will help reveal these problems earlier.

This commit also makes better use of const strings and fixes a few bugs
that the new testing turned up.

Related issue: #50217
2017-09-07 15:50:17 -07:00
Michael Taufen 47beb80368 fsync config checkpoint files after writing 2017-09-07 14:42:18 -07:00
Kubernetes Submit Queue ae6b329368 Merge pull request #51644 from sjenning/init-container-status-fix
Automatic merge from submit-queue (batch tested with PRs 51239, 51644, 52076)

do not update init containers status if terminated

fixes #29972 #41580

This fixes an issue where, if a completed init container is removed while the pod or subsequent init containers are still running, the status for that init container will be reset to `Waiting` with `PodInitializing`.  

This can manifest in a number of ways.

If the init container is removed why the main pod containers are running, the status will be reset with no functional problem but the status will be reported incorrectly in `kubectl get pod` for example

If the init container is removed why a subsequent init container is running, the init container will be **re-executed** leading to all manner of badness.

@derekwaynecarr @bparees
2017-09-07 14:31:23 -07:00
Derek Carr 27365eb900 Fix cross-build 2017-09-07 09:53:52 -04:00
Kubernetes Submit Queue a51eb2ac4e Merge pull request #49202 from cbonte/node-addresses
Automatic merge from submit-queue (batch tested with PRs 51728, 49202)

Fix setNodeAddress when a node IP and a cloud provider are set

**What this PR does / why we need it**:
When a node IP is set and a cloud provider returns the same address with
several types, only the first address was accepted. With the changes made
in PR #45201, the vSphere cloud provider returned the ExternalIP first,
which led to a node without any InternalIP.

The behaviour is modified to return all the address types for the
specified node IP.

**Which issue this PR fixes**: fixes #48760

**Special notes for your reviewer**:
* I'm not a golang expert, is it possible to mock `kubelet.validateNodeIP()` to avoid the need of real host interface addresses in the test ?
* It would be great to have it backported for a next 1.6.8 release.

**Release note**:
```release-note
NONE
```
2017-09-06 20:01:00 -07:00
Kubernetes Submit Queue b6545a086c Merge pull request #51728 from derekwaynecarr/cadvisor-stats
Automatic merge from submit-queue (batch tested with PRs 51728, 49202)

Enable CRI-O stats from cAdvisor

**What this PR does / why we need it**:
cAdvisor may support multiple container runtimes (docker, rkt, cri-o, systemd, etc.)

As long as the kubelet continues to run cAdvisor, runtimes with native cAdvisor support may not want to run multiple monitoring agents to avoid performance regression in production.  Pending kubelet running a more light-weight monitoring solution, this PR allows remote runtimes to have their stats pulled from cAdvisor when cAdvisor is registered stats provider by introspection of the runtime endpoint.

See issue https://github.com/kubernetes/kubernetes/issues/51798

**Special notes for your reviewer**:
cAdvisor will be bumped to pick up https://github.com/google/cadvisor/pull/1741

At that time, CRI-O will support fetching stats from cAdvisor.

**Release note**:
```release-note
NONE
```
2017-09-06 20:00:57 -07:00
Joel Smith 58ae5a78f9 Clean up kublet secret and configmap unit test
* Expected value comes before actual value in assert.Equal()
* Use assert.Equal() instead of assert.True() when possible
* Add a unit test that verifies no-op pod updates to the
  secret_manager and the configmap_manager
* Add a clarifying comment about why it's good to seemingly
  delete a secret on updates.
* Fix (for now, non-buggy) variable shadowing issue
2017-09-06 16:38:01 -06:00
Balaji Subramaniam e2cb80db4a Added large topology tests for static policy in CPU Manager.
- Added comments for tests cases.
2017-09-06 13:15:22 -07:00