Commit Graph

5475 Commits (f34a24e98e7c837b567b78be3af958ac1156cd80)

Author SHA1 Message Date
Kubernetes Submit Queue fc8a647f78 Merge pull request #52864 from dcbw/dockershim-fix-net-teardown
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

dockershim: fine-tune network-ready handling on sandbox teardown and removal

If sandbox teardown results in an error, GC will periodically attempt
to again remove the sandbox.  Until the sandbox is removed, pod
sandbox status calls will attempt to enter the pod's namespace and
retrieve the pod IP, but the first teardown attempt may have already
removed the network namespace, resulting in a pointless log error
message that the network namespace doesn't exist, or that nsenter
can't find eth0.

The network-ready mechanism originally attempted to suppress those
messages by ensuring that pod sandbox status skipped network checks
when networking was already torn down, but unfortunately the ready
value was cleared too early.

Also, don't tear down the pod network multiple times if the first
time we tore it down, it succeeded.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-09-24 04:32:12 -07:00
Kubernetes Submit Queue 7c9e614cbb Merge pull request #52873 from ixdy/bazel-cleanup
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

bazel: build/test almost everything

**What this PR does / why we need it**: Miscellaneous cleanups and bug fixes. The main motivating idea here was to make `bazel build //...` and `bazel test //...` mostly work. (There's a few reasons these still don't work, but we're a lot closer.)

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

/assign @BenTheElder @mikedanese @spxtr
2017-09-24 00:04:36 -07:00
Kubernetes Submit Queue cece399058 Merge pull request #52567 from smarterclayton/fix_fallback_to_logs
Automatic merge from submit-queue (batch tested with PRs 50890, 52484, 52542, 52567, 50672). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Do not set message when terminationMessagePath not found

If terminationMessagePath is set to a file that does not exist, we should not log an error message and instead try falling back to logs (based on the user's request).

This also slightly simplifies the terminationMessagePath processing.

Seen in #50499

```release-note
If a container does not create a file at the `terminationMessagePath`, no message should be output about being unable to find the file.
```
2017-09-23 16:26:54 -07:00
Kubernetes Submit Queue 441f674c60 Merge pull request #50396 from bobbypage/stats
Automatic merge from submit-queue (batch tested with PRs 52168, 48939, 51889, 52051, 50396). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Add Windows Server Containers Stats and Metrics to Kubelet

**What this PR does / why we need it**:

This PR implements stats for Windows Server Containers. This adds the ability to monitor Windows Server containers via the existing stats/summary endpoint inside the kubelet. Windows metrics can now be ingested into heapster and monitored using existing tools (like Grafana). 

Previously, the /stats/summary api would consistently crash the kubelet on Windows server containers. This PR implements a new package "winstats" which reads windows server metrics from a combination of windows specific perf counters as well as docker stats. The "winstats" package exports functions that return CAdvisor data structures, which the existing summary api can read. 


**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #49398

This PR addresses my plan to implement windows server container stats https://github.com/kubernetes/kubernetes/issues/49398 .


**Release note**:

```release-note
Add monitoring of Windows Server containers metrics in the kubelet via the stats/summary endpoint.
```
2017-09-23 13:40:56 -07:00
Kubernetes Submit Queue 5e3b681caa Merge pull request #48939 from verb/nit-expetected
Automatic merge from submit-queue (batch tested with PRs 52168, 48939, 51889, 52051, 50396). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Fix typo in kubelet kuberuntime container test

Changes "Expetected" to "Expected"

**What this PR does / why we need it**: Fixes a typo in a test

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-09-23 13:40:47 -07:00
Kubernetes Submit Queue 2c5413b379 Merge pull request #50422 from karataliu/apid
Automatic merge from submit-queue (batch tested with PRs 50294, 50422, 51757, 52379, 52014). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Fix AnnotationProvidedIPAddr annotation for externalCloudProvider

**What this PR does / why we need it**:
In #44258, it introduced `AnnotationProvidedIPAddr`. When kubelet has 'node-ip' parameter set, and cloud provider not set, this annotation would be populated, and then will be validated by cloud-controller-manager:
https://github.com/kubernetes/kubernetes/pull/44258/files#diff-6b0808bd1afb15f9f77986f4459601c2R465

Later with #47152, externalCloudProvider is checked and func returns before that annotation got set. In this case, that annotation will not get populated.

This fix is to bring that annotation assignment to a proper location.

Please correct me if I have any misunderstanding.
@wlan0 @ublubu 

**Which issue this PR fixes**

**Special notes for your reviewer**:

**Release note**:
2017-09-23 11:40:47 -07:00
Kubernetes Submit Queue 7485aad067 Merge pull request #52235 from xiangpengzhao/remove-hostportChainName
Automatic merge from submit-queue (batch tested with PRs 52109, 52235, 51809, 52161, 50080). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Remove backward compatibility of hostportChainName

**What this PR does / why we need it**:
fix TODO.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:
/assign @freehan 

**Release note**:

```release-note
NONE
```
2017-09-23 10:26:47 -07:00
Kubernetes Submit Queue ffe122d89c Merge pull request #52220 from yujuhong/rm-legacy-code
Automatic merge from submit-queue (batch tested with PRs 52240, 48145, 52220, 51698, 51777). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

dockershim: remove support for legacy containers

The code was first introduced in 1.6 to help pre-CRI-kubelet upgrade to
using the CRI implementation. They can safely be removed now.
2017-09-23 09:14:00 -07:00
Kubernetes Submit Queue d4ac62cea4 Merge pull request #51031 from jcbsmpsn/metric-certificate-expiration-on-kubelet
Automatic merge from submit-queue (batch tested with PRs 51031, 51705, 51888, 51727, 51684). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Add a kubelet metric to track certificate expiration.

Fix https://github.com/kubernetes/kubernetes/issues/51964

```release-note
Add a metric to the kubelet to monitor remaining lifetime of the certificate that
authenticates the kubelet to the API server.
```
2017-09-23 01:46:58 -07:00
Kubernetes Submit Queue 28df7a1cae Merge pull request #47806 from dcbw/fix-pod-ip-race
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

kubelet: fix inconsistent display of terminated pod IPs

PLEG and kubelet race when reading and sending pod status to the apiserver.  PLEG
inserts status into a cache, and then signals kubelet.  Kubelet then eventually
reads the status out of that cache, but in the mean time the status could have
been changed by PLEG.

When a pod exits, pod status will no longer include the pod's IP address because
the network plugin/runtime will report "" for terminated pod IPs.  If this status
gets inserted into the PLEG cache before kubelet gets the status out of the cache,
kubelet will see a blank pod IP address.  This happens in about 1/5 of cases when
pods are short-lived, and somewhat less frequently for longer running pods.

To ensure consistency for properties of dead pods, copy an old status update's
IP address over to the new status update if (a) the new status update's IP is
missing and (b) all sandboxes of the pod are dead/not-ready (eg, no possibility
for a valid IP from the sandbox).

Fixes: https://github.com/kubernetes/kubernetes/issues/47265
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1449373

@eparis @freehan @kubernetes/rh-networking @kubernetes/sig-network-misc
2017-09-22 21:01:50 -07:00
Yu-Ju Hong 3837a016ef kubelet: remove the --docker-exec-handler flag
Stop supporting the "nsenter" exec handler. Only the Docker native exec
handler is supported.

The flag was deprecated in Kubernetes 1.6 and is safe to remove
in Kubernetes 1.9 according to the deprecation policy.
2017-09-22 12:13:31 -07:00
Jeff Grafton 02fb4200dc Use buildozer to delete licenses() rules 2017-09-21 15:53:22 -07:00
Jeff Grafton 532bd482df Use buildozer to remove deprecated automanaged tags 2017-09-21 15:53:22 -07:00
Kubernetes Submit Queue a284c1e7a9 Merge pull request #51985 from DiamantiCom/fix-to-mount-on-reboot-pr
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Fix volume remount on reboot

**What this PR does / why we need it**:
Check the mount is actually attached & mounted before marking actual state of world of Kubelet reconciler.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #51982  

**Special notes for your reviewer**:
Added explicit check to make sure volumes are attached and are mounted before marking the state in actual state of world.

**Release note**:
NONE
2017-09-21 14:19:43 -07:00
Dan Williams ddb5075842 dockershim: fine-tune network-ready handling on sandbox teardown and removal
If sandbox teardown results in an error, GC will periodically attempt
to again remove the sandbox.  Until the sandbox is removed, pod
sandbox status calls will attempt to enter the pod's namespace and
retrieve the pod IP, but the first teardown attempt may have already
removed the network namespace, resulting in a pointless log error
message that the network namespace doesn't exist, or that nsenter
can't find eth0.

The network-ready mechanism originally attempted to suppress those
messages by ensuring that pod sandbox status skipped network checks
when networking was already torn down, but unfortunately the ready
value was cleared too early.

Also, don't tear down the pod network multiple times if the first
time we tore it down, it succeeded.
2017-09-21 14:53:50 -05:00
Yu-Ju Hong 478b7f8ab0 CRI: Allow configuring stdout/stderr streams for Exec/Attach requests
Add stdout/stderr to exec and attach requests. Also check the request to
ensure it meets the requirements.
2017-09-20 16:40:15 -07:00
Kubernetes Submit Queue 14b32888de Merge pull request #52635 from Random-Liu/fix-cri-stats
Automatic merge from submit-queue (batch tested with PRs 51337, 47080, 52646, 52635, 52666). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Fix CRI container/imagefs stats.

`ContainerStats`, `ListContainerStats` and `ImageFsInfo` are returning `not implemented` error now.

This PR fixes it.

@yujuhong @feiskyer @yguo0905
2017-09-19 17:31:11 -07:00
Kubernetes Submit Queue 0bd2ed16a0 Merge pull request #47080 from jingxu97/May/allocatable
Automatic merge from submit-queue (batch tested with PRs 51337, 47080, 52646, 52635, 52666). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Map a resource to multiple signals in eviction manager

It is possible to have multiple signals that point to the same type of
resource, e.g., both SignalNodeFsAvailable and
SignalAllocatableNodeFsAvailable refer to the same resource NodeFs.
Change the map from map[v1.ResourceName]evictionapi.Signal to
map[v1.ResourceName][]evictionapi.Signal



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #52661

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-09-19 17:31:07 -07:00
Kubernetes Submit Queue 08486ab4aa Merge pull request #52561 from jiayingz/deviceplugin-failure
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Fixes a race in deviceplugin/manager_test.go and a race in deviceplug…

…in/manager.go.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/52560

**Special notes for your reviewer**:
Tested with  go test -count 50 -race k8s.io/kubernetes/pkg/kubelet/deviceplugin and all runs passed.

**Release note**:

```release-note
```
2017-09-19 13:35:44 -07:00
Kubernetes Submit Queue f80999f438 Merge pull request #48970 from caseydavenport/fix-kubelet-restart
Automatic merge from submit-queue (batch tested with PRs 48970, 52497, 51367, 52549, 52541). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Recreate pod sandbox when the sandbox does not have an IP address.

**What this PR does / why we need it**:

Attempts to fix a bug where Pods do not receive networking when the kubelet restarts during pod creation.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:

fixes # https://github.com/kubernetes/kubernetes/issues/48510

**Release note**:

```release-note
NONE
```
2017-09-19 01:17:39 -07:00
wackxu d8aa0ca82a fix the bad code comment and make the format unify 2017-09-19 11:15:10 +08:00
Chakravarthy Nelluri b8d1c3bcd8 Fix volume remount on reboot 2017-09-18 16:28:21 -04:00
Jiaying Zhang 34dccc5d2a Fixes some races in deviceplugin manager_test.go and manager.go. 2017-09-18 13:19:51 -07:00
Lantao Liu d387eab817 Fix CRI container/imagefs stats. 2017-09-18 07:48:20 +00:00
FengyunPan bfc171ccaa Improve codes which checks whether sandbox contains containers
Currently when evictSandboxes() checks whether sandbox contains
containers, it traverses all the containers for every sandbox,
but when cluster has many containres, it wastes a lot of time.
It is better to use sets in this case.
2017-09-18 14:34:34 +08:00
Kubernetes Submit Queue 3277de69b4 Merge pull request #52176 from liggitt/heartbeat-timeout
Automatic merge from submit-queue (batch tested with PRs 52176, 43152). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Eliminate hangs/throttling of node heartbeat

Fixes https://github.com/kubernetes/kubernetes/issues/48638
Fixes #50304

Stops kubelet from wedging when updating node status if unable to establish tcp connection.

 Notes that this only affects the node status loop. The pod sync loop would still hang until the dead TCP connections timed out,  so more work is needed to keep the sync loop responsive in the face of network issues, but this change lets existing pods coast without the node controller trying to evict them

```release-note
kubelet to master communication when doing node status updates now has a timeout to prevent indefinite hangs
```
2017-09-16 09:45:29 -07:00
supereagle 87c29a08e1 fix typos: remove duplicated word in comments 2017-09-16 14:38:10 +08:00
David Porter aee1e58d58 Handle nil WritableLayer 2017-09-16 00:13:17 +00:00
David Porter 0b1f806557 Fix nil dereference if storage id is nil 2017-09-16 00:13:04 +00:00
Clayton Coleman eb0cab5b18
Do not set message when terminationMessagePath not found
If terminationMessagePath is set to a file that does not exist, we
should not log an error message and instead try falling back to logs
(based on the user's request).
2017-09-15 16:27:36 -04:00
Casey Davenport 94bf2b0ccf Attempt at fixing UTs 2017-09-15 09:23:52 -07:00
Casey Davenport be5cd7fed2 Recreate pod sandbox when the sandbox does not have an IP address. 2017-09-15 09:23:52 -07:00
Kubernetes Submit Queue b5fbd71bbc Merge pull request #52290 from jiayingz/deviceplugin-failure
Automatic merge from submit-queue (batch tested with PRs 52452, 52115, 52260, 52290)

Fixes device plugin re-registration handling logic to make sure:

- If a device plugin exits, its exported resource will be removed.
- No capacity change if a new device plugin instance comes up to replace the old instance.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/kubernetes/kubernetes/issues/52510

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-09-15 02:00:08 -07:00
Kubernetes Submit Queue 86dc5fceda Merge pull request #52451 from yujuhong/enable-cri-stats
Automatic merge from submit-queue (batch tested with PRs 51824, 50476, 52451, 52009, 52237)

kubelet: enable CRI container metrics

Fixes #46984
2017-09-15 01:08:05 -07:00
Kubernetes Submit Queue ce5c41ab0f Merge pull request #52363 from balajismaniam/fix-cpuman-restartpol-never-bug
Automatic merge from submit-queue (batch tested with PRs 52442, 52247, 46542, 52363, 51781)

Make CPU manager release CPUs when Pod enters completed phase. 

**What this PR does / why we need it**: When CPU manager is enabled, this PR releases allocated CPUs when container is not running and is non-restartable. 

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #52351

**Special notes for your reviewer**:
This bug is only reproduced for pods with `restartPolicy` = `Never` or `OnFailure`.  The following output is from a 4 CPU node. This bug can be reproduced as long >= half the cores are requested. 

pod1.yaml:
```
apiVersion: v1
kind: Pod
metadata:
  name: test-pod1
spec:
  containers:
  - image: ubuntu
    command: ["/bin/bash"]
    args: ["-c", "sleep 5"]
    name: test-container1
    resources:
      requests:
        cpu: 2
        memory: 100Mi
      limits:
        cpu: 2
        memory: 100Mi
  restartPolicy: "Never"
```

pod2.yaml:
```
apiVersion: v1
kind: Pod
metadata:
  name: test-pod2
spec:
  containers:
  - image: ubuntu
    command: ["/bin/bash"]
    args: ["-c", "sleep 5"]
    name: test-container1
    resources:
      requests:
        cpu: 2
        memory: 100Mi
      limits:
        cpu: 2
        memory: 100Mi
  restartPolicy: "Never"
```
Run a local Kubernetes cluster with CPU manager enabled. 
```sh
KUBELET_FLAGS='--feature-gates=CPUManager=true --cpu-manager-policy=static --cpu-manager-reconcile-period=1s --kube-reserved=cpu=500m' ./hack/local-up-cluster.sh
```
_Before:_
Create `test-pod1` using pod1.yaml. 
```
./cluster/kubectl.sh create -f pod1.yaml
```
Wait for the pod to complete and wait another 90 seconds (give enough time for GC to kick-in). 

Create `test-pod2` using pod2.yaml. 
```
./cluster/kubectl.sh create -f pod2.yaml
```

Get all pods in the cluster. 
```
./cluster/kubectl.sh get pods -a
NAME        READY     STATUS                                         RESTARTS   AGE
test-pod1   0/1       Completed                                      0          1m
test-pod2   0/1       not enough cpus available to satisfy request   0          9s
```

_After:_
Create `test-pod1` using pod1.yaml. 
```
./cluster/kubectl.sh create -f pod1.yaml
```
Wait for the pod to complete and wait another 90 seconds (give enough time for GC to kick-in). 

Create `test-pod2` using pod2.yaml. 
```
./cluster/kubectl.sh create -f pod2.yaml
```

Get all pods in the cluster. 
```
./cluster/kubectl.sh get pods -a
NAME        READY     STATUS      RESTARTS   AGE
test-pod1   0/1       Completed    0          1m
test-pod2   0/1       Completed    0          9s
```
2017-09-15 00:11:14 -07:00
Kubernetes Submit Queue 20a4112e88 Merge pull request #46542 from derekwaynecarr/quota-ignore-pod-whose-node-lost
Automatic merge from submit-queue (batch tested with PRs 52442, 52247, 46542, 52363, 51781)

Ignore pods for quota marked for deletion whose node is unreachable

**What this PR does / why we need it**:
Traditionally, we charge to quota all pods that are in a non-terminal phase.  We have a user report that noted the behavior change in kube 1.5 for the node controller to no longer force delete pods whose nodes have been lost.  Instead, the pod is marked for deletion, and the reason is updated to state that the node is unreachable.  The user expected the quota to be released.  If the user was at their quota limit, their application may not be able to create a new replica given the current behavior.  As a result, this PR ignores pods marked for deletion that have exceeded their grace period.

**Which issue this PR fixes**
xref https://bugzilla.redhat.com/show_bug.cgi?id=1455743
fixes https://github.com/kubernetes/kubernetes/issues/52436

**Release note**:
```release-note
Ignore pods marked for deletion that exceed their grace period in ResourceQuota
```
2017-09-15 00:11:10 -07:00
Jiaying Zhang 5cac9fc984 Fixes device plugin re-registration handling logic to make sure:
- If a device plugin exits, its exported resource will be removed.
- No capacity change if a new device plugin instance comes up to replace the old instance.
2017-09-14 15:24:46 -07:00
Jordan Liggitt f8f57d8959
Use separate client for node status loop 2017-09-14 15:56:22 -04:00
David Porter a854ddb358 Implement metrics for Windows Nodes
This implements stats for windows nodes in a new package, winstats.
WinStats exports methods to get cadvisor like datastructures, however
with windows specific metrics. WinStats only gets node level metrics and
information, container stats will go via the CRI. This enables the
use of the summary api to get metrics for windows nodes.
2017-09-14 06:32:51 +00:00
Yu-Ju Hong 2c415cc506 kubelet: enable CRI container metrics 2017-09-13 15:09:35 -07:00
Lee Verberne e2e6a8cd85 Fix typo in kubelet kuberuntime container test
Changes "Expetected" to "Expected"
2017-09-13 14:32:48 +02:00
Kubernetes Submit Queue c6a9b1e198 Merge pull request #52125 from yujuhong/fix-file-sync
Automatic merge from submit-queue (batch tested with PRs 52339, 52343, 52125, 52360, 52301)

dockershim: check if f.Sync() returns an error and surface it

```release-note
dockershim: check the error when syncing the checkpoint.
```
2017-09-12 21:45:56 -07:00
Balaji Subramaniam e2e356964a Make CPU manager release allocated CPUs when container enters completed phase. 2017-09-12 21:01:01 -07:00
Kubernetes Submit Queue b04f81d342 Merge pull request #52344 from smarterclayton/no_log_pull
Automatic merge from submit-queue (batch tested with PRs 48226, 52046, 52231, 52344, 52352)

Log at higher verbosity levels some common SyncPod errors

This log message was 90% of all glog.Errorf level statements reported on a production cluster, hiding other more impactful errors. We already log it in start container, but for extra caution we continue to log it at v(3) here (the downside of not logging a start container error is worse than some log spam at higher levels).

HandleError() is intended only for unknown and unexpected errors.

```release-note
NONE
```

@derekwaynecarr @sjenning
2017-09-12 19:40:03 -07:00
Kubernetes Submit Queue 32f1521cc2 Merge pull request #52046 from dashpole/soft_eviction
Automatic merge from submit-queue (batch tested with PRs 48226, 52046, 52231, 52344, 52352)

[BugFix] Soft Eviction timer works correctly

fixes #51516

thresholdsMet should not exclude previously met thresholds when we do not have new stats for a threshold.

/assign @vishh @derekwaynecarr 
cc @kubernetes/sig-node-bugs
2017-09-12 19:39:55 -07:00
Kubernetes Submit Queue 8e95e39c15 Merge pull request #52297 from derekwaynecarr/code-hygiene
Automatic merge from submit-queue (batch tested with PRs 51041, 52297, 52296, 52335, 52338)

Use cAdvisor constant for crio imagefs

**What this PR does / why we need it**:
code hygiene to use a constant from cAdvisor

**Release note**:
```release-note
NONE
```
2017-09-12 11:10:10 -07:00
Clayton Coleman a5ac80cbce
Log at higher verbosity levels some common SyncPod errors 2017-09-12 10:52:31 -04:00
Kubernetes Submit Queue d8847a8f1d Merge pull request #52119 from mtaufen/sync-files
Automatic merge from submit-queue

fsync config checkpoint files after writing

@yujuhong brought up that it's possible for a hard reboot to result in empty checkpoint files, if they haven't been synced to disk yet. This PR ensures that Kubelet configuration checkpoints are synced after writing to avoid this issue.

fixes #52222

**Release note**:
```release-note
NONE
```
2017-09-12 05:41:25 -07:00
Kubernetes Submit Queue 01154dd3cf Merge pull request #51870 from feiskyer/sandbox-creds
Automatic merge from submit-queue (batch tested with PRs 52264, 51870)

Use credentials from providers for docker sandbox image

**What this PR does / why we need it**:

Sandbox image lookup uses creds from docker config only; other credential providers are ignored. This is a regression introduced in dockershim.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #51293

**Special notes for your reviewer**:

Should also cherry-pick this to release-1.6 and release-1.7.

**Release note**:

```release-note
Fix credentials providers for docker sandbox image.
```
2017-09-12 02:10:24 -07:00
yanxuean 799d0e5a6e correct to handler 2017-09-12 13:47:08 +08:00
Derek Carr cf2c688385 Use cAdvisor constant for crio imagefs 2017-09-11 14:08:00 -04:00
Derek Carr da01c6d3a2 Ignore pods for quota that exceed deletion grace period 2017-09-11 13:31:52 -04:00
Yu-Ju Hong aaf26b2eaa dockershim: remove support for legacy containers
The code was first introduced in 1.6 to help pre-CRI-kubelet upgrade to
using the CRI implementation. They can safely be removed now.
2017-09-11 08:44:27 -07:00
xiangpengzhao 0484a1c2c5 Remove backward compatibility of hostportChainName 2017-09-10 00:24:00 +08:00
Kubernetes Submit Queue d6df4a5127 Merge pull request #52063 from mtaufen/dkcfg-e2enode
Automatic merge from submit-queue (batch tested with PRs 52047, 52063, 51528)

Improve dynamic kubelet config e2e node test and fix bugs

Rather than just changing the config once to see if dynamic kubelet
config at-least-sort-of-works, this extends the test to check that the
Kubelet reports the expected Node condition and the expected configuration
values after several possible state transitions.

Additionally, this adds a stress test that changes the configuration 100
times. It is possible for resource leaks across Kubelet restarts to
eventually prevent the Kubelet from restarting. For example, this test
revealed that cAdvisor's leaking journalctl processes (see:
https://github.com/google/cadvisor/issues/1725) could break dynamic
kubelet config. This test will help reveal these problems earlier.

This commit also makes better use of const strings and fixes a few bugs
that the new testing turned up.

Related issue: #50217

I had been sitting on this until the cAdvisor fix merged in #51751, as these tests fail without that fix.

**Release note**:

```release-note
NONE
```
2017-09-08 16:06:56 -07:00
Pengfei Ni 4d5d97438b Use credentials from providers for docker sandbox image 2017-09-09 07:02:04 +08:00
Kubernetes Submit Queue 943817f57b Merge pull request #52047 from balajismaniam/cpuman-large-topo-test
Automatic merge from submit-queue

Added large topology tests for static policy in CPU Manager.

**What this PR does / why we need it**: This PR adds a very large topology test case for the CPU Manager feature.

Related to #51180. 

CC @ConnorDoyle
2017-09-08 15:57:41 -07:00
Kevin f50761c9d4 fix prober ticking shift for kubelet restarted cases 2017-09-08 17:31:02 +08:00
Yu-Ju Hong a850614613 dockershim: check if f.Sync() returns an error and surface it 2017-09-07 16:05:02 -07:00
Michael Taufen a846ba191c Improve dynamic kubelet config e2e node test and fix bugs
Rather than just changing the config once to see if dynamic kubelet
config at-least-sort-of-works, this extends the test to check that the
Kubelet reports the expected Node condition and the expected configuration
values after several possible state transitions.

Additionally, this adds a stress test that changes the configuration 100
times. It is possible for resource leaks across Kubelet restarts to
eventually prevent the Kubelet from restarting. For example, this test
revealed that cAdvisor's leaking journalctl processes (see:
https://github.com/google/cadvisor/issues/1725) could break dynamic
kubelet config. This test will help reveal these problems earlier.

This commit also makes better use of const strings and fixes a few bugs
that the new testing turned up.

Related issue: #50217
2017-09-07 15:50:17 -07:00
Michael Taufen 47beb80368 fsync config checkpoint files after writing 2017-09-07 14:42:18 -07:00
Kubernetes Submit Queue ae6b329368 Merge pull request #51644 from sjenning/init-container-status-fix
Automatic merge from submit-queue (batch tested with PRs 51239, 51644, 52076)

do not update init containers status if terminated

fixes #29972 #41580

This fixes an issue where, if a completed init container is removed while the pod or subsequent init containers are still running, the status for that init container will be reset to `Waiting` with `PodInitializing`.  

This can manifest in a number of ways.

If the init container is removed why the main pod containers are running, the status will be reset with no functional problem but the status will be reported incorrectly in `kubectl get pod` for example

If the init container is removed why a subsequent init container is running, the init container will be **re-executed** leading to all manner of badness.

@derekwaynecarr @bparees
2017-09-07 14:31:23 -07:00
Derek Carr 27365eb900 Fix cross-build 2017-09-07 09:53:52 -04:00
Kubernetes Submit Queue a51eb2ac4e Merge pull request #49202 from cbonte/node-addresses
Automatic merge from submit-queue (batch tested with PRs 51728, 49202)

Fix setNodeAddress when a node IP and a cloud provider are set

**What this PR does / why we need it**:
When a node IP is set and a cloud provider returns the same address with
several types, only the first address was accepted. With the changes made
in PR #45201, the vSphere cloud provider returned the ExternalIP first,
which led to a node without any InternalIP.

The behaviour is modified to return all the address types for the
specified node IP.

**Which issue this PR fixes**: fixes #48760

**Special notes for your reviewer**:
* I'm not a golang expert, is it possible to mock `kubelet.validateNodeIP()` to avoid the need of real host interface addresses in the test ?
* It would be great to have it backported for a next 1.6.8 release.

**Release note**:
```release-note
NONE
```
2017-09-06 20:01:00 -07:00
Kubernetes Submit Queue b6545a086c Merge pull request #51728 from derekwaynecarr/cadvisor-stats
Automatic merge from submit-queue (batch tested with PRs 51728, 49202)

Enable CRI-O stats from cAdvisor

**What this PR does / why we need it**:
cAdvisor may support multiple container runtimes (docker, rkt, cri-o, systemd, etc.)

As long as the kubelet continues to run cAdvisor, runtimes with native cAdvisor support may not want to run multiple monitoring agents to avoid performance regression in production.  Pending kubelet running a more light-weight monitoring solution, this PR allows remote runtimes to have their stats pulled from cAdvisor when cAdvisor is registered stats provider by introspection of the runtime endpoint.

See issue https://github.com/kubernetes/kubernetes/issues/51798

**Special notes for your reviewer**:
cAdvisor will be bumped to pick up https://github.com/google/cadvisor/pull/1741

At that time, CRI-O will support fetching stats from cAdvisor.

**Release note**:
```release-note
NONE
```
2017-09-06 20:00:57 -07:00
Joel Smith 58ae5a78f9 Clean up kublet secret and configmap unit test
* Expected value comes before actual value in assert.Equal()
* Use assert.Equal() instead of assert.True() when possible
* Add a unit test that verifies no-op pod updates to the
  secret_manager and the configmap_manager
* Add a clarifying comment about why it's good to seemingly
  delete a secret on updates.
* Fix (for now, non-buggy) variable shadowing issue
2017-09-06 16:38:01 -06:00
Balaji Subramaniam e2cb80db4a Added large topology tests for static policy in CPU Manager.
- Added comments for tests cases.
2017-09-06 13:15:22 -07:00
David Ashpole d60d4a4420 soft eviction timer works 2017-09-06 13:01:49 -07:00
Yang Guo dfea03d920 Implement StatsProvider using CRI stats 2017-09-06 09:11:56 -07:00
Kubernetes Submit Queue dcc1aa0628 Merge pull request #51928 from mindprince/pr-45724-fix-build
Automatic merge from submit-queue

Make *fakeMountInterface in container_manager_unsupported_test.go implement mount.Interface again.

This was broken in #45724

**Release note**:
```release-note
NONE
```
/sig storage
/sig node

/cc @jsafrane, @vishh
2017-09-05 19:44:54 -07:00
Kubernetes Submit Queue e8d99f5839 Merge pull request #51645 from jingxu97/Aug/nameserver
Automatic merge from submit-queue (batch tested with PRs 51186, 50350, 51751, 51645, 51837)

Set up DNS server in containerized mounter path

During NFS/GlusterFS mount, it requires to have DNS server to be able to
resolve service name. This PR gets the DNS server ip from kubelet and
add it to the containerized mounter path. So if containerized mounter is
used, service name could be resolved during mount
**Release note**:

```release-note
Allow DNS resolution of service name for COS using containerized mounter.  It fixed the issue with DNS resolution of NFS and Gluster services.
```
2017-09-05 17:30:09 -07:00
Kubernetes Submit Queue 99aa992ce8 Merge pull request #51751 from dashpole/update_cadvisor_godep
Automatic merge from submit-queue (batch tested with PRs 51186, 50350, 51751, 51645, 51837)

Update Cadvisor Dependency

Fixes: https://github.com/kubernetes/kubernetes/issues/51832
This is the worst dependency update ever... 
The root of the problem is the [name change of Sirupsen -> sirupsen](https://github.com/sirupsen/logrus/issues/570#issuecomment-313933276).  This means that in order to update cadvisor, which venders the lowercase, we need to update all dependencies to use the lower-cased version.  With that being said, this PR updates the following packages:

`github.com/docker/docker`
- `github.com/docker/distribution`
  - `github.com/opencontainers/go-digest`
  - `github.com/opencontainers/image-spec`
  - `github.com/opencontainers/runtime-spec`
  - `github.com/opencontainers/selinux`
  - `github.com/opencontainers/runc`
    - `github.com/mrunalp/fileutils`
  - `golang.org/x/crypto`
    - `golang.org/x/sys`
- `github.com/docker/go-connections`
- `github.com/docker/go-units`
- `github.com/docker/libnetwork`
- `github.com/docker/libtrust`
- `github.com/sirupsen/logrus`
- `github.com/vishvananda/netlink`

`github.com/google/cadvisor`
- `github.com/euank/go-kmsg-parser`

`github.com/json-iterator/go`

Fixed https://github.com/kubernetes/kubernetes/issues/51832

```release-note
Fix journalctl leak on kubelet restart
Fix container memory rss
Add hugepages monitoring support
Fix incorrect CPU usage metrics with 4.7 kernel
Add tmpfs monitoring support
```
2017-09-05 17:30:06 -07:00
Kubernetes Submit Queue 78c820803c Merge pull request #50350 from dashpole/eviction_container_deletion
Automatic merge from submit-queue (batch tested with PRs 51186, 50350, 51751, 51645, 51837)

Wait for container cleanup before deletion

We should wait to delete pod API objects until the pod's containers have been cleaned up. See issue: #50268 for background.

This changes the kubelet container gc, which deletes containers belonging to pods considered "deleted".
It adds two conditions under which a pod is considered "deleted", allowing containers to be deleted:
Pods where deletionTimestamp is set, and containers are not running
Pods that are evicted

This PR also changes the function PodResourcesAreReclaimed by making it return false if containers still exist.
The eviction manager will wait for containers of previous evicted pod to be deleted before evicting another pod.
The status manager will wait for containers to be deleted before removing the pod API object.

/assign @vishh
2017-09-05 17:30:03 -07:00
Rohit Agarwal 18d25bf4ba Add an OWNERS file for deviceplugin package. Update OWNERS file for gpu package. 2017-09-05 13:46:13 -07:00
Kubernetes Submit Queue 8b9e8cf80a Merge pull request #51744 from jiayingz/deviceplugin-checkpoint
Automatic merge from submit-queue (batch tested with PRs 50072, 51744)

Deviceplugin checkpoint

**What this PR does / why we need it**:
Extends on top of PR 51209 to checkpoint device to pod allocation information on Kubelet to recover from Kubelet restarts.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-09-05 13:33:01 -07:00
David Ashpole e5a6a79fd7 update cadvisor, docker, and runc godeps 2017-09-05 12:38:57 -07:00
Jing Xu 3d4bc931d3 Set up DNS server in containerized mounter path
During NFS/GlusterFS mount, it requires to have DNS server to be able to
resolve service name. This PR gets the DNS server ip from kubelet and
add it to the containerized mounter path. So if containerized mounter is
used, service name could be resolved during mount
2017-09-05 11:40:23 -07:00
Jiaying Zhang 3b2bc58c11 Extends device_plugin_handler to checkpoint device to container allocation information. 2017-09-05 09:52:14 -07:00
Derek Carr 38d5dee677 Node validation restricts pre-allocated hugepages to single page size 2017-09-05 10:34:30 -04:00
Derek Carr 1ec2a69d9a Kubelet changes to support hugepages 2017-09-05 09:46:08 -04:00
Rohit Agarwal 08ea02b9a5 Make *fakeMountInterface in container_manager_unsupported_test.go implement mount.Interface again.
This was broken in #45724
2017-09-04 21:48:55 -07:00
saadali 3b834cf665 Modify VolumeZonePredicate to handle multi-zone PV
Modifies the VolumeZonePredicate to handle a PV that belongs to more
then one zone or region. This is indicated by the zone or region label
value containing a comma separated list.
2017-09-04 20:13:32 -07:00
David Ashpole 9ac30e2c28 wait for container cleanup before deletion 2017-09-04 17:38:09 -07:00
Balaji Subramaniam 5b5958ecec Add tests for the static cpumanager policy. 2017-09-04 07:24:59 -07:00
Connor Doyle d0bcbbb437 Added static cpumanager policy. 2017-09-04 07:24:59 -07:00
Connor Doyle e03a6435bb Added cpu assignment helpers. 2017-09-04 07:24:59 -07:00
Szymon Scharmach 242439c9d7 Add topology helper and tests to cpumanager. 2017-09-04 07:24:59 -07:00
Connor Doyle e4d5565228 Fix Start signature in container_manager_windows. 2017-09-04 07:24:59 -07:00
Connor Doyle 81ccd396d7 Fixed nil InternalContainerLifecycle in cm stubs. 2017-09-04 07:24:59 -07:00
Connor Doyle ec706216e6 Un-revert "CPU manager wiring and `none` policy"
This reverts commit 8d2832021a.
2017-09-04 07:24:59 -07:00
Hemant Kumar e78d433150 Implement necessary API changes
Introduce feature gate for expanding PVs
Add a field to SC
Add new Conditions and feature tag pvc update
Add tests for size update via feature gate
register the resize admission plugin
Update golint failures
2017-09-04 09:02:34 +02:00
Kubernetes Submit Queue 034c40be6f Merge pull request #51864 from jiayingz/fix-51863
Automatic merge from submit-queue (batch tested with PRs 51845, 51868, 51864)

Fixes a cross-build failure introduced in PR 51209. FYI, issue 51863.

fixes #51863
2017-09-03 21:32:00 -07:00
Kubernetes Submit Queue 6ec80eac1b Merge pull request #51816 from liggitt/xiangpengzhao-remove-initc-anno
Automatic merge from submit-queue

Remove deprecated init-container in annotations

fixes #50655
fixes #51816 
closes #41004
fixes #51816 

Builds on #50654 and drops the initContainer annotations on conversion to prevent bypassing API server validation/security and targeting version-skewed kubelets that still honor the annotations

```release-note
The deprecated alpha and beta initContainer annotations are no longer supported. Init containers must be specified using the initContainers field in the pod spec.
```
2017-09-03 17:35:11 -07:00
Kubernetes Submit Queue f07279ada2 Merge pull request #51474 from verult/ProberTest
Automatic merge from submit-queue (batch tested with PRs 51805, 51725, 50925, 51474, 51638)

Flexvolume dynamic plugin discovery: Prober unit tests and basic e2e test.

**What this PR does / why we need it**: Tests for changes introduced in PR #50031 .
As part of the prober unit test, I mocked filesystem, filesystem watch, and Flexvolume plugin initialization.
Moved the filesystem event goroutine to watcher implementation.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #51147

**Special notes for your reviewer**:
First commit contains added functionality of the mock filesystem.
Second commit is the refactor for moving mock filesystem into a common util directory.
Third commit is the unit and e2e tests.

**Release note**:

```release-note
NONE
```
/release-note-none
/sig storage
/assign @saad-ali @liggitt 
/cc @mtaufen @chakri-nelluri @wongma7
2017-09-03 11:10:05 -07:00
Kubernetes Submit Queue 0554520495 Merge pull request #50938 from cblecker/threshold-crossbuild
Automatic merge from submit-queue (batch tested with PRs 51666, 49829, 51058, 51004, 50938)

Fix threshold notifier build tags

**What this PR does / why we need it**:
Cross building from darwin is currently broken on the following error:
```
# k8s.io/kubernetes/pkg/kubelet/eviction
pkg/kubelet/eviction/threshold_notifier_unsupported.go:25: NewMemCGThresholdNotifier redeclared in this block
        previous declaration at pkg/kubelet/eviction/threshold_notifier_linux.go:38
```
It looks like #49300 broke the build tags introduced in #38630 and #37384. This fixes the build tag on `threshold_notifier_unsupported.go` as the cgo requirement was removed from `threshold_notifier_linux.go`.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #50935

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2017-09-02 22:52:11 -07:00
Jiaying Zhang 29d178fbc3 Fixes a cross-build failure introduced in PR 51209. FYI, issue 51863. 2017-09-02 21:56:39 -07:00
Kubernetes Submit Queue 578195873a Merge pull request #51553 from wongma7/pvc-prometheus
Automatic merge from submit-queue

Expose PVC metrics via kubelet prometheus

This depends on https://github.com/kubernetes/kubernetes/pull/51448, opening early though. second commit is mine and mostly a copy/paste job.

implements metrics listed in here https://github.com/kubernetes/community/pull/855 following method here https://github.com/kubernetes/community/pull/930#issuecomment-325509736

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: https://github.com/kubernetes/features/issues/363

**Special notes for your reviewer**:

**Release note**:

```release-note
PersistentVolumeClaim metrics like "volume_stats_inodes" and "volume_stats_capacity_bytes" are now reported via kubelet prometheus
```
2017-09-02 21:22:43 -07:00
Kubernetes Submit Queue 11a836078d Merge pull request #46444 from jsafrane/node-mount-propagation
Automatic merge from submit-queue (batch tested with PRs 45724, 48051, 46444, 51056, 51605)

Mount propagation in kubelet

Together with #45724 it implements mount propagation as proposed in https://github.com/kubernetes/community/pull/659

There is:

- New alpha annotation that allows user to explicitly set propagation mode for each `VolumeMount` in pod containers (to be replaced with real `VolumeMount.Propagation` field during beta) + validation + tests. "Private" is the default one (= no change to existing pods).

  I know about proposal for real API fields for alpha feature in https://docs.google.com/document/d/1wuoSqHkeT51mQQ7dIFhUKrdi3-1wbKrNWeIL4cKb9zU/edit, but it seems it's not implemented yet. It would save me quite lot of code and ugly annotation.

- Updated CRI API to transport chosen propagation to Docker.

- New `kubelet --experimental-mount-propagation` option to enable the previous bullet without modifying types.go (worked around with changing `KubeletDeps`... not nice, but it's better than adding a parameter to `NewMainKubelet` and removing it in the next release...)

```release-note
kubelet has alpha support for mount propagation. It is disabled by default and it is there for testing only. This feature may be redesigned or even removed in a future release.
```

@derekwaynecarr @dchen1107 @kubernetes/sig-node-pr-reviews
2017-09-02 12:11:07 -07:00
Kubernetes Submit Queue 917f9f02ef Merge pull request #45724 from jsafrane/mount-propagation2
Automatic merge from submit-queue

Make /var/lib/kubelet as shared during startup

This is part of ~~https://github.com/kubernetes/community/pull/589~~ https://github.com/kubernetes/community/pull/659

We'd like kubelet to be able to consume mounts from containers in the future, therefore kubelet should make sure that `/var/lib/kubelet` has shared mount propagation to be able to see these mounts. 

On most distros, root directory is already mounted with shared mount propagation and this code will not do anything. On older distros such as Debian Wheezy, this code detects that `/var/lib/kubelet` is a directory on `/` which has private mount propagation and kubelet bind-mounts `/var/lib/kubelet` as rshared.

Both "regular" linux mounter and `NsenterMounter` are updated here.

@kubernetes/sig-storage-pr-reviews @kubernetes/sig-node-pr-reviews 
@vishh 

Release note:
```release-note
Kubelet re-binds /var/lib/kubelet directory with rshared mount propagation during startup if it is not shared yet.
```
2017-09-02 12:00:30 -07:00
Kubernetes Submit Queue ddef5f1ef9 Merge pull request #51575 from derekwaynecarr/fix-stats
Automatic merge from submit-queue (batch tested with PRs 51590, 48217, 51209, 51575, 48627)

Skip system container cgroup stats if undefined

**What this PR does / why we need it**:
the kubelet /stats/summary endpoint tried to look up cgroup stats for containers that are not required.  this polluted logs with messages about not finding stats for "" container.  this pr skips cgroup stats if the cgroup name is not specified (they are optional anyway)

**Special notes for your reviewer**:
i think this was a regression from recent refactor.

**Release note**:
```release-note
NONE
```
2017-09-02 11:12:13 -07:00
Kubernetes Submit Queue 139e52744a Merge pull request #51209 from jiayingz/deviceplugin-jiayingz
Automatic merge from submit-queue (batch tested with PRs 51590, 48217, 51209, 51575, 48627)

Deviceplugin jiayingz

**What this PR does / why we need it**:
This PR implements the kubelet Device Plugin Manager.
It includes four commits implemented by @RenaudWasTaken and a commit that supports allocation.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Design document: kubernetes/community#695
PR tracking: kubernetes/features#368

**Special notes for your reviewer**:

**Release note**:
Extending Kubelet to support device plugin

```release-note
```
2017-09-02 11:12:10 -07:00
Shyam JVS 3bba914496 Revert "Remove deprecated and experimental fields from KubeletConfiguration" 2017-09-02 16:30:56 +02:00
Kubernetes Submit Queue 9b535b06a6 Merge pull request #51307 from mtaufen/kc-type-refactor
Automatic merge from submit-queue (batch tested with PRs 50381, 51307, 49645, 50995, 51523)

Remove deprecated and experimental fields from KubeletConfiguration

As we work towards providing a stable (v1) kubeletconfig API,
we cannot afford to have deprecated or "experimental" (alpha) fields
living in the KubeletConfiguration struct. This removes all existing
experimental or deprecated fields, and places them in KubeletFlags
instead.

I'm going to send another PR after this one that organizes the remaining
fields into substructures for readability. Then, we should try to move
to v1 ASAP (maybe not v1 in 1.8, given how close we are, but definitely in 1.9).

It makes far more sense to focus on a clean API in kubeletconfig v2,
than to try and further clean up the existing "API" that everyone
already depends on.

fixes: #51657

**Release note**:
```release-note
NONE
```
2017-09-01 16:33:59 -07:00
Kubernetes Submit Queue 0955f3602e Merge pull request #50381 from sczizzo/bugfix-issue-47800
Automatic merge from submit-queue (batch tested with PRs 50381, 51307, 49645, 50995, 51523)

Bugfix: Use local JSON log buffer in parseDockerJSONLog.

**What this PR does / why we need it**:
The issue described in #47800 is due to a race condition in `ReadLogs`: Because the JSON log buffer (`dockerJSONLog`) is package-scoped, any two goroutines modifying the buffer could race and overwrite the other's changes. In particular, one goroutine could unmarshal a JSON log line into the buffer, then another goroutine could `Reset()` the buffer, and the resulting `Stream` would be empty (`""`). This empty `Stream` is caught in a `case` block and raises an `unexpected stream type` error.

This PR creates a new buffer for each execution of `parseDockerJSONLog`, so each goroutine is guaranteed to have a local instance of the buffer.

**Which issue this PR fixes**: fixes #47800

**Release note**:
```release-note
Fixed an issue (#47800) where `kubectl logs -f` failed with `unexpected stream type ""`.
```
2017-09-01 16:33:56 -07:00
Jing Xu 8f98230f20 Map a resource to multiple signals in eviction manager
It is possible to have multiple signals that point to the same type of
resource, e.g., both SignalNodeFsAvailable and
SignalAllocatableNodeFsAvailable refer to the same resource NodeFs.
Change the map from map[v1.ResourceName]evictionapi.Signal to
map[v1.ResourceName][]evictionapi.Signal
2017-09-01 12:54:37 -07:00
Jan Safranek 03b753daad Implement mount propagation in kubelet 2017-09-01 21:36:33 +02:00
Jan Safranek 0c767355d8 Implement mount propagation in docker shim 2017-09-01 21:36:33 +02:00
Jan Safranek 9a7465a4e2 Add mount propagation to CRI protocol
CRI will blindly obey Kubelet decission about what propagation should be
used when.
2017-09-01 21:36:33 +02:00
Jiaying Zhang 02001af752 Kubelet side extension to support device allocation 2017-09-01 11:56:35 -07:00
Renaud Gaubert 7a8ad491ef Alpha feature integration 2017-09-01 11:47:16 -07:00
Renaud Gaubert f7f4515e43 Testing 2017-09-01 11:47:16 -07:00
Renaud Gaubert c4a1c97329 Device Plugin Kubelet integration 2017-09-01 11:47:09 -07:00
Renaud Gaubert b563101efb Added Device Plugin Manager 2017-09-01 11:40:52 -07:00
Matthew Wong dac2068bbd Expose PVC metrics via kubelet prometheus 2017-09-01 12:50:17 -04:00
Shyam JVS 8d2832021a Revert "CPU manager wiring and `none` policy" 2017-09-01 18:17:36 +02:00
Kubernetes Submit Queue c65ab61b3f Merge pull request #51372 from mtaufen/feature-gate-file
Automatic merge from submit-queue (batch tested with PRs 49971, 51357, 51616, 51649, 51372)

Separate feature gates for dynamic kubelet config vs loading from a file

This makes it so these two features can be turned on independently, rather than bundling both under dynamic kubelet config.

fixes: #51664

```release-note
NONE
```
2017-09-01 01:12:47 -07:00
Kubernetes Submit Queue 08ad0127ac Merge pull request #51357 from ConnorDoyle/cpu-manager-wiring-and-nonepolicy
Automatic merge from submit-queue (batch tested with PRs 49971, 51357, 51616, 51649, 51372)

CPU manager wiring and `none` policy

Blocker for CPU manager #49186 (4 of 6)

* Previous PR in this series: #51140
* Next PR in this series: #51180

cc @balajismaniam @derekwaynecarr @sjenning 

**Release note**:

```release-note
NONE
```

TODO:
- [X] In-memory CPU manager state
- [x] Kubelet config value
- [x] Feature gate
- [X] None policy
- [X] Unit tests
- [X] CPU manager instantiation
- [x] Calls into CPU manager from Kubelet container runtime
2017-09-01 01:12:39 -07:00
Kubernetes Submit Queue aa50c0f54c Merge pull request #51490 from NickrenREN/eviction-podLocalEphemeralStorageUsage
Automatic merge from submit-queue (batch tested with PRs 51628, 51637, 51490, 51279, 51302)

Fix pod local ephemeral storage usage calculation

We use podDiskUsage to calculate pod local ephemeral storage which is not correct, because podDiskUsage also contains HostPath volume  which is considered as persistent storage
This pr fixes it
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #51489

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```

/assign @jingxu97  @vishh 
cc @ddysher
2017-09-01 00:11:17 -07:00
Kubernetes Submit Queue 17dffc1ef5 Merge pull request #51448 from kastenhq/pvc_ref_volstats
Automatic merge from submit-queue (batch tested with PRs 51513, 51515, 50570, 51482, 51448)

Add PVCRef to VolumeStats

**What this PR does / why we need it**:
For pod volumes that reference a PVC, add a PVCRef to the corresponding
volume stat. This allows metrics to be indexed/queried by PVC name
which is more user-friendly than Pod reference

**Which issue this PR fixes** : [#363](https://github.com/kubernetes/features/issues/363)

**Special notes for your reviewer**:

**Release note**:
```
`VolumeStats` reported by the kubelet stats summary API 
(http://<node>:10255/stats/summary) now include a PVCRef
field describing the PVC referenced by the volume (if any). 
```
2017-08-31 22:09:20 -07:00
Kubernetes Submit Queue b7381c3b03 Merge pull request #51515 from jianglingxia/jlx82918
Automatic merge from submit-queue (batch tested with PRs 51513, 51515, 50570, 51482, 51448)

fix typo about volumes

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-08-31 22:09:13 -07:00
Kubernetes Submit Queue d56b676100 Merge pull request #51408 from feiskyer/magic
Automatic merge from submit-queue (batch tested with PRs 50719, 51216, 50212, 51408, 51381)

Use constants instead of magic string for runtime names

**What this PR does / why we need it**:

Use constants instead of magic string for runtime names.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #51678

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-08-31 21:09:17 -07:00
Derek Carr 566f411b08 Support remote runtimes with native cAdvisor support 2017-08-31 16:41:53 -04:00
Connor Doyle 50674ec614 Added cpu-manager-reconcile-period config.
- Defaults to sync-frequency.
2017-08-30 23:42:32 -07:00
Michael Taufen 0e25cbd6a0 Separate feature gates for dynamic kubelet config vs loading from a file 2017-08-30 14:52:37 -07:00
Michael Taufen c18626de4a Remove deprecated and experimental fields from KubeletConfiguration
As we work towards providing a stable (v1) kubeletconfig API,
we cannot afford to have deprecated or "experimental" (alpha) fields
living in the KubeletConfiguration struct. This removes all existing
experimental or deprecated fields, and places them in KubeletFlags
instead.

I'm going to send another PR after this one that organizes the remaining
fields into substructures for readability. Then, we should try to move
to v1 ASAP.

It makes far more sense to focus on a clean API in kubeletconfig v2,
than to try and further clean up the existing "API" that everyone
already depends on.
2017-08-30 11:54:21 -07:00
Jing Xu 4d6da1fd9a Change SizeLimit to a pointer
This PR fixes issue #50121
2017-08-30 11:50:35 -07:00
Seth Jennings 3b80b9d518 do not update init containers status if terminated 2017-08-30 13:55:17 -04:00
Jacob Simpson f1fef11b37 Add a kubelet metric to track certificate expiration. 2017-08-30 09:55:40 -07:00
Connor Doyle 7c6e31617d CPU Manager initialization and lifecycle calls. 2017-08-30 08:50:41 -07:00
Connor Doyle 5dee682796 CPU manager config and feature gate. 2017-08-30 08:27:23 -07:00
Balaji Subramaniam 7567f1765f Added CPU manager unit tests (none policy) 2017-08-30 08:26:22 -07:00
Seth Jennings ff471913f9 Added none policy for CPU manager. 2017-08-30 08:26:21 -07:00
Connor Doyle 01d1d8f23f Added in-memory CPU manager state. 2017-08-30 08:26:21 -07:00
Jan Safranek d9500105d8 Share /var/lib/kubernetes on startup
Kubelet makes sure that /var/lib/kubelet is rshared when it starts.
If not, it bind-mounts it with rshared propagation to containers
that mount volumes to /var/lib/kubelet can benefit from mount propagation.
2017-08-30 16:45:04 +02:00
Kubernetes Submit Queue 99c5295fdd Merge pull request #51140 from ConnorDoyle/cpu-manager-interfaces
Automatic merge from submit-queue (batch tested with PRs 51439, 51361, 51140, 51539, 51585)

CPU manager interfaces.

Please review / merge #51132 first.
Blocker for CPU manager #49186 (3 of 6)

@sjenning @derekwaynecarr
2017-08-30 03:59:32 -07:00
Vaibhav Kamra 1ac56d8cbb Add PVCRef to VolumeStats
For pod volumes that reference a PVC, add a PVCRef to the corresponding
volume stat. This allows metrics to be indexed/queried by PVC name
which is more user-friendly than Pod reference
2017-08-29 23:12:20 -07:00
NickrenREN 9fadd3bd9a Fix pod local ephemeral storage usage 2017-08-30 13:53:54 +08:00
Kubernetes Submit Queue 759ba487b3 Merge pull request #51377 from Random-Liu/streaming-server-stop
Automatic merge from submit-queue

Implement stop function in streaming server.

Implement streaming server stop, so that we could properly stop streaming server.

We need this to properly stop cri-containerd.
2017-08-29 22:33:31 -07:00
Kubernetes Submit Queue aa9417ce91 Merge pull request #49927 from huangjiuyuan/fix-kubelet-option-validation
Automatic merge from submit-queue (batch tested with PRs 49961, 50005, 50738, 51045, 49927)

adding validations on kubelet starting configurations

**What this PR does / why we need it**:
I found some validations of kubelet starting options were missing when I was creating a custom cluster from scratch. The kubelet does not check invalid configurations on `--cadvisor-port`, `--event-burst`, `--image-gc-high-threshold`, etc. I have added some validations in kubelet like validations in `cmd/kube-apiserver/app/options/validation.go`.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
Adds additional validation for kubelet in `pkg/kubelet/apis/kubeletconfig/validation`.
```
2017-08-29 21:43:42 -07:00
Derek Carr ef9b398f4c Skip system container cgroup stats if undefined 2017-08-29 20:34:50 -04:00
Cyril Bonté 2b2a5c6500 Fix setNodeAddress when a node IP and a cloud provider are set
When a node IP is set and a cloud provider returns the same address with
several types, on the first address was accepted. With the changes made
in PR #45201, the vSphere cloud provider returned the ExternalIP first,
which led to a node without any InternalIP.

The behaviour is modified to return all the address types for the
specified node IP.

Issue #48760
2017-08-29 17:09:25 +02:00
Kubernetes Submit Queue 611036c8c3 Merge pull request #51404 from feiskyer/nonewprivs
Automatic merge from submit-queue (batch tested with PRs 51425, 51404, 51459, 51504, 51488)

Admit NoNewPrivs for remote and rkt runtimes

**What this PR does / why we need it**:

#51347 is aiming to admit NoNewPrivis for remote container runtime, but it didn't actually solve the problem. See @miaoyq 's comments [here](https://github.com/kubernetes/kubernetes/pull/51347#discussion_r135379446).

This PR always admit NoNewPrivs for runtimes except docker, which should fix the problem.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 

Fixes #51319.

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-08-29 04:15:28 -07:00
jianglingxia 437f4640ca fix typo about volumes 2017-08-29 18:57:57 +08:00
Connor Doyle 726bd8e27b Add CPU manager interfaces. 2017-08-29 03:42:17 -07:00
Kubernetes Submit Queue cc557e61cc Merge pull request #51473 from bboreham/cadvisor-consistent-labels
Automatic merge from submit-queue (batch tested with PRs 51471, 50561, 50435, 51473, 51436)

Fix inconsistent Prometheus cAdvisor metrics

**What this PR does / why we need it**:

We need this because otherwise kubelet is exposing different sets of Prometheus metrics that randomly include or do not include containers.

See also https://github.com/google/cadvisor/issues/1704; quoting here:

Prometheus requires that all metrics in the same family have the same labels, so we arrange to supply blank strings for missing labels

The function `containerPrometheusLabels()` conditionally adds various metric labels from container labels - pod name, image, etc. However, when it receives the metrics, Prometheus [checks](https://github.com/prometheus/client_golang/blob/master/prometheus/registry.go#L665) that all metrics in the same family have the same label set, and [rejects](https://github.com/prometheus/client_golang/blob/master/prometheus/registry.go#L497) those that do not.

Since containers are collected in (somewhat) random order, depending on which kind is seen first you get one set of metrics or the other.

Changing the container labels function to always add the same set of labels, adding `""` when it doesn't have a real value, eliminates the issue in my testing.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

Fixes #50151

**Special notes for your reviewer**:

I have made the same fix in two places.  I am 98% sure the one in `cadvisor_linux.go` isn't used and indeed cannot be used, but have not gone fully down that rabbit-hole.

**Release note**:
```release-note
Fix inconsistent Prometheus cAdvisor metrics
```
2017-08-29 02:22:16 -07:00
Kubernetes Submit Queue 7c70decd27 Merge pull request #51312 from andrewsykim/50986
Automatic merge from submit-queue (batch tested with PRs 50932, 49610, 51312, 51415, 50705)

Deprecation warnings for auto detecting cloud providers

**What this PR does / why we need it**:
Adds deprecation warnings for auto detecting cloud providers. As part of the initiative for out-of-tree cloud providers, this feature is conflicting since we're shifting the dependency of kubernetes core into cAdvisor. In the future kubelets should be using `--cloud-provider=external` or no cloud provider at all. 

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #50986

**Special notes for your reviewer**:
NOTE: I still have to coordinate with sig-node and kubernetes-dev to get approval for this deprecation, I'm only opening this PR since we're close to code freeze and it's something presentable.  

**Release note**:
```release-note
Deprecate auto detecting cloud providers in kubelet. Auto detecting cloud providers go against the initiative for out-of-tree cloud providers as we'll now depend on cAdvisor integrations with cloud providers instead of the core repo. In the near future, `--cloud-provider` for kubelet will either be an empty string or `external`. 
```
2017-08-29 01:17:37 -07:00
Kubernetes Submit Queue c27cdb11a9 Merge pull request #50932 from yguo0905/stats-cadvisor
Automatic merge from submit-queue (batch tested with PRs 50932, 49610, 51312, 51415, 50705)

Implement StatsProvider interface using cadvisor

Ref: https://github.com/kubernetes/kubernetes/issues/46984

- This PR changes the `StatsProvider` interface in `pkg/kubelet/server/stats` so that it can provide container stats from either cadvisor or CRI, and the summary API can consume the stats without knowing how they are provided.
- The `StatsProvider` struct in the newly added package `pkg/kubelet/stats` implements part of the `StatsProvider` interface in `pkg/kubelet/server/stats`.
- In `pkg/kubelet/stats`,
    - `stats_provider.go`: implements the node level stats and provides the entry point for this package.
    - `cadvisor_stats_provider.go`: implements the container level stats using cadvisor.
    - `cri_stats_provider.go`: implements the container level stats using CRI.
    - `helper.go`: utility functions shared by the above three components.
- There should be no user visible behaviors change in this PR.
- A follow up PR will implement the StatsProvider interface using CRI.

**Release note**:
```
None
```

/assign @yujuhong 
/assign @WIZARD-CXY
2017-08-29 01:17:29 -07:00
Pengfei Ni fc8736fd97 Admit NoNewPrivs for remote and rkt runtimes 2017-08-29 08:48:30 +08:00
Cheng Xing 8618e28194 Refactoring for filesystem mock move 2017-08-28 16:17:15 -07:00
Bryan Boreham c193bbc7c2 Make Prometheus cAdvisor metrics labels consistent
Prometheus requires that all metrics in the same family have the same
labels, so we arrange to supply blank strings for missing labels

See https://github.com/google/cadvisor/issues/1704
2017-08-28 19:53:18 +00:00
Cheng Xing fde9541c80 Moving filesystem mock to pkg/util, and added some functionality 2017-08-28 11:33:26 -07:00
Kubernetes Submit Queue b8fde17fc2 Merge pull request #48589 from yiqinguo/yiqinguo_add_event
Automatic merge from submit-queue

When faild create pod sandbox record event.

I created pods because of the failure to create a sandbox, but there was no clear message telling me what was the failure, so I wanted to record an event when the sandbox was created.

**Release note**:
```release-note
NONE
```
2017-08-28 10:59:53 -07:00
Kubernetes Submit Queue c17d70c240 Merge pull request #47044 from kubermatic/kubelet-update-default-labels
Automatic merge from submit-queue

Always check if default labels on node need to be updated in kubelet

**What this PR does / why we need it**:
Nodes join again but maybe OS/Arch/Instance-Type has changed in the meantime.
In this case the kubelet needs to check if the default labels are still correct and if not it needs to update them.

```release-note
Kubelet updates default labels if those are deprecated
```
2017-08-28 08:20:19 -07:00
Kubernetes Submit Queue d5a811a1c8 Merge pull request #51380 from mtaufen/dkcfg-test-file-load
Automatic merge from submit-queue (batch tested with PRs 49861, 50933, 51380, 50688, 51305)

Test loading Kubelet config from a file

**What this PR does / why we need it**:
Adds a test for loading kubelet config from a file, part of improving https://github.com/kubernetes/kubernetes/issues/50217

**Release note**:
```release-note
NONE
```
2017-08-27 22:20:51 -07:00
Kubernetes Submit Queue cbe5f38ed2 Merge pull request #49849 from dixudx/stable_sort_volumesInUse
Automatic merge from submit-queue (batch tested with PRs 49849, 50334, 51414)

make volumesInUse sorted in node status updates

**What this PR does / why we need it**:

`volumesInUse` is not sent in a stable sort order. This will make node status patch requests larger than needed, and makes debugging nodes harder than necessary.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #49731

**Special notes for your reviewer**:

/cc @derekwaynecarr @jboyd01

**Release note**:

```release-note
make volumesInUse sorted in node status updates
```
2017-08-26 18:09:27 -07:00
Kubernetes Submit Queue 217513e27a Merge pull request #45294 from liggitt/proto-slices
Automatic merge from submit-queue

Remove null -> [] slice hack

Closes #44593

When 1.6 added protobuf storage, the storage layer lost the ability to persist slice fields with empty but non-null values.

As a workaround, we tried to convert empty slice fields to `[]`, rather than `null`. Compressing `null` -> `[]` was just as much of an API breakage as `[]` -> `null`, but was hoped to cause fewer problems in clients that don't do null checks.

Because of conversion optimizations around converting lists of objects, the `null` -> `[]` hack was discovered to only apply to individual get requests, not to a list of objects. 1.6 and 1.7 was released with this behavior, and the world didn't explode. 1.7 documented the breaking API change that `null` and `[]` should be considered equivalent, unless otherwise noted on a particular field.

This PR:

* Reverts the earlier attempt (https://github.com/kubernetes/kubernetes/pull/43422) at ensuring non-null json slice output in conversion
* Makes results of `get` consistent with the results of `list` (which helps naive clients that do deepequal comparisons of objects obtained via list/watch and get), and allows empty slice fields to be returned as `null`

```release-note
Protobuf serialization does not distinguish between `[]` and `null`.
API fields previously capable of storing and returning either `[]` and `null` via JSON API requests (for example, the Endpoints `subsets` field) can now store only `null` when created using the protobuf content-type or stored in etcd using protobuf serialization (the default in 1.6+). JSON API clients should tolerate `null` values for such fields, and treat `null` and `[]` as equivalent in meaning unless specifically documented otherwise for a particular field.
```
2017-08-26 13:35:29 -07:00
Michael Taufen 251e8f5f1f Test loading Kubelet config from a file 2017-08-26 12:53:59 -07:00
Kubernetes Submit Queue 9188043c6e Merge pull request #49599 from tcharding/kubelet_test_mock
Automatic merge from submit-queue (batch tested with PRs 51391, 51338, 51340, 50773, 49599)

Remove duplicate code

This PR cleans up Kubelet test code. Adds a function enabling the removal of duplicate code for Mock chaining. Also adds a function to check the pod status, again enabling removal of duplicate code.

Fixes #22470

**Special notes for your reviewer**:

This is my first PR for the Kubernetes project. Keeping it simple.
2017-08-26 08:49:29 -07:00
Kubernetes Submit Queue 98fb8cacf9 Merge pull request #50773 from huzhengchuan/bug/50770
Automatic merge from submit-queue (batch tested with PRs 51391, 51338, 51340, 50773, 49599)

Delete "hugetlb" from whitelistControllers

**What this PR does / why we need it**:
Delete "hugetlb" from whitelistControllers

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #50770

**Special notes for your reviewer**:

**Release note**:

```
NONE
```
2017-08-26 08:49:26 -07:00
Pengfei Ni 9dd589c035 Use constants instead of magic string for runtime names 2017-08-26 22:44:27 +08:00
huangjiuyuan 39c61b0967 adding validations on kubelet starting configurations 2017-08-26 22:28:14 +08:00
Kubernetes Submit Queue 932e07af53 Merge pull request #50031 from verult/ConnectedProbe
Automatic merge from submit-queue (batch tested with PRs 51054, 51101, 50031, 51296, 51173)

Dynamic Flexvolume plugin discovery, probing with filesystem watch.

**What this PR does / why we need it**: Enables dynamic Flexvolume plugin discovery. This model uses a filesystem watch (fsnotify library), which notifies the system that a probe is necessary only if something changes in the Flexvolume plugin directory.

This PR uses the dependency injection model in https://github.com/kubernetes/kubernetes/pull/49668.

**Release Note**:
```release-note
Dynamic Flexvolume plugin discovery. Flexvolume plugins can now be discovered on the fly rather than only at system initialization time.
```

/sig-storage

/assign @jsafrane @saad-ali 
/cc @bassam @chakri-nelluri @kokhang @liggitt @thockin
2017-08-26 02:05:34 -07:00
Kubernetes Submit Queue d660a41f36 Merge pull request #51101 from zhangxiaoyu-zidif/refactor-kubelet-kuberuntime-test
Automatic merge from submit-queue (batch tested with PRs 51054, 51101, 50031, 51296, 51173)

Refactor kuberuntime test case with sets.String

**What this PR does / why we need it**:
change to make got and want use sets.String instead, since that is both safe and more clearly shows the intent.

ref: https://github.com/kubernetes/kubernetes/pull/50554

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/kubernetes/kubernetes/issues/51396

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-08-26 02:05:29 -07:00
Kubernetes Submit Queue ea206bbe29 Merge pull request #51347 from Random-Liu/fix-no-new-privs
Automatic merge from submit-queue (batch tested with PRs 50889, 51347, 50582, 51297, 51264)

Fix NoNewPrivs and also allow remote runtime to provide the support.

Fixes https://github.com/kubernetes/kubernetes/issues/51319.

This PR:
1) Let kubelet admit remote runtime for `NoNewPrivis` container runtime.
2) Fix a `NoNewPrivis` bug which checks wrong runtime type.

/cc @kubernetes/sig-node-bugs @jessfraz
2017-08-25 22:43:28 -07:00
Kubernetes Submit Queue 76c520cea3 Merge pull request #50889 from NickrenREN/local-storage-eviction
Automatic merge from submit-queue (batch tested with PRs 50889, 51347, 50582, 51297, 51264)

Change eviction manager to manage one single local storage resource

**What this PR does / why we need it**:
We decided to manage one single resource name, eviction policy should be modified too.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:  part of #50818

**Special notes for your reviewer**:

**Release note**:
```release-note
Change eviction manager to manage one single local ephemeral storage resource
```

/assign @jingxu97
2017-08-25 22:43:26 -07:00
andrewsykim fd86022714 add deprecation warnings for auto detecting cloud providers 2017-08-25 19:30:52 -04:00
Lantao Liu a0ae7fac2b Implement stop function in streaming server.
Signed-off-by: Lantao Liu <lantaol@google.com>
2017-08-25 23:24:30 +00:00
Lantao Liu b760fa95e5 Fix NoNewPrivs and also allow remote runtime to provide the support. 2017-08-25 21:32:33 +00:00
NickrenREN 27901ad5df Change eviction policy to manage one single local storage resource 2017-08-26 05:14:49 +08:00
Michael Taufen 6918ab1d70 fix ReadOnlyPort, HealthzPort, CAdvisorPort defaulting/documentation
The ReadOnlyPort defaulting prevented passing 0 to diable via
the KubeletConfiguraiton struct.

The HealthzPort defaulting prevented passing 0 to disable via the
KubeletConfiguration struct. The documentation also failed to mention
this, but the check is performed in code.

The CAdvisorPort documentation failed to mention that you can pass 0 to
disable.
2017-08-25 13:15:36 -07:00
Yang Guo f9767d2f71 Change StatsProvider interface to provide container stats from either cadvisor or CRI and implement this interface using cadvisor 2017-08-25 13:11:26 -07:00
Jordan Liggitt c7defb806f
Generated files 2017-08-25 15:01:08 -04:00
Cheng Xing 396c3c7c6f Adding dynamic Flexvolume plugin discovery capability, using filesystem watch. 2017-08-25 11:42:32 -07:00
Kubernetes Submit Queue fe0c519f49 Merge pull request #51132 from ConnorDoyle/cpuset-helpers
Automatic merge from submit-queue (batch tested with PRs 50033, 49988, 51132, 49674, 51207)

Add cpuset helper library.

Blocker for CPU manager #49186 (1 of 6)

@sjenning @derekwaynecarr 

```release-note
NONE
```
2017-08-25 11:07:12 -07:00
xiangpengzhao 8719b4a8ea Remove deprecated init-container in annotations 2017-08-25 13:39:29 +08:00
Serguei Bezverkhi 1be99dd78e Adding fsGroup check before mounting a volume
fsGroup check will be enforcing that if a volume has already been
mounted by one pod and another pod wants to mount it but has a different
fsGroup value, this mount operation will not be allowed.
2017-08-24 17:33:51 -04:00
Kubernetes Submit Queue 9537241702 Merge pull request #47115 from zhangxiaoyu-zidif/add-check-err-for-kubelet
Automatic merge from submit-queue (batch tested with PRs 47115, 51196, 51204, 51208, 51206)

Delete redundant err definition

**What this PR does / why we need it**:
Delete reduandant err definition
Line 307 has err definition and initialization.


**Release note**:

```release-note
NONE
```
2017-08-24 07:20:03 -07:00
Kubernetes Submit Queue 73a6ee1dcc Merge pull request #51146 from mtaufen/remove-crashloop-detection
Automatic merge from submit-queue

Remove crash loop "detection" from the dynamic kubelet config feature

**What this PR does / why we need it**:
The subfeature was a cool idea, but in the end it is very complex to
separate Kubelet restarts into crash-loops caused by config vs.
crash-loops caused by other phenomena, like admin-triggered node restarts,
kernel panics, and and process babysitter behavior. Dynamic kubelet config
will be better off without the potential for false positives here.

Removing this subfeature also simplifies dynamic configuration by
reducing persistent state:
- we no longer need to track bad config in a file
- we no longer need to track kubelet startups in a file

**Which issue this PR fixes**: fixes #50216 

**Release note**:

```release-note
NONE
```
2017-08-24 05:34:32 -07:00
Kubernetes Submit Queue c041567b5a Merge pull request #46597 from dixudx/implement_proposal_34058
Automatic merge from submit-queue (batch tested with PRs 51113, 46597, 50397, 51052, 51166)

implement proposal 34058: hostPath volume type

**What this PR does / why we need it**:
implement proposal #34058

**Which issue this PR fixes** : fixes #46549

**Special notes for your reviewer**:
cc @thockin @luxas @euank PTAL
2017-08-23 23:16:27 -07:00
Kubernetes Submit Queue ef1b835220 Merge pull request #50646 from rickypai/rpai/hostalias_hostnetwork
Automatic merge from submit-queue

Support HostAlias for HostNetwork Pods

**What this PR does / why we need it**: Currently, HostAlias does not support HostNetwork pods because historically, kubelet only manages hosts file for non-HostNetwork pods. With the recent change in https://github.com/kubernetes/kubernetes/pull/49140, kubelet now manages hosts file for all Pods, which enables HostAlias support also.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #48398

**Special notes for your reviewer**: might be easier to review commit-by-commit

**Release note**:

```release-note
HostAlias is now supported for both non-HostNetwork Pods and HostNetwork Pods.
```

@yujuhong @hongchaodeng @thockin
2017-08-23 22:06:27 -07:00
Kubernetes Submit Queue c23e5b604e Merge pull request #51022 from wackxu/fixcodeanno
Automatic merge from submit-queue (batch tested with PRs 50489, 51070, 51011, 51022, 51141)

Fixed code comments that were not updated

**What this PR does / why we need it**:

The comment of the args ‘KubeReserved’ is out of date and there is no consistent  with command line messages

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #


**Release note**:

```
NONE
```
2017-08-23 19:54:30 -07:00
Kubernetes Submit Queue 178a5ff314 Merge pull request #50665 from xiangpengzhao/hardcode-to-const
Automatic merge from submit-queue (batch tested with PRs 50257, 50247, 50665, 50554, 51077)

Replace hard-code "cpu" and "memory" to consts

**What this PR does / why we need it**:
There are many places using hard coded "cpu" and "memory" as resource name. This PR replace them to consts.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:
/kind cleanup

**Release note**:

```release-note
NONE
```
2017-08-23 02:35:09 -07:00
Di Xu 5c45db564f implement proposal 34058: hostPath volume type 2017-08-23 14:05:21 +08:00
Connor Doyle 515d86faa0 Add CPUSetBuilder, make CPUSet immutable. 2017-08-22 22:33:04 -07:00
Connor Doyle e686ecb6ea Renamed CPUSet.AsSlice() => CPUSet.ToSlice() 2017-08-22 21:21:26 -07:00
Michael Taufen 76c41a252c Remove crash loop detection from the dynamic kubelet config feature
The subfeature was a cool idea, but in the end it is very complex to
separate Kubelet restarts into crash-loops caused by config vs.
crash-loops caused by other phenomena, like admin-triggered node restarts,
kernel panics, and and process babysitter behavior. Dynamic kubelet config
will be better off without the potential for false positives here.

Removing this subfeature also simplifies dynamic configuration by
reducing persistent state:
- we no longer need to track bad config in a file
- we no longer need to track kubelet startups in a file
2017-08-22 12:37:22 -07:00
Kubernetes Submit Queue 09bb8d367a Merge pull request #50712 from dims/create-cadvisor-directory-if-necessary
Automatic merge from submit-queue (batch tested with PRs 51102, 50712, 51037, 51044, 51059)

Create the directory for cadvisor if needed

**What this PR does / why we need it**:

In 6c7245d464, code was added to
bail out if the directory that cadvisor monitored did not exist.

However, this breaks the earlier assumption that kubelet created
directories when needed in pkg/kubelet/kubelet.go's setupDataDirs()
method. setupDataDirs() happens much later, so basically kubelet
exits now.

So since cadvisor really needs this directory, let us just create
it

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

Fixes #50709

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-08-22 12:27:59 -07:00
Connor Doyle 8f38abb350 Add cpuset helper library. 2017-08-22 11:42:01 -07:00
Kubernetes Submit Queue c6980e7247 Merge pull request #51033 from mtaufen/revert-51008-revert-50789-fix-scheme
Automatic merge from submit-queue (batch tested with PRs 50967, 50505, 50706, 51033, 51028)

Revert "Merge pull request #51008 from kubernetes/revert-50789-fix-scheme"

I'm spinning up a cluster right now to test this fix, but I'm pretty sure this was the problem.
There doesn't seem to be a way to confirm from logs, because AFAICT the logs from the hollow kubelet containers are not collected as part of the kubemark test.

**What this PR does / why we need it**:

This reverts commit f4afdecef8, reversing
changes made to e633a1604f.

This also fixes a bug where Kubemark was still using the core api scheme
to manipulate the Kubelet's types, which was the cause of the initial
revert.

**Which issue this PR fixes**: fixes #51007

**Release note**:

```release-note
NONE
```

/cc @shyamjvs @wojtek-t
2017-08-22 10:48:21 -07:00
zhangxiaoyu-zidif e4ac711dfc Refactor kuberuntime test case with sets.String 2017-08-22 19:43:18 +08:00
Henrik Schmidt 80156474cf Always check if default labels on node need to be updated in kubelet 2017-08-22 12:54:07 +02:00
Kubernetes Submit Queue 198e83588b Merge pull request #46458 from jsafrane/mount-prep
Automatic merge from submit-queue (batch tested with PRs 46458, 50934, 50766, 50970, 47698)

Prepare VolumeHost for running mount tools in containers

This is the first part of implementation of https://github.com/kubernetes/features/issues/278 - running mount utilities in containers.

It updates `VolumeHost` interface:

*  `GetMounter()` now requires volume plugin name, as it is going to return different mounter to different volume plugings, because mount utilities for these plugins can be on different places.
* New `GetExec()` method that should volume plugins use to execute any utilities. This new `Exec` interface will execute them on proper place.
* `SafeFormatAndMount` is updated to the new `Exec` interface.

This is just a preparation, `GetExec` right now leads to simple `os.Exec` and mount utilities are executed on the same place as before. Also, the volume plugins will be updated in subsequent PRs (split into separate PRs, some plugins required lot of changes).

```release-note
NONE
```

@kubernetes/sig-storage-pr-reviews 
@rootfs @gnufied
2017-08-21 18:11:16 -07:00
Kubernetes Submit Queue 0b6bd601ae Merge pull request #50853 from dcbw/cni-conf
Automatic merge from submit-queue (batch tested with PRs 50531, 50853, 49976, 50939, 50607)

cni: print better error when a CNI .configlist is put into a .config

If the admin mistakenly puts a CNI configlist into a "conf" file, that's not correct, but kubelet will still read the config file and then fail to start the pod because "type=".  Be a bit smarter about that.  Should also be fixed in CNI, which I'm doing a PR for as well.

@squeed @thockin @freehan
2017-08-21 15:46:17 -07:00
Michael Taufen a90d81620b Revert "Merge pull request #51008 from kubernetes/revert-50789-fix-scheme"
This reverts commit f4afdecef8, reversing
changes made to e633a1604f.

This also fixes a bug where Kubemark was still using the core api scheme
to manipulate the Kubelet's types, which was the cause of the initial
revert.
2017-08-21 11:28:05 -07:00
xswack 7bc8411f62 Fixed code comments that were not updated 2017-08-21 20:19:59 +08:00
Shyam JVS 5591914d62 Revert "Don't register the kubeletconfig group with the default Scheme" 2017-08-21 11:15:27 +02:00
Pengfei Ni 4180b79f04 Fix typo in docs of remote package 2017-08-21 09:38:57 +08:00
Davanum Srinivas ca2d5178aa Create the directory for cadvisor if needed
In 6c7245d464, code was added to
bail out if the directory that cadvisor monitored did not exist.

However, this breaks the earlier assumption that kubelet created
directories when needed in pkg/kubelet/kubelet.go's setupDataDirs()
method. setupDataDirs() happens much later, so basically kubelet
exits now.

So since cadvisor really needs this directory, let us just create
it

Fixes #50709
2017-08-19 20:42:50 -04:00
Kubernetes Submit Queue e633a1604f Merge pull request #50789 from mtaufen/fix-scheme
Automatic merge from submit-queue

Don't register the kubeletconfig group with the default Scheme

See https://github.com/kubernetes/kubernetes/pull/49051#discussion_r132527078
2017-08-19 12:15:48 -07:00
Kubernetes Submit Queue b59ad9cbff Merge pull request #50146 from gmarek/deepcopyinto
Automatic merge from submit-queue (batch tested with PRs 46512, 50146)

Make metav1.(Micro)?Time functions take pointers

Is there any reason for those functions not to be on pointers?
2017-08-19 11:28:15 -07:00