Commit Graph

27332 Commits (efa63b7a5877252a3e542bc866cd0fe637933aed)

Author SHA1 Message Date
Kubernetes Submit Queue aec91e112b
Merge pull request #60050 from obnoxxx/glusterfs-comment-typo
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

glusterfs: improve a few comments

**What this PR does / why we need it**:

This PR fixes a couple of comments in the glusterfs module:
* fixes a typo in a comment
* removes an outdated comment that is not correct (any more)
* updates one comment to refer to upstream gluster documentation instead of downstream red hat product documentation
2018-02-20 03:55:20 -08:00
Kubernetes Submit Queue 5b98dbcfe5
Merge pull request #60008 from k82cn/k8s_54313_2
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Taint node when it under PID pressure.

Signed-off-by: Da K. Ma <madaxa@cn.ibm.com>

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
part of #54313 

**Release note**:
```release-note
If TaintNodesByCondition enabled, taint node when it under PID pressure 
```
2018-02-20 03:13:28 -08:00
Kubernetes Submit Queue 96ec318718
Merge pull request #59842 from ixdy/update-rules_go-02-2018
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

 Update bazelbuild/rules_go, kubernetes/repo-infra, and gazelle dependencies

**What this PR does / why we need it**: updates our bazelbuild/rules_go dependency in order to bump everything to go1.9.4. I'm separating this effort into two separate PRs, since updating rules_go requires a large cleanup, removing an attribute from most build rules.

**Release note**:

```release-note
NONE
```
2018-02-19 22:23:05 -08:00
Kubernetes Submit Queue 236fa894df
Merge pull request #57802 from dashpole/allocatable_monitoring
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Monitor the /kubepods cgroup for allocatable metrics

**What this PR does / why we need it**:
The current implementation of allocatable memory evictions sums the usage of pods in order to compute the total usage by user processes.
This PR changes this to instead monitor the `/kubepods` cgroup, which contains all pods, and use this value directly.  This is more accurate than summing pod usage, as it is measured at a single point in time.
This also collects metrics from this cgroup on-demand.
This PR is a precursor to memcg notifications on the `/kubepods` cgroup.
This removes the dependency the eviction manager has on the container manager, and adds a dependency for the summary collector on the container manager (to get Cgroup Root)
This also changes the way that the allocatable memory eviction signal and threshold are added to make them in-line with the memory eviction signal to address #53902

**Which issue(s) this PR fixes**:
Fixes #55638
Fixes #53902

**Special notes for your reviewer**:
I have tested this, and can confirm that it works when CgroupsPerQos is set to false.  In this case, it returns node metrics, as it is monitoring the `/` cgroup, rather than the `/kubepods` cgroup (which doesn't exist).

**Release note**:
```release-note
Expose total usage of pods through the "pods" SystemContainer in the Kubelet Summary API
```
cc @sjenning @derekwaynecarr @vishh @kubernetes/sig-node-pr-reviews
2018-02-19 15:13:31 -08:00
Michael Adam 6bf35daca7 glusterfs: refer to upstream gluster documentation
Do not refer to downstream Red Hat documentation
in the upstream kubernetes code, if there is upstream
documentation to refer to.

Signed-off-by: Michael Adam <obnox@redhat.com>
2018-02-19 21:30:02 +01:00
Michael Adam 67adc29f8f glusterfs: fix a comment typo
Signed-off-by: Michael Adam <obnox@redhat.com>
2018-02-19 21:30:02 +01:00
Michael Adam 2ec681632b glusterfs: Remove an outdated comment about GB vs GiB
This was originally added due to a misunderstanding
of the documentation of Heketi (using a different
convention). Heketi's documentation has meanwhile
been clarified.

Signed-off-by: Michael Adam <obnox@redhat.com>
2018-02-19 21:30:02 +01:00
Kubernetes Submit Queue 8d9d0317fc
Merge pull request #60017 from sbezverk/csi_e2e_tests
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fixing CSI e2e test

Current e2e test had some inconsistencies which were preventing it from running successfully on the local cluster.
```release-note
Making sure CSI E2E test runs on a local cluster
```
Closes #60016
2018-02-19 04:20:00 -08:00
Kubernetes Submit Queue e267f46c8e
Merge pull request #59986 from nicksardo/mockproject
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

GCE: Fix SelfLink of cloudprovider mocks

**What this PR does / why we need it**:
Allows the user to pass in a ProjectRouter to the mocked services

**Special notes for your reviewer**:
/assign bowei
/cc agau4779  

**Release note**:
```release-note
NONE
```
2018-02-18 18:39:14 -08:00
Serguei Bezverkhi 348a02395d Fixing CSI E2E test 2018-02-17 18:13:06 -05:00
David Ashpole 960856f4e8 collect metrics on the /kubepods cgroup on-demand 2018-02-17 12:32:40 -08:00
Kubernetes Submit Queue 220bdf26b3
Merge pull request #59209 from sbezverk/csi_0.2.0_breaking_changes
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

CSI 0.2.0 breaking changes

Refactor kubernetes CSI bits to support CSI version 0.2.0

```release-note
Addressing breaking changes introduced by new 0.2.0 release of CSI spec
```
2018-02-16 21:27:58 -08:00
Kubernetes Submit Queue 6d0b71740f
Merge pull request #59968 from kubernetes/revert-59323-nodetaint
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert "add node shutdown taint"

Reverts kubernetes/kubernetes#59323

Node becomes unready, but is never removed. I've found the following in [kube-controller-manager.log](https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce-autoscaling/6055/artifacts/bootstrap-e2e-master/cluster-autoscaler.log) from test run for one such node:

`E0216 01:14:27.084923       1 node_lifecycle_controller.go:686] Error determining if node bootstrap-e2e-minion-group-01b1 shutdown in cloud: failed to get instance ID from cloud provider: instance not found`

This goes on for the rest of the run (~6h). Looks like the node is stuck in Unready state because of this check: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/nodelifecycle/node_lifecycle_controller.go#L684. Previously, there was no such check and the node was removed.

Reverting as this would affect all users attempting to resize their node groups on GCE.

```release-note
NONE
```
2018-02-16 20:12:56 -08:00
Da K. Ma 6bda1bec6e Taint node when it under PID pressure.
Signed-off-by: Da K. Ma <madaxa@cn.ibm.com>
2018-02-17 10:55:29 +08:00
Da K. Ma 4df591fc5d Updated comments to correct flag of taint.
Signed-off-by: Da K. Ma <madaxa@cn.ibm.com>
2018-02-17 10:01:25 +08:00
Kubernetes Submit Queue 270ed995f4
Merge pull request #59841 from dashpole/metrics_after_reclaim
Automatic merge from submit-queue (batch tested with PRs 59683, 59964, 59841, 59936, 59686). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reevaluate eviction thresholds after reclaim functions

**What this PR does / why we need it**:
When the node comes under `DiskPressure` due to inodes or disk space, the eviction manager runs garbage collection functions to clean up dead containers and unused images.
Currently, we use the strategy of trying to measure the disk space and inodes freed by garbage collection.  However, as #46789 and #56573 point out, there are gaps in the implementation that can cause extra evictions even when they are not required.  Furthermore, for nodes which frequently cycle through images, it results in a large number of evictions, as running out of inodes always causes an eviction.

This PR changes this strategy to call the garbage collection functions and ignore the results.  Then, it triggers another collection of node-level metrics, and sees if the node is still under DiskPressure.
This way, we can simply observe the decrease in disk or inode usage, rather than trying to measure how much is freed.

**Which issue(s) this PR fixes**:
Fixes #46789
Fixes #56573
Related PR #56575

**Special notes for your reviewer**:
This will look cleaner after #57802  removes arguments from [makeSignalObservations](https://github.com/kubernetes/kubernetes/pull/57802/files#diff-9e5246d8c78d50ce4ba440f98663f3e9R719).

**Release note**:
```release-note
NONE
```

/sig node
/kind bug
/priority important-soon
cc @kubernetes/sig-node-pr-reviews
2018-02-16 16:31:33 -08:00
Kubernetes Submit Queue df92baf6e4
Merge pull request #59874 from dims/log-command-line-flags
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Log the command line flags

**What this PR does / why we need it**:

With d7ddcca231, we lost the logging
of the flags. We should at least log what the command line flags
were used to start processes as those incredibly useful for trouble shooting.


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:
/assign @deads2k 
/assign @liggitt 

**Release note**:

```release-note
NONE
```
2018-02-16 14:22:25 -08:00
Nick Sardo 2410b6576e Pass ProjectRouter to mocks 2018-02-16 13:47:12 -08:00
Jeff Grafton ef56a8d6bb Autogenerated: hack/update-bazel.sh 2018-02-16 13:43:01 -08:00
Kubernetes Submit Queue 930f86574f
Merge pull request #57885 from cimomo/kubelet-fixes
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve comments for kubelet

**What this PR does / why we need it**:
Improve comments and fix typos for kubelet.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-16 13:38:49 -08:00
Kubernetes Submit Queue 244549f02a
Merge pull request #59769 from dashpole/capacity_ephemeral_storage
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Collect ephemeral storage capacity on initialization

**What this PR does / why we need it**:
We have had some node e2e flakes where a pod can be rejected if it requests ephemeral storage.  This is because we don't set capacity and allocatable for ephemeral storage on initialization.
This PR causes cAdvisor to do one round of stats collection during initialization, which will allow it to get the disk capacity when it first sets the node status.
It also sets the node to NotReady if capacities have not been initialized yet.

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
/assign @jingxu97 @Random-Liu 

/sig node
/kind bug
/priority important-soon
2018-02-16 11:17:02 -08:00
Kubernetes Submit Queue eac5bc0035
Merge pull request #57136 from k82cn/k8s_54313
Automatic merge from submit-queue (batch tested with PRs 57136, 59920). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Updated PID pressure node condition.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
part of #54313 

**Release note**:

```release-note
Updated PID pressure node condition
```
2018-02-16 10:35:33 -08:00
Serguei Bezverkhi ea4df51b3b Refactor k8s core csi bits for CSI Spec 0.2.0 2018-02-16 13:29:34 -05:00
Kubernetes Submit Queue d594a13d69
Merge pull request #59954 from msau42/index-sc
Automatic merge from submit-queue (batch tested with PRs 57700, 59954). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Index PVs by StorageClass in assume cache

**What this PR does / why we need it**:
Performance optimization for delayed binding in the scheduler to only search for PVs with a matching StorageClass name.  This means that if you prebind the PV to a PVC, the PV must have a matching StorageClass name.  This behavior is different from when you prebind with immediate binding, which doesn't care about StorageClass.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #56102

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-16 09:24:30 -08:00
David Ashpole e0830d0b71 reevaluate eviction thresholds after reclaim functions 2018-02-16 08:35:24 -08:00
Kubernetes Submit Queue 11ecad2629
Merge pull request #59914 from iliastsi/feature-csi-stale-path
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

csi: Remove stale volume path

**What this PR does / why we need it**:

The CSI mounter creates the following paths during SetUp():

   * .../pods/\<podID\>/volumes/kubernetes.io~csi/\<specVolId\>/mount/
   * .../pods/\<podID\>/volumes/kubernetes.io~csi/\<specVolId\>/volume_data.json

During TearDown(), it does not remove the `.../kubernetes.io~csi/<specVolId>/`
directory, leaving behind orphan volumes: method cleanupOrphanedPodDirs()
complains with 'Orphaned pod found, but volume paths are still present
on disk'.

Fix that by removing the above directory in removeMountDir().

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-16 08:04:32 -08:00
Kubernetes Submit Queue 0e81651e77
Merge pull request #59909 from jsafrane/volumemanager-approvers
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add jsafrane as AWS approver.

**What this PR does / why we need it**:
I contrinbuted several PRs in AWS storage and I'm willing to share review/approval duty.

**Release note**:

```release-note
NONE
```

/assign @justinsb
2018-02-16 06:02:01 -08:00
Kubernetes Submit Queue 1db96b0e9b
Merge pull request #59668 from brycecarman/ccm-iam-role
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add AWS cloud provider option for IAM role

**What this PR does / why we need it**:
Adds the option to provide an IAM role ARN in the AWS cloud provider config file that should be assumed when communicating with the AWS APIs. 
For example, this allows running Controller Manager in a account separate from the worker nodes, but still allows all resources created to interact with the workers. ELBs created would be in the same account as the worker nodes for instance. 

**Which issue(s) this PR fixes** *(optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged)*:
Fixes #59526

**Special notes for your reviewer**:
None

**Release note**:

```release-note
Add AWS cloud provider option to use an assumed IAM role 
```
2018-02-16 06:01:45 -08:00
Aleksandra Malinowska 2d54ba3e0f
Revert "add node shutdown taint" 2018-02-16 12:24:27 +01:00
Kubernetes Submit Queue 01ec7a9eb8
Merge pull request #59809 from phsiao/59733_port_forward_with_target_port
Automatic merge from submit-queue (batch tested with PRs 59809, 59955). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubectl port-forward should resolve service port to target port

**What this PR does / why we need it**:

Continues on the work in #59705, this PR adds additional support for looking up targetPort for a service, as well as enable using svc/name to select a pod.

**Which issue(s) this PR fixes**:
Fixes #15180
Fixes #59733

**Special notes for your reviewer**:

I decided to create pkg/kubectl/util/service_port.go to contain two functions that might be re-usable.

**Release note**:
```release-note
`kubectl port-forward` now supports specifying a service to port forward to: `kubectl port-forward svc/myservice 8443:443`
```
2018-02-15 22:42:30 -08:00
Bryce Carman 3b99e1b487 Add AWS cloud provider option for IAM role
Currently the AWS cloud provider uses the EC2 instance role when
interacting with AWS APIs. This change gives the option to provide and IAM
role that the cloud provider will assume before calling the APIs. All
resources created by the role will be owned by that account instead of
the account where the EC2 instance is running.
2018-02-15 21:14:58 -08:00
Kubernetes Submit Queue c105796e4b
Merge pull request #59953 from Random-Liu/fix-pod-scheduled
Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix pod scheduled.

Fix `PodScheduled` condition.

The test `[k8s.io] EquivalenceCache [Serial] validates pod affinity works properly when new replica pod is scheduled` for cri-containerd is flaky.
The reason is that it assume all existing pods should have `PodScheduled` condition, but it is not the case:
```
Feb 15 15:31:01.359: INFO: with-label-390d246e-1265-11e8-beb8-0a580a3c7b55       bootstrap-e2e-minion-group-l6qw  Running         [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:30:59 +0000 UTC  } {Ready True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:31:00 +0000 UTC  } {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:30:59 +0000 UTC  }]
Feb 15 15:31:01.359: INFO: calico-node-7mzxc                                     bootstrap-e2e-minion-group-hztx  Running         [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 14:17:05 +0000 UTC  } {Ready True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 14:17:59 +0000 UTC  }]
Feb 15 15:31:01.359: INFO: calico-node-kvrsx                                     bootstrap-e2e-minion-group-l6qw  Running         [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:24:54 +0000 UTC  } {Ready True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:25:20 +0000 UTC  }]
Feb 15 15:31:01.359: INFO: calico-node-llwjh        
```

I'm not sure why this doesn't happen to docker. One theory is that we don't prepull image in cri-containerd, and we do start pod a bit faster for cri-containerd, and that exposes the race condition.

/cc @kubernetes/sig-node-bugs 
Signed-off-by: Lantao Liu <lantaol@google.com>



**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
none
```
2018-02-15 20:16:44 -08:00
Kubernetes Submit Queue 99c87cf679
Merge pull request #59923 from jsafrane/volumemanager-logs
Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Rework volume manager log levels

- all normal logs to go to level 4
- too frequent / duplicate logs go to level 5 (e.g. when something else logged similar message not too far away).

I checked that there is no excessive spam in the log - reconciler runs every 100ms, but it does not log anything if there is nothing to do.

**What this PR does / why we need it**:
This will help us debug flakes. E2e tests do not log levels 10-12 used in volume manager

**Release note**:

```release-note
NONE
```

/sig storage
/sig node
cc: @jingxu97 @sjenning
2018-02-15 20:16:38 -08:00
Kubernetes Submit Queue c7c5d89e32
Merge pull request #59873 from jsafrane/fix-downward-flake
Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix DownwardAPI refresh race.

WaitForAttachAndMount should mark only pod in DesiredStateOfWorldPopulator (DSWP) and DSWP should mark the volume to be remounted only when the new pod has been processed.

Otherwise DSWP and reconciler race who gets the new pod first. If it's reconciler, then DownwardAPI and Projected volumes of the pod are not refreshed with new content and they are updated after the next periodic sync (60-90 seconds).

Fixes #59813 

/assign @jingxu97 @saad-ali 
/sig storage
/sig node

```release-note
None
```
2018-02-15 20:16:32 -08:00
Kubernetes Submit Queue bfdd94c6a0
Merge pull request #59170 from cofyc/fix_kubelet_volume_metrics
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix kubelet PVC stale metrics

**What this PR does / why we need it**:

Volumes on each node changes, we should not only add PVC metrics into
gauge vector. It's better use a collector to collector metrics from internal
stats.

Currently, if a PV (bound to a PVC `testpv`)  is attached and used by node A, then migrated to node B or just deleted from node A later.  `testpvc` metrics will not disappear from kubelet on node A. After a long running time, `kubelet` process will keep a lot of stale volume metrics in memory.

For these dynamic metrics, it's better to use a collector to collect metrics from a data source (`StatsProvider` here), like [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) scraping metrics from kube-apiserver.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/57686

**Special notes for your reviewer**:

**Release note**:

```release-note
Fix kubelet PVC stale metrics
```
2018-02-15 18:44:08 -08:00
David Ashpole b259543985 collect ephemeral storage capacity on initialization 2018-02-15 17:33:22 -08:00
Michelle Au 5271edd9e2 Index PVs by StorageClass in assume cache 2018-02-15 17:12:32 -08:00
Lantao Liu f69b4e9262 Fix pod scheduled.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-02-16 00:51:20 +00:00
Kubernetes Submit Queue 271c267fff
Merge pull request #59830 from khenidak/az-ratelimit
Automatic merge from submit-queue (batch tested with PRs 59939, 59830). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Azure - ARM Read/Write rate limiting

**What this PR does / why we need it**:

Azure cloud provider currently runs with:
1. Single ARM rate limiter for both `read [put/post/delete]` and `write` operations, while ARM provide [different rates for read/write] (https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-request-limits). This causes write operation to stop even if there is available write request quotas. 
2. Cloud provider uses rate limiter's `Accept()` instead of `TryAccept()` This causes control loop to wait for prolonged tike `in case of no request quota available` for **all** requests even for those does not require ARM interaction. A case for that the `Service` control loop will wait for a prolonged time trying to create `LoadBalancer` service even though it can fail and work on the next service which is `ClusterIP`. This PR moves cloud provider tp `TryAccept()`

**Which issue(s) this PR fixes**:
Fixes # https://github.com/kubernetes/kubernetes/issues/58770

**Special notes for your reviewer**:
`n/a`

**Release note**:

```release-note
- Separate current ARM rate limiter into read/write
- Improve control over how ARM rate limiter is used within Azure cloud provider
```

cc @jackfrancis (need your help carefully reviewing this one) @brendanburns @jdumars
2018-02-15 16:43:37 -08:00
Kubernetes Submit Queue 281cb00776
Merge pull request #59939 from dims/avoid-calls-to-cloud-instances-unless-taint-present
Automatic merge from submit-queue (batch tested with PRs 59939, 59830). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Avoid call to get cloud instances

**What this PR does / why we need it**:

if a node does not have the taint, we really don't need to make calls
to get the list of instances from the cloud provider

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:
Found when reviewing code for #59887

**Release note**:

```release-note
NONE
```
2018-02-15 16:43:34 -08:00
Davanum Srinivas 265e5ae085 Log the command line flags
With d7ddcca231, we lost the logging
of the flags. We should at least log what the command line flags
were used to start processes as those incredibly useful for trouble shooting.
2018-02-15 18:04:04 -05:00
JulienBalestra 493f335830 kubelet: revert the get pod status 2018-02-15 22:24:35 +01:00
Kubernetes Submit Queue 01517e530f
Merge pull request #59887 from dims/process-cloud-nodes-in-ccm-before-creating-shared-informer-handler
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Process existing cloud nodes in CCM

**What this PR does / why we need it**:

This is a timing issue. If kubelet(s) get started before the CCM is
started, the shared informer event handler does not process them at
all. So we should loop through these before. We run this in a
go wait.Until loop to tolerate errors when listing the nodes and
giving an opportunity for any scripts that may need to setup RBAC
roles etc.


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #58613

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-15 12:17:26 -08:00
Khaled Henidak(Kal) 38a9fc33db code review: create err chan via helper 2018-02-15 20:11:40 +00:00
Davanum Srinivas 84d171fe86 Avoid call to get cloud instances
if a node does not have the taint, we really don't need to make calls
to get the list of instances from the cloud provider
2018-02-15 14:50:25 -05:00
Kubernetes Submit Queue c03edcc58e
Merge pull request #53833 from mtaufen/kubeletconfig-to-beta
Automatic merge from submit-queue (batch tested with PRs 59353, 59905, 53833). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Graduate kubeletconfig API group to beta

Regarding https://github.com/kubernetes/features/issues/281, this PR moves the kubeletconfig API group to beta. 

After #53088, the KubeletConfiguration type should not contain any deprecated or experimental fields, and we should not have to remove any more fields from the type before graduating it to beta. 

We need the community to double check for two things, however:
1. Are there any fields currently in the KubeletConfiguration type that you were going to mark deprecated this quarter, but haven't yet?
2. Are there any fields currently in the KubeletConfiguration type that are experimental or alpha, but were not explicitly denoted as such?

Please comment on this PR if you can answer "yes" to either of those two questions. Please cc anyone with a stake in the kubeletconfig API, so we get as much coverage as possible.

/cc @thockin @dchen1107 @Random-Liu @yujuhong @dashpole @tallclair @vishh @abw @freehan @dnardo @bowei @MrHohn @luxas @liggitt @ncdc @derekwaynecarr @mikedanese 

@kubernetes/sig-network-pr-reviews, @kubernetes/sig-node-pr-reviews 

```release-note
action required: The `kubeletconfig` API group has graduated from alpha to beta, and the name has changed to `kubelet.config.k8s.io`. Please use `kubelet.config.k8s.io/v1beta1`, as `kubeletconfig/v1alpha1` is no longer available. 
```

**TODO:**
- [x] Move experimental/non-gated-alpha/soon-to-be-deprecated fields to `KubeletFlags`
  - [x] #53088
  - [x] #54154
  - [x] #54160
  - [x] #55562
  - [x] #55983
  - [x] #57851
- [x] Lift embedded structure out of strings
  - [x] #53025
  - [x] #54643
  - [x] #54823
  - [x] #55254
- [x] Resolve relative paths against the location config files are loaded from
  - [x] #55648 
- [x] Rename to `kubelet.config.k8s.io`
- [x] Comments
  - [x] Make sure existing comments at least read sensibly.
  - [x] Note default values in comments on the versioned struct.
  - [x] Remove any reference to default values in comments on the internal struct.
- [x] Most fields should be `+optional` and `omitempty`. Add where necessary. ~Where omitted, explicitly comment.~ Edit: We should not distinguish between nil and empty, see below items.
- [x] Ensure defaults are specified via `pkg/kubelet/apis/kubelet.config.k8s.io/v1beta1/defaults.go`, not `cmd/kubelet/app/options/options.go`.
  - [x] #57770
- [x] Ensure kubeadm does not persist v1alpha1 KubeletConfiguration objects (or feature-gates this functionality)
- [x] Don't make a distinction between empty and nil, because of #43203.
  - [x] #59515
  - [x] #59681
- [x] Take the opportunity to fix insecure Kubelet defaults @tallclair 
  - [x] #59666
- [x] Remove CAdvisorPort from KubeletConfiguration wrt #56523.
  - [x] #59580
- [x] Hide `ConfigTrialDuration` until we're more sure what to do with it.
   - [x] #59628
- [x] Fix `// default: x` comments after rebasing on recent changes.
2018-02-15 11:06:40 -08:00
Kubernetes Submit Queue b099e91920
Merge pull request #59905 from mtaufen/dkcfg-config-ok-kubelet-config-ok
Automatic merge from submit-queue (batch tested with PRs 59353, 59905, 53833). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Rename ConfigOK to KubeletConfigOk

This is a more accurate name for the condition, as it describes the
status of the Kubelet's configuration.

Also cleans up capitalization of internal names.

```release-note
The ConfigOK node condition has been renamed to KubeletConfigOk.
```
2018-02-15 11:06:36 -08:00
Kubernetes Submit Queue bb500a73b6
Merge pull request #59353 from juanvallejo/jvallejo/update-name-printer-output
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

update name printer output to kind.group/name

**Release note**:
```release-note
NONE
```

Followup to https://github.com/kubernetes/kubernetes/pull/59227

Updates output via `-o name` to be pipeable.

cc @deads2k
2018-02-15 10:37:19 -08:00
Kubernetes Submit Queue 27daaab224
Merge pull request #59323 from zetaab/nodetaint
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add node shutdown taint

**What this PR does / why we need it**: we need node stopped taint in order to detach volumes immediately without waiting timeout. More info in issue ticket #58635 

**Which issue(s) this PR fixes** 
Fixes #58635

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2018-02-15 09:52:10 -08:00
Kubernetes Submit Queue 9a8b675d2c
Merge pull request #59630 from k82cn/k8s_59194_0
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Task 0: Added Alpha flag for NoDaemonSetScheduler feature.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Part of #59194

**Release note**:
```release-note
None
```
2018-02-15 09:10:58 -08:00