Commit Graph

61923 Commits (be04e7c1b1393b895a7da23185d050db27294c4e)

Author SHA1 Message Date
Kubernetes Submit Queue 99c87cf679
Merge pull request #59923 from jsafrane/volumemanager-logs
Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Rework volume manager log levels

- all normal logs to go to level 4
- too frequent / duplicate logs go to level 5 (e.g. when something else logged similar message not too far away).

I checked that there is no excessive spam in the log - reconciler runs every 100ms, but it does not log anything if there is nothing to do.

**What this PR does / why we need it**:
This will help us debug flakes. E2e tests do not log levels 10-12 used in volume manager

**Release note**:

```release-note
NONE
```

/sig storage
/sig node
cc: @jingxu97 @sjenning
2018-02-15 20:16:38 -08:00
Kubernetes Submit Queue 72e1cf21c4
Merge pull request #59933 from mikedanese/rm-cert-controller
Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

gke-certificates-controller: rm -rf

Fixes https://github.com/kubernetes/kubernetes/issues/53439

```release-note
NONE
```
2018-02-15 20:16:36 -08:00
Kubernetes Submit Queue c7c5d89e32
Merge pull request #59873 from jsafrane/fix-downward-flake
Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix DownwardAPI refresh race.

WaitForAttachAndMount should mark only pod in DesiredStateOfWorldPopulator (DSWP) and DSWP should mark the volume to be remounted only when the new pod has been processed.

Otherwise DSWP and reconciler race who gets the new pod first. If it's reconciler, then DownwardAPI and Projected volumes of the pod are not refreshed with new content and they are updated after the next periodic sync (60-90 seconds).

Fixes #59813 

/assign @jingxu97 @saad-ali 
/sig storage
/sig node

```release-note
None
```
2018-02-15 20:16:32 -08:00
Kubernetes Submit Queue bfdd94c6a0
Merge pull request #59170 from cofyc/fix_kubelet_volume_metrics
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix kubelet PVC stale metrics

**What this PR does / why we need it**:

Volumes on each node changes, we should not only add PVC metrics into
gauge vector. It's better use a collector to collector metrics from internal
stats.

Currently, if a PV (bound to a PVC `testpv`)  is attached and used by node A, then migrated to node B or just deleted from node A later.  `testpvc` metrics will not disappear from kubelet on node A. After a long running time, `kubelet` process will keep a lot of stale volume metrics in memory.

For these dynamic metrics, it's better to use a collector to collect metrics from a data source (`StatsProvider` here), like [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) scraping metrics from kube-apiserver.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/57686

**Special notes for your reviewer**:

**Release note**:

```release-note
Fix kubelet PVC stale metrics
```
2018-02-15 18:44:08 -08:00
nikhiljindal d2fe556309 Updating kubemci e2e test to not add kubeconfig flag for get-status 2018-02-15 18:23:57 -08:00
mlmhl dcbd1ae3cf wait for bound pvc metric updated before validating 2018-02-16 09:57:30 +08:00
David Ashpole b259543985 collect ephemeral storage capacity on initialization 2018-02-15 17:33:22 -08:00
Michelle Au 5271edd9e2 Index PVs by StorageClass in assume cache 2018-02-15 17:12:32 -08:00
Lantao Liu f69b4e9262 Fix pod scheduled.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-02-16 00:51:20 +00:00
Kubernetes Submit Queue 271c267fff
Merge pull request #59830 from khenidak/az-ratelimit
Automatic merge from submit-queue (batch tested with PRs 59939, 59830). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Azure - ARM Read/Write rate limiting

**What this PR does / why we need it**:

Azure cloud provider currently runs with:
1. Single ARM rate limiter for both `read [put/post/delete]` and `write` operations, while ARM provide [different rates for read/write] (https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-request-limits). This causes write operation to stop even if there is available write request quotas. 
2. Cloud provider uses rate limiter's `Accept()` instead of `TryAccept()` This causes control loop to wait for prolonged tike `in case of no request quota available` for **all** requests even for those does not require ARM interaction. A case for that the `Service` control loop will wait for a prolonged time trying to create `LoadBalancer` service even though it can fail and work on the next service which is `ClusterIP`. This PR moves cloud provider tp `TryAccept()`

**Which issue(s) this PR fixes**:
Fixes # https://github.com/kubernetes/kubernetes/issues/58770

**Special notes for your reviewer**:
`n/a`

**Release note**:

```release-note
- Separate current ARM rate limiter into read/write
- Improve control over how ARM rate limiter is used within Azure cloud provider
```

cc @jackfrancis (need your help carefully reviewing this one) @brendanburns @jdumars
2018-02-15 16:43:37 -08:00
Kubernetes Submit Queue 281cb00776
Merge pull request #59939 from dims/avoid-calls-to-cloud-instances-unless-taint-present
Automatic merge from submit-queue (batch tested with PRs 59939, 59830). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Avoid call to get cloud instances

**What this PR does / why we need it**:

if a node does not have the taint, we really don't need to make calls
to get the list of instances from the cloud provider

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:
Found when reviewing code for #59887

**Release note**:

```release-note
NONE
```
2018-02-15 16:43:34 -08:00
Davanum Srinivas 265e5ae085 Log the command line flags
With d7ddcca231, we lost the logging
of the flags. We should at least log what the command line flags
were used to start processes as those incredibly useful for trouble shooting.
2018-02-15 18:04:04 -05:00
Kubernetes Submit Queue cdecea5455
Merge pull request #59948 from JulienBalestra/revert-kubelet-pod-status
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: revert the get pod with updated status

**What this PR does / why we need it**:

Following #59892 this PR finish to revert #57106 

The first revert didn't solve the reboot test in the [test grid](https://k8s-testgrid.appspot.com/google-gce#gci-gce-reboot).


**Special notes for your reviewer**:

cc @dashpole  @Random-Liu 

**Release note**:
```release-note
None
```
2018-02-15 14:51:53 -08:00
Nick Sardo 911a082d65 Add cloud-provider policies to be applied via addon mgr 2018-02-15 14:49:33 -08:00
Stephen Augustus 3a8948c027 cluster/images/hyperkube: Fix typo in Dockerfile for aggregator symlink 2018-02-15 17:44:02 -05:00
Christoph Blecker 6fb2304f2a
Re-add OWNERS files to Godeps/vendor dirs 2018-02-15 13:31:02 -08:00
Christoph Blecker a11ea4e45f
Enforce OWNERS file in Godeps and vendor dirs 2018-02-15 13:30:30 -08:00
Christoph Blecker d3bcd367ec
Add cblecker to dep approvers 2018-02-15 13:29:57 -08:00
JulienBalestra 493f335830 kubelet: revert the get pod status 2018-02-15 22:24:35 +01:00
Avesh Agarwal 9b19141281 Update reviewers for sig-scheduling. 2018-02-15 16:08:52 -05:00
Kubernetes Submit Queue f88ff2ab41
Merge pull request #59937 from MrHohn/addon-manager-reviewer
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add a reviewer to addon-manager

**What this PR does / why we need it**:
Would like to keep an eye on this until it goes away.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #NONE 

**Special notes for your reviewer**:
/assign @mikedanese 

**Release note**:

```release-note
NONE
```
2018-02-15 12:17:42 -08:00
Kubernetes Submit Queue 01517e530f
Merge pull request #59887 from dims/process-cloud-nodes-in-ccm-before-creating-shared-informer-handler
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Process existing cloud nodes in CCM

**What this PR does / why we need it**:

This is a timing issue. If kubelet(s) get started before the CCM is
started, the shared informer event handler does not process them at
all. So we should loop through these before. We run this in a
go wait.Until loop to tolerate errors when listing the nodes and
giving an opportunity for any scripts that may need to setup RBAC
roles etc.


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #58613

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-15 12:17:26 -08:00
Khaled Henidak(Kal) 38a9fc33db code review: create err chan via helper 2018-02-15 20:11:40 +00:00
Mike Danese b973840481 gke-certificates-controller: rm -rf 2018-02-15 12:01:00 -08:00
Davanum Srinivas 84d171fe86 Avoid call to get cloud instances
if a node does not have the taint, we really don't need to make calls
to get the list of instances from the cloud provider
2018-02-15 14:50:25 -05:00
Chao Xu 9cfd20ef1c enable mutating and validating admission webhook by default on gce and centos
clusters setup by kube/cluster-up.sh
2018-02-15 11:19:53 -08:00
Kubernetes Submit Queue c03edcc58e
Merge pull request #53833 from mtaufen/kubeletconfig-to-beta
Automatic merge from submit-queue (batch tested with PRs 59353, 59905, 53833). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Graduate kubeletconfig API group to beta

Regarding https://github.com/kubernetes/features/issues/281, this PR moves the kubeletconfig API group to beta. 

After #53088, the KubeletConfiguration type should not contain any deprecated or experimental fields, and we should not have to remove any more fields from the type before graduating it to beta. 

We need the community to double check for two things, however:
1. Are there any fields currently in the KubeletConfiguration type that you were going to mark deprecated this quarter, but haven't yet?
2. Are there any fields currently in the KubeletConfiguration type that are experimental or alpha, but were not explicitly denoted as such?

Please comment on this PR if you can answer "yes" to either of those two questions. Please cc anyone with a stake in the kubeletconfig API, so we get as much coverage as possible.

/cc @thockin @dchen1107 @Random-Liu @yujuhong @dashpole @tallclair @vishh @abw @freehan @dnardo @bowei @MrHohn @luxas @liggitt @ncdc @derekwaynecarr @mikedanese 

@kubernetes/sig-network-pr-reviews, @kubernetes/sig-node-pr-reviews 

```release-note
action required: The `kubeletconfig` API group has graduated from alpha to beta, and the name has changed to `kubelet.config.k8s.io`. Please use `kubelet.config.k8s.io/v1beta1`, as `kubeletconfig/v1alpha1` is no longer available. 
```

**TODO:**
- [x] Move experimental/non-gated-alpha/soon-to-be-deprecated fields to `KubeletFlags`
  - [x] #53088
  - [x] #54154
  - [x] #54160
  - [x] #55562
  - [x] #55983
  - [x] #57851
- [x] Lift embedded structure out of strings
  - [x] #53025
  - [x] #54643
  - [x] #54823
  - [x] #55254
- [x] Resolve relative paths against the location config files are loaded from
  - [x] #55648 
- [x] Rename to `kubelet.config.k8s.io`
- [x] Comments
  - [x] Make sure existing comments at least read sensibly.
  - [x] Note default values in comments on the versioned struct.
  - [x] Remove any reference to default values in comments on the internal struct.
- [x] Most fields should be `+optional` and `omitempty`. Add where necessary. ~Where omitted, explicitly comment.~ Edit: We should not distinguish between nil and empty, see below items.
- [x] Ensure defaults are specified via `pkg/kubelet/apis/kubelet.config.k8s.io/v1beta1/defaults.go`, not `cmd/kubelet/app/options/options.go`.
  - [x] #57770
- [x] Ensure kubeadm does not persist v1alpha1 KubeletConfiguration objects (or feature-gates this functionality)
- [x] Don't make a distinction between empty and nil, because of #43203.
  - [x] #59515
  - [x] #59681
- [x] Take the opportunity to fix insecure Kubelet defaults @tallclair 
  - [x] #59666
- [x] Remove CAdvisorPort from KubeletConfiguration wrt #56523.
  - [x] #59580
- [x] Hide `ConfigTrialDuration` until we're more sure what to do with it.
   - [x] #59628
- [x] Fix `// default: x` comments after rebasing on recent changes.
2018-02-15 11:06:40 -08:00
Kubernetes Submit Queue b099e91920
Merge pull request #59905 from mtaufen/dkcfg-config-ok-kubelet-config-ok
Automatic merge from submit-queue (batch tested with PRs 59353, 59905, 53833). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Rename ConfigOK to KubeletConfigOk

This is a more accurate name for the condition, as it describes the
status of the Kubelet's configuration.

Also cleans up capitalization of internal names.

```release-note
The ConfigOK node condition has been renamed to KubeletConfigOk.
```
2018-02-15 11:06:36 -08:00
Zihong Zheng d8f5eafd86 Add a reviewer to addon-manager 2018-02-15 10:40:02 -08:00
Kubernetes Submit Queue bb500a73b6
Merge pull request #59353 from juanvallejo/jvallejo/update-name-printer-output
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

update name printer output to kind.group/name

**Release note**:
```release-note
NONE
```

Followup to https://github.com/kubernetes/kubernetes/pull/59227

Updates output via `-o name` to be pipeable.

cc @deads2k
2018-02-15 10:37:19 -08:00
Kubernetes Submit Queue 27daaab224
Merge pull request #59323 from zetaab/nodetaint
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add node shutdown taint

**What this PR does / why we need it**: we need node stopped taint in order to detach volumes immediately without waiting timeout. More info in issue ticket #58635 

**Which issue(s) this PR fixes** 
Fixes #58635

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2018-02-15 09:52:10 -08:00
Jan Safranek 746d1dd99d Enable mount propagation tests by default
MountPropagation is enabled by default now, so should be the test.
2018-02-15 18:12:54 +01:00
Kubernetes Submit Queue 9a8b675d2c
Merge pull request #59630 from k82cn/k8s_59194_0
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Task 0: Added Alpha flag for NoDaemonSetScheduler feature.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Part of #59194

**Release note**:
```release-note
None
```
2018-02-15 09:10:58 -08:00
Kubernetes Submit Queue 8a67b71ac6
Merge pull request #59921 from x13n/fluentd-version-bump
Automatic merge from submit-queue (batch tested with PRs 56394, 59921). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix fluentd-gcp-scaler to look at correct fluentd-gcp version

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

/assign @crassirostris
2018-02-15 08:54:31 -08:00
Khaled Henidak(Kal) 53036bf755 Code review + resync VMSS changes 2018-02-15 16:49:47 +00:00
Kubernetes Submit Queue 7e8de5422c
Merge pull request #56394 from porridge/fetch-token-retry-longer
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Try longer to fetch initial token.

**What this PR does / why we need it**:
Step towards fixing #56293

**Special notes for your reviewer**:
/kind bug
/priority critial-urgent
@kubernetes/sig-scalability-bugs
/cc @shyamjvs please add to v1.9

**Release note**:
```release-note
NONE
```
2018-02-15 08:31:03 -08:00
Jan Safranek e260096392 Rework volume manager log levels
- all normal logs to go to level 4
- too frequent / duplicate logs go to level 5 (e.g. when something else logged similar message not too far away).
2018-02-15 16:33:17 +01:00
juanvallejo 765f9ec68b
update -o name format to kind.group/name 2018-02-15 10:33:06 -05:00
Kubernetes Submit Queue d336607679
Merge pull request #59871 from wojtek-t/cache_fields_and_labels
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Store labels and fields with object

We are already computing labels and fields before putting objects in watchcache.
And my tests show this is `PodToSelectableFields` is responsible for ~10% of memory allocations.
This PR is supposed to fix that - let's double check by running kubemark-big on it.
2018-02-15 07:20:51 -08:00
Daniel Kłobuszewski 6db742fc10 fix fluentd-gcp-scaler to look at correct fluentd-gcp version 2018-02-15 16:15:41 +01:00
Konstantinos Tsakalozos e2399de900 Clean-up not needed method. 2018-02-15 17:01:52 +02:00
Beata Skiba 329feee0e9 Fix cluster autoscaler test to support regional clusters. 2018-02-15 15:57:49 +01:00
Wojciech Tyczynski 87a65b6c93 Store labels and fields with object 2018-02-15 14:56:33 +01:00
Kubernetes Submit Queue d3dc4584f9
Merge pull request #59899 from mikedanese/authz
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

apiserver: fix some typos from refactor

introduced in #59582

```release-note
NONE
```
2018-02-15 04:36:39 -08:00
Kubernetes Submit Queue d3bacb914c
Merge pull request #59657 from x13n/manual-fluentd-gcp-scaler
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Enable scaling fluentd-gcp resources using ScalingPolicy.

See https://github.com/justinsb/scaler for more details about ScalingPolicy resource.

**What this PR does / why we need it**:
This is adding a way to override fluentd-gcp resources in a running cluster. The resources syncing for fluentd-gcp is decoupled from addon manager.

**Special notes for your reviewer**:

**Release note**:
```release-note
fluentd-gcp resources can be modified via a ScalingPolicy
```

cc @kawych @justinsb
2018-02-15 03:42:14 -08:00
Ilias Tsitsimpis 6590c3fe35 csi: Remove stale volume path
The CSI mounter creates the following paths during SetUp():

   * .../pods/<podID>/volumes/kubernetes.io~csi/<specVolId>/mount/
   * .../pods/<podID>/volumes/kubernetes.io~csi/<specVolId>/volume_data.json

During TearDown(), it does not remove the `.../kubernetes.io~csi/<specVolId>/`
directory, leaving behind orphan volumes: method cleanupOrphanedPodDirs()
complains with 'Orphaned pod found, but volume paths are still present
on disk'.

Fix that by removing the above directory in removeMountDir().

Signed-off-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
2018-02-15 13:30:36 +02:00
Davanum Srinivas c423be11d5 Process existing cloud nodes in CCM
Existing nodes are sent via update and not via the add function,
so let's add an UpdateCloudNode and just forward it to the
AddCloudNode. This works fine as all we do is look for the cloud
taint and bail out if it is not present.
2018-02-15 06:29:48 -05:00
Marcin Owsiany 3a5d48700c Try longer to fetch initial token. 2018-02-15 10:57:35 +01:00
Kubernetes Submit Queue 7377c5911a
Merge pull request #59892 from JulienBalestra/revert-host-ip
Automatic merge from submit-queue (batch tested with PRs 59877, 59886, 59892). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: revert the status HostIP behavior

**What this PR does / why we need it**:

This PR partially revert #57106 to fix #59889.

The PR #57106 changed the behavior of `generateAPIPodStatus` when a **kubeClient** is nil.

**Release note**:
```release-note
NONE
```
2018-02-15 00:01:35 -08:00
Kubernetes Submit Queue be5a9d34a2
Merge pull request #59886 from cblecker/fix-codegen
Automatic merge from submit-queue (batch tested with PRs 59877, 59886, 59892). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Partial revert of #59797

**What this PR does / why we need it**:
Need to do a partial revert of #59797. This PR introduced two problems:
- `INTERNAL_DIRS` has an unbound variable error
- The script introduces the use of mapfile which doesn't exist in bash versions prior to bash 4.x. Darwin (OS X) uses bash 3.x by default.

**Release note**:
```release-note
NONE
```
2018-02-15 00:01:33 -08:00