Commit Graph

48691 Commits (bbfda9cdfe54aee7391bac8b908af5bde4f06fe3)

Author SHA1 Message Date
Kubernetes Submit Queue f499606bfe Merge pull request #45346 from codablock/fix_double_attach
Automatic merge from submit-queue

Don't try to attach volumes which are already attached to other nodes

This PR is a replacement for https://github.com/kubernetes/kubernetes/pull/40148. I was not able to push fixes and rebases to the original branch as I don't have access to the Github organization anymore.

CC @saad-ali You probably have to update the PR link in [Q2 2017 (v1.7)](https://docs.google.com/spreadsheets/d/1t4z5DYKjX2ZDlkTpCnp18icRAQqOE85C1T1r2gqJVck/edit#gid=14624465)

I assume the PR will need a new "ok to test" 

**ORIGINAL PR DESCRIPTION**

This PR fixes an issue with the attach/detach volume controller. There are cases where the `desiredStateOfWorld` contains the same volume for multiple nodes, resulting in the attach/detach controller attaching this volume to multiple nodes. This of course fails for volumes like AWS EBS, Azure Disks, ...

I observed this situation on Azure when using Azure Disks and replication controllers which start to reschedule PODs. When you delete a POD that belongs to a RC, the RC will immediately schedule a new POD on another node. This results in a short time (max a few seconds) where you have 2 PODs which try to attach/mount the same volume on different nodes. As the old POD is still alive, the attach/detach controller does not try to detach the volume and starts to attach the volume to the new POD immediately.

This behavior was probably not noticed before on other clouds as the bogus attempt to attach probably fails pretty fast and thus is unnoticed. As the situation with the 2 PODs disappears after a few seconds, a detach for the old POD is initiated and thus the new POD can attach successfully.

On Azure however, attaching and detaching takes quite long, resulting in the first bogus attach attempt to already eat up much time.
When attaching fails on Azure and reports that it is already attached somewhere else, the cloud provider immediately does a detach call for the same volume+node it tried to attach to. This is done to make sure the failed attach request is aborted immediately. You can find this here: https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/azure/azure_storage.go#L74

The complete flow of attach->fail->abort eats up valuable time and the attach/detach controller can not proceed with other work while this is happening. This means, if the old POD disappears in the meantime, the controller can't even start the detach for the volume which delays the whole process of rescheduling and reattaching.

Also, I and other people have observed very strange behavior where disks ended up being "attached" to multiple VMs at the same time as reported by Azure Portal. This results in the controller to fail reattaching forever. It's hard to figure out why and when this happens and there is no reproducer known yet. I can imagine however that the described behavior correlates with what I described above.

I was not sure if there are actually cases where it is perfectly fine to have a volume mounted to multiple PODs/nodes. At least technically, this should be possible with network based volumes, e.g. nfs. Can someone with more knowledge about volumes help me here? I may need to add a check before skipping attaching in `reconcile`.

CC @colemickens @rootfs

-->
```release-note
Don't try to attach volume to new node if it is already attached to another node and the volume does not support multi-attach.
```
2017-05-19 21:54:42 -07:00
shashidharatd b4792014d1 Updated test/test_owners.csv for federation test cases 2017-05-20 08:43:30 +05:30
Kubernetes Submit Queue 2473c24f81 Merge pull request #45979 from bowei/owners
Automatic merge from submit-queue

Add bowei to OWNERS: e2e/test dns,network; cloud route, node, service…
2017-05-19 19:39:05 -07:00
Kubernetes Submit Queue a9d0403858 Merge pull request #38169 from caseydavenport/calico-daemonset
Automatic merge from submit-queue

Update Calico add-on

**What this PR does / why we need it:**

Updates Calico to the latest version using self-hosted install as a DaemonSet, removes Calico's dependency on etcd.

- [x] Remove [last bits of Calico salt](175fe62720/cluster/saltbase/salt/calico/master.sls (L3))
- [x] Failing on the master since no kube-proxy to access API.
- [x] Fix outgoing NAT
- [x] Tweak to work on both debian / GCI (not just GCI)
- [x] Add the portmap plugin for host port support

Maybe:
- [ ] Add integration test

**Which issue this PR fixes:**

https://github.com/kubernetes/kubernetes/issues/32625

**Try it out**

Clone the PR, then:

```
make quick-release
export NETWORK_POLICY_PROVIDER=calico
export NODE_OS_DISTRIBUTION=gci
export MASTER_SIZE=n1-standard-4
./cluster/kube-up.sh 
```

**Release note:**

```release-note
The Calico version included in kube-up for GCE has been updated to v2.2.
```
2017-05-19 19:38:59 -07:00
Kubernetes Submit Queue 83a1a863ad Merge pull request #45564 from whitlockjc/admission-api-group
Automatic merge from submit-queue (batch tested with PRs 45996, 46121, 45707, 46011, 45564)

add "admission" API group

This commit is an initial pass at providing an admission API group.
The API group is required by the webhook admission controller being
developed as part of https://github.com/kubernetes/community/pull/132
and could be used more as that proposal comes to fruition.

**Note:** This PR was created by following the [Adding an API Group](https://github.com/kubernetes/community/blob/master/contributors/devel/adding-an-APIGroup.md) documentation.

cc @smarterclayton
2017-05-19 18:57:38 -07:00
Kubernetes Submit Queue 73e7ef1f8c Merge pull request #46011 from MrHohn/e2e-fix-return-podnames
Automatic merge from submit-queue (batch tested with PRs 45996, 46121, 45707, 46011, 45564)

Fix waitForNPods in restart.go

From https://github.com/kubernetes/kubernetes/issues/45991#issuecomment-302292404.

Don't redefine `pods` so we can return real pod names instead of empty array.

/assign @dchen1107 @bowei 

**Release note**:

```release-note
NONE
```
2017-05-19 18:57:36 -07:00
Kubernetes Submit Queue e2a9327999 Merge pull request #45707 from Crazykev/cri-experimental
Automatic merge from submit-queue (batch tested with PRs 45996, 46121, 45707, 46011, 45564)

Remove flag `experimental-cri` in e2e-node test

Signed-off-by: Crazykev <crazykev@zju.edu.cn>



**What this PR does / why we need it**: 
This patch remove deprecated flag in node e2e test script, cause kubelet already remove this. Leave this will make kubelet start failed.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**: /cc @feiskyer 

**Release note**:

```release-note
NONE
```
2017-05-19 18:57:34 -07:00
Kubernetes Submit Queue 1c8d255819 Merge pull request #46121 from Random-Liu/fix-kuberuntime-getpods
Automatic merge from submit-queue (batch tested with PRs 45996, 46121, 45707, 46011, 45564)

Fix kuberuntime GetPods.

The `ImageID` is not populated from `GetPods` in kuberuntime.

Image garbage collector is using this field, https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/images/image_gc_manager.go#L204.

Without this fix, image garbage collector will try to garbage collect all images every time. Because docker will not allow that, it should be fine. However, I'm not sure whether the unnecessary remove will cause any problem, e.g. overload docker image management system and make docker hang.

@dchen1107 @yujuhong @feiskyer Do you think we should cherry-pick this?
2017-05-19 18:57:33 -07:00
ymqytw dd80b915e0 improve type assertion error 2017-05-19 18:07:59 -07:00
Jonathan MacMillan af2a8f7e8a [Federation] Use service accounts instead of the user's credentials when accessing joined clusters' API servers. 2017-05-19 18:05:09 -07:00
Kubernetes Submit Queue ee9bab1111 Merge pull request #45996 from cblecker/hack-owner
Automatic merge from submit-queue

Add cblecker to hack/ reviewers

**What this PR does / why we need it**:
I've done a number of reviews in this part of the code base, and would like to continue helping out and formally be assigned PRs that change things in hack/

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-05-19 16:06:27 -07:00
mbohlool 601570a079 Fix hack/update-federation-openapi-spec.sh flakyness 2017-05-19 15:39:08 -07:00
mbohlool 4d4abf3ba6 Update bazel 2017-05-19 15:39:08 -07:00
mbohlool 4b0fbfe1ee bugfix: form parameters should have type in OpenAPI spec 2017-05-19 15:39:08 -07:00
mbohlool 161b480107 Add protobuf binary version of openapi spec 2017-05-19 15:39:08 -07:00
mbohlool 67025046a5 Add gnostic to Godep 2017-05-19 15:39:08 -07:00
Fabiano Franz 18cb56bf78 kubectl plugins have access config, global flags and environment 2017-05-19 19:17:43 -03:00
Ahmet Alp Balkan 8604ed6d99
clientgo/examples/in-cluster: add instructions to run the example
This patch adds instructions for how to run the in-cluster client-go example.
To make this example executable, providing a Dockerfile and build steps so
that it can directly be run on minikube.

This is part of the body of work improving the client library samples.

Signed-off-by: Ahmet Alp Balkan <ahmetb@google.com>
2017-05-19 14:55:10 -07:00
Bowei Du 3af1c0efcb Add bowei to OWNERS: e2e/test dns,network; cloud route, node, service controller 2017-05-19 14:49:43 -07:00
Alexander Campbell 5e1a3fd020 kubectl: improve deprecatedAlias unit test
Two new behaviors are tested:

 1. The output message that deprecatedAlias gives when it is called must
    include the word "deprecatated" and the name of the new function
    that the user should use instead.
 2. The correct function must be called by the alias (alias should "fall
    back" to the functionality of the original.
2017-05-19 14:49:03 -07:00
Alexander Campbell 487398ebf9 Merge branch 'master' into correct-deprecation-errors 2017-05-19 14:20:05 -07:00
Fabiano Franz da85262f70 Adds support to a tree hierarchy of kubectl plugins 2017-05-19 18:06:15 -03:00
emaildanwilson 2cef454fd3 fed cluster selector integration test
updates from review
2017-05-19 13:47:52 -07:00
Kubernetes Submit Queue 8d0cce3f91 Merge pull request #46052 from davidewatson/spelling
Automatic merge from submit-queue

Correct spelling in comment

**What this PR does / why we need it**:

Corrects two misspelled names in a comment.

**Which issue this PR fixes**:

N/A

**Special notes for your reviewer**:

NONE

**Release note**:

NONE
2017-05-19 13:42:31 -07:00
Alexander Campbell 4055425e62 Merge branch 'master' into correct-deprecation-errors 2017-05-19 13:14:56 -07:00
Alexander Campbell 213f3c7e6e Merge branch 'master' into correct-deprecation-errors 2017-05-19 12:58:42 -07:00
Wojciech Tyczynski a3da8d7300 Fix naming and comments in kube-proxy. 2017-05-19 21:34:05 +02:00
Anthony Yeh 8594af7676
Update CHANGELOG.md for v1.6.4. 2017-05-19 12:30:25 -07:00
Jago Macleod 0cab203fbd Added deprecation notice and guidance for cloud providers. 2017-05-19 12:17:09 -07:00
Kubernetes Submit Queue effca9605b Merge pull request #46068 from zjj2wry/ccc
Automatic merge from submit-queue

fix changelog link not work

**What this PR does / why we need it**:
i read other pr has fix it, but now reappear. how can i got generate tool ?

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-05-19 11:52:18 -07:00
Kubernetes Submit Queue d3aa925c01 Merge pull request #46038 from dnardo/ip-masq-agent
Automatic merge from submit-queue (batch tested with PRs 44606, 46038)

Add ip-masq-agent addon to the addons folder. 

This also ensures that under gce we add this DaemonSet if the non-masq-cidr
is set to 0/0.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:
```release-note
Add ip-masq-agent addon to the addons folder which is used in GCE if  --non-masquerade-cidr is set to 0/0
```
2017-05-19 11:52:09 -07:00
Kubernetes Submit Queue 69ce7d8475 Merge pull request #44606 from ivan4th/fix-serialization-of-enforce-node-allocatable
Automatic merge from submit-queue (batch tested with PRs 44606, 46038)

Fix serialization of EnforceNodeAllocatable

EnforceNodeAllocatable being `nil` and `[]` are treated in different
ways by kubelet. Namely, `nil` is replaced with `[]string{"pods"}` by
the defaulting mechanism.

E.g. if you run kubelet in Docker-in-Docker environment
you may need to run it with the following options:
`--cgroups-per-qos=false --enforce-node-allocatable=`
(this corresponds to EnforceNodeAllocatable being empty array and not
null) If you then grab kubelet configuration via /configz and try to
reuse it for dynamic kubelet config, kubelet will think that
EnforceNodeAllocatable is null, failing to run in the
Docker-in-Docker environment.

Encountered this while updating Virtlet for Kubernetes 1.6
(the dev environment is based on kubeadm-dind-cluster)
2017-05-19 11:52:03 -07:00
Random-Liu 4935e119da Fix kuberuntime GetPods. 2017-05-19 11:47:45 -07:00
Wojciech Tyczynski 7d44f83441 Descrese logs verbosity for iptables 2017-05-19 20:44:26 +02:00
Wojciech Tyczynski e3bb755270 Reuse buffers for generated iptables rules 2017-05-19 20:44:26 +02:00
Wojciech Tyczynski 4d29c8608f Avoid strings.Join which is expensive 2017-05-19 20:44:25 +02:00
Wojciech Tyczynski 5464c39333 Reuse buffer for getting iptables contents 2017-05-19 20:44:25 +02:00
Wojciech Tyczynski bcfae7e1ed Extend Iptables interface with SaveInto 2017-05-19 20:44:25 +02:00
Wojciech Tyczynski 028ac8034b Remove SaveAll from iptables interface 2017-05-19 20:44:25 +02:00
Andy Goldstein f51c2c445c Restore kube-proxy --version 2017-05-19 14:40:35 -04:00
Alexander Campbell ca899ec048 kubectl: write unit test for deprecatedAlias() 2017-05-19 11:25:11 -07:00
Kubernetes Submit Queue 65f5bff1df Merge pull request #46104 from liggitt/node-admission
Automatic merge from submit-queue (batch tested with PRs 46028, 46104)

Use name from node object on create

GetName() isn't populated in admission attributes on create unless the rest storage is a NamedCreator (which only specific subresources are today)

Fixes #46085
2017-05-19 10:58:07 -07:00
Kubernetes Submit Queue ea828d05b8 Merge pull request #46028 from humblec/iscsi-describe
Automatic merge from submit-queue (batch tested with PRs 46028, 46104)

Add missing parameters of iscsi volume source to describe printer.
2017-05-19 10:58:04 -07:00
Alexander Campbell 2f0faedbee kubectl: make "Deprecated" a private function
There's no reason to export this function, so I've made it private.
2017-05-19 10:45:49 -07:00
Alexander Campbell bf2fc62144 kubectl: renmae deprecatedCmd -> deprecatedAlias
This will clear up some of the confusion around the deprecatedCmd /
Deprecated function.
2017-05-19 10:40:21 -07:00
Kubernetes Submit Queue 4d89212d26 Merge pull request #44898 from xingzhou/kube-44697
Automatic merge from submit-queue (batch tested with PRs 45908, 44898)

While calculating pod's cpu limits, need to count in init-container.

Need to count in init-container when calculating a pod's cpu limits.
Otherwise, may cause pod start failure due to "invalid argument"
error while trying to write "cpu.cfs_quota_us" file.

Fixed #44697 

Release note:
```
NONE
```
2017-05-19 09:39:04 -07:00
Jeremy Whitlock 1b59dd887d add "admission" API group
This commit is an initial pass at providing an admission API group.
The API group is required by the webhook admission controller being
developed as part of https://github.com/kubernetes/community/pull/132
and could be used more as that proposal comes to fruition.
2017-05-19 10:17:37 -06:00
Kubernetes Submit Queue 9a5694b4c4 Merge pull request #45908 from ncdc/kube-proxy-write-config
Automatic merge from submit-queue

kube-proxy: add --write-config-to flag

Add --write-config-to flag to kube-proxy to write the default configuration
values to the specified file location.

@deads2k suggested I create my own scheme for this, so I followed the example he shared with me. The only bit currently still referring to `api.Scheme` is where we create the event broadcaster recorder. In order to use the custom private scheme, I either have to pass it in to `NewProxyServer()`, or I have to make `NewProxyServer()` a member of the `Options` struct. If the former, then I probably need to export `Options.scheme`. Thoughts?

cc @mikedanese @sttts @liggitt @deads2k @smarterclayton @timothysc @kubernetes/sig-network-pr-reviews @kubernetes/sig-api-machinery-pr-reviews 

```release-note
Add --write-config-to flag to kube-proxy to allow users to write the default configuration settings to a file.
```
2017-05-19 09:01:04 -07:00
Alexander Campbell 699a3e20ff .generated_docs: update to include "new" commands 2017-05-19 08:56:53 -07:00
Alexander Campbell acfdafb1fb Merge branch 'master' into correct-deprecation-errors 2017-05-19 08:55:17 -07:00