Automatic merge from submit-queue
cinder: Add support for the KVM virtio-scsi driver
**What this PR does / why we need it**:
The VirtIO SCSI driver for KVM changes the way disks appear in /dev/disk/by-id.
This adds support for the new format.
Without this, volume attaching on an openstack cluster using this kvm driver doesn't work
**Special notes for your reviewer**:
Does this need e2e tests? I couldn't find anywhere to add another openstack configuration used in the e2e tests.
Wiki page about this: https://wiki.openstack.org/wiki/Virtio-scsi-for-bdm
**Release note**:
```release-note
cinder: Add support for the KVM virtio-scsi driver
```
Automatic merge from submit-queue
Update kubelet to use the network-plugin-dir if the cni-bin-dir flag is not set.
This is a fix for the regession raised in issue #44683
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
Mark Stackdriver Logging e2e tests with a feature
Makes Stackdriver Logging e2e tests, except for the most basic one, run in the separate tests suites, prepared by https://github.com/kubernetes/test-infra/pull/2542
Automatic merge from submit-queue
De-Flake Volume E2E: force GCEPD detach to prevent timeout
**What this PR does / why we need it**:
Fix flake`[k8s.io] Volumes [Volume] [k8s.io] PD should be mountable [Flaky] 5m38s.
Flake occurs as a result of an automated detach taking longer than 5 minutes, which exceeds the timeout limit of the delete function.
This PR adds explicit detach and wait func calls before the deletion. By forcing the detach and giving GCE an appropriate timeout limit, this should squash the timeout flake. This also significantly shortens cleanup time.
This PR does not remove the [Flaky] tag. Once this PR is merged, I'll keep an eye on the test grid for ~1 week. If no flakes surface, I'll submit a PR to pull the tag off.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #43977
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44722, 44704, 44681, 44494, 39732)
Fix issue #34242: Attach/detach should recover from a crash
When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status and figure out which volumes to detach. This requires some changes in the volume providers too: the only information available from the nodes is the volume name and the device path. The controller needs to find the correct volume plugin and reconstruct the volume spec just from the name. This required a small change also in the volume plugin interface.
Fixes Issue #34242.
cc: @jsafrane @jingxu97
Automatic merge from submit-queue (batch tested with PRs 44722, 44704, 44681, 44494, 39732)
Don't rebuild endpoints map in iptables kube-proxy all the time.
@thockin - i think that this PR should help with yours https://github.com/kubernetes/kubernetes/pull/41030 - it (besides performance improvements) clearly defines when update because of endpoints is needed. If we do the same for services (I'm happy to help with it), i think it should be much simpler.
But please take a look if it makes sense from your perspective too.
Automatic merge from submit-queue (batch tested with PRs 44722, 44704, 44681, 44494, 39732)
prevent installation of docker from upstream
**What this PR does / why we need it**: Disallows installation of upstream docker from PPA in the Juju kubernetes-worker charm.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Disallows installation of upstream docker from PPA in the Juju kubernetes-worker charm.
```
Automatic merge from submit-queue (batch tested with PRs 44594, 44651)
remove strings.compare(), use string native operation
I notice we use strings.Compare() in some code, we can remove it and use native operation.
Automatic merge from submit-queue
Delete deprecated node phase in kubect describe node.
**What this PR does / why we need it**:
Since NodePhase is no longer used, delete it in `kubect describe node` result.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
ref: https://github.com/kubernetes/kubernetes/pull/44388
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 42177, 42176, 44721)
Removed fluentd-gcp manifest pod
```release-note
Fluentd manifest pod is no longer created on non-registered master when creating clusters using kube-up.sh.
```
Automatic merge from submit-queue (batch tested with PRs 42177, 42176, 44721)
Job: Respect ControllerRef
**What this PR does / why we need it**:
This is part of the completion of the [ControllerRef](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md) proposal. It brings Job into full compliance with ControllerRef. See the individual commit messages for details.
**Which issue this PR fixes**:
This ensures that Job does not fight with other controllers over control of Pods.
Ref: #24433
**Special notes for your reviewer**:
**Release note**:
```release-note
Job controller now respects ControllerRef to avoid fighting over Pods.
```
cc @erictune @kubernetes/sig-apps-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 42177, 42176, 44721)
CronJob: Respect ControllerRef
**What this PR does / why we need it**:
This is part of the completion of the [ControllerRef](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md) proposal. It brings CronJob into compliance with ControllerRef. See the individual commit messages for details.
**Which issue this PR fixes**:
This ensures that other controllers do not fight over control of objects that a CronJob owns.
**Special notes for your reviewer**:
**Release note**:
```release-note
CronJob controller now respects ControllerRef to avoid fighting with other controllers.
```
cc @erictune @kubernetes/sig-apps-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 44555, 44238)
openstack: remove field flavor_to_resource
I believe there is no usage about `flavor_to_resource`, and I think there is no need to build that information, too.
cc @anguslees
**Release note:**
```
NONE
```
Automatic merge from submit-queue
Remove the old docker-multinode files that were built into the hyperkube image
**What this PR does / why we need it**:
ref: https://goo.gl/VxSaKx
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
The hyperkube image has been slimmed down and no longer includes addon manifests and other various scripts. These were introduced for the now removed docker-multinode setup system.
```
cc @jbeda @brendandburns @bgrant0607 @justinsb @mikedanese
Automatic merge from submit-queue
Switch to pointer to policy rule, visit and short circuit during authorization
Ref #40015
* Switches policy rule helper methods to work with pointers
* Switches authorization to use a short-circuiting visitor
Best-case, authorization short-circuits early and avoids accumulating rules it never needs to check
Worst-case (a forbidden request), it still checks all the applicable rules, but requires less allocation to do so
$ go test ./plugin/pkg/auth/authorizer/rbac/... -bench=. -benchmem -run Bench
on master:
```
BenchmarkAuthorize/allow_list_pods-8 300000 4373 ns/op 3840 B/op 26 allocs/op
BenchmarkAuthorize/allow_update_pods/status-8 300000 5121 ns/op 3840 B/op 26 allocs/op
BenchmarkAuthorize/forbid_educate_dolphins-8 300000 4706 ns/op 3840 B/op 26 allocs/op
```
with short-circuiting and policy rule pointer changes:
```
BenchmarkAuthorize/allow_list_pods-8 2000000 930 ns/op 64 B/op 2 allocs/op
BenchmarkAuthorize/allow_update_pods/status-8 1000000 1656 ns/op 64 B/op 2 allocs/op
BenchmarkAuthorize/forbid_educate_dolphins-8 500000 3395 ns/op 1488 B/op 25 allocs/op
```
Automatic merge from submit-queue
Refactoring reorganize taints function in kubectl to expose operations
**What this PR does / why we need it**:
This adds some UX functionality when specifying taints using kubectl.
For example:
```
./kubectl.sh taint nodes XYZ dedicated1=abca2:NoSchedule
node "XYZ" tainted
./kubectl.sh taint nodes XYZ dedicated1=abca1:NoSchedule --overwrite=True
node "XYZ overwritten
./kubectl.sh taint nodes XYZ dedicated1-
node "XYZ" untainted
./kubectl.sh taint nodes XYZ dedicated=abca1:NoSchedule dedicated1-
node "XYZ" modified
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#43167
**Release note**:
```
Fixed the output of kubectl taint node command with minor improvements.
```
When the attach/detach controller crashes and a pod with attached PV is deleted
afterwards the controller will never detach the pod's attached volumes. To
prevent this the controller should try to recover the state from the nodes
status.
Automatic merge from submit-queue
adding test for volume fstype validation
**What this PR does / why we need it**:
This PR is adding a test for volume fstype validation. Test verifies fstype specified in storage-class is being honored after volume creation.
Steps:
1. Create StorageClass with fstype set to valid type (default case included).
2. Create PVC which uses the StorageClass created in step 1.
3. Wait for PV to be provisioned.
4. Wait for PVC's status to become Bound.
5. Create pod using PVC on specific node.
6. Wait for Disk to be attached to the node.
7. Execute command in the pod to get fstype.
8. Delete pod and Wait for Volume Disk to be detached from the Node.
9. Delete PVC, PV and Storage Class.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
cc: @jeffvance @tusharnt
Automatic merge from submit-queue (batch tested with PRs 42272, 44696)
e2e test fix: Wait longer when first creating ELB
On any cloud (GCE or AWS), a lag between creating the LoadBalancer and
having it actually start serving traffic is expected. On AWS the lag is
larger, and we weren't correctly using the longer wait on our first
request.
Use a longer wait period on our first request.
Fix#44695
```release-note
NONE
```
Automatic merge from submit-queue
apiserver: Update genericapiserver to panic on listener error
Previously runServer would try to listen again if a listener error occurred. This commit changes the response to a panic to allow a process manager (systemd/kubelet/etc) to react to the failure.
**Release note**:
```release-note
The Kubernetes API server now exits if it encounters a networking failure (e.g. the networking interface hosting its address goes away) to allow a process manager (systemd/kubelet/etc) to react to the problem. Previously the server would log the failure and try again to bind to its configured address:port.
```
cc: @liggitt @sttts @deads2k @derekwaynecarr
Automatic merge from submit-queue
e2e: Prefer kubeconfig host to default
Previously it was necessary to pass ``-host`` to ``e2e.test`` even if ``-kubeconfig`` was specified since otherwise a localhost default would be used. This change ensures that the default is only used when kubeconfig is not set.
cc: @jayunit100
On any cloud (GCE or AWS), a lag between creating the LoadBalancer and
having it actually start serving traffic is expected. On AWS the lag is
larger, and we weren't correctly using the longer wait on our first
request.
Use a longer wait period on our first request.
Fix#44695
Automatic merge from submit-queue (batch tested with PRs 44687, 44689, 44661)
Fix panic when using `kubeadm init` with vsphere cloud-provider
**What this PR does / why we need it**:
Check if the reference is nil when finding machine reference by UUID.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#44603
**Special notes for your reviewer**:
This is just a quick fix for the panic.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44687, 44689, 44661)
Retry in get-kube.sh to avoid download flakes.
GCS has up to 2% 5xx rates, so retrying is critical.
This is currently failing about 8 times per day [according to the dashboard](https://storage.googleapis.com/k8s-gubernator/triage/index.html?test=Extract#be2f33fb1e6dd2389d12). It could be backported to reduce the flake rate.
Relase note:
```release-note
NONE
```
Automatic merge from submit-queue
Add hack/lib to kubernetes release tarball
**What this PR does / why we need it**:
Add hack/lib to kubernetes release tarball, to fix an issue with https://get.k8s.io/ script introduced in #42748.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/pull/42748#issuecomment-295412268
**Special notes for your reviewer**:
I'm new to bazel, so hopefully I'm not off-base here :)
**Release note**:
```release-note
NONE
```
cc: @ixdy @dcbw @smarterclayton
Automatic merge from submit-queue
Implement LRU for AWS device allocator
On failure to attach do not use device from pool
In AWS environment when attach fails on the node
lets not use device from the pool. This makes sure
that a bigger pool of devices is available.
Automatic merge from submit-queue
select one api endpoint at random when deploying kubernetes-core charm
**What this PR does / why we need it**: Fixes a bug in the kubernetes-worker Juju charm code that attempted to give kube-proxy more than one api endpoint.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**: https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/255
**Release note**:
```release-note
Fixes a bug in the kubernetes-worker Juju charm code that attempted to give kube-proxy more than one api endpoint.
```