Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update bazelbuild/rules_go, kubernetes/repo-infra, and gazelle dependencies
**What this PR does / why we need it**: updates our bazelbuild/rules_go dependency in order to bump everything to go1.9.4. I'm separating this effort into two separate PRs, since updating rules_go requires a large cleanup, removing an attribute from most build rules.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
add node shutdown taint
**What this PR does / why we need it**: we need node stopped taint in order to detach volumes immediately without waiting timeout. More info in issue ticket #58635
**Which issue(s) this PR fixes**
Fixes#58635
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
PanFengyun <pan_feng_yun@163.com>'s previous github id was @FengyunPan
Due to some problem with github, he lost access to @FengyunPan and
is not using @FengyunPan2. So let's switch over to the new id. Github
has promised to release the previous id back in 6 months, so we may
have to switch it back later.
shutdowned -> stopped
use shutdown everywhere
use patch in taints api call
use notimplemented in clouds use AddOrUpdateTaintOnNode
correct log text
add fake cloud
try to fix bazel
add shutdown tests
add context
This adds context to all the relevant cloud provider interface signatures.
Callers of those APIs are currently satisfied using context.TODO().
There will be follow on PRs to push the context through the stack.
For an idea of the full scope of this change please look at PR #58532.
Since we are transitioning to external cloud provider, we need a way
to use the existing cinder volume plugin (from kubelet). With external
cloud manager kubelet will be run with --cloud=provider=external and
no --cloud-config file will be in the command line. So we need a way
to load the openstack config file from somewhere.
Taking a cue from kubeadm, which currently is picking up "/etc/kubernetes/cloud-config"
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/phases/controlplane/manifests.go#L44
let's support the scenario where we fall back to this static location if
there is no cloud provider specified in the command line.
This has been tested with local-up-cluster using the following params:
EXTERNAL_CLOUD_PROVIDER=true
CLOUD_PROVIDER=openstack
CLOUD_CONFIG=/etc/kubernetes/cloud-config
Automatic merge from submit-queue (batch tested with PRs 55925, 55999, 55944, 55992, 56196). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
probeAttachedVolume improvement in Cinder
What this PR does / why we need it:
During OpenStack Cinder volume attachment, kubelet creates too many udev "change" events.
If volume attachement takes long, then udev event queue gets full,therefore
udev does not answer requests from docker, and makes Kubernetes node unhealthy.
This patch will extend the udevadm trigger period,
and use the udevadm settle to ensure that any device nodes
have been created successfully before proceeding.
Fixes#55007
Release note:
```release-note
None
```
/cc @dims @kabakaev
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Rename Detach() parameter.
`Detach()` does not get device name, it gets volume name. Parameters named `deviceMountPath` or `deviceName` just confuses developers.
Note that this PR just renames parameters here and there, there should be no behavior change.
@kubernetes/sig-storage-pr-reviews
/assign @gnufied @jingxu97
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update volume OWNERS to reflect active sig-storage reviewers
**What this PR does / why we need it**:
Update sig-storage reviewers to add new members and remove those that don't have as much time to review storage PRs. Approvers are unchanged.
**Special notes for your reviewer**:
For all those that have been removed, please approve. If you want to remain as a reviewer, let me know and I will add you back.
**Release note**:
NONE
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix incorrect log
**What this PR does / why we need it**:
fix incorrect log in nfs_test.go
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
NONE
**Special notes for your reviewer**:
NONE
**Release note**:
NONE
This change is prerequisite for implementing iSCSI attacher
and detacher.
In order to use chap authentication at iSCSI plugin after
implementing attacher and detacher, secret is needed at
AttachDisk() which is called from WaitForAttach().
To obtain secret, pod information is required, but
WaitForAttach() doesn't pass pod information inside.
This patch adds 'pod' as an argument of WaitForAttach()
and adds changes to drivers who implements WaitForAttach().
Fixes#48953
Automatic merge from submit-queue (batch tested with PRs 51174, 51363, 51087, 51382, 51388)
Add InstanceExistsByProviderID to cloud provider interface for CCM
**What this PR does / why we need it**:
Currently, [`MonitorNode()`](02b520f0a4/pkg/controller/cloud/nodecontroller.go (L240)) in the node controller checks with the CCM if a node still exists by calling `ExternalID(nodeName)`. `ExternalID` is supposed to return the provider id of a node which is not supported on every cloud. This means that any clouds who cannot infer the provider id by the node name from a remote location will never remove nodes that no longer exist.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50985
**Special notes for your reviewer**:
We'll want to create a subsequent issue to track the implementation of these two new methods in the cloud providers.
**Release note**:
```release-note
Adds `InstanceExists` and `InstanceExistsByProviderID` to cloud provider interface for the cloud controller manager
```
/cc @wlan0 @thockin @andrewsykim @luxas @jhorwit2
/area cloudprovider
/sig cluster-lifecycle
Automatic merge from submit-queue (batch tested with PRs 51054, 51101, 50031, 51296, 51173)
Dynamic Flexvolume plugin discovery, probing with filesystem watch.
**What this PR does / why we need it**: Enables dynamic Flexvolume plugin discovery. This model uses a filesystem watch (fsnotify library), which notifies the system that a probe is necessary only if something changes in the Flexvolume plugin directory.
This PR uses the dependency injection model in https://github.com/kubernetes/kubernetes/pull/49668.
**Release Note**:
```release-note
Dynamic Flexvolume plugin discovery. Flexvolume plugins can now be discovered on the fly rather than only at system initialization time.
```
/sig-storage
/assign @jsafrane @saad-ali
/cc @bassam @chakri-nelluri @kokhang @liggitt @thockin
Automatic merge from submit-queue (batch tested with PRs 51235, 50819, 51274, 50972, 50504)
Clean cinder detachlogerror
**What this PR does / why we need it**:
**Which issue this PR fixes** : fixes#50441
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 38947, 50239, 51115, 51094, 51116)
Mark the volumes as detached when node does not exist
If node does not exist, node's volumes will be detached
automatically and become available. So mark them detached and do not return err.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
#50200
**Release note**:
```release-note
NONE
```
If node doesn't exist, OpenStack Nova will assume the volumes
are not attached to it. So mark the volumes as detached and
return false without error.
Fix: #50200
Automatic merge from submit-queue
Remove repeated reviewer's names
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
Tune Cinder approvers
I don't want to be single approver for cinder PRs, @anguslees is OpenStack maintainer and should be able to help with Cinder.
Any other volunteers from @kubernetes/sig-storage-pr-reviews or @k8s-sig-openstack-pr-reviews?
Note: @justinsb **is** still reviewer, he was just listed twice.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45345, 49470, 49407, 49448, 49486)
Support "fstype" parameter in dynamically provisioned PVs
This PR is a replacement for https://github.com/kubernetes/kubernetes/pull/40805. I was not able to push fixes and rebases to the original branch as I don't have access to the Github organization anymore.
I assume the PR will need a new "ok to test"
**ORIGINAL PR DESCRIPTION**
**What this PR does / why we need it**: This PR allows specifying the desired FSType when dynamically provisioning volumes with storage classes. The FSType can now be set as a parameter:
```yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: test
provisioner: kubernetes.io/azure-disk
parameters:
fstype: xfs
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#37801
**Special notes for your reviewer**:
The PR also implicitly adds checks for unsupported parameters.
**Release note**:
```release-note
Support specifying of FSType in StorageClass
```
When volume's status is 'detaching', controllermanager will detach
it again and return err. It is necessary to wait for detach
operation to complete within the alloted time.
When volume's status is 'attaching', its attachments will be None,
controllermanager can't get device path and make some failed event.
But it is normal, let's fix it.
Automatic merge from submit-queue
Statefulsets for cinder: allow multi-AZ deployments, spread pods across zones
**What this PR does / why we need it**: Currently if we do not specify availability zone in cinder storageclass, the cinder is provisioned to zone called nova. However, like mentioned in issue, we have situation that we want spread statefulset across 3 different zones. Currently this is not possible with statefulsets and cinder storageclass. In this new solution, if we leave it empty the algorithm will choose the zone for the cinder drive similar style like in aws and gce storageclass solutions.
**Which issue this PR fixes** fixes#44735
**Special notes for your reviewer**:
example:
```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: all
provisioner: kubernetes.io/cinder
---
apiVersion: v1
kind: Service
metadata:
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
name: galera
labels:
app: mysql
spec:
ports:
- port: 3306
name: mysql
clusterIP: None
selector:
app: mysql
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: "galera"
replicas: 3
template:
metadata:
labels:
app: mysql
annotations:
pod.alpha.kubernetes.io/initialized: "true"
spec:
containers:
- name: mysql
image: adfinissygroup/k8s-mariadb-galera-centos:v002
imagePullPolicy: Always
ports:
- containerPort: 3306
name: mysql
- containerPort: 4444
name: sst
- containerPort: 4567
name: replication
- containerPort: 4568
name: ist
volumeMounts:
- name: storage
mountPath: /data
readinessProbe:
exec:
command:
- /usr/share/container-scripts/mysql/readiness-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
volumeClaimTemplates:
- metadata:
name: storage
annotations:
volume.beta.kubernetes.io/storage-class: all
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 12Gi
```
If this example is deployed it will automatically create one replica per AZ. This helps us a lot making HA databases.
Current storageclass for cinder is not perfect in case of statefulsets. Lets assume that cinder storageclass is defined to be in zone called nova, but because labels are not added to pv - pods can be started in any zone. The problem is that at least in our openstack it is not possible to use cinder drive located in zone x from zone y. However, should we have possibility to choose between cross-zone cinder mounts or not? Imo it is not good way of doing things that they mount volume from another zone where the pod is located(means more network traffic between zones)? What you think? Current new solution does not allow that anymore (should we have possibility to allow it? it means removing the labels from pv).
There might be some things that needs to be fixed still in this release and I need help for that. Some parts of the code is not perfect.
Issues what i am thinking about (I need some help for these):
1) Can everybody see in openstack what AZ their servers are? Can there be like access policy that do not show that? If AZ is not found from server specs, I have no idea how the code behaves.
2) In GetAllZones() function, is it really needed to make new serviceclient using openstack.NewComputeV2 or could I somehow use existing one
3) This fetches all servers from some openstack tenant(project). However, in some cases kubernetes is maybe deployed only to specific zone. If kube servers are located for instance in zone 1, and then there are another servers in same tenant in zone 2. There might be usecase that cinder drive is provisioned to zone-2 but it cannot start pod, because kubernetes does not have any nodes in zone-2. Could we have better way to fetch kubernetes nodes zones? Currently that information is not added to kubernetes node labels automatically in openstack (which should I think). I have added those labels manually to nodes. If that zone information is not added to nodes, the new solution does not start stateful pods at all, because it cannot target pods.
cc @rootfs @anguslees @jsafrane
```release-note
Default behaviour in cinder storageclass is changed. If availability is not specified, the zone is chosen by algorithm. It makes possible to spread stateful pods across many zones.
```
The cloudprovider is being refactored out of kubernetes core. This is being
done by moving all the cloud-specific calls from kube-apiserver, kubelet and
kube-controller-manager into a separately maintained binary(by vendors) called
cloud-controller-manager. The Kubelet relies on the cloudprovider to detect information
about the node that it is running on. Some of the cloudproviders worked by
querying local information to obtain this information. In the new world of things,
local information cannot be relied on, since cloud-controller-manager will not
run on every node. Only one active instance of it will be run in the cluster.
Today, all calls to the cloudprovider are based on the nodename. Nodenames are
unqiue within the kubernetes cluster, but generally not unique within the cloud.
This model of addressing nodes by nodename will not work in the future because
local services cannot be queried to uniquely identify a node in the cloud. Therefore,
I propose that we perform all cloudprovider calls based on ProviderID. This ID is
a unique identifier for identifying a node on an external database (such as
the instanceID in aws cloud).
This implements Bulk volume polling using ideas presented by
justin in https://github.com/kubernetes/kubernetes/pull/39564
But it changes the implementation to use an interface
and doesn't affect other implementations.
Automatic merge from submit-queue (batch tested with PRs 41246, 39998)
Cinder volume attacher: use instanceID instead of NodeID when verifying attachment
**What this PR does / why we need it**: Cinder volume attacher incorrectly uses NodeID instead of openstack instance id, so that reconciliation fails.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#39978
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 38772, 38797, 40732, 40740)
Synchronous spellcheck for pkg/volume/*
**What this PR does / why we need it**: Increase code readability
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**: Minor contribution
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 40299, 40311)
move authoritative client-go util out of pkg
Move `client-go/pkg/util` which are authoritative to `client-go/util` to make it easier to reason about what comes from where.
Automatic merge from submit-queue
Adding rescan scsi controller for cinder
For lsilogic scsi controller, attached cinder volume does not
appear under /dev/ automatically unless do a rescan.
This approach was used in vSphere volume provider before PR #27496
dropped support for lsilogic scsi controller.
Automatic merge from submit-queue
Curating Owners: pkg/volume
cc @jsafrane @spothanis @agonzalezro @justinsb @johscheuer @simonswine @nelcy @pmorie @quofelix @sdminonne @thockin @saad-ali @rootfs
In an effort to expand the existing pool of reviewers and establish a
two-tiered review process (first someone lgtms and then someone
experienced in the project approves), we are adding new reviewers to
existing owners files.
If You Care About the Process:
------------------------------
We did this by algorithmically figuring out who’s contributed code to
the project and in what directories. Unfortunately, that doesn’t work
well: people that have made mechanical code changes (e.g change the
copyright header across all directories) end up as reviewers in lots of
places.
Instead of using pure commit data, we generated an excessively large
list of reviewers and pruned based on all time commit data, recent
commit data and review data (number of PRs commented on).
At this point we have a decent list of reviewers, but it needs one last
pass for fine tuning.
Also, see https://github.com/kubernetes/contrib/issues/1389.
TLDR:
-----
As an owner of a sig/directory and a leader of the project, here’s what
we need from you:
1. Use PR https://github.com/kubernetes/kubernetes/pull/35715 as an example.
2. The pull-request is made editable, please edit the `OWNERS` file to
remove the names of people that shouldn't be reviewing code in the
future in the **reviewers** section. You probably do NOT need to modify
the **approvers** section. Names asre sorted by relevance, using some
secret statistics.
3. Notify me if you want some OWNERS file to be removed. Being an
approver or reviewer of a parent directory makes you a reviewer/approver
of the subdirectories too, so not all OWNERS files may be necessary.
4. Please use ALIAS if you want to use the same list of people over and
over again (don't hesitate to ask me for help, or use the pull-request
above as an example)
Automatic merge from submit-queue (batch tested with PRs 39152, 39142, 39055)
openstack: Forcibly detach an attached cinder volume before attaching elsewhere
Fixes#33288
**What this PR does / why we need it**:
Without this fix, we can't preemptively reschedule pods with persistent volumes to other hosts (for rebalancing or hardware failure recovery).
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#33288
**Special notes for your reviewer**:
(This is a resurrection/cleanup of PR #33734, originally authored by @Rotwang)
**Release note**: