Automatic merge from submit-queue (batch tested with PRs 45283, 45289, 45248, 44295)
Azure disk: dealing with missing disk probe
**What this PR does / why we need it**:
While Azure disks are expected to attach to SCSI host 3 and above on general purpose instances, on certain Azure instances disks are under SCSI host 2.
This fix searches all LUNs but excludes those used by Azure sys disks, based on udev rules [here](https://raw.githubusercontent.com/Azure/WALinuxAgent/master/config/66-azure-storage.rules)
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
azure disk: add logging on disk attach
**What this PR does / why we need it**:
While we were debugging a failed azure disk attach, we were missing logging information to identify the root cause. This fix logs information at each stage of attach to help identify where problem is once it happens again.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
NONE
This implements Bulk volume polling using ideas presented by
justin in https://github.com/kubernetes/kubernetes/pull/39564
But it changes the implementation to use an interface
and doesn't affect other implementations.
Automatic merge from submit-queue (batch tested with PRs 39826, 40030)
azure disk: restrict length of name
**What this PR does / why we need it**:
Fixes dynamic disk provisioning on Azure by properly truncating the disk name to conform to the Azure API spec.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
n/a
**Special notes for your reviewer**:
n/a
**Release note**:
```release-note
azure disk: restrict name length for Azure specifications
```
cc: @rootfs
Automatic merge from submit-queue
Curating Owners: pkg/volume
cc @jsafrane @spothanis @agonzalezro @justinsb @johscheuer @simonswine @nelcy @pmorie @quofelix @sdminonne @thockin @saad-ali @rootfs
In an effort to expand the existing pool of reviewers and establish a
two-tiered review process (first someone lgtms and then someone
experienced in the project approves), we are adding new reviewers to
existing owners files.
If You Care About the Process:
------------------------------
We did this by algorithmically figuring out who’s contributed code to
the project and in what directories. Unfortunately, that doesn’t work
well: people that have made mechanical code changes (e.g change the
copyright header across all directories) end up as reviewers in lots of
places.
Instead of using pure commit data, we generated an excessively large
list of reviewers and pruned based on all time commit data, recent
commit data and review data (number of PRs commented on).
At this point we have a decent list of reviewers, but it needs one last
pass for fine tuning.
Also, see https://github.com/kubernetes/contrib/issues/1389.
TLDR:
-----
As an owner of a sig/directory and a leader of the project, here’s what
we need from you:
1. Use PR https://github.com/kubernetes/kubernetes/pull/35715 as an example.
2. The pull-request is made editable, please edit the `OWNERS` file to
remove the names of people that shouldn't be reviewing code in the
future in the **reviewers** section. You probably do NOT need to modify
the **approvers** section. Names asre sorted by relevance, using some
secret statistics.
3. Notify me if you want some OWNERS file to be removed. Being an
approver or reviewer of a parent directory makes you a reviewer/approver
of the subdirectories too, so not all OWNERS files may be necessary.
4. Please use ALIAS if you want to use the same list of people over and
over again (don't hesitate to ask me for help, or use the pull-request
above as an example)
This PR is to fix the issue in converting aws volume id from mount
paths. Currently there are three aws volume id formats supported. The
following lists example of those three formats and their corresponding
global mount paths:
1. aws:///vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/vol-123456)
2. aws://us-east-1/vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/mounts/aws/us-est-1/vol-123455)
3. vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/mounts/aws/us-est-1/vol-123455)
For the first two cases, we need to check the mount path and convert
them back to the original format.
Automatic merge from submit-queue
Remove unused WaitForDetach from Detacher interface and plugins
See issue #33128 and PR #33270
We can't rely on the device name provided by OpenStack Cinder, and thus
must perform detection based on the drive serial number (aka It's cinder ID)
on the kubelet itself.
This needs to be removed now, as part of #33128, as the code can't be
updated to attempt device detection and fallback through to the Cinder
provided deviceName, as detection "fails" when the device is gone, and
if cinder has reported a deviceName that another volume has used in
relaity, then this will block forever (or until the other, unreleated,
volume has been detached)
This has been unused since 542f2dc7, and relies on deviceName, which
can no longer be relied upon (see issue #33128).
This needs to be removed now, as part of #33128, as the code can't be
updated to attempt device detection and fallback through to the Cinder
provided deviceName, as detection "fails" when the device is gone, and
if cinder has reported a deviceName that another volume has used in
relaity, then this will block forever (or until the other, unreleated,
volume has been detached)
At master volume reconciler, the information about which volumes are
attached to nodes is cached in actual state of world. However, this
information might be out of date in case that node is terminated (volume
is detached automatically). In this situation, reconciler assume volume
is still attached and will not issue attach operation when node comes
back. Pods created on those nodes will fail to mount.
This PR adds the logic to periodically sync up the truth for attached volumes kept in the actual state cache. If the volume is no longer attached to the node, the actual state will be updated to reflect the truth. In turn, reconciler will take actions if needed.
To avoid issuing many concurrent operations on cloud provider, this PR
tries to add batch operation to check whether a list of volumes are
attached to the node instead of one request per volume.
More details are explained in PR #33760
We had another bug where we confused the hostname with the NodeName.
To avoid this happening again, and to make the code more
self-documenting, we use types.NodeName (a typedef alias for string)
whenever we are referring to the Node.Name.
A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName
Also clean up some of the (many) places where the NodeName is referred
to as a hostname (not true on AWS), or an instanceID (not true on GCE),
etc.