Automatic merge from submit-queue
Fix Multizone pv creation on GCE
When Multizone is enabled static PV creation on GCE
fails because Cloud provider configuration is not
available in admission plugins.
cc @derekwaynecarr @childsb
This implements Bulk volume polling using ideas presented by
justin in https://github.com/kubernetes/kubernetes/pull/39564
But it changes the implementation to use an interface
and doesn't affect other implementations.
We are more liberal in what we accept as a volume id in k8s, and indeed
we ourselves generate names that look like `aws://<zone>/<id>` for
dynamic volumes.
This volume id (hereafter a KubernetesVolumeID) cannot directly be
compared to an AWS volume ID (hereafter an awsVolumeID).
We introduce types for each, to prevent accidental comparison or
confusion.
Issue #35746
At master volume reconciler, the information about which volumes are
attached to nodes is cached in actual state of world. However, this
information might be out of date in case that node is terminated (volume
is detached automatically). In this situation, reconciler assume volume
is still attached and will not issue attach operation when node comes
back. Pods created on those nodes will fail to mount.
This PR adds the logic to periodically sync up the truth for attached volumes kept in the actual state cache. If the volume is no longer attached to the node, the actual state will be updated to reflect the truth. In turn, reconciler will take actions if needed.
To avoid issuing many concurrent operations on cloud provider, this PR
tries to add batch operation to check whether a list of volumes are
attached to the node instead of one request per volume.
More details are explained in PR #33760
We had another bug where we confused the hostname with the NodeName.
To avoid this happening again, and to make the code more
self-documenting, we use types.NodeName (a typedef alias for string)
whenever we are referring to the Node.Name.
A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName
Also clean up some of the (many) places where the NodeName is referred
to as a hostname (not true on AWS), or an instanceID (not true on GCE),
etc.
We had a long-lasting bug which prevented creation of volumes in
non-master zones, because the cloudprovider in the volume label
admission controller is not initialized with the multizone setting
(issue #27656).
This implements a simple workaround: if the volume is created with the
failure-domain zone label, we look for the volume in that zone. This is
more efficient, avoids introducing a new semantic, and allows users (and
the dynamic provisioner) to create volumes in non-master zones.
Fixes#27657
This is a first-aid bandage to let admission controller ignore persistent
volumes that are being provisioned right now and thus may not exist in
external cloud infrastructure yet.
We are (sadly) using a copy-and-paste of the GCE PD code for AWS EBS.
This code hasn't been updated in a while, and it seems that the GCE code
has some code to make volume mounting more robust that we should copy.
For AWS EBS, a volume can only be attached to a node in the same AZ.
The scheduler must therefore detect if a volume is being attached to a
pod, and ensure that the pod is scheduled on a node in the same AZ as
the volume.
So that the scheduler need not query the cloud provider every time, and
to support decoupled operation (e.g. bare metal) we tag the volume with
our placement labels. This is done automatically by means of an
admission controller on AWS when a PersistentVolume is created backed by
an EBS volume.
Support for tagging GCE PVs will follow.
Pods that specify a volume directly (i.e. without using a
PersistentVolumeClaim) will not currently be scheduled correctly (i.e.
they will be scheduled without zone-awareness).