Automatic merge from submit-queue
Add SuccessfulMountVolume message to the events of pod
**What this PR does / why we need it:**
When creating a pod with volume, the volume mount may failed at first, but eventually succeed after retry several times. kubectl describe pod can only see the failed messages, so i think it will be better to add the SuccessfulMountVolume message to the pod events too.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#42867
Automatic merge from submit-queue (batch tested with PRs 46239, 46627, 46346, 46388, 46524)
move labels to components which own the APIs
During the apimachinery split in 1.6, we accidentally moved several label APIs into apimachinery. They don't belong there, since the individual APIs are not general machinery concerns, but instead are the concern of particular components: most commonly the kubelet. This pull moves the labels into their owning components and out of API machinery.
@kubernetes/sig-api-machinery-misc @kubernetes/api-reviewers @kubernetes/api-approvers
@derekwaynecarr since most of these are related to the kubelet
Automatic merge from submit-queue
Log out from multiple target portals when using iscsi storage plugin
**What this PR does / why we need it**:
When using iscsi storage with multiple target portal (TP) addresses
and multipathing the volume manager logs on to the IQN for all
portal addresses, but when a pod gets destroyed the volume manager
only logs out for the primary TP and sessions for another TPs are
always remained.
This patch adds mount points for all TPs, and then log out from all
TPs when a pod is destroyed. If a TP is referred from another pods,
the connection will be remained as usual.
**Which issue this PR fixes**
fixes#45394
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue
Migrate kubelet to ConfigMapManager interface and use TTL-based caching manager
Fixes#41379
Sometime ago we moved to a secret manager interface for kubelet to manage secrets.
This PR's first commit moves config map management also to a similar interface.
The second commit adds TTL-based CachingConfigMapManager (similar to CachingSecretManager) and makes kubelet use it.
/cc @kubernetes/sig-node-pr-reviews @kubernetes/sig-scalability-misc @wojtek-t @dchen1107
When using iscsi storage with multiple target portal (TP)
addresses and multipathing the volume manager logs on to
the IQN for all portal addresses, but when a pod gets
destroyed the volume manager only logs out for the primary
TP and sessions for another TPs are always remained.
This patch adds methods to store and load iscsi disk
configrations, then uses the stored config at DetachDisk
path.
Fix#45394
Automatic merge from submit-queue
Check volume's status before detaching volume
When volume's status is 'detaching', controllermanager will detach
it again and return err. It is necessary to check volume's status
before detaching volume.
same issue: #44536
Automatic merge from submit-queue (batch tested with PRs 46661, 46562, 46657, 46655, 46640)
remove redundant carriage return for readable
**What this PR does / why we need it**:
remove redundant carriage to make it more readable.
Automatic merge from submit-queue (batch tested with PRs 46076, 43879, 44897, 46556, 46654)
Local storage plugin
**What this PR does / why we need it**:
Volume plugin implementation for local persistent volumes. Scheduler predicate will direct already-bound PVCs to the node that the local PV is at. PVC binding still happens independently.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Part of #43640
**Release note**:
```
Alpha feature: Local volume plugin allows local directories to be created and consumed as a Persistent Volume. These volumes have node affinity and pods will only be scheduled to the node that the volume is at.
```
When volume's status is 'detaching', controllermanager will detach
it again and return err. It is necessary to wait for detach
operation to complete within the alloted time.
Automatic merge from submit-queue (batch tested with PRs 46450, 46272, 46453, 46019, 46367)
Move MountVolume.SetUp succeeded to debug level
This message is verbose and repeated over and over again in log files
creating a lot of noise. Leave the message in, but require a -v in
order to actually log it.
**What this PR does / why we need it**: Moves a verbose log message to actually be verbose.
**Which issue this PR fixes** fixes#46364Fixes#29059
Automatic merge from submit-queue (batch tested with PRs 46383, 45645, 45923, 44884, 46294)
Node status updater now deletes the node entry in attach updates...
… when node is missing in NodeInformer cache.
- Added RemoveNodeFromAttachUpdates as part of node status updater operations.
**What this PR does / why we need it**: Fixes issue of unnecessary node status updates when node is deleted.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#42438
**Special notes for your reviewer**: Unit tested added, but a more comprehensive test involving the attach detach controller requires certain testing functionality that is currently absent, and will require larger effort. Will be added at a later time.
There is an edge case caused by the following steps:
1) A node is deleted and restarted. The node exists, but is not yet recognized by Kubernetes.
2) A pod requiring a volume attach with nodeName specifically set to this node.
This would make the pod stuck in ContainerCreating state. This is low-pri since it's a specific edge case that can be avoided.
**Release note**:
```release-note
NONE
```
This message is verbose and repeated over and over again in log files
creating a lot of noise. Leave the messsage in, but require a -v in
order to actually log it.
Fixes#29059
Automatic merge from submit-queue (batch tested with PRs 46429, 46308, 46395, 45867, 45492)
Implement FakeVolumePlugin's ConstructVolumeSpec method according to interface expectation.
This fixes#45803 and #46204.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45518, 46127, 46146, 45932, 45003)
Remove requirement to run the Portworx volume driver on master node
**What this PR does / why we need it**:
This change removes requirement to run the Portworx volume driver on Kubernetes master node.
**Special notes for your reviewer**:
Before this pull request, in order to use a Portworx volume, users had to run the Portworx container on the master node. Since it isn't ideal (and impossible on GKE) to schedule any pods on the master node, this PR removes that requirement.
Automatic merge from submit-queue
fix regression in UX experience for double attach volume
send event when volume is not allowed to multi-attach
Fixes#46012
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 38505, 41785, 46315)
Only retrieve relevant volumes
**What this PR does / why we need it**:
Improves performance for Cinder volume attach/detach calls.
Currently when Cinder volumes are attached or detached, functions try to retrieve details about the volume from the Nova API. Because some only have the volume name not its UUID, they use the list function in gophercloud to iterate over all volumes to find a match. This incurs severe performance problems on OpenStack projects with lots of volumes (sometimes thousands) since it needs to send a new request when the current page does not contain a match. A better way of doing this is use the `?name=XXX` query parameter to refine the results.
**Which issue this PR fixes**:
https://github.com/kubernetes/kubernetes/issues/26404
**Special notes for your reviewer**:
There were 2 ways of addressing this problem:
1. Use the `name` query parameter
2. Instead of using the list function, switch to using volume UUIDs and use the GET function instead. You'd need to change the signature of a few functions though, such as [`DeleteVolume`](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/cinder/cinder.go#L49), so I'm not sure how backwards compatible that is.
Since #1 does effectively the same as #2, I went with it because it ensures BC.
One assumption that is made is that the `volumeName` being retrieved matches exactly the name of the volume in Cinder. I'm not sure how accurate that is, but I see no reason why cloud providers would want to append/prefix things arbitrarily.
**Release note**:
```release-note
Improves performance of Cinder volume attach/detach operations
```
An admin wants to specify in which AWS availability zone(s) users may create persistent volumes using dynamic provisioning.
That's why the admin can now configure in StorageClass object a comma separated list of zones. Dynamically created PVs for PVCs that use the StorageClass are created in one of the configured zones.
An admin wants to specify in which GCE availability zone(s) users may create persistent volumes using dynamic provisioning.
That's why the admin can now configure in StorageClass object a comma separated list of zones. Dynamically created PVs for PVCs that use the StorageClass are created in one of the configured zones.
The zone parameter provided in a Storage Class may erroneously be an empty string or contain only spaces and tab characters. Such situation shall be detected and reported as an error.
That's why the func ValidateZone was added.
An admin shall be able to configure a comma separated list of zones for a StorageClass.
That's why the func ZonesToSet (string) (set.String, error) is added. The func ZonesToSet converts a string containing a comma separated list of zones to a set. In case the list contains an empty zone an error is returned.
Automatic merge from submit-queue
Add `auto_unmount` mount option for glusterfs fuse mount.
libfuse has an auto_unmount option which, if enabled, ensures that
the file system is unmounted at FUSE server termination by running a
separate monitor process that performs the unmount when that occurs.
(This feature would probably better be called "robust auto-unmount",
as FUSE servers usually do try to unmount their file systems upon
termination, it's just this mechanism is not crash resilient.)
This change implements that option and behavior for glusterfs.
This option will be only supported for clients with version >3.11.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
libfuse has an auto_unmount option which, if enabled, ensures that
the file system is unmounted at FUSE server termination by running a
separate monitor process that performs the unmount when that occurs.
(This feature would probably better be called "robust auto-unmount",
as FUSE servers usually do try to unmount their file systems upon
termination, it's just this mechanism is not crash resilient.)
This change implements that option and behavior for glusterfs.
This option will be only supported for clients with version >3.11.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Modprobe is a kernel operation that should only be done once to load the
RBD module. The admin could've done this on the Kubernetes nodes. The
RBD plugin can still try to load the module but it shouldnt fail the
workflow if it doesnt succeed.
Partially addresses #45190
Automatic merge from submit-queue (batch tested with PRs 45374, 44537, 45739, 44474, 45888)
Fix attach volume to instance repeatedly
1.When volume's status is 'attaching', controllermanager will attach
it again and return err. So it is necessary to check volume's
status before attach/detach volume.
2. When volume's status is 'attaching', its attachments will be None,
controllermanager can't get device path and make some failed event.
But it is normal, so don't return err when attachments is None
Fix bug: #44536
Automatic merge from submit-queue (batch tested with PRs 45408, 45355, 45528)
Make createEndpointService() and deleteEndpointService() plugin interface methods.
Why this change?
In some setups, after creation of dynamic PVs and before mounting/using these PVs in a pod, the endpoint/service got mistakenly deleted by the user/developer. By making these methods 'plugin' specific, we can call it from mounter if there are scenarios where the endpoint and service got wiped in between accidentally.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
In some setups, after creation of dynamic PVs and before mounting/using
these PVs in a pod, the endpoint/service got mistakenly deleted by the
user/developer. By making these methods 'plugin' specific, we can call
it from mounter if there are scenarios where the endpoint and service
got wiped in between accidentally.
Signed-off-by: Humble Chirammal hchiramm@redhat.com
When volume's status is 'attaching', its attachments will be None,
controllermanager can't get device path and make some failed event.
But it is normal, let's fix it.
Automatic merge from submit-queue (batch tested with PRs 44798, 45537, 45448, 45432)
nfs.go: cleancode err
**What this PR does / why we need it**:
The modification makes code clean, simple, and easy to inspect.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Statefulsets for cinder: allow multi-AZ deployments, spread pods across zones
**What this PR does / why we need it**: Currently if we do not specify availability zone in cinder storageclass, the cinder is provisioned to zone called nova. However, like mentioned in issue, we have situation that we want spread statefulset across 3 different zones. Currently this is not possible with statefulsets and cinder storageclass. In this new solution, if we leave it empty the algorithm will choose the zone for the cinder drive similar style like in aws and gce storageclass solutions.
**Which issue this PR fixes** fixes#44735
**Special notes for your reviewer**:
example:
```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: all
provisioner: kubernetes.io/cinder
---
apiVersion: v1
kind: Service
metadata:
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
name: galera
labels:
app: mysql
spec:
ports:
- port: 3306
name: mysql
clusterIP: None
selector:
app: mysql
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: "galera"
replicas: 3
template:
metadata:
labels:
app: mysql
annotations:
pod.alpha.kubernetes.io/initialized: "true"
spec:
containers:
- name: mysql
image: adfinissygroup/k8s-mariadb-galera-centos:v002
imagePullPolicy: Always
ports:
- containerPort: 3306
name: mysql
- containerPort: 4444
name: sst
- containerPort: 4567
name: replication
- containerPort: 4568
name: ist
volumeMounts:
- name: storage
mountPath: /data
readinessProbe:
exec:
command:
- /usr/share/container-scripts/mysql/readiness-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
volumeClaimTemplates:
- metadata:
name: storage
annotations:
volume.beta.kubernetes.io/storage-class: all
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 12Gi
```
If this example is deployed it will automatically create one replica per AZ. This helps us a lot making HA databases.
Current storageclass for cinder is not perfect in case of statefulsets. Lets assume that cinder storageclass is defined to be in zone called nova, but because labels are not added to pv - pods can be started in any zone. The problem is that at least in our openstack it is not possible to use cinder drive located in zone x from zone y. However, should we have possibility to choose between cross-zone cinder mounts or not? Imo it is not good way of doing things that they mount volume from another zone where the pod is located(means more network traffic between zones)? What you think? Current new solution does not allow that anymore (should we have possibility to allow it? it means removing the labels from pv).
There might be some things that needs to be fixed still in this release and I need help for that. Some parts of the code is not perfect.
Issues what i am thinking about (I need some help for these):
1) Can everybody see in openstack what AZ their servers are? Can there be like access policy that do not show that? If AZ is not found from server specs, I have no idea how the code behaves.
2) In GetAllZones() function, is it really needed to make new serviceclient using openstack.NewComputeV2 or could I somehow use existing one
3) This fetches all servers from some openstack tenant(project). However, in some cases kubernetes is maybe deployed only to specific zone. If kube servers are located for instance in zone 1, and then there are another servers in same tenant in zone 2. There might be usecase that cinder drive is provisioned to zone-2 but it cannot start pod, because kubernetes does not have any nodes in zone-2. Could we have better way to fetch kubernetes nodes zones? Currently that information is not added to kubernetes node labels automatically in openstack (which should I think). I have added those labels manually to nodes. If that zone information is not added to nodes, the new solution does not start stateful pods at all, because it cannot target pods.
cc @rootfs @anguslees @jsafrane
```release-note
Default behaviour in cinder storageclass is changed. If availability is not specified, the zone is chosen by algorithm. It makes possible to spread stateful pods across many zones.
```
Automatic merge from submit-queue
add rootfs gnufied and childsb to volume approver
**What this PR does / why we need it**:
add me and @gnufied @childsb to volume approver
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Remove unnecessary constants and add type to secret
**What this PR does / why we need it**:
Adds the type field to the secret for the `persistent-volume-provisioning` example of Quobyte. Also remove unnecessary constants in Quobyte Code base.
FYI
@rootfs @saad-ali @quolix
Automatic merge from submit-queue (batch tested with PRs 44590, 44969, 45325, 45208, 44714)
Use dedicated UnixUserID and UnixGroupID types
**What this PR does / why we need it**:
DRYs up type definitions by using the dedicated types in apimachinery
**Which issue this PR fixes**
#38120
**Release note**:
```release-note
UIDs and GIDs now use apimachinery types
```
Automatic merge from submit-queue (batch tested with PRs 44590, 44969, 45325, 45208, 44714)
Refactor volume operation log and error messages
What this PR does / why we need it:
Adds wrappers for volume-specific error and log messages. Each message has a simple version that can be displayed to the user and a detailed version that can be used in logs. The messages that are used for events was also cleaned up. @msau42
Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes#40905
Special notes for your reviewer:
pkg/kubelet/volumemanager/reconciler/reconciler.go can be refactored. I can do that refactoring after this one.
Release note:
NONE
Automatic merge from submit-queue (batch tested with PRs 45283, 45289, 45248, 44295)
Azure disk: dealing with missing disk probe
**What this PR does / why we need it**:
While Azure disks are expected to attach to SCSI host 3 and above on general purpose instances, on certain Azure instances disks are under SCSI host 2.
This fix searches all LUNs but excludes those used by Azure sys disks, based on udev rules [here](https://raw.githubusercontent.com/Azure/WALinuxAgent/master/config/66-azure-storage.rules)
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Log node name when error attaching volume
Helps with debugging to know immediately which node the volume failed to atach to. Went through all plugins, added this to 3. @gnufied
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44601, 44842, 44893, 44491, 44588)
Define const annotation variable once
We do not need to define the const annotation var twice in pkg/volume and pkg/volume/validation
**Release note**:
```release-note
NONE
```
When the attach/detach controller crashes and a pod with attached PV is deleted
afterwards the controller will never detach the pod's attached volumes. To
prevent this the controller should try to recover the state from the nodes
status.
Automatic merge from submit-queue (batch tested with PRs 40055, 42085, 44509, 44568, 43956)
Fix gofmt errors
**What this PR does / why we need it**:
There were some gofmt errors on master. Ran the following to fix:
```
hack/verify-gofmt.sh | grep ^diff | awk '{ print $2 }' | xargs gofmt -w -s
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: none
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 40777, 43673)
remove an unnecassary variable assignment in glusterfs_test
**What this PR does / why we need it**:
`path` is exactly the same variable as `volumePath`, which is defined in line 122 . So no needs to assign it.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 44362, 44421, 44468, 43878, 44480)
Delete EmptyDir volume directly instead of renaming the directory.
**What this PR does / why we need it**:
The volume operation executor can handle duplicate requests on the same volume now, so it is not necessary to rename the directory anymore. This change can cause pod deletion to take longer for large emptydir volumes because now the pod waits for the volume to be deleted until it continues pod cleanup. But this is actually required for local disk scheduling so that we don't schedule new pods that need emptydir volumes on the node if the previous emptydir has not be fully reclaimed yet.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#43534
**Special notes for your reviewer**:
**Release note**:
NONE
cc @kubernetes/sig-storage-pr-reviews
Automatic merge from submit-queue
Update owners to include kerneltime
**What this PR does / why we need it**: Update owners to include kerneltime to help with PRs
Automatic merge from submit-queue
Catch error when failed to make directory in NFS volume plugin
NFS: Catch error when failed to make directory
Currently, NFS volume plugin doesn't catch the error from
os.MkdirAll. That makes it difficult to debug why failed to make the
directory. This patch adds error catch to os.MkdirAll.
Automatic merge from submit-queue
azure disk: add logging on disk attach
**What this PR does / why we need it**:
While we were debugging a failed azure disk attach, we were missing logging information to identify the root cause. This fix logs information at each stage of attach to help identify where problem is once it happens again.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
NONE
Automatic merge from submit-queue (batch tested with PRs 44119, 42538, 43802, 42336, 43396)
iSCSI CHAP support
**What this PR does / why we need it**:
To support CHAP authentication in a multi-tenant setup
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Support iSCSI CHAP authentication
```