Automatic merge from submit-queue
Add gnufied as reviewer for pkg/volume
I have helped review and contributed code to this
area already.
cc @saad-ali @jsafrane @childsb
Automatic merge from submit-queue
fc: Drop multipath.conf snippet
**What this PR does / why we need it**:
Removes multipath.conf - The code does not make use of it - or ensure s that it's getting used - and it should in addition be handled elsewehre.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
A minimalistic multipath.conf got written, but it was useless, as
it is unclear if multipathd is running and there was also no
config reload triggered.
This patch drops this snippet. In general it's probably a better idea
to leave the multipath.conf to the component managing the host.
Signed-off-by: Fabian Deutsch <fabiand@fedoraproject.org>
Auto-generated via:
git grep -l [Ss]uccesfully | xargs sed -ri 's/([sS])uccesfully/\1uccessfully/g'
I noticed this when running kube-scheduler with --v4 and it is annoying.
Then manually reverted changed to the vendored bits.
Automatic merge from submit-queue
recycle pod can't get the event since channel closed
What this PR does / why we need it:
We create a hostPath type PV with "Recycle" persistentVolumeReclaimPolicy, and bind a PVC to it, but after deleted the PVC, the PV cannot become to available status. This is happened after we upgrade etcd to 3.0. The reason is:
If the channel used to get the pod message and events been abnormal closed(for example, the event channel maybe closed because of "required revision has been compacted" error), the function internalRecycleVolumeByWatchingPodUntilCompletion will stuck in a loop, and the recycle pod will not been deleted, the PV can not become into available status
Special notes for your reviewer:
None
Release note:
This commit updates the code to set the default value of the readOnly attribute to false.
It also updates the example docs to add full list of supported plugin attributes and doc.
Automatic merge from submit-queue (batch tested with PRs 42369, 42375, 42397, 42435, 42455)
[Bug Fix]: Avoid evicting more pods than necessary by adding Timestamps for fsstats and ignoring stale stats
Continuation of #33121. Credit for most of this goes to @sjenning. I added volume fs timestamps.
**why is this a bug**
This PR attempts to fix part of https://github.com/kubernetes/kubernetes/issues/31362 which results in multiple pods getting evicted unnecessarily whenever the node runs into resource pressure. This PR reduces the chances of such disruptions by avoiding reacting to old/stale metrics.
Without this PR, kubernetes nodes under resource pressure will cause unnecessary disruptions to user workloads.
This PR will also help deflake a node e2e test suite.
The eviction manager currently avoids evicting pods if metrics are old. However, timestamp data is not available for filesystem data, and this causes lots of extra evictions.
See the [inode eviction test flakes](https://k8s-testgrid.appspot.com/google-node#kubelet-flaky-gce-e2e) for examples.
This should probably be treated as a bugfix, as it should help mitigate extra evictions.
cc: @kubernetes/sig-storage-pr-reviews @kubernetes/sig-node-pr-reviews @vishh @derekwaynecarr @sjenning
Automatic merge from submit-queue (batch tested with PRs 41306, 42187, 41666, 42275, 42266)
Implement bulk polling of volumes
This implements Bulk volume polling using ideas presented by
justin in https://github.com/kubernetes/kubernetes/pull/39564
But it changes the implementation to use an interface
and doesn't affect other implementations.
cc @justinsb
This implements Bulk volume polling using ideas presented by
justin in https://github.com/kubernetes/kubernetes/pull/39564
But it changes the implementation to use an interface
and doesn't affect other implementations.
Automatic merge from submit-queue (batch tested with PRs 41597, 42185, 42075, 42178, 41705)
force rbd image unlock if the image is not used
**What this PR does / why we need it**:
Ceph RBD image could be locked if the host that holds the lock is down. In such case, the image cannot be used by other Pods.
The fix is to detect the orphaned locks and force unlock.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#31790
**Special notes for your reviewer**:
Note, previously, RBD volume plugin maps the image, mount it, and create a lock on the image. Since the proposed fix uses `rbd status` output to determine if the image is being used, the sequence has to change to: rbd lock checking (through `rbd lock list`), mapping check (through `rbd status`), forced unlock if necessary (through `rbd lock rm`), image lock, image mapping, and mount.
**Release note**:
```release-note
force unlock rbd image if the image is not used
```
- Add a new type PortworxVolumeSource
- Implement the kubernetes volume plugin for Portworx Volumes under pkg/volume/portworx
- The Portworx Volume Driver uses the libopenstorage/openstorage specifications and apis for volume operations.
Changes for k8s configuration and examples for portworx volumes.
- Add PortworxVolume hooks in kubectl, kube-controller-manager and validation.
- Add a README for PortworxVolume usage as PVs, PVCs and StorageClass.
- Add example spec files
Handle code review comments.
- Modified READMEs to incorporate to suggestions.
- Add a test for ReadWriteMany access mode.
- Use util.UnmountPath in TearDown.
- Add ReadOnly flag to PortworxVolumeSource
- Use hostname:port instead of unix sockets
- Delete the mount dir in TearDown.
- Fix link issue in persistentvolumes README
- In unit test check for mountpath after Setup is done.
- Add PVC Claim Name as a Portworx Volume Label
Generated code and documentation.
- Updated swagger spec
- Updated api-reference docs
- Updated generated code under pkg/api/v1
Godeps update for Portworx Volume Driver
- Adds github.com/libopenstorage/openstorage
- Adds go.pedge.io/pb/go/google/protobuf
- Updates Godep Licenses
Automatic merge from submit-queue (batch tested with PRs 41116, 41804, 42104, 42111, 42120)
Add support for attacher/detacher interface in Flex volume
Add support for attacher/detacher interface in Flex volume
This change breaks backward compatibility and requires to be release noted.
```release-note
Flex volume plugin is updated to support attach/detach interfaces. It broke backward compatibility. Please update your drivers and implement the new callouts.
```
Add some lines about how to enable multipath for block storage.
A new README was added, because multipath is relevant for at least
FC and iSCSI.
Signed-off-by: Fabian Deutsch <fabiand@fedoraproject.org>
A minimalistic multipath.conf got written, but it was useless, as
it is unclear if multipathd is running and there was also no
config reload triggered.
This patch drops this snippet. In general it's probably a better idea
to leave the multipath.conf to the component managing the host.
Signed-off-by: Fabian Deutsch <fabiand@fedoraproject.org>
is contacted if there is a failure in the connected server.
Mount option becomes:
mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/glustermount/glusterpod-glusterfs.log,backup-volfile-servers=192.168.100.0:192.168.200.0:192.168.43.149 ..
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Automatic merge from submit-queue
Fix for Support selection of datastore for dynamic provisioning in vS…
Fixes#40558
Current vSphere Cloud provider doesn't allow a user to select a datastore for dynamic provisioning. All the volumes are created in default datastore provided by the user in the global vsphere configuration file.
With this fix, the user will be able to provide the datastore in the storage class definition. This will allow the volumes to be created in the datastore specified by the user in the storage class definition. This field is optional. If no datastore is specified, the volume will be created in the default datastore specified in the global config file.
For example:
User creates a storage class with the datastore
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: slow
provisioner: kubernetes.io/vsphere-volume
parameters:
diskformat: thin
datastore: VMFSDatastore
Now the volume will be created in the datastore - "VMFSDatastore" specified by the user.
If the user creates a storage class without any datastore
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: slow
provisioner: kubernetes.io/vsphere-volume
parameters:
diskformat: thin
Now the volume will be created in the datastore which in the global configuration file (vsphere.conf)
@pdhamdhere @kerneltime
Automatic merge from submit-queue (batch tested with PRs 41667, 41820, 40910, 41645, 41361)
Allow multiple mounts in StatefulSet volume zone placement
We have some heuristics that ensure that volumes (and hence stateful set
pods) are spread out across zones. Sadly they forgot to account for
multiple mounts. This PR updates the heuristic to ignore the mount name
when we see something that looks like a statefulset volume, thus
ensuring that multiple mounts end up in the same AZ.
Fix#35695
```release-note
Fix zone placement heuristics so that multiple mounts in a StatefulSet pod are created in the same zone
```
Automatic merge from submit-queue (batch tested with PRs 41364, 40317, 41326, 41783, 41782)
changes to cleanup the volume plugin for recycle
**What this PR does / why we need it**:
Code cleanup. Changing from creating a new interface from the plugin, that then calls a function to recycle a volume, to adding the function to the plugin itself.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#26230
**Special notes for your reviewer**:
Took same approach from closed PR #28432.
Do you want the approach to be the same for NewDeleter(), NewMounter(), NewUnMounter() and should they be in this same PR or submit different PR's for those?
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 39373, 41585, 41617, 41707, 39958)
Fix ConfigMaps for Windows
**What this PR does / why we need it**: ConfigMaps were broken for Windows as the existing code used linux specific file paths. Updated the code in `kubelet_getters.go` to use `path/filepath` to get the directories. Also reverted back the code in `secret.go` as updating `kubelet_getters.go` to use `path/filepath` also fixes `secrets`
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/kubernetes/kubernetes/issues/39372
```release-note
Fix ConfigMap for Windows Containers.
```
cc: @pires
We have some heuristics that ensure that volumes (and hence stateful set
pods) are spread out across zones. Sadly they forgot to account for
multiple mounts. This PR updates the heuristic to ignore the mount name
when we see something that looks like a statefulset volume, thus
ensuring that multiple mounts end up in the same AZ.
Fix#35695
Automatic merge from submit-queue (batch tested with PRs 41531, 40417, 41434)
Always detach volumes in operator executor
**What this PR does / why we need it**:
Instead of marking a volume as detached immediately in Kubelet's
reconciler, delegate the marking asynchronously to the operator
executor. This is necessary to prevent race conditions with other
operations mutating the same volume state.
An example of one such problem:
1. pod is created, volume is added to desired state of the world
2. reconciler process starts
3. reconciler starts MountVolume, which is kicked off asynchronously via
operation_executor.go
4. MountVolume mounts the volume, but hasn't yet marked it as mounted
5. pod is deleted, volume is removed from desired state of the world
6. reconciler reaches detach volume section, detects volume is no longer in desired state of world,
removes it from volumes in use
7. MountVolume tries to mark mount, throws an error because
volume is no longer in actual state of world list. After this, kubelet isn't aware of the mount
so doesn't try to unmount again.
8. controller-manager tries to detach the volume, this fails because it
is still mounted to the OS.
9. EBS gets stuck indefinitely in busy state trying to detach.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#32881, fixes ##37854 (maybe)
**Special notes for your reviewer**:
**Release note**:
```release-note
```
To safely mark a volume detached when the volume controller manager is used.
An example of one such problem:
1. pod is created, volume is added to desired state of the world
2. reconciler process starts
3. reconciler starts MountVolume, which is kicked off asynchronously via
operation_executor.go
4. MountVolume mounts the volume, but hasn't yet marked it as mounted
5. pod is deleted, volume is removed from desired state of the world
6. reconciler detects volume is no longer in desired state of world,
removes it from volumes in use
7. MountVolume tries to mark volume in use, throws an error because
volume is no longer in actual state of world list.
8. controller-manager tries to detach the volume, this fails because it
is still mounted to the OS.
9. EBS gets stuck indefinitely in busy state trying to detach.
Automatic merge from submit-queue (batch tested with PRs 41246, 39998)
Cinder volume attacher: use instanceID instead of NodeID when verifying attachment
**What this PR does / why we need it**: Cinder volume attacher incorrectly uses NodeID instead of openstack instance id, so that reconciliation fails.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#39978
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
Add gnufied as reviewer for aws and gce volumes
Adding myself as reviewer for aws and gce volume plugins. I understand the code well enough and have helped with review in those areas already.
cc @childsb @justinsb @saad-ali
This fix prevents the PV controller from forcefully overwriting the provisioned volume's name with the generated PV name. Instead, it allows dynamic provisioner implementers to set the name of the volume to a value that they choose.
Automatic merge from submit-queue (batch tested with PRs 38772, 38797, 40732, 40740)
Synchronous spellcheck for pkg/volume/*
**What this PR does / why we need it**: Increase code readability
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**: Minor contribution
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 40299, 40311)
move authoritative client-go util out of pkg
Move `client-go/pkg/util` which are authoritative to `client-go/util` to make it easier to reason about what comes from where.
Automatic merge from submit-queue
Adding rescan scsi controller for cinder
For lsilogic scsi controller, attached cinder volume does not
appear under /dev/ automatically unless do a rescan.
This approach was used in vSphere volume provider before PR #27496
dropped support for lsilogic scsi controller.
Automatic merge from submit-queue
Optional configmaps and secrets
Allow configmaps and secrets for environment variables and volume sources to be optional
Implements approved proposal c9f881b7bb
Release note:
```release-note
Volumes and environment variables populated from ConfigMap and Secret objects can now tolerate the named source object or specific keys being missing, by adding `optional: true` to the volume or environment variable source specifications.
```
Automatic merge from submit-queue (batch tested with PRs 37228, 40146, 40075, 38789, 40189)
Cleanup temp dirs
So funny story my /tmp ran out of space running the unit tests so I am cleaning up all the temp dirs we create.
Automatic merge from submit-queue (batch tested with PRs 39826, 40030)
azure disk: restrict length of name
**What this PR does / why we need it**:
Fixes dynamic disk provisioning on Azure by properly truncating the disk name to conform to the Azure API spec.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
n/a
**Special notes for your reviewer**:
n/a
**Release note**:
```release-note
azure disk: restrict name length for Azure specifications
```
cc: @rootfs
Automatic merge from submit-queue
Curating Owners: pkg/volume
cc @jsafrane @spothanis @agonzalezro @justinsb @johscheuer @simonswine @nelcy @pmorie @quofelix @sdminonne @thockin @saad-ali @rootfs
In an effort to expand the existing pool of reviewers and establish a
two-tiered review process (first someone lgtms and then someone
experienced in the project approves), we are adding new reviewers to
existing owners files.
If You Care About the Process:
------------------------------
We did this by algorithmically figuring out who’s contributed code to
the project and in what directories. Unfortunately, that doesn’t work
well: people that have made mechanical code changes (e.g change the
copyright header across all directories) end up as reviewers in lots of
places.
Instead of using pure commit data, we generated an excessively large
list of reviewers and pruned based on all time commit data, recent
commit data and review data (number of PRs commented on).
At this point we have a decent list of reviewers, but it needs one last
pass for fine tuning.
Also, see https://github.com/kubernetes/contrib/issues/1389.
TLDR:
-----
As an owner of a sig/directory and a leader of the project, here’s what
we need from you:
1. Use PR https://github.com/kubernetes/kubernetes/pull/35715 as an example.
2. The pull-request is made editable, please edit the `OWNERS` file to
remove the names of people that shouldn't be reviewing code in the
future in the **reviewers** section. You probably do NOT need to modify
the **approvers** section. Names asre sorted by relevance, using some
secret statistics.
3. Notify me if you want some OWNERS file to be removed. Being an
approver or reviewer of a parent directory makes you a reviewer/approver
of the subdirectories too, so not all OWNERS files may be necessary.
4. Please use ALIAS if you want to use the same list of people over and
over again (don't hesitate to ask me for help, or use the pull-request
above as an example)
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)
fix bug not using volumetype config in create volume
fixes#39843
@humblec
we are building the volumetype config but I don't see where we are using it in the CreateVolume for dyn provisioning, this is why volumetype parameter from the Storage Class was being overlooked because we are hard coding constants like replicaCount which is always 3.
unless I'm missing something?
Automatic merge from submit-queue (batch tested with PRs 39486, 37288, 39477, 39455, 39542)
Fix wc zombie goroutine issue in volume util
See [Cadvisor #1558](https://github.com/google/cadvisor/pull/1558). This should solve problems for those using images that do not support "wc".
cc: @timstclair
Automatic merge from submit-queue (batch tested with PRs 39433, 39413)
"Attach" function records information collation
In the "attach" function, the log information, for the variable "instanceid", has been described as "node", as well as recorded as "instance", recorded as "instance" should be better.
Automatic merge from submit-queue
Add unit tests for operation_executor
Add unit test for `Unmount operations should start in parallel for all volume plugins`
cc: @saad-ali
Automatic merge from submit-queue (batch tested with PRs 39092, 39126, 37380, 37093, 39237)
Improve error reporting in Ceph RBD provisioner.
- We should report an error when user references a secret that cannot be found
- We should report output of rbd create/delete commands, logging "exit code 1"
is not enough.
Before:
```
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
33m 33m 1 {persistentvolume-controller } Warning ProvisioningFailed Failed to provision volume with StorageClass "cephrbdprovisioner": rbd: create volume failed, err: exit status 1
```
After:
```
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
33m 33m 1 {persistentvolume-controller } Warning ProvisioningFailed Failed to provision volume with StorageClass "cephrbdprovisioner": failed to create rbd image: exit status 1, command output: rbd: couldn't connect to the cluster
```
@rootfs, PTAL
Automatic merge from submit-queue (batch tested with PRs 37959, 36221)
Recycle Pod Template Check
The kube-controller-manager has two command line arguments (--pv-recycler-pod-template-filepath-hostpath and --pv-recycler-pod-template-filepath-nfs) that specify a recycle pod template. The recycle pod template may not contain the volume that shall be recycled.
A check is added to make sure that the recycle pod template contains at least a volume.
cc: @jsafrane
Automatic merge from submit-queue
Refactor operation_executor to make it testable
**What this PR does / why we need it**:
To refactor operation_executor to make it unit testable
**Release note**:
`NONE`
Automatic merge from submit-queue (batch tested with PRs 39152, 39142, 39055)
openstack: Forcibly detach an attached cinder volume before attaching elsewhere
Fixes#33288
**What this PR does / why we need it**:
Without this fix, we can't preemptively reschedule pods with persistent volumes to other hosts (for rebalancing or hardware failure recovery).
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#33288
**Special notes for your reviewer**:
(This is a resurrection/cleanup of PR #33734, originally authored by @Rotwang)
**Release note**:
Automatic merge from submit-queue (batch tested with PRs 39029, 39014)
[Glusterfs Vol Plugin]: Check kube client is invalid and return error
Fixes: #38939
In volume plugins, we need to create a kube client to make api call. And this kube client can be nil when, for example, wrong api-server configuration, but kubelet should not crash in this case.
I have also checked other plugins and found only glusterfs need this fix.
The kube-controller-manager has two command line arguments (--pv-recycler-pod-template-filepath-hostpath and --pv-recycler-pod-template-filepath-nfs) that specify a recycle pod template. The recycle pod template may not contain the volume that shall be recycled.
A check is added to make sure that the recycle pod template contains at least a volume.
Automatic merge from submit-queue (batch tested with PRs 36888, 38180, 38855, 38590)
wrong pod reference in error message for volume attach timeout
**What this PR does / why we need it**:
when a disk mount times out you get the following error:
```
Warning FailedSync Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "nginx"/"default". list of unattached/unmounted volumes=[data]
```
where the pod is referenced by "podname"/"namespace", but should be "namespace"/"podname".
**Which issue this PR fixes**
no issue number
**Special notes for your reviewer**:
untested :(
Automatic merge from submit-queue (batch tested with PRs 38284, 38403, 38265, 38378)
glusterfs: properly check gidMin and gidMax values from SC individually
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
This fixes a misleading debug message, and also prevents the glusterfs provisioner from adapting a misconfiguration of the gid-range in the storage class. Instead it will fail with proper error messages.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://bugzilla.redhat.com/show_bug.cgi?id=1402286
**Special notes for your reviewer**:
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
```
Don't override explict out-of max-range configuration, but
fail with an error message instead.
Signed-off-by: Michael Adam <obnox@redhat.com>
Automatic merge from submit-queue (batch tested with PRs 36071, 32752, 37998, 38350, 38401)
Pass addressable values to DeepCopy
Extracted from https://github.com/kubernetes/kubernetes/pull/35728
These are the places we are currently calling DeepCopy incorrectly, and we need to fix, even if we don't pick up the changes to DeepCopy in #35728:
* creating a new cloner means we have no generated functions registered
* passing non-addressable values doesn't pick up generated deep copy functions, and forces us into reflective mode
Automatic merge from submit-queue
Fix unmountDevice issue caused by shared mount in GCI
This is a fix on top #38124. In this fix, we move the logic to filter
out shared mount references into operation_executor's UnmountDevice
function to avoid this part is being used by other types volumes such as
rdb, azure etc. This filter function should be only needed during
unmount device for GCI image.
Automatic merge from submit-queue (batch tested with PRs 36310, 37349, 38319, 38402, 38338)
Fix space issue in volumePath with vSphere Cloud Provider
I tried to create a kubernetes deployment with vSphere volume with volume path
"[datastore] kubevols/redis-master".
In this case the cloud provider queries the getDeviceNameFromMount() to return the path of the volume mounted. Since getDeviceNameFromMount() queries the filesystem to get the mount references, it returns a volume path "[datastore]\\040kubevols/redis-master". Later the kubelet searches for this volume path in both the actual and desired states. Th actual and desired states contains volume with path "[datastore] kubevols/redis-master". So, it couldn't find such volume path and therefore kubernetes stalls unable to make any progress further similar to one described in #37022.
This PR will fix the space issue in volume path by replacing \\040 to empty space. This fixes#37712.
Also fixes#38148
@kerneltime @pdhamdhere
This is a fix on top #38124. In this fix, we move the logic to filter
out shared mount references into operation_executor's UnmountDevice
function to avoid this part is being used by other types volumes such as
rdb, azure etc. This filter function should be only needed during
unmount device for GCI image.
Automatic merge from submit-queue (batch tested with PRs 38294, 37009, 36778, 38130, 37835)
fix permissions when using fsGroup
Currently, when an fsGroup is specified, the permissions of the defaultMode are not respected and all files created by the atomic writer have mode 777. This is because in `SetVolumeOwnership()` the `filepath.Walk` includes the symlinks created by the atomic writer. The symlinks have mode 777 when read from `info.Mode()`. However, when the are chmod'ed later, the chmod applies to the file the symlink points to, not the symlink itself, resulting in the wrong mode for the underlying file.
This PR skips chmod/chown for symlinks in the walk since those operations are carried out on the underlying file which will be included elsewhere in the walk.
xref https://bugzilla.redhat.com/show_bug.cgi?id=1384458
@derekwaynecarr @pmorie
this is a workaround for the unmount device issue caused by gci mounter. In GCI cluster, if gci mounter is used for mounting, the container started by mounter script will cause additional mounts created in the container. Since these mounts are irrelavant to the original mounts, they should be not considered when checking the mount references. By comparing the mount path prefix, those additional mounts can be filtered out.
Plan to work on better approach to solve this issue.
Automatic merge from submit-queue (batch tested with PRs 38076, 38137, 36882, 37634, 37558)
glusterfs: Fix all gid types to int to prevent failures on 32bit systems
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
The glusterfs dynamic provisioner with GID security has an issue on 32 bit systems.
This fixes that issue by forcing all gid types to int internally.
<!--
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
-->
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
Fix the glusterfs dynamic provisioner for 32bit systems by limiting the gids to type int internally, and allowing 2147483647 as the highest GID.
```
This makes all types int until we hand the GID to heketi/gluster,
at which point it's converted to int64.
It also limits the maximum usable GID ti math.MaxInt32 = 2147483647.
Signed-off-by: Michael Adam <obnox@redhat.com>
This makes all types int until we hand the GID to heketi/gluster,
at which point it's converted to int64.
It also limits the maximum usable GID ti math.MaxInt32 = 2147483647.
Signed-off-by: Michael Adam <obnox@redhat.com>
Automatic merge from submit-queue
Implement GID security for the GlusterFS dynamic provisioner.
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
This PR implements GID security for the glusterfs dynamic provisioner.
It is a reworked version of PR #37549 .
<!--
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
-->
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
The glusterfs dynamic volume provisioner will now choose a unique GID for new persistent volumes from a range that can be configured in the storage class with the "gidMin" and "gidMax" parameters. The default range is 2000 - 4294967295 (max uint32).
```
An allocator of integers that allows for changing the range.
Previously allocated numbers are not lost, and can be
released later even if they have fallen outside of the range.
Signed-off-by: Michael Adam <obnox@redhat.com>
Automatic merge from submit-queue
Add `clusterid`, an optional parameter to storageclass.
At present, admin doesn't have the privilege to chose the
trusted storage pool from which persistent gluster volume
has to be provided.
This patch introduce a new storage class parameter which allows
the admin to specify storage pool/cluster if required.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Automatic merge from submit-queue
kubernetes attempts to unmount a wrong vSphere volume and stops making any progress after that
This is in reference to the bug #37332 which was accidentally closed. So created this new PR.
The code is already reviewed as part of PR #37332
Fixes issue #37022
@saad-ali @jingxu97 @abrarshivani @kerneltime
Automatic merge from submit-queue
Fix photon controller plugin to construct with correct PdID
**What this PR does / why we need it**:
This PR is to fix a mismatching of unmount path in photon volume plugin, which is resulted from the assigning volume spec name to persistent disk ID. Without this path, unmounting process is stalling in reconciler when a pod is deleted. Restart the same pod will see a mount failure because the previous unmounting is still going on.
The input variable of function ConstructVolumeSpec is the volume spec name instead of persistent disk ID. Previously the function directly construct new volume spec by assigning volume spec name to persistent disk ID, which will result in mismatching of mount path. The fix will find the pdID according to mount path and construct volume spec with the correct pdID.
I have tested the patch with back-to-back pod creation/deletion and mounting/unmounting of photon persistent disk volume source performs normal now.
This need to be cherry-picked to 1.5 release branch.
Automatic merge from submit-queue
Reduce verbosity of volume reconciler
**What this PR does / why we need it**:
It reduces the log verbosity for attaching of volumes
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Reduce verbosity of volume reconciler when attaching volumes
```
Set logging level for information about attaching of volumes to from 1 to 4
Otherwise the log is spammed with one line per 100ms while attaching is
in progress and afterwards as long as the volume is attached.
- We should report an error when user references a secret that cannot be found
- We should report output of rbd create/delete commands, logging "exit code 1"
is not enough.
Before:
```
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
33m 33m 1 {persistentvolume-controller } Warning ProvisioningFailed Failed to provision volume with StorageClass "cephrbdprovisioner": rbd: create volume failed, err: exit status 1
```
After:
```
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
33m 33m 1 {persistentvolume-controller } Warning ProvisioningFailed Failed to provision volume with StorageClass "cephrbdprovisioner": failed to create rbd image: exit status 1, command output: rbd: couldn't connect to the cluster
```
The input variable of function ConstructVolumeSpec is the volume spec
name instead of persistent disk ID. Previously the function directly
construct new volume spec by assigning volume spec name to persistent
disk ID, which will result in mismatching of mount path. The fix will
find the pdID according to mount path and construct volume spec with the
correct pdID.
trusted storage pool from which persistent gluster volume
has to be provided.
This patch introduce a new storage class parameter which allows
the admin to specify storage pool/cluster if required.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Automatic merge from submit-queue
Add more logging around Pod deletion
After this PR we'll have at least V(2) level log near all Pod deletions.
@saad-ali - this is required by GKE to help with diagnosing possible problem.
cc @dchen1107 @wojtek-t
For lsilogic scsi controller, attached cinder volume does not
appear under /dev/ automatically unless do a rescan.
This approach was used in vSphere volume provider before PR #27496
dropped support for lsilogic scsi controller.
Automatic merge from submit-queue
Revert "Use Gid when provisioning Gluster Volumes."
On further inspection the design in #35460 was not secure enough. This PR reverts the change.
This reverts commit 7a0d219d12.
Automatic merge from submit-queue
fix issue in converting aws volume id from mount paths
This PR is to fix the issue in converting aws volume id from mount
paths. Currently there are three aws volume id formats supported. The
following lists example of those three formats and their corresponding
global mount paths:
1. aws:///vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/vol-123456)
2. aws://us-east-1/vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/mounts/aws/us-est-1/vol-123455)
3. vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/mounts/aws/us-est-1/vol-123455)
For the first two cases, we need to check the mount path and convert
them back to the original format.
This PR fixes#36269
Automatic merge from submit-queue
Fix recycler pod deletion race.
We should use clone of recycler pod template instead of reusing the same
one for two or more recyclers running in parallel.
Also add some logs to relevant places to spot the error easily next time.
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1392338
This PR is to fix the issue in converting aws volume id from mount
paths. Currently there are three aws volume id formats supported. The
following lists example of those three formats and their corresponding
global mount paths:
1. aws:///vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/vol-123456)
2. aws://us-east-1/vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/mounts/aws/us-est-1/vol-123455)
3. vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/mounts/aws/us-est-1/vol-123455)
For the first two cases, we need to check the mount path and convert
them back to the original format.
We should use clone of recycler pod template instead of reusing the same
one for two or more recyclers running in parallel.
Also add some logs to relevant places to spot the error easily next time.
Automatic merge from submit-queue
Avoid setting S_ISGID on files in volumes
Some applications are having issues with setting the S_ISGID bit on files in volumes. We intend to do this for directories so that the group ID is inherited, but not files for which S_ISGID indicates madatory file locking https://linux.die.net/man/2/stat
xref https://bugzilla.redhat.com/show_bug.cgi?id=1387306
@ncdc @derekwaynecarr @pmorie
Automatic merge from submit-queue
Make a consistent name ( GlusterFS instead of Gluster) in variables a…
Signed-off-by: Humble Chirammal hchiramm@redhat.com
Directories in volumes are set S_ISGID to ensure files created inside
them inherit group ownership. Currently, files are also set S_ISGID
however this is not relevant to the original intent, and indicates
'mandatory file locking' (stat(2)).
With this commit, only directories are set S_ISGID.
struct hostPathPlugin contains newRecyclerFunc, newDeleterFunc and newProvisionerFunc items that have only one instance, i.e. newRecycler, newDeleter or newProvisioner function.
That's why the newRecyclerFunc, newDeleterFunc and newProvisionerFunc items are removed and the newRecycler, newDeleter or newProvisioner functions are called directly.
In addition, the TestRecycler tests whether NewFakeRecycler function is called and returns nil. This is no longer needed so this particular part of the test is removed. In addition, the no longer used NewFakeRecycler function is removed also.
Similarly for the NFS plugin, struct nfsPlugin contains newRecyclerFunc item that has only one instance, i.e. newRecycler function. That's why the newRecyclerFunc item is removed and the newRecycler function is called directly. In addition, the TestRecycler tests whether newMockRecycler function is called and returns nil. This is no longer needed so this particular part of the test is removed. In addition, the no longer used newMockRecycler function is removed also.
Automatic merge from submit-queue
Remove PV annotations for quobyte provisioner
This is the last provisioner that uses annotations to pass secrets from provisioner to deleter.
Fixes#34822
@johscheuer, I don't have access to quobyte, please take look and retest the plugin. An e2e test for quobyte would be nice!
@kubernetes/sig-storage
Automatic merge from submit-queue
Remove unused WaitForDetach from Detacher interface and plugins
See issue #33128 and PR #33270
We can't rely on the device name provided by OpenStack Cinder, and thus
must perform detection based on the drive serial number (aka It's cinder ID)
on the kubelet itself.
This needs to be removed now, as part of #33128, as the code can't be
updated to attempt device detection and fallback through to the Cinder
provided deviceName, as detection "fails" when the device is gone, and
if cinder has reported a deviceName that another volume has used in
relaity, then this will block forever (or until the other, unreleated,
volume has been detached)
Automatic merge from submit-queue
Remove GetRootContext method from VolumeHost interface
Remove the `GetRootContext` call from the `VolumeHost` interface, since Kubernetes no longer needs to know the SELinux context of the Kubelet directory.
Per #33951 and #35127.
Depends on #33663; only the last commit is relevant to this PR.
Automatic merge from submit-queue
Initial work on running windows containers on Kubernetes
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
This is the first stab at getting the Kubelet running on Windows (fixes#30279), and getting it to deploy network-accessible pods that consist of Windows containers. Thanks @csrwng, @jbhurat for helping out.
The main challenge with Windows containers at this point is that container networking is not supported. In other words, each container in the pod will get it's own IP address. For this reason, we had to make a couple of changes to the kubelet when it comes to setting the pod's IP in the Pod Status. Instead of using the infra-container's IP, we use the IP address of the first container.
Other approaches we investigated involved "disabling" the infra container, either conditionally on `runtime.GOOS` or having a separate windows-docker container runtime that re-implemented some of the methods (would require some refactoring to avoid maintainability nightmare).
Other changes:
- The default docker endpoint was removed. This results in the docker client using the default for the specific underlying OS.
More detailed documentation on how to setup the Windows kubelet can be found at https://docs.google.com/document/d/1IjwqpwuRdwcuWXuPSxP-uIz0eoJNfAJ9MWwfY20uH3Q.
cc: @ikester @brendandburns @jstarks
Automatic merge from submit-queue
Per Volume Inode Accounting
Collects volume inode stats using the same find command as cadvisor. The command is "find _path_ -xdev -printf '.' | wc -c". The output is passed to the summary api, and will be consumed by the eviction manager.
This cannot be merged yet, as it depends on changes adding the InodesUsed field to the summary api, and the eviction manager consuming this. Expect tests to fail until this happens.
DEPENDS ON #35137