Commit Graph

58 Commits (e9afbd5cdf4501e3ba4377d4032afc28b03eac93)

Author SHA1 Message Date
Kubernetes Submit Queue 60dc2fa5d8 Merge pull request #35675 from liggitt/pv-secrets
Automatic merge from submit-queue

Require PV provisioner secrets to match type

In 1.5, PV provisioners are allowing targeting namespaced secrets via storageclass params. This adds a requirement that those secrets' type match the volume provisioner plugin name, to prevent targeting and extraction of arbitrary secrets

Helps limit secret targeting issues mentioned in https://github.com/kubernetes/kubernetes/issues/34822
2016-10-30 02:41:05 -07:00
Jing Xu abbde43374 Add sync state loop in master's volume reconciler
At master volume reconciler, the information about which volumes are
attached to nodes is cached in actual state of world. However, this
information might be out of date in case that node is terminated (volume
is detached automatically). In this situation, reconciler assume volume
is still attached and will not issue attach operation when node comes
back. Pods created on those nodes will fail to mount.

This PR adds the logic to periodically sync up the truth for attached volumes kept in the actual state cache. If the volume is no longer attached to the node, the actual state will be updated to reflect the truth. In turn, reconciler will take actions if needed.

To avoid issuing many concurrent operations on cloud provider, this PR
tries to add batch operation to check whether a list of volumes are
attached to the node instead of one request per volume.

More details are explained in PR #33760
2016-10-28 09:24:53 -07:00
Jordan Liggitt 1dd73c59f3
Require PV provisioner secrets to match type 2016-10-27 02:45:05 -04:00
Jing Xu b02481708a Fix volume states out of sync problem after kubelet restarts
When kubelet restarts, all the information about the volumes will be
gone from actual/desired states. When update node status with mounted
volumes, the volume list might be empty although there are still volumes
are mounted and in turn causing master to detach those volumes since
they are not in the mounted volumes list. This fix is to make sure only
update mounted volumes list after reconciler starts sync states process.
This sync state process will scan the existing volume directories and
reconstruct actual states if they are missing.

This PR also fixes the problem during orphaned pods' directories. In
case of the pod directory is unmounted but has not yet deleted (e.g.,
interrupted with kubelet restarts), clean up routine will delete the
directory so that the pod directoriy could be cleaned up (it is safe to
delete directory since it is no longer mounted)

The third issue this PR fixes is that during reconstruct volume in
actual state, mounter could not be nil since it is required for creating
container.VolumeMap. If it is nil, it might cause nil pointer exception
in kubelet.

Details are in proposal PR #33203
2016-10-25 12:29:12 -07:00
Mike Danese 3b6a067afc autogenerated 2016-10-21 17:32:32 -07:00
Jan Safranek 2b2508ba15 Remove PV annotations for Gluster provisioner.
Don't store Gluster SotrageClass parameters in annotations, it's insecure.
Instead, expect that there is the StorageClass available at the time
when it's needed by Gluster deleter.
2016-10-18 09:54:35 +02:00
Jedrzej Nowak f0988b95e7 Typos and englishify pkg/volume 2016-10-03 22:39:33 +02:00
Justin Santa Barbara 54195d590f Use strongly-typed types.NodeName for a node name
We had another bug where we confused the hostname with the NodeName.

To avoid this happening again, and to make the code more
self-documenting, we use types.NodeName (a typedef alias for string)
whenever we are referring to the Node.Name.

A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName

Also clean up some of the (many) places where the NodeName is referred
to as a hostname (not true on AWS), or an instanceID (not true on GCE),
etc.
2016-09-27 10:47:31 -04:00
Johannes Scheuermann 0b7cb5f2ae Inital Quobyte dynamic provision 2016-09-16 13:26:18 +02:00
Kubernetes Submit Queue 6a9a93d469 Merge pull request #32242 from jingxu97/bug-wrongvolume-9-2
Automatic merge from submit-queue

Fix race condition in updating attached volume between master and node

This PR tries to fix issue #29324. The cause of this issue is that a race
condition happens when marking volumes as attached for node status. This
PR tries to clean up the logic of when and where to mark volumes as
attached/detached. Basically the workflow as follows,
1. When volume is attached sucessfully, the volume and node info is
added into nodesToUpdateStatusFor to mark the volume as attached to the
node.
2. When detach request comes in, it will check whether it is safe to
detach now. If the check passes, remove the volume from volumesToReportAsAttached
to indicate the volume is no longer considered as attached now.
Afterwards, reconciler tries to update node status and trigger detach
operation. If any of these operation fails, the volume is added back to
the volumesToReportAsAttached list showing that it is still attached.

These steps should make sure that kubelet get the right (might be
outdated) information about which volume is attached or not. It also
garantees that if detach operation is pending, kubelet should not
trigger any mount operations.
2016-09-12 15:29:38 -07:00
Jing Xu efaceb28cc Fix race condition in updating attached volume between master and node
This PR tries to fix issue #29324. This cause of this issue is a race
condition happens when marking volumes as attached for node status. This
PR tries to clean up the logic of when and where to mark volumes as
attached/detached. Basically the workflow as follows,
1. When volume is attached sucessfully, the volume and node info is
added into nodesToUpdateStatusFor to mark the volume as attached to the
node.
2. When detach request comes in, it will check whether it is safe to
detach now. If the check passes, remove the volume from volumesToReportAsAttached
to indicate the volume is no longer considered as attached now.
Afterwards, reconciler tries to update node status and trigger detach
operation. If any of these operation fails, the volume is added back to
the volumesToReportAsAttached list showing that it is still attached.

These steps should make sure that kubelet get the right (might be
outdated) information about which volume is attached or not. It also
garantees that if detach operation is pending, kubelet should not
trigger any mount operations.
2016-09-12 13:51:08 -07:00
Kubernetes Submit Queue 34141a794d Merge pull request #31251 from rootfs/rbd-prov3
Automatic merge from submit-queue

support storage class in Ceph RBD volume

replace WIP PR #30959, using PV annotation idea from @jsafrane 

@kubernetes/sig-storage @johscheuer @elsonrodriguez
2016-09-10 07:03:14 -07:00
Jing Xu b9157b7524 Post event message for volume attachment
This PR is to add event message when attaching volume fails to help
users to debug. For detach failure, may address in a different PR since
it requires more data structure change.
2016-09-01 16:24:36 -07:00
Huamin Chen 0c3b2f44a4 review feedbacks
Signed-off-by: Huamin Chen <hchen@redhat.com>
2016-08-25 15:32:26 -04:00
Kubernetes Submit Queue d0cca393d7 Merge pull request #31034 from jingxu97/unmount-8-19
Automatic merge from submit-queue

Add ismounted check in unmountpath function

This change is to fix PR #30930. The function should check if the
mountpath is still mounted or not. If it is not, it should continue with
removing the directory instead of returning error.
2016-08-19 22:18:28 -07:00
Jing Xu cafd126ecd Add ismounted check in unmountpath function
This change is for fixing PR #30930. The function should check if the
mountpath is still mounted or not. If it is not, it should continue with
removing the directory instead of returning error.
2016-08-19 17:15:30 -07:00
Kubernetes Submit Queue 6ce405c6ee Merge pull request #27778 from screeley44/k8-vol-executor
Automatic merge from submit-queue

Add Events for operation_executor to show status of mounts, failed/successful to show in describe events

Fixes #27590 
@saad-ali @pmorie @erinboyd

After talking with @pmorie last week about the above issue, I decided to poke around and see if I could remedy.  The refactoring broke my previous UXP merged PR's that correctly showed failed mount errors in the describe events.  However, Not sure I implemented correctly, but it tested out and seems to be working, let me know what I missed or if this is not the correct approach.

```
Events:
  FirstSeen	LastSeen	Count	From			SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  2m		2m		1	{default-scheduler }			Normal		Scheduled	Successfully assigned nfs-bb-pod1 to 127.0.0.1
  44s		44s		1	{kubelet 127.0.0.1}			Warning		FailedMount	Unable to mount volumes for pod "nfs-bb-pod1_default(a94f64f1-37c9-11e6-9aa5-52540073d346)": timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol]
  44s		44s		1	{kubelet 127.0.0.1}			Warning		FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol]
  38s		38s		1	{kubelet }				Warning		FailedMount	Unable to mount volumes for pod "a94f64f1-37c9-11e6-9aa5-52540073d346": Mount failed: exit status 32
Mounting arguments: nfs1.rhs:/opt/data99 /var/lib/kubelet/pods/a94f64f1-37c9-11e6-9aa5-52540073d346/volumes/kubernetes.io~nfs/nfsvol nfs []
Output: mount.nfs: Connection timed out

Resolution hint: Check and make sure the NFS Server exists (ensure that correct IPAddress/Hostname was given) and is available/reachable.
Also make sure firewall ports are open on both client and NFS Server (2049 v4 and 2049, 20048 and 111 for v3).
Use commands telnet <nfs server> <port> and showmount <nfs server> to help test connectivity.
```
2016-08-19 08:27:48 -07:00
Kubernetes Submit Queue 6824f4c08a Merge pull request #28936 from rata/secret-configmap-file-mode
Automatic merge from submit-queue

Allow setting permission mode bits on secrets, configmaps and downwardAPI files

cc @thockin @pmorie 

Here is the first round to implement: https://github.com/kubernetes/kubernetes/pull/28733.

I made two commits: one with the actual change and the other with the auto-generated code. I think it's easier to review this way, but let me know if you prefer in some other way.

I haven't written any tests yet, I wanted to have a first glance and not write them till this (and the API) are more close to the "LGTM" :)

There are some things:
 * I'm not sure where to do the "AND 0777". I'll try to look better in the code base, but suggestions are always welcome :)
 * The write permission on group and others is not set when you do an `ls -l` on the running container. It does work with write permissions to the owner. Debugging seems to show that is something happening after this is correctly set on creation. Will look closer.
 * The default permission (when the new fields are not specified) are the same that on kubernetes v1.3
 * I do realize there are conflicts with master, but I think this is good enough to have a look. The conflicts is with the autog-enerated code, so the actual code is actually the same (and it takes like ~30 minutes to generate it here)
 * I didn't generate the docs (`generated-docs` and `generated-swagger-docs` from `hack/update-all.sh`) because my machine runs out of mem. So that's why it isn't in this first PR, will try to investigate and see why it happens.

Other than that, this works fine here with some silly scripts I did to create a secret&configmap&downwardAPI, a pod and check the file permissions. Tested the "defaultMode" and "mode" for all. But of course, will write tests once this is looking fine :)


Thanks a lot again!
Rodrigo
2016-08-18 05:59:48 -07:00
Rodrigo Campos 568f4c2e63 Add mode permission bits to configmap, secrets and downwardAPI
This implements the proposal in:
docs/proposals/secret-configmap-downwarapi-file-mode.md

Fixes: #28317.

The mounttest image is updated so it returns the permissions of the linked file
and not the symlink itself.
2016-08-17 14:44:41 -04:00
Scott Creeley 782d7d9815 Add Events for operation_executor to show status of mounts, failed or successful 2016-08-17 09:53:47 -04:00
saadali 0c72568247 Skip safe to detach if node api obj doesn't exist 2016-08-16 21:30:51 -07:00
saadali e73c516366 Prevent device unmount from deleting dir on err
Prevent device unmount from deleting dir unless volume is successfully
unmounted first.
2016-08-15 16:58:31 -07:00
Kubernetes Submit Queue 79ed7064ca Merge pull request #27970 from jingxu97/restartKubelet-6-22
Automatic merge from submit-queue

Add volume reconstruct/cleanup logic in kubelet volume manager

Currently kubelet volume management works on the concept of desired
and actual world of states. The volume manager periodically compares the
two worlds and perform volume mount/unmount and/or attach/detach
operations. When kubelet restarts, the cache of those two worlds are
gone. Although desired world can be recovered through apiserver, actual
world can not be recovered which may cause some volumes cannot be cleaned
up if their information is deleted by apiserver. This change adds the
reconstruction of the actual world by reading the pod directories from
disk. The reconstructed volume information is added to both desired
world and actual world if it cannot be found in either world. The rest
logic would be as same as before, desired world populator may clean up
the volume entry if it is no longer in apiserver, and then volume
manager should invoke unmount to clean it up.

Fixes https://github.com/kubernetes/kubernetes/issues/27653
2016-08-15 13:48:43 -07:00
Jing Xu f19a1148db This change supports robust kubelet volume cleanup
Currently kubelet volume management works on the concept of desired
and actual world of states. The volume manager periodically compares the
two worlds and perform volume mount/unmount and/or attach/detach
operations. When kubelet restarts, the cache of those two worlds are
gone. Although desired world can be recovered through apiserver, actual
world can not be recovered which may cause some volumes cannot be cleaned
up if their information is deleted by apiserver. This change adds the
reconstruction of the actual world by reading the pod directories from
disk. The reconstructed volume information is added to both desired
world and actual world if it cannot be found in either world. The rest
logic would be as same as before, desired world populator may clean up
the volume entry if it is no longer in apiserver, and then volume
manager should invoke unmount to clean it up.
2016-08-15 11:29:15 -07:00
Jess Frazelle 7e9d82129e
fix go vet errors
Signed-off-by: Jess Frazelle <jessfraz@google.com>

fix composites

Signed-off-by: Jess Frazelle <me@jessfraz.com>
2016-08-10 16:45:41 -07:00
k8s-merge-robot 5760acf603 Merge pull request #29596 from matttproud/fix/time-leaks/remainder
Automatic merge from submit-queue

pkg/various: plug leaky time.New{Timer,Ticker}s

According to the documentation for Go package time, `time.Ticker` and
`time.Timer` are uncollectable by garbage collector finalizers.  They
leak until otherwise stopped.  This commit ensures that all remaining
instances are stopped upon departure from their relative scopes.

Similar efforts were incrementally done in #29439 and #29114.

```release-note
* pkg/various: plugged various time.Ticker and time.Timer leaks.
```
2016-07-29 14:06:47 -07:00
k8s-merge-robot 62e7c57acc Merge pull request #29598 from matttproud/refactor/simplify/goroutinemap
Automatic merge from submit-queue

pkg/util/goroutinemap: apply idiomatic Go cleanups

Package goroutinemap can be structurally simplified to be more
idiomatic, concise, and free of error potential.  No structural changes
are made.

It is unconventional declare `sync.Mutex` directly as a pointerized
field in a parent structure.  The `sync.Mutex` operates on pointer
receivers of itself; and by relying on that, the types that contain
those fields can be safely constructed using
https://golang.org/ref/spec#The_zero_value semantic.

The duration constants are already of type `time.Duration`, so
re-declaring that is redundant.

/CC: @saad-ali
2016-07-28 04:44:26 -07:00
Paul Morie c884297990 Fix collisions issues / timeouts for mounts
For non-attachable volumes, do not call GetVolumeName on the plugin and instead
generate a unique name based on the identity of the pod and the name of the volume
within the pod.
2016-07-27 17:53:50 -04:00
Matt T. Proud 4e0a1858f9 pkg/util/goroutinemap: apply idiomatic Go cleanups
Package goroutinemap can be structurally simplified to be more
idiomatic, concise, and free of error potential.  No structural changes
are made.

It is unconventional declare `sync.Mutex` directly as a pointerized
field in a parent structure.  The `sync.Mutex` operates on pointer
receivers of itself; and by relying on that, the types that contain
those fields can be safely constructed using
https://golang.org/ref/spec#The_zero_value.

The duration constants are already of type `time.Duration`, so
re-declaring that is redundant.
2016-07-26 07:00:26 +02:00
Matt T. Proud 5c6292c074 pkg/various: plug leaky time.New{Timer,Ticker}s
According to the documentation for Go package time, `time.Ticker` and
`time.Timer` are uncollectable by garbage collector finalizers.  They
leak until otherwise stopped.  This commit ensures that all remaining
instances are stopped upon departure from their relative scopes.
2016-07-26 06:20:31 +02:00
saadali 88d495026d Allow mounts to run in parallel for non-attachable
Allow mount volume operations to run in parallel for non-attachable
volume plugins.

Allow unmount volume operations to run in parallel for all volume
plugins.
2016-07-19 21:54:26 -07:00
Cindy Wang e13c678e3b Make volume unmount more robust using exclusive mount w/ O_EXCL 2016-07-18 16:20:08 -07:00
Davanum Srinivas 2b0ed014b7 Use Go canonical import paths
Add canonical imports only in existing doc.go files.
https://golang.org/doc/go1.4#canonicalimports

Fixes #29014
2016-07-16 13:48:21 -04:00
bin liu 426fdc431a Merge branch 'master' into fix-typos 2016-07-04 11:20:47 +08:00
saadali 0dd17fff22 Reorganize volume controllers and manager 2016-07-01 18:50:25 -07:00
David McMahon ef0c9f0c5b Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
saadali e06b32b1ef Mark VolumeInUse before checking if it is Attached
Ensure that kublet marks VolumeInUse before checking if it is Attached.
Also ensures that the attach/detach controller always fetches a fresh
copy of the node object before detach (instead ofKubelet relying on node
informer cache).
2016-06-28 14:05:59 -07:00
saadali dfe8e606c1 Fix device path used by volume WaitForAttach 2016-06-22 12:56:58 -07:00
bin liu fd27cd47f7 fix some typos
Signed-off-by: bin liu <liubin0329@gmail.com>
2016-06-22 18:14:26 +08:00
saadali e716ddc771 Controller wait for attach and exponential backoff
Modify attach/detach controller to keep track of volumes to report
attached in Node VolumeToAttach status.

Modify kubelet volume manager to wait for volume to show up in Node
VolumeToAttach status.

Implement exponential backoff for errors in volume manager and attach
detach controller
2016-06-20 18:19:55 -07:00
saadali d72f88bf3a Modify Attach method to return device path 2016-06-19 23:54:02 -07:00
saadali 542f2dc708 Introduce new kubelet volume manager
This commit adds a new volume manager in kubelet that synchronizes
volume mount/unmount (and attach/detach, if attach/detach controller
is not enabled).

This eliminates the race conditions between the pod creation loop
and the orphaned volumes loops. It also removes the unmount/detach
from the `syncPod()` path so volume clean up never blocks the
`syncPod` loop.
2016-06-15 09:34:08 -07:00
saadali 9b6a505f8a Rename UniqueDeviceName to UniqueVolumeName
Rename UniqueDeviceName to UniqueVolumeName and move helper functions
from attacherdetacher to volumehelper package.
Introduce UniquePodName alias
2016-06-15 09:32:12 -07:00
Saad Ali 9dbe943491 Attach/Detach Controller Kubelet Changes
This PR contains Kubelet changes to enable attach/detach controller control.
* It introduces a new "enable-controller-attach-detach" kubelet flag to
  enable control by controller. Default enabled.
* It removes all references "SafeToDetach" annoation from controller.
* It adds the new VolumesInUse field to the Node Status API object.
* It modifies the controller to use VolumesInUse instead of SafeToDetach
  annotation to gate detachment.
* There is a bug in node-problem-detector that causes VolumesInUse to
  get reset every 30 seconds. Issue https://github.com/kubernetes/node-problem-detector/issues/9
  opened to fix that.
2016-06-02 16:47:11 -07:00
Vishnu Kannan baa8ac4d6b Add metrics support for a few network based volumes.
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2016-05-23 09:33:12 -07:00
Clayton Coleman 5e4308f91d
Update use of Quantity in other classes 2016-05-19 08:41:43 -04:00
Tim Hockin 527cb50583 Demand at least go1.6 2016-05-08 20:30:37 -07:00
tobad357 1811ded396 This is an update that allows the iscsi volume to check if a iscsi device belongs to a mpio device
If it does belong to the device then we make sure we mount the mpio device instead of
the raw device.

Heuristics
Login into /dev/disk/by-path/iqn-example.com.2999 -> /dev/sde
Check if sde existsin in /sys/block/[dm-X]/slaves/xx
If it does mount /dev/[dm-x] which will look like /dev/mapper/mpiodevicename in mount

examples/iscsi has more details
2016-04-20 09:42:11 +08:00
Paul Morie e838ff2893 Make ConfigMap volume readable as non-root 2016-04-05 12:20:52 -04:00
Paul Morie 26471d5723 Remove sentinel file from atomic writer 2016-02-27 16:09:06 -05:00