Commit Graph

63 Commits (e6dbe5f57eecca4d804a9679014d5cbfd821f383)

Author SHA1 Message Date
Kubernetes Submit Queue 3c606cdd20 Merge pull request #41456 from dashpole/pod_volume_cleanup
Automatic merge from submit-queue (batch tested with PRs 41466, 41456, 41550, 41238, 41416)

Delay Deletion of a Pod until volumes are cleaned up

#41436 fixed the bug that caused #41095 and #40239 to have to be reverted.  Now that the bug is fixed, this shouldn't cause problems.

 @vishh @derekwaynecarr @sjenning @jingxu97 @kubernetes/sig-storage-misc
2017-02-16 10:14:05 -08:00
David Ashpole 1d38818326 Revert "Merge pull request #41202 from dashpole/revert-41095-deletion_pod_lifecycle"
This reverts commit ff87d13b2c, reversing
changes made to 46becf2c81.
2017-02-15 08:44:03 -08:00
James Ravn 9992bd23c2 Mark detached only if no pending operations
To safely mark a volume detached when the volume controller manager is used.

An example of one such problem:

1. pod is created, volume is added to desired state of the world
2. reconciler process starts
3. reconciler starts MountVolume, which is kicked off asynchronously via
   operation_executor.go
4. MountVolume mounts the volume, but hasn't yet marked it as mounted
5. pod is deleted, volume is removed from desired state of the world
6. reconciler detects volume is no longer in desired state of world,
   removes it from volumes in use
7. MountVolume tries to mark volume in use, throws an error because
   volume is no longer in actual state of world list.
8. controller-manager tries to detach the volume, this fails because it
   is still mounted to the OS.
9. EBS gets stuck indefinitely in busy state trying to detach.
2017-02-13 11:51:44 +00:00
David Ashpole b224f83c37 Revert "[Kubelet] Delay deletion of pod from the API server until volumes are deleted" 2017-02-09 08:45:18 -08:00
David Ashpole 67cb2704c5 delete volumes before pod deletion 2017-02-08 07:34:49 -08:00
deads2k 8a12000402 move client/record 2017-01-31 19:14:13 -05:00
Kubernetes Submit Queue 564016bd9f Merge pull request #38443 from NickrenREN/vol-mgr-getVolumeInUse
Automatic merge from submit-queue (batch tested with PRs 38443, 40145, 40701, 40682)

fix GetVolumeInUse() function

Since we just want to get volume name info, each volume name just need to added once.  desiredStateOfWorld.GetVolumesToMount() will return volume and pod binding info,
if one volume is mounted to several pods, the volume name will be return several times. That is not what we want in this function.
We can add a new function to only get the volume name info  or judge whether the volume name is added to the desiredVolumesMap array.
2017-01-30 20:59:38 -08:00
Aleksandra Malinowska 74e1d8078e Revert "Delay deletion of pod from the API server until volumes are deleted" 2017-01-27 13:31:02 +01:00
Kubernetes Submit Queue aace5a7b87 Merge pull request #40449 from deads2k/client-15-types
Automatic merge from submit-queue (batch tested with PRs 40239, 40397, 40449, 40448, 40360)

move the discovery and dynamic clients

Moved the dynamic client, discovery client, testing/core, and testing/cache to `client-go`.  Dependencies on api groups we don't have generated clients for have dropped out, so federation, kubeadm, and imagepolicy.

@caesarxuchao @sttts 

approved based on https://github.com/kubernetes/kubernetes/issues/40363
2017-01-26 14:23:42 -08:00
deads2k 9488e2ba30 move testing/core to client-go 2017-01-26 13:54:40 -05:00
David Ashpole 9094b57570 cleanup volumes before deleting from the api server 2017-01-25 10:21:15 -08:00
deads2k 5a8f075197 move authoritative client-go utils out of pkg 2017-01-24 08:59:18 -05:00
Kubernetes Submit Queue dcf14add92 Merge pull request #37228 from sjenning/teardown-terminated-volumes
Automatic merge from submit-queue (batch tested with PRs 37228, 40146, 40075, 38789, 40189)

kubelet: storage: teardown terminated pod volumes

This is a continuation of the work done in https://github.com/kubernetes/kubernetes/pull/36779

There really is no reason to keep volumes for terminated pods attached on the node.  This PR extends the removal of volumes on the node from memory-backed (the current policy) to all volumes.

@pmorie raised a concern an impact debugging volume related issues if terminated pod volumes are removed.  To address this issue, the PR adds a `--keep-terminated-pod-volumes` flag the kubelet and sets it for `hack/local-up-cluster.sh`.

For consideration in 1.6.

Fixes #35406

@derekwaynecarr @vishh @dashpole

```release-note
kubelet tears down pod volumes on pod termination rather than pod deletion
```
2017-01-20 12:34:52 -08:00
Seth Jennings e2750a305a reclaim terminated pod volumes 2017-01-20 11:08:35 -06:00
deads2k ee6752ef20 find and replace 2017-01-20 08:04:53 -05:00
Wojciech Tyczynski 09e4de385c Enable nontrivial secret manager 2017-01-19 19:47:33 +01:00
Antoine Pelisse 3d82265340 Update OWNERS approvers and reviewers: pkg/kubelet 2017-01-18 10:27:11 -08:00
Clayton Coleman 9a2a50cda7
refactor: use metav1.ObjectMeta in other types 2017-01-17 16:17:19 -05:00
NickrenREN a12dea14e0 fix redundant alias clientset 2017-01-12 10:21:05 +08:00
deads2k 6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
Kubernetes Submit Queue 402abd23ef Merge pull request #39493 from sjenning/fix-null-deref
Automatic merge from submit-queue (batch tested with PRs 39493, 39496)

kubelet: fix nil deref in volume type check

An attempt to address memory exhaustion through a build up of terminated pods with memory backed volumes on the node in PR https://github.com/kubernetes/kubernetes/pull/36779 introduced this.

For the `VolumeSpec`, either the `Volume` or `PersistentVolume` field is set, not both.  This results in a situation where there is a nil deref on PVs.  Since PVs are inherently not memory-backend, only local/temporal volumes should be considered.

This needs to go into 1.5 as well.

Fixes #39480

@saad-ali @derekwaynecarr @grosskur @gnufied

```release-note
fixes nil dereference when doing a volume type check on persistent volumes
```
2017-01-06 08:44:18 -08:00
Jeff Grafton 20d221f75c Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
Seth Jennings c4e6725236 fix nil deref 2017-01-05 15:36:42 -06:00
NickrenREN 3bb75064fb fix GetVolumeInUse() function
Since we just want to get volume name info, each volume name just need to added once. desiredStateOfWorld.GetVolumesToMount() will return volume and pod binding info,
if one volume is mounted to several pods, the volume name will be return several times. That is not what we want in this function.
We can add a new function to only get the volume name info or judge whether the volume name is added to the desiredVolumesMap array.
2017-01-03 14:04:15 +08:00
Mike Danese 161c391f44 autogenerated 2016-12-29 13:04:10 -08:00
rkouj e7e3c55ad7 Add unit tests for MountVolume() of operation executor 2016-12-27 16:07:06 -08:00
rkouj d5f7610b82 Refactor operation_executor to make it unit testable 2016-12-27 15:12:16 -08:00
Kubernetes Submit Queue f1a763e7d7 Merge pull request #38180 from NickrenREN/vmgr-actual-state
Automatic merge from submit-queue (batch tested with PRs 36888, 38180, 38855, 38590)

fix function notes
2016-12-20 20:33:54 -08:00
Kubernetes Submit Queue 1abb8498aa Merge pull request #36888 from linki/patch-1
Automatic merge from submit-queue (batch tested with PRs 36888, 38180, 38855, 38590)

wrong pod reference in error message for volume attach timeout

**What this PR does / why we need it**:
when a disk mount times out you get the following error:

```
Warning		FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "nginx"/"default". list of unattached/unmounted volumes=[data]
```

where the pod is referenced by "podname"/"namespace", but should be "namespace"/"podname".

**Which issue this PR fixes**
no issue number

**Special notes for your reviewer**:
untested :(
2016-12-20 20:33:52 -08:00
NickrenREN dc7a0bf65e fix function notes
change the function notes according to the implementations of the function AddPodToVolume()
2016-12-21 10:02:20 +08:00
Chao Xu 03d8820edc rename /release_1_5 to /clientset 2016-12-14 12:39:48 -08:00
Mike Danese c87de85347 autoupdate BUILD files 2016-12-12 13:30:07 -08:00
Wojciech Tyczynski aa7da5231f Update bazel files 2016-12-09 09:42:02 +01:00
Wojciech Tyczynski e8d1cba875 GetOptions in client calls 2016-12-09 09:42:01 +01:00
Jordan Liggitt 6819706adf
Pass addressable values to DeepCopy 2016-12-08 14:16:01 -05:00
Martin Linkhorst 001e75ba8c
fix(kubelet): reference pod by namespace/name 2016-12-01 11:22:55 +01:00
Kubernetes Submit Queue ef079a316e Merge pull request #37535 from yarntime/fix_typo_in_volume_manager
Automatic merge from submit-queue

fix typo in volume_manager

fix typo in volume_manager.
2016-11-30 01:26:36 -08:00
yarntime 8447b9f940 fix typo in volumeme_manager 2016-11-28 11:46:22 +08:00
Chao Xu bcc783c594 run hack/update-all.sh 2016-11-23 15:53:09 -08:00
Chao Xu 5e1adf91df cmd/kubelet 2016-11-23 15:53:09 -08:00
Seth Jennings b80bea4a62 fix leaking memory backed volumes of terminated pods 2016-11-16 10:17:22 -06:00
Jing Xu c124830278 fix issue in reconstruct volume data when kubelet restarts
During state reconstruction when kubelet restarts, outerVolueSpecName
cannot be recovered by scanning the disk directories. But this
information is used by volume manager to check whether pod's volume is
mounted or not. There are two possible cases:
1. pod is not deleted during kubelet restarts so that desired state
should have the information. reconciler.updateState() will use this
inforamtion to update.
2. pod is deleted during this period, reconciler has to use
InnerVolumeSpecName, but it should be ok since this information will not
be used for volume cleanup (umount)
2016-11-10 16:23:55 -08:00
Rajat Ramesh Koujalagi d81e216fc6 Better messaging for missing volume components on host to perform mount 2016-11-09 15:16:11 -08:00
Paul Morie 4722cb299b Remove GetRootContext from VolumeHost 2016-11-03 12:21:19 -04:00
Jing Xu b02481708a Fix volume states out of sync problem after kubelet restarts
When kubelet restarts, all the information about the volumes will be
gone from actual/desired states. When update node status with mounted
volumes, the volume list might be empty although there are still volumes
are mounted and in turn causing master to detach those volumes since
they are not in the mounted volumes list. This fix is to make sure only
update mounted volumes list after reconciler starts sync states process.
This sync state process will scan the existing volume directories and
reconstruct actual states if they are missing.

This PR also fixes the problem during orphaned pods' directories. In
case of the pod directory is unmounted but has not yet deleted (e.g.,
interrupted with kubelet restarts), clean up routine will delete the
directory so that the pod directoriy could be cleaned up (it is safe to
delete directory since it is no longer mounted)

The third issue this PR fixes is that during reconstruct volume in
actual state, mounter could not be nil since it is required for creating
container.VolumeMap. If it is nil, it might cause nil pointer exception
in kubelet.

Details are in proposal PR #33203
2016-10-25 12:29:12 -07:00
Mike Danese 3b6a067afc autogenerated 2016-10-21 17:32:32 -07:00
Seth Jennings da3683e2b7 kubelet: storage: don't hang kubelet on unresponsive nfs 2016-10-18 08:45:40 -05:00
Jing Xu 9e8edf6baf Fix issue in updating device path when volume is attached multiple times
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue #29324
2016-10-03 17:14:23 -07:00
Justin Santa Barbara 54195d590f Use strongly-typed types.NodeName for a node name
We had another bug where we confused the hostname with the NodeName.

To avoid this happening again, and to make the code more
self-documenting, we use types.NodeName (a typedef alias for string)
whenever we are referring to the Node.Name.

A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName

Also clean up some of the (many) places where the NodeName is referred
to as a hostname (not true on AWS), or an instanceID (not true on GCE),
etc.
2016-09-27 10:47:31 -04:00
Kubernetes Submit Queue 6a9a93d469 Merge pull request #32242 from jingxu97/bug-wrongvolume-9-2
Automatic merge from submit-queue

Fix race condition in updating attached volume between master and node

This PR tries to fix issue #29324. The cause of this issue is that a race
condition happens when marking volumes as attached for node status. This
PR tries to clean up the logic of when and where to mark volumes as
attached/detached. Basically the workflow as follows,
1. When volume is attached sucessfully, the volume and node info is
added into nodesToUpdateStatusFor to mark the volume as attached to the
node.
2. When detach request comes in, it will check whether it is safe to
detach now. If the check passes, remove the volume from volumesToReportAsAttached
to indicate the volume is no longer considered as attached now.
Afterwards, reconciler tries to update node status and trigger detach
operation. If any of these operation fails, the volume is added back to
the volumesToReportAsAttached list showing that it is still attached.

These steps should make sure that kubelet get the right (might be
outdated) information about which volume is attached or not. It also
garantees that if detach operation is pending, kubelet should not
trigger any mount operations.
2016-09-12 15:29:38 -07:00