Commit Graph

220 Commits (c3463d737efe553f4818ff6495f93856eac4557b)

Author SHA1 Message Date
Kubernetes Submit Queue ab794c6128 Merge pull request #40918 from k82cn/pv_ctrl_typo
Automatic merge from submit-queue

Fixed typo in pv_controller.go

fixes #40916
2017-02-03 07:37:25 -08:00
Klaus Ma ef5f838c23 Fixed typo in pv_controller.go 2017-02-03 20:55:15 +08:00
Kubernetes Submit Queue 9e427c88c4 Merge pull request #40859 from jsafrane/ps-scheduler-event
Automatic merge from submit-queue (batch tested with PRs 40855, 40859)

PV binding: send an event when there are no PVs to bind

This is similar to scheduler that says "no nodes available to schedule pods"
when it can't schedule a pod.

@kubernetes/sig-storage-pr-reviews
2017-02-02 09:01:48 -08:00
Jan Safranek 13546e5ea4 PV binding: send an event when there are no PVs to bind
This is similar to scheduler that says "no nodes available to schedule pods"
when it can't schedule a pod.
2017-02-02 13:30:53 +01:00
Vladimir Vivien 8ebed57767 Prevent pv controller from forcefully overwrite provisioned volume name
This fix prevents the PV controller from forcefully overwriting the provisioned volume's name with the generated PV name.  Instead, it allows dynamic provisioner implementers to set the name of the volume to a value that they choose.
2017-02-01 12:19:20 -05:00
Jan Safranek 587eb199e0 Remove alpha provisioning 2017-02-01 14:51:54 +01:00
deads2k 8a12000402 move client/record 2017-01-31 19:14:13 -05:00
deads2k 2c1c0f3f72 move workqueue to client-go 2017-01-30 09:08:21 -05:00
Dr. Stefan Schimanski 44ea6b3f30 Update generated files 2017-01-29 21:41:45 +01:00
Dr. Stefan Schimanski bc6fdd925d pkg/api/resource: move to apimachinery 2017-01-29 21:41:44 +01:00
Kubernetes Submit Queue 88890f586c Merge pull request #40126 from resouer/return-value
Automatic merge from submit-queue (batch tested with PRs 40126, 40565, 38777, 40564, 40572)

Do not swallow error in asw.updateNodeStatusUpdateNeeded

Ref #39056

Bubble the error up to `SetNodeUpdateStatusNeeded` and log it out.

NOTE: This does not modify interface of `SetNodeUpdateStatusNeeded`
2017-01-27 01:34:16 -08:00
deads2k 9488e2ba30 move testing/core to client-go 2017-01-26 13:54:40 -05:00
Dr. Stefan Schimanski a0137e9b28 Update generated files 2017-01-25 19:49:45 +01:00
Dr. Stefan Schimanski d7eb3b6870 pkg/util: move uuid and strategicpatch into k8s.io/apimachinery 2017-01-25 19:45:09 +01:00
Harry Zhang 70941f65bf Do not swallow error in volume 2017-01-25 21:29:48 +08:00
deads2k b0b156b381 make tools/cache authoritative 2017-01-25 08:29:45 -05:00
Kubernetes Submit Queue 373e7ef0c0 Merge pull request #40294 from tsmetana/persistent-volume-test-refactor
Automatic merge from submit-queue (batch tested with PRs 39064, 40294)

Refactor persistent volume tests

This is an attempt to make the binder tests a bit more concise. The PVCs are being created by a "templating" function. There is also a handful of PVs in the tests but those vary quite more and I don't think similar approach would save us much code.

Reference:
https://reviewable.kubernetes.io/reviews/kubernetes/kubernetes/29006#-KPJuVeDE0O6TvDP9jia

@jsafrane: I hope this is what you have on mind.
2017-01-25 02:05:57 -08:00
Clayton Coleman 469df12038
refactor: move ListOptions references to metav1 2017-01-23 17:52:46 -05:00
Wojciech Tyczynski bf7138652f SecretVolume using secret manager 2017-01-23 16:10:01 +01:00
Tomas Smetana 382f7bc9cc Refactor persistent volume tests 2017-01-23 10:43:44 +01:00
Wojciech Tyczynski d08abdb187 Allow for returning map[string]interface{} from patch. 2017-01-18 11:53:30 +01:00
Clayton Coleman 9a2a50cda7
refactor: use metav1.ObjectMeta in other types 2017-01-17 16:17:19 -05:00
Clayton Coleman 36acd90aba
Move APIs and core code to use metav1.ObjectMeta 2017-01-17 16:17:18 -05:00
Kubernetes Submit Queue f74b4bbbad Merge pull request #38094 from yarntime/fix_update_typo
Automatic merge from submit-queue

fix typos

fix typos.
2017-01-16 18:22:33 -08:00
deads2k 77b4d55982 mechanical 2017-01-16 09:35:12 -05:00
Klaus Ma 25fe1e0d82 Made cache.Controller to be interface. 2017-01-13 13:33:23 +08:00
deads2k 6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
yarntime@163.com f7c737e8a9 fix typos 2017-01-11 16:08:20 +08:00
Kubernetes Submit Queue 7c3fff1a95 Merge pull request #39551 from chrislovecnm/reconciler-time-increases
Automatic merge from submit-queue (batch tested with PRs 39628, 39551, 38746, 38352, 39607)

Increasing times on reconciling volumes fixing impact to AWS.

#**What this PR does / why we need it**:

We are currently blocked by API timeouts with PV volumes.  See https://github.com/kubernetes/kubernetes/issues/39526.  This is a workaround, not a fix.

**Special notes for your reviewer**:

A second PR will be dropped with CLI cobra options in it, but we are starting with increasing the reconciliation periods.  I am dropping this without major testing and will test on our AWS account. Will be marked WIP until I run smoke tests.

**Release note**:

```release-note
Provide kubernetes-controller-manager flags to control volume attach/detach reconciler sync.  The duration of the syncs can be controlled, and the syncs can be shut off as well. 
```
2017-01-10 11:54:15 -08:00
chrislovecnm ac49139c9f updates from review 2017-01-09 17:20:19 -07:00
chrislovecnm a973c38c7d The capability to control duration via controller-manager flags,
and the option to shut off reconciliation.
2017-01-09 16:47:13 -07:00
NickrenREN 639572ac68 fix redundant alias and remove unused function 2017-01-09 17:13:09 +08:00
Jeff Grafton 20d221f75c Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
Jan Safranek 0fd5f2028d Add work queues to PV controller
PV controller should not use Controller.Requeue, as as it is not available in
shared informers. We need to implement our own work queues instead where we
can enqueue volumes/claims as we want.
2017-01-02 15:17:24 +01:00
Mike Danese 161c391f44 autogenerated 2016-12-29 13:04:10 -08:00
rkouj e7e3c55ad7 Add unit tests for MountVolume() of operation executor 2016-12-27 16:07:06 -08:00
rkouj d5f7610b82 Refactor operation_executor to make it unit testable 2016-12-27 15:12:16 -08:00
Hemant Kumar 7b423085fa Fix variable shadowing in exponential backoff when deleting volumes
Also fix pv_controller unit tests to behave more accurately
in light of exponential backoffs
2016-12-20 21:31:12 -05:00
Chao Xu 03d8820edc rename /release_1_5 to /clientset 2016-12-14 12:39:48 -08:00
Kubernetes Submit Queue 8abbedae54 Merge pull request #38315 from mikedanese/pin-gazel
Automatic merge from submit-queue

Pin gazel to a version and support cgo

This fixes the bazel build.

@krousey who is buildcop
2016-12-12 19:32:29 -08:00
Kubernetes Submit Queue f45e918b8b Merge pull request #35833 from apelisse/owners-pkg-controller
Automatic merge from submit-queue

Curating Owners: pkg/controller

cc @jsafrane @mikedanese @bprashanth @derekwaynecarr @thockin @saad-ali

In an effort to expand the existing pool of reviewers and establish a
two-tiered review process (first someone **lgtms** and then someone
experienced in the project **approves**), we are adding new reviewers to
existing owners files.
## If You Care About the Process:

We did this by algorithmically figuring out who’s contributed code to
the project and in what directories.  Unfortunately, that doesn’t work
perfectly: people that have made mechanical code changes (e.g change the
copyright header across all directories) end up as reviewers in lots of
places.

Instead of using pure commit data, we generated an excessively large
list of reviewers and pruned based on all time commit data, recent
commit data and review data (number of PRs commented on).

At this point we have a decent list of reviewers, but it needs one last
pass for fine tuning.
## TLDR:

As an owner of a sig/directory and a leader of the project, here’s what
we need from you:
1. Use PR https://github.com/kubernetes/kubernetes/pull/35715 as an example.
2. The pull-request is made editable, please edit the OWNERS file to add
   the names of people that should be reviewing code in the future in the **reviewers** section. You probably do NOT need to modify the **approvers** section.
3. Notify me if you want some OWNERS file to be removed.  Being an approver or reviewer
   of a parent directory makes you a reviewer/approver of the subdirectories too, so not all
   OWNERS files may be necessary.
4. Please use ALIAS if you want to use the same list of people over and
   over again (don't hesitate to ask me for help, or use the pull-request
   above as an example)
2016-12-12 18:51:33 -08:00
Mike Danese c87de85347 autoupdate BUILD files 2016-12-12 13:30:07 -08:00
Kubernetes Submit Queue 43233caaf0 Merge pull request #37871 from Random-Liu/use-patch-in-kubelet
Automatic merge from submit-queue (batch tested with PRs 36692, 37871)

Use PatchStatus to update node status in kubelet.

Fixes https://github.com/kubernetes/kubernetes/issues/37771.

This PR changes kubelet to update node status with `PatchStatus`.

@caesarxuchao @ymqytw told me that there is a limitation of current `CreateTwoWayMergePatch`, it doesn't support primitive type slice which uses strategic merge.
* I checked the node status, the only primitive type slices in NodeStatus are as follows, they are not using strategic merge:
  * [`ContainerImage.Names`](https://github.com/kubernetes/kubernetes/blob/master/pkg/api/v1/types.go#L2963)
  * [`VolumesInUse`](https://github.com/kubernetes/kubernetes/blob/master/pkg/api/v1/types.go#L2909)
* Volume package is already [using `CreateStrategicMergePath` to generate node status update patch](https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/attachdetach/statusupdater/node_status_updater.go#L111), and till now everything is fine. 

@yujuhong @dchen1107 
/cc @kubernetes/sig-node
2016-12-09 11:29:11 -08:00
Wojciech Tyczynski e8d1cba875 GetOptions in client calls 2016-12-09 09:42:01 +01:00
Random-Liu beba1ebbf8 Use PatchStatus to update node status in kubelet. 2016-12-08 17:13:59 -08:00
Jordan Liggitt 6819706adf
Pass addressable values to DeepCopy 2016-12-08 14:16:01 -05:00
Kubernetes Submit Queue 8f11cc78a8 Merge pull request #38339 from gnufied/backoff-on-volume-delete
Automatic merge from submit-queue (batch tested with PRs 38377, 36365, 36648, 37691, 38339)

Exponential back off when volume delete fails

**What this PR does / why we need it**:

This PR implements ability in pv_controller to back off when deleting a volume fails from plugin API. 

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 

Partly fixes #38295 , but I think volume delete is most problematic thing happening in pv_controller without any sort of backoff. 

After this change the attempts of volume deletion look like:

```
controller : I1208 00:18:35.532061   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
controller : I1208 00:20:50.578325   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
controller : I1208 00:23:05.563488   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
controller : I1208 00:25:20.599158   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
controller : I1208 00:27:35.560009   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
controller : I1208 00:29:50.594967   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
controller : I1208 00:32:05.539168   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
controller : I1208 00:34:20.581665   16388 aws_util.go:55] Error deleting EBS Disk volume aws://us-east-1d/vol-abcdefg: VolumeInUse: Volume vol-abcdefg is currently attached to i-1234567
```
2016-12-08 10:52:03 -08:00
Hemant Kumar caf867a402 Exponential back off when volume delete fails
This implements pv_controller to exponentially backoff
when deleting a volume fails in Cloud API. It ensures that
we aren't making too many calls to Cloud API
2016-12-07 19:25:36 -05:00
Alejandro Escobar 759530536f type found with controller comment. 2016-12-07 10:55:02 -08:00
Hemant Kumar fcf5d79be7 Add integration tests for desire state of world populator
This adds tests for code introduced here :
https://github.com/kubernetes/kubernetes/issues/26994

Via integration test we can now verify that if pod delete
event is somehow missed by AttachDetach controller - it still
get cleaned up by Desired State of World populator.
2016-12-06 06:52:52 -05:00
Clayton Coleman 3454a8d52c
refactor: update bazel, codec, and gofmt 2016-12-03 19:10:53 -05:00
Clayton Coleman 5df8cc39c9
refactor: generated 2016-12-03 19:10:46 -05:00
Kubernetes Submit Queue 6fd00e9f56 Merge pull request #37678 from tsmetana/issue_37377
Automatic merge from submit-queue (batch tested with PRs 37608, 37103, 37320, 37607, 37678)

Fix issue #37377: Report an event on successful PVC provisioning

This is a simple patch to fix the issue #37377: On a successful PVC provisioning an event is emitted so it's clear the provisioning actually succeeded.

cc: @jsafrane
2016-12-02 23:32:50 -08:00
Kubernetes Submit Queue fb7e9d901d Merge pull request #37939 from yarntime/fix_typo_in_node_status_updater
Automatic merge from submit-queue (batch tested with PRs 37997, 37939, 37990, 36700, 37258)

fix typo in node_status_updater

fix typo.
2016-12-02 19:26:47 -08:00
Kubernetes Submit Queue c552f8918b Merge pull request #37727 from rkouj/bug-fix-upgrade-test
Automatic merge from submit-queue

SetNodeUpdateStatusNeeded whenever nodeAdd event is received

**What this PR does / why we need it**:
Bug fix and SetNodeStatusUpdateNeeded for a node whenever its api object is added. This is to ensure that we don't lose the attached list of volumes in the node when its api object is deleted and recreated.

fixes https://github.com/kubernetes/kubernetes/issues/37586
         https://github.com/kubernetes/kubernetes/issues/37585


**Special notes for your reviewer**:


<!--  Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access) 
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`. 
-->
2016-12-02 05:44:57 -08:00
yarntime@163.com df6e9db9d9 fix typo 2016-12-02 17:33:45 +08:00
rkouj 638ef1b977 SetNodeUpdateStatusNeeded whenever nodeAdd event is received 2016-11-30 21:12:34 -08:00
Kubernetes Submit Queue 66fe55f5ad Merge pull request #37238 from deads2k/controller-02-minor-fixes
Automatic merge from submit-queue

controller manager refactors

The controller manager needs some significant cleanup.  This starts us down the patch by respecting parameters like `stopCh`, simplifying discovery checks, removing unnecessary parameters, preventing unncessary fatals, and using our client builder.

@sttts @ncdc
2016-11-30 20:08:19 -08:00
Tomas Smetana a02ee64d00 Fix issue #37377: Report an event on successful PVC provisioning
cc: @jsafrane
2016-11-30 10:27:22 +01:00
Pengfei Ni f584ed4398 Fix package aliases to follow golang convention 2016-11-30 15:40:50 +08:00
deads2k d973158a4e make controller manager use specified stop channel 2016-11-28 15:02:21 -05:00
Chao Xu bcc783c594 run hack/update-all.sh 2016-11-23 15:53:09 -08:00
Chao Xu 7eeb71f698 cmd/kube-controller-manager 2016-11-23 15:53:09 -08:00
ymqytw 3cc294b1e0 Revert "support patch list of primitives"
This reverts commit 34891ad9f6.
2016-11-22 21:06:36 -08:00
ymqytw d248843b65 Revert "try old patch after new patch fails"
This reverts commit f32696e734.
2016-11-22 21:02:30 -08:00
Kubernetes Submit Queue b9d2d74a94 Merge pull request #37038 from ymqytw/retry_old_patch_after_new_patch_fail
Automatic merge from submit-queue

Fix kubectl Stratigic Merge Patch compatibility

As @smarterclayton pointed out in [comment1](https://github.com/kubernetes/kubernetes/pull/35647#pullrequestreview-8290820) and [comment2](https://github.com/kubernetes/kubernetes/pull/35647#pullrequestreview-8290847) in PR #35647,
we cannot assume the API servers publish version and they shares the same version.

This PR removes all the calls of GetServerSupportedSMPatchVersion().
Change the behavior of `apply` and `edit` to:
Retrying with the old patch version, if the new version fails.
Default other usage of SMPatch to the new version, since they don't update list of primitives.

fixes #36916

cc: @pwittrock @smarterclayton
2016-11-19 01:02:47 -08:00
ymqytw f32696e734 try old patch after new patch fails 2016-11-17 14:28:09 -08:00
Jing Xu 3d3e44e77e fix issue in converting aws volume id from mount paths
This PR is to fix the issue in converting aws volume id from mount
paths. Currently there are three aws volume id formats supported. The
following lists example of those three formats and their corresponding
global mount paths:
1. aws:///vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/vol-123456)
2. aws://us-east-1/vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/mounts/aws/us-est-1/vol-123455)
3. vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/mounts/aws/us-est-1/vol-123455)

For the first two cases, we need to check the mount path and convert
them back to the original format.
2016-11-16 22:35:48 -08:00
Kubernetes Submit Queue 3e169be887 Merge pull request #35647 from ymqytw/patch_primitive_list
Automatic merge from submit-queue

Fix strategic patch for list of primitive type with merge sementic

Fix strategic patch for list of primitive type when the patch strategy is `merge`.
Before: we cannot replace or delete an item in a list of primitive, e.g. string, when the patch strategy is `merge`. It will always append new items to the list.
This patch will generate a map to update the list of primitive type.
The server with this patch will accept either a new patch or an old patch.
The client will found out the APIserver version before generate the patch.

Fixes #35163, #32398

cc: @pwittrock @fabianofranz 

``` release-note
Fix strategic patch for list of primitive type when patch strategy is `merge` to remove deleted objects.
```
2016-11-11 14:36:44 -08:00
Kubernetes Submit Queue 0f082c6663 Merge pull request #36280 from rkouj/better-mount-error
Automatic merge from submit-queue

Better messaging for missing volume binaries on host

**What this PR does / why we need it**:
When mount binaries are not present on a host, the error returned is a generic one.
This change is to check the mount binaries before the mount and return a user-friendly error message.

This change is specific to GCI and the flag is experimental now.

https://github.com/kubernetes/kubernetes/issues/36098

**Release note**:
Introduces a flag `check-node-capabilities-before-mount` which if set, enables a check (`CanMount()`) prior to mount operations to verify that the required components (binaries, etc.) to mount the volume are available on the underlying node. If the check is enabled and `CanMount()` returns an error, the mount operation fails. Implements the `CanMount()` check for NFS.















Sample output post change :


rkouj@rkouj0:~/go/src/k8s.io/kubernetes$ kubectl describe pods
Name:		sleepyrc-fzhyl
Namespace:	default
Node:		e2e-test-rkouj-minion-group-oxxa/10.240.0.3
Start Time:	Mon, 07 Nov 2016 21:28:36 -0800
Labels:		name=sleepy
Status:		Pending
IP:		
Controllers:	ReplicationController/sleepyrc
Containers:
  sleepycontainer1:
    Container ID:	
    Image:		gcr.io/google_containers/busybox
    Image ID:		
    Port:		
    Command:
      sleep
      6000
    QoS Tier:
      cpu:	Burstable
      memory:	BestEffort
    Requests:
      cpu:		100m
    State:		Waiting
      Reason:		ContainerCreating
    Ready:		False
    Restart Count:	0
    Environment Variables:
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	False 
  PodScheduled 	True 
Volumes:
  data:
    Type:	NFS (an NFS mount that lasts the lifetime of a pod)
    Server:	127.0.0.1
    Path:	/export
    ReadOnly:	false
  default-token-d13tj:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	default-token-d13tj
Events:
  FirstSeen	LastSeen	Count	From						SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----						-------------	--------	------		-------
  7s		7s		1	{default-scheduler }						Normal		Scheduled	Successfully assigned sleepyrc-fzhyl to e2e-test-rkouj-minion-group-oxxa
  6s		3s		4	{kubelet e2e-test-rkouj-minion-group-oxxa}			Warning		FailedMount	Unable to mount volume kubernetes.io/nfs/32c7ef16-a574-11e6-813d-42010af00002-data (spec.Name: data) on pod sleepyrc-fzhyl (UID: 32c7ef16-a574-11e6-813d-42010af00002). Verify that your node machine has the required components before attempting to mount this volume type. Required binary /sbin/mount.nfs is missing
2016-11-09 18:51:00 -08:00
Kubernetes Submit Queue 6a8edf72e1 Merge pull request #35957 from jsafrane/implement-external-provisioner
Automatic merge from submit-queue

Implement external provisioning proposal

In other words, add "provisioned-by" annotation to all PVCs that should be provisioned dynamically.

Most of the changes are actually in tests.

@kubernetes/sig-storage
2016-11-09 18:12:56 -08:00
Rajat Ramesh Koujalagi d81e216fc6 Better messaging for missing volume components on host to perform mount 2016-11-09 15:16:11 -08:00
ymqytw 34891ad9f6 support patch list of primitives 2016-11-09 11:46:59 -08:00
Paul Morie 4722cb299b Remove GetRootContext from VolumeHost 2016-11-03 12:21:19 -04:00
Jan Safranek 2224e80dd7 Fix race when two provisioner create two PVs for a single claim. 2016-11-03 16:58:25 +01:00
Saad Ali eac6809845 Add reviewers for `controller/volume` dir 2016-11-02 16:19:19 -07:00
Antoine Pelisse db35acde19 Update OWNERS: Remove reviewers: pkg/controller 2016-11-02 16:19:19 -07:00
Antoine Pelisse c695a54c1c Update OWNERS approvers and reviewers: pkg/controller 2016-11-02 16:19:18 -07:00
Jan Safranek 18de83c641 Implement external provisioning proposal
In other words, add "provisioned-by" annotation to all PVCs
that should be provisioned dynamically.
2016-11-02 14:13:34 +01:00
Chao Xu 850729bfaf include multiple versions in clientset
update client-gen to use the term "internalversion" rather than "unversioned";
leave internal one unqualified;
cleanup client-gen
2016-10-29 13:30:47 -07:00
Jing Xu abbde43374 Add sync state loop in master's volume reconciler
At master volume reconciler, the information about which volumes are
attached to nodes is cached in actual state of world. However, this
information might be out of date in case that node is terminated (volume
is detached automatically). In this situation, reconciler assume volume
is still attached and will not issue attach operation when node comes
back. Pods created on those nodes will fail to mount.

This PR adds the logic to periodically sync up the truth for attached volumes kept in the actual state cache. If the volume is no longer attached to the node, the actual state will be updated to reflect the truth. In turn, reconciler will take actions if needed.

To avoid issuing many concurrent operations on cloud provider, this PR
tries to add batch operation to check whether a list of volumes are
attached to the node instead of one request per volume.

More details are explained in PR #33760
2016-10-28 09:24:53 -07:00
Kubernetes Submit Queue 453bfa1f0f Merge pull request #34368 from jingxu97/Oct/statusupdate-10-7
Automatic merge from submit-queue

Node status updater should SetNodeStatusUpdateNeeded if it fails to

update status

When volume controller tries to update the node status, if it fails to
update the nodes status, it should call SetNodeStatusUpdateNeeded so
that the volume list could be updated next time.
2016-10-26 11:09:16 -07:00
Jan Safranek ad946f4fcc Fixed mutation warning in Attach/Detach controller
Objects from shared informer must not be changed, they are shared among all
controllers.

This fixes CacheMutationDetector panic with this output:

CACHE *api.Node[5] ALTERED!
{"metadata":{"name":"ip-172-18-8-71.ec2.internal","selfLink":"/api/v1/nodes/ip-172-18-8-71.ec2.internal","uid":"73d07d16-976e-11e6-8225-0e2f14b56070","resourceVersion":"136","creationTimestamp":"2016-10-21T09:12:12Z","labels":{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/instance-type":"t2.medium","beta.kubernetes.io/os":"linux","failure-domain.beta.kubernetes.io/region":"us-east-1","failure-domain.beta.kubernetes.io/zone":"us-east-1d","kubernetes.io/hostname":"ip-172-18-8-71.ec2.internal"},"annotations":{"volumes.kubernetes.io/controller-managed-attach-detach":"true"}},"spec":{"externalID":"i-9cb6180f","providerID":"aws:///us-east-1d/i-9cb6180f"},"status":{"capacity":{"alpha.kubernetes.io/nvidia-gpu":"0","cpu":"2","memory":"4045568Ki","pods":"110"},"allocatable":{"alpha.kubernetes.io/nvidia-gpu":"0","cpu":"2","memory":"4045568Ki","pods":"110"},"conditions":[{"type":"OutOfDisk","status":"False","lastHeartbeatTime":"2016-10-21T09:12:52Z","lastTransitionTime":"2016-10-21T09:12:12Z","reason":"KubeletHasSufficientDisk","message":"kubelet has sufficient disk space available"},{"type":"MemoryPressure","status":"False","lastHeartbeatTime":"2016-10-21T09:12:52Z","lastTransitionTime":"2016-10-21T09:12:12Z","reason":"KubeletHasSufficientMemory","message":"kubelet has sufficient memory available"},{"type":"DiskPressure","status":"False","lastHeartbeatTime":"2016-10-21T09:12:52Z","lastTransitionTime":"2016-10-21T09:12:12Z","reason":"KubeletHasNoDiskPressure","message":"kubelet has no disk pressure"},{"type":"InodePressure","status":"False","lastHeartbeatTime":"2016-10-21T09:12:52Z","lastTransitionTime":"2016-10-21T09:12:12Z","reason":"KubeletHasNoInodePressure","message":"kubelet has no inode pressure"},{"type":"Ready","status":"True","lastHeartbeatTime":"2016-10-21T09:12:52Z","lastTransitionTime":"2016-10-21T09:12:22Z","reason":"KubeletReady","message":"kubelet is posting ready status"}],"addresses":[{"type":"InternalIP","address":"172.18.8.71"},{"type":"LegacyHostIP","address":"172.18.8.71"},{"type":"ExternalIP","address":"54.85.104.236"}],"daemonEndpoints":{"kubeletEndpoint":{"Port":10250}},"nodeInfo":{"machineID":"78a79498db8e4fdc9ac24b5e436a982c","systemUUID":"EC2BB406-5467-4ABE-B54D-D9993C45714F","bootID":"2553d6b8-1ddb-4ef0-902a-d09a807b89ba","kernelVersion":"4.6.7-300.fc24.x86_64","osImage":"Fedora 24 (Cloud Edition)","containerRuntimeVersion":"docker://1.10.3","kubeletVersion":"v1.5.0-alpha.1.726+5aac5eddb809e4","kubeProxyVersion":"v1.5.0-alpha.1.726+5aac5eddb809e4","operatingSystem":"linux","architecture":"amd64"},"images":[{"names":["openshift/origin-release:latest"],"sizeBytes":714569002},{"names":["openshift/origin-haproxy-router-base:latest"],"sizeBytes":294417608},{"names":["openshift/origin-base:latest"],"sizeBytes":275310761},{"names":["docker.io/centos@sha256:2ae0d2c881c7123870114fb9cc7afabd1e31f9888dac8286884f6cf59373ed9b","docker.io/centos:centos7"],"sizeBytes":196744353},{"names":["gcr.io/google_containers/busybox@sha256:4bdd623e848417d96127e16037743f0cd8b528c026e9175e22a84f639eca58ff","gcr.io/google_containers/busybox:1.24"],"sizeBytes":1113554},{"names":["gcr.io/google_containers/pause-amd64@sha256:163ac025575b775d1c0f9bf0bdd0f086883171eb475b5068e7defa4ca9e76516","gcr.io/google_containers/pause-amd64:3.0"],"sizeBytes":746888}],"volumesInUse":["kubernetes.io/aws-ebs/aws://us-east-1d/vol-f4bd0352"]

A: ,"volumesAttached":[{"name":"kubernetes.io/aws-ebs/aws://us-east-1d/vol-f4bd0352","devicePath":"/dev/xvdba"}]}}

B: }}
2016-10-25 14:28:10 +02:00
Jing Xu 70efadc2f4 Node status updater should SetNodeStatusUpdateNeeded if it fails to
update status

When volume controller tries to update the node status, if it fails to
update the nodes status, it should call SetNodeStatusUpdateNeeded so
that the volume list could be updated next time.
2016-10-24 13:59:39 -07:00
Mike Danese 3b6a067afc autogenerated 2016-10-21 17:32:32 -07:00
Scott Creeley 86f1a94be5 Adding default StorageClass annotation printout for resource_printer 2016-10-19 10:59:07 -04:00
Jan Safranek 101602ab11 Pass whole PVC to provisioner plugin
Gluster provisioner is interested in pvc.Namespace and I don't want to add
at as a new field in VolumeOptions - it would contain almost whole PVC.

Let's pass direct reference to PVC instead and let the provisioner to pick
information it is interested in.
2016-10-12 12:22:01 +02:00
Jing Xu 9e8edf6baf Fix issue in updating device path when volume is attached multiple times
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue #29324
2016-10-03 17:14:23 -07:00
Kubernetes Submit Queue 1854bdcb0c Merge pull request #29048 from justinsb/volumes_nodename_not_hostname
Automatic merge from submit-queue

Use strongly-typed types.NodeName for a node name

We had another bug where we confused the hostname with the NodeName.

Also, if we want to use different values for the Node.Name (which is
an important step for making installation easier), we need to keep
better control over this.

A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName
2016-09-27 17:58:41 -07:00
Justin Santa Barbara 54195d590f Use strongly-typed types.NodeName for a node name
We had another bug where we confused the hostname with the NodeName.

To avoid this happening again, and to make the code more
self-documenting, we use types.NodeName (a typedef alias for string)
whenever we are referring to the Node.Name.

A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName

Also clean up some of the (many) places where the NodeName is referred
to as a hostname (not true on AWS), or an instanceID (not true on GCE),
etc.
2016-09-27 10:47:31 -04:00
Jan Safranek a54c9e2887 Refactor volume controller parameters into a structure
persistentvolumecontroller.NewPersistentVolumeController has 11 arguments now,
put them into a structure.

Also, rename NewPersistentVolumeController to NewController, persistentvolume
is already name of the package.

Fixes #30219
2016-09-26 14:15:25 +02:00
Jan Safranek 5ff1597cf9 Rename controller*.go to pv_controller*.go
To make log filtering easier. controller.go is used by several controllers and
matching logs for "pv_controller.*" is much better.
2016-09-26 12:26:58 +02:00
Kubernetes Submit Queue 4785f6f517 Merge pull request #31978 from jsafrane/detach-before-delete
Automatic merge from submit-queue

Do not report error when deleting an attached volume

Persistent volume controller should not send warning events to a PV and mark the PV as failed when the volume is still attached.

This happens when a user quickly deletes a pod and associated PVC - PV is slowly detaching, while the PVC is already deleted and the PV enters Failed phase.

`Deleter.Deleter` can now return `tryAgainError`, which is sent as INFO to the PV to let the user know we did not forget to delete the PV, however the PV stays in Released state. The controller tries again in the next sync (15 seconds by default).

Fixes #31511
2016-09-25 18:55:32 -07:00
Kubernetes Submit Queue 0a4316f11e Merge pull request #32807 from jingxu97/stateupdateNeeded-9-15
Automatic merge from submit-queue

Fix race condition in setting node statusUpdateNeeded flag

This PR fixes the race condition in setting node statusUpdateNeeded flag
in master's attachdetach controller. This flag is used to indicate
whether a node status has been updated by the node_status_updater or
not. When updater finishes update a node status, it is set to false.
When the node status is changed such as volume is detached or new volume
is attached to the node, the flag is set to true so that updater can
update the status again. The previous workflow has a race condition as
follows
1. updater gets the currently attached volume list from the node which needs to be
updated.
2. A new volume A is attached to the same node right after 1 and set the
flag to TRUE
3. updater updates the node attached volume list (which does not include volume A) and then set the flag to FALSE.
The result is that volume A will be never added to the attached volume
list so at node side, this volume is never attached.

So in this PR, the flag is set to FALSE when updater tries to get the
attached volume list (as in an atomic operation). So in the above
example, after step 2, the flag will be TRUE again, in step 3, updater
does not set the flag if updates is sucessful. So after that, flag is
still TRUE and in next round of update, the node status will be updated.
2016-09-23 11:25:16 -07:00
Jing Xu 14cad206f5 Fix race conditino in setting node statusUpdateNeeded flag
This PR fixes the race condition in setting node statusUpdateNeeded flag
in master's attachdetach controller. This flag is used to indicate
whether a node status has been updated by the node_status_updater or
not. When updater finishes update a node status, it is set to false.
When the node status is changed such as volume is detached or new volume
is attached to the node, the flag is set to true so that updater can
update the status again. The previous workflow has a race condition as
follows
1. updater gets the currently attached volume list from the node which needs to be
updated.
2. A new volume A is attached to the same node right after 1 and set the
flag to TRUE
3. updater updates the node attached volume list (which does not include volume A) and then set the flag to FALSE.
The result is that volume A will be never added to the attached volume
list so at node side, this volume is never attached.

So in this PR, the flag is set to FALSE when updater tries to get the
attached volume list (as in an atomic operation). So in the above
example, after step 2, the flag will be TRUE again, in step 3, updater
does not set the flag if updates is sucessful. So after that, flag is
still TRUE and in next round of update, the node status will be updated.

This PR also changes a unit test due to the workflow changes
2016-09-22 14:02:30 -07:00
Kubernetes Submit Queue e9f4db2748 Merge pull request #27714 from jsafrane/event-recycle
Automatic merge from submit-queue

Send recycle events from pod to pv.

This allows users to diagnose what's wrong with recycler. Recycler pods are started automatically with a cryptic name and they are deleted immediately when they finish.

e.g, `kubectl describe pv` could show that NFS cannot be mounted (and how many pods have tried it):

```
  FirstSeen     LastSeen        Count   From                            SubobjectPath   Type            Reason          Message
  ---------     --------        -----   ----                            -------------   --------        ------          -------
  59m           59m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(5421800e-347b-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  53m           53m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(3c9809e5-347c-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  46m           46m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(250dd2a2-347d-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  40m           40m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(0d84ea33-347e-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  33m           33m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(f5fb63bf-347e-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  27m           27m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(de7128fd-347f-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  1h            3m              75      {persistentvolume-controller }                  Normal          RecyclerPod     Recycler pod: Successfully assigned recycler-for-nfs to 127.0.0.1
  1h            3m              76      {persistentvolume-controller }                  Normal          RecyclerPod     Recycler pod: Pod was active on the node longer than specified deadline
  1h            1m              12      {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  20m           1m              4       {persistentvolume-controller }                  Warning         RecyclerPod     (events with common reason combined)
```

These steps were necessary:

- added event watcher to volume.RecycleVolumeByWatchingPodUntilCompletion
- pass all these events through volume plugins to volume controller
- rework volume.RecycleVolumeByWatchingPodUntilCompletion unit tests to a table (too much copy-paste)
- fix all unit tests along the way
2016-09-22 12:18:53 -07:00
Mike Danese a765d59932 move informer and controller to pkg/client/cache
Signed-off-by: Mike Danese <mikedanese@google.com>
2016-09-15 12:50:08 -07:00
Kubernetes Submit Queue 843d7cd24c Merge pull request #32576 from wongma7/revert-30825-pv-controller-informer
Automatic merge from submit-queue

Revert "Use PV shared informer in PV controller"

Fixes #32497 

Reverts kubernetes/kubernetes#30825
2016-09-15 04:37:29 -07:00
Jan Safranek 9903b389b3 Update cloud providers 2016-09-15 10:33:57 +02:00
Jan Safranek a24e6a90bd Add new error 2016-09-15 09:39:30 +02:00