github/k3s - k3s - https://git.xinac.net

Commit Graph

Author	SHA1	Message	Date
Jan Safranek	169076e7da	Fix initialization of volume controller caches. Fix PersistentVolumeController.initializeCaches() to pass pointers to volume or claim to storeObjectUpdate() and add extra functions to enforce that the right types are checked in the future. Fixes #28076	2016-06-27 13:08:02 +02:00
Angus Salkeld	b4f7e67d25	Fix startup type error in initializeCaches The following error was getting logged: PersistentVolumeController can't initialize caches, expected list of volumes, got: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink:/api/v1/persistentvolumes ResourceVersion:11} Items:[]}	2016-06-25 10:15:27 +10:00
k8s-merge-robot	07471cf90f	Merge pull request #27553 from justinsb/pvc_zone_spreading_2 Automatic merge from submit-queue AWS/GCE: Spread PetSet volume creation across zones, create GCE volumes in non-master zones Long term we plan on integrating this into the scheduler, but in the short term we use the volume name to place it onto a zone. We hash the volume name so we don't bias to the first few zones. If the volume name "looks like" a PetSet volume name (ending with -<number>) then we use the number as an offset. In that case we hash the base name.	2016-06-22 01:22:16 -07:00
k8s-merge-robot	ec518005a8	Merge pull request #27567 from saad-ali/blockKubeletOnAttachController Automatic merge from submit-queue Kubelet Volume Manager Wait For Attach Detach Controller and Backoff on Error * Closes https://github.com/kubernetes/kubernetes/issues/27483 * Modified Attach/Detach controller to report `Node.Status.AttachedVolumes` on successful attach (unique volume name along with device path). * Modified Kubelet Volume Manager wait for Attach/Detach controller to report success before proceeding with attach. * Closes https://github.com/kubernetes/kubernetes/issues/27492 * Implemented an exponential backoff mechanism for for volume manager and attach/detach controller to prevent operations (attach/detach/mount/unmount/wait for controller attach/etc) from executing back to back unchecked. * Closes https://github.com/kubernetes/kubernetes/issues/26679 * Modified volume `Attacher.WaitForAttach()` methods to uses the device path reported by the Attach/Detach controller in `Node.Status.AttachedVolumes` instead of calling out to cloud providers.	2016-06-20 20:36:08 -07:00
saadali	e716ddc771	Controller wait for attach and exponential backoff Modify attach/detach controller to keep track of volumes to report attached in Node VolumeToAttach status. Modify kubelet volume manager to wait for volume to show up in Node VolumeToAttach status. Implement exponential backoff for errors in volume manager and attach detach controller	2016-06-20 18:19:55 -07:00
k8s-merge-robot	d8b463dfd2	Merge pull request #27128 from markturansky/disable_provisioning Automatic merge from submit-queue Allow disabling of dynamic provisioning Allow administrators to opt-out of dynamic provisioning. Provisioning is still on by default, which is the current behavior. Per a conversation with @jsafrane, a boolean toggle was added and plumbed through into the controller. Deliberate disabling will simply return nil from `provisionClaim` whereas a misconfigured provisioner will continue on and generate error events for the PVC. @kubernetes/rh-storage @saad-ali @thockin @abhgupta	2016-06-20 02:10:43 -07:00
k8s-merge-robot	0730ffbff7	Merge pull request #27434 from jsafrane/pv-events-message Automatic merge from submit-queue Fill PV.Status.Message with deleter/recycler errors. Instead of empty `Message` `kubectl describe pv` now shows: ``` Name: nfs Labels: <none> Status: Failed Claim: default/nfs Reclaim Policy: Recycle Access Modes: RWX Capacity: 1Mi Message: Recycler failed: Pod was active on the node longer than specified deadline Source: Type: NFS (an NFS mount that lasts the lifetime of a pod) Server: 10.999.999.999 Path: / ReadOnly: false ``` This is actually a regression since 1.2 @kubernetes/sig-storage	2016-06-20 01:36:28 -07:00
markturansky	16ec36c591	added toggle to disable dynamic provisioning	2016-06-20 01:15:23 -04:00
Justin Santa Barbara	e711cbf912	GCE/AWS: Spread PetSet volume creation across zones Long term we plan on integrating this into the scheduler, but in the short term we use the volume name to place it onto a zone. We hash the volume name so we don't bias to the first few zones. If the volume name "looks like" a PetSet volume name (ending with -<number>) then we use the number as an offset. In that case we hash the base name. Fixes #27256	2016-06-17 23:27:31 -04:00
goltermann	218645b346	Fix several spelling errors in comments.	2016-06-17 10:41:18 -07:00
saadali	542f2dc708	Introduce new kubelet volume manager This commit adds a new volume manager in kubelet that synchronizes volume mount/unmount (and attach/detach, if attach/detach controller is not enabled). This eliminates the race conditions between the pod creation loop and the orphaned volumes loops. It also removes the unmount/detach from the `syncPod()` path so volume clean up never blocks the `syncPod` loop.	2016-06-15 09:34:08 -07:00
saadali	9b6a505f8a	Rename UniqueDeviceName to UniqueVolumeName Rename UniqueDeviceName to UniqueVolumeName and move helper functions from attacherdetacher to volumehelper package. Introduce UniquePodName alias	2016-06-15 09:32:12 -07:00
Jan Safranek	449e9f49d3	Fill PV.Status.Message with deleter/recycler errors.	2016-06-15 14:56:31 +02:00
Jan Safranek	6081bd61f0	Rework PV controller to use util/goroutinemap	2016-06-09 13:49:04 +02:00
k8s-merge-robot	bd2bc25308	Merge pull request #25865 from jsafrane/devel/pv-convert-from-12 Automatic merge from submit-queue volume controller: Convert PersistentVolumes from Kubernetes 1.2 In Kubernetes 1.2 we used template PersistentVolume for provisioning. When a claim for dynamic volume was detected, Kubernetes did: - create template PV for the claim with dummy pointer to storage asset - allocate storage asset such as AWS EBS - fill real pointer to the created storage asset to the template PV In refactored volume provisioner, Kubernetes allocates the storage asset first and then creates a Kubernetes PV instance already with the correct pointer to the storage asset. To support seamles upgrade from 1.2 to 1.3 we need to remove these unprovisioned template PVs. The new controller does not use them, it will see PVC for dynamic provisioning and create real PV instead. See https://github.com/pmorie/pv-haxxz/pull/3 for pseudocode.	2016-06-03 23:27:13 -07:00
k8s-merge-robot	4877153727	Merge pull request #26772 from jsafrane/flake-controller-cache-empty Automatic merge from submit-queue Wait for all volumes/claims to get synced in unit test. Controller.HasSynced() returns true when all initial claims/volumes were sent to appropriate goroutines, not when the goroutine has actually processed them. Fixes #26712	2016-06-03 17:05:22 -07:00
Jan Safranek	27b11c5342	Convert PersistentVolumes from Kubernetes 1.2 In Kubernetes 1.2 we used template PersistentVolume for provisioning. When a claim for dynamic volume was detected, Kubernetes did: - create template PV for the claim with dummy pointer to storage asset - allocate storage asset such as AWS EBS - fill real pointer to the created storage asset to the template PV In refactored volume provisioner, Kubernetes allocates the storage asset first and then creates a Kubernetes PV instance already with the correct pointer to the storage asset. To support seamles upgrade from 1.2 to 1.3 we need to remove these unprovisioned template PVs. The new controller does not use them, it will see PVC for dynamic provisioning and create real PV instead.	2016-06-03 14:26:06 +02:00
k8s-merge-robot	59e008dbcb	Merge pull request #26733 from pmorie/pv-controller-typos Automatic merge from submit-queue Fix typo and linewrap comments in PV controller Fix some typos and linewrap long comments that I found while going over this code investigating something.	2016-06-03 04:51:08 -07:00
Jan Safranek	962505ad01	Wait for all volumes/claims to get synced in unit test. Controller.HasSynced() returns true when all initial claims/volumes were sent to appropriate goroutines, not when the goroutine has actually processed them.	2016-06-03 10:53:56 +02:00
k8s-merge-robot	a41d84408c	Merge pull request #26518 from jsafrane/initial-sync Automatic merge from submit-queue Fill controller caches on startup The controller needs to fill its caches before it starts binding/recycling/ deleting or provisioning volumes and claims. This was done using blocking initial 'xxx added' from going through syncClaim/syncVolume. However, when the caches were full, the controller waited for the next sync period to do actual binding/recycling etc. In this patch, the controller fills its caches directly from etcd and then processes initial 'xxx added' events to reconcile the world and bind/recycle/ delete/provision stuff, resulting in faster binding after startup. Fixes #25967 (properly)	2016-06-02 21:44:56 -07:00
Paul Morie	277c0a4e90	Fix typo and linewrap comments in PV controller	2016-06-02 15:50:07 -04:00
k8s-merge-robot	335da9b125	Merge pull request #26410 from jsafrane/fix-test-race Automatic merge from submit-queue Fix data race in volume controller unit test. Reactor must be locked when fiddling with reactor.volumes and reactor.claims. Therefore add new functions to add/delete volume/claim with sending an event. Fixes #26345	2016-06-02 04:25:08 -07:00
Jan Safranek	ee74cc4354	Fix fake event recorder race Event recorder should wait for some time to get all expected events, the event may be written by another goroutine that just have finished. It should not slow down the test in most cases, only when there is a bug and expected event is not sent.	2016-06-01 10:16:35 +02:00
Jan Safranek	2d43e4549e	Fix data race in volume controller unit test. Reactor must be locked when fiddling with reactor.volumes and reactor.claims. Therefore add new functions to add/delete volume/claim with sending an event.	2016-06-01 08:35:33 +02:00
k8s-merge-robot	04f77dd602	Merge pull request #26556 from jsafrane/fix-format Automatic merge from submit-queue Fix log arguments. 'i' is not printed. @kubernetes/sig-storage	2016-05-31 21:24:50 -07:00
k8s-merge-robot	38d5be4f36	Merge pull request #26555 from jsafrane/stabilize-test-flakes Automatic merge from submit-queue Stabilize controller unit tests. Remove test "5-1", it's flaky as it depends on order of execution of goroutines. When the controller starts, existing claim is enqueued as "initial sync event" and a new volume is enqueued to separate goroutine. It is not deterministic which goroutine processes its events first and there is no way how to tell that the claim event was processed. Also, force resync of the controllers after the test to make sure all events are processed. Fixes unit test flakes. @kubernetes/sig-storage	2016-05-31 17:06:12 -07:00
Jan Safranek	21059e8b6d	Fix log arguments. 'i' is not printed.	2016-05-31 12:12:15 +02:00
Jan Safranek	011eac7c8b	Stabilize controller unit tests. Remove test "5-1", it's flaky as it depends on order of execution of goroutines. When the controller starts, existing claim is enqueued as "initial sync event" and a new volume is enqueued to separate goroutine. It is not deterministic which goroutine processes its events first and there is no way how to tell that the claim event was processed. Also, force resync of the controllers after the test to make sure all events are processed.	2016-05-31 12:07:47 +02:00
Paul Morie	4ffa3c6754	Add label selector to match criteria for claims to volumes	2016-05-30 12:11:12 -04:00
Paul Morie	faa112bad1	Add selector to PersistentVolumeClaim	2016-05-30 12:09:50 -04:00
k8s-merge-robot	9aeeef1d81	Merge pull request #26414 from jsafrane/reduce-sync-period Automatic merge from submit-queue Reduce volume controller sync period fixes #24236 and most probably also fixes #25294. Needs #25881! With the cache, binder is not affected by sync period. Without the cache, binding of 1000 PVCs takes more than 5 minutes (instead of ~70 seconds). 15 seconds were chosen by fair 2d10 roll :-)	2016-05-30 05:54:51 -07:00
Jan Safranek	df161c3a7e	Fill controller caches on startup The controller needs to fill its caches before it starts binding/recycling/ deleting or provisioning volumes and claims. This was done using blocking initial 'xxx added' from going through syncClaim/syncVolume. However, when the caches were full, the controller waited for the next sync period to do actual binding/recycling etc. In this patch, the controller fills its caches directly from etcd and then processes initial 'xxx added' events to reconcile the world and bind/recycle/ delete/provision stuff, resulting in faster binding after startup. Fixes #25967 (properly)	2016-05-30 13:16:45 +02:00
k8s-merge-robot	5643b7498f	Merge pull request #25881 from jsafrane/devel/pv-add-cache Automatic merge from submit-queue volume controller: Add cache with the latest version of PVs and PVCs When the controller binds a PV to PVC, it saves both objects to etcd. However, there is still an old version of these objects in the controller Informer cache. So, when a new PVC comes, the PV is still seen as available and may get bound to the new PVC. This will be blocked by etcd, still, it creates unnecessary traffic that slows everything down. To make everything worse, when periodic sync with the old PVC is performed, this PVC is seen by the controller as Pending (while it's already Bound on etcd) and will be bound to a different PV. Writing to this PV won't be blocked by etcd, only subsequent write of the PVC fails. So, the controller will need to roll back the PV in another transaction(s). The controller can keep itself pretty busy this way. Also, we save bound PVs (and PVCs) as two transactions - we save say PV.Spec first and then .Status. The controller gets "PV.Spec updated" event from etcd and tries to fix the Status, as it seems to the controller it's outdated. This write again fails - there already is a correct version in etcd. As we can't influence the Informer cache, it is read-only to the controller, this patch introduces second cache in the controller, which holds latest and greatest version on PVs and PVCs to prevent these useless writes to etcd . It gets updated with events from etcd and after etcd confirms successful save of PV/PVC modified by the controller. The cache stores only pointers to PVs/PVCs, so in ideal case it shares the actual object data with the informer cache. They will diverge only for a short time when the controller modifies something and the informer cache did not get update events yet. @kubernetes/sig-storage	2016-05-30 04:13:18 -07:00
Jan Safranek	2aa9f1dd8f	Reduce volume controller sync period	2016-05-30 09:59:31 +02:00
Alex Mohr	9803393a67	Merge pull request #25960 from jsafrane/do-not-sort-bind volume controller: Speed up binding by not sorting volumes	2016-05-26 15:47:14 -07:00
k8s-merge-robot	da7d3c189a	Merge pull request #25869 from jsafrane/devel/operation-logs Automatic merge from submit-queue volume controller: use better operation names Using volume/claim.UID in the operation name is not really useful, as UIDs are not logged by rest of the controller. On the other hand, volume.Name and claim.Namespace/Name is logged pretty often and it would help to log these also in operation name. Still, I'd prefer to have the operation name really unique to be protected from users deleting a volume and quickly creating another one with the same name, so UID is still part of the operation name. This has been already proven to be very useful in controller debugging.	2016-05-25 17:58:07 -07:00
Brendan Burns	88663fc58b	Add some extra checking in the tests to prevent flakes.	2016-05-23 16:25:02 -07:00
k8s-merge-robot	62a8394eb4	Merge pull request #25263 from jsafrane/devel/adopt-recycle-pod Automatic merge from submit-queue volume recycler: Don't start a new recycler pod if one already exists. Recycling is a long duration process and when the recycler controller is restarted in the meantime, it should not start a new recycler pod if there is one already running. This means that the recycler pod must have deterministic name based on name of the recycled PV, we then get name conflicts when creating the pod. Two things need to be changed: - recycler controller and recycler plugins must pass the PV.Name to place, where the pod is created. This is most of the patch and it should be pretty straightforward. - create recycler pod with deterministic name and check "already exists" error. When at it, remove useless 'resourceVersion' argument and make log messages starting with lowercase. There is an unit test to check the behavior + there is an e2e test that checks that regular recycling is not broken (it does not try to run two recycler pods in parallel as the recycler is single-threaded now).	2016-05-21 02:28:26 -07:00
Jan Safranek	c7da3abd5b	volume controller: Speed up binding by not sorting volumes The binder sorts all available volumes first, then it filters out volumes that cannot be bound by processing each volume in a loop and then finds the smallest matching volume by binary search. So, if we process every available volume in a loop, we can also remember the smallest matching one and save us potentially long sorting (and quick binary search).	2016-05-20 12:26:39 +02:00
Jan Safranek	0279232360	volume controller: Add cache with the latest version of PVs and PVCs When the controller binds a PV to PVC, it saves both objects to etcd. However, there is still an old version of these objects in the controller Informer cache. So, when a new PVC comes, the PV is still seen as available and may get bound to the new PVC. This will be blocked by etcd, still, it creates unnecessary traffic that slows everything down. Also, we save bound PV/PVC as two transactions - we save PV/PVC.Spec first and then .Status. The controller gets "PV/PVC.Spec updated" event from etcd and tries to fix the Status, as it seems to the controller it's outdated. This write again fails - there already is a correct version in etcd. We can't influence the Informer cache, it is read-only to the controller. To prevent these useless writes to etcd, this patch introduces second cache in the controller, which holds latest and greatest version on PVs and PVCs. It gets updated with events from etcd and after etcd confirms successful save of PV/PVC modified by the controller. The cache stores only pointers to PVs/PVCs, so in ideal case it shares the actual object data with the informer cache. They will diverge only when the controller modifies something and the informer cache did not get update events yet.	2016-05-19 16:09:06 +02:00
Jan Safranek	e9a6ec29a0	volume controller: use better operation names Using volume/claim.UID in the operation name is not really useful, as UIDs are not logged by rest of the controller. On the other hand, volume.Name and claim.Namespace/Name is logged pretty often and it would help to log these also in operation name. This has been already proven to be very useful in controller debugging.	2016-05-19 14:19:33 +02:00
Jan Safranek	0ee9160f88	volume recycler: Don't start a new recycler pod if one already exists. Recycling is a long duration process and when the recycler controller is restarted in the meantime, it should not start a new recycler pod if there is one already running. This means that the recycler pod must have deterministic name based on name of the recycled PV, we then get name conflicts when creating the pod. Two things need to be changed: - recycler controller and recycler plugins must pass the PV.Name to place, where the pod is created. - create recycler pod with deterministic name and check "already exists" error. When at it, remove useless 'resourceVersion' argument and make log messages starting with lowercase.	2016-05-19 12:58:25 +02:00
Jan Safranek	61d630ddf7	volume controller: Fix method name in a log message It's deleteVolume, not deleteClaim.	2016-05-19 12:54:17 +02:00
Jan Safranek	01b20d8e77	Generate shorter provisioned PV names. GCE PD names are generated out of provisioned PV.Name, therefore it should be as short as possible and still unique.	2016-05-18 10:06:51 +02:00
Jan Safranek	79b91b9ee0	Refactor persistent volume initialization There should be only one initialization function, shared by the real controller and unit tests.	2016-05-18 10:06:51 +02:00
Jan Safranek	7f549511e2	Big move and rename - remove persistentvolume_ prefix from all files - split controller.go into controller.go and controller_base.go (to have them under 1500 lines for github)	2016-05-18 10:06:51 +02:00
Jan Safranek	c5fe1f943c	Fixed binder logging - we need the original volume/claim in error paths - don't report version conflicts as errors (they happen pretty often and we recover from them)	2016-05-18 10:06:51 +02:00
Jan Safranek	41adcc5496	Speed up binding of provisioned volumes This fixes e2e test for provisioning - it expects that provisioned volumes are bound quickly. Majority of this patch is update of test framework needs to initialize the controller appropriately.	2016-05-18 10:06:51 +02:00
Jan Safranek	c6f05c8056	provisioning: Add unit testso for provisioning errors.	2016-05-18 10:06:51 +02:00
Jan Safranek	c24b33793c	unit test: Add possibility to inject kubeclient errors.	2016-05-18 10:06:51 +02:00

1 2 3 4

163 Commits (7e670fb847430f91319ebefd67deb870b9970428)