Commit Graph

1157 Commits (eeda4c79155a2b90fdb45848c133428a19502ef7)

Author SHA1 Message Date
k8s-merge-robot d36375954e Merge pull request #27733 from caesarxuchao/gc-parametercodec
Automatic merge from submit-queue

let dynamic client handle non-registered ListOptions

And register v1.ListOptions in the policy group.

Fix #27622

@lavalamp @smarterclayton @krousey
2016-06-22 17:36:16 -07:00
Chao Xu d9f07925be let dynamic client handle non-registered ListOptions;
register ListOptions for apis/policy
2016-06-22 13:18:50 -07:00
saadali dfe8e606c1 Fix device path used by volume WaitForAttach 2016-06-22 12:56:58 -07:00
k8s-merge-robot 5289de0501 Merge pull request #27837 from saad-ali/blockKubeletDetachFix
Automatic merge from submit-queue

Prevent detach before node status update

The PR prevents the attach/detach controller from start a detach operation before updating the node status (to remove the volume from the list of attached volumes).

Fixes https://github.com/kubernetes/kubernetes/issues/27836
2016-06-22 10:10:58 -07:00
saadali 773ac20880 Prevent detach before node status update 2016-06-22 04:45:50 -07:00
k8s-merge-robot 07471cf90f Merge pull request #27553 from justinsb/pvc_zone_spreading_2
Automatic merge from submit-queue

AWS/GCE: Spread PetSet volume creation across zones, create GCE volumes in non-master zones

Long term we plan on integrating this into the scheduler, but in the
short term we use the volume name to place it onto a zone.
    
We hash the volume name so we don't bias to the first few zones.
    
If the volume name "looks like" a PetSet volume name (ending with
-<number>) then we use the number as an offset.  In that case we hash
the base name.
2016-06-22 01:22:16 -07:00
Jing Xu 0fefb23f94 implement desiredWorld populator to sync up with informer
This change implements the desiredStateOfWorld populator to sync up with
the pod informer. It periodically check each pod in the
desiredStateOfworld and verify whether it is still in pod informer
cache. If it not, remove it from the desiredStateOfWorld
2016-06-21 17:09:35 -07:00
k8s-merge-robot 459757cf08 Merge pull request #27728 from janetkuo/deployment-cleanup-unhealthy
Automatic merge from submit-queue

Deployment controller's cleanupUnhealthyReplicas should respect minReadySeconds

```release-note
Fixed an issue that Deployment may be scaled down further than allowed by maxUnavailable when minReadySeconds is set.
```

Fixes #26834

Detected by a flake in deployment rollover e2e test (the only test that specifies `minReadySeconds`).

cc @kubernetes/deployment @pwittrock 
cc @mqliang who first added `cleanupUnhealthyReplicas` in deployment controller 

[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
2016-06-21 11:46:12 -07:00
k8s-merge-robot ec518005a8 Merge pull request #27567 from saad-ali/blockKubeletOnAttachController
Automatic merge from submit-queue

Kubelet Volume Manager Wait For Attach Detach Controller and Backoff on Error

* Closes https://github.com/kubernetes/kubernetes/issues/27483
  * Modified Attach/Detach controller to report `Node.Status.AttachedVolumes` on successful attach (unique volume name along with device path).
  * Modified Kubelet Volume Manager wait for Attach/Detach controller to report success before proceeding with attach.
* Closes https://github.com/kubernetes/kubernetes/issues/27492
  * Implemented an exponential backoff mechanism for for volume manager and attach/detach controller to prevent operations (attach/detach/mount/unmount/wait for controller attach/etc) from executing back to back unchecked.
* Closes https://github.com/kubernetes/kubernetes/issues/26679
  * Modified volume `Attacher.WaitForAttach()` methods to uses the device path reported by the Attach/Detach controller in `Node.Status.AttachedVolumes` instead of calling out to cloud providers.
2016-06-20 20:36:08 -07:00
saadali e716ddc771 Controller wait for attach and exponential backoff
Modify attach/detach controller to keep track of volumes to report
attached in Node VolumeToAttach status.

Modify kubelet volume manager to wait for volume to show up in Node
VolumeToAttach status.

Implement exponential backoff for errors in volume manager and attach
detach controller
2016-06-20 18:19:55 -07:00
Janet Kuo 726ba45b59 Deployment controller's cleanupUnhealthyReplicas should respect minReadySeconds 2016-06-20 15:03:57 -07:00
k8s-merge-robot d19c8ed825 Merge pull request #27609 from ZTE-PaaS/zhangke-patch-001
Automatic merge from submit-queue

EndpointController syncService log error

Here key param should service nor rc
2016-06-20 13:06:44 -07:00
k8s-merge-robot d8b463dfd2 Merge pull request #27128 from markturansky/disable_provisioning
Automatic merge from submit-queue

Allow disabling of dynamic provisioning

Allow administrators to opt-out of dynamic provisioning.  Provisioning is still on by default, which is the current behavior.

Per a conversation with @jsafrane, a boolean toggle was added and plumbed through into the controller.  Deliberate disabling will simply return nil from `provisionClaim` whereas a misconfigured provisioner will continue on and generate error events for the PVC.

@kubernetes/rh-storage @saad-ali @thockin  @abhgupta
2016-06-20 02:10:43 -07:00
k8s-merge-robot 0730ffbff7 Merge pull request #27434 from jsafrane/pv-events-message
Automatic merge from submit-queue

Fill PV.Status.Message with deleter/recycler errors.

Instead of empty `Message` `kubectl describe pv` now shows:

```
Name:		nfs
Labels:		<none>
Status:		Failed
Claim:		default/nfs
Reclaim Policy:	Recycle
Access Modes:	RWX
Capacity:	1Mi
Message:	Recycler failed: Pod was active on the node longer than specified deadline
Source:
    Type:	NFS (an NFS mount that lasts the lifetime of a pod)
    Server:	10.999.999.999
    Path:	/
    ReadOnly:	false
```

This is actually a regression since 1.2

@kubernetes/sig-storage
2016-06-20 01:36:28 -07:00
saadali 926bb4cca0 Add patch status to Node internalclientset 2016-06-19 23:54:02 -07:00
markturansky 16ec36c591 added toggle to disable dynamic provisioning 2016-06-20 01:15:23 -04:00
Justin Santa Barbara e711cbf912 GCE/AWS: Spread PetSet volume creation across zones
Long term we plan on integrating this into the scheduler, but in the
short term we use the volume name to place it onto a zone.

We hash the volume name so we don't bias to the first few zones.

If the volume name "looks like" a PetSet volume name (ending with
-<number>) then we use the number as an offset.  In that case we hash
the base name.

Fixes #27256
2016-06-17 23:27:31 -04:00
goltermann 218645b346 Fix several spelling errors in comments. 2016-06-17 10:41:18 -07:00
Ke Zhang c8471f2c3e EndpointController syncService log error 2016-06-17 17:05:50 +08:00
k8s-merge-robot 646a872f15 Merge pull request #27415 from caesarxuchao/fix-oldrc
Automatic merge from submit-queue

fix updatePod() of RS and RC controllers

Fix updatePod of replication controller manager and replica set controller to handle pod label updates that match no RC or RS.

Fix #27405
2016-06-16 17:09:53 -07:00
Chao Xu 63fb075f0a fix updatePod of replication controller manager and replica set controller to
handle pod label updates that match no rc or rs
2016-06-15 10:34:26 -07:00
saadali 542f2dc708 Introduce new kubelet volume manager
This commit adds a new volume manager in kubelet that synchronizes
volume mount/unmount (and attach/detach, if attach/detach controller
is not enabled).

This eliminates the race conditions between the pod creation loop
and the orphaned volumes loops. It also removes the unmount/detach
from the `syncPod()` path so volume clean up never blocks the
`syncPod` loop.
2016-06-15 09:34:08 -07:00
saadali 9b6a505f8a Rename UniqueDeviceName to UniqueVolumeName
Rename UniqueDeviceName to UniqueVolumeName and move helper functions
from attacherdetacher to volumehelper package.
Introduce UniquePodName alias
2016-06-15 09:32:12 -07:00
Jan Safranek 449e9f49d3 Fill PV.Status.Message with deleter/recycler errors. 2016-06-15 14:56:31 +02:00
k8s-merge-robot 2b9670b77b Merge pull request #27190 from caesarxuchao/remove-debugging-log
Automatic merge from submit-queue

Fix a debugging line

A trivial update. @k8s-oncall can we manually merge it?
2016-06-14 16:53:09 -07:00
Wojciech Tyczynski 5d702a32c1 Fix race in informer 2016-06-14 16:40:12 +02:00
k8s-merge-robot f97bca37a5 Merge pull request #27127 from jsafrane/refactor-binder-operations
Automatic merge from submit-queue

Rework PV controller to use util/goroutinemap


@kubernetes/sig-storage
2016-06-12 23:44:28 -07:00
k8s-merge-robot 628af356b8 Merge pull request #26980 from hongchaodeng/fix
Automatic merge from submit-queue

processor listener: fix locking in pop()

Currently the lock in processorListener is used to guard pendingNotifications. But in pop, it also locks around on select chan. This will block the goroutine with lock acquired.

This PR changes the lock to guard the correct section only.
2016-06-12 17:59:09 -07:00
Chao Xu c15c10f312 fix a log line 2016-06-10 09:58:27 -07:00
Janet Kuo 764df2e096 Listing pods only once when getting pods for RS in deployment 2016-06-10 09:55:28 -07:00
Jan Safranek 6081bd61f0 Rework PV controller to use util/goroutinemap 2016-06-09 13:49:04 +02:00
Hongchao Deng d4eb48c0bb add TestPopReleaseLock 2016-06-08 11:34:35 -07:00
Hongchao Deng 308201acb0 processor listener: fix locking in pop() 2016-06-08 11:34:35 -07:00
k8s-merge-robot 707cc2bbb8 Merge pull request #26493 from caesarxuchao/fix-gc-flake
Automatic merge from submit-queue

Fixes 25890 flake. Let GC convert ListOptions to v1 before passing it to the dynamic client

GC's ListWatcher directly passed the api.ListOptions to the dynamic client, but the parameter codec of dynamic client converts the options to queries based on the tags in the struct, which are not present in api.ListOptions, so the queries are not sent to the server. As a result, the Watch request was sent without a resourceVersion, causing missed events. Flake #25890 is caused by the missed deletion events.

This PR converts the api.ListOptions to v1.ListOptions before the GC passes it to the dynamic codec. The flaky test has successfully passed 79 times ([log](https://00e9e64bacd064560a027fbee9c5a373a1614f3a56e652ae40-apidata.googleusercontent.com/download/storage/v1_internal/b/kubernetes-jenkins/o/pr-logs%2Fpull%2F25923%2Fkubernetes-pull-test-unit-integration%2F28364%2Fbuild-log.txt?qk=AD5uMEv72OjSUqDyk5i-ZLurcmM4i7gket1c7WaqR7yuIYz7WhPYT7ewVBafijV0ymnPTYqxRYt1kp6S9YQv7chPwC-3UtrKetKfhYnvAFrPGXAIBxHytTmpFohRAYgsARN1B6j1f9vyK5lM-8jyzRGhCK3sCRsAPnbDBWIWFlbH4b1n3vUET3P71QamHrF5itYyaqRU5pMZV3Cwwr81X8q7h5hCzm3Ip78RpMzfjEqTG0RcM2TLGccUrlkWVBLh4hn0NFpUIkzVFugFA5ooJffo-0AdJnO3mGWEOnXNVFWftJbK8cKnTns0DISrYFOyH_PlOe_YHCxgIXIT-dW8G-nbqoUjn5SBqunr36rcpaYCIwe2va4W_AcLCT43xiEAezRER_U9AuIqi_22KMd6SuHTyljhmWFPvPk8-gpjthLWXhcE7LPO5dV41hnZHnbI4n_9eI1nSVm7q9XdSvX1sWKV1GCwn8oj017AnxVvl9bScultko_0dTC747UqJ6UTFakLuFcHFe-F5Tz7ItDWlBVPoXeC7gTpyuicFKLsdqGlW9F5X6kIwNrBRj9uRsS-QuzSER-fVkQCn4dUTcokttRH_0bYvyfr9oqiDXmywMgOp-L0sKayk8JOVynh2q0Tju9sdkvFr0PxoAjhofomfIC1SZ_JkOzwAT1TUW8dLjPHluMct34xW_-qna1AmkoxM4bZQLhllap96NTC-0IdtzeKDrTul8p7u3WXSJjjEMSijibTNMlnkB0AluT1_RNO94OnzuFv4YlcV24FPhJzchhbyKREkOb_wzgcnSbRwGHjIcfRgkX-IzoXHVBcMYFUrPmsXrnRcfad4XwjkUOgvivkURW2_EwnzgrLDh-IKek51_0FpT1MnFCSG0gQbVSs_iMVPr6UXNAw62LGbKVtl3ZMXyapEpcO8azNbn6Wvd550R704JXxYlU)).

@lavalamp @krousey @smarterclayton
2016-06-04 01:52:31 -07:00
k8s-merge-robot bd2bc25308 Merge pull request #25865 from jsafrane/devel/pv-convert-from-12
Automatic merge from submit-queue

volume controller: Convert PersistentVolumes from Kubernetes 1.2

In Kubernetes 1.2 we used template PersistentVolume for provisioning. When a claim for dynamic volume was detected, Kubernetes did:

- create template PV for the claim with dummy pointer to storage asset
- allocate storage asset such as AWS EBS
- fill real pointer to the created storage asset to the template PV

In refactored volume provisioner, Kubernetes allocates the storage asset first and then creates a Kubernetes PV instance already with the correct pointer to the storage asset.

To support seamles upgrade from 1.2 to 1.3 we need to remove these unprovisioned template PVs. The new controller does not use them, it will see PVC for dynamic provisioning and create real PV instead.

See https://github.com/pmorie/pv-haxxz/pull/3 for pseudocode.
2016-06-03 23:27:13 -07:00
k8s-merge-robot 4877153727 Merge pull request #26772 from jsafrane/flake-controller-cache-empty
Automatic merge from submit-queue

Wait for all volumes/claims to get synced in unit test.

Controller.HasSynced() returns true when all initial claims/volumes were sent
to appropriate goroutines, not when the goroutine has actually processed them.

Fixes #26712
2016-06-03 17:05:22 -07:00
k8s-merge-robot a00dbea133 Merge pull request #26758 from mqliang/lookupcache-threadsafe
Automatic merge from submit-queue

bugfix:lookupcache's Get method can not be called concurrently

ref https://github.com/kubernetes/kubernetes/issues/26376

@lavalamp @therc @mikedanese
2016-06-03 12:46:13 -07:00
Chao Xu 06f49f7ca7 Let the dynamic client take a customized parameter codec for List, Watch, and DeleteCollection.
Let the gc's ListWatcher use api.ParameterCodec. Fixes 25890.
2016-06-03 11:22:51 -07:00
mqliang 9a0ff5a9e8 bugfix:lookupcache's Get method can not be called concurrently 2016-06-04 02:21:25 +08:00
Jan Safranek 27b11c5342 Convert PersistentVolumes from Kubernetes 1.2
In Kubernetes 1.2 we used template PersistentVolume for provisioning. When a
claim for dynamic volume was detected, Kubernetes did:
- create template PV for the claim with dummy pointer to storage asset
- allocate storage asset such as AWS EBS
- fill real pointer to the created storage asset to the template PV

In refactored volume provisioner, Kubernetes allocates the storage asset first
and then creates a Kubernetes PV instance already with the correct pointer
to the storage asset.

To support seamles upgrade from 1.2 to 1.3 we need to remove these
unprovisioned template PVs. The new controller does not use them, it will see
PVC for dynamic provisioning and create real PV instead.
2016-06-03 14:26:06 +02:00
k8s-merge-robot 3157e87cb2 Merge pull request #26768 from wojtek-t/routecontroller_logs
Automatic merge from submit-queue

Improve logging in routecontroller

@zmerlynn
2016-06-03 04:51:12 -07:00
k8s-merge-robot 59e008dbcb Merge pull request #26733 from pmorie/pv-controller-typos
Automatic merge from submit-queue

Fix typo and linewrap comments in PV controller

Fix some typos and linewrap long comments that I found while going over this code investigating something.
2016-06-03 04:51:08 -07:00
Wojciech Tyczynski de1d35a66d Improve logging in routecontroller 2016-06-03 12:05:12 +02:00
Jan Safranek 962505ad01 Wait for all volumes/claims to get synced in unit test.
Controller.HasSynced() returns true when all initial claims/volumes were sent
to appropriate goroutines, not when the goroutine has actually processed them.
2016-06-03 10:53:56 +02:00
k8s-merge-robot 75ef1ca270 Merge pull request #26351 from saad-ali/attachDetachControllerKubeletChanges
Automatic merge from submit-queue

Attach/Detach Controller Kubelet Changes

This PR contains changes to enable attach/detach controller proposed in #20262.

Specifically it:
* Introduces a new `enable-controller-attach-detach` kubelet flag to enable control by attach/detach controller. Default enabled.
* Removes all references `SafeToDetach` annotation from controller.
* Adds the new `VolumesInUse` field to the Node Status API object.
* Modifies the controller to use `VolumesInUse` instead of `SafeToDetach` annotation to gate detachment.
* Modifies kubelet to set `VolumesInUse` before Mount and after Unmount.
  * There is a bug in the `node-problem-detector` binary that causes `VolumesInUse` to get reset to nil every 30 seconds. Issue https://github.com/kubernetes/node-problem-detector/issues/9#issuecomment-221770924 opened to fix that.
  * There is a bug here in the mount/unmount code that prevents resetting `VolumeInUse in some cases, this will be fixed by mount/unmount refactor.
* Have controller process detaches before attaches so that volumes referenced by pods that are rescheduled to a different node are detached first.
* Fix misc bugs in controller.
* Modify GCE attacher to: remove retries, remove mutex, and not fail if volume is already attached or already detached.

Fixes #14642, #19953

```release-note
Kubernetes v1.3 introduces a new Attach/Detach Controller. This controller manages attaching and detaching volumes on-behalf of nodes that have the "volumes.kubernetes.io/controller-managed-attach-detach" annotation.

A kubelet flag, "enable-controller-attach-detach" (default true), controls whether a node sets the "controller-managed-attach-detach" or not.
```
2016-06-02 23:30:32 -07:00
k8s-merge-robot a41d84408c Merge pull request #26518 from jsafrane/initial-sync
Automatic merge from submit-queue

Fill controller caches on startup

The controller needs to fill its caches before it starts binding/recycling/ deleting or provisioning volumes and claims. This was done using blocking initial 'xxx added' from going through syncClaim/syncVolume. However, when the caches were full, the controller waited for the next sync period to do actual binding/recycling etc.

In this patch, the controller fills its caches directly from etcd and then processes initial 'xxx added' events to reconcile the world and bind/recycle/ delete/provision stuff, resulting in faster binding after startup.

Fixes #25967 (properly)
2016-06-02 21:44:56 -07:00
Saad Ali 9dbe943491 Attach/Detach Controller Kubelet Changes
This PR contains Kubelet changes to enable attach/detach controller control.
* It introduces a new "enable-controller-attach-detach" kubelet flag to
  enable control by controller. Default enabled.
* It removes all references "SafeToDetach" annoation from controller.
* It adds the new VolumesInUse field to the Node Status API object.
* It modifies the controller to use VolumesInUse instead of SafeToDetach
  annotation to gate detachment.
* There is a bug in node-problem-detector that causes VolumesInUse to
  get reset every 30 seconds. Issue https://github.com/kubernetes/node-problem-detector/issues/9
  opened to fix that.
2016-06-02 16:47:11 -07:00
Paul Morie 277c0a4e90 Fix typo and linewrap comments in PV controller 2016-06-02 15:50:07 -04:00
Janet Kuo 36f704c975 List RSes only once when getting old+new RSes in deployment controller 2016-06-02 11:24:43 -07:00
k8s-merge-robot 335da9b125 Merge pull request #26410 from jsafrane/fix-test-race
Automatic merge from submit-queue

Fix data race in volume controller unit test.

Reactor must be locked when fiddling with reactor.volumes and reactor.claims. Therefore add new functions to add/delete volume/claim with sending an event.

Fixes #26345
2016-06-02 04:25:08 -07:00