Automatic merge from submit-queue
Checked node condition for DaemonSets when updating node.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#45628
**Release note**:
```release-note-none
```
If an error happened during the UpdateNodeStatuses loop, there were some
code paths where we would not call SetNodeStatusUpdateNeeded, leaking
the state. Add it to all paths by adding a function.
Part of #40583
Automatic merge from submit-queue (batch tested with PRs 46076, 43879, 44897, 46556, 46654)
Local storage plugin
**What this PR does / why we need it**:
Volume plugin implementation for local persistent volumes. Scheduler predicate will direct already-bound PVCs to the node that the local PV is at. PVC binding still happens independently.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Part of #43640
**Release note**:
```
Alpha feature: Local volume plugin allows local directories to be created and consumed as a Persistent Volume. These volumes have node affinity and pods will only be scheduled to the node that the volume is at.
```
Automatic merge from submit-queue (batch tested with PRs 46635, 45619, 46637, 45059, 46415)
fix a comment and log message in the nodecontroller
I was poking around in the nodecontroller code and this looked wrong.
Automatic merge from submit-queue (batch tested with PRs 43275, 45014, 46449, 46488, 46525)
fix typo in taint_controller
**What this PR does / why we need it**:
fix typo in taint_controller
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
Automatic merge from submit-queue
avoiding unnecessary loop to copy pods listed
**What this PR does / why we need it**: avoids unnecessary loop to copy pods listed
**Which issue this PR fixes** : fixes#46433
**Release note**:
```release-note
```
/assign @wojtek-t
adding comments stating that returned pods should be used as read-only objects
fixing typo
avoiding unnecessary loop to copy pods listed see #46433
fixing fmt
avoiding unnecessary loop to copy pods listed see #46433
Automatic merge from submit-queue
Optimize provisioner plugin result check logic
If err is not returned by findProvisionablePlugin(...), storageClass is certainly not nil
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46383, 45645, 45923, 44884, 46294)
Node status updater now deletes the node entry in attach updates...
… when node is missing in NodeInformer cache.
- Added RemoveNodeFromAttachUpdates as part of node status updater operations.
**What this PR does / why we need it**: Fixes issue of unnecessary node status updates when node is deleted.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#42438
**Special notes for your reviewer**: Unit tested added, but a more comprehensive test involving the attach detach controller requires certain testing functionality that is currently absent, and will require larger effort. Will be added at a later time.
There is an edge case caused by the following steps:
1) A node is deleted and restarted. The node exists, but is not yet recognized by Kubernetes.
2) A pod requiring a volume attach with nodeName specifically set to this node.
This would make the pod stuck in ContainerCreating state. This is low-pri since it's a specific edge case that can be avoided.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45518, 46127, 46146, 45932, 45003)
PodDisruptionBudget should use ControllerRef
Fixes https://github.com/kubernetes/kubernetes/issues/42284
```release-note
PodDisruptionBudget now uses ControllerRef to decide which controller owns a given Pod, so it doesn't get confused by controllers with overlapping selectors.
```
Automatic merge from submit-queue
fix regression in UX experience for double attach volume
send event when volume is not allowed to multi-attach
Fixes#46012
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45573, 46354, 46376, 46162, 46366)
break the loop when found true
break the loop when found true.
Automatic merge from submit-queue (batch tested with PRs 45891, 46147)
fix typo
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 45514, 45635)
refactor certificate controller to break it into two parts
Break pkg/controller/certificates into:
* pkg/controller/certificates/approver: containing the group approver
* pkg/controller/certificates/signer: containing the local signer
* pkg/controller/certificates: containing shared infrastructure
```release-note
Break the 'certificatesigningrequests' controller into a 'csrapprover' controller and 'csrsigner' controller.
```
Automatic merge from submit-queue (batch tested with PRs 46149, 45897, 46293, 46296, 46194)
GC: update required verbs for deletable resources, allow list of ignored resources to be customized
The garbage collector controller currently needs to list, watch, get,
patch, update, and delete resources. Update the criteria for
deletable resources to reflect this.
Also allow the list of resources the garbage collector controller should
ignore to be customizable, so downstream integrators can add their own
resources to the list, if necessary.
cc @caesarxuchao @deads2k @smarterclayton @mfojtik @liggitt @sttts @kubernetes/sig-api-machinery-pr-reviews
Allow the list of resources the garbage collector controller should
ignore to be customizable, so downstream integrators can add their own
resources to the list, if necessary.
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)
Use shared informers in gc controller if possible
Modify the garbage collector controller to try to use shared informers for resources, if possible, to reduce the number of unique reflectors listing and watching the same thing.
cc @kubernetes/sig-api-machinery-pr-reviews @caesarxuchao @deads2k @liggitt @sttts @smarterclayton @timothysc @soltysh @kargakis @kubernetes/rh-cluster-infra @derekwaynecarr @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 38990, 45781, 46225, 44899, 43663)
Support parallel scaling on StatefulSets
Fixes#41255
```release-note
StatefulSets now include an alpha scaling feature accessible by setting the `spec.podManagementPolicy` field to `Parallel`. The controller will not wait for pods to be ready before adding the other pods, and will replace deleted pods as needed. Since parallel scaling creates pods out of order, you cannot depend on predictable membership changes within your set.
```
Created OWNERS_ALIASES called sig-apps-reviewers from the union of reviewers in:
pkg/controller/{cronjob,deployment,daemon,job,replicaset,statefulset}/OWNERS
except removed inactive user bprashanth
Created OWNERS_ALIASES called sig-apps-api-reviewers as the intersection
of sig-apps-reviewers and the approvers from pkg/api/OWNERS.
Used those OWNERS_ALIASES as the reviewers/approvers for the disruption controller,
and API.
Automatic merge from submit-queue (batch tested with PRs 46164, 45471, 46037)
NS controller: don't stop deleting GVRs on error
**What this PR does / why we need it**:
If the namespace controller encounters an error trying to delete a
single GroupVersionResource, add the error to an aggregated list of
errors and continue attempting to delete all the GroupVersionResources
instead of stopping at the first error. Return the aggregated error list
(if any) when done. This allows us to delete as much of the content in
the namespace as we can in each pass.
**Special notes for your reviewer**:
This may help with some of the namespace deletions taking too long in our e2e tests.
**Release note**:
```release-note
```
The alpha field podManagementPolicy defines how pods are created,
deleted, and replaced. The new `Parallel` policy will replace pods
as fast as possible, not waiting for the pod to be `Ready` or providing
an order. This allows for advanced clustered software to take advantage
of rapid changes in scale.
Tokens controller previously needed a bit of extra help in order to be
safe for concurrent use. The new MutationCache allows it to keep a local
cache and still use a shared informer. The filtering event handler lets
it only see changes to secrets it cares about.
Automatic merge from submit-queue
Don't try to attach volumes which are already attached to other nodes
This PR is a replacement for https://github.com/kubernetes/kubernetes/pull/40148. I was not able to push fixes and rebases to the original branch as I don't have access to the Github organization anymore.
CC @saad-ali You probably have to update the PR link in [Q2 2017 (v1.7)](https://docs.google.com/spreadsheets/d/1t4z5DYKjX2ZDlkTpCnp18icRAQqOE85C1T1r2gqJVck/edit#gid=14624465)
I assume the PR will need a new "ok to test"
**ORIGINAL PR DESCRIPTION**
This PR fixes an issue with the attach/detach volume controller. There are cases where the `desiredStateOfWorld` contains the same volume for multiple nodes, resulting in the attach/detach controller attaching this volume to multiple nodes. This of course fails for volumes like AWS EBS, Azure Disks, ...
I observed this situation on Azure when using Azure Disks and replication controllers which start to reschedule PODs. When you delete a POD that belongs to a RC, the RC will immediately schedule a new POD on another node. This results in a short time (max a few seconds) where you have 2 PODs which try to attach/mount the same volume on different nodes. As the old POD is still alive, the attach/detach controller does not try to detach the volume and starts to attach the volume to the new POD immediately.
This behavior was probably not noticed before on other clouds as the bogus attempt to attach probably fails pretty fast and thus is unnoticed. As the situation with the 2 PODs disappears after a few seconds, a detach for the old POD is initiated and thus the new POD can attach successfully.
On Azure however, attaching and detaching takes quite long, resulting in the first bogus attach attempt to already eat up much time.
When attaching fails on Azure and reports that it is already attached somewhere else, the cloud provider immediately does a detach call for the same volume+node it tried to attach to. This is done to make sure the failed attach request is aborted immediately. You can find this here: https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/azure/azure_storage.go#L74
The complete flow of attach->fail->abort eats up valuable time and the attach/detach controller can not proceed with other work while this is happening. This means, if the old POD disappears in the meantime, the controller can't even start the detach for the volume which delays the whole process of rescheduling and reattaching.
Also, I and other people have observed very strange behavior where disks ended up being "attached" to multiple VMs at the same time as reported by Azure Portal. This results in the controller to fail reattaching forever. It's hard to figure out why and when this happens and there is no reproducer known yet. I can imagine however that the described behavior correlates with what I described above.
I was not sure if there are actually cases where it is perfectly fine to have a volume mounted to multiple PODs/nodes. At least technically, this should be possible with network based volumes, e.g. nfs. Can someone with more knowledge about volumes help me here? I may need to add a check before skipping attaching in `reconcile`.
CC @colemickens @rootfs
-->
```release-note
Don't try to attach volume to new node if it is already attached to another node and the volume does not support multi-attach.
```
If the namespace controller encounters an error trying to delete a
single GroupVersionResource, add the error to an aggregated list of
errors and continue attempting to delete all the GroupVersionResources
instead of stopping at the first error. Return the aggregated error list
(if any) when done. This allows us to delete as much of the content in
the namespace as we can in each pass.
Automatic merge from submit-queue (batch tested with PRs 45990, 45544, 45745, 45742, 45678)
Refactor reconciler volume log and error messages
**What this PR does / why we need it**:
Utilizes volume-specific error and log messages introduced in #44969, inside files that also log volume information.
Specifically:
- pkg/kubelet/volumemanager/reconciler/reconciler.go,
- pkg/controller/volume/attachdetach/reconciler/reconciler.go, and
- pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go
**Which issue this PR fixes** : fixes#40905
**Special notes for your reviewer**:
**Release note**:
```release-note
```
NONE
Automatic merge from submit-queue
Move all API related annotations into annotation_key_constants.go
Separate from #45869. See https://github.com/kubernetes/kubernetes/pull/45869#discussion_r116839411 for details.
This PR does nothing but move constants around :)
/assign @caesarxuchao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45709, 41939)
delete err when return _
Signed-off-by: yupengzte <yu.peng36@zte.com.cn>
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 45247, 45810, 45034, 45898, 45899)
Apiregistration v1alpha1→v1beta1
Promoting apiregistration api from v1alpha1 to v1beta1.
API Registration is responsible for registering an API `Group`/`Version` with
another kubernetes like API server. The `APIService` holds information
about the other API server in `APIServiceSpec` type as well as general
`TypeMeta` and `ObjectMeta`. The `APIServiceSpec` type have the main
configuration needed to do the aggregation. Any request coming for
specified `Group`/`Version` will be directed to the service defined by
`ServiceReference` (on port 443) after validating the target using provided
`CABundle` or skipping validation if development flag `InsecureSkipTLSVerify`
is set. `Priority` is controlling the order of this API group in the overall
discovery document.
The return status is a set of conditions for this aggregation. Currently
there is only one condition named "Available", if true, it means the
api/server requests will be redirected to specified API server.
```release-note
API Registration is now in beta.
```
Automatic merge from submit-queue (batch tested with PRs 45664, 45861)
Fix#45213: Syncing jobs would return error when podController exception
**What this PR does / why we need it**:
Jobcontroller: Syncing jobs would return error when podController exception
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes#45213
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 44337, 45775, 45832, 45574, 45758)
daemoncontroller.go:format for
**What this PR does / why we need it**:
format for.
delete redundant para.
make code clean.
**Release note**:
```release-note
NONE
```
We have two node selection functions: includeNodeFromNodeList and
getNodeConditionPredicate, and the logic is different.
The logic should be the same, so remove includeNodeFromNodeList and just
use getNodeConditionPredicate everywhere.
Fix#45772
Automatic merge from submit-queue (batch tested with PRs 44748, 45692)
Limiting client go packages visibility, round 3
Continue the work in the merged PR https://github.com/kubernetes/kubernetes/pull/45258
These packages in client-go will be gone after #44065 is fixed:
pkg/api/helper, pkg/api/util, internal version of api groups, API install packages.
This PR removes the dependency on these packages and add bazel visibility rules to prevent relapse.
Automatic merge from submit-queue (batch tested with PRs 45685, 45572, 45624, 45723, 45733)
resource quota full resync was removed in error
**What this PR does / why we need it**:
the quota controller should have had a full resync interval, and it was inadvertently removed in the move to shared informers.
**Which issue this PR fixes**
This fixes quota recalculation happening at the specified interval.
**Special notes for your reviewer**:
**Release note**:
```release-note
the resource quota controller was not adding quota to be resynced at proper interval
```
change import of client-go/api/helper to kubernetes/api/helper
remove unnecessary use of client-go/api.registry
change use of client-go/pkg/util to kubernetes/pkg/util
remove dependency on client-go/pkg/apis/extensions
remove unnecessary invocation of k8s.io/client-go/extension/intsall
change use of k8s.io/client-go/pkg/apis/authentication to v1
The new fake client properly represents the resource of `PodMetrics` as
"pods" and the resource of `NodeMetrics` as "nodes". Previously, it
used "podmetricses" and "nodemetrics", respectively.
This fixes up `horizontal_test.go` and `replica_calc_test.go` to use the
new names.
Automatic merge from submit-queue (batch tested with PRs 45304, 45006, 45527)
increase the QPS for namespace controller
The namespace controller is really chatty. Especially to discovery since that involves two requests for every API version available. This bumps the QPS and burst on the namespace controller to avoid being stuck waiting.
Automatic merge from submit-queue (batch tested with PRs 45508, 44258, 44126, 45441, 45320)
cloud initialize node in external cloud controller
@thockin This PR adds support in the `cloud-controller-manager` to initialize nodes (instead of kubelet, which did it previously)
This also adds support in the kubelet to skip node cloud initialization when `--cloud-provider=external`
Specifically,
Kubelet
1. The kubelet has a new flag called `--provider-id` which uniquely identifies a node in an external DB
2. The kubelet sets a node taint - called "ExternalCloudProvider=true:NoSchedule" if cloudprovider == "external"
Cloud-Controller-Manager
1. The cloud-controller-manager listens on "AddNode" events, and then processes nodes that starts with that above taint. It performs the cloud node initialization steps that were previously being done by the kubelet.
2. On addition of node, it figures out the zone, region, instance-type, removes the above taint and updates the node.
3. Then periodically queries the cloudprovider for node addresses (which was previously done by the kubelet) and updates the node if there are new addresses
```release-note
NONE
```
Automatic merge from submit-queue
fix the typos of e.g.
fix the typos of e.g.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 43732, 45413)
Extend timeouts in timed_workers_test
Fix#45375
If it won't be enough I'll rewrite it to allow injectable timers.
Automatic merge from submit-queue (batch tested with PRs 43732, 45413)
Handle maxUnavailable larger than spec.replicas
**What this PR does / why we need it**:
Handle maxUnavailable larger than spec.replicas
**Which issue this PR fixes**
fixes#42479
**Special notes for your reviewer**:
None
**Release note**:
```
NONE
```
Automatic merge from submit-queue
stateful_pod_control.go: format the code
**What this PR does / why we need it**:
1.Improve the quality of the code.
2.Reduce reduandant parameters
3.add one comma
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Update token controller test to test async retry
Fixes#44819https://github.com/kubernetes/kubernetes/pull/44625 changed the token controller to queue a retry if the live service account's resourceVersion did not match our cache.
This updates the unit test that was testing that condition to test async queue behavior (which this condition now drives)
Automatic merge from submit-queue (batch tested with PRs 42477, 44462)
Use storage.v1 instead of v1beta1
storage.v1beta1 was used to work around GKE which did not expose v1. Now that GKE is updated, we can switch everything to v1.
This is simple sed v1beta1 -> v1 + enabled a new test + changed preference of exposed interfaces in `storage/install/install.go`.
@msau42, PTAL and let me know when GKE is updated with storage v1 API and this PR can be actually merged.
@kubernetes/sig-storage-pr-reviews
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44741, 44853, 44572, 44797, 44439)
controller: fix saturation check in Deployments
Fixes https://github.com/kubernetes/kubernetes/issues/44436
@kubernetes/sig-apps-bugs
I'll cherry-pick this back to 1.6 and 1.5
Automatic merge from submit-queue (batch tested with PRs 40060, 44860, 44865, 44825, 44162)
servicecontroller: remove unused zone field
The zone field was unused, and this complicated e.g. #39996
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44862, 42241, 42101, 43181, 44147)
Feature/hpa upscale downscale delay configurable
**What this PR does / why we need it**:
Makes "upscale forbidden window" and "downscale forbidden window" duration configurable in arguments of kube-controller-manager. Those are options of horizontal pod autoscaler.
**Special notes for your reviewer**:
Please have a look @DirectXMan12 , the PR as discussed in Slack.
**Release note**:
```
Make "upscale forbidden window" and "downscale forbidden window" duration configurable in arguments of kube-controller-manager. Those are options of horizontal pod autoscaler. Right now are hardcoded 3 minutes for upscale, and 5 minutes to downscale. But sometimes cluster administrator might want to change this for his own needs.
```
Automatic merge from submit-queue (batch tested with PRs 43575, 44672)
Update deployment and daemonset completeness checks
maxUnavailable being taken into account for deployment completeness has caused a lot of confusion (https://github.com/kubernetes/kubernetes/issues/44395, https://github.com/kubernetes/kubernetes/issues/44657, https://github.com/kubernetes/kubernetes/issues/40496, others as well I am sure) so I am willing to just stop using it and require all of the new Pods for a Deployment to be available for the Deployment to be considered complete (hence both `rollout status` and ProgressDeadlineSeconds will not be successful in cases where a 1-pod Deployment never becomes successful because its Pod never transitions to ready).
@kubernetes/sig-apps-api-reviews thoughts?
```release-note
Deployments and DaemonSets are now considered complete once all of the new pods are up and running - affects `kubectl rollout status` (and ProgressDeadlineSeconds for Deployments)
```
Fixes https://github.com/kubernetes/kubernetes/issues/44395
Automatic merge from submit-queue
Exclude master from LoadBalancer / NodePort
The servicecontroller documents that the master is excluded from the
LoadBalancer / NodePort, but this is broken for clusters where we are
using taints for the master (as introduced in 1.6), instead of marking
the master as unschedulable.
This restores the desired documented behaviour, by excluding nodes that
are labeled as masters with the new 1.6 labels, even if they use the new
1.6 taints.
Fix#33884
```release-note
Exclude nodes labeled as master from LoadBalancer / NodePort; restores documented behaviour
```
Automatic merge from submit-queue
Improve Service controller's code coverage a little bit
**What this PR does / why we need it**:
Improves the code coverage for Service Controller
Before
```
go test --cover ./pkg/controller/service
ok k8s.io/kubernetes/pkg/controller/service 0.101s coverage: 23.4% of statements
```
After
```
go test --cover ./pkg/controller/service/
ok k8s.io/kubernetes/pkg/controller/service 0.094s coverage: 62.0% of statements
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
More unit testing
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44625, 43594, 44756, 44730)
Check for terminating Pod prior to launching successor in StatefulSet
Modifies sync loop for StatefulSet controller to check if a Pod is terminating before launching its successor. Fixes#44229. Should be cherry picked into 1.6 branch.
**Which issue this PR fixes**
fixes#44229
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44625, 43594, 44756, 44730)
Retry secret reference addition on conflict
* Tolerates leading or trailing etcd reads when fetching liveServiceAccount - fixes#25416
* Tolerates conflicts when updating the service account with the secret reference (does RetryOnConflict before deleting token and completely restarting the flow) - fixes#44054
Automatic merge from submit-queue
More RC/RS controller logging updates
We were comparing the address of the old and new RC.spec.replicas and we
have to compare the values. This only affects logging.
Update RS controller to match RC controller to log when spec.replicas
changes, not status.replicas.
@kargakis @janetkuo @sttts @liggitt
We were comparing the address of the old and new RC.spec.replicas and we
have to compare the values. This only affects logging.
Update RS controller to match RC controller to log when spec.replicas
changes, not status.replicas.
Automatic merge from submit-queue (batch tested with PRs 41498, 44487)
Use len of pods in stateful set error
**What this PR does / why we need it**:
Sync stateful set reports wrong error, we need to fix it.
**Release note**:
```release-note
`NONE`
```