The patch removes ExternalID usage from node_controller
and node_lifecycle_oontroller. The code instead uses InstanceID
which returns the cloud provider ID as well.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update bazelbuild/rules_go, kubernetes/repo-infra, and gazelle dependencies
**What this PR does / why we need it**: updates our bazelbuild/rules_go dependency in order to bump everything to go1.9.4. I'm separating this effort into two separate PRs, since updating rules_go requires a large cleanup, removing an attribute from most build rules.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59939, 59830). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Avoid call to get cloud instances
**What this PR does / why we need it**:
if a node does not have the taint, we really don't need to make calls
to get the list of instances from the cloud provider
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
Found when reviewing code for #59887
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Process existing cloud nodes in CCM
**What this PR does / why we need it**:
This is a timing issue. If kubelet(s) get started before the CCM is
started, the shared informer event handler does not process them at
all. So we should loop through these before. We run this in a
go wait.Until loop to tolerate errors when listing the nodes and
giving an opportunity for any scripts that may need to setup RBAC
roles etc.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#58613
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Existing nodes are sent via update and not via the add function,
so let's add an UpdateCloudNode and just forward it to the
AddCloudNode. This works fine as all we do is look for the cloud
taint and bail out if it is not present.
shutdowned -> stopped
use shutdown everywhere
use patch in taints api call
use notimplemented in clouds use AddOrUpdateTaintOnNode
correct log text
add fake cloud
try to fix bazel
add shutdown tests
add context
This adds context to all the relevant cloud provider interface signatures.
Callers of those APIs are currently satisfied using context.TODO().
There will be follow on PRs to push the context through the stack.
For an idea of the full scope of this change please look at PR #58532.
This moves plugin/pkg/scheduler to pkg/scheduler and
plugin/cmd/kube-scheduler to cmd/kube-scheduler.
Bulk of the work was done with gomvpkg, except for kube-scheduler main
package.
Automatic merge from submit-queue (batch tested with PRs 49856, 56257, 57027, 57695, 57432). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Change to pkg/util/node.UpdateNodeStatus
**What this PR does / why we need it**:
> // TODO: Change to pkg/util/node.UpdateNodeStatus.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
/cc @brendandburns @dchen1107 @lavalamp
**Release note**:
```release-note
None
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Node which is not present not need update NodeAddress
**What this PR does / why we need it**:
when the node is not exist according to cloud provider. there is no need to update node address better.
finally the node will be delete in https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/cloud/node_controller.go#L240
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 53946, 53993, 54315, 54143, 54532). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update many misspelled word initializer
**What this PR does / why we need it**:
fix misspelled word initalizer
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
``` NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Log when node is initialized in cloud controller manager
**What this PR does / why we need it**:
Always logs when a node is successfully initialized and raises log level for adding node labels to new nodes. This is useful since the only way to know if CCM is working properly is to check for the taint `node.cloudprovider.kubernetes.io/uninitialized`.
**Release note**:
```release-note
Log when node is successfully initialized by Cloud Controller Manager
```
cc @luxas @wlan0 @jhorwit2
Using assertions for unit tests:
1. cmd/kube-controller-manager/app/controller_manager_test.go
2. pkg/controller/bootstrap/jws_test.go
3. pkg/controller/cloud/node_controller_test.go
4. pkg/controller/controller_utils_test.go
Automatic merge from submit-queue (batch tested with PRs 52176, 43152). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Eliminate hangs/throttling of node heartbeat
Fixes https://github.com/kubernetes/kubernetes/issues/48638Fixes#50304
Stops kubelet from wedging when updating node status if unable to establish tcp connection.
Notes that this only affects the node status loop. The pod sync loop would still hang until the dead TCP connections timed out, so more work is needed to keep the sync loop responsive in the face of network issues, but this change lets existing pods coast without the node controller trying to evict them
```release-note
kubelet to master communication when doing node status updates now has a timeout to prevent indefinite hangs
```
We should be able to build a cloud-controller-manager without having to
pull in code specific to GCE and AWS clouds. Note that this is a tactical
fix for now, we should have allow PVLabeler to be passed into the
PersistentVolumeController, maybe come up with better interfaces etc. Since
it is too late to do all that for 1.8, we just move cloud specific code
to where they belong and we check for PVLabeler method and use it where
needed.
Fixes#51629
Automatic merge from submit-queue (batch tested with PRs 51574, 51534, 49257, 44680, 48836)
Add a persistent volume label controller to the cloud-controller-manager
Part of https://github.com/kubernetes/features/issues/88
Outstanding concerns needing input:
- [x] Why 5 threads for controller processing?
- [x] Remove direct linkage to aws/gce cloud providers [#51629]
- [x] Modify shared informers to allow added event handlers ability to include uninitialized objects/using unshared informer #48893
- [x] Use cache.MetaNamespaceKeyFunc in event handler?
I'm willing to work on addressing the removal of the direct linkage to aws/gce after this PR gets in.
Automatic merge from submit-queue (batch tested with PRs 51174, 51363, 51087, 51382, 51388)
Add InstanceExistsByProviderID to cloud provider interface for CCM
**What this PR does / why we need it**:
Currently, [`MonitorNode()`](02b520f0a4/pkg/controller/cloud/nodecontroller.go (L240)) in the node controller checks with the CCM if a node still exists by calling `ExternalID(nodeName)`. `ExternalID` is supposed to return the provider id of a node which is not supported on every cloud. This means that any clouds who cannot infer the provider id by the node name from a remote location will never remove nodes that no longer exist.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50985
**Special notes for your reviewer**:
We'll want to create a subsequent issue to track the implementation of these two new methods in the cloud providers.
**Release note**:
```release-note
Adds `InstanceExists` and `InstanceExistsByProviderID` to cloud provider interface for the cloud controller manager
```
/cc @wlan0 @thockin @andrewsykim @luxas @jhorwit2
/area cloudprovider
/sig cluster-lifecycle
Automatic merge from submit-queue (batch tested with PRs 49850, 47782, 50595, 50730, 51341)
Cloud Controller Manager now sets Node.Spec.ProviderID
**What this PR does / why we need it**:
Cloud Controller Manager now sets `Node.Spec.ProviderID`.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/49836.
**Special notes for your reviewer**:
* As part of an effort to move cloud controller manager into beta https://github.com/kubernetes/kubernetes/issues/48690.