The change assumes the compute Alpha object is the superset of the v1
object. By storing the Alpha objects internally in the fake, we can
convert them to Beta and v1 to test different functions.
- Adds AddAliasToInstance() to the GCE cloud provider.
- Adds field "secondary-range-name" to the gce.conf configuration file.
```release-note
NONE
```
Automatic merge from submit-queue
cloudprovider/openstack bug fix: don't try to append pool id if pool doesn't exist
**What this PR does / why we need it**:
This fixes a bug in the OpenStack cloud provider that could cause a panic.
Consider what will happen in the current `LbaasV2.EnsureLoadBalancerDeleted` code if `nil, ErrNotFound` is returned by `getPoolByListenerID`.
Automatic merge from submit-queue (batch tested with PRs 38947, 50239, 51115, 51094, 51116)
Mark the volumes as detached when node does not exist
If node does not exist, node's volumes will be detached
automatically and become available. So mark them detached and do not return err.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
#50200
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50967, 50505, 50706, 51033, 51028)
teach gce cloud to handle alpha/beta operations v2
Alternative to #50704
This one feels cleaner. BUT, type assertion problems cannot be exposed at compile time.
Please let me know what you think. This will set the precedence for consuming GCE alpha/beta API.
cc: @thockin @yujuhong @saad-ali @MrHohn
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50893, 50913, 50963, 50629, 50640)
gce external LB: add a function to verify the requested IP address
Factor out the logic for verifying the user-requested IP for better
readability and testing. Also rename a few variables for clarity.
If user specify floating-network-id by annotation rather than cloud
provider file, openstack cloud provider don't delete floatingip when
deleting LoadBalancer service.
Automatic merge from submit-queue (batch tested with PRs 50255, 50885)
AWS: Arbitrarily choose first (lexicographically) subnet in AZ
When there is more than one subnet for an AZ on AWS choose arbitrarily
chose the first one lexicographically for consistency.
**What this PR does / why we need it**:
If two subnets were to be used appear in the same aws az which one is chosen is currently not consistent. This could lead to difficulty in diagnosing issues.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#45983
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 50303, 50856)
Make route-controller list only relevant routes instead of all of them
Ref https://github.com/kubernetes/kubernetes/issues/50854 (somewhat related issue)
IIUC from the code, route-controller memory is mainly being used in storing routes and nodes (also CIDRs, but that's not much).
This should help reduce that memory usage (particularly when running in a project with large no. of routes), by moving filtering to server-side.
For e.g in kubernetes-scale project we have ~5000 routes (each about 600B) => 3 MB of routes
This doesn't help with reducing time to list the routes as filtering is also linear.
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue
Mark volume as detached when node does not exist for vsphere
If node does not exist, node's volumes will be detached
automatically and become available. So mark them detached and
return false without error.
Fix#50266
**Special notes for your reviewer**:
/assign @jingxu97
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46317, 48922, 50651, 50230, 47599)
Log name if Azure file share cannot be created
**What this PR does / why we need it**: If the Azure storage provider fails to create a file share, it logs and error message 'failed to create share in account _foo_: _error-msg_'. A user on the Slack azure-sig channel reported an error of "The specified resource name length is not within the permissible limits". This PR adds logging of the name so that this error can be diagnosed in future.
**Which issue this PR fixes**: This was raised on Slack and has not been created as a GitHub issue.
**Special notes for your reviewer**: None
**Release note**:
```release-note
Changed the error log format when creating an Azure file share to include the name of the share.
```
Automatic merge from submit-queue (batch tested with PRs 50061, 48580, 50779, 50722)
Fix for Policy based volume provisioning failure due to long VM Name in vSphere cloud provider
Dummy VM is used for SPBM policy based provisioning feature of vSphere cloud provider.
Dummy VM name is generated based on kubernetes cluster name and pv name. It can easily go beyond
vSphere's limitation of 80 characters for vmName.
To solve the long VM name failure hash is used instead of vSphere-k8s-clusterName-PvName
**Which issue this PR fixes**
https://github.com/vmware/kubernetes/issues/176
**Release note:**
```release-note
None
```
@BaluDontu @divyenpatel @luomiao @tusharnt
Currently if user doesn't specify subnet-id or specify a unsafe
subnet-id, openstack cloud provider can't create a correct LoadBalancer
service.
Actually we can get it automatically. This patch do a improvement.
This is a part of #50726
vSphere has limitation of 80 characters for vmName.
with vsphere-k8s prefix and "vmdisk.volumeOptions.Name" vmName can become easily bigger than 80 chars.
Used hash funciton just of the "vmdisk.volumeOptions.Name" part as cleanup dummyVm logic depends on prefix "vsphere-k8s"
If node doesn't exist, OpenStack Nova will assume the volumes
are not attached to it. So mark the volumes as detached and
return false without error.
Fix: #50200
If node does not exist, node's volumes will be detached
automatically and become available. So mark them detached and
return false without error.
Fix#50266
Automatic merge from submit-queue (batch tested with PRs 47724, 49984, 49785, 49803, 49618)
Fix conflict about getPortByIp
**What this PR does / why we need it**:
Currently getPortByIp() get port of instance only based on IP.
If there are two instances in diffent network and the CIDR of
their subnet are same, getPortByIp() will be conflict.
My PR gets port based on IP and Name of instance.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fix#43909
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Added jdumars to OWNERS file for Azure cloud provider
**What this PR does / why we need it**:
This PR adds GitHub user jdumars as an approver to pkg/cloudprovider/providers/azure
Jaice Singer DuMars (me) is the program manager at Microsoft tasked with shepherding all upstream contributions from Microsoft into Kubernetes. With the volume of work, and the impending breakout of cloud provider code, this helps distribute the review and approval load more evenly.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
N/A
**Special notes for your reviewer**:
This was discussed with Brendan Burns prior to submitting the pre-approval.
**Release note**:
none
Automatic merge from submit-queue
GCE: Fix lowercase value and alpha-missing annotation for ILB
**What this PR does / why we need it**:
Fixes#50426
Also explicitly sets an annotation as 'alpha'.
/assign @freehan @bowei
**Release note**:
```release-note
NONE
```
Currently getPortByIp() get port of instance only based on IP.
If there are two instances in diffent network and the CIDR of
their subnet are same, getPortByIp() will be conflict.
My PR gets port based on IP and Name of instance.
Automatic merge from submit-queue
Ignore the available volume when calling DetachDisk
Fix#50207
If user detachs the volume by nova in openstack env, volume becomes
available. If nova instance is been deleted, nova will detach it
automatically and become available. So the "available" is fine since that means the
volume is detached from instance already.
**Release note**:
```release-note
NONE
```
If node does not exist, node's volumes will be detached
automatically and become available. So mark them detached and
return false without error.
Fix#50266
Automatic merge from submit-queue (batch tested with PRs 49524, 46760, 50206, 50166, 49603)
[OpenStack] Add more detail error message
I get same simple error messages "Unable to initialize cinder client
for region: RegionOne" from controller-manager, but I can not find the
reason. We should add more detail message "err" into glog.Errorf.
Currently NewBlockStorageV2() return err when failed to get cinder endpoint, but there is no code to output the message of err.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Azure: Allow VNet to be in a separate Resource Group
**What this PR does / why we need it**:
This PR allows Kubernetes in an Azure context to use a VNet which is not in the same Resource Group as Kubernetes.
We need this because currently Azure Cloud Provider driver assumes that it should have a VNet for himself but if there is one thing that should be shared amongst Azure resources it's a VNet cause, well, things might want to talk to each other in a private network, don't you think ?
I guess this should we backported down to 1.6 branch.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
fixes#49577
**Release note**:
```release-note
NONE
```
@kubernetes/sig-azure
@kubernetes/sig-azure-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 50418, 49830, 49206, 49061, 49912)
add LocalZone into gce.conf and refactor gce cloud provider configura…
The main goal of this PR is to make gce cloud provider able to run locally.
1. added a LocalZone parameter into gce.conf.
2. refactor `newGCECloud` to avoid contacting metadata server if configuration is already available.
```release-note
None
```
Automatic merge from submit-queue
VSphere cloud provider code refactoring
The current PR tracks the vSphere Cloud Provider code refactoring which includes the following changes.
- VCLib Package - A framework used by vSphere cloud provider for managing the vSphere entities. VCLib package mainly does the following:
- Volume management on datastore (Create/Delete)
- Volume management on Virtual Machines (Attach/Detach)
- Storage Policy Management
- vSphere Cloud Provider changes to implement the cloud provider interfaces by calling into VCLib package.
- Modifications to e2e tests to accomodate the latest design changes.
@divyenpatel @rohitjogvmw @luomiao
```release-note
vSphere cloud provider: vSphere cloud provider code refactoring
```
Automatic merge from submit-queue (batch tested with PRs 50087, 39587, 50042, 50241, 49914)
AttachDisk should not call detach inside of Cinder volume provider
If use detachs the volume by nova in openstack env, volume becomes
available. If nova instance is been deleted, nova will detach it
automatically. So the "available" is fine since that means the
volume is detached from instance already.
I get same simple error messages "Unable to initialize cinder client
for region: RegionOne" from controller-manager, but I can not find the
reason. We should add more detail message "err" into glog.Errorf.
Automatic merge from submit-queue (batch tested with PRs 49805, 50052)
We never want to modify the globally defined SG for ELBs
**What this PR does / why we need it**:
Fixes a bug where creating or updating an ELB will modify a globally defined security group
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50105
**Special notes for your reviewer**:
**Release note**:
```release-note
fixes a bug around using the Global config ElbSecurityGroup where Kuberentes would modify the passed in Security Group.
```
Automatic merge from submit-queue (batch tested with PRs 47416, 47408, 49697, 49860, 50162)
add possibility to use multiple floatingip pools in openstack loadbalancer
**What this PR does / why we need it**: Currently only one floating pool is supported in kubernetes openstack cloud provider. It is quite big issue for us, because we want run only single kubernetes cluster, but we want that external and internal services can be used. It means that we need possibility to create services with internal and external pools.
**Which issue this PR fixes**: fixes#49147
**Special notes for your reviewer**: service labels is not maybe correct place to define this floatingpool id. However, I did not find any better place easily. I do not want start modifying service api structure.
**Release note**:
```release-note
Add possibility to use multiple floatingip pools in openstack loadbalancer
```
Example how it works:
```
cat /etc/kubernetes/cloud-config
[Global]
auth-url=https://xxxx
username=xxxx
password=xxxx
region=yyy
tenant-id=b23efb65b1d44b5abd561511f40c565d
domain-name=foobar
[LoadBalancer]
lb-version=v2
subnet-id=aed26269-cd01-4d4e-b0d8-9ec726c4c2ba
lb-method=ROUND_ROBIN
floating-network-id=56e523e7-76cb-477f-80e4-2dc8cf32e3b4
create-monitor=yes
monitor-delay=10s
monitor-timeout=2000s
monitor-max-retries=3
```
```
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 1
template:
metadata:
labels:
run: web
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
labels:
run: web-ext
name: web-ext
namespace: default
spec:
selector:
run: web
ports:
- port: 80
name: https
protocol: TCP
targetPort: 80
type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
labels:
run: web-int
floatingPool: a2a84887-4915-42bf-aaff-2b76688a4ec7
name: web-int
namespace: default
spec:
selector:
run: web
ports:
- port: 80
name: https
protocol: TCP
targetPort: 80
type: LoadBalancer
```
```
% kubectl create -f example.yaml
deployment "nginx-deployment" created
service "web-ext" created
service "web-int" created
% kubectl get svc -o wide
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kubernetes 10.254.0.1 <none> 443/TCP 2m <none>
web-ext 10.254.23.153 192.168.1.57,193.xx.xxx.xxx 80:30151/TCP 52s run=web
web-int 10.254.128.141 192.168.1.58,10.222.130.80 80:32431/TCP 52s run=web
```
cc @anguslees @k8s-sig-openstack-feature-requests @dims
Automatic merge from submit-queue (batch tested with PRs 49916, 50050)
GCE: Fix bug by correctly cast port to string
Code is incorrectly casting a port to a string, causing the diff-expression to always return true.
**What this PR does / why we need it**:
Fixes#50049
**Special notes for your reviewer**:
/assign @MrHohn
**Release note**:
```release-note
NONE
```
if not needed here
load network ids from gophercloud api
fix to getnetworkbyname
update godeps, add networks library
fix gofmt and boilerplate
gofmt
use annotations
fix
remove enableflag
add comment to annotationvalue
Automatic merge from submit-queue (batch tested with PRs 49870, 49416, 49872, 49892, 49908)
fix alpha/beta endpoint when api endpoint is specified
fix a bug in alpha/beta compute API endpoint bootstraping when api-endpiont is specified.
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 45813, 49594, 49443, 49167, 47539)
GCE: Adding unit test for ensureStaticIP
**What this PR does / why we need it**:
Entry into unit testing GCE loadbalancer code by testing `ensureStaticIP` which had a bug in 1.7.0.
@bowei @freehan @MrHohn @dnardo @thockin, any thoughts and comments on how we could unit test LB code moving forward? I think there are many areas we can split functions into smaller ones for easier testing - firewallNeedsUpdate being an example of that. However, it seems to me that we still need to mock our GCP calls for some functions that heavily revolve around API calls. A dream goal would be to have a unit test that can call EnsureLoadBalancer. Now that we have shared resources between different services and ingresses (firewalls, instance groups, [future features]), being able to setup different scenarios without depending on E2E tests would be awesome. However, I'm not sure how reachable that goal would be.
Most importantly, let's not make things worse. If you have advice on anti-patterns to avoid, please speak up.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45813, 49594, 49443, 49167, 47539)
GCE: Update vendor of gcfg and filter config parsing errors
**What this PR does / why we need it**:
To utilize new function `FatalOnly` which filters "programmer errors"
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#49660
**Special notes for your reviewer**:
/assign @bowei
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49081, 49318, 49219, 48989, 48486)
GCE: Remove resource Get function calls from Create functions
**What this PR does / why we need it**:
Consistency. This PR removes the GetXXX from the CreateXXX functions of the GCE cloudprovider. Consumers (specifically the ingress controller) will need to call the Get resource funcs separately when updating their vendored versions.
**Release note**:
```release-note
NONE
```
/assign @bowei
Automatic merge from submit-queue (batch tested with PRs 49081, 49318, 49219, 48989, 48486)
Better message if we dont find appropriate BlockStorage API
**What this PR does / why we need it**:
With latest devstack, v1 and v2 are DEPRECATED and v3 is marked
as CURRENT. So we fail to attach the disk, the error message is
shown when one does "kubectl describe pod" but the operator has
to dig into find the problem.
So log a better message if we can't find the appropriate version
of the API that we support with an explicit error message that
the operator can see how to fix the situation.
Note support for v3 block storage API is being added to gophercloud
and will take a bit of time before we can support it.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Define a new config VnetResourceGroup in order to be able to use a VNet
which is not in the same resource group as kubernetes.
Signed-off-by: Sylvain Rabot <s.rabot@lectra.com>
Automatic merge from submit-queue
Warn if aws has no cluster id provided
**What this PR does / why we need it**:
we info log a message when no cluster id is provided that should be a warning given its impact.
fixes https://github.com/kubernetes/kubernetes/issues/49568
**Release note**:
```release-note
NONE
```
With latest devstack, v1 and v2 are DEPRECATED and v3 is marked
as CURRENT. So we fail to attach the disk, the error message is
shown when one does "kubectl describe pod" but the operator has
to dig into find the problem.
So log a better message if we can't find the appropriate version
of the API that we support with an explicit error message that
the operator can see how to fix the situation.
Note support for v3 block storage API is being added to gophercloud
and will take a bit of time before we can support it.
Automatic merge from submit-queue (batch tested with PRs 49420, 49296, 49299, 49371, 46514)
Avoid looking up instance id until we need it
**What this PR does / why we need it**:
currently kube-controller-manager cannot run outside of a vm started
by openstack (with --cloud-provider=openstack params). We try to read
the instance id from the metadata provider or the config drive or the
file location only when we really need it. In the normal scenario, the
controller-manager uses the node name to get the instance id.
41541910e1/pkg/volume/cinder/attacher.go (L149)
The localInstanceID is currently used only in the test case, so let
us not read it until it is really needed.
So let's try to find the instance-id only when we need it.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Bump up gce minNodesHealthCheckVersion due to known issues
**What this PR does / why we need it**: There are some known issues in previous 1.7 versions causing kube-proxy not correctly responding healthz traffic.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: From #49263.
**Special notes for your reviewer**:
/assign @nicksardo @freehan
cc @bowei @thockin
**Release note**:
```release-note
GCE Cloud Provider: New created LoadBalancer type Service will have health checks for nodes by default if all nodes have version >= v1.7.2.
```
Automatic merge from submit-queue (batch tested with PRs 49107, 47177, 49234, 49224, 49227)
Added logging to AWS api calls. #46969
Additionally logging of when AWS API calls start and end to help diagnose problems with kubelet on cloud provider nodes not reporting node status periodically. There's some inconsistency in logging around this PR we should discuss.
IMO, the API logging should be at a higher level than most other types of logging as you would probably only want it in limited instances. For most cases that is easy enough to do, but there are some calls which have some logging around them already, namely in the instance groups. My preference would be to keep the existing logging as it and just add the new API logs around the API call.
currently kube-controller-manager cannot run outside of a vm started
by openstack (with --cloud-provider=openstack params). We try to read
the instance id from the metadata provider or the config drive or the
file location only when we really need it. In the normal scenario, the
controller-manager uses the node name to get the instance id.
41541910e1/pkg/volume/cinder/attacher.go (L149)
The localInstanceID is currently used only in the test case, so let
us not read it until it is really needed.
Automatic merge from submit-queue (batch tested with PRs 49276, 49235)
Don't fail fast if LoadBalancer section is missing
**What this PR does / why we need it**:
We should allow scenarios where cinder can be used even if the
operator does not want to use the openstack load balancer. So
let's warn in the beginning if subnet-id is missing but fail only
if they try to use the load balancer
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
We should allow scenarios where cinder can be used even if the
operator does not want to use the openstack load balancer. So
let's warn in the beginning if subnet-id is missing but fail only
if they try to use the load balancer
Automatic merge from submit-queue
Tolerate Flavor information for computing instance type
**What this PR does / why we need it**:
Current devstack seems to return "id", and an upcoming change using
nova's microversion will be returning "original_name":
https://blueprints.launchpad.net/nova/+spec/instance-flavor-api
So let's just inspect what is present and use that to figure out
the instance type.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Fix on-premises term in error string and comments for aws provider
**What this PR does / why we need it**: fix for correct terminology of "on-premises" over "on-premise"
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: n/a
**Special notes for your reviewer**: Updated error string while doing a scrub for the incorrect term in the docs (kubernetes/kubernetes.github.io#4413).
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49083, 45540, 46862)
Add extra logging to azure API get calls
**What this PR does / why we need it**:
This PR adds extra logging for external calls to the Azure API, specifically get calls.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
This will help troubleshoot problems arising from the usage of this cloudprovider. For example, it looks like #43516 is caused by a call to the cloudprovider taking too much time.
Automatic merge from submit-queue (batch tested with PRs 49218, 48253, 48967, 48460, 49230)
additional backoff in azure cloudprovider
Fixes#48971
**What this PR does / why we need it**:
We want to be able to opt in to backoff retry logic for kubelet-originating request behavior: node IP address resolution and node load balancer pool membership enforcement.
**Special notes for your reviewer**:
The use-case for this is azure cloudprovider clusters with large node counts, especially during cluster installation, or other scenarios when lots of nodes come online at once and attempt to register all resources with the backend API. To allow clusters at scale more control over the API request rate in-cluster, backoff config has the ability to meaningful slow down this rate, when appropriate.
**Release note**:
```additional backoff in azure cloudprovider
```
Current devstack seems to return "id", and an upcoming change using
nova's microversion will be returning "original_name":
https://blueprints.launchpad.net/nova/+spec/instance-flavor-api
So let's just inspect what is present and use that to figure out
the instance type.
Automatic merge from submit-queue (batch tested with PRs 48702, 48965, 48740, 48974, 48232)
Rackspace for cloud-controller-manager
This implements the NodeAddressesByProviderID and InstanceTypeByProviderID
methods used by the cloud-controller-manager to the RackSpace provider.
The instance type returned is the flavor name, for consistency
InstanceType has been implemented too returning the same value.
This is part of #47257 cc @wlan0
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48890, 46893, 48872, 48896)
Fix the order of deletion
1. EnsureLoadBalancer can't delete pool without deleting members,
just let EnsureLoadBalancerDeleted do it.
2. Add some friendly error message
**Release note**:
```release-note
NONE
```
EnsureHostInPool() submits a GET to azure API for VM info. We’re seeing this on agent node kubelets and would like to enable configurable backoff engagement for 4xx responses to be able to slow down the rate of reconciliation, when appropriate.
Automatic merge from submit-queue (batch tested with PRs 47066, 48892, 48933, 48854, 48894)
azure: msi: add managed identity field, logic
**What this PR does / why we need it**: Enables managed service identity support for the Azure cloudprovider. "Managed Service Identity" allows us to ask the Azure Compute infra to provision an identity for the VM. Users can then retrieve the identity and assign it RBAC permissions to talk to Azure ARM APIs for the purpose of the cloudprovider needs.
Per the commit text:
```
The azure cloudprovider will now use the Managed Service Identity
to retrieve access tokens for the Azure ARM APIs, rather than
requiring hard-coded, user-specified credentials.
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: n/a
**Special notes for your reviewer**: none
**Release note**:
```release-note
azure: support retrieving access tokens via managed identity extension
```
cc: @brendandburns @jdumars @anhowe
Automatic merge from submit-queue
Azure PD (Managed/Blob)
This is exactly the same code as this [PR](https://github.com/kubernetes/kubernetes/pull/41950). It has a clean set of generated items. We created a separate PR to accelerate the accept/merge the PR
CC @colemickens
CC @brendandburns
**What this PR does / why we need it**:
1. Adds K8S support for Azure Managed Disks.
2. Adds support for dedicated blob disks (1:1 to storage account) in addition to shared blob disks (n:1 to storage account).
3. Automatically manages the underlying storage accounts. New storage accounts are created at 50% utilization. Max is 100 disks, 60 disks per storage account.
2. Addresses the current issues with Blob Disks:
..* Significantly faster attach process. Disks are now usually available for pods on nodes under 30 sec if formatted, under a min if not formatted.
..* Adds support to move disks between nodes.
..* Adds consistent attach/detach behavior, checks if the disk is leased/attached on a different node before attempting to attach to target nodes.
..* Fixes a random hang behavior on Azure VMs during mount/format (for both blob + managed disks).
..* Fixes a potential conflict by avoiding the use of disk names for mount paths. The new plugin uses hashed disk uri for mount path.
The existing AzureDisk is used as is. Additional "kind" property was added allowing the user to decide if the pd will be shared, dedicated or managed (Azure Managed Disks are used).
Due to the change in mounting paths, existing PDs need to be recreated as PV or PVCs on the new plugin.
The azure cloudprovider will now use the Managed Service Identity
to retrieve access tokens for the Azure ARM APIs, rather than
requiring hard-coded, user-specified credentials.
Automatic merge from submit-queue (batch tested with PRs 48555, 48849)
GCE: Fix panic when service loadbalancer has static IP address
Fixes#48848
```release-note
Fix service controller crash loop when Service with GCP LoadBalancer uses static IP (#48848, @nicksardo)
```
Automatic merge from submit-queue (batch tested with PRs 48781, 48817, 48830, 48829, 48053)
vSphere for cloud-controller-manager
**What this PR does / why we need it**:
This is to implement the `NodeAddressesByProviderID` and `InstanceTypeByProviderID` methods for cloud-controller-manager for vSphere cloud provider.
Currently vSphere cloud provider only supports VMs in the same folder.
Thus `NodeAddressesByProviderID` is similar to `NodeAddresses` with a simple ProviderID to NodeName translation.
`InstanceTypeByProviderID` returns nil as same as `InstanceType`.
**Which issue this PR fixes**
Part of Issue https://github.com/kubernetes/kubernetes/issues/47257
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 48594, 47042, 48801, 48641, 48243)
Add initial support for the Azure instance metadata service.
Part of fixing #46632
@colemickens @rootfs @jdumars @kris-nova
Automatic merge from submit-queue (batch tested with PRs 48594, 47042, 48801, 48641, 48243)
Fix panic of DeleteRoute()
Fix#48800
It should be 'addr_pairs', not 'routes'.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47948, 48631, 48693, 48549, 47593)
OpenStack for cloud-controller-manager
**What this PR does / why we need it**:
This implements the `NodeAddressesByProviderID` and `InstanceTypeByProviderID` methods used by the cloud-controller-manager to the OpenStack provider. The instance type returned is the flavor name, for consistency `InstanceType` has been implemented too returning the same value.
```release-note
NONE
```
This is part of #47257 cc @wlan0
Automatic merge from submit-queue
Removed mesos as cloud provider from Kubernetes.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47205
**Special notes for your reviewer**:
**Release note**:
```release-note
Move Mesos Cloud Provider out of Kubernetes Repo
```
This implements the NodeAddressesByProviderID and InstanceTypeByProviderID
methods used by the cloud-controller-manager to the RackSpace provider.
The instance type returned is the flavor name, for consistency
InstanceType has been implemented too returning the same value.
This is part of #47257 cc @wlan0
Automatic merge from submit-queue (batch tested with PRs 48497, 48604, 48599, 48560, 48546)
GCE: Use network project id for firewall/route mgmt and zone listing
- Introduces a new environment variable for plumbing the network project id which will be used for firewall and route management. fixes#48515
- onXPN is determined by metadata if config is not specified
- Split `if` conditions: fixes#48521
- Remove `getNetworkNameViaAPICall` which was used as a last resort for the `networkURL` (if empty) which was previously filled with the metadata network project & name.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Fix deleting empty monitors
Fix#48094
When create-monitor of cloud-config is false, pool has not monitor
and can not delete empty monitor.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47234, 48410, 48514, 48529, 48348)
Check opts of cloud config file
Fix#48347
Check opts when register OpenStack CloudProvider rather than
returning error when use opts to create/use cloud resource.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48292, 48121)
Add Google cloudkms dependency, add cloudkms service to GCE cloud provider
Required to introduce a Google KMS based envelope encryption, which shall allow encrypting secrets at rest using KEK-DEK scheme.
The above requires KMS API to create/delete KeyRings and CryptoKeys, and Encrypt/Decrypt data.
Should target release 1.8
@jcbsmpsn
Update: It appears that Godep only allows dependencies which are in use. We may have to modify this PR to include some Google KMS code.
Progresses #48522
Automatic merge from submit-queue
Use %q formatter for error messages from the AWS SDK. #47789
Error messages from the AWS SDK can have return keys in them, so use %q formatter for those messages.
Automatic merge from submit-queue (batch tested with PRs 47918, 47964, 48151, 47881, 48299)
Add ApiEndpoint support to GCE config.
**What this PR does / why we need it**:
Add the ability to change ApiEndpoint for GCE.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 47286, 47729)
Add client certificate authentication to Azure cloud provider
This adds support for client cert authentication in Azure cloud provider. The certificate can be provided in PKCS #12 format with password protection. Not that this authentication will be active only when no client secret is configured.
cc @brendandburns @colemickens
Automatic merge from submit-queue (batch tested with PRs 48139, 48042, 47645, 48054, 48003)
Pipe clusterID into gce_loadbalancer_external.go
**What this PR does / why we need it**: Small cleanup for GCE ELB codes.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48002
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47776, 46220, 46878, 47942, 47947)
fix comment mistake
fix comment mistake
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 47776, 46220, 46878, 47942, 47947)
update openstack metadata-service url
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
Ignore ErrNotFound when delete LB resources
IsNotFound error is fine since that means the object is
deleted already, so let's check it before return error.
Automatic merge from submit-queue (batch tested with PRs 47915, 47856, 44086, 47575, 47475)
AWS: Fix suspicious loop comparing permissions
Because we only ever call it with a single UserId/GroupId, this would
not have been a problem in practice, but this fixes the code.
Fix#36902
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47922, 47195, 47241, 47095, 47401)
AWS: Set CredentialsChainVerboseErrors
This avoids a rather confusing error message.
Fix#39374
```release-note
NONE
```
Automatic merge from submit-queue
Add E2E tests for Azure internal loadbalancer support, fix an issue for public IP resource deletion.
**What this PR does / why we need it**:
- Add E2E tests for Azure internal loadbalancer support: https://github.com/kubernetes/kubernetes/pull/43510
- Fix an issue that public IP resource not get deleted when switching from external loadbalancer to internal static loadbalancer.
**Special notes for your reviewer**:
1. Add new Azure resource tag to Public IP resources to indicate kubernetes managed resources.
Currently we determine whether the public IP resource should be deleted by looking at LoadBalancerIp property on spec. In the scenario 'Switching from external loadbalancer to internal loadbalancer with static IP', that value might have been updated for internal loadbalancer. So here we're to add an explicit tag for kubernetes managed resources.
2. Merge cleanupPublicIP logic into cleanupLoadBalancer
**Release note**:
NONE
CC @brendandburns @colemickens
Automatic merge from submit-queue
New annotation to add existing Security Groups to ELBs created by AWS cloudprovider
**What this PR does / why we need it**:
When K8S cluster is deployed in existing VPC there might be a need to attach extra SecurityGroups to ELB created by AWS cloudprovider. Example of it can be cases, where such Security Groups are maintained by another team.
**Special notes for your reviewer**:
For tests to pass depends on https://github.com/kubernetes/kubernetes/pull/45168 and therefore includes it
**Release note**:
```release-note
New 'service.beta.kubernetes.io/aws-load-balancer-extra-security-groups' Service annotation to specify extra Security Groups to be added to ELB created by AWS cloudprovider
```
Automatic merge from submit-queue
AWS: Remove blackhole routes in our managed range
Blackhole routes otherwise acccumulate unboundedly. We also are careful
to ensure that we do so only within the managed range, which requires
enlisting the help of the routecontroller.
Fix#47524
```release-note
AWS: clean up blackhole routes when using kubenet
```
We maintain a cache of all instances, and we invalidate the cache
whenever we see a new instance. For ELBs that should be sufficient,
because our usage is limited to instance ids and security groups, which
should not change.
Fix#45050
Automatic merge from submit-queue (batch tested with PRs 47510, 47516, 47482, 47521, 47537)
Batch AWS getInstancesByNodeNames calls with FilterNodeLimit
We are going to limit the getInstancesByNodeNames call with a batch
size of 150.
Fixes - #47271
```release-note
AWS: Batch DescribeInstance calls with nodeNames to 150 limit, to stay within AWS filter limits.
```
Blackhole routes otherwise acccumulate unboundedly. We also are careful
to ensure that we do so only within the managed range, which requires
enlisting the help of the routecontroller.
Fix#47524
Automatic merge from submit-queue
AWS: Process disk attachments even with duplicate NodeNames
Fix#47404
```release-note
AWS: Process disk attachments even with duplicate NodeNames
```
Automatic merge from submit-queue (batch tested with PRs 47302, 47389, 47402, 47468, 47459)
[GCE] Fix ILB sharing and GC
Fixes#47092
- Users must opt-in for sharing backend services (alpha feature - may be removed in future release)
- Shared backend services use a hash for determining similarity via settings (so far, only sessionaffinity) (again, this may be removed)
- Move resource cleanup to after the ILB setup.
/assign @bowei
**Release note**:
```release-note
NONE
```