Automatic merge from submit-queue
Add e2e test for external provisioners
this is a retry of https://github.com/kubernetes/kubernetes/pull/39545
This time around:
* take advantage of the system:persistent-volume-provisioner bootstrap cluster role to grant the external provisioner pod serviceaccount permissions
* add storageclass suffix so that the first and third test don't conflict when one creates the storageclass with the same name before the other
also tested more thoroughly myself on gce :)
@jsafrane if you would like to re-review
Automatic merge from submit-queue (batch tested with PRs 41248, 41214)
Switch hpa controller to shared informer
**What this PR does / why we need it**: switch the hpa controller to use a shared informer
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**: Only the last commit is relevant. The others are from #40759, #41114, #41148
**Release note**:
```release-note
```
cc @smarterclayton @deads2k @sttts @liggitt @DirectXMan12 @timothysc @kubernetes/sig-scalability-pr-reviews @jszczepkowski @mwielgus @piosz
Automatic merge from submit-queue (batch tested with PRs 41246, 39998)
Cinder volume attacher: use instanceID instead of NodeID when verifying attachment
**What this PR does / why we need it**: Cinder volume attacher incorrectly uses NodeID instead of openstack instance id, so that reconciliation fails.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#39978
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 39418, 41175, 40355, 41114, 32325)
add e2e tests for replicasets with weight, min and max replicas
e2e test with weight, min and max replicas set
#31904#32014
@quinton-hoole @nikhiljindal @deepak-vij @kshafiee @mwielgus
Automatic merge from submit-queue (batch tested with PRs 39418, 41175, 40355, 41114, 32325)
TaintController
```release-note
This PR adds a manager to NodeController that is responsible for removing Pods from Nodes tainted with NoExecute Taints. This feature is beta (as the rest of taints) and enabled by default. It's gated by controller-manager enable-taint-manager flag.
```
Automatic merge from submit-queue (batch tested with PRs 39418, 41175, 40355, 41114, 32325)
ResyncPeriod Comment
ResyncPeriod Comment:
// ResyncPeriod returns a function which generates a duration each time it is
// invoked; this is so that multiple controllers don't get into lock-step and all
// hammer the apiserver with list requests simultaneously.
Automatic merge from submit-queue (batch tested with PRs 41112, 41201, 41058, 40650, 40926)
[Federation][e2e] Fix few flakes in federation e2e tests
**What this PR does / why we need it**:
Fixes few flakes in #37105
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes # partly fixes few test cases in the above mentioned issue.
**Special notes for your reviewer**:
While cleaning up in AfterEach Block some objects are returned while listing, but by the time the object is delete is issued the object is disappearing resulting in this flake occasionally.
To fix this, we need to check if the err is NotFound while deleting, its ok and need not fail the test.
**Release note**: `NONE`
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 41112, 41201, 41058, 40650, 40926)
verify: Use macOS compatible copying method
**What this PR does / why we need it**:
Similar to the fix in #34944, this fixes issues in the `make verify` tests, by using a copy method that is compatible with macOS and the bsd version of `cp`.
Before fix:
```
Verifying hack/make-rules/../../hack/verify-codegen.sh
cp: illegal option -- T
usage: cp [-R [-H | -L | -P]] [-fi | -n] [-apvX] source_file target_file
cp [-R [-H | -L | -P]] [-fi | -n] [-apvX] source_file ... target_directory
FAILED hack/make-rules/../../hack/verify-codegen.sh 0s
```
After fix:
```
Verifying hack/make-rules/../../hack/verify-codegen.sh
Building client-gen
Building lister-gen
Building informer-gen
diffing cmd/kube-aggregator/hack/../pkg against freshly generated codegen
cmd/kube-aggregator/hack/../pkg up to date.
+++ [0128 10:06:48] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [0128 10:06:48] Generating bindata:
test/e2e/generated/gobindata_util.go
/opt/gopath/src/k8s.io/kubernetes /opt/gopath/src/k8s.io/kubernetes/test/e2e/generated
/opt/gopath/src/k8s.io/kubernetes/test/e2e/generated
+++ [0128 10:06:49] Building go targets for darwin/amd64:
cmd/libs/go2idl/client-gen
cmd/libs/go2idl/lister-gen
cmd/libs/go2idl/informer-gen
Building client-gen
Building lister-gen
Building informer-gen
SUCCESS hack/make-rules/../../hack/verify-codegen.sh 59s
```
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 41112, 41201, 41058, 40650, 40926)
Promote TokenReview to v1
Peer to https://github.com/kubernetes/kubernetes/pull/40709
We have multiple features that depend on this API:
- [webhook authentication](https://kubernetes.io/docs/admin/authentication/#webhook-token-authentication)
- [kubelet delegated authentication](https://kubernetes.io/docs/admin/kubelet-authentication-authorization/#kubelet-authentication)
- add-on API server delegated authentication
The API has been in use since 1.3 in beta status (v1beta1) with negligible changes:
- Added a status field for reporting errors evaluating the token
This PR promotes the existing v1beta1 API to v1 with no changes
Because the API does not persist data (it is a query/response-style API), there are no data migration concerns.
This positions us to promote the features that depend on this API to stable in 1.7
cc @kubernetes/sig-auth-api-reviews @kubernetes/sig-auth-misc
```release-note
The authentication.k8s.io API group was promoted to v1
```
Automatic merge from submit-queue (batch tested with PRs 41112, 41201, 41058, 40650, 40926)
make round trip testing generic
RoundTrip testing is something associated with a scheme and everyone who writes an API will want to do it. In the end, we should wire each API group separately in a test scheme and have them all call this general function. Once `kubeadm` is out of the main scheme, we'll be able to remove the one really ugly hack.
@luxas @sttts @kubernetes/sig-apimachinery-pr-reviews @smarterclayton
Because deleteAllTestNamespaces deleted all the e2e namespaces
it interefered with other federation namespace tests running in
parallel. This change should mitigate the problem and make the
tests runnable in parallel.
Automatic merge from submit-queue (batch tested with PRs 40796, 40878, 36033, 40838, 41210)
StatefulSet hardening
**What this PR does / why we need it**:
This PR contains the following changes to StatefulSet. Only one change effects the semantics of how the controller operates (This is described in #38418), and this change only brings the controller into conformance with its documented behavior.
1. pcb and pcb controller are removed and their functionality is encapsulated in StatefulPodControlInterface. This class modules the design contoller.PodControlInterface and provides an abstraction to clientset.Interface which is useful for testing purposes.
2. IdentityMappers has been removed to clarify what properties of a Pod are mutated by the controller. All mutations are performed in the UpdateStatefulPod method of the StatefulPodControlInterface.
3. The statefulSetIterator and petQueue classes are removed. These classes sorted Pods by CreationTimestamp. This is brittle and not resilient to clock skew. The current control loop, which implements the same logic, is in stateful_set_control.go. The Pods are now sorted and considered by their ordinal indices, as is outlined in the documentation.
4. StatefulSetController now checks to see if the Pods matching a StatefulSet's Selector also match the Name of the StatefulSet. This will make the controller resilient to overlapping, and will be enhanced by the addition of ControllerRefs.
5. The total lines of production code have been reduced, and the total number of unit tests has been increased. All new code has 100% unit coverage giving the module 83% coverage. Tests for StatefulSetController have been added, but it is not practical to achieve greater coverage in unit testing for this code (the e2e tests for StatefulSet cover these areas).
6. Issue #38418 is fixed in that StaefulSet will ensure that all Pods that are predecessors of another Pod are Running and Ready prior to launching a new Pod. This removes the potential for deadlock when a Pod needs to be rescheduled while its predecessor is hung in Pending or Initializing.
7. All reference to pet have been removed from the code and comments.
**Which issue this PR fixes**
fixes #38418,#36859
**Special notes for your reviewer**:
**Release note**:
```release-note
Fixes issue #38418 which, under circumstance, could cause StatefulSet to deadlock.
Mediates issue #36859. StatefulSet only acts on Pods whose identity matches the StatefulSet, providing a partial mediation for overlapping controllers.
```
Automatic merge from submit-queue (batch tested with PRs 40796, 40878, 36033, 40838, 41210)
HPA v2 (API Changes)
**Release note**:
```release-note
Introduces an new alpha version of the Horizontal Pod Autoscaler including expanded support for specifying metrics.
```
Implements the API changes for kubernetes/features#117.
This implements #34754, which is the new design for the Horizontal Pod Autoscaler. It includes improved support for custom metrics (and/or arbitrary metrics) as well as expanded support for resource metrics. The new HPA object is introduces in the API group "autoscaling/v1alpha1".
Note that the improved custom metric support currently is limited to per pod metrics from Heapster -- attempting to use the new "object metrics" will simply result in an error. This will change once #34586 is merged and implemented.
Automatic merge from submit-queue (batch tested with PRs 40796, 40878, 36033, 40838, 41210)
kubeadm: added tests for preflight checks
**What this PR does / why we need it**: There hadn't been much care to add more unit tests as more preflight checks were added. I added tests that increased coverage from ~9% to ~71%
Adding unit tests is a WIP from https://github.com/kubernetes/kubernetes/issues/34136
**Special notes for your reviewer**: /cc @luxas @pires
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 40796, 40878, 36033, 40838, 41210)
Implement TTL controller and use the ttl annotation attached to node in secret manager
For every secret attached to a pod as volume, Kubelet is trying to refresh it every sync period. Currently Kubelet has a ttl-cache of secrets of its pods and the ttl is set to 1 minute. That means that in large clusters we are targetting (5k nodes, 30pods/node), given that each pod has a secret associated with ServiceAccount from its namespaces, and with large enough number of namespaces (where on each node (almost) every pod is from a different namespace), that resource in ~30 GETs to refresh all secrets every minute from one node, which gives ~2500QPS for GET secrets to apiserver.
Apiserver cannot keep up with it very easily.
Desired solution would be to watch for secret changes, but because of security we don't want a node watching for all secrets, and it is not possible for now to watch only for secrets attached to pods from my node.
So as a temporary solution, we are introducing an annotation that would be a suggestion for kubelet for the TTL of secrets in the cache and a very simple controller that would be setting this annotation based on the cluster size (the large cluster is, the bigger ttl is).
That workaround mean that only very local changes are needed in Kubelet, we are creating a well separated very simple controller, and once watching "my secrets" will be possible it will be easy to remove it and switch to that. And it will allow us to reach scalability goals.
@dchen1107 @thockin @liggitt
This is quite often the case on AWS, and we really don't care if
the hostname is resolvable or not. It's not an easy problem
to ask user to fix, and there is no functional penalty at the
Kubernetes level, also it's possible that users fixes their host
resolution eventually, we don't have to make them do so.
Automatic merge from submit-queue (batch tested with PRs 40917, 41181, 41123, 36592, 41183)
fix ca cert in kubeadm
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
Automatic merge from submit-queue (batch tested with PRs 40917, 41181, 41123, 36592, 41183)
Set all node conditions to Unknown when node is unreachable
**What this PR does / why we need it**:
Sets all node conditions to Unknown when node does not report status/unreachable
**Which issue this PR fixes**
fixes https://github.com/kubernetes/kubernetes/issues/36273
Automatic merge from submit-queue (batch tested with PRs 40917, 41181, 41123, 36592, 41183)
fix scheduler performance test script
**What this PR does / why we need it**:
`test-performance.sh` is in dir `kubernetes/test/integration/scheduler_perf`
the dir `kubernetes/test/component/scheduler/perf` does not exist
Thanks.
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 40917, 41181, 41123, 36592, 41183)
[Federation] Add override flags options to kubefed init
**What this PR does / why we need it**:
Allows modification of startup flags (of apiserver and controller manager) through kubefed
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/40398
**Special notes for your reviewer**:
I haven't removed the existing redundant flags now (for example --dns-zone-name) intentionally to avoid breaking any existing tests that might use them.
I guess that would be better done as a follow up PR.
@madhusudancs @marun @nikhiljindal
**Release note**:
```
It is now possible for the user to modify any startup flag of federation-apiserver and federation-controller-manager when deployed through kubefed.
There are two new options introduced in kubefed:
--apiserver-arg-overrides and --controllermanager-arg-overrides
Any number of actual federation-apiserver or federation-controller-manager flags can be specified using these options.
Example:
kubefed init "-other options-" ----apiserver-arg-overrides "--flag1=value1,--flag2=value2"
```
Start looking up the virtual machine by it's UUID in vSphere again. Looking up by IP address is problematic and can either not return a VM entirely, or could return the wrong VM.
Retrieves the VM's UUID in one of two methods - either by a `vm-uuid` entry in the cloud config file on the VM, or via sysfs. The sysfs route requires root access, but restores the previous functionality.
Multiple VMs in a vCenter cluster can share an IP address - for example, if you have multiple VM networks, but they're all isolated and use the same address range. Additionally, flannel network address ranges can overlap.
vSphere seems to have a limitation of reporting no more than 16 interfaces from a virtual machine, so it's possible that the IP address list on a VM is completely untrustworthy anyhow - it can either be empty (because the 16 interfaces it found were veth interfaces with no IP address), or it can report the flannel IP.