Automatic merge from submit-queue (batch tested with PRs 63658, 63509, 63800, 63586, 63840). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Clean stackdriver sinks when reached limit
Workaround for stackdriver sinks leaking during test.
This will automate cleaning when sinks when we reach limit.
It can cause other tests running in parallel to fail, but because it's happens once per month impact is negligible.
Fixes#7295
```release-note
NONE
```
/cc @krzyzacy @x13n @cezarygerard
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
move cached_discovery to client-go/discovery
**Release note**:
```release-note
NONE
```
Moves the cmd/util CachedDiscoveryClient to client-go
cc @soltysh @deads2k
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
scheduler: remove nested retry loops and fix serving metrics under healthz server port
This PR fixes the issues of #63404:
- The Serve func for metrics and healthz does not block. Inside, there is a retry loop. This PR fixes this and gets rid of the error messages every 5 seconds.
- The separate metrics server is only to be started if it is configured on another port. Before this PR we were wrongly checking for the healthz service to be activated and then launched the metrics server instead. Because both server startups are in a go routine, they were racing against each other. If you were unlucky, the metrics endpoint was winning and it returned 404 on /healthz (while it should not have been active in the first place). The kubemark tests run the scheduler with a liveness probe which failed, restarting the scheduler after some minutes.
Automatic merge from submit-queue (batch tested with PRs 63792, 63495, 63742, 63332, 63779). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Actually support service `publishNotReadyAddresses`
This was added and the annotation was deprecated, but it was never
implemented.
xref #63741
**Release note**:
```release-note
The annotation `service.alpha.kubernetes.io/tolerate-unready-endpoints` is deprecated. Users should use Service.spec.publishNotReadyAddresses instead.
```
Automatic merge from submit-queue (batch tested with PRs 63492, 62379, 61984, 63805, 63807). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add e2e test to verify that GPU pool is not scaled up if GPUs are not requested
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 55511, 63372, 63400, 63100, 63769). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Create pkg/scheduling/apis/v1beta1 and move priorityClass to beta
**What this PR does / why we need it**:
This is for creating pkg/apis/scheduling/v1beta1 so that priorityClasses could be moved to beta.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Part of #57471
**Special notes for your reviewer**:
/cc @bsalamat @aveshagarwal
**Release note**:
```release-note
The `PriorityClass` API is promoted to `scheduling.k8s.io/v1beta1`
```
Automatic merge from submit-queue (batch tested with PRs 55511, 63372, 63400, 63100, 63769). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[GCE] check for new backend naming scheme
**What this PR does / why we need it**:
Checks for both the old Backend naming scheme (Nodeport in the name) and the new naming scheme (same scheme used to name NEGs). We need to check for both, in order for both the tests against Ingress head (once https://github.com/kubernetes/ingress-gce/pull/239 gets merged) and tests against prior Ingress versions to pass.
See https://github.com/kubernetes/ingress-gce/pull/239 .
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 63588, 63806). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Run FSGroup tests by default.
There is no special feature flag for FSGroup and the tests can run in all test suites. They're reasonably fast too.
**Release note**:
```release-note
NONE
```
cc: @jeffvance
New pod timeseries was introduced that has same labels for namespace and
pod_name resulting in doubling value in old query. New metric is not
based on containers so filtering on image solves that problem.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Removed unused namespace in UT helper func.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 63367, 63718, 63446, 63723, 63720). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
finish new dynamic client and deprecate old dynamic client
Builds on a couple other pulls. This completes the transition to the new dynamic client.
@kubernetes/sig-api-machinery-pr-reviews
@caesarxuchao @sttts
```release-note
The old dynamic client has been replaced by a new one. The previous dynamic client will exist for one release in `client-go/deprecated-dynamic`. Switch as soon as possible.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
e2e: add a tooling argument to differentiate tooling
We expect lots of tools to be able to install on provider=gce - the cluster
API, kops, kube-up etc.
We introduce a new optional flag to e2e ('tooling') to enable switching on
the tooling, not just the cloud.
This will prove useful for upgrade tests, for example, where the mechanism
will likely vary by tooling, but is currently tightly bound to the provider
(i.e. cloud)
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 63049, 59731). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
re-enable nodeipam in kube-controller-manager
**What this PR does / why we need it**:
Re-enables nodeipam controller for external clouds. Also does a small refactor so that we don't need to pass in `allocateNodeCidr` into the controller.
In v1.10 we made a change (9187b343e1 (diff-f11913dc67d80d36b3d06a93f61c49cf) in https://github.com/kubernetes/kubernetes/pull/57492) where nodeipam would be disabled for any cluster that sets `--cloud-provider=external`. The original intention behind this was that the nodeipam controller is cloud specific for some clouds (only GCE at the moment) so it should be moved to the CCM (cloud controller manager). After some discussions with wg-cloud-provider it makes sense to re-enable nodeipam controller in KCM and have GCE CCM enable its own cloud-specific IPAM controller as part of [Initialize()](https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/cloud.go#L33-L35). This would allow for GCE to run nodeipam in both KCM (by setting --cloud-provider=gce and --allocate-node-cidr) and in the CCM (once implemented in `Initialize()`) without disabling nodeipam in the KCM for all external clouds and avoids having to implement nodeipam in CCM.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Re-enable nodeipam controller for external clouds.
```
Automatic merge from submit-queue (batch tested with PRs 63246, 63185). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add scale-up test from 0 for GPU node-pool
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
should use time.Since instead of time.Now().Sub
**What this PR does / why we need it**:
should use time.Since instead of time.Now().Sub
**Special notes for your reviewer**:
Automatic merge from submit-queue (batch tested with PRs 63669, 63511, 63561, 63289). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Cleanup DaemonSet after each integration test.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#63287
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 62665, 62194, 63616, 63672, 63450). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Kubeadm e2e
**What this PR does / why we need it**:
Provides in-tree E2E tests for the Kubeadm subproject
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes [kubeadm#456](https://github.com/kubernetes/kubeadm/issues/456)
**Special notes for your reviewer**:
The weird way tests are executed mirrors `e2e_node`. A future pull request will add a frontend for these tests to kubetest, which will abstract away much of this detail.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove ActiveDeadlineSeconds from Watch e2e Test
**What this PR does / why we need it**:
I originally updated ActiveDeadlineSeconds because it is one of the few fields on Pods which can change, but I didn't consider how it might affect the reliability of the test. This PR replaces the update done to ActiveDeadlineSeconds with a more predictable update to a label.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Might fix#61846
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Stop() for Ticker to enable leak-free code
**What this PR does / why we need it**:
I wanted to use the clock package but the `Ticker` without a `Stop()` method is a deal breaker for me.
**Release note**:
```release-note
NONE
```
/kind enhancement
/sig api-machinery
Automatic merge from submit-queue (batch tested with PRs 63624, 59847). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
explicit kubelet config key in Node.Spec.ConfigSource.ConfigMap
This makes the Kubelet config key in the ConfigMap an explicit part of
the API, so we can stop using magic key names.
As part of this change, we are retiring ConfigMapRef for ConfigMap.
```release-note
You must now specify Node.Spec.ConfigSource.ConfigMap.KubeletConfigKey when using dynamic Kubelet config to tell the Kubelet which key of the ConfigMap identifies its config file.
```
We expect lots of tools to be able to install on provider=gce - the
cluster API, kops, kube-up etc.
We introduce a new optional flag to e2e ('tooling') to enable switching
on the tooling, not just the cloud.
This will prove useful for upgrade tests, for example, where the
mechanism will likely vary by tooling, but is currently tightly bound to
the provider (i.e. cloud)
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix cgroup names in node_container_manager_test.
**What this PR does / why we need it**:
Fix cgroup names in node_container_manager_test.
The names were made invalid for the CgroupName refactor in #62541, so update them here.
Furthermore, as the new names are now compatible with what EnforceNodeAllocatable wants, reuse the constants there as well.
Tested:
```
$ make test-e2e-node REMOTE=true HOSTS=test-cos-beta-67-10575-27-0 FOCUS='Validate Node Allocatable' SKIP='' TEST_ARGS='--feature-gates=DynamicKubeletConfig=true'
• [SLOW TEST:39.488 seconds]
[k8s.io] Node Container Manager [Serial]
Validate Node Allocatable
set's up the node and runs the test
Ran 1 of 261 Specs in 57.348 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 260 Skipped
```
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#63549
**Special notes for your reviewer**:
/assign @dashpole
**Release note**:
```release-note
NONE
```