Automatic merge from submit-queue (batch tested with PRs 51513, 51515, 50570, 51482, 51448)
Removes redundant prefix in cluster-lifecycle e2e test names
**What this PR does / why we need it**:
Removes redundant prefix in cluster-lifecycle e2e test names
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Umbrella issue #49161
xref: #50054
**Special notes for your reviewer**:
/cc @jbeda
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50719, 51216, 50212, 51408, 51381)
Make selector immutable for v1beta2 deployment, replicaset and daemonset prior update
**What this PR does / why we need it**:
This PR ensures controller selector is immutable for deployment and replicaset prior update by ignoring any change to `Spec`.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50808
**Special notes for your reviewer**:
This will be a breaking change.
**Release note**:
```release-note
For Deployment, ReplicaSet, and DaemonSet, selectors are now immutable when updating via the new `apps/v1beta2` API. For backward compatibility, selectors can still be changed when updating via `apps/v1beta1` or `extensions/v1beta1`.
```
Automatic merge from submit-queue (batch tested with PRs 51707, 51662, 51723, 50163, 51633)
update GC controller to wait until controllers have been initialized …
fixes#51013
Alternative to https://github.com/kubernetes/kubernetes/pull/51492 which keeps those few controllers (only one) from starting the informers early.
Automatic merge from submit-queue (batch tested with PRs 51707, 51662, 51723, 50163, 51633)
Change SizeLimit to a pointer
This PR fixes issue #50121
```release-note
The `emptyDir.sizeLimit` field is now correctly omitted from API requests and responses when unset.
```
Automatic merge from submit-queue (batch tested with PRs 51707, 51662, 51723, 50163, 51633)
Adding vishh to test/ reviewers and approvers
Rationale: Reviewing/Shepherding lots of features/PRs around node and resource management.
Automatic merge from submit-queue (batch tested with PRs 50775, 51397, 51168, 51465, 51536)
Enable batch/v1beta1.CronJobs by default
This PR moves to CronJobs beta entirely, enabling `batch/v1beta1` by default.
Related issue: #41039
@erictune @janetkuo ptal
```release-note
Promote CronJobs to batch/v1beta1.
```
Automatic merge from submit-queue (batch tested with PRs 50775, 51397, 51168, 51465, 51536)
Allow bearer requests to be proxied by kubectl proxy
Use a fake transport to capture changes to the request and then surface
them back to the end user.
Fixes#50466
@liggitt no tests yet, but works locally
As we work towards providing a stable (v1) kubeletconfig API,
we cannot afford to have deprecated or "experimental" (alpha) fields
living in the KubeletConfiguration struct. This removes all existing
experimental or deprecated fields, and places them in KubeletFlags
instead.
I'm going to send another PR after this one that organizes the remaining
fields into substructures for readability. Then, we should try to move
to v1 ASAP.
It makes far more sense to focus on a clean API in kubeletconfig v2,
than to try and further clean up the existing "API" that everyone
already depends on.
Automatic merge from submit-queue
e2e: Add tests for network tiers in GCE
This test depends on #51301, which adds the new feature. Only the `e2e: Add tests for network tiers in GCE` commit is new.
#51301 should pass this new test.
Automatic merge from submit-queue (batch tested with PRs 51439, 51361, 51140, 51539, 51585)
Enable alpha GCE disk API
This PR builds on top of #50467 to allow the GCE disk API to use either the alpha or stable APIs.
CC @freehan
Automatic merge from submit-queue (batch tested with PRs 47054, 50398, 51541, 51535, 51545)
Switch away from gcloud deprecated flags in compute resource listings
**What is fixed**
Remove deprecated `gcloud compute` flags, see linked issue.
**Which issue this PR fixes**:
fixes#49673
**Special notes for your reviewer**:
The change in `gcloudComputeResourceList` in `test/e2e/framework/ingress_utils.go` isn't strictly needed as currently no affected resources are called on within that file, however the function has the _potential_ to access affected resources so I covered it as well. Happy to change if deemed unnecessary.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51228, 50185, 50940, 51544, 51543)
Add upgrades tests for kube-proxy daemonset migration path
**What this PR does / why we need it**:
From #23225, this is a part of setting up CIs to validate the kube-proxy migration path (static pods -> daemonset and reverse).
The other part of the works (adding real CIs that run these tests) will be in a separate PR against [kubernetes/test-infra](https://github.com/kubernetes/test-infra).
Though this is currently blocked by #50705.
**Special notes for your reviewer**:
cc @roberthbailey @pwittrock
**Release note**:
```release-note
NONE
```
For pod volumes that reference a PVC, add a PVCRef to the corresponding
volume stat. This allows metrics to be indexed/queried by PVC name
which is more user-friendly than Pod reference
Automatic merge from submit-queue (batch tested with PRs 51377, 46580, 50998, 51466, 49749)
Adding e2e SELinux test for local storage
Adding e2e test for SELinux enabled local storage
/sig storage
Closes#45054
Automatic merge from submit-queue (batch tested with PRs 51377, 46580, 50998, 51466, 49749)
Use the pre-built docker binaries on Ubuntu for benchmark tests
- Tested manually.
- The `ubuntu-init-docker.yaml` is copied from `cos-init-docker.yaml` with the following changes needed by Ubuntu. This change is temporary -- we will remove the script and the tests once we know the performance of using the pre-built Docker 1.12 on Ubuntu.
```
71,72c71,72
< mount --bind "${install_location}"/docker-containerd /usr/bin/docker-containerd
< mount --bind "${install_location}"/docker-containerd-shim /usr/bin/docker-containerd-shim
---
> mount --bind "${install_location}"/docker-containerd /usr/bin/containerd
> mount --bind "${install_location}"/docker-containerd-shim /usr/bin/containerd-shim
75c75
< mount --bind "${install_location}"/docker-runc /usr/bin/docker-runc
---
> mount --bind "${install_location}"/docker-runc /usr/sbin/runc
88c88
< local requested_version="$(get_metadata "gci-docker-version")"
---
> local requested_version="$(get_metadata "ubuntu-docker-version")"
93,98d92
< # Check if we have the requested version installed.
< if check_installed /usr/bin/docker "${requested_version}"; then
< echo "Requested version already installed. Exiting."
< exit 0
< fi
<
100c94
< /usr/bin/systemctl stop docker
---
> systemctl stop docker
106c100
< /usr/bin/systemctl start docker && exit $rc
---
> systemctl start docker && exit $rc
```
- Updated all tests to use the latest Ubuntu image.
**Release note**:
```
None
```
/assign @Random-Liu
Automatic merge from submit-queue (batch tested with PRs 49961, 50005, 50738, 51045, 49927)
Add cluster e2es to verify scheduler local storage support
Add cluster e2es to verify scheduler local storage support and remove some unused private functions
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
part of #50818
**Release note**:
```release-note
Add cluster e2es to verify scheduler local ephemeral storage support
```
/assign @jingxu97
/cc @ddysher
Automatic merge from submit-queue (batch tested with PRs 44719, 48454)
check job ActiveDeadlineSeconds
**What this PR does / why we need it**:
enqueue a sync task after ActiveDeadlineSeconds
**Which issue this PR fixes** *:
fixes#32149
**Special notes for your reviewer**:
**Release note**:
```release-note
enqueue a sync task to wake up jobcontroller to check job ActiveDeadlineSeconds in time
```
Automatic merge from submit-queue
Added an end-to-end test ensuring that Cluster Autoscaler does not scale up when all pending pods are unschedulable
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50919, 51410, 50099, 51300, 50296)
GCE: Read networkProjectID param
Fixes#48515
/assign bowei
The first commit is the original PR cherrypicked. The master's kubelet isn't provided a cloud config path, so the project is retrieved via instance metadata. In the GKE case, this project cannot be retrieved by the master and caused an error.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51471, 50561, 50435, 51473, 51436)
Feature gate initializers field
The metadata.initializers field should be feature gated and disabled by default while in alpha, especially since enforcement of initializer permission that keeps users from submitting objects with their own initializers specified is done via an admission plugin most clusters do not enable yet.
Not gating the field and tests caused tests added in https://github.com/kubernetes/kubernetes/issues/51429 to fail on clusters that don't enable the admission plugin.
This PR:
* adds an `Initializers` feature gate, auto-enables the feature gate if the admission plugin is enabled
* clears the `metadata.initializers` field of objects on create/update if the feature gate is not set
* marks the e2e tests as feature-dependent (will follow up with PR to test-infra to enable the feature and opt in for GCE e2e tests)
```release-note
Use of the alpha initializers feature now requires enabling the `Initializers` feature gate. This feature gate is auto-enabled if the `Initialzers` admission plugin is enabled.
```
Automatic merge from submit-queue
Moved node condition filter into a predicates.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50360
**Release note**:
```release-note
A new predicates, named 'CheckNodeCondition', was added to replace node condition filter. 'NetworkUnavailable', 'OutOfDisk' and 'NotReady' maybe reported as a reason when failed to schedule pods.
```
Automatic merge from submit-queue (batch tested with PRs 50953, 51082)
Fix mergekey of initializers; Repair invalid update of initializers
Fix https://github.com/kubernetes/kubernetes/issues/51131
The PR did two things to make parallel patching `metadata.initializers.pending` possible:
* Add mergekey to initializers.pending
* Let the initializer admission plugin set the `metadata.intializers` to nil if an update makes the `pending` and the `result` both nil, instead of returning a validation error. Otherwise if multiple initializer controllers sending the patch removing themselves from `pending` at the same time, one of them will get a validation error.
```release-note
The patch to remove the last initializer from metadata.initializer.pending will result in metadata.initializer to be set to nil (assuming metadata.initializer.result is also nil), instead of resulting in an validation error.
```
Automatic merge from submit-queue
Fix forbidden message format
Before this change:
$ kubectl get pods --as=tom
Error from server (Forbidden): pods "" is forbidden: User "tom" cannot list pods in the namespace "default".
After this change:
$ kubectl get pods --as=tom
Error from server (Forbidden): pods is forbidden: User "tom" cannot list pods in the namespace "default".
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
Fix forbidden message format, remove extra ""
```
Automatic merge from submit-queue
Let the quota evaluator handle mutating specs of pod & pvc
### Background
The final goal is to address https://github.com/kubernetes/kubernetes/issues/47837, which aims to allow more mutation for uninitialized objects.
To do that, we [decided](https://github.com/kubernetes/kubernetes/issues/47837#issuecomment-321462433) to let the admission controllers to handle mutation of uninitialized objects.
### Issue
#50399 attempted to fix all admission controllers so that can handle mutating uninitialized objects. It was incomplete. I didn't realize although the resourcequota admission plugin handles the update operation, the underlying evaluator didn't. This PR updated the evaluators to handle updates of uninitialized pods/pvc.
### TODO
We still miss another piece. The [quota replenish controller](https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/resourcequota/replenishment_controller.go) uses the sharedinformer, which doesn't observe the deletion of uninitialized pods at the moment. So there is a quota leak if a pod is deleted before it's initialized. It will be addressed with https://github.com/kubernetes/kubernetes/issues/48893.
Automatic merge from submit-queue
Make coreos test images sshd not allow password login.
This will prevent security scanners from triggering.
Configuration is verbatim from:
https://coreos.com/os/docs/latest/customizing-sshd.html
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51054, 51101, 50031, 51296, 51173)
Dynamic Flexvolume plugin discovery, probing with filesystem watch.
**What this PR does / why we need it**: Enables dynamic Flexvolume plugin discovery. This model uses a filesystem watch (fsnotify library), which notifies the system that a probe is necessary only if something changes in the Flexvolume plugin directory.
This PR uses the dependency injection model in https://github.com/kubernetes/kubernetes/pull/49668.
**Release Note**:
```release-note
Dynamic Flexvolume plugin discovery. Flexvolume plugins can now be discovered on the fly rather than only at system initialization time.
```
/sig-storage
/assign @jsafrane @saad-ali
/cc @bassam @chakri-nelluri @kokhang @liggitt @thockin
Automatic merge from submit-queue (batch tested with PRs 50889, 51347, 50582, 51297, 51264)
Change eviction manager to manage one single local storage resource
**What this PR does / why we need it**:
We decided to manage one single resource name, eviction policy should be modified too.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: part of #50818
**Special notes for your reviewer**:
**Release note**:
```release-note
Change eviction manager to manage one single local ephemeral storage resource
```
/assign @jingxu97
Before this change:
# kubectl get pods --as=tom
Error from server (Forbidden): pods "" is forbidden: User "tom" cannot list pods in the namespace "default".
After this change:
# kubectl get pods --as=tom
Error from server (Forbidden): pods is forbidden: User "tom" cannot list pods in the namespace "default".
Automatic merge from submit-queue
Fixed gke auth update wait condition.
Lookup whoami on gke using gcloud auth list.
Make sure we do not run the test on any cluster older than 1.7.
**What this PR does / why we need it**: Fixes issue with aggregator e2e test on GKE
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50945
**Special notes for your reviewer**: There is a TODO, follow up will be provided when the immediate problem is resolved.
**Release note**: ```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51134, 51122, 50562, 50971, 51327)
Made the tests ensure that Cluster Autoscaler is on before running.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Configuration is based on:
https://coreos.com/os/docs/latest/customizing-sshd.html
The specific SSHD config is:
# Use most defaults for sshd configuration.
UsePrivilegeSeparation sandbox
Subsystem sftp internal-sftp
ClientAliveInterval 180
UseDNS no
UsePAM yes
PrintLastLog no # handled by PAM
PrintMotd no # handled by PAM
AuthenticationMethods publickey
This will prevent security scanners from triggering.
Automatic merge from submit-queue
AllowedNotReadyNodes allowed to be not ready for absolutely *any* reason
It's as good as we allow those many nodes to be not part of the cluster at all, ever.
Btw - currently our 5k-node correctness test fails if "kubelet stopped posting node status" or "route not created", etc (ref: https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-correctness/3/build-log.txt)
cc @kubernetes/sig-scalability-misc
Automatic merge from submit-queue (batch tested with PRs 51244, 50559, 49770, 51194, 50901)
Distribute pods efficiently in CA scalability tests
**What this PR does / why we need it**:
Instead of using runReplicatedPodOnEachNode method
which is suited to a small number of nodes,
distribute pods on the nodes with desired load
using RCs that eat up all the space we want to be
empty after distribution.
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50213, 50707, 49502, 51230, 50848)
StatefulSet: Deflake e2e `kubectl exec` commands.
This may help with another source of flakiness found while investigating #48031.
We seem to get a lot of flakes due to "connection refused" while running `kubectl exec`. I can't find any reason this would be caused by the test flow, so I'm adding retries to see if that helps.
Automatic merge from submit-queue (batch tested with PRs 51224, 51191, 51158, 50669, 51222)
Enable overlay2 on cos-m60 in node e2e tests
Ref: https://github.com/kubernetes/kubernetes/issues/42926
- Restart docker with `-s overlay2` in cloud-init before running all node e2e tests. I have to copy the systemd unit file to `/etc/systemd/system` because the `/usr/lib/systemd/system/` is read only.
- Updated node e2e tests to use the new cos-m60 image.
- The name of the cloud init file (`cos-init-live-restore.yaml`) does not indicate overlay2 will be enabled, but I can't just change the name in this PR, since it's referenced in test-infra.
**Release note**:
```
None
```
/assign @Random-Liu
Automatic merge from submit-queue (batch tested with PRs 51224, 51191, 51158, 50669, 51222)
StatefulSet: Deflake e2e "restart" phase.
This addresses another source of flakiness found while investigating #48031.
The test used to scale the StatefulSet down to 0, wait for ListPods to return 0 matching Pods, and then scale the StatefulSet back up.
This was prone to a race in which StatefulSet was told to scale back up before it had observed its own deletion of the last Pod, as evidenced by logs showing the creation of Pod ss-1 prior to the creation of the replacement Pod ss-0.
Instead, we now wait for the controller to observe all deletions before scaling it back up. This should fix flakes of the form:
```
Too many pods scheduled, expected 1 got 2
```
We seem to get a lot of flakes due to "connection refused" while running
`kubectl exec`. I can't find any reason this would be caused by the test
flow, so I'm adding retries to see if that helps.
Instead of using runReplicatedPodOnEachNode method
which is suited to a small number of nodes,
distribute pods on the nodes with desired load
using RCs that eat up all the space we want to be
empty after distribution.
Automatic merge from submit-queue (batch tested with PRs 51193, 51154, 42689, 51189, 51200)
Re-enable OIR e2e tests.
Re-enabling test skeleton for opaque integer resources originally submitted as part of #41870. The e2e was disabled since it was flaky. This is the first step toward re-enabling them. Currently all cases are skipped, so this exercises only the BeforeEach behavior and the deferred removal of OIRs from a node.
cc @timothysc
Automatic merge from submit-queue (batch tested with PRs 51108, 51035, 50539, 51160, 50947)
Auto-calculate CLUSTER_IP_RANGE based on cluster size
In preparation for eliminating CLUSTER_IP_RANGE env var from job configs, making it less error prone while folks try to start their own large cluster tests (https://github.com/kubernetes/kubernetes/issues/50907).
/cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 51113, 46597, 50397, 51052, 51166)
Add statefulset upgrade tests to cluster_upgrade
**What this PR does / why we need it**:
Adds already created statefulset upgrade tests to cluster_upgrade.go. With further test infra changes, this will allow them to be continuously run, giving better signals.
Detect and prevent issues like https://github.com/kubernetes/kubernetes/issues/48327
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51113, 46597, 50397, 51052, 51166)
implement proposal 34058: hostPath volume type
**What this PR does / why we need it**:
implement proposal #34058
**Which issue this PR fixes** : fixes#46549
**Special notes for your reviewer**:
cc @thockin @luxas @euank PTAL
Automatic merge from submit-queue (batch tested with PRs 50489, 51070, 51011, 51022, 51141)
update to rbac v1 in yaml file
**What this PR does / why we need it**:
ref to https://github.com/kubernetes/kubernetes/pull/49642
ref https://github.com/kubernetes/features/issues/2
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
cc @liggitt
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Skip "Simple pod should support exec through kubectl proxy" test
As reported in https://github.com/kubernetes/kubernetes/issues/50466,
this test doesn't work in GKE because it uses a bearer token and the feature only works with client certs.
As the feature that is broken in GKE is new and didn't work before, it
is safe to juste ignore the test and consider the feature as "still not
working" in GKE.
**What this PR does / why we need it**: Fixes the broken test in https://k8s-testgrid.appspot.com/release-master-blocking#gke
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: works-around #50466
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
The test used to scale the StatefulSet down to 0, wait for ListPods to
return 0 matching Pods, and then scale the StatefulSet back up.
This was prone to a race in which StatefulSet was told to scale back up
before it had observed its own deletion of the last Pod, as evidenced by
logs showing the creation of Pod ss-1 prior to the creation of the
replacement Pod ss-0.
We now wait for the controller to observe all deletions before
scaling it back up. This should fix flakes of the form:
```
Too many pods scheduled, expected 1 got 2
```
Automatic merge from submit-queue (batch tested with PRs 50257, 50247, 50665, 50554, 51077)
Replace hard-code "cpu" and "memory" to consts
**What this PR does / why we need it**:
There are many places using hard coded "cpu" and "memory" as resource name. This PR replace them to consts.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
/kind cleanup
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50980, 46902, 51051, 51062, 51020)
Remove seemingly obsolete binaries
It's hard to tell if these are safe to remove. Let CI tell me.
Automatic merge from submit-queue (batch tested with PRs 51039, 50512, 50546, 50965, 50467)
Kubectl: Plumb openapi validation (disabled by default)
**What this PR does / why we need it**: Creates a new flag '--openapi' and plumb in the validation code so that it can be used by default to validate objects against the openapi schema.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: partially https://github.com/kubernetes/kubectl/issues/49
**Special notes for your reviewer**:
This is not complete, the name of the variable must change for example.
**Release note**:
```release-note
Kubectl uses openapi for validation. If OpenAPI is not available on the server, it defaults back to the old Swagger.
```
Automatic merge from submit-queue
StatefulSet: Deflake e2e "Saturate" phase.
This should reduce one source of flakiness found while investigating #48031.
The "Saturate" phase of StatefulSet e2e tests verifies orderly startup by controlling when each Pod is allowed to report Ready. If a Pod unexepectedly goes down during the test, the replacement Pod
created by the controller will forget if it was already allowed to report Ready.
After this change, the signal that allows each Pod to report Ready is persisted in the Pod's PVC. Thus, the replacement Pod will remember that it was already told to proceed to a Ready state.
Automatic merge from submit-queue (batch tested with PRs 51102, 50712, 51037, 51044, 51059)
[sig-network-e2e] Remove redundant sig prefix from tests
**What this PR does / why we need it**:
Remove redundant sig prefix from:
```
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for endpoint-Service: http
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for endpoint-Service: udp
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for node-Service: http
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for node-Service: udp
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for pod-Service: http
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for pod-Service: udp
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should update endpoints: http
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should update endpoints: udp
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should update nodePort: http [Slow]
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should update nodePort: udp [Slow]
[sig-network] Loadbalancing: L7 [sig-network] GCE [Slow] [Feature:Ingress] should conform to Ingress spec
[sig-network] Loadbalancing: L7 [sig-network] GCE [Slow] [Feature:Ingress] should create ingress with given static-ip
```
Umbrella issue #49161
**Special notes for your reviewer**:
cc @xiangpengzhao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50967, 50505, 50706, 51033, 51028)
Fix GC integration test race
During TestCreateWithNonExistentOwner, when creating a pod with a
non-existent owner, assume it's possible the pod will be deleted before
we start checking for the pod's existence. Assuming that the pod still
exists immediately after Create returns is flaky if the GC reacts very
quickly.
```release-note
NONE
```
Might fix https://github.com/kubernetes/kubernetes/issues/50943; without the additional test context provided by this PR, it's not entirely possible to assess the root cause of the reported failure (as we don't know whether the original assertion failure was due to there being 0 or >1 pods).
/cc @caesarxuchao
Automatic merge from submit-queue (batch tested with PRs 50967, 50505, 50706, 51033, 51028)
Revert "Merge pull request #51008 from kubernetes/revert-50789-fix-scheme"
I'm spinning up a cluster right now to test this fix, but I'm pretty sure this was the problem.
There doesn't seem to be a way to confirm from logs, because AFAICT the logs from the hollow kubelet containers are not collected as part of the kubemark test.
**What this PR does / why we need it**:
This reverts commit f4afdecef8, reversing
changes made to e633a1604f.
This also fixes a bug where Kubemark was still using the core api scheme
to manipulate the Kubelet's types, which was the cause of the initial
revert.
**Which issue this PR fixes**: fixes#51007
**Release note**:
```release-note
NONE
```
/cc @shyamjvs @wojtek-t
As reported in https://github.com/kubernetes/kubernetes/issues/50466,
this test doesn't work in GKE because the transport layer doesn't work
with dialing.
As the feature that is broken in GKE is new and didn't work before, it
is safe to juste ignore the test and consider the feature as "still not
working" in GKE.
Automatic merge from submit-queue
Should generate files before scheduler perf
**What this PR does / why we need it**:
For a newly cloned project, generated files are not included. Then scheduler_perf will fail:
```
undefined: openapi.GetOpenAPIDefinitions
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
fixes: #51090
**Special notes for your reviewer**:
Automatic merge from submit-queue (batch tested with PRs 50893, 50913, 50963, 50629, 50640)
Increase latency threshold for list api calls
This is only a short-term solution to make our density test green. In the long-term, we should measure as per our new SLIs.
From @wojtek-t's [doc](https://docs.google.com/document/d/1Q5qxdeBPgTTIXZxdsFILg7kgqWhvOwY8uROEf0j5YBw) on the new SLIs/SLOs, we have the following SLO for list calls:
```
SLO1: In default Kubernetes installation, 99th percentile of SLI2 per cluster-day:
<= 1s if total number of objects of the same type as resource in the system <= X
<= 5s if total number of objects of the same type as resource in the system <= Y
<= 30s if total number of objects of the same types as resource in the system <= Z
```
I would guess that 170,000 pods would fall into the 2nd bracket (at least) and hence the new value of 5s. WDYT?
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue
Revert #50362.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: part of #50884
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 50693, 50831, 47506, 49119, 50871)
Add instance metadata from flag even when using image config.
Also add instance metadata from flag even when we are using image config.
* Sometimes we need to dynamically generate instance metadata, it's troublesome to put them into image config.
* Sometimes we want to apply instance metadata to all images, it's duplicated to add them to each image in the image config.
/assign @yguo0905 Could you help me review this?
The "Saturate" phase of StatefulSet e2e tests verifies orderly startup
by controlling when each Pod is allowed to report Ready.
If a Pod unexepectedly goes down during the test, the replacement Pod
created by the controller will forget if it was already allowed to
report Ready.
After this change, the signal that allows each Pod to report Ready is
persisted in the Pod's PVC. Thus, the replacement Pod will remember that
it was already told to proceed to a Ready state.
This reverts commit f4afdecef8, reversing
changes made to e633a1604f.
This also fixes a bug where Kubemark was still using the core api scheme
to manipulate the Kubelet's types, which was the cause of the initial
revert.
Automatic merge from submit-queue (batch tested with PRs 47896, 50678, 50620, 50631, 51005)
Made the difference between scale-up timeout and cluster set-up timeout explicit.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
During TestCreateWithNonExistentOwner, when creating a pod with a
non-existent owner, assume it's possible the pod will be deleted before
we start checking for the pod's existence. Assuming that the pod still
exists immediately after Create returns is flaky if the GC reacts very
quickly.
- Wait for the master to be healthy
- Wait longer for the master to start
- Fail gracefully if starting the master panics
Signed-off-by: Monis Khan <mkhan@redhat.com>
Automatic merge from submit-queue (batch tested with PRs 46512, 50146)
Make metav1.(Micro)?Time functions take pointers
Is there any reason for those functions not to be on pointers?
Automatic merge from submit-queue (batch tested with PRs 50904, 50691)
Stackdriver Logging e2e: Explicitly check for docker and kubelet logs presence
Check for kubelet and docker logs explicitly in the Stackdriver Logging e2e tests
Automatic merge from submit-queue
Add enj to OWNERS for test/integration/etcd/etcd_storage_path_test.go
@deads2k is the bot smart enough to not spam me with every test change? Perhaps I should create an `OWNERS` file in `test/integration/etcd`?
**Release note**:
```release-note
NONE
```
@kubernetes/sig-api-machinery-pr-reviews
Automatic merge from submit-queue
CollisionCount should have type int32 across controllers that use it for collision avoidance
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50530
**Special notes for your reviewer**:
/cc @liyinan926
/assign @kow3ns @thockin @janetkuo
**Release note**:
```release-note
Change CollisionCount from int64 to int32 across controllers
```
Automatic merge from submit-queue (batch tested with PRs 50277, 50823, 50376, 50867)
Move e2e taints test file to sig-scheduling
**What this PR does / why we need it**:
Move taint test file to e2e scheduling and add sig-scheduling prefix.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Ref Umbrella issue #49161
**Special notes for your reviewer**:
**Release note**:
none
Automatic merge from submit-queue
Change API version of statefulset scale subresource e2e test to v1beta2
**What this PR does / why we need it**:
This PR changes API version of statefulset scale subresource e2e test from `v1beta1` to `v1beta2`.
`apps/v1beta2` has been enabled.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: xref #50109
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50281, 50747, 50347, 50834, 50852)
Add e2e aggregator test.
What this PR does / why we need it:
This adds an e2e test for aggregation based on the sample-apiserver.
Currently is uses a sample-apiserver built as of 1.7.
This should ensure that the aggregation system works end-to-end.
It will also help detect if we break "old" extension api servers.
Which issue this PR fixes (optional, in fixes #<issue number>(, fixes
fixes#43714
**Special notes for your reviewer**:
**Release note**: NONE
Automatic merge from submit-queue (batch tested with PRs 50563, 50698, 50796)
Add ControllerRevision to apps/v1beta2
**What this PR does / why we need it**:
This PR added `ControllerRevision` currently in `apps/v1beta1` to `apps/v1beta2`.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50696.
**Special notes for your reviewer**:
@kow3ns @janetkuo
**Release note**:
```release-note
Add ControllerRevision to apps/v1beta2
```
What this PR does / why we need it:
This adds an e2e test for aggregation based on the sample-apiserver.
Currently is uses a sample-apiserver built as of 1.7.
This should ensure that the aggregation system works end-to-end.
It will also help detect if we break "old" extension api servers.
Which issue this PR fixes (optional, in fixes #<issue number>(, fixes
fixes#43714
Fixed bazel for the change.
Fixed # of args issue from govet.
Added code to test dynamic.Client.
Automatic merge from submit-queue
Migrate sig-apimachinery and sig-servicecatalog e2e tests
**What this PR does / why we need it**:
Migrate sig-apimachinery and sig-servicecatalog e2e tests
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Ref Umbrella issue #49161
1. Move generated_clientset.go to sig-apimachinary
2. Move podpreset.go to sig-servicecatalog by creating new directory.
**Special notes for your reviewer**:
**Release note**:
none
/cc @liggitt
Automatic merge from submit-queue (batch tested with PRs 50550, 50768)
Don't SSH to master for metrics in case of GKE
cc @kubernetes/sig-scalability-misc @crassirostris
Automatic merge from submit-queue
add some e2e for node authz
**What this PR does / why we need it**:
fix#47174
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47174
**Special notes for your reviewer**:
**Release note**:
```
None
```
Automatic merge from submit-queue
Promote CronJobs to batch/v1beta1 - just the API
This PR promotes CronJobs to beta.
@erictune @kubernetes/sig-apps-api-reviews @kubernetes/api-approvers ptal
This builds on top of #41890 and needs #40932 as well
```release-note
Promote CronJobs to batch/v1beta1.
```
Automatic merge from submit-queue (batch tested with PRs 50711, 50742, 50204)
fix panic in e2e
**What this PR does / why we need it**:
fix#50660
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
no
**Release note**:
```release-note
none
```
Automatic merge from submit-queue (batch tested with PRs 50670, 50332)
e2e test for local storage mount point
**What this PR does / why we need it**:
We discovered that kubernetes can treat local directories and actual mountpoints differently. For example, https://github.com/kubernetes/kubernetes/issues/48331. The current local storage e2e tests use directories.
This PR introduces a test that creates a tmpfs and mounts it, and runs one of the local storage e2e tests.
**Which issue this PR fixes**: fixes https://github.com/kubernetes/kubernetes/issues/49126
**Special notes for your reviewer**:
I cherrypicked PR https://github.com/kubernetes/kubernetes/pull/50177, since local storage e2e tests are broken in master on 2017-08-08 due to "no such host" error. This PR replaces NodeExec with SSH commands.
You can run the tests using the following commands:
```
$ NUM_NODES=1 KUBE_FEATURE_GATES="PersistentLocalVolumes=true" go run hack/e2e.go -- -v --up
$ go run hack/e2e.go -- -v --test --test_args="--ginkgo.focus=\[Feature:LocalPersistentVolumes\]"
```
Here are the summary of results from my test run:
```
Ran 9 of 651 Specs in 387.905 seconds
SUCCESS! -- 9 Passed | 0 Failed | 0 Pending | 642 Skipped PASS
Ginkgo ran 1 suite in 6m29.369318483s
Test Suite Passed
2017/08/08 11:54:01 util.go:133: Step './hack/ginkgo-e2e.sh --ginkgo.focus=\[Feature:LocalPersistentVolumes\]' finished in 6m32.077462612s
```
**Release note**:
`NONE`
Automatic merge from submit-queue (batch tested with PRs 50198, 49051, 48432)
move KubeletConfiguration out of componentconfig API group
I'm splitting #44252 into more manageable steps. This step moves the types and updates references.
To reviewers: the most important changes are the removals from pkg/apis/componentconfig and additions to pkg/kubelet/apis/kubeletconfig. Almost everything else is an import or name update.
I have one unanswered question: Should I create a whole new api scheme for Kubelet APIs rather than register e.g. a kubeletconfig group with the default runtime.Scheme instance? This feels like the right thing, as the Kubelet should be exposing its own API, but there's a big fat warning not to do this in `pkg/api/register.go`. Can anyone answer this?
Automatic merge from submit-queue (batch tested with PRs 50198, 49051, 48432)
Add prefix to common networking e2e tests
**What this PR does / why we need it**:
Common networking e2e tests shared by node and cluster suites should also have prefix `[sig-network]`.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Umbrella issue #49161
**Special notes for your reviewer**:
/cc @bowei
**Release note**:
```release-note
NONE
```
LocalVolumeType tmpfs added
Added checks to ensure tha volume created during setup contains expected testFileContent
Refactored tests out to avoid code duplication
Two different tests are performed with tmpfs:
-serial write and read in two different pods
-write and read in two different pods mounted at the same time
Fixed local storage test failures by integrating https://github.com/kubernetes/kubernetes/pull/50177
Switched NodeExec to SSH
Automatic merge from submit-queue (batch tested with PRs 49904, 50484, 50214)
Adding support for internal IP for e2e tests
Currently IssueSSHComand in util.go only checks for External IP address
to ssh, this PR adds check for internal IP too.
Closes#50630
Automatic merge from submit-queue (batch tested with PRs 50094, 48966, 49478, 50593, 49140)
Migrate sig-auth e2e tests.
**What this PR does / why we need it:** This PR adds [sig-auth] prefix to
workload e2e tests in accord to requirements of adding a SIG dashboard
to testgrid. Refer PR #48781 for guidelines.
**Release note**:
```release-note
```
**What this PR does / why we need it:** This PR adds [sig-auth] prefix to
workload e2e tests in accord to requirements of adding a SIG dashboard
to testgrid. Refer PR #48781 for guidelines.
Automatic merge from submit-queue
Moved node condition filter into a predicates.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50360
**Release note**:
```release-note
A new predicates, named 'CheckNodeCondition', was added to replace node condition filter. 'NetworkUnavailable', 'OutOfDisk' and 'NotReady' maybe reported as a reason when failed to schedule pods.
```
Automatic merge from submit-queue (batch tested with PRs 49847, 49743, 49853, 50225, 50479)
Add node benchmark tests for cos-m60 with docker 1.12.6
Ref: https://github.com/kubernetes/kubernetes/issues/42926
This PR adds a benchmark tests against cos-m60 with docker 1.12.6 on http://node-perf-dash.k8s.io. This test is useful for docker validation -- we can compare the performance of different dockers on the same OS.
cos-m60 comes with docker 1.13.1 by default, so we need to use cloud-init to downgrade the version to 1.12.6.
**Release note**:
```
None
```
/assign @dchen1107
Automatic merge from submit-queue (batch tested with PRs 50485, 49951, 50508, 50511, 50506)
Pass config to external Kubemark cluster in e2e tests
When cluster autoscaler is used in kubemark tests,
pass default kubeconfig as external cluster config.
@shyamjvs @gmarek
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50485, 49951, 50508, 50511, 50506)
fix a typo
**What this PR does / why we need it**:
fix a small typo
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
verions->versions
**Special notes for your reviewer**:
**Release note**:
NONE
```release-note
```NONE
Automatic merge from submit-queue (batch tested with PRs 50485, 49951, 50508, 50511, 50506)
Multiarch nonewprivs test image
**What this PR does / why we need it**:
This PR is for converting nonewprivs image which pushed very recently part of https://github.com/kubernetes/kubernetes/pull/47019.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#50498
**Special notes for your reviewer**:
**Release note**:
```NONE```
Automatic merge from submit-queue (batch tested with PRs 50537, 49699, 50160, 49025, 50205)
When not using a CloudProvider, set both InternalIP and ExternalIP on Nodes
#36095 changed all of the cloudproviders to set both InternalIP and ExternalIP on Nodes, but the non-cloudprovider fallback code now only sets InternalIP.
This causes the test "should be able to create a functioning NodePort service" in test/e2e/service.go to fail on cloud-provider-less clusters, because (with LegacyHostIP gone), it now will only try to work with ExternalIPs, and will fail if the node has only an InternalIP.
There isn't much other code that assumes that ExternalIP will always be set (there's something in pkg/master/master.go, but I don't know what it's doing, so maybe it's only useful in the case where InternalIP != ExternalIP anyway). But given that several of the cloudproviders (mesos, ovirt, rackspace) now explicitly set both InternalIP and ExternalIP to the same value always, it seemed right to do that in the fallback case too.
@deads2k FYI
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
code format in master_utils.go
**What this PR does / why we need it**:
code format
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #N/A
**Release note**:
```release-note
None
```
Automatic merge from submit-queue
move logs to kubectl/util
Move `pkg/util/logs` to `pkg/kubectl/util/logs` per https://github.com/kubernetes/kubernetes/issues/48209#issuecomment-311730681
This will make kubeadm, kubefed, gke-certificates-controller and e2e have dependency on kubectl, which should be fine.
partially addresses: kubernetes/community#598
```release-note
NONE
```
/assign @apelisse @monopole
Automatic merge from submit-queue
Remove deprecated ESIPP beta annotations
**What this PR does / why we need it**:
Remove deprecated ESIPP beta annotations.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50187
**Special notes for your reviewer**:
/assign @MrHohn
/sig network
**Release note**:
```release-note
Beta annotations `service.beta.kubernetes.io/external-traffic` and `service.beta.kubernetes.io/healthcheck-nodeport` have been removed. Please use fields `service.spec.externalTrafficPolicy` and `service.spec.healthCheckNodePort` instead.
```
Automatic merge from submit-queue
Migrate to controller references helpers in meta/v1
**What this PR does / why we need it**:
This is a follow up for #48319 that migrates all method usages to new methods in meta/v1.
**Special notes for your reviewer**:
Looking at each commit individually might be easier.
**Release note**:
```release-note
NONE
```
/sig api-machinery
/kind cleanup
Automatic merge from submit-queue
Add Cluster Autoscaler scalability test suite
This suite is intended for manually testing Cluster Autoscaler on large clusters. It isn't supposed to be run automatically (at least for now).
It can be run on Kubemark (with #50440) with the following setup:
- start Kubemark with NUM_NODES=1 (as we require there to be exactly 1 replica per hollow-node replication controller in this setup)
- set kubemark-master machine type manually to appropriate type for the Kubemark cluster size. Maximum Kubemark cluster size reached in test run is defined by maxNodes constant, so for maxNodes=1000, please upgrade to n1-standard-32. Adjust if modifying maxNodes.
- start Cluster Autoscaler pod in the external cluster using image built from version with Kubemark cloud provider (release pending)
- for grabbing metrics from ClusterAutoscaler (with #50382), add "--include-cluster-autoscaler=true" parameter in addition to regular flags for gathering components' metrics/resource usage during e2e tests
cc @bskiba
Automatic merge from submit-queue (batch tested with PRs 45186, 50440)
Add functionality needed by Cluster Autoscaler to Kubemark Provider.
Make adding nodes asynchronous. Add method for getting target
size of node group. Add method for getting node group for node.
Factor out some common code.
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45186, 50440)
Retry fed-svc creation on diff NodePort during e2e tests
**What this PR does / why we need it**:
Currently in federated end2end tests, the creation of services are
done with a randomize NodePort selection take is causing e2e test
flakes if the creation of a federated service failed if the port is
not available.
Now the util.CreateService(...) function is retrying to create the
service on different nodePort in case of error. The method retry until
success or all possible NodePorts have been tested and also failed.
**Which issue this PR fixes**
fixes#44018
Make adding nodes asynchronous. Add method for getting target
size of node group. Add method for getting node group for node.
Factor out some common code.
Automatic merge from submit-queue (batch tested with PRs 50386, 50374, 50444, 50382)
Add grabbing Cluster Autoscaler metrics in e2e tests
This adds:
- collecting metrics from Cluster Autoscaler before & after e2e test run
- --include-cluster-autoscaler opt-in flag
- passing external cluster client to MetricsGrabber (required for Kubemark setup, as Cluster Autoscaler doesn't run on master in this case)
Most types now have valid rest mappings because
NewDefaultRESTMapperFromScheme no longer ignores certain import
paths. Thus we can no longer use the lack of a valid REST mapping
as an indicator for when to use kindWhiteList. Thus kindWhiteList
now serves as a whitelist for all kinds and not just those that
formally had no mapping. This does mean that we could whitelist
kinds due to a name conflict, but that is unlikely as names such as
GetOptions are not appropriate for new objects.
Signed-off-by: Monis Khan <mkhan@redhat.com>
Automatic merge from submit-queue (batch tested with PRs 49725, 50367, 50391, 48857, 50181)
Add e2e test for privileged containers
**What this PR does / why we need it**:
This PR adds node e2e test for privileged containers.
**Which issue this PR fixes**
Part of #44118.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/assign @Random-Liu
Automatic merge from submit-queue (batch tested with PRs 49642, 50335, 50390, 49283, 46582)
Improve GC discovery sync performance
Improve GC discovery sync performance by only syncing when discovered
resource diffs are detected. Before, the GC worker pool was shut down
and monitors resynced unconditionally every sync period, leading to
significant processing delays causing test flakes where otherwise
reasonable GC timeouts were being exceeded.
Related to https://github.com/kubernetes/kubernetes/issues/49966.
/cc @kubernetes/sig-api-machinery-bugs
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49642, 50335, 50390, 49283, 46582)
Add rbac.authorization.k8s.io/v1
xref https://github.com/kubernetes/features/issues/2
Promotes the rbac.authorization.k8s.io/v1beta1 API to v1 with no changes
```release-note
The `rbac.authorization.k8s.io/v1beta1` API has been promoted to `rbac.authorization.k8s.io/v1` with no changes.
The `rbac.authorization.k8s.io/v1alpha1` version is deprecated and will be removed in a future release.
```
Automatic merge from submit-queue (batch tested with PRs 50300, 50328, 50368, 50370, 50372)
Reduce hollow-kubelet cpu request
Fixes https://github.com/kubernetes/kubernetes/issues/50366
This should make kubemark-500 fit in 6 nodes again. Checked that it should be enough.
cc @kubernetes/sig-scalability-misc
Automatic merge from submit-queue (batch tested with PRs 50418, 49830, 49206, 49061, 49912)
add LocalZone into gce.conf and refactor gce cloud provider configura…
The main goal of this PR is to make gce cloud provider able to run locally.
1. added a LocalZone parameter into gce.conf.
2. refactor `newGCECloud` to avoid contacting metadata server if configuration is already available.
```release-note
None
```
Automatic merge from submit-queue
remove apps/v1beta2 defaulting codes for obj.Spec.Selector and obj.Labels
**What this PR does / why we need it**:
This PR removes defaulting codes for `obj.Spec.Selector`. Currently, `obj.Spec.Selector.MatchLabels` is set to `obj.Spec.Template.Labels` if `obj.Spec.Template.Labels != nil && obj.Spec.Selector == nil`. We should not perform this defaulting operation as controllers selectors are immutable.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50339
**Special notes for your reviewer**:
This PR removes defaulting codes for `apps/v1beta2` only. The defaulting codes for validation will be removed in another PR.
**Release note**:
```NONE
```
Automatic merge from submit-queue
VSphere cloud provider code refactoring
The current PR tracks the vSphere Cloud Provider code refactoring which includes the following changes.
- VCLib Package - A framework used by vSphere cloud provider for managing the vSphere entities. VCLib package mainly does the following:
- Volume management on datastore (Create/Delete)
- Volume management on Virtual Machines (Attach/Detach)
- Storage Policy Management
- vSphere Cloud Provider changes to implement the cloud provider interfaces by calling into VCLib package.
- Modifications to e2e tests to accomodate the latest design changes.
@divyenpatel @rohitjogvmw @luomiao
```release-note
vSphere cloud provider: vSphere cloud provider code refactoring
```
Automatic merge from submit-queue (batch tested with PRs 50016, 49583, 49930, 46254, 50337)
Alpha Dynamic Kubelet Configuration
Feature: https://github.com/kubernetes/features/issues/281
This proposal contains the alpha implementation of the Dynamic Kubelet Configuration feature proposed in ~#29459~ [community/contributors/design-proposals/dynamic-kubelet-configuration.md](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/dynamic-kubelet-configuration.md).
Please note:
- ~The proposal doc is not yet up to date with this implementation, there are some subtle differences and some more significant ones. I will update the proposal doc to match by tomorrow afternoon.~
- ~This obviously needs more tests. I plan to write several O(soon). Since it's alpha and feature-gated, I'm decoupling this review from the review of the tests.~ I've beefed up the unit tests, though there is still plenty of testing to be done.
- ~I'm temporarily holding off on updating the generated docs, api specs, etc, for the sake of my reviewers 😄~ these files now live in a separate commit; the first commit is the one to review.
/cc @dchen1107 @vishh @bgrant0607 @thockin @derekwaynecarr
```release-note
Adds (alpha feature) the ability to dynamically configure Kubelets by enabling the DynamicKubeletConfig feature gate, posting a ConfigMap to the API server, and setting the spec.configSource field on Node objects. See the proposal at https://github.com/kubernetes/community/blob/master/contributors/design-proposals/dynamic-kubelet-configuration.md for details.
```
Automatic merge from submit-queue (batch tested with PRs 50016, 49583, 49930, 46254, 50337)
Remove scheduledjobs
This is a prerequisite for promoting CronJobs to beta.
**Release note**:
```release-note
Remove deprecated ScheduledJobs endpoints, use CronJobs instead.
```
Automatic merge from submit-queue (batch tested with PRs 50016, 49583, 49930, 46254, 50337)
[Federation] Make the hpa scale time window configurable
This PR is on top of open pr https://github.com/kubernetes/kubernetes/pull/45993.
Please review only the last commit in this PR.
This adds a config param to controller manager, the value of which gets passed to hpa adapter via sync controller.
This is needed to reduce the overall time limit of the hpa scaling window to much lesser (then the default 2 mins) to get e2e tests run faster. Please see the comment on the newly added parameter.
**Special notes for your reviewer**:
@kubernetes/sig-federation-pr-reviews
@quinton-hoole
@marun to please validate the mechanism used to pass a parameter from cmd line to adapter.
**Release note**:
```
federation-controller-manager gets a new flag --hpa-scale-forbidden-window.
This flag is used to configure the duration used by federation hpa controller to determine if it can move max and/or min replicas
around (or not), of a cluster local hpa object, by comparing current time with the last scaled time of that cluster local hpa.
Lower value will result in faster response to scalibility conditions achieved by cluster local hpas on local replicas, but too low
a value can result in thrashing. Higher values will result in slower response to scalibility conditions on local replicas.
```
Pods associated with the test JobTemplate should use a zero
TerminationGracePeriodSeconds to ensure they're deleted immediately.
This should improve test timing assumption consistency.
Automatic merge from submit-queue
Support exec/attach/portforward in `kubectl proxy`
Use the UpgradeAwareProxy shared code in kubectl proxy. Provide a separate transport for those requests that does not have HTTP/2 enabled. Refactor the code to be a bit cleaner in places and to better separate changes.
Fixes#32026
```release-note
`kubectl proxy` will now correctly handle the `exec`, `attach`, and `portforward` commands. You must pass `--disable-filter` to the command in order to allow these endpoints.
```
Improve GC discovery sync performance by only syncing when discovered
resource diffs are detected. Before, the GC worker pool was shut down
and monitors resynced unconditionally every sync period, leading to
significant processing delays causing test flakes where otherwise
reasonable GC timeouts were being exceeded.
Related to https://github.com/kubernetes/kubernetes/issues/49966.
Automatic merge from submit-queue (batch tested with PRs 50173, 50324, 50288, 50263, 50333)
Add blank import for node tests
The node tests weren't being run because the weren't imported in the test/e2e/e2e_test.go file.
Thanks to @abgworrall for sounding the alarm (he noticed [sig-node] wasn't in the test results)!
/assign @yujuhong
/cc @abgworrall
Automatic merge from submit-queue
Fix local storage test failures
**What this PR does / why we need it**:
Fixed a few issues:
- CI environment on GCE cannot resolve node names, need to use IPs. Use a different SSH wrapper that will get the IPs from the node object.
- Use hostdir instead of containerdir now that commands are executed directly on the host, instead of through a container.
- Get the PVC object again after it is bound so that it has the PV name.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50128
**Release note**:
NONE
/release-note-none
/sig storage
Automatic merge from submit-queue
Add waitForFailure for e2e test framework
**What this PR does / why we need it**:
Add waitForFailure for e2e test framework, this could reduce the reliance on logs.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Part of #44118. Refer https://github.com/kubernetes/kubernetes/pull/48858#discussion_r128331726
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Deprecate Deployment .spec.rollbackTo field
~Depends on #48746~ (merged)
xref: #46934, #49135
1. Deprecate Deployment field `.spec.rollbackTo` in `extensions/v1beta1` and `apps/v1beta1`, and remove the same field and `/rollback` endpoint from `apps/v1beta2` Deployment.
1. Add an annotation `deprecated.deployment.rollback.to` in `apps/v1beta2` for conversion to/from other versions.
Note: `apps/v1beta2` is new in 1.8 (and WIP), so it is okay to make breaking changes to it.
```release-note
Deprecate Deployment .spec.rollbackTo field
```
Currently, in federated end2end tests, the creation of services are
done with a randomize NodePort selection. It causing e2e test
flakes if the creation of a federated service failed if the port is
not available.
Now the util.CreateService(...) function is re trying to create the
service on different nodePort in an error case. The method retries until
success or 10 creation retry with other random NodePorts.
If never the service has not been created properly on one of the
federated cluster, a Service shards cleanup is executed before retrying
again the federated service creation.
fixes#44018
Automatic merge from submit-queue
Add a simple cloud provider for e2e tests on kubemark
**What this PR does / why we need it**:
Adds a simplified cloud provider for kubemark. This enables us to add and
remove nodes and operate on nodegroups while running tests on kubemark.
This is needed to run scalability tests for cluster autoscaler on kubemark.
See https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/kubemark_integration.md
**Release note**:
```
NONE
```
Automatic merge from submit-queue
Add e2e test for cronjob chained removal
This is test proving https://github.com/kubernetes/kubernetes/pull/44058 works with cronjobs. This will fail until the aforementioned PR merges.
@caesarxuchao ptal