Automatic merge from submit-queue (batch tested with PRs 49237, 49656, 49980, 49841, 49899)
certificate manager: close existing client conns once cert rotates
After the kubelet rotates its client cert, it will keep connections to the API server open indefinitely, causing it to use its old credentials instead of the new certs. Because the API server authenticates client certs at the time of the request, and not the handshake, this could cause the kubelet to start hitting auth failures even if it rotated its certificate to a new, valid one.
When the kubelet rotates its cert, close down existing connections to force a new TLS handshake.
Ref https://github.com/kubernetes/features/issues/266
Updates https://github.com/kubernetes-incubator/bootkube/pull/663
```release-note
After a kubelet rotates its client cert, it now closes its connections to the API server to force a handshake using the new cert. Previously, the kubelet could keep its existing connection open, even if the cert used for that connection was expired and rejected by the API server.
```
/cc @kubernetes/sig-auth-bugs
/assign @jcbsmpsn @mikedanese
Automatic merge from submit-queue (batch tested with PRs 49237, 49656, 49980, 49841, 49899)
[Bug Fix] Set NodeOODCondition to false
fixes#49839, which was introduced by #48846
This PR makes the kubelet set NodeOODCondition to false, so that the scheduler and other controllers do not consider the node to be unschedulable.
/assign @vishh
/sig node
/release-note-none
Automatic merge from submit-queue (batch tested with PRs 49237, 49656, 49980, 49841, 49899)
GC shouldn't send empty patch
The scope of the `if` statement was wrong, causing GC to sometimes send empty patch.
Found this bug while investigating https://github.com/kubernetes/kubernetes/issues/49966.
Automatic merge from submit-queue (batch tested with PRs 49237, 49656, 49980, 49841, 49899)
make admission tolerate object without objectmeta for errors
Not all object have ObjectMeta (see SARs for instance). Admission should tolerate this condition without giving meaningless errors.
@derekwaynecarr ptal
@php-coder fyi
Automatic merge from submit-queue (batch tested with PRs 49989, 49806, 49649, 49412, 49512)
This adds an etcd health check endpoint to kube-apiserver
addressing https://github.com/kubernetes/kubernetes/issues/48215.
**What this PR does / why we need it**:
This ensures kube-apiserver `/healthz` endpoint fails whenever connectivity cannot be established to etcd, also ensures the etcd preflight checks works with unix sockets
**Which issue this PR fixes**: fixes#48215
**Special notes for your reviewer**:
This PR does not use the etcd client directly as the client object is wrapped behind the storage interface and not exposed directly for use, so I decided to reuse what's being done in the preflight. So this will only check fail for connectivity and not etcd auth related problems. I did not write tests for the endpoint because I couldn't find examples that I could follow for writing tests for healthz related endpoints, I'll be willing to write those tests if someone can point me at a relevant one.
**Release note**:
```release-note
Add etcd connectivity endpoint to healthz
```
@deads2k please help review, thanks!
Automatic merge from submit-queue (batch tested with PRs 49989, 49806, 49649, 49412, 49512)
Use existing k8s binaries and images on disk when they are preloaded to gce cos image.
**What this PR does / why we need it**:
This change is to accelerate K8S startup time on gce when k8s tarballs and images are already preloaded in VM image, by skipping the downloading, extracting and file transfer steps.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49989, 49806, 49649, 49412, 49512)
Update repo-infra and rules_go Bazel workspace dependencies
**What this PR does / why we need it**: bumping the `repo-infra` dependency gets us https://github.com/kubernetes/repo-infra/pull/25, which hopefully fixes the `gsutil -m rsync` flakiness in the `pull-kubernetes-e2e-gce-bazel` job, and https://github.com/kubernetes/repo-infra/pull/26, which lets us bump the `rules_go` dependency.
Bumping the `rules_go` dependency fixes the build on bazel 0.5.3+, gives us race detector support, and probably a bunch of other features, too. It's also a prerequisite for switching to gazelle (#47558).
**Release note**:
```release-note
NONE
```
/assign @spxtr @mikedanese
Automatic merge from submit-queue
Log abridged set of rules at v2 in kube-proxy on error
**What this PR does / why we need it**:
this is a follow-on to https://github.com/kubernetes/kubernetes/pull/48085
**Special notes for your reviewer**:
we hit this in operations where we typically run in v2, and would like to log abridged set of output rather than full output.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Add ubuntu to gluster and nfs tests
**What this PR does / why we need it**:
Enable gluster and nfs tests for ubuntu distro
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50039
**Special notes for your reviewer**:
**Release note**:
/release-note-none
/sig storage
Automatic merge from submit-queue
cleanup dead installer code
cleans up some installer code that was dead and reorders a little of the flow to reduce complexity.
@kubernetes/sig-api-machinery-misc
Automatic merge from submit-queue (batch tested with PRs 50029, 48517, 49739, 49866, 49782)
Update generated deepcopy code
**What this PR does / why we need it**:
In generated deepcopy code, the method names in comments do not match the real method names.
**Which issue this PR fixes**: fixes#49755
**Special notes for your reviewer**:
/assign @sttts @caesarxuchao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50029, 48517, 49739, 49866, 49782)
fix spelling
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 50029, 48517, 49739, 49866, 49782)
Pod affinity test clean up as AffinitInAnnotation is removed.
**What this PR does / why we need it**:
These tests are already covered under "empty topologyKey" pod affinity test cases.
These test cases were added only to test the scenario when the AffinitInAnnotation
feature was disabled. Since AffinitInAnnotation is removed now, these test cases are
no longer needed as they are duplicate now.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
@kubernetes/sig-scheduling-misc @bsalamat
Automatic merge from submit-queue (batch tested with PRs 50029, 48517, 49739, 49866, 49782)
iptables_test should not run on OSX or Windows
**What this PR does / why we need it**:
Fix for failing tests. Let's just skip these on darwin and windows
platforms as iptables is not available on these.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#48509
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Add cblecker to hack/ owners
This PR adds myself to the approvers list for `hack/`. I've been helping do a bunch of improvements here over the past few months, and want to continue doing so.
I solemnly promise to use approval powers for good, and only approve things that I know are safe and within my area of expertise!
```release-note
NONE
```
/cc @ixdy @sttts @fejta @spxtr
Automatic merge from submit-queue (batch tested with PRs 49990, 49997, 44278, 49936, 49891)
adds an admission plugin to the sample apiserver.
**What this PR does / why we need it**:
adds an admission plugin to the sample apiserver.
the admission plugin checks whether `Flunder.Name` is not on the banned list.
including a unit test with various test scenarios.
**Special notes for your reviewer**:
https://github.com/kubernetes/kubernetes/issues/47868
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49990, 49997, 44278, 49936, 49891)
Allow mode in e2e-framework to gather metrics only from master
This should enable getting metrics for our 5k-node clusters.
cc @kubernetes/sig-scalability-misc @gmarek
Automatic merge from submit-queue (batch tested with PRs 49990, 49997, 44278, 49936, 49891)
Move ResourceQuota plugin at the end of the admission plugin chain.
@liggitt @derekwaynecarr
Automatic merge from submit-queue
Update submit-queue URL in README.md
**What this PR does / why we need it**:
- Updates the submit-queue URL in README.md from `e2e` to `ci` to match upstream changes.
- NOTE: The existing URL works for now because submit-queue is a single page app and we route to the same tab for both URLs client-side, but this is the correct URL now.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
- N/A
**Special notes for your reviewer**:
- [kubernetes-dev update](https://groups.google.com/forum/#!topic/kubernetes-dev/z3S47Zsq53Y)
- [test-infra PR with changes to submit-queue]( https://github.com/kubernetes/test-infra/pull/3765)
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
fix sample-apiserver apiservice.yaml to add groupPriorityMinimum
fix sample-apiserver apiservice.yaml example to add groupPriorityMinimum and versionPriority, which is added in v1.7
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Have a uniform format for filenames across controllers
**What this PR does / why we need it**:
Bring in uniformity in filename format across all the controllers. Now controllers are of the format
`<controllerName>_controller.go`
From
```
./pkg/controller/node/nodecontroller.go
./pkg/controller/route/routecontroller.go
./pkg/controller/service/servicecontroller.go
./pkg/controller/cloud/nodecontroller.go
./pkg/controller/ttl/ttlcontroller.go
./pkg/controller/job/jobcontroller.go
./pkg/controller/daemon/daemoncontroller.go
```
TO
```
./pkg/controller/node/node_controller.go
./pkg/controller/route/route_controller.go
./pkg/controller/service/service_controller.go
./pkg/controller/cloud/node_controller.go
./pkg/controller/ttl/ttl_controller.go
./pkg/controller/job/job_controller.go
./pkg/controller/daemon/daemon_controller.go
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
None
```
Automatic merge from submit-queue (batch tested with PRs 49992, 48861, 49267, 49356, 49886)
Set default vmodule flag in integration tests
Re-introduce a default glog vmodule flag to the integration test setup.
The default was removed in d08dfb9 because it was hard-coded and
prevented local override. This commit makes the default overridable.
```release-note
NONE
```
/cc @caesarxuchao
Automatic merge from submit-queue (batch tested with PRs 49992, 48861, 49267, 49356, 49886)
remove unused function
**What this PR does / why we need it**:
remove unused function which is not used months ago.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49992, 48861, 49267, 49356, 49886)
Emit event and retry when fail to start healthz server on kube-proxy
**What this PR does / why we need it**: Enhance kube-proxy's logic when fail to start healthz server.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: From #49263.
**Special notes for your reviewer**:
/assign @thockin @nicksardo @bowei
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49992, 48861, 49267, 49356, 49886)
Reintegrate aggregation support for OpenAPI
Reintegrating changes of #46734
Changes summary:
- Extracted all OpenAPI specs to new repo `kube-openapi`
- Make OpenAPI spec aggregator to copy and rename any non-requal model (even with documentation change only).
- Load specs when adding APIServices and retry on failure until successful spec retrieval or a 404.
- Assumes all Specs except aggregator's Spec are static
- A re-register of any APIService will result in updating the spec for that service (Suggestion for TPR: they should be registered to aggregator API Server, Open for discussion if any more changes needed for another PR.)
fixes#48548
Automatic merge from submit-queue (batch tested with PRs 49992, 48861, 49267, 49356, 49886)
Correctly handle empty watch event cache
Fixes https://github.com/kubernetes/kubernetes/issues/49956
Introduced by ada60236f7 which did not adjust the oldest available resourceVersion for an empty watch event cache.
Exposed by 74b9ba3b4d, which allowed controllers to get list results from etcd before the watch cache is ready (normally they list with resourceVersion=0 which serves the list request from the watch cache, blocking until it is ready)
When the watch cache had an empty cache of watch events, it currently allows establishing a watch as if it can deliver a watch event for its currently synced resourceVersion. This results in an off-by-one error which can result in a missed watch event.
Scenario:
bob:
1. creates object at resourceVersion=11
sally:
1. does a list API request, gets a list resourceVersion of 10 (just before bob creates the object)
2. starts watch handled by watch cache at resourceVersion=10
Watch cache:
1. initial list gets resourceVersion=11, including the item created by bob
2. when determining the initial watch events to send to sally's watch, there are no watch events in the cache, so no initial watch events are sent.
3. the cache listerwatcher watches etcd starting at resourceVersion=11, so future events are fed into the event cache and to sally's watch
The watch cache should have dropped sally's watch from resourceVersion=10 with a "gone" error, since it can't deliver the watch event for resourceVersion=11. This would force sally to relist (where she would get a list at resourceVersion=11) and rewatch (from resourceVersion=11)
This particularly affects tests that create CRD/TPRs and establish watches on the new types as the storage layer's watch cache is also populating for that type.
```release-note
Fix a bug in watch cache sometimes causing missing events after watch cache initialization.
```
Automatic merge from submit-queue
Add parallelism to GCE cluster upgrade
Fixes https://github.com/kubernetes/kubernetes/issues/48373
Should allow upgrading 500-node cluster (1.6->1.7) in < 1 hr. It currently takes ~1.5 day.
Though it is the duty of the upgrader to choose the right parallelism in order to avoid disrupting too many pods.
/cc @kubernetes/sig-cluster-lifecycle-pr-reviews @kubernetes/sig-scalability-misc @mikedanese @gmarek
Automatic merge from submit-queue (batch tested with PRs 49871, 49422, 49092, 49858, 48999)
ScaleIO Volume Plugin - Volume attribute fixes and updates
**What this PR does / why we need it**:
This is a housekeeping PR for small enhancements and fixes to the ScaleIO volume plugin to address issues:
- Enforcement of fsGroup
- Enable ScaleIO multiple-instance volume mapping
- Tighter validation of PVC parameters
- Injection of default PVC capacity when omitted
- Better alignment of PVC, PV, and volume names for dynamic provisioning
**Special notes for your reviewer**:
**Release note**:
```release-note
Enforcement of fsGroup; enable ScaleIO multiple-instance volume mapping; default PVC capacity; alignment of PVC, PV, and volume names for dynamic provisioning
```