Automatic merge from submit-queue
Using API_HOST_IP to do apiserver health check.
In `hack/local-up-cluster.sh`, it's better to use `API_HOST_IP` to do apiserver health check.
Automatic merge from submit-queue
[Federation][kubefed] Add option to expose federation apiserver on nodeport service
**What this PR does / why we need it**:
This PR adds an option to kubefed to expose federation api server over nodeport. This can be useful to deploy federation in non-cloud environments. This PR is target to address #39271
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
[Federation] kubefed init learned a new flag, `--api-server-service-type`, that allows service type to be specified for the federation API server.
[Federation] kubefed init also learned a new flag, `--api-server-advertise-address`, that allows specifying advertise address for federation API server in case the service type is NodePort.
```
@kubernetes/sig-federation-misc @madhusudancs
Automatic merge from submit-queue
Add SIG to test owners
**What this PR does / why we need it**:
This PR adds a `sig` column to the test owners file generation script.
A problem experienced with the current owners file is that since members are auto-assigned there are times where tests are assigned to non-active users who don't follow up to notifications to fix flakes. By assigning a SIG to each test we can hold a group we know is active responsible for taking care of flakes it's less likely that flakes will fall through the cracks.
**Special notes for your reviewer**:
* A companion PR will go into *kubernetes/contrib* adding support for mungers parsing this new column.
* Another PR in contrib will add labeling GitHub flake issues with the appropriate SIG
* Currently SIGs are not labeled, this will be added in another PR where SIG determinations can be discussed
@saad-ali @pwittrock
Automatic merge from submit-queue (batch tested with PRs 40289, 40877, 40879, 39972, 40942)
Extract util used by jsonmergepatch and SMPatch
followup https://github.com/kubernetes/kubernetes/pull/40666#discussion_r99198931
Extract some util out of the `strategicMergePatch` to make `jsonMergePatch` doesn't depend on `strategicMergePatch`.
```release-note
None
```
cc: @liggitt
Automatic merge from submit-queue (batch tested with PRs 40289, 40877, 40879, 39972, 40942)
Rename experimental-cgroups-per-pod flag
**What this PR does / why we need it**:
1. Rename `experimental-cgroups-per-qos` to `cgroups-per-qos`
1. Update hack/local-up-cluster to match `CGROUP_DRIVER` with docker runtime if used.
**Special notes for your reviewer**:
We plan to roll this feature out in the upcoming release. Previous node e2e runs were running with this feature on by default. We will default this feature on for all e2es next week.
**Release note**:
```release-note
Rename --experiemental-cgroups-per-qos to --cgroups-per-qos
```
Automatic merge from submit-queue (batch tested with PRs 40289, 40877, 40879, 39972, 40942)
PV E2E: provide each spec with a fresh nfs host
**What this PR does / why we need it**:
PersistentVolume e2e currently reuses an NFS host pod created at the start of the suite and accessed by each test. This is far less favorable than using a fresh volume per test. Additionally, this guards against the volume host pod or it's kubelet being disrupted, which has led to flakes.
```release-note-none
```
Automatic merge from submit-queue (batch tested with PRs 40289, 40877, 40879, 39972, 40942)
Remove the temporary fix for pre-1.0 mirror pods
The fix was introduced to fix#15960 for pre-1.0 pods. It should be safe to remove
this fix now.
Automatic merge from submit-queue (batch tested with PRs 40906, 40924, 40938, 40902, 40911)
federation: Updating deletion helper to add both finalizers in a single update
Fixes https://github.com/kubernetes/kubernetes/issues/40837
cc @mwielgus @csbell
Automatic merge from submit-queue (batch tested with PRs 40906, 40924, 40938, 40902, 40911)
print apiserver log location on apiserver error
**What this PR does / why we need it**:
Improve user experience. Attempt to direct user to logs of failing component.
**Special notes for your reviewer**:
In addition to failure, point to logs so that a user can attempt to self remedy and have more information available to debug immediately. A user may not know that the failing component has logs.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 40906, 40924, 40938, 40902, 40911)
Add [Flaky] tag to persistent volumes tests
**What this PR does / why we need it**:
Persistent Volume tests continue to flake in CI.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 40906, 40924, 40938, 40902, 40911)
Check whether apiversions is empty
What this PR does / why we need it:
#39719 check whether apisversions get from /api is empty
Special notes for your reviewer:
@caesarxuchao
Automatic merge from submit-queue
Add an upgrade test for secrets.
**What this PR does / why we need it**: This PR adds an upgrade test for secrets. It creates a secret and makes sure that pods can consume it before an after an upgrade.
Automatic merge from submit-queue
CRI: Handle cri in-place upgrade
Fixes https://github.com/kubernetes/kubernetes/issues/40051.
## How does this PR restart/remove legacy containers/sandboxes?
With this PR, dockershim will convert and return legacy containers and infra containers as regular containers/sandboxes. Then we can rely on the SyncPod logic to stop the legacy containers/sandboxes, and the garbage collector to remove the legacy containers/sandboxes.
To forcibly trigger restart:
* For infra containers, we manually set `hostNetwork` to opposite value to trigger a restart (See [here](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kuberuntime/kuberuntime_manager.go#L389))
* For application containers, they will be restarted with the infra container.
## How does this PR avoid extra overhead when there is no legacy container/sandbox?
For the lack of some labels, listing legacy containers needs extra `docker ps`. We should not introduce constant performance regression for legacy container cleanup. So we added the `legacyCleanupFlag`:
* In `ListContainers` and `ListPodSandbox`, only do extra `ListLegacyContainers` and `ListLegacyPodSandbox` when `legacyCleanupFlag` is `NotDone`.
* When dockershim starts, it will check whether there are legacy containers/sandboxes.
* If there are none, it will mark `legacyCleanupFlag` as `Done`.
* If there are any, it will leave `legacyCleanupFlag` as `NotDone`, and start a goroutine periodically check whether legacy cleanup is done.
This makes sure that there is overhead only when there are legacy containers/sandboxes not cleaned up yet.
## Caveats
* In-place upgrade will cause kubelet to restart all running containers.
* RestartNever container will not be restarted.
* Garbage collector sometimes keep the legacy containers for a long time if there aren't too many containers on the node. In that case, dockershim will keep performing extra `docker ps` which introduces overhead.
* Manually remove all legacy containers will fix this.
* Should we garbage collect legacy containers/sandboxes in dockershim by ourselves? /cc @yujuhong
* Host port will not be reclaimed for the lack of checkpoint for legacy sandboxes. https://github.com/kubernetes/kubernetes/pull/39903 /cc @freehan
/cc @yujuhong @feiskyer @dchen1107 @kubernetes/sig-node-api-reviews
**Release note**:
```release-note
We should mention the caveats of in-place upgrade in release note.
```
Automatic merge from submit-queue
Plumb subresource through subjectaccessreview
plumb all fields for subjectaccessreview into the resulting `authorizer.AttributesRecord`
```release-note
The SubjectAccessReview API passes subresource and resource name information to the authorizer to answer authorization queries.
```
Automatic merge from submit-queue
examples: PV docs clarify Azure storage account restriction
**What this PR does / why we need it**: One line doc fix, clarifies a constraint for using `AzureDisk` volumes.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#40276
**Special notes for your reviewer**: None
**Release note**:
```release-note
NONE
```
cc: @rootfs @otaviosoares
Automatic merge from submit-queue
GroupMetaFactoryArgs documentation
**What this PR does / why we need it**:
Documentation for people writing new API-Groups.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: documentation
**Special notes for your reviewer**:
@deads2k @pmorie my thoughts from writing the service-catalog apiserver.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Tidy up the main README.
Removed the coveralls link since it hasn't been updated in a few years. Made some punctuation more consistent.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Optionally avoid evicting critical pods in kubelet
For #40573
```release-note
When feature gate "ExperimentalCriticalPodAnnotation" is set, Kubelet will avoid evicting pods in "kube-system" namespace that contains a special annotation - `scheduler.alpha.kubernetes.io/critical-pod`
This feature should be used in conjunction with the rescheduler to guarantee availability for critical system pods - https://kubernetes.io/docs/admin/rescheduler/
```
Automatic merge from submit-queue (batch tested with PRs 40696, 39914, 40374)
Convert hack/e2e.go to a test-infra/kubetest shim
Replaces `hack/e2e.go` for a shim that passes the args to `k8s.io/test-infra/kubetest`
Adds fejta to `hack/OWNERS`
Adds `e2e_test.go` for unit test coverage of the shim.
`Usage: go run hack/e2e.go [--get=true] [--old=1d] -- KUBETEST_ARGS`
In other words there is are `--get` and `--old` shim flags, which control how we upgrade `kubetest`, and a `--` to separate the shim args from the kubetest args, and the existing kubetest args like `--down` `--up`, etc. If only `KUBETEST_ARGS` are used then you can skip the `--` (although golang will complain about it).
Once this is ready to go I will update the kubekins-e2e image to copy this file from test-infra: https://github.com/kubernetes/test-infra/blob/master/jenkins/e2e-image/Dockerfile#L70
ref https://github.com/kubernetes/test-infra/issues/1475
Automatic merge from submit-queue (batch tested with PRs 40696, 39914, 40374)
Forgiveness library changes
**What this PR does / why we need it**:
Splited from #34825, contains library changes that are needed to implement forgiveness:
1. ~~make taints-tolerations matching respect timestamps, so that one toleration can just tolerate a taint for only a period of time.~~ As TaintManager is caching taints and observing taint changes, time-based checking is now outside the library (in TaintManager). see #40355.
2. make tolerations respect wildcard key.
3. add/refresh some related functions to wrap taints-tolerations operation.
**Which issue this PR fixes**:
Related issue: #1574
Related PR: #34825, #39469
~~Please note that the first 2 commits in this PR come from #39469 .~~
**Special notes for your reviewer**:
~~Since currently we have `pkg/api/helpers.go` and `pkg/api/v1/helpers.go`, there are some duplicated periods of code laying in these two files.~~
~~Ideally we should move taints-tolerations related functions into a separate package (pkg/util/taints), and make it a unified set of implementations. But I'd just suggest to do it in a follow-up PR after Forgiveness ones done, in case of feature Forgiveness getting blocked to long.~~
**Release note**:
```release-note
make tolerations respect wildcard key
```
Automatic merge from submit-queue (batch tested with PRs 40696, 39914, 40374)
Cleanup scheduler server with an external config class
**What this PR does / why we need it**:
Some cleanup in cmd/server so that the parts which setup scheduler configuration are stored and separately tested.
- additionally a simple unit test to check that erroneous configs return a non-nil error is included.
- it also will make sure we avoid nil panics of schedulerConfiguration is misconfigured.
Automatic merge from submit-queue (batch tested with PRs 40862, 40909)
Remove apimachinery from staging client-go/Godeps/Godeps.json
The publishing robot will add the latest version of apimachinery to Godeps.json.
This is part of the effort to allow update staging apimachinery and staging client-go in a same PR.
The robot change is here: https://github.com/kubernetes/test-infra/pull/1784
@deads2k @stts @lavalamp
Automatic merge from submit-queue (batch tested with PRs 40862, 40909)
[Federation][kubefed] Add option to disable persistence storage for etcd
**What this PR does / why we need it**:
This is part of updates to enable deployment of federation on non-cloud environments. This pr enables disabling persistent storage for etcd via kubefed.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#40617
**Special notes for your reviewer**:
**Release note**:
```
[Federation] Add --etcd-persistent-storage flag to kubefed to enable/disable persistent storage for etcd
```
cc: @kubernetes/sig-federation-bugs @madhusudancs
Automatic merge from submit-queue (batch tested with PRs 40795, 40863)
Use caching secret manager in kubelet
I just found that this is in my local branch I'm using for testing, but not in master :)