Automatic merge from submit-queue (batch tested with PRs 45514, 45635)
hyperkube_test should not depend on number of spaces.
From #45524.
Apparently adding a long flag to kube-controller-manager breaks the hyperkube unit tests, because they depend on number of spaces :)
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45514, 45635)
refactor certificate controller to break it into two parts
Break pkg/controller/certificates into:
* pkg/controller/certificates/approver: containing the group approver
* pkg/controller/certificates/signer: containing the local signer
* pkg/controller/certificates: containing shared infrastructure
```release-note
Break the 'certificatesigningrequests' controller into a 'csrapprover' controller and 'csrsigner' controller.
```
Automatic merge from submit-queue (batch tested with PRs 42042, 46139, 46126, 46258, 46312)
Remove unused test properties
Issue: #42676
A separate serial memcg suite was created for the initial stages of re-enabling memcg notifications. Now that all e2e tests have memcg notifications enabled, this suite is no longer needed.
Automatic merge from submit-queue (batch tested with PRs 42042, 46139, 46126, 46258, 46312)
Append X-Forwarded-For in proxy handler
Append the request sender's IP to the `X-Forwarded-For` header chain when proxying requests. This is important for audit logging (https://github.com/kubernetes/features/issues/22) in order to capture the client IP (specifically in the case of federation or kube-aggregator).
/cc @liggitt @deads2k @ericchiang @ihmccreery @soltysh
Automatic merge from submit-queue (batch tested with PRs 42042, 46139, 46126, 46258, 46312)
Remove kubectl's dependence on pkg/api/helper
**What this PR does / why we need it**:
Remove kubectl's dependence on pkg/api/helper, as part of
broader effort to isolate kubectl from the rest of k8s.
In this case, the code becomes private to kubectl; nobody else uses it.
**Which issue this PR fixes**
Part of a series of PRs to address kubernetes/community#598
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 42042, 46139, 46126, 46258, 46312)
[Federation] Use service accounts instead of the user's credentials when accessing joined clusters' API servers.
Fixes#41267.
Release notes:
```release-note
Modifies kubefed to create and the federation controller manager to use credentials associated with a service account rather than the user's credentials.
```
Automatic merge from submit-queue
Fix some typo of comment in kubelet.go
**What this PR does / why we need it**:
The PR is to fix some typo in kubelet.go
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
N/A
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
remove init blocks from all admission plugins
**What this PR does / why we need it**:
removes init blocks from all admission plugins
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46149, 45897, 46293, 46296, 46194)
check flag format in file known-flags.txt
All flags in file hack/verify-flags/known-flags.txt should contain
character -, this change check it to prevent adding useless flags
to known-flags.txt
ref #45948
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46149, 45897, 46293, 46296, 46194)
Use storage instead of REST for the CRD finalizer
**What this PR does / why we need it**:
Switch the custom resource definition finalizer controller to use
storage instead of a REST client, because a client could incorrectly try
to delete ThirdPartyResources whose names happen to collide with the
CustomResource instances.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46149, 45897, 46293, 46296, 46194)
Chaosmonkey - Signal stop to tests and wait for done when disruption fails
**What this PR does / why we need it**:
Prevents tests from leaking resources because their Teardown was never called when test disruption fails.
**Which issue this PR fixes**
First problem of #45842
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46149, 45897, 46293, 46296, 46194)
GC: update required verbs for deletable resources, allow list of ignored resources to be customized
The garbage collector controller currently needs to list, watch, get,
patch, update, and delete resources. Update the criteria for
deletable resources to reflect this.
Also allow the list of resources the garbage collector controller should
ignore to be customizable, so downstream integrators can add their own
resources to the list, if necessary.
cc @caesarxuchao @deads2k @smarterclayton @mfojtik @liggitt @sttts @kubernetes/sig-api-machinery-pr-reviews
Automatic merge from submit-queue
Allow the /logs handler on the apiserver to be toggled.
Adds a flag to kube-apiserver, and plumbs through en environment variable in configure-helper.sh
Automatic merge from submit-queue
Double `StopContainer` request timeout.
Doubled `StopContainer` request timeout to leave some time for `SIGKILL` container.
@yujuhong @feiskyer
Automatic merge from submit-queue
removing generic_scheduler todo after discussion (#46027)
**What this PR does / why we need it**:
**Which issue this PR fixes** #46027
**Special notes for your reviewer**: just a quick clean cc @wojtek-t
**Release note**:
```release-note
```
Switch the custom resource definition finalizer controller to use
storage instead of a REST client, because a client could incorrectly try
to delete ThirdPartyResources whose names happen to collide with the
CustomResource instances.
Automatic merge from submit-queue
Enable "kick the tires" support for Nvidia GPUs in COS
This PR provides an installation daemonset that will install Nvidia CUDA drivers on Google Container Optimized OS (COS).
User space libraries and debug utilities from the Nvidia driver installation are made available on the host in a special directory on the host -
* `/home/kubernetes/bin/nvidia/lib` for libraries
* `/home/kubernetes/bin/nvidia/bin` for debug utilities
Containers that run CUDA applications on COS are expected to consume the libraries and debug utilities (if necessary) from the host directories using `HostPath` volumes.
Note: This solution requires updating Pod Spec across distros. This is a known issue and will be addressed in the future. Until then CUDA workloads will not be portable.
This PR updates the COS base image version to m59. This is coupled with this PR for the following reasons:
1. Driver installation requires disabling a kernel feature in COS.
2. The kernel API for disabling this interface changed across COS versions
3. If the COS image update is not handled in this PR, then a subsequent COS image update will break GPU integration and will require an update to the installation scripts in this PR.
4. Instead of having to post `3` PRs, one each for adding the basic installer, updating COS to m59, and then updating the installer again, this PR combines all the changes to reduce review overhead and latency, and additional noise that will be created when GPU tests break.
**Try out this PR**
1. Get Quota for GPUs in any region
2. `export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci`
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1" cluster/kube-up.sh`
4. `kubectl create -f cluster/gce/gci/nvidia-gpus/cos-installer-daemonset.yaml`
5. Run your CUDA app in a pod.
**Another option is to run a e2e manually to try out this PR**
1. Get Quota for GPUs in any region
2. export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1"`
4. `go run hack/e2e.go -- --up`
5. `hack/ginkgo-e2e.sh --ginkgo.focus="\[Feature:GPU\]"`
The e2e will install the drivers automatically using the daemonset and then run test workloads to validate driver integration.
TODO:
- [x] Update COS image version to m59 release.
- [x] Remove sleep from the install script and add it to the daemonset
- [x] Add an e2e that will run the daemonset and run a sample CUDA app on COS clusters.
- [x] Setup a test project with necessary quota to run GPU tests against HEAD to start with https://github.com/kubernetes/test-infra/pull/2759
- [x] Update node e2e serial configs to install nvidia drivers on COS by default
Automatic merge from submit-queue (batch tested with PRs 45587, 46286)
fix typo in kubelet
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45587, 46286)
PDB Max Unavailable Field
Completes https://github.com/kubernetes/features/issues/285
```release-note
Adds a MaxUnavailable field to PodDisruptionBudget
```
Individual commits are self-contained; Last commit can be ignored because it is autogenerated code.
cc @kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews
Allow the list of resources the garbage collector controller should
ignore to be customizable, so downstream integrators can add their own
resources to the list, if necessary.
The garbage collector controller currently needs to list, watch, get,
patch, update, and delete resources. Update the criteria for
deletable resources to reflect this.
Automatic merge from submit-queue (batch tested with PRs 45766, 46223)
Scheduler should use a shared informer, and fix broken watch behavior for cached watches
Can be used either from a true shared informer or a local shared
informer created just for the scheduler.
Fixes a bug in the cache watcher where we were returning the "current" object from a watch event, not the historic event. This means that we broke behavior when introducing the watch cache. This may have API implications for filtering watch consumers - but on the other hand, it prevents clients filtering from seeing objects outside of their watch correctly, which can lead to other subtle bugs.
```release-note
The behavior of some watch calls to the server when filtering on fields was incorrect. If watching objects with a filter, when an update was made that no longer matched the filter a DELETE event was correctly sent. However, the object that was returned by that delete was not the (correct) version before the update, but instead, the newer version. That meant the new object was not matched by the filter. This was a regression from behavior between cached watches on the server side and uncached watches, and thus broke downstream API clients.
```
Automatic merge from submit-queue
tighten and simplify owners in some staging repos
With the move to staging, we can have much cleaner owners across the related packages. This pares down the list of OWNERS to better match for code and activity. It should help get PRs directed to people more active and familiar with the areas for quicker review.
@kubernetes/sig-api-machinery-misc
@lavalamp @smarterclayton ptal.
Automatic merge from submit-queue (batch tested with PRs 46060, 46234)
Speedup generating selflinks for list and watch requests
I've seen profiles, where GenerateSelflink was 8-9% of whole cpu usage of apiserver (profiles over 30s). Most of this where spent in getting RequestInfo from the context and creating the context.
This PR changes the API of the GenerateLink method of the namer which results in computing the context and requestInfo only once per LIST/WATCH request (instead of computing it for every single returned element of LIST/WATCH).
@smarterclayton @deads2k - can one of you please take a look?
Automatic merge from submit-queue
Add `auto_unmount` mount option for glusterfs fuse mount.
libfuse has an auto_unmount option which, if enabled, ensures that
the file system is unmounted at FUSE server termination by running a
separate monitor process that performs the unmount when that occurs.
(This feature would probably better be called "robust auto-unmount",
as FUSE servers usually do try to unmount their file systems upon
termination, it's just this mechanism is not crash resilient.)
This change implements that option and behavior for glusterfs.
This option will be only supported for clients with version >3.11.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Automatic merge from submit-queue
vSphere storage policy support for dynamic volume provisioning
Till now, vSphere cloud provider provides support to configure persistent volume with VSAN storage capabilities - kubernetes#42974. Right now this only works with VSAN.
Also there might be other use cases:
- The user might need a way to configure a policy on other datastores like VMFS, NFS etc.
- Use Storage IO control, VMCrypt policies for a persistent disk.
We can achieve about 2 use cases by using existing storage policies which are already created on vCenter using the Storage Policy Based Management service. The user will specify the SPBM policy ID as part of dynamic provisioning
- resultant persistent volume will have the policy configured with it.
- The persistent volume will be created on the compatible datastore that satisfies the storage policy requirements.
- If there are multiple compatible datastores, the datastore with the max free space would be chosen by default.
- If the user specifies the datastore along with the storage policy ID, the volume will created on this datastore if its compatible. In case if the user specified datastore is incompatible, it would error out the reasons for incompatibility to the user.
- Also, the user will be able to see the associations of persistent volume object with the policy on the vCenter once the volume is attached to the node.
For instance in the below example, the volume will created on a compatible datastore with max free space that satisfies the "Gold" storage policy requirements.
```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: fast
provisioner: kubernetes.io/vsphere-volume
parameters:
diskformat: zeroedthick
storagepolicyName: Gold
```
For instance in the below example, the vSphere CP checks if "VSANDatastore" is compatible with "Gold" storage policy requirements. If yes, volume will be provisioned on "VSANDatastore" else it will error that "VSANDatastore" is not compatible with the exact reason for failure.
```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: fast
provisioner: kubernetes.io/vsphere-volume
parameters:
diskformat: zeroedthick
storagepolicyName: Gold
datastore: VSANDatastore
```
As a part of this change, 4 commits have been added to this PR.
1. Vendor changes for vmware/govmomi
2. Changes to the VsphereVirtualDiskVolumeSource in the Kubernetes API. Added 2 additional fields StoragePolicyName, StoragePolicyID
3. Swagger and Open spec API changes.
4. vSphere Cloud Provider changes to implement the storage policy support.
**Release note**:
```release-note
vSphere cloud provider: vSphere Storage policy Support for dynamic volume provisioning
```
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)
kubectl: fix deprecation warning bug
**What this PR does / why we need it**:
Some kubectl commands were deprecated but would fail to print the
correct warning message when a flag was given before the command name.
# Correctly prints the warning that "resize" is deprecated and
# "scale" is now preferred.
kubectl resize [...]
# Should print the same warning but no warning is printed.
kubectl --v=1 resize [...]
This was due to a fragile check on os.Args[1].
This commit implements a new function deprecatedCmd() that is used to
construct new "passthrough" commands which are marked as deprecated and
hidden.
Note that there is an existing "filters" system that may be preferable
to the system created in this commit. I'm not sure why the "filters"
array was not used for all deprecated commands in the first place.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)
[Federation][kubefed]: Add support for etcd image override
This PR adds support for overriding the default etcd image used by ``kubefed init`` by providing an argument to ``--etcd-image``. This is primarily intended to allow consumers like openshift to provide a different default, but as a nice side-effect supports code-free validation of non-default etcd images.
**Release note**:
```release-note
'kubefed init' now supports overriding the default etcd image name with the --etcd-image parameter.
```
cc: @kubernetes/sig-federation-pr-reviews