Automatic merge from submit-queue (batch tested with PRs 52264, 51870)
Use credentials from providers for docker sandbox image
**What this PR does / why we need it**:
Sandbox image lookup uses creds from docker config only; other credential providers are ignored. This is a regression introduced in dockershim.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#51293
**Special notes for your reviewer**:
Should also cherry-pick this to release-1.6 and release-1.7.
**Release note**:
```release-note
Fix credentials providers for docker sandbox image.
```
Automatic merge from submit-queue (batch tested with PRs 52227, 52120)
Use COS for nodes in testing clusters by default, and bump COS.
Addresses part of issue #51487. May assist with #51961 and #50695.
CVM is being deprecated, and falls out of support on 2017/10/01. We shouldn't run test jobs on it. So start using COS for all test jobs.
The default value of `KUBE_NODE_OS_DISTRIBUTION` for clusters created for testing will now be gci. Testjobs that do not specify this value will now run on clusters using COS (aka GCI) as the node OS, instead of CVM, the previous default.
This change only affects testing; non-testing clusters already use COS by default.
In addition, bump the version of COS from `cos-stable-60-9592-84-0` to `cos-stable-60-9592-90-0`.
```release-note
NONE
```
/cc @yujuhong, @mtaufen, @fejta, @krzyzacy
Automatic merge from submit-queue (batch tested with PRs 52227, 52120)
Fix discovery restmapper finding resources in non-preferred versions
Fixes: #52219
Also reverts behavioral changes to tests that version-qualified cronjobs to work around this issue.
The discovery rest mapper was only populating the priority rest mapper's search list with preferred groupversions.
That meant that if a resource existed in multiple non-preferred versions, AND did not exist in the preferred version (like cronjob, which only exists in v1beta2.batch and v2alpha1.batch, but not v1.batch), the priority restmapper would not find it in its group/version priority list, and would return an error.
```release-note
Fixed an issue looking up cronjobs when they existed in more than one API version
```
Automatic merge from submit-queue
Restore OWNERS file for k8s.io/metrics
The owners file for k8s.io/metrics somehow got lost. This restores it
to its contents on the "legacy" branch of k8s.io/metrics.
```release-note
NONE
```
Automatic merge from submit-queue
newline to separate unimplemented TaintEffectNoScheduleNoAdmit
**What this PR does / why we need it**:
Unimplemented `TaintEffectNoScheduleNoAdmit ` should not be treated as comments of `TaintEffectNoExecute `
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
xref #49530
**Special notes for your reviewer**:
/assign @k82cn
**Release note**:
```release-note
None
```
Automatic merge from submit-queue
Extend nvidia-gpus e2e test to include a device plugin based test
**What this PR does / why we need it**:
This is needed to verify device plugin feature.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/kubernetes/features/issues/368
**Special notes for your reviewer**:
Related test_infra PR: https://github.com/kubernetes/test-infra/pull/4265
**Release note**:
Add an e2e test for nvidia gpu device plugin
Automatic merge from submit-queue
Fix splitProviderID for Azure
**What this PR does / why we need it**:
#46940 add 'splitProviderID' for Azure to get node name from provider, but it captures the resource id instead of node name.
Functions such as NodeAddresses are accepting node names:
84d9778f22/pkg/cloudprovider/providers/azure/azure_instances.go (L32)
With current implementation, it takes in a resource ID, and will result in following error
```
E0830 04:15:09.877143 10427 azure_instances.go:63] error: az.NodeAddresses, az.getIPForMachine(/subscriptions/{id}/resourceGroups/{id}/providers/Microsoft.Compute/virtualMachines/k8s-master-0), err=instance not found
```
This fix makes is return node names instead.
**Which issue this PR fixes**
**Special notes for your reviewer**:
**Release note**:
`NONE`
@brendandburns @realfake @wlan0
Automatic merge from submit-queue (batch tested with PRs 52047, 52063, 51528)
implementation of GetZoneByProviderID and GetZoneByNodeName for azure
This is part of the #50926 effort
cc @luxas
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 52047, 52063, 51528)
Improve dynamic kubelet config e2e node test and fix bugs
Rather than just changing the config once to see if dynamic kubelet
config at-least-sort-of-works, this extends the test to check that the
Kubelet reports the expected Node condition and the expected configuration
values after several possible state transitions.
Additionally, this adds a stress test that changes the configuration 100
times. It is possible for resource leaks across Kubelet restarts to
eventually prevent the Kubelet from restarting. For example, this test
revealed that cAdvisor's leaking journalctl processes (see:
https://github.com/google/cadvisor/issues/1725) could break dynamic
kubelet config. This test will help reveal these problems earlier.
This commit also makes better use of const strings and fixes a few bugs
that the new testing turned up.
Related issue: #50217
I had been sitting on this until the cAdvisor fix merged in #51751, as these tests fail without that fix.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Added large topology tests for static policy in CPU Manager.
**What this PR does / why we need it**: This PR adds a very large topology test case for the CPU Manager feature.
Related to #51180.
CC @ConnorDoyle
Automatic merge from submit-queue (batch tested with PRs 50949, 52155, 52175, 52112, 52188)
kubeadm: Perform TLS Bootstrapping in kubeadm join for v1.7 kubelets
**What this PR does / why we need it**:
Partially reverts 9dc3a661d7
Performs the TLS Bootstrap if `kubeadm join` v1.8 is executed on a node with a kubelet v1.7.
Since the kubelet arguments for v1.7 (from the kubeadm dropin) expects a TLS bootstrapped kubeconfig, we still have to provide this functionality in kubeadm CLI v1.8 (as we support one minor version down)
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes: https://github.com/kubernetes/kubeadm/issues/429
**Special notes for your reviewer**:
This is a required bug fix for v1.8
**Release note**:
```release-note
NONE
```
@kubernetes/sig-cluster-lifecycle-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 50949, 52155, 52175, 52112, 52188)
Allow watch cache to be disabled per type
Currently setting watch cache size for a given resource does not disable
the watch cache. This commit adds a new `default-watch-cache-size` flag
to map to the existing field, and refactors how watch cache sizes are
calculated to bring all of the code into one place. It also adds debug
logging to startup to allow us to verify watch cache enablement in
production.
Part of #51825
Will allow watch cache to be disabled selectively.
Automatic merge from submit-queue
Add pod preemption to the scheduler
**What this PR does / why we need it**:
This is the last of a series of PRs to add priority-based preemption to the scheduler. This PR connects the preemption logic to the scheduler workflow.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48646
**Special notes for your reviewer**:
This PR includes other PRs which are under review (#50805, #50405, #50190). All the new code is located in 43627afdf9.
**Release note**:
```release-note
Add priority-based preemption to the scheduler.
```
ref/ #47604
/assign @davidopp
@kubernetes/sig-scheduling-pr-reviews
Automatic merge from submit-queue
Add cluster up configuration for certificate signing duration.
```release-note
Add CLUSTER_SIGNING_DURATION environment variable to cluster configuration scripts
to allow configuration of signing duration of certificates issued via the Certificate
Signing Request API.
```
Automatic merge from submit-queue
Add German translation for kubectl
**What this PR does / why we need it**:
This PR provides a first attempt to translate kubectl in German (related to #40645, #45573, #45562, #40591, #46559, #50155).
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
No issues
**Special notes for your reviewer**:
This PR requires German people to assist in the review. I'm native in German with BSc in Business Information Technology.
**Release note**:
```release-note
Adding German translation for kubectl
```
Automatic merge from submit-queue
ScaleIO - Specify SDC GUID value via node label
**What this PR does / why we need it**:
This is a ScaleIO plugin volume PR to do the following:
- Reads node label `scaleio.sdcGuid` value for the SDC GUID
- Uses value to look up the Scaleio SDC `instance ID`
- If label not found, falls back to current way of doing instance id look up now
This enhancement allows the ScaleIO plugin to work properly even if the drv_cfg binary is not installed on the kubelet node.
**Special Notes**
Associated issue - #51537Closes#51537
```release-note
The ScaleIO volume plugin can now read the SDC GUID value as node label scaleio.sdcGuid; if binary drv_cfg is not installed, the plugin will still work properly; if node label not found, it defaults to drv_cfg if installed.
```
Currently setting watch cache size for a given resource does not disable
the watch cache. This commit adds a new `default-watch-cache-size` flag
to map to the existing field, and refactors how watch cache sizes are
calculated to bring all of the code into one place. It also adds debug
logging to startup to allow us to verify watch cache enablement in
production.
Automatic merge from submit-queue
Fix deployment timeout reporting
If the previous condition has been a successful rollout then we
shouldn't try to estimate any progress. Scenario:
* progressDeadlineSeconds is smaller than the difference between
now and the time the last rollout finished in the past.
* the creation of a new ReplicaSet triggers a resync of the
Deployment prior to the cached copy of the Deployment getting
updated with the status.condition that indicates the creation
of the new ReplicaSet.
The Deployment will be resynced and eventually its Progressing
condition will catch up with the state of the world.
Fixes https://github.com/kubernetes/kubernetes/issues/49637
I will also cherry-pick this back to 1.7.
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 51900, 51782, 52030)
Fill in creationtimestamp in audit events
**What this PR does / why we need it**:
This is fixing null creationtimestamp in audit events.
@sttts @crassirostris like we've talked earlier today
**Release note**:
```release-note
none
```
Automatic merge from submit-queue (batch tested with PRs 51900, 51782, 52030)
A policy with 0 rules should return an error
**Which issue this PR fixes**
[isuue#51565](https://github.com/kubernetes/kubernetes/issues/51565)
**Release note**:
```
An audit policy file with 0 rule returns an error.
```
Automatic merge from submit-queue (batch tested with PRs 51900, 51782, 52030)
apiservers: stratify versioned informer construction
The versioned share informer factory has been part of the GenericApiServer config,
but its construction depended on other fields of that config (e.g. the loopback
client config). Hence, the order of changes to the config mattered.
This PR stratifies this by moving the SharedInformerFactory from the generic Config
to the CompleteConfig struct. Hence, it is only filled during completion when it is
guaranteed that the loopback client config is set.
While doing this, the CompletedConfig construction is made more type-safe again,
i.e. the use of SkipCompletion() is considereably reduced. This is archieved by
splitting the derived apiserver Configs into the GenericConfig and the ExtraConfig
part. Then the completion is structural again because CompleteConfig is again
of the same structure: generic CompletedConfig and local completed ExtraConfig.
Fixes#50661.
If the previous condition has been a successful rollout then we
shouldn't try to estimate any progress. Scenario:
* progressDeadlineSeconds is smaller than the difference between
now and the time the last rollout finished in the past.
* the creation of a new ReplicaSet triggers a resync of the
Deployment prior to the cached copy of the Deployment getting
updated with the status.condition that indicates the creation
of the new ReplicaSet.
The Deployment will be resynced and eventually its Progressing
condition will catch up with the state of the world.
Signed-off-by: Michail Kargakis <mkargaki@redhat.com>
Automatic merge from submit-queue
Bump cluster autoscaler to 0.7.0-alpha3
After adding an extra field to `etc/gce.conf` CA stopped starting properly. After this change CI test suite should become more green.