Automatic merge from submit-queue
Fix bad check in node e2e tests for GPUs.
When no nvidia device was attached, the -ne check had a syntax error:
sh: -ne: argument expected
This resulted in `Success` being echoed and the test passing incorrectly.
This was found while debugging issue #47216
/release-note-none
/sig node
/area node-e2e
/kind bug
Automatic merge from submit-queue (batch tested with PRs 46678, 45545, 47375)
bazel: update debian-iptables-amd64 digest
**What this PR does / why we need it**: upstream debian has fixed several CVEs recently, so we should apply those fixes:
* CVE-2017-2616
* CVE-2017-6512
x-ref #47386
**Special notes for your reviewer**: nothing has been pushed yet, so this will likely fail many of the tests.
Do you think these version numbers make sense? We also need to fix debian-iptables v5, and I don't know what to do there. (v5.1?)
**Release note**:
```release-note
NONE
```
/assign @timstclair
Automatic merge from submit-queue (batch tested with PRs 46678, 45545, 47375)
update gophercloud/gophercloud dependency
**What this PR does / why we need it**:
**Which issue this PR fixes**
fixes#44461
**Special notes for your reviewer**:
**Release note**:
```release-note
update gophercloud/gophercloud dependency for reauthentication fixes
```
Automatic merge from submit-queue
fix#46039: iptables proxier need use '--bind-address' if set
**What this PR does / why we need it**:
iptables proxier need use '--bind-address' if set
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#46039
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 47000, 47188, 47094, 47323, 47124)
fix sync loop health check
This PR will do error logging about the fall behind sync for kubelet instead of sync loop healthz checking.
The reason is kubelet can not do sync loop and therefore can not update sync loop time when there is any runtime error, such as docker hung.
When there is any runtime error, according to current implementation, kubelet will not do sync operation and thus kubelet's sync loop time will not be updated. This will make when there is any runtime error, kubelet will also return non 200 response status code when accessing healthz endpoint. This is contrary with #37865 which prevents kubelet from being killed when docker hangs.
**Release note**:
```release-note
fix sync loop health check with seperating runtime errors
```
/cc @yujuhong @Random-Liu @dchen1107
Automatic merge from submit-queue (batch tested with PRs 47000, 47188, 47094, 47323, 47124)
Set up proxy certs for Aggregator.
Working on fixing https://github.com/kubernetes/kubernetes/issues/43716.
This will create the necessary certificates.
On GCE is will upload those certificates to Metadata.
They are then pulled down on to the kube-apiserver.
They are written to the /etc/src/kubernetes/pki directory.
Finally they are loaded vi the appropriate command line flags.
The requestheader-client-ca-file can be seen by running the following:-
kubectl get ConfigMap extension-apiserver-authentication --namespace=kube-system -o yaml
**What this PR does / why we need it**:
This PR creates a request header CA. It also creates a proxy client cert/key pair.
It causes these files to end up on kube-apiserver and set the CLI flags so they are properly loaded.
Without it the customer either has to set them up themselves or re-use the master CA which is a security vulnerability.
Currently this creates everything on GCE.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#43716
**Special notes for your reviewer**:
Automatic merge from submit-queue (batch tested with PRs 47000, 47188, 47094, 47323, 47124)
Add Calico typha agent
**What this PR does / why we need it**:
- Adds the Calico typha agent with autoscaling to the GCE scripts.
- Adds logic to adjust Calico resource requests based on cluster size.
Fixes https://github.com/kubernetes/kubernetes/issues/47269
**Special notes for your reviewer**:
CC @dnardo
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47000, 47188, 47094, 47323, 47124)
GC should retry on patch error
Fixing https://github.com/kubernetes/kubernetes/issues/46998.
This is fixing a bug, so applying the 1.7 milestone.
Automatic merge from submit-queue
Fix missing __kubectl_parse_config
**What this PR does / why we need it**:
This PR fixes the broken completion of kubectl config use-context. I checked that the completions of kubectl config use-context, --user and --cluster work correctly.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#29386
**Special notes for your reviewer**: @pwittrock @janetkuo
**Release note**:
```release-note
NONE
```
GenericPLEG's 1s relist() loop races against pod network setup. It
may be called after the infra container has started but before
network setup is done, since PLEG and the runtime's SyncPod() run
in different goroutines.
Track network setup status and don't bother trying to read the pod's
IP address if networking is not yet ready.
See also: https://bugzilla.redhat.com/show_bug.cgi?id=1434950
Mar 22 12:18:17 ip-172-31-43-89 atomic-openshift-node: E0322
12:18:17.651013 25624 docker_manager.go:378] NetworkPlugin
cni failed on the status hook for pod 'pausepods22' - Unexpected
command output Device "eth0" does not exist.
Runtimes should never return "" and nil errors, since network plugin
drivers need to treat netns differently in different cases. So return
errors when we can't get the netns, and fix up the plugins to do the
right thing.
Namely, we don't need a NetNS on pod network teardown. We do need
a netns for pod Status checks and for network setup.
This reverts commit fee4c9a7d9.
This is not the correct fix for the problem; and it causes other problems
like continuous:
docker_sandbox.go:234] NetworkPlugin cni failed on the status hook for pod
"someotherdc-1-deploy_default": Unexpected command output nsenter: cannot
open : No such file or directory with error: exit status 1
Because GetNetNS() is returning an empty network namespace. That is
not helpful nor should really be allowed; that's what the error return
from GetNetNS() is for.
Automatic merge from submit-queue
Make fluentd-gcp run with host network
Fluentd-gcp should have access to instance's platform-dependent service account in order to work.
/cc @piosz
Automatic merge from submit-queue
servicecontroller: use consistent node criteria
We have two node selection functions: includeNodeFromNodeList and
getNodeConditionPredicate, and the logic is different.
The logic should be the same, so remove includeNodeFromNodeList and just
use getNodeConditionPredicate everywhere.
Fix#45772
```release-note
servicecontroller: Fix node selection logic on initial LB creation
```
When no nvidia device was attached, the -ne check had a syntax error:
sh: -ne: argument expected
This resulted in 'Success' being echoed and the test passing incorrectly.
This was found while debugging issue #47216
Automatic merge from submit-queue
iSCSI plugin: Update devicepath with filepath.Glob result
**What this PR does / why we need it**:
If iscsiTransport is not tcp, iSCSI plugin tries to
find devicepath using filepath.Glob but never updates
devicepath with the filepath.Glob result.
This patch fixes the problem.
**Which issue this PR fixes** : fixes#47253
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue
AWS for cloud-controller-manager
fixes#47214
This implements the NodeAddressesByProviderID and InstanceTypeByProviderID methods used by the cloud-controller-manager for the AWS provider.
NodeAddressesByProvider uses DescribeInstances (for normal addresses) and DescribeAddresses (for Elastic IP addresses).
InstanceTypeByProviderID uses DescribeInstances.
```release-note
NONE
```