Commit Graph

67349 Commits (1da598457c6525612dcfc7ea2b36c2b036076488)

Author SHA1 Message Date
Jordan Liggitt 14d31aff48
fix scheduler client construction from configuration files 2018-06-26 20:37:15 -04:00
Kubernetes Submit Queue 05f073dc28
Merge pull request #65468 from mindprince/remove-cos-requirement
Automatic merge from submit-queue (batch tested with PRs 65404, 65323, 65468). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove COS requirement while running e2e nvidia gpu tests.

```release-note
NONE
```
2018-06-26 17:33:08 -07:00
Kubernetes Submit Queue e55ea1608a
Merge pull request #65323 from jsafrane/fix-csi-json
Automatic merge from submit-queue (batch tested with PRs 65404, 65323, 65468). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix cleanup of volume metadata json file.

Create the json file with metadata as the last item, when everything else is ready, so we don't need to clean up the file in all error cases in this function.

Fixes #65322

**Release note**:
```release-note
Fixed cleanup of CSI metadata files.
```

/assign @saad-ali @vladimirvivien
2018-06-26 17:33:05 -07:00
Kubernetes Submit Queue f9a1cb9b63
Merge pull request #65404 from fisherxu/collapse-rvParse
Automatic merge from submit-queue (batch tested with PRs 65404, 65323, 65468). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Collapse the list and watch resource version parse

**What this PR does / why we need it**:
Collapse the list and watch resource version parse, as discuss in [#64513](https://github.com/kubernetes/kubernetes/pull/64513#issuecomment-399380988)
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-06-26 17:33:01 -07:00
Ryo Nishikawa 0637d52128 Fix typo in comment 2018-06-26 17:04:25 -07:00
Kubernetes Submit Queue 35d5daa8a0
Merge pull request #65454 from bsalamat/rescheduler_version
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Update Rescheduler's manifest

**What this PR does / why we need it**: Updates Rescheduler's manifest to use version 0.4.0

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
Update Rescheduler's manifest to use version 0.4.0.
```
2018-06-26 14:21:20 -07:00
Anago GCB 519e9d1b7b Update CHANGELOG-1.11.md for v1.11.0-rc.3. 2018-06-26 21:13:44 +00:00
Rohit Agarwal af3bc705b5 Remove COS requirement while running e2e nvidia gpu tests. 2018-06-26 12:12:06 -07:00
Kubernetes Submit Queue ba7f798a1a
Merge pull request #65460 from cofyc/issue64853
Automatic merge from submit-queue (batch tested with PRs 65342, 65460). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Prepare local volumes via hostexec pod instead of SSH

**What this PR does / why we need it**:

Prepare local volumes via hostexec pod. SSH access may be removed in future.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64853

**Special notes for your reviewer**:

For each test, launch a pod for each node to setup volumes when needed.
It uses `nsenter` to enter into host mount namespace to run commands.

Why using `nsenter` command:
- migrate to use hostexec pod (baseimage: alpine:3.6) busybox `losetup` is hard
- alpine does not contain mkfs.ext4 command
- easier to setup local volumes (no need to mount /tmp, /mnt, /dev/, /sys directories)
- only require hostexec pod contains `nsenter` command

**Release note**:

```release-note
NONE
```
2018-06-26 11:55:08 -07:00
Kubernetes Submit Queue 2dbb9c8602
Merge pull request #65342 from dashpole/npd_args
Automatic merge from submit-queue (batch tested with PRs 65342, 65460). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Update NPD config for GCI

**What this PR does / why we need it**:
Use https://github.com/kubernetes/node-problem-detector/pull/180 on GCI

**Special notes for your reviewer**:
This is currently pending an NPD release.

**Release note**:
```release-note
NONE
```
/assign @Random-Liu 
/sig node
/kind feature
/priority important-soon
2018-06-26 11:55:04 -07:00
Minhan Xia 35fee75754 bug fix for cloud provider generator 2018-06-26 11:33:38 -07:00
Minhan Xia 232ebadc9e add beta healthcheck in gce cloud provider 2018-06-26 11:33:38 -07:00
Minhan Xia 3248fa2a89 swtich to use Beta 2018-06-26 11:33:38 -07:00
Minhan Xia 469ac167c1 regenerate GCE cloud provider 2018-06-26 11:29:55 -07:00
Minhan Xia 9e45bb8264 revendor GCE API Go client 2018-06-26 11:29:55 -07:00
Ashley Gau 72335f6607 update NEGAnnotation 2018-06-26 10:59:48 -07:00
Jordan Liggitt 6b2278fe21
Add debugging for scale e2e test errors 2018-06-26 13:34:48 -04:00
Matthew Wong b376b31ee0 Resolve potential devicePath symlink when MapVolume in containerized kubelet 2018-06-26 13:08:36 -04:00
Kubernetes Submit Queue 0d9c432542
Merge pull request #65437 from losipiuk/lo/gpu-tests-from-env
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Read gpu type from TESTED_GPU_TYPE env variable

```release-note
NONE
```
2018-06-26 08:55:44 -07:00
David Eads aee798bbf3 kubectl convert should not double wrap output in nested lists 2018-06-26 11:50:37 -04:00
Dmitrii Shcherbakov 7e2caf02ac use lowercase hostnames for node names
Usage of names containing uppercase characters returned by calls to
gethostname and getfqdn in requests to apiserver related to nodes
results in 404 errors. Node names are lowercase in K8s itself so charms
should make sure to use lowercase names well as it results in errors.

pkg/util/node/node.go has code to convert hostnames to lowercase in
GetHostname and that function is used to form node names.
2018-06-26 17:36:28 +02:00
Jordan Liggitt 6354665ee8
show type differences in reflect diff 2018-06-26 11:30:30 -04:00
Jordan Liggitt ce51c76b97
allow enabling kubelet serving certificate rotation via flag 2018-06-26 10:11:39 -04:00
Clayton Coleman c819a16284
Improve job describe and get output
For get, condense completions and success into a single column, and
print the job duration. Use a new variant of ShortHumanDuration that
shows more significant digits, since duration matters more for jobs.

```
NAME                                   COMPLETIONS   DURATION   AGE
image-mirror-origin-v3.10-1529985600   1/1           47s        42m
image-mirror-origin-v3.11-1529985600   1/1           74s        42m
image-pruner-1529971200                1/1           60m        4h
```

The completions column can be:

```
COMPLETIONS
0/1        # completions nil or 1, succeeded 0
1/1        # completions nil or 1, succeeded 1
0/3        # completions 3, succeeded 1
1/3        # completions 3, succeeded 1
0/1 of 30  # parallelism of 30, completions is nil
```

Update describe to show the completion time and the duration.
2018-06-26 09:37:29 -04:00
Davide Belloni b24bf0c5e2
Enable “Kubernetes Monitoring” and “PodSecurityPolicies” on the same cluster
Without that the daemonset "metadata-agent" return:

```pods "metadata-agent-" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.containers[0].hostPort: Invalid value: 8799: Host port 8799 is not allowed to be used. Allowed ports: []]```
2018-06-26 14:06:32 +02:00
Kubernetes Submit Queue 76b4699c69
Merge pull request #49410 from jasonbrooks/patch-1
Automatic merge from submit-queue (batch tested with PRs 65449, 65373, 49410). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add kernel config locations for fedora and atomic

**What this PR does / why we need it**:

* Fedora stores its kernel configs in /usr/lib/modules/$(uname -r)/config
* Fedora/CentOS/RHEL atomic hosts use /usr/lib/ostree-boot/$(uname -r), though this location is deprecated
* The lack of these locations in the validator is causing kubeadm to hang on "failed to parse kernel config" in its preflight checking on fedora and atomic host

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2018-06-26 02:52:11 -07:00
Kubernetes Submit Queue 3d694993d0
Merge pull request #65373 from multi-io/openstack_lbaas_node_secgroup_fix
Automatic merge from submit-queue (batch tested with PRs 65449, 65373, 49410). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

OpenStack LBaaS fix: must use ID, not name, of the node security group

This is a bugfix for the OpenStack LBaaS cloud provider security group management.

A bit of context: When creating a load balancer for a given `type: LoadBalancer` service, the provider will try to:

(see `pkg/cloudprovider/providers/openstack/openstack_loadbalancer.go`/`EnsureLoadBalancer`)

1. create a load balancer (LB) in Openstack with listeners corresponding to the service's ports
2. attach a floating IP to the LB's network port

If `manage-security-groups` is enabled in controller-manager's cloud.conf:

3. create a security group with ingress rules corresponding to the LB's listeners, and attach it to the LB's network port
4. for all nodes of the cluster, pick an existing security group for the nodes ("node security group") and add ingress rules to it exposing the service's NodePorts to the security group created in step 3.

In the current upstream master, steps 1 through 3 work fine, step 4 fails, leading to a service that's not accessible via the LB without further manual intervention.

The bug is in the "pick an existing security group" operation (func `getNodeSecurityGroupIDForLB`), which, contrary to its name, will return the security group's name rather than its ID (actually it returns a list of names rather than IDs, apparently to cover some corner cases where you might have more than one node security group, but anyway). This will then be used when trying to add the ingress rules to the group, which the Openstack API will reject with a 404 (at least on our (fairly standard) Openstack Ocata installation) because we're giving it a name where it expects an ID.

The PR adds a "get ID given a name" lookup to the `getNodeSecurityGroupIDForLB` function, so it actually returns IDs. That's it. I'm not sure if the upstream code wasn't really tested, or maybe other people use other Openstacks with more lenient APIs. The bug and the fix is always reproducible on our installation.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

Fixes #58145


**Special notes for your reviewer:**

Should we turn `getNodeSecurityGroupIDForLB` into a method with the lbaas as its receiver because it now requires two of the lbaas's attributes? I'm not sure what the conventions are here, if any. 

**Release note**:
```release-note
NONE
```
2018-06-26 02:52:06 -07:00
Kubernetes Submit Queue 0d31f90b22
Merge pull request #65449 from cblecker/run-in-gopath-symlink-fix
Automatic merge from submit-queue (batch tested with PRs 65449, 65373, 49410). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix run-in-gopath issue with symlink'd gopath

**What this PR does / why we need it**:
Fixes `hack/update-bazel.sh` so that it can be run in a symlink'd GOPATH, (such as using `hack/run-in-gopath.sh`).

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #65403.

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2018-06-26 02:52:02 -07:00
Yecheng Fu 1fbc5babb5 Prepare local volumes via hostexec pod. 2018-06-26 13:18:55 +08:00
Kubernetes Submit Queue 1f4f0123ed
Merge pull request #64812 from hzxuzhonghu/audit-useragent
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add user-agent to audit-logging

**What this PR does / why we need it**:

Add User-Agent to audit event.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64791

**Special notes for your reviewer**:

**Release note**:

```release-note
Add user-agent to audit-logging.
```
2018-06-25 22:16:08 -07:00
Kubernetes Submit Queue 93055c7730
Merge pull request #65330 from freehan/neg-rate-limit
Automatic merge from submit-queue (batch tested with PRs 59214, 65330). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add rate limiting for NEG calls

```release-note
None
```
2018-06-25 18:19:04 -07:00
Kubernetes Submit Queue 991a84758f
Merge pull request #59214 from kdembler/cpumanager-checkpointing
Automatic merge from submit-queue (batch tested with PRs 59214, 65330). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Migrate cpumanager to use checkpointing manager

**What this PR does / why we need it**:
This PR migrates `cpumanager` to use new kubelet level node checkpointing feature (#56040) to decrease code redundancy and improve consistency.

**Which issue(s) this PR fixes**:
Fixes #58339

**Notes**:
At point of submitting PR the most straightforward approach was used - `state_checkpoint` implementation of `State` interface was added. However, with checkpointing implementation there might be no point to keep `State` interface and just use single implementation with checkpoint backend and in case of different backend than filestore needed just supply `cpumanager` with custom `CheckpointManager` implementation.

/kind feature
/sig node
cc @flyingcougar @ConnorDoyle
2018-06-25 18:19:00 -07:00
hui luo d04f596829 Add hierarchy support for plugin directory
it traverses and watch plugin directory and its sub directory recursively,
plugin socket file only need be unique within one directory,

- plugin socket directory
-    |
-    ---->sub directory 1
-    |              |
-    |              ----->  socket1,  socket2 ...
-    ----->sub directory 2
-                  |
-                  ------> socket1, socket2 ...

the design itself allow sub directory be anything,
but in practical, each plugin type could just use one sub directory.

four bonus changes added as below

1. extract example handler out from test, it is easier to read the code
with the seperation.

2. there are two variables here: "Watcher" and "watcher".
"Watcher" is the plugin watcher, and "watcher" is the fsnotify watcher.
so rename the "watcher" to "fsWatcher" to make code easier to
understand.

3. change RegisterCallbackFn() return value order, it is
conventional to return error last, after this change,
the pkg/volume/csi is compliance with golint, so remove it
from hack/.golint_failures

4. refactor errors handling at invokeRegistrationCallbackAtHandler()
to make error message more clear.
2018-06-25 17:32:18 -07:00
Guoliang Wang 0d6c51656e enable etcdv3 client prometheus metics 2018-06-26 08:08:19 +08:00
Bobby (Babak) Salamat 2cd36643f6 Update Rescheduler's manifest 2018-06-25 16:38:04 -07:00
Kubernetes Submit Queue 28b7809d2f
Merge pull request #65448 from kwmonroe/bug/lint-fixes
Automatic merge from submit-queue (batch tested with PRs 65156, 65448). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

lint fixes for goal state checks

**What this PR does / why we need it**:
Lint fixes for long lines introduced in #65187 

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:
We could also fix this by setting a longer flake8 line length in something like a local `setup.cfg`

**Release note**:

```release-note
NONE
```
2018-06-25 16:07:05 -07:00
Kubernetes Submit Queue 732eca80cc
Merge pull request #65156 from agau4779/remove_neg_gate
Automatic merge from submit-queue (batch tested with PRs 65156, 65448). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[GCE] move NEG out of featuregate

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes # https://github.com/kubernetes/ingress-gce/issues/274
**Release note**:
-->
```release-note
NONE
```
2018-06-25 16:07:03 -07:00
Christoph Blecker 50fd906f74
Update vendored tool go install location to use GOPATH 2018-06-25 15:45:14 -07:00
Kubernetes Submit Queue 7f23a743e8
Merge pull request #65258 from ddebroy/ddebroy-ebs1
Automatic merge from submit-queue (batch tested with PRs 65164, 65258). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Query candidate zones for EBS when zone/zones not passed

**What this PR does / why we need it**:
This PR skips invoking `getCandidateZonesForDynamicVolume` to query EC2 zones of instances when zone/zones is present.

/sig storage

**Release note**:
```
none
```
2018-06-25 14:44:08 -07:00
Kubernetes Submit Queue db80cdf37f
Merge pull request #65164 from xlgao-zju/add-log-for-timeout
Automatic merge from submit-queue (batch tested with PRs 65164, 65258). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add output to give user awareness of how long timeouts are expected to be

**What this PR does / why we need it**:
Add output to give user awareness of how long manifest upgrade timeout is expected to be.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref [kubernetes/kubeadm/#914](https://github.com/kubernetes/kubeadm/issues/914)

**Special notes for your reviewer**:

**Release note**:

```release-note
kubeadm: notify the user of manifest upgrade timeouts
```
2018-06-25 14:44:04 -07:00
Christoph Blecker 2edd10709c
Fix run-in-gopath issue with symlink'd gopath 2018-06-25 14:25:55 -07:00
Kevin W Monroe 0eeb34382b one more lint fix for sshl_chain_completion 2018-06-25 15:14:49 -05:00
David Ashpole c8758a774e update NPD version to v0.5.0 for gci 2018-06-25 13:13:39 -07:00
Kevin W Monroe 428a63e9a1 lint fixes for goal state checks 2018-06-25 15:06:06 -05:00
Olaf Klischat 8ed735d104 BUGFIX: must use ID, not name, of the node security group when adding rules to it 2018-06-25 21:44:59 +02:00
Anago GCB 759f3e21da Update CHANGELOG-1.11.md for v1.11.0-rc.2. 2018-06-25 17:22:58 +00:00
Łukasz Osipiuk 63f5f3106b Read gpu type from TESTED_GPU_TYPE env variable 2018-06-25 18:47:49 +02:00
Ashley Gau 7beefd0c9c move NEG out of featuregate 2018-06-25 09:47:39 -07:00
immutablet 0100891168 Add support for linux abstract socket namespace. 2018-06-25 09:41:14 -07:00
Xianglin Gao b309ace793 kubeadm-upgrade: notify the user of manifest upgrade timeouts
Signed-off-by: Xianglin Gao <xianglin.gxl@alibaba-inc.com>
2018-06-26 00:03:00 +08:00