Automatic merge from submit-queue
GCI/Trusty: support the Docker registry mirror
@roberthbailey @zmerlynn please review it.
cc/ @fabioy @dchen1107 @kubernetes/goog-image FYI.
cc/ @ojarjur it is very straightforward to add support for GCI, which is pretty much like the change to ContainerVM's configure-vm.sh in your original PR #25841.
Automatic merge from submit-queue
GCI: correct the fix in #26363
This PR is mainly for correcting the fix to 'find' command in #26363. I added "-maxdepth 1" in an earlier change, and #26363 tried to fix it by changing the search path. This is potentially incorrect, when yaml files are in more than one layer deep. The real fix should be removing the "-maxdepth 1" flag from 'find' command. This PR also updates two minor places in the file configure-helper.sh introduced by two previous PR #26413 and #26048.
@roberthbailey @wonderfly
cc/ @dchen1107 @fabioy @kubernetes/goog-image
Automatic merge from submit-queue
Increase failure threshold for glbc liveness probe
This pod fails a liveness probe on occasion, probably because the failure thresholds are too strict. Simple enough that either reviewer can review.
Automatic merge from submit-queue
Fix typo and linewrap comments in PV controller
Fix some typos and linewrap long comments that I found while going over this code investigating something.
Automatic merge from submit-queue
Add timeout for image pulling
Fix#26300.
With this PR, if image pulling makes no progress for *1 minute*, the operation will be cancelled. Docker reports progress for every 512kB block (See [here](3d13fddd2b/pkg/progress/progressreader.go (L32))), *512kB/min* means the throughput is *<= 8.5kB/s*, which should be kind of abnormal?
It's a little hard to write unit test for this, so I just manually tested it. If I set the `defaultImagePullingStuckTimeout` to 0s, and `defaultImagePullingProgressReportInterval` to 1s, image pulling will be cancelled.
```
E0601 18:48:29.026003 46185 kube_docker_client.go:274] Cancel pulling image "nginx:latest" because of no progress for 0, latest progress: "89732b811e7f: Pulling fs layer "
E0601 18:48:29.026308 46185 manager.go:2110] container start failed: ErrImagePull: net/http: request canceled
```
/cc @kubernetes/sig-node
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Automatic merge from submit-queue
Stabilize persistent volume integration tests
- add more logs
- wait both for volume and claim to get bound
When binding volumes to claims the controller saves PV first and PVC right
after that. In theory, this saved PV could cause waitForPersistentVolumePhase
to finish and PVC could be checked in the test before the controller saves it.
So, wait for both PVC and PV to get bound and check the results only after
that. This is only a theory, there are no usable logs in integration tests.
Fixes#26499 (at least I hope so...)
Automatic merge from submit-queue
pin GCI version to milestone 52
This is mainly for pinning the 1.2 branch to GCI milestone 52
which contains correct docker and kubelet built in.
Doing this allows us to upgrade docker to v1.11 (issue #26455)
in GCI 53 without breaking the 1.2 release branch.
@kubernetes/goog-image @dchen1107 @roberthbailey @andyzheng0831
Automatic merge from submit-queue
Don't allow deps with no discernible license
This updates the few deps we had with no LICENSE file to current versions that do have that file. It also disallows new deps without obvious licenses.
Automatic merge from submit-queue
Attach/Detach Controller Kubelet Changes
This PR contains changes to enable attach/detach controller proposed in #20262.
Specifically it:
* Introduces a new `enable-controller-attach-detach` kubelet flag to enable control by attach/detach controller. Default enabled.
* Removes all references `SafeToDetach` annotation from controller.
* Adds the new `VolumesInUse` field to the Node Status API object.
* Modifies the controller to use `VolumesInUse` instead of `SafeToDetach` annotation to gate detachment.
* Modifies kubelet to set `VolumesInUse` before Mount and after Unmount.
* There is a bug in the `node-problem-detector` binary that causes `VolumesInUse` to get reset to nil every 30 seconds. Issue https://github.com/kubernetes/node-problem-detector/issues/9#issuecomment-221770924 opened to fix that.
* There is a bug here in the mount/unmount code that prevents resetting `VolumeInUse in some cases, this will be fixed by mount/unmount refactor.
* Have controller process detaches before attaches so that volumes referenced by pods that are rescheduled to a different node are detached first.
* Fix misc bugs in controller.
* Modify GCE attacher to: remove retries, remove mutex, and not fail if volume is already attached or already detached.
Fixes#14642, #19953
```release-note
Kubernetes v1.3 introduces a new Attach/Detach Controller. This controller manages attaching and detaching volumes on-behalf of nodes that have the "volumes.kubernetes.io/controller-managed-attach-detach" annotation.
A kubelet flag, "enable-controller-attach-detach" (default true), controls whether a node sets the "controller-managed-attach-detach" or not.
```
Automatic merge from submit-queue
Fill controller caches on startup
The controller needs to fill its caches before it starts binding/recycling/ deleting or provisioning volumes and claims. This was done using blocking initial 'xxx added' from going through syncClaim/syncVolume. However, when the caches were full, the controller waited for the next sync period to do actual binding/recycling etc.
In this patch, the controller fills its caches directly from etcd and then processes initial 'xxx added' events to reconcile the world and bind/recycle/ delete/provision stuff, resulting in faster binding after startup.
Fixes#25967 (properly)
Automatic merge from submit-queue
Trusty: fix 'find' commands and add k8s license and motd info
This PR fixes several things which should be also cherry picked in release-1.2 branch:
- Fix the 'find' command failure. See issue #26350 for the background;
- Add k8s license file into /home/kubernetes and set /etc/motd info. We fixed this in cluster/gce/gci for 1.3 branch, but have not done it for 1.2 branch. This PR simply copies the code from the GCI support.
Please note that the trusty code in master branch is under best-effort maintenance and we don't guarantee prompt fixes. But for 1.2 branch, we need to guarantee its correctness. I will test this in 1.2 branch.
@roberthbailey and @zmerlynn please review it.
cc/ @dchen1107 @fabioy @kubernetes/goog-image FYI.
Automatic merge from submit-queue
Fixes#26526 - hack/update-api-reference-docs.sh
I opened a Pull request to fix this issue https://github.com/kubernetes/kubernetes/issues/26526
The problem is that the update script ignores white spaces but the verify script doesn't which leads to a strange behaviour -> you use the update script but the verify script tells you that the api docs are not up to date.
The following packes did not have discernible LICENSE files at the hash we have
vendored:
github.com/beorn7/perks
github.com/daviddengcn/go-colortext
github.com/garyburd/redigo
github.com/prometheus/common
github.com/shurcooL/sanitized_anchor_name
github.com/stretchr/objx
This commit updates all of them and updates the central LICENSE file.
This PR contains Kubelet changes to enable attach/detach controller control.
* It introduces a new "enable-controller-attach-detach" kubelet flag to
enable control by controller. Default enabled.
* It removes all references "SafeToDetach" annoation from controller.
* It adds the new VolumesInUse field to the Node Status API object.
* It modifies the controller to use VolumesInUse instead of SafeToDetach
annotation to gate detachment.
* There is a bug in node-problem-detector that causes VolumesInUse to
get reset every 30 seconds. Issue https://github.com/kubernetes/node-problem-detector/issues/9
opened to fix that.
Add it as a special case package root and import the license file. This was
the only UNKNOWN license, prior to the change to not look at upstream repo
state.
Automatic merge from submit-queue
rkt: Get logs via syslog identifier
This change works around https://github.com/coreos/rkt/issues/2630
Without this change, logs cannot reliably be collected for containers
with short lifetimes.
With this change, logs cannot be collected on rkt versions v1.6.0 and
before.
I'd like to also bump the required rkt version, but I don't want to do that until there's a released version that can be pointed to (so the next rkt release).
I haven't added tests (which were missing) because this code will be removed if/when logs are retrieved via the API. I have run E2E tests with this merged in and verified the tests which previously failed no longer fail.
cc @yifan-gu
This is mainly for pinning the 1.2 branch to GCI milestone 52
which contains correct docker and kubelet built in.
Doing this allows us to upgrade docker to v1.11 (issue #26455)
in GCI 53 without breaking the 1.2 release branch.
Automatic merge from submit-queue
Enable node e2e accounting on systemd
Updated the e2e setup.sh script to enable cpu and memory accounting.
Related to https://github.com/kubernetes/kubernetes/issues/26198
/cc @pwittrock
Automatic merge from submit-queue
read gluster log to surface glusterfs plugin errors properly in describe events
glusterfs.go does not properly expose errors as all mount errors go to a log file, I propose we read the log file to expose the errors without asking the users to 'go look at this log'
This PR does the following:
1. adds a gluster option for log-level=ERROR to remove all noise from log file
2. change log file name and path based on PV + Pod name - so specific per PV and Pod
3. create a utility to read the last two lines of the log file when failure occurs
old behavior:
```
13s 13s 1 {kubelet 127.0.0.1} Warning FailedMount Unable to mount volumes for pod "bb-gluster-pod2_default(34b18c6b-070d-11e6-8e95-52540092b5fb)": glusterfs: mount failed: Mount failed: exit status 1
Mounting arguments: 192.168.234.147:myVol2 /var/lib/kubelet/pods/34b18c6b-070d-11e6-8e95-52540092b5fb/volumes/kubernetes.io~glusterfs/pv-gluster glusterfs [log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/pv-gluster/glusterfs.log]
Output: Mount failed. Please check the log file for more details.
```
improved behavior: (updated after suggestions from community)
```
34m 34m 1 {kubelet 127.0.0.1} Warning FailedMount Unable to mount volumes for pod "bb-multi-pod1_default(e7d7f790-0d4b-11e6-a275-52540092b5fb)": glusterfs: mount failed: Mount failed: exit status 1
Mounting arguments: 192.168.123.222:myVol2 /var/lib/kubelet/pods/e7d7f790-0d4b-11e6-a275-52540092b5fb/volumes/kubernetes.io~glusterfs/pv-gluster2 glusterfs [log-level=ERROR log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/pv-gluster2/bb-multi-pod1-glusterfs.log]
Output: Mount failed. Please check the log file for more details.
the following error information was pulled from the log to help resolve this issue:
[2016-04-28 14:21:29.109697] E [socket.c:2332:socket_connect_finish] 0-glusterfs: connection to 192.168.123.222:24007 failed (Connection timed out)
[2016-04-28 14:21:29.109767] E [glusterfsd-mgmt.c:1819:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: 192.168.123.222 (Transport endpoint is not connected)
```
also this PR is alternate approach to : #24624