Commit Graph

4465 Commits (c1e8c6d878293b4b370b2647081751a63e555337)

Author SHA1 Message Date
Joe Beda 1b1806af56 Add jbeda to OWNERS for build, cluster, hack 2016-09-27 14:53:16 -07:00
Kubernetes Submit Queue 15daecea7f Merge pull request #33551 from wojtek-t/etcd3_in_kubemark
Automatic merge from submit-queue

Make it possible to set etcd version in kubemark
2016-09-27 05:37:59 -07:00
Kubernetes Submit Queue 9e4ba1866b Merge pull request #33146 from MrHohn/kubedns-readiness
Automatic merge from submit-queue

Tune down initialDelaySeconds for readinessProbe.

Fixed #33053.

Tuned down the `initialDelaySeconds`(original 30s) for readiness probe to 3 seconds and `periodSeconds`(default 10s) to 5 seconds to shorten the initial time before a dns server pod being exposed. This configuration passed DNS e2e tests and did not even hit any readiness failure(for kube-dns) with a GCE cluster with 4 nodes during the experiments.

For scaling out kube-dns servers, it took less than 10s for servers being exposed after they appeared as running, which is much faster than 30+s(the original cost).

`failureThreshold` is left as default(3) and it would not lead to restart because the status of readiness probe would only affect whether endpoints being exposed in service or not(in the dns service point of view). According to the implementation of [prober](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/prober/worker.go), the number of retries for readiness probe is unbounded. Hence there is no obvious effect if the readiness probe fail several times in the beginning. 

The state machine of prober could be illustrated with below figure:

![drawing](https://cloud.githubusercontent.com/assets/8681801/18693503/fb4466dc-7f56-11e6-8671-0a14c4835d24.jpeg)

I want to see the e2e result of this PR for further evaluation.

@thockin @bprashanth
2016-09-27 05:02:39 -07:00
Wojciech Tyczynski 8abf3c1600 Make it possible to set etcd version in kubemark 2016-09-27 13:13:29 +02:00
Kubernetes Submit Queue 869af8f5a1 Merge pull request #33437 from justinsb/typo_incremeting
Automatic merge from submit-queue

Fix typo: incremeting -> incrementing
2016-09-26 22:30:22 -07:00
Kubernetes Submit Queue 5e9bb641e1 Merge pull request #32808 from justinsb/kubelet_restart_forever
Automatic merge from submit-queue

salt: Don't stop trying to start kubelet
2016-09-26 22:30:16 -07:00
gmarek f7d0615e2b Wait until master IP is visible 2016-09-26 15:56:31 +02:00
Kubernetes Submit Queue 5fe2495588 Merge pull request #33122 from ixdy/upgrade-debugging
Automatic merge from submit-queue

Print a more helpful error message when failing to start rolling-updates

Hopefully this will help us track down where the 1.3 -> 1.4 upgrades are breaking down. We'll need to cherry-pick this into release-1.4 to have any effect, though.
2016-09-26 00:35:05 -07:00
MrHohn 55db76241c Tune down initialDelaySeconds for readinessProbe 2016-09-25 12:48:19 -07:00
Kubernetes Submit Queue b79c99da1b Merge pull request #32406 from MrHohn/kubedns-healthz
Automatic merge from submit-queue

Split dns healthcheck into two different urls

Attempt to fix #30633.

<s>This new kube-dns pod template creates two exechealthz processes listen on two different ports for kubedns and dnsmasq correspondingly. 

@thockin @girishkalele
2016-09-25 12:21:34 -07:00
Justin Santa Barbara a6dfaffe3f Fix typo: incremeting -> incrementing 2016-09-24 16:10:51 -04:00
Kubernetes Submit Queue 55830471ba Merge pull request #33353 from vishh/gci-default
Automatic merge from submit-queue

Switch k8s on GCE to use GCI by default
2016-09-23 17:25:35 -07:00
Kubernetes Submit Queue 1834039960 Merge pull request #32203 from dgoodwin/kubediscovery
Automatic merge from submit-queue

Alpha JWS Discovery API for locating an apiserver securely

This PR contains an early alpha prototype of the JWS discovery API outlined in proposal #30707.

CA certificate, API endpoints, and the token to be used to authenticate to this discovery API are currently passed in as secrets. If the caller provides a valid token ID, a JWS signed blob of ClusterInfo containing the API endpoints and the CA cert to use will be returned to the caller. This is used by the alpha kubeadm to allow seamless, very quick cluster setup with simple commands well suited for copy paste.

Current TODO list:

- [x] Allow the use of arbitrary strings as token ID/token, we're currently treating them as raw keys.
- [x] Integrate the building of the pod container, move to cluster/images/kube-discovery.
  - [x] Build for: amd64, arm, arm64 and ppc64le. (just replace GOARCH=)
  - [x] Rename to gcr.io/google_containers/kube-discovery-ARCH:1.0
  - [x] Cleanup rogue files in discovery sub-dir.
  - [x] Move pkg/discovery/ to cmd/discovery/app.

There is additional pending work to return a kubeconfig rather than ClusterInfo, however I believe this is slated for post-alpha.
2016-09-23 08:19:19 -07:00
Kubernetes Submit Queue 33b5d9650a Merge pull request #33197 from adityakali/core
Automatic merge from submit-queue

Reset core_patern on GCI

The default core_pattern pipes the core dumps to /sbin/crash_reporter
which is more restrictive in saving crash dumps. So for
now, set a generic core_pattern that users can work with.

@dchen1107 @aulanov can you please review?

cc/ @kubernetes/goog-image
2016-09-23 03:50:15 -07:00
Vishnu kannan 504cf5ca3c mount kubelet root directory as executable in GCI
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-09-22 22:01:59 -07:00
Vishnu kannan ef49584603 Switch k8s on GCE to use GCI by default
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-09-22 21:11:13 -07:00
MrHohn d17cd1a514 Split dns healthcheck into two different urls 2016-09-22 18:02:30 -07:00
Kubernetes Submit Queue 6d08910dd8 Merge pull request #33163 from DjangoPeng/Django-patch
Automatic merge from submit-queue

[bug]fix the appending bug

Fix the `DOCKER_OPTS` appending bug. Reference the [issue#33124](https://github.com/kubernetes/kubernetes/issues/33124)
2016-09-22 16:15:54 -07:00
Kubernetes Submit Queue e69c8f142c Merge pull request #33227 from vishh/remove-dns-limits
Automatic merge from submit-queue

Remove cpu limits for dns pod to avoid CPU starvation

The current limits are not based on usage profiles
Fixes #33222
2016-09-21 22:11:43 -07:00
Kubernetes Submit Queue 03c698ce44 Merge pull request #33194 from dchen1107/master
Automatic merge from submit-queue

Update the containervm image to the latest one (container-v1-3-v20160…

Node e2e is running with old containervm image which only has docker 1.9.1. This pr fixed such issue.
2016-09-21 20:40:02 -07:00
Kubernetes Submit Queue 290982d6bc Merge pull request #33224 from fabioy/fix-logrotate
Automatic merge from submit-queue

(GCI) Configure logrotate to rotate all .log files in /var/log.

Fixes logrotate configuration in GCI to rotate all "*.log" files in /var/log. 

Fixes issue #33223.
2016-09-21 20:01:35 -07:00
Vishnu kannan 7631b09baf remove cpu limits for dns pod. The current limits are not based on usage profiles
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-09-21 19:14:52 -07:00
Kubernetes Submit Queue 8c73e2bcbe Merge pull request #33125 from maisem/pin_gke_tests
Automatic merge from submit-queue

Setting the default image for GKE tests to Container_VM.

@vishh @spxtr @pwittrock

The purpose is to keep the current state of tests as is even if GKE changes the base image.
2016-09-21 18:02:15 -07:00
Fabio Yeon 177fee1358 (GCI) Configure logrotate to rotate all .log files in /var/log. 2016-09-21 15:29:14 -07:00
Dawn Chen f1f16fe03a Update the containervm image to the latest one (container-v1-3-v20160604). 2016-09-21 10:24:22 -07:00
Aditya Kali d54db34172 Reset core_patern on GCI
The default core_pattern pipes the core dumps to /sbin/crash_reporter
which is more restrictive in saving crash dumps. So for
now, set a generic core_pattern that users can work with.
2016-09-21 10:08:23 -07:00
Mik Vyatskov 3fbde5ecfb Fixed elasticsearch cluster logging e2e test on GCI 2016-09-21 13:55:43 +02:00
Jingtian Peng cee76a6f7d fix the appending bug 2016-09-21 16:36:08 +08:00
Kubernetes Submit Queue 01dd125b60 Merge pull request #33039 from colhom/fix-bad-var-name-gce
Automatic merge from submit-queue

gce/util: $replica-pd --> $replica_pd

\cc @quinton-hoole @madhusudancs 

fixes #32997
2016-09-20 22:22:16 -07:00
Kubernetes Submit Queue 6fd94968e1 Merge pull request #32738 from Amey-D/gci-version-v1.4
Automatic merge from submit-queue

Bump up GCI version.

```release-note
   Upgrading Container-VM base image for k8s on GCE. Brief changelog as follows:
    - Fixed performance regression in veth device driver
    - Docker and related binaries are statically linked
    - Fixed the issue of systemd being oom-killable
```

Fixes #32596

This needs a cherrypick into v1.4 release branch because it is fixing v1.4 release blocking issues. This patch is easy and safe to rollback in case of emergencies.

@vishh can you please review?

Fixes #32596 and many other issues.
cc/ @kubernetes/goog-image  FYI
2016-09-20 16:30:01 -07:00
Jeff Grafton 47e4573943 Print a more helpful error message when rolling-updates fail. 2016-09-20 15:31:57 -07:00
Maisem Ali 714983c9f3 Setting the default image for GKE tests to Container_VM. 2016-09-20 14:29:23 -07:00
Amey Deshpande 5da8486758 Bump up GCI version.
Brief changelog compared to gci-dev-54-8743-3-0:
- Fixed performance regression in veth device driver
- Docker and related binaries are statically linked
- Fixed the issue of systemd being oom-killable
- Updated built-in kubelet version to 1.3.7
- add ethtool and ebtables binaries expected by kubelet

Fixes #32596
2016-09-20 13:59:31 -07:00
Kubernetes Submit Queue 12ecc60833 Merge pull request #32264 from dshulyak/enable_hostpath_provisioner
Automatic merge from submit-queue

Enable hostpath provisioner for vagrant environment

This flag is required to run e2e tests for certain features (petset), and for manual tests and debugging.

related: https://github.com/kubernetes/kubernetes/issues/32119
2016-09-20 00:30:42 -07:00
Colin Hom acd7f5045d gce/util: $replica-pd --> $replica_pd
fixes #32997
2016-09-19 12:00:08 -07:00
Wojciech Tyczynski 8a942e65fd Show errors in tars_from_version 2016-09-19 16:26:07 +02:00
Kubernetes Submit Queue 87c2650038 Merge pull request #32873 from jszczepkowski/ha-delete-nodes2
Automatic merge from submit-queue

Implemented KUBE_DELETE_NODES flag in kube-down.

Implemented KUBE_DELETE_NODES flag in kube-down script.
It prevents removal of nodes when shutting down a HA master replica.
2016-09-19 01:08:18 -07:00
Kubernetes Submit Queue a5e35eb887 Merge pull request #32886 from freehan/bump-master-cidr
Automatic merge from submit-queue

bump master cidr range from /30 to /29

Fixes P1 item in the 1.4 milestone

ref: https://github.com/kubernetes/kubernetes/issues/32844
2016-09-17 11:27:46 -07:00
Michael Taufen 2a536bf6f5 Revert "Merge pull request #31023 from vishh/gci-default"
This reverts PR #31023, which had made GCI the default node image for
open source. This revert makes container-vm the default for open source again.
2016-09-16 15:16:53 -07:00
Minhan Xia 879a2dcdbd bump master cidr range from /30 to /29 2016-09-16 13:41:58 -07:00
Kubernetes Submit Queue 9bc7e36f4b Merge pull request #32517 from luxas/fix_arm_ppc64le
Automatic merge from submit-queue

Use a patched golang version for building linux/arm

Fixes: #29904

Right now, linux/arm is broken because of an internal limitation in Go.
I've filed an issue for it here: https://github.com/golang/go/issues/17028

The affected binaries of this limitation are hyperkube and kube-apiserver, which are the largest binaries.

And when we now have a patched go 1.7.1 version for building "unsupported" but important architectures (ref: https://github.com/kubernetes/kubernetes/blob/master/docs/proposals/multi-platform.md), we should also include the patch for ppc64le and start building ppc64le again. 

As soon as @laboger has the patch I need up on Github, I'll include ppc64le to this PR and we'll merge it

TODO:
 - [ ] ~~Update the PR with patches for ppc64le at the same time @luxas~~
 - [x] Push the new kube-cross image @ixdy 
 - [x] Run a full `make release` before to verify nothing breaks @luxas + @ixdy 
 - [ ] Cherrypick into the 1.4 branch @luxas + (who?)

@lavalamp @smarterclayton @ixdy @rsc @davecheney @wojtek-t @jfrazelle @bradfitz @david-mcmahon @pwittrock
2016-09-16 12:52:17 -07:00
Wojciech Tyczynski 07476fa658 Copy rotated logs in e2e tests 2016-09-16 19:12:18 +02:00
Jerzy Szczepkowski 58c8992590 Implemented KUBE_DELETE_NODES flag in kube-down.
Implemented KUBE_DELETE_NODES flag in kube-down script.
It prevents removal of nodes when shutting down a HA master replica.
2016-09-16 16:51:52 +02:00
Devan Goodwin baebd7cfd9 Expand on kube-discovery API and integrate container build. 2016-09-16 11:37:04 -03:00
Kubernetes Submit Queue 60840140ab Merge pull request #31437 from jszczepkowski/ha-poc-debian2
Automatic merge from submit-queue

Implemented creation of HA master for GCE on debian.
2016-09-16 05:44:18 -07:00
Kubernetes Submit Queue 5a8d0a198c Merge pull request #32855 from wojtek-t/extend_logs_for_upgrade
Automatic merge from submit-queue

Extend logs for debugging upgrade test failures
2016-09-16 03:17:30 -07:00
Marek Grabowski 5fc62c2333 Merge pull request #32814 from bprashanth/kubeup
Retrieve username/password from basicauth section of kubeconfig
2016-09-16 11:41:17 +02:00
Wojciech Tyczynski ed88a03944 Extend logs for debugging upgrade test failures 2016-09-16 10:52:14 +02:00
Vishnu kannan ff5081cce5 support image type override for real in upgrade.sh script
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-09-15 23:16:44 -07:00
Random-Liu bb233e2249 Change the upgrade script to keep os distro during upgrade. 2016-09-15 21:14:40 -07:00