Commit Graph

233 Commits (319989854190e10364e83f051b6fcd7118cb5750)

Author SHA1 Message Date
Kubernetes Submit Queue 7fbb458f6d Merge pull request #40213 from jszczepkowski/ha-e2e-tests
Automatic merge from submit-queue (batch tested with PRs 39260, 40216, 40213, 40325, 40333)

Fixed propagation of kube master certs during master replication.

Fixed propagation of kube-master-certs during master replication.
2017-01-24 16:26:02 -08:00
Mike Danese 513994a9f8 pass CA key to signer in GCE 2017-01-20 11:10:19 -08:00
Jerzy Szczepkowski d1a73fa5cd Fixed propagation of kube master certs during master replication.
Fixed propagation of kube master certs during master replication.
2017-01-20 13:24:09 +01:00
Kubernetes Submit Queue 6dfe5c49f6 Merge pull request #38865 from vwfs/ext4_no_lazy_init
Automatic merge from submit-queue

Enable lazy initialization of ext3/ext4 filesystems

**What this PR does / why we need it**: It enables lazy inode table and journal initialization in ext3 and ext4.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #30752, fixes #30240

**Release note**:
```release-note
Enable lazy inode table and journal initialization for ext3 and ext4
```

**Special notes for your reviewer**:
This PR removes the extended options to mkfs.ext3/mkfs.ext4, so that the defaults (enabled) for lazy initialization are used.

These extended options come from a script that was historically located at */usr/share/google/safe_format_and_mount* and later ported to GO so this dependency to the script could be removed. After some search, I found the original script here: https://github.com/GoogleCloudPlatform/compute-image-packages/blob/legacy/google-startup-scripts/usr/share/google/safe_format_and_mount

Checking the history of this script, I found the commit [Disable lazy init of inode table and journal.](4d7346f7f5). This one introduces the extended flags with this description:
```
Now that discard with guaranteed zeroing is supported by PD,
initializing them is really fast and prevents perf from being affected
when the filesystem is first mounted.
```

The problem is, that this is not true for all cloud providers and all disk types, e.g. Azure and AWS. I only tested with magnetic disks on Azure and AWS, so maybe it's different for SSDs on these cloud providers. The result is that this performance optimization dramatically increases the time needed to format a disk in such cases.

When mkfs.ext4 is told to not lazily initialize the inode tables and the check for guaranteed zeroing on discard fails, it falls back to a very naive implementation that simply loops and writes zeroed buffers to the disk. Performance on this highly depends on free memory and also uses up all this free memory for write caching, reducing performance of everything else in the system. 

As of https://github.com/kubernetes/kubernetes/issues/30752, there is also something inside kubelet that somehow degrades performance of all this. It's however not exactly known what it is but I'd assume it has something to do with cgroups throttling IO or memory. 

I checked the kernel code for lazy inode table initialization. The nice thing is, that the kernel also does the guaranteed zeroing on discard check. If it is guaranteed, the kernel uses discard for the lazy initialization, which should finish in a just few seconds. If it is not guaranteed, it falls back to using *bio*s, which does not require the use of the write cache. The result is, that free memory is not required and not touched, thus performance is maxed and the system does not suffer.

As the original reason for disabling lazy init was a performance optimization and the kernel already does this optimization by default (and in a much better way), I'd suggest to completely remove these flags and rely on the kernel to do it in the best way.
2017-01-18 09:09:52 -08:00
Jordan Liggitt d94bb26776
Conditionally write token file entries 2017-01-13 17:59:46 -05:00
Jordan Liggitt 968b0b30cf
Update token users if needed 2017-01-11 17:21:12 -05:00
Jordan Liggitt 21b422fccc
Allow enabling ABAC authz 2017-01-11 17:20:51 -05:00
Jordan Liggitt 1fe517e96a
Include admin in super-user group 2017-01-11 17:20:42 -05:00
Kubernetes Submit Queue ebc8e40694 Merge pull request #39691 from yujuhong/bump_timeout
Automatic merge from submit-queue (batch tested with PRs 39694, 39383, 39651, 39691, 39497)

Bump container-linux and gci timeout for docker health check

The command `docker ps` can take longer time to respond under heavy load or
when encountering some known issues. In these cases, the containers are running
fine, so aggressive health check could cause serious disruption. Bump the
timeout to 60s to be consistent with the debian-based containerVM.

This addresses #38588
2017-01-10 21:25:16 -08:00
Yu-Ju Hong 4e87973a9b Bump container-linux and gci timeout for docker health check
The command `docker ps` can take longer time to respond under heavy load or
when encountering some known issues. In these cases, the containers are running
fine, so aggressive health check could cause serious disruption. Bump the
timeout to 60s to be consistent with the debian-based containerVM.
2017-01-10 13:07:21 -08:00
Mike Danese 3ab0e37cc6 implement upgrades 2017-01-04 11:45:57 -08:00
CJ Cullen d0997a3d1f Generate a kubelet CA and kube-apiserver cert-pair for kubelet auth.
Plumb through to kubelet/kube-apiserver on gci & cvm.
2017-01-03 14:30:45 -08:00
deads2k ecd23a0217 remove abac authorizer from e2e 2017-01-03 07:53:03 -05:00
Kubernetes Submit Queue 274a9f0f70 Merge pull request #38927 from luxas/remove_maintainer
Automatic merge from submit-queue

Remove all MAINTAINER statements in the codebase as they are deprecated

**What this PR does / why we need it**:
ref: https://github.com/docker/docker/pull/25466

**Release note**:

```release-note
Remove all MAINTAINER statements in Dockerfiles in the codebase as they are deprecated by docker
```
@ixdy @thockin (who else should be notified?)
2016-12-29 16:41:24 -08:00
deads2k 19391164b9 add additional e2e rbac bindings to match existing users 2016-12-21 16:24:45 -05:00
deads2k 2e2a2e4b94 update gce for RBAC, controllers, proxy, kubelet (p1) 2016-12-21 13:51:49 -05:00
Alexander Block 13a2bc8afb Enable lazy initialization of ext3/ext4 filesystems 2016-12-18 11:08:51 +01:00
Euan Kemp e2644bb442 cluster/gce: copy gci -> coreos
This is for reviewing ease as the following commits introduce changes
to make the coreos kube-up deployment share significant code with the
gci code.
2016-12-17 21:36:30 -08:00
Lucas Käldström 3c5b5f5963 Remove all MAINTAINER statements in the codebase as they aren't very useful and now deprecated 2016-12-17 20:34:10 +02:00
Piotr Szczesniak a52637f09f Migrated fluentd to daemon set 2016-12-15 13:48:32 +01:00
Amey Deshpande 5ec42e6a25 Ensure the GCI metadata files do not have whitespace at the end
Fixes #36708
2016-12-13 13:41:54 -08:00
Zihong Zheng 4ad06df18f Renames kube-dns configure files from skydns* to kubedns* 2016-12-08 20:01:19 -08:00
Kubernetes Submit Queue f2014abf6f Merge pull request #36778 from cjcullen/basicauth
Automatic merge from submit-queue (batch tested with PRs 38294, 37009, 36778, 38130, 37835)

Only configure basic auth on gci if KUBE_USER and KUBE_PASSWORD are specified.

This should not change the existing flow when KUBE_USER/KUBE_PASSWORD are specified.
It makes not specifying those a valid option that means "don't turn on basic auth".
I only did it for cluster/gce/gci for now, but others should be somewhat similar.
2016-12-07 10:45:18 -08:00
Kubernetes Submit Queue 97ae7ccb56 Merge pull request #31647 from mikedanese/register-tainted
Automatic merge from submit-queue

add a configuration for kubelet to register as a node with taints

and deprecate --register-schedulable

ref #28687 #29178

cc @dchen1107 @davidopp @roberthbailey
2016-12-06 19:07:54 -08:00
Kubernetes Submit Queue 65ed735d4f Merge pull request #38124 from kubernetes/Dec/gluster
Automatic merge from submit-queue

Fix GCI mounter issue
2016-12-06 16:21:06 -08:00
Mike Danese e225625a80 add a configuration for kubelet to register as a node with taints
and deprecate register-schedulable
2016-12-06 10:32:54 -08:00
Kubernetes Submit Queue 9d7644286d Merge pull request #37664 from euank/fix-gci-typo
Automatic merge from submit-queue (batch tested with PRs 37870, 36643, 37664, 37545)

cluster/gci: Fix typo
2016-12-06 00:22:56 -08:00
Jing Xu 3a1cf2d52a Fix GCI mounter script to run garbage collection multiple times
Remove break in the mounter script to make sure gc run multiple times
2016-12-05 10:17:54 -08:00
gmarek aef56cdf21 Increase max mutating inflight requests in large clusters 2016-12-05 09:33:05 +01:00
Kubernetes Submit Queue ce4af7f0b5 Merge pull request #37941 from Crassirostris/fluentd-gcp-config-unification
Automatic merge from submit-queue (batch tested with PRs 37692, 37785, 37647, 37941, 37856)

Use unified gcp fluentd image for gci and cvm

Follow-up of https://github.com/kubernetes/kubernetes/pull/37681

Actually unify the pod specs for CVM and GCI, to simplify the configuration

CC @piosz
2016-12-03 11:45:02 -08:00
Kubernetes Submit Queue 2cdb97d413 Merge pull request #37593 from yujuhong/gci_rm_docker_network
Automatic merge from submit-queue

GCI: Remove /var/lib/docker/network
2016-12-01 13:24:22 -08:00
Daniel Smith 5b1d875f27 Revert "Modify GCI mounter to enable NFSv3" 2016-12-01 11:47:24 -08:00
Mik Vyatskov 74a3b77c73 Use unified gcp fluentd image for gci and cvm 2016-12-01 17:29:27 +01:00
Kubernetes Submit Queue 1570aad238 Merge pull request #37451 from jszczepkowski/ha-read-quorum
Automatic merge from submit-queue

Added setting etcd read quorum flag
2016-12-01 06:31:24 -08:00
Kubernetes Submit Queue 6c2c12fafa Merge pull request #37582 from jingxu97/Nov/retrynfsv3
Automatic merge from submit-queue

Modify GCI mounter to enable NFSv3
2016-11-30 21:59:08 -08:00
Kubernetes Submit Queue 85ff555954 Merge pull request #31617 from jsafrane/default-storage-class
Automatic merge from submit-queue

Deploy a default StorageClass instance on AWS and GCE

This needs a newer kubectl in kube-addons-manager container. It's quite tricky to test as I cannot push new container image to gcr.io and I must copy the newer container manually.

cc @kubernetes/sig-storage

**Release note**:
```release-note
Kubernetes now installs a default StorageClass object when deployed on AWS, GCE and
OpenStack with kube-up.sh scripts. This StorageClass will automatically provision
a PeristentVolume in corresponding cloud for a PersistentVolumeClaim that cannot be
satisfied by any existing matching PersistentVolume in Kubernetes.

To override this default provisioning, administrators must manually delete this default StorageClass.
```
2016-11-29 20:52:01 -08:00
Euan Kemp 5c6e2aaef9 cluster/gci: Fix typo 2016-11-29 16:03:35 -08:00
Jing Xu 80f2e58ccc Modify GCI mounter to enable NFSv3
This PR is a retry for PR #36610
2016-11-29 10:50:33 -08:00
Yu-Ju Hong 47c3b05fa3 GCI: Remove /var/lib/docker/network
This avoids running into corrupt network checkpoint issues.
2016-11-28 17:58:43 -08:00
Jerzy Szczepkowski 02542cae06 Added setting etcd read quorum flag.
Added setting etcd read quorum flag in kube-up scripts. Required for HA master.
2016-11-25 13:53:11 +01:00
Robert Bailey 60dbfc9a71 Fix an else branch in configure-helper.sh. 2016-11-23 00:42:06 -08:00
Kubernetes Submit Queue e801fcfc4a Merge pull request #36610 from jingxu97/Nov/nfsv3
Automatic merge from submit-queue

Modify GCI mounter to enable NFSv3

In order to make NFSv3 work, mounter needs to start rpcbind daemon. This
change modify mounter's Dockerfile and mounter script to start the
rpcbind daemon if it is not running on the host.

After this change, need to make push the image and update the sha number in Changelog.
2016-11-22 23:38:51 -08:00
Jerzy Szczepkowski d01998f5fa Fixed e2e tests for HA master.
Set of fixes that allows HA master e2e tests to pass for removal/addition of master replicas.
2016-11-22 12:03:28 +01:00
Jing Xu 2a8d89e5d1 Modify GCI mounter to enable NFSv3
In order to make NFSv3 work, mounter needs to start rpcbind daemon. This
change modify mounter's Dockerfile and mounter script to start the
rpcbind daemon if it is not running on the host.

After this change, need to make push the image and update the sha number in Changelog.
2016-11-21 16:42:40 -08:00
Jan Safranek b52d971aee stash 2016-11-21 10:16:29 +01:00
CJ Cullen 8af7fc6f00 Only configure basic auth on gci if KUBE_USER & KUBE_PASSWORD are specified.
Knock out the garbage {{kube_user}} abac line when KUBE_USER isn't specified.
2016-11-14 18:58:56 -08:00
Jerzy Szczepkowski ab7266bf19 SSL certificates for etcd cluster.
Added generation of SSL certificates for etcd cluster internal
communication. Turned on on gci & trusty.
2016-11-10 15:26:03 +01:00
Kubernetes Submit Queue 1014bc411a Merge pull request #36346 from jszczepkowski/ha-masterip
Automatic merge from submit-queue

Change master to advertise external IP in kubernetes service.

Change master to advertise external IP in kubernetes service.
In effect, in HA mode in case of multiple masters, IP of external load
balancer will be advertise in kubernetes service.
2016-11-10 05:00:48 -08:00
Kubernetes Submit Queue c98fc70195 Merge pull request #36008 from MrHohn/addon-rc-migrate
Automatic merge from submit-queue

Migrates addons from RCs to Deployments

Fixes #33698.

Below addons are being migrated:
- kube-dns
- GLBC default backend
- Dashboard UI
- Kibana

For the new deployments, the version suffixes are removed from their names. Version related labels are also removed because they are confusing and not needed any more with regard to how Deployment and the new Addon Manager works.

The `replica` field in `kube-dns` Deployment manifest is removed for the incoming DNS horizontal autoscaling feature #33239.

The `replica` field in `Dashboard` Deployment manifest is also removed because the rescheduler e2e test is manually scaling it.

Some resource limit related fields in `heapster-controller.yaml` are removed, as they will be set up by the `addon resizer` containers. Detailed reasons in #34513.

Three e2e tests are modified:
- `rescheduler.go`: Changed to resize Dashboard UI Deployment instead of ReplicationController.
- `addon_update.go`: Some namespace related changes in order to make it compatible with the new Addon Manager.
- `dns_autoscaling.go`: Changed to examine kube-dns Deployment instead of ReplicationController.

Both of above two tests passed on my own cluster. The upgrade process --- from old Addons with RCs to new Addons with Deployments --- was also tested and worked as expected.

The last commit upgrades Addon Manager to v6.0. It is still a work in process and currently waiting for #35220 to be finished. (The Addon Manager image in used comes from a non-official registry but it mostly works except some corner cases.)

@piosz @gmarek could you please review the heapster part and the rescheduler test?

@mikedanese @thockin 

cc @kubernetes/sig-cluster-lifecycle 

---

Notes:
- Kube-dns manifest still uses *-rc.yaml for the new Deployment. The stale file names are preserved here for receiving faster review. May send out PR to re-organize kube-dns's file names after this.
- Heapster Deployment's name remains in the old fashion(with `-v1.2.0` suffix) for avoiding describe this upgrade transition explicitly. In this way we don't need to attach fake apply labels to the old Deployments.
2016-11-10 02:36:38 -08:00
Rajat Ramesh Koujalagi d81e216fc6 Better messaging for missing volume components on host to perform mount 2016-11-09 15:16:11 -08:00