Commit Graph

5505 Commits (5ba21e83b9f3c50dab929ed6eb2ad5147fb0ff75)

Author SHA1 Message Date
Kubernetes Submit Queue 750d5c3bc5 Merge pull request #41561 from jamiehannaford/fix-multiple-swift-urls
Automatic merge from submit-queue

Ensure only 1 Swift URL is used in cluster operations

**What this PR does / why we need it**:

Extracts only 1 Swift URL if multiple are returned from Keystone.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:

https://github.com/kubernetes/kubernetes/issues/34930

**Special notes for your reviewer**:

**Release note**:
```release-note
Heat cluster operations now support environments that have multiple Swift URLs
```
2017-04-12 02:16:28 -07:00
Kubernetes Submit Queue 640c67792f Merge pull request #44363 from bowei/use-auto-net
Automatic merge from submit-queue

Use auto mode networks instead of legacy networks in GCP

Use of the --range flag creates legacy networks in GCP.

Legacy networks will not support new GCP features.

```release-note
NONE
```
2017-04-11 22:57:20 -07:00
Kubernetes Submit Queue ceccd305ce Merge pull request #42147 from bowei/ip-alias-2
Automatic merge from submit-queue

Add support for IP aliases for pod IPs (GCP alpha feature)

```release-note
Adds support for allocation of pod IPs via IP aliases.

# Adds KUBE_GCE_ENABLE_IP_ALIASES flag to the cluster up scripts (`kube-{up,down}.sh`).

KUBE_GCE_ENABLE_IP_ALIASES=true will enable allocation of PodCIDR ips
using the ip alias mechanism rather than using routes. This feature is currently
only available on GCE.

## Usage
$ CLUSTER_IP_RANGE=10.100.0.0/16 KUBE_GCE_ENABLE_IP_ALIASES=true bash -x cluster/kube-up.sh

# Adds CloudAllocator to the node CIDR allocator (kubernetes-controller manager).

If CIDRAllocatorType is set to `CloudCIDRAllocator`, then allocation
of CIDR allocation instead is done by the external cloud provider and
the node controller is only responsible for reflecting the allocation
into the node spec.

- Splits off the rangeAllocator from the cidr_allocator.go file.
- Adds cloudCIDRAllocator, which is used when the cloud provider allocates
  the CIDR ranges externally. (GCE support only)
- Updates RBAC permission for node controller to include PATCH
```
2017-04-11 22:09:24 -07:00
Bowei Du 079505023f Use auto mode networks instead of legacy networks in GCP
Use of the --range flag creates legacy networks in GCP.
2017-04-11 14:36:17 -07:00
Bowei Du 345c65847f Add KUBE_GCE_ENABLE_IP_ALIASES flag to the cluster turn up scripts.
KUBE_GCE_ENABLE_IP_ALIASES=true will enable allocation of PodCIDR ips
using the ip alias mechanism rather than using routes.

NODE_IP_RANGE will control the node instance IP cidr
KUBE_GCE_IP_ALIAS_SIZE controls the size of each podCIDR
IP_ALIAS_SUBNETWORK controls the name of the subnet created for the cluster
2017-04-11 14:07:50 -07:00
Kubernetes Submit Queue b9a5a5c9b3 Merge pull request #42748 from dcbw/cfssl-localup
Automatic merge from submit-queue (batch tested with PRs 43866, 42748)

hack/cluster: download cfssl if not present

hack/local-up-cluster.sh uses cfssl to generate certificates and
will exit it cfssl is not already installed.  But other cluster-up
mechanisms (GCE) that generate certs just download cfssl if not
present.  Make local-up-cluster.sh do that too so users don't have
to bother installing it from somewhere.
2017-04-10 14:27:11 -07:00
Kubernetes Submit Queue 8d173c96ad Merge pull request #44178 from opsnull/master
Automatic merge from submit-queue

fix kubedns-sa.yaml missing "namespace: kube-system" value

The file kubedns-sa.yaml  missing `namespace: kube-system`,  so it will create ServiceAccount kube-dns in default namespace, this will cause kube-dns deployment's pods be blocked forever;

Some logs as following:

>     - lastTransitionTime: 2017-04-06T19:02:12Z
>       lastUpdateTime: 2017-04-06T19:02:12Z
>       message: 'unable to create pods: pods "kube-dns-699984412-" is forbidden: service
>         account kube-system/kube-dns was not found, retry after the service account

**Release note**:

```release-note
NONE
```
2017-04-07 00:18:43 -07:00
Kubernetes Submit Queue 0653751fb4 Merge pull request #44169 from mikedanese/fix
Automatic merge from submit-queue (batch tested with PRs 42025, 44169, 43940)

if we have a dedicated serviceaccount keypair, use it to verify serviceaccounts
2017-04-06 17:00:20 -07:00
Mike Danese e2d7e2c866 make salt return non-zero exit code on failure 2017-04-06 13:57:33 -07:00
opsnull 7978ad17a9 fix kubedns-sa.yaml missing "namespace: kube-system" value 2017-04-07 03:52:51 +08:00
Mike Danese ffcbe213c1 if we have a dedicated serviceaccount keypair, use it to verify serviceaccounts 2017-04-06 11:06:25 -07:00
Kubernetes Submit Queue b41e415ebd Merge pull request #43137 from shashidharatd/federation-domain
Automatic merge from submit-queue

[Federation] Remove FEDERATIONS_DOMAIN_MAP references

Remove all references to FEDERATIONS_DOMAIN_MAP as this method is no longer is used and is replaced by adding federation domain map to kube-dns configmap.

cc @madhusudancs @kubernetes/sig-federation-pr-reviews 

**Release note**:
```
[Federation] Mechanism of adding `federation domain maps` to kube-dns deployment via `--federations` flag is superseded by adding/updating `federations` key in `kube-system/kube-dns` configmap. If user is using kubefed tool to join cluster federation, adding federation domain maps to kube-dns is already taken care by `kubefed join` and does not need further action.
```
2017-04-06 02:05:42 -07:00
Kubernetes Submit Queue 0f10d6ccf2 Merge pull request #43996 from ncdc/proxy-shared-informers
Automatic merge from submit-queue

Use shared informers for proxy endpoints and service configs

Use shared informers instead of creating local controllers/reflectors
for the proxy's endpoints and service configs. This allows downstream
integrators to pass in preexisting shared informers to save on memory &
cpu usage.

This also enables the cache mutation detector for kube-proxy for those
presubmit jobs that already turn it on.

Follow-up to #43295 cc @wojtek-t 

Will race with #43937 for conflicting changes 😄 cc @thockin 

cc @smarterclayton @sttts @liggitt @deads2k @derekwaynecarr @eparis @kubernetes/rh-cluster-infra
2017-04-05 06:52:25 -07:00
Kubernetes Submit Queue 3b8e327924 Merge pull request #44049 from crassirostris/fluentd-es-remove-toleration
Automatic merge from submit-queue

Remove toleration from fluentd-elasticsearch

Fix https://github.com/kubernetes/kubernetes/issues/43795
Address comments from https://github.com/kubernetes/kubernetes/issues/42983

Fluentd-es doesn't work on master anyway, because it has no access to k8s services
2017-04-05 06:03:47 -07:00
Mik Vyatskov 30f22ad683 Remove toleration from fluentd-elasticsearch 2017-04-05 11:27:14 +02:00
Kubernetes Submit Queue 3a3dc827e4 Merge pull request #43467 from tvansteenburgh/gpu-support
Automatic merge from submit-queue (batch tested with PRs 44047, 43514, 44037, 43467)

Juju: Enable GPU mode if GPU hardware detected

**What this PR does / why we need it**:

Automatically configures kubernetes-worker node to utilize GPU hardware when such hardware is detected.

layer-nvidia-cuda does the hardware detection, installs CUDA and Nvidia
drivers, and sets a state that the k8s-worker can react to.

When gpu is available, worker updates config and restarts kubelet to
enable gpu mode. Worker then notifies master that it's in gpu mode via
the kube-control relation.

When master sees that a worker is in gpu mode, it updates to privileged
mode and restarts kube-apiserver.

The kube-control interface has subsumed the kube-dns interface
functionality.

An 'allow-privileged' config option has been added to both worker and
master charms. The gpu enablement respects the value of this option;
i.e., we can't enable gpu mode if the operator has set
allow-privileged="false".

**Special notes for your reviewer**:

Quickest test setup is as follows:
```bash
# Bootstrap. If your aws account doesn't have a default vpc, you'll need to
# specify one at bootstrap time so that juju can provision a p2.xlarge.
# Otherwise you can leave out the --config "vpc-id=vpc-xxxxxxxx" bit.
juju bootstrap --config "vpc-id=vpc-xxxxxxxx" --constraints "cores=4 mem=16G root-disk=64G" aws/us-east-1 k8s

# Deploy the bundle containing master and worker charms built from
# https://github.com/tvansteenburgh/kubernetes/tree/gpu-support/cluster/juju/layers
juju deploy cs:~tvansteenburgh/bundle/kubernetes-gpu-support-3

# Setup kubectl locally
mkdir -p ~/.kube
juju scp kubernetes-master/0:config ~/.kube/config
juju scp kubernetes-master/0:kubectl ./kubectl

# Download a gpu-dependent job spec
wget -O /tmp/nvidia-smi.yaml https://raw.githubusercontent.com/madeden/blogposts/master/k8s-gpu-cloud/src/nvidia-smi.yaml

# Create the job
kubectl create -f /tmp/nvidia-smi.yaml

# You should see a new nvidia-smi-xxxxx pod created
kubectl get pods

# Wait a bit for the job to run, then view logs; you should see the
# nvidia-smi table output
kubectl logs $(kubectl get pods -l name=nvidia-smi -o=name -a)
```

kube-control interface: https://github.com/juju-solutions/interface-kube-control
nvidia-cuda layer: https://github.com/juju-solutions/layer-nvidia-cuda
(Both are registered on http://interfaces.juju.solutions/)

**Release note**:
```release-note
Juju: Enable GPU mode if GPU hardware detected
```
2017-04-04 14:33:26 -07:00
Kubernetes Submit Queue 95289ff239 Merge pull request #42518 from mtanino/issue/42517
Automatic merge from submit-queue

get-kube-local.sh checks pods with option "--namespace=kube-system"

**What this PR does / why we need it**:

Local cluster creation using get-kube-local.sh is never finished.
The get-kube-local.sh monitors running_count of pods such as etcd,
master and kube-proxy, but these pods are created under the namespace
kube-system. Therefore kubectl can't find these pods then cluster
creation isn't completed.

The get-kube-local.sh should monitor created pods with option
"--namespace=kube-system".

**Which issue this PR fixes** : fixes #42517




**Release note**: 

```
`NONE`
```
2017-04-04 13:22:45 -07:00
Kubernetes Submit Queue ae57772988 Merge pull request #44017 from justinsb/permissions_log_dump
Automatic merge from submit-queue

cluster/log-dump - chmod files before dumping

We make the files world-readable, so that installation techniques that
lock down the logfiles can still be dumped.

Issue https://github.com/kubernetes/test-infra/issues/2397

```release-note
NONE
```
2017-04-04 09:52:25 -07:00
Andy Goldstein d2bc4d0b2e Use shared informers for proxy endpoints and service configs
Use shared informers instead of creating local controllers/reflectors
for the proxy's endpoints and service configs. This allows downstream
integrators to pass in preexisting shared informers to save on memory &
cpu usage.

This also enables the cache mutation detector for kube-proxy for those
presubmit jobs that already turn it on.
2017-04-04 12:51:41 -04:00
Kubernetes Submit Queue 12fbc9083e Merge pull request #43625 from mbruzek/cdk-load-balancer-update
Automatic merge from submit-queue

Adding more proxy options and header to nginx load-balancer.

**What this PR does / why we need it**: The kubeapi-load-balancer uses nginx to proxy commands to the kube-apiserver. It currently does not support SPDY and therefore the `kubectl exec` command is broken.

**Which issue this PR fixes** : 
fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/226
fixes https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/201

**Special notes for your reviewer**: This only changes the nginx configuration no code change was required.

**Release note**:
```release-note
Using http2 in kubeapi-load-balancer to fix kubectl exec uses
```
2017-04-04 08:03:44 -07:00
Dan Williams f20437a822 hack/cluster: download cfssl if not present
hack/local-up-cluster.sh uses cfssl to generate certificates and
will exit it cfssl is not already installed.  But other cluster-up
mechanisms (GCE) that generate certs just download cfssl if not
present.  Make local-up-cluster.sh do that too.
2017-04-03 23:31:16 -05:00
Justin Santa Barbara f506dfe1ea cluster/log-dump - chmod files before dumping
We make the files world-readable, so that installation techniques that
lock down the logfiles can still be dumped.

Issue https://github.com/kubernetes/test-infra/issues/2397
2017-04-03 21:41:24 -04:00
Kubernetes Submit Queue d1dd73e9f6 Merge pull request #42668 from ixdy/build-silence-docker-rmi
Automatic merge from submit-queue (batch tested with PRs 42379, 42668, 42876, 41473, 43260)

Silence error messages from the docker rmi call we expect to fail

**What this PR does / why we need it**: when we removed `docker tag -f` in #34361 we added a bunch of `docker rmi` calls to preserve behavior for older docker versions. That step is usually a no-op, however, and results in confusing messages like
```
Tagging docker image gcr.io/google_containers/kube-proxy:c8d0b2e7a06b451117a8ac58fc3bb3d3 as gcr.io/kubernetes-release-test/kube-proxy-amd64:v1.5.4
Error response from daemon: No such image: gcr.io/kubernetes-release-test/kube-proxy-amd64:v1.5.4
```

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #42665

**Special notes for your reviewer**: I could probably remove the `docker rmi` calls entirely, though I don't know if folks are still using docker < 1.10. (I think Jenkins still has 1.9.1.)

**Release note**:

```release-note
NONE
```

cc @jessfraz
2017-03-30 23:36:24 -07:00
Dan Williams b3705b6e35 hack/cluster: consolidate cluster/ utils to hack/lib/util.sh
Per Clayton's suggestion, move stuff from cluster/lib/util.sh to
hack/lib/util.sh.  Also consolidate ensure-temp-dir and use the
hack/lib/util.sh implementation rather than cluster/common.sh.
2017-03-30 22:34:46 -05:00
Kubernetes Submit Queue 7ff948ce32 Merge pull request #43643 from rmmh/redis
Automatic merge from submit-queue (batch tested with PRs 43726, 43643)

Make a smaller redis image for testing, based on Alpine.

**What this PR does / why we need it**:
This shrinks gcr.io/google_containers/redis from 400MB to 5MB, which should reduce flakes.

**Which issue this PR fixes**:
fixes #43631

**Release note**:
```release-note
NONE
```
2017-03-29 17:23:19 -07:00
Kubernetes Submit Queue b020fb1fda Merge pull request #43726 from vishh/local-ssd-gce
Automatic merge from submit-queue

Add support for local ssds in GCE

For #43640
2017-03-29 16:56:27 -07:00
Kubernetes Submit Queue 060ea9ca7b Merge pull request #42617 from MrHohn/dns-autoscaler-rbac
Automatic merge from submit-queue

Moves dns-horizontal-autoscaler to a separate service account

Similar to #38816.

As one of the cluster add-ons, dns-horizontal-autoscaler is now using the default service account in kube-system namespace, which is introduced by https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/e2e-rbac-bindings/random-addon-grabbag.yaml for the ease of transition. This default service account will be removed in the future.

This PR subdivides dns-horizontal-autoscaler to a separate service account and setup the necessary permissions.

@bowei 

**Release note**:

```release-note
NONE
```
2017-03-29 15:43:10 -07:00
Vishnu kannan 937bac940a add support for local ssds in GCE
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-03-29 15:06:58 -07:00
Ryan Hitchman 4816ba9898 Make a smaller redis image for testing, based on Alpine.
This shrinks redis from 400MB to 5MB, which should reduce flakes.
2017-03-28 16:18:00 -07:00
Kubernetes Submit Queue 23104b714c Merge pull request #42467 from chentao1596/change-etcd-version
Automatic merge from submit-queue (batch tested with PRs 43518, 42467)

install/kube-up: fix some errors while install k8s through kube-up/down.sh

What this PR does / why we need it:

     etcd2.3.1 will be installed follow this scripts, but k8s use etcd3 as default storage backend, so the next error will always be apprear: 
     API server: rpc error: code = 13 desc = transport is closing
     so i think we should change the version of etcd

    thank you!
2017-03-28 14:09:22 -07:00
Kubernetes Submit Queue be4452cfce Merge pull request #42994 from Shawyeok/features/full-tls-etcd-cluster
Automatic merge from submit-queue

Centos provider: generate SSL certificates for etcd cluster.

**What this PR does / why we need it**:
Support secure etcd cluster for centos provider, generate SSL certificates for etcd in default. Running it w/o SSL is exposing cluster data to everyone and is not recommended. [#39462](https://github.com/kubernetes/kubernetes/pull/39462#issuecomment-271601547)

/cc @jszczepkowski @zmerlynn 

**Release note**:
```release-note
Support secure etcd cluster for centos provider.
```
2017-03-28 09:02:26 -07:00
Marcin Wielgus b08e6f6297 Bump cluster autoscaler to 0.5.1 2017-03-28 13:17:47 +02:00
Kubernetes Submit Queue b30fe32a66 Merge pull request #43381 from aleksandra-malinowska/stackdriver-config
Automatic merge from submit-queue (batch tested with PRs 43681, 40423, 43562, 43008, 43381)

Add stackdriver monitoring option
2017-03-27 12:49:29 -07:00
Kubernetes Submit Queue 8dfc939345 Merge pull request #43681 from ethernetdan/proto-upgrade-prompt
Automatic merge from submit-queue

added prompt warning if etcd3 media type isn't set during upgrade

**What this PR does / why we need it**:
This adds a prompt confirming the upgrade when `STORAGE_MEDIA_TYPE` is not explicitly set. This is to prevent users from accidentally upgrading to protobuf.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 
Alongs with docs, addresses #43669

**Special notes for your reviewer**:
Should be cherrypicked onto `release-1.6`

**Release note**:
```release-note
NONE
```
2017-03-27 12:10:31 -07:00
Dan Gillespie d7a552c188 in storage media upgrade prompt, provide config for using protobuf 2017-03-27 16:46:38 +01:00
Dan Gillespie 031dd569ac etcd upgrade warning: add docs link, fixed etcd2 behavior, print non-interactive 2017-03-27 16:13:11 +01:00
Dan Gillespie d0bbb941fd added prompt warning if etcd3 media type isn't set during upgrade 2017-03-27 13:47:09 +01:00
Jerzy Szczepkowski 27b8e1f518 Added failing upgrade if there are many master replicas.
Added failing upgrade (GCE) if there are many master replicas. Releated
to #43688.
2017-03-27 14:29:06 +02:00
Aleksandra Malinowska a737fec50b Add stackdriver monitoring option 2017-03-27 12:10:04 +02:00
Kubernetes Submit Queue e6453c7880 Merge pull request #42638 from jamiehannaford/minion-fip
Automatic merge from submit-queue (batch tested with PRs 41297, 42638, 42666, 43039, 42567)

Allow minion floating IPs to be optional

**What this PR does / why we need it**:

Makes the generation of floating IPs for worker nodes optional, based on an env var. To quote the original issue:

> Currently, the OpenStack installation method assigns a floating IP to every single worker node. While this is fine for smaller clusters with a good sized IP pool, it can cause issues in environments with high node counts or less IPs available.

**Which issue this PR fixes**:

https://github.com/kubernetes/kubernetes/issues/40737

**Special notes for your reviewer**:

I used the conditions section of the Heat spec: https://docs.openstack.org/developer/heat/template_guide/hot_spec.html#conditions-section

**Release note**:
```release-note
OpenStack clusters can now specify whether worker nodes are assigned a floating IP
```
2017-03-25 18:15:21 -07:00
Kubernetes Submit Queue 1251280236 Merge pull request #43624 from liggitt/legacy-abac-test
Automatic merge from submit-queue (batch tested with PRs 43048, 43624, 43649)

Remove E2E_UPGRADE_TEST check in config-test.sh

Once https://github.com/kubernetes/test-infra/pull/2330 merges, the upgrade tests will drive the exact behavior they want, and we can remove the check for envvars leaked from the job env
2017-03-25 13:29:23 -07:00
Jeff Grafton e39978e6bf Update a few regex patterns to support release candidates 2017-03-24 14:38:04 -07:00
Kubernetes Submit Queue 53d14e9a4c Merge pull request #43609 from Random-Liu/update-npd-rbac
Automatic merge from submit-queue

Update NPD rbac.

I recently enabled NPD in gke.

However, I found that in gke e2e test (https://k8s-testgrid.appspot.com/google-gke#gci-gke), npd on the node could not talk with apiserver, and reported full of following errors:
```
E0324 05:08:26.745545    1328 manager.go:160] failed to update node conditions: the server does not allow access to the requested resource (patch nodes gke-bootstrap-e2e-default-pool-fd91d792-mqh4)
E0324 05:08:37.719423    1328 manager.go:160] failed to update node conditions: the server does not allow access to the requested resource (patch nodes gke-bootstrap-e2e-default-pool-fd91d792-mqh4)
E0324 05:08:47.719694    1328 manager.go:160] failed to update node conditions: the server does not allow access to the requested resource (patch nodes gke-bootstrap-e2e-default-pool-fd91d792-mqh4)
```

I created a GKE cluster (v1.7.0-alpha.0.1483+1e879c69ecf09e) myself, and found that addon manager could not create npd binding with the following error:
```
error: error validating "/etc/kubernetes/addons/node-problem-detector/standalone/npd-binding.yaml": error validating data: couldn't find type: v1alpha1.ClusterRoleBinding; if you choose to ignore these errors, turn validation off with --validate=false
```

I found that rbac was updated to beta, but npd was missed because it was merged after 9e6a3496b4 (diff-b05c70853d9a772b310db71a61297841).

I updated rbac to beta in the master manifest and npd on the node could talk with apiserver immediately.
We must get this in 1.6 to make NPD working. @dchen1107 

@dchen1107 @fabioy @liggitt
2017-03-24 11:27:42 -07:00
Kubernetes Submit Queue ba63cb4538 Merge pull request #42903 from krousey/owners
Automatic merge from submit-queue

Remove krousey from some OWNERS files
2017-03-24 10:26:40 -07:00
Kubernetes Submit Queue f5d3126fca Merge pull request #42035 from timchenxiaoyu/enableerror
Automatic merge from submit-queue

enable error

enable word error
2017-03-24 10:25:13 -07:00
Kubernetes Submit Queue ff353231ec Merge pull request #42102 from timchenxiaoyu/kubltworderror
Automatic merge from submit-queue

kubelet word mistake
2017-03-24 10:25:06 -07:00
Jordan Liggitt eb45dc9eb9
Remove E2E_UPGRADE_TEST check in config-test.sh 2017-03-24 10:14:20 -04:00
Random-Liu 1e51b907bb Update NPD rbac. 2017-03-23 23:07:55 -07:00
shawyeok c692b55b57 Centos provider: generate SSL certificates for etcd cluster.
Making download-cfssl reusable.

Extract generate-etcd-cert method up to common.sh.
2017-03-24 09:15:57 +08:00
Matt Bruzek 71f583ebe4 Adding more proxy options and header to nginx load-balancer. 2017-03-23 16:14:02 -05:00