Automatic merge from submit-queue
Exit image puller subshell
Exit the subshell with 0 so even if the last docker pull fails the pod doesn't end up in the error state.
Automatic merge from submit-queue
Enable support for memory eviction configuration via salt
Added evictions based on memory by default whenever the available memory is < 100Mi.
Updated GCE and GCI.
Automatic merge from submit-queue
Bump cluster autoscaler version and enable scale down by default
Follow up of https://github.com/kubernetes/contrib/pull/1148.
cc: @piosz @fgrzadkowski @jszczepkowski
Automatic merge from submit-queue
Add collection of the new glbc and cluster-autoscaler logs
I've incremented the version numbers by 2 to avoid conflicting with #26652. I'll make sure the potential conflict between the images gets resolved reasonably.
cc @piosz @bprashanth @aledbf
Automatic merge from submit-queue
Switch DNS addons from skydns to kubedns
Change GCI and trusty cluster-helper scripts to use kubedns instead of skydns.
Unified skydns templates using a simple underscore based template and
added transform sed scripts to transform into salt and sed yaml
templates
Moved all content out of cluster/addons/dns into build/kube-dns and
saltbase/salt/kube-dns
Automatic merge from submit-queue
Add node problem detector as an addon pod.
```release-note
Introduce a new add-on pod NodeProblemDetector.
NodeProblemDetector is a DaemonSet running on each node, monitoring node health and reporting
node problems as NodeCondition and Event. Currently it already supports kernel log monitoring, and
will support more problem detection in the future. It is enabled by default on gce now.
```
This PR enables NodeProblemDetector as an add-on pod.
/cc @mikedanese @kubernetes/sig-node
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Automatic merge from submit-queue
Configuration for GCP webhook authentication and authorization
This PR adds configuration for GCP webhook authentication and authorization in ContainerVM and GCI. The change of configure-vm.sh and kube-apiserver.manifest is directly copied from @cjcullen's PR #25380 and #25296. The change in GCI script configure-helper.sh includes the support for webhook authentication and authorization, and also some code refactor to improve readability.
@cjcullen @roberthbailey @zmerlynn please review it. The original PRs are P1, please mark this as P1.
cc/ @fabioy @kubernetes/goog-image FYI.
I verified it by running e2e tests on GCI cluster. Without the GCI side change, cluster creation fails as being capture by GKE Jenkins tests. I don't test when the two env GCP_AUTHN_URL and GCP_AUTHZ_URL are set, because they are only set in GKE. After this PR is merged, @cjcullen will test in GKE.
Automatic merge from submit-queue
Salt configuration for the new Cluster Autoscaler for GCE
Adds support for cloud autoscaler from contrib/cloud-autoscaler in kube-up.sh GCE script.
cc: @fgrzadkowski @piosz
Automatic merge from submit-queue
Openstack provider
Our pull request delivers solution to create Kubernetes cluster on the top of OpenStack. Heat OpenStack Orchestration engine describes the infrastructure for Kubernetes cluster. CentoOS images are used for Kubernetes host machines.
We tested our solution with DevStack and Citycloud provider.
We believe that our solution will fill the gap that which is on the market.
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/21737)
<!-- Reviewable:end -->
Automatic merge from submit-queue
Add an entry to the salt config to allow Debian jessie on GCE.
```release-note
Add an entry to the salt config to allow Debian jessie on GCE.
As with the existing Wheezy image on GCE, docker is expected
to already be installed in the image.
```
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
CentOS 7 Core nodes running on OpenStack with an SSL-enabled API
endpoint results in the following error without this patch:
F0425 19:00:58.124520 5 server.go:100] Cloud provider could not be initialized: could not init cloud provider "openstack": Post https://my.openstack.cloud:5000/v2.0/tokens: x509: failed to load system roots and no roots provided
The root cause is that the ca-bundle.crt file is actually a symlink
which points to a directory which wasn't previously exposed.
[root@kubernetesstack-master ~]# ls -l /etc/ssl/certs/ca-bundle.crt
lrwxrwxrwx. 1 root root 49 18 nov 11:02 /etc/ssl/certs/ca-bundle.crt -> /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
[root@kubernetesstack-master ~]#
Making the assumption that the person running kube-up has their
Openstack environment setup, those same variables are being passed
into heat, and then into openstack.conf.
The salt codebase was modified to add openstack as well.
Automatic merge from submit-queue
Switch to ABAC authorization from AllowAll
Switch from AllowAll to ABAC. All existing identities (that are created by deployment scripts) are given full permissions through ABAC. Manually created identities will need policies added to the `policy.jsonl` file on the master.
Automatic merge from submit-queue
don't source the kube-env in addon-manager
This was added in 2feb658ed7 which became unused after #23603 but wasn't removed
Automatic merge from submit-queue
Initial kube-up support for VMware's Photon Controller
This is for: https://github.com/kubernetes/kubernetes/issues/24121
Photon Controller is an open-source cloud management platform. More
information is available at:
http://vmware.github.io/photon-controller/
This commit provides initial support for Photon Controller. The
following features are tested and working:
- kube-up and kube-down
- Basic pod and service management
- Networking within the Kubernetes cluster
- UI and DNS addons
It has been tested with a Kubernetes cluster of up to 10
nodes. Further work on scaling is planned for the near future.
Internally we have implemented continuous integration testing and will
run it multiple times per day against the Kubernetes master branch
once this is integrated so we can quickly react to problems.
A few things have not yet been implemented, but are planned:
- Support for kube-push
- Support for test-build-release, test-setup, test-teardown
Assuming this is accepted for inclusion, we will write documentation
for the kubernetes.io site.
We have included a script to help users configure Photon Controller
for use with Kubernetes. While not required, it will help some
users get started more quickly. It will be documented.
We are aware of the kube-deploy efforts and will track them and
support them as appropriate.
This is for: https://github.com/kubernetes/kubernetes/issues/24121
Photon Controller is an open-source cloud management platform. More
information is available at:
http://vmware.github.io/photon-controller/
This commit provides initial support for Photon Controller. The
following features are tested and working:
- kube-up and kube-down
- Basic pod and service management
- Networking within the Kubernetes cluster
- UI and DNS addons
It has been tested with a Kubernetes cluster of up to 10
nodes. Further work on scaling is planned for the near future.
Internally we have implemented continuous integration testing and will
run it multiple times per day against the Kubernetes master branch
once this is integrated so we can quickly react to problems.
A few things have not yet been implemented, but are planned:
- Support for kube-push
- Support for test-build-release, test-setup, test-teardown
Assuming this is accepted for inclusion, we will write documentation
for the kubernetes.io site.
We have included a script to help users configure Photon Controller
for use with Kubernetes. While not required, it will help some
users get started more quickly. It will be documented.
We are aware of the kube-deploy efforts and will track them and
support them as appropriate.
Automatic merge from submit-queue
add HOME env variable for kube-addons service
Fix https://github.com/kubernetes/kubernetes/issues/23973.
Briefly, systemd service does not know the `HOME` environment variable which causes the kubectl write schema file into `/.kube` while it is expected to be `/root/.kube`.
Automatic merge from submit-queue
add labels to kube component static pods
```
$ k --namespace=kube-system get po -l 'tier in (control-plane)'
NAME READY STATUS RESTARTS AGE
kube-apiserver-k-7-master 1/1 Running 2 1m
kube-controller-manager-k-7-master 1/1 Running 1 1m
kube-scheduler-k-7-master 1/1 Running 0 54s
$ k --namespace=kube-system get po -l 'tier in (node)'
NAME READY STATUS RESTARTS AGE
kube-proxy-k-7-minion-eheu 1/1 Running 0 1m
kube-proxy-k-7-minion-mwo9 1/1 Running 0 1m
kube-proxy-k-7-minion-xw6m 1/1 Running 0 1m
```
cc @bgrant0607 @thockin @gmarek
Fixes#21267
Automatic merge from submit-queue
don't ship kube-registry-proxy and pause images in tars.
pause is built into containervm. if it's not on the machine we should just pull
it. nobody that I'm aware of uses kube-registry-proxy and it makes build/deployment
more complicated and slower.
pause is built into containervm. if it's not on the machine we should just pull
it. nobody that I'm aware of uses kube-registry-proxy and it makes build/deployment
more complicated and slower.
Files are taken from cluster/network-plugins/{bin,conf} to be consumed within a vagrant kube-up.sh environment.
Paths used for configuration files and the 'cni' name of the network provider are all from the kubernetes documentation, but the actual implementation in the salt automation doesn't seem to exist.
Use of NETWORK_PROVIDER=cni is documented as useable (as well as it's affects on the runtime args of kubelet),
however the actual implimentation in the salt automation doesnt seem to exist.
this change attempts to fix that for the vagrant usecase.
Automatic merge from submit-queue
use apply instead of create to setup namespaces and tokens in addon manager
when the addon manager restarts, it takes ~15 minutes (1000 seconds) to start the sync loop because it retries creation of namespace and tokens 100 times. Create fails if the tokens already exist. Just use apply.
Automatic merge from submit-queue
Create a new Deployment in kube-system for every version.
It appears that version numbers have already been properly added to these files. Small change to delete an old deployment entirely, so we can make a new one per version (like replication controllers).
We'll want to change this back once the kube-addons support deployments in a later version.
This should allow allow the non_masquerade_cidr option to get configured
in /etc/salt/minion.d/grains.conf, allowing the flag to used by kubelet
in /etc/sysconfig/kubelet. Default configuration is set in pillar
It includes some performance improvements for parsing JSON (which is
very important for us, since all Docker logs are JSON) as well as a
couple new settings, like forcing of a flush of multiline logs after a
time period rather than having to wait until a new log is seen before
feeling confident flushing the previous one.
-Remove CPU limits to enable CPU bursting once 1.2 begins enforcing CPU limits.
-Add a memory limit for fluentd-es to match fluentd-gcp.
-Explicitly set requests to match limits.
This change revises the way to provide kube-system manifests for clusters on Trusty. Originally, we maintained copies of some manifests under cluster/gce/trusty/kube-manifests, which is not scalable and hard to maintain. With this change, clusters on Trusty will use the same source of manifests as ContainerVM. This change also fixes some minor problems such as shell variables and comments to meet the style guidance better.
Starting docker through Salt has always been problematic. Kubelet or
the babysitter process should start it. We've kept it around primarily
so we have a `service: docker` node for the Salt DAG.
Instead, we enable (but do not start) the Docker service in Salt. This
lets us keep the DAG node, but won't start it.
There's another bug in Salt, where watches will start the service even
on `service.enabled`. So we remove the watches, and move them to our
existing Salt bug-fix script.
The Docker 1.9.1 package on Debian is broken, and the service fails to
install when run unattended. This is treated as an installation failure
and causes everything to fail.
However, the service can be started by Salt once we're not installing
the package, and indeed we restart docker anyway.
So, on Debian, use a helper script to install the docker package. The
script sets up a policy-rc.d file to prevent the service starting, and
then cleanly removes it afterwards (this would be difficult to do in
Salt, I believe).
for vsphere provider docker currently only supports 1.9.1 release.
The older versions of docker are failing on jessie due to issue https://github.com/docker/docker/issues/18793
and newer version 1.10.x is not properly tested.
In commit 07d7cfd3, people add ${DEBUG} == "true" in file
cluster/saltbase/salt/generate-cert/make-ca-cert.sh
But the default value for DEBUG is not set. In that commit, it set the value
of DEBUG in cluster/ubuntu/util.sh where it call this script. When using this
script in saltstack to bring up cluster in other cloud platforms, it will fail
to generate the cert since we set set -o nounset in make-ca-cert.sh and var DEBUG
does not set. Set a default value for DEBUG here will fix this problem.
This is good because it removes an obstacle to using the
cluster/ubuntu scripting to install Kubernetes into a restricted
environment where the machines can not open connections to arbitrary
external locations.
Also add debuggability to make-ca-cert.sh
Resolves#21037Resolves#21092
This change support running kubernetes master on Ubuntu Trusty.
It uses pure cloud-config and shell scripts, and completely gets
rid of saltstack or the release salt tarball.
To fix it, I just add openssl depedency on "generate-cert" state. It
should work on Debian-like and RedHat-Like systems. (and, Archlinux,
Opensuse, etc)
Fixed error :
$ sudo salt 'kubernetes-master' state.apply
----------
ID: kubernetes-cert
Function: cmd.script
Result: False
Comment: Command 'kubernetes-cert' run
Started: 06:57:06.634203
Duration: 208.719 ms
Changes:
----------
pid:
793
retcode:
1
stderr:
/tmpm24T3R.sh: line 22: openssl: command not found
chgrp: cannot access '/srv/kubernetes/server.key': No such file or directory
chgrp: cannot access '/srv/kubernetes/server.cert': No such file or directory
chmod: cannot access '/srv/kubernetes/server.key': No such file or directory
chmod: cannot access '/srv/kubernetes/server.cert': No such file or directory
stdout:
After applying my patch (success) :
----------
ID: kubernetes-cert
Function: cmd.script
Result: True
Comment: Command 'kubernetes-cert' run
Started: 07:17:04.172384
Duration: 1041.092 ms
Changes:
----------
pid:
1045
retcode:
0
stderr:
Generating a 4096 bit RSA private key
......................................................................++
...............................................................................++
writing new private key to '/srv/kubernetes/server.key'
-----
stdout:
----------
The version of Salt we're running doesn't do a good job of detecting
systemd. Inspired by https://github.com/saltstack/salt/issues/13926,
I added a provider-force to the services.
With this change, salt-call -l debug state.highstate succeeds, even for
repeated invocations.
The issue was (probably) benign, but definitely caused noised (e.g. #11297)
I got the package name wrong before, which meant that salt was failing
on invocations after the first (the name apparently doesn't matter on
the first invocation).
We've had a lot of salt problems with systemd on AWS; we have a
workaround in place that we use everywhere else, we should use that for
kube-node-unpacker too.
Fixes#19386
Issue #19388
To build the python image, BUILD_PYTHON_IMAGE should be set during make.
When the addon script is running, it will check if python is installed
on the machine, if not, it will use the python image that built previously.
When KUBE_E2E_STORAGE_TEST_ENVIRONMENT is set to 'true', kube-up.sh script
will:
- Install the right packages for all storage volumes.
- Use devicemapper as docker storage backend. 'aufs', the default one on
Debian, does not support extended attibutes required by Ceph RBD and Gluster
server containers.
Tested on GCE and Vagrant, e2e tests for storage volumes passes without any
additional configuration.
This ensures nfs-common is installed on GCE, and provides a more
functional explanation/example. I launched two replication controllers
so that there were busybox pods to poke around at the NFS volume, and
so that the later wget actually works (the original example would have
to work on the node, or need some other access to the container
network). After switching to two controllers, it actually makes more
sense to use PV claims, and it's probably a configuration that makes
more sense for indirection for NFS anyways.
We want to match the version of netcat that is installed on GCE. We
were having problems with netcat-openbsd having slightly different
timeout behaviour (on UDP packets; when there was no listener).
This scopes down the initially ambitious PR:
https://github.com/kubernetes/kubernetes/pull/14960 to replace just
`pause` and `fluentd-elasticsearch` to come through `beta.gcr.io`.
The v2 versions have been pushed under new tags, `pause:2.0` and
`fluentd-elastisearch:1.12`.
NOTE: `beta.gcr.io` will still serve images using v1 until they are repushed with v2. Pulls through `gcr.io` will still work after pushing through `beta.gcr.io`, but will be served over v1 (via compat logic).
OpenContrail is an open-source based networking software which provides virtualization support for the cloud.
This change-set adds ability to install and provision opencontrail software for networking in kubernetes based cloud environment.
There are basically 3 components
o kube-network-manager -- plugin between contrail components and kubernets components
o provision_master.sh -- OpenContrail software installer and provisioner in master node
o provision_minion.sh -- OpenContrail software installer and provisioner in minion node(s)
These are driven via salt configuration files
One can provision opencontrail by just setting "export NETWORK_PROVIDER=opencontrail"
Optionally, OPENCONTRAIL_TAG, and OPENCONTRAIL_KUBERNETES_TAG can be used to
specify opencontrail and contrail-kubernetes software versions to install and provision.
Public-IP Subnet provided by contrail can be configured via OPENCONTRAIL_PUBLIC_SUBNET
environment variable
At this moment, plan is to add support for aws, gce and vagrant based platforms
For more information on contrail-kubernetes, please visit https://github.com/juniper/contrail-kubernetes For more information on opencontrail, please visit http://www.opencontrail.org
changes to fluent-plugin-google-cloud to attach Kubernetes metadata to
logs.
Along with this, separate logs from containers in the cluster out from
logs from the daemons running on the node by instantiating two instances
of the output plugin, one which uses the new metadata (for containers)
and one which doesn't (for things like docker and the kubelet).
* Using Fedora 21 as the base box
* Discover the active network interfaces in the box to avoid hardcoding
them in configuration.
* Use the master IP for the certificate.
gunk when installing the google-fluentd agent.
Also let it log things by not redirecting to a file within the container
and only using -q (warning logs only) rather than -qq (error logs only).
This registry can be accessed through proxies that run on each node
listening on port 5000. We send the proxy images to the nodes directly
to avoid requests that hit the network during cluster launch. For now,
we continue to pull the registry itself over the network, especially
given its large size (we should be able to dramatically shrink the
image). On GCE we create a PD and use that for storage, otherwise we
use an emptyDir. The registry is not enabled outside of GCE. All
communication is currently plain HTTP. In order to use SSL, we will
need to be able to request a certificate/key from the apiserver signed
by the apiserver's CA cert.