Automatic merge from submit-queue (batch tested with PRs 40971, 41027, 40709, 40903, 39369)
Bump GCI to gci-beta-56-9000-80-0
cc/ @Random-Liu @adityakali
Changelogs since gci-dev-56-8977-0-0 (currently used in Kubernetes):
- "net.ipv4.conf.eth0.forwarding" and "net.ipv4.ip_forward" may get reset to 0
- Track CVE-2016-9962 in Docker in GCI
- Linux kernel CVE-2016-7097
- Linux kernel CVE-2015-8964
- Linux kernel CVE-2016-6828
- Linux kernel CVE-2016-7917
- Linux kernel CVE-2016-7042
- Linux kernel CVE-2016-9793
- Linux kernel CVE-2016-7039 and CVE-2016-8666
- Linux kernel CVE-2016-8655
- Toolbox: allow docker image to be loaded from local tarball
- Update compute-image-package in GCI
- Change the product name on /etc/os-release (to COS)
- Remove 'dogfood' from HWID_OVERRIDE in /etc/lsb-release
- Include Google NVME extensions to optimize LocalSSD performance.
- /proc/<pid>/io missing on GCI (enables process stats accounting)
- Enable BLK_DEV_THROTTLING
cc/ @roberthbailey @fabioy for GKE cluster update
This is to facilitate GCI tip vs. K8s tip testing; we need to
dynamically set the version of GCI to stay current with their
latest canary (latest of the "gci-base" prefixed images).
There is a concern that some GCE users may be running automation that
(a) turns up ephemeral clusters and (b) always uses the latest K8s
release. If any of these workloads fall outside the set supported on
GCI, cutting the release will break the automation. We are therefore
delaying this change until we have provided sufficient warning.
Automatic merge from submit-queue
Add gcl cluster logging test
This PR changes default logging destination for tests to gcp and adds test for cluster logging using google cloud logging
Fix#20760
Automatic merge from submit-queue
Update GCI_VERSION to gci-dev-55-8866-0-0
Update GCI base image:
Change log:
* Built-in kubernetes updated to v1.4.0
* Enabled VXLAN and IP_SET config options in kernel to support some networking tools
* OpenSSL CVE fixes
```release-note
Update GCI base image:
* Enabled VXLAN and IP_SET config options in kernel to support some networking tools (ebtools)
* OpenSSL CVE fixes
```
cc/ @kubernetes/goog-image cc/ @dchen1107
Automatic merge from submit-queue
Nodefs becomes imagefs on GCI
Kubelet cannot identify rootfs correctly
For #33444
```release-note
Enforce Disk based pod eviction with GCI base image in Kubelet
```
Signed-off-by: Vishnu kannan <vishnuk@google.com>
Changelog:
* Built-in kubernetes updated to v1.4.0
* Enabled VXLAN and IP_SET config options in kernel to support some networking tools
* OpenSSL CVE fixes
Automatic merge from submit-queue
Update the containervm image to the latest one (container-v1-3-v20160…
Node e2e is running with old containervm image which only has docker 1.9.1. This pr fixed such issue.
Automatic merge from submit-queue
Bump up GCI version.
```release-note
Upgrading Container-VM base image for k8s on GCE. Brief changelog as follows:
- Fixed performance regression in veth device driver
- Docker and related binaries are statically linked
- Fixed the issue of systemd being oom-killable
```
Fixes#32596
This needs a cherrypick into v1.4 release branch because it is fixing v1.4 release blocking issues. This patch is easy and safe to rollback in case of emergencies.
@vishh can you please review?
Fixes#32596 and many other issues.
cc/ @kubernetes/goog-image FYI
Brief changelog compared to gci-dev-54-8743-3-0:
- Fixed performance regression in veth device driver
- Docker and related binaries are statically linked
- Fixed the issue of systemd being oom-killable
- Updated built-in kubelet version to 1.3.7
- add ethtool and ebtables binaries expected by kubelet
Fixes#32596
Automatic merge from submit-queue
Implemented KUBE_DELETE_NODES flag in kube-down.
Implemented KUBE_DELETE_NODES flag in kube-down script.
It prevents removal of nodes when shutting down a HA master replica.
Automatic merge from submit-queue
Enable kubelet eviction whenever inodes free is < 5% on GCE
This is a pre-req for enabling inodes based evictions in GKE.
Automatic merge from submit-queue
rkt: Update kube-up rkt version to v1.14.0
cc @kubernetes/sig-rktnetes
This should have been included in #31286 (whoops).
This is a bugfix that I propose for v1.4 inclusion.
Automatic merge from submit-queue
Enable Rescheduler by default
Rescheduler is stable - e2e test is passing constantly for >1week.
ref #29023
```release-note
Rescheduler which ensures that critical pods are always scheduled enabled by default in GCE.
```
Automatic merge from submit-queue
Pick a specific GCI version by default on GCE.
Prior to this change, a K8s branch (master as well as release) was
pinned to a GCI milestone. It would pick up the latest GCI release on
that milestone at the time of cluster creation. The rationale was the
K8s users would automatically get the bug fixes in newer versions of
GCI. However in practice, it makes the runtime environment
non-deterministic, and lack of continuous e2e tests mean we would run
into breakages sooner or later.
With this change, each K8s release will pick a specific version
of GCI by default (similar to how the Debian-based container-vm gets used).
Users can override the default version through KUBE_GCE_MASTER_IMAGE and
KUBE_GCE_NODE_IMAGE environment variables.
We expect the default GCI version will be updated relatively frequently stay
updated with newer GCI releases. We can also automate the process to
automatically bump the hard-coded GCI version in future.
@vishh @adityakali can you please review?
cc @kubernetes/goog-image FYI
Prior to this change, a K8s branch (master as well as release) was
pinned to a GCI milestone. It would pick up the latest GCI release on
that milestone at the time of cluster creation. The rationale was the
K8s users would automatically get the bug fixes in newer versions of
GCI. However in practice, it makes the runtime environment
non-deterministic, and lack of continuous e2e tests mean we would run
into breakages sooner or later.
With this change, each K8s release will pick a specific version
of GCI by default (similar to how the Debian-based container-vm gets used).
Users can override the default version through KUBE_GCE_MASTER_IMAGE and
KUBE_GCE_NODE_IMAGE environment variables.
We expect the default GCI version will be updated relatively frequently stay
updated with newer GCI releases. We can also automate the process to
automatically bump the hard-coded GCI version in future.
Automatic merge from submit-queue
Add admission controller for default storage class.
The admission controller adds a default class to PVCs that do not require any
specific class. This way, users (=PVC authors) do not need to care about
storage classes, administrator can configure a default one and all these PVCs
that do not care about class will get the default one.
The marker of default class is annotation "volume.beta.kubernetes.io/storage-class", which must be set to "true" to work. All other values (or missing annotation) makes the class non-default.
Based on @thockin's code, added tests and made it not to reject a PVC when no class is marked as default.
.
@kubernetes/sig-storage
The admission controller adds a default class to PVCs that do not require any
specific class. This way, users (=PVC authors) do not need to care about
storage classes, administrator can configure a default one and all these PVCs
that do not care about class will get the default one.
* Add a pillar for hostname (because even if there's a good Salt
function for it, I don't trust it to return the short hostname)
* Move INITIAL_ETCD_CLUSTER to just the GCE turn-up
* Remove the master_name, which isn't needed as a pillar
Automatic merge from submit-queue
Add Calico as policy provider in GCE
Adds Calico as policy provider to GCE, enforcing the extensions/v1beta1 NetworkPolicy API.
Still to do:
- [x] Enable NetworkPolicy API when POLICY_PROVIDER is provided.
- [x] Fix CNI plugin, policy controller versions.
CC @thockin - does this general approach look good?
Automatic merge from submit-queue
Prep for continuous Docker validation test
```release-note
Add a test config variable to specify desired Docker version to run on GCI.
```
We want to continuously validate Docker releases (#25215), on GCI. This change
adds a new test config variable, `KUBE_GCI_DOCKER_VERSION`, through which we can
specify which version of Docker we want to run on the master and nodes. This
change also patches the Jenkins e2e-runner with the ability to fetch the latest
Docker (pre)release, and sets the aforementioned variable accordingly.
Tested on my local Jenkins instance that was able to start a cluster with the latest Docker version (different from installed version) running on both master and nodes.
@dchen1107 Can you review?
cc/ @andyzheng0831 for changes in `cluster/gce/gci/helper.sh`, and @ixdy @spxtr for changes to the Jenkins e2e-runner
cc/ @kubernetes/goog-image
Automatic merge from submit-queue
Re-enable node problem detector by default
Re-enable node problem detector started in gce cluster by default.
For now, in the master node, the node problem detector will be started and do nothing (see https://github.com/kubernetes/node-problem-detector/pull/13).
But in fact, in my test cluster, the master has no extra cpu to run the node problem detector, so node problem detector is started on all nodes except master, which is what we want but not expected...
@dchen1107
/cc @kubernetes/sig-node
/cc @andyzheng0831 for the gci script change.
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
We want to continuously validate Docker releases (#25215), on GCI. This change
adds a new test config variable, `KUBE_GCI_DOCKER_VERSION`, through which we can
specify which version of Docker we want to run on the master and nodes. This
change also patches the Jenkins e2e-runner with the ability to fetch the latest
Docker (pre)release, and sets the aforementioned variable accordingly.
This change recovers some of the side effects of
https://github.com/kubernetes/kubernetes/pull/26197, i.e., keeps the defaults of
`NODE_IMAGE` and `NODE_IMAGE_PROJECT` to `MASTER_IMAGE` and
`MASTER_IMAGE_PROJECT`, for backward compatibility. Although it keeps
`OS_DISTRIBUTION` defaulting to `gci`, the default settings of these vars are
moved to `cluster/gce/util.sh` and conditioned on `OS_DISTRIBUTION==gci`.
Automatic merge from submit-queue
Prepull images in e2e
Quick and dirty image puller because the SQ stalled multiple times just *today* on image pull flake (https://github.com/kubernetes/kubernetes/issues/25277).
@kubernetes/sig-node @kubernetes/sig-testing wdyt?
Automatic merge from submit-queue
Add node problem detector as an addon pod.
```release-note
Introduce a new add-on pod NodeProblemDetector.
NodeProblemDetector is a DaemonSet running on each node, monitoring node health and reporting
node problems as NodeCondition and Event. Currently it already supports kernel log monitoring, and
will support more problem detection in the future. It is enabled by default on gce now.
```
This PR enables NodeProblemDetector as an add-on pod.
/cc @mikedanese @kubernetes/sig-node
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Allow the gcr.io/google_containers registry to be overridden
regionally by just blasting a new KUBE_ADDON_REGISTRY out. Instead of
adding every addon to Salt and asking all of the other consumers
(Trusty, Juju, Mesos, etc) to change, just script the sed ourselves.
This is probably the 9th grossest thing I've ever done, but it works
well, and it works quickly. I kind of wish it didn't.
* In kube-up.sh, create a staging bucket with a location nearest the
zone being created. If new variable RELEASE_REGION_FALLBACK is set
(default false), create multiple buckets and stage to fallback
URLs. (In open source, this path is primarily for testing.)
* In configure-vm.sh, split the URL env variables by comma (if any
extra are present) and retry on the fallback URLs. Also factor the
hash checking into this path rather than outside, since a corrupt
release in a particular geo can be retried in a different geo.
* Remove the local already-staged .tar.gz checks. They've caused
several issues along the way, and with this code path become virtually
unmaintainable. (I could add a sentinel for each bucket it's possibly
staged to, but ew.)
Implement a flag that defines the frequency at which a node's out of
disk condition can change its status. Use this flag to suspend out of
disk status changes in the time period specified by the flag, after
the status is changed once.
Set the flag to 0 in e2e tests so that we can predictably test out of
disk node condition.
Also, use util.Clock interface for all time related functionality in
the kubelet. Calling time functions in unversioned package or time
package such as unversioned.Now() or time.Now() makes it really hard
to test such code. It also makes the tests flaky and sometimes
unnecessarily slow due to time.Sleep() calls used to simulate the
time elapsed. So use util.Clock interface instead which can be faked
in the tests.
This allows resource usage monitoring test to launch 100 test pods per node, in
addition to the add-on pods.
Also reduce the test time length since the results over the shorter period are
representative enough.
Addresses #15968
This patch removes KUBE_ENABLE_EXPERIMENTAL_API and similar calls in
favor of specifying desired features in KUBE_RUNTIME_CONFIG. Changes
have also been made to e2e scripts to re-enable using
KUBE_RUNTIME_CONFIG rather than EXPERIMENTAL_API env vars.
This also introduces KUBE_ENABLE_DAEMONSETS and KUBE_ENABLE_DEPLOYMENTS.
Signed-off-by: Christian Stewart <christian@paral.in>
When KUBE_E2E_STORAGE_TEST_ENVIRONMENT is set to 'true', kube-up.sh script
will:
- Install the right packages for all storage volumes.
- Use devicemapper as docker storage backend. 'aufs', the default one on
Debian, does not support extended attibutes required by Ceph RBD and Gluster
server containers.
Tested on GCE and Vagrant, e2e tests for storage volumes passes without any
additional configuration.
OpenContrail is an open-source based networking software which provides virtualization support for the cloud.
This change-set adds ability to install and provision opencontrail software for networking in kubernetes based cloud environment.
There are basically 3 components
o kube-network-manager -- plugin between contrail components and kubernets components
o provision_master.sh -- OpenContrail software installer and provisioner in master node
o provision_minion.sh -- OpenContrail software installer and provisioner in minion node(s)
These are driven via salt configuration files
One can provision opencontrail by just setting "export NETWORK_PROVIDER=opencontrail"
Optionally, OPENCONTRAIL_TAG, and OPENCONTRAIL_KUBERNETES_TAG can be used to
specify opencontrail and contrail-kubernetes software versions to install and provision.
Public-IP Subnet provided by contrail can be configured via OPENCONTRAIL_PUBLIC_SUBNET
environment variable
At this moment, plan is to add support for aws, gce and vagrant based platforms
For more information on contrail-kubernetes, please visit https://github.com/juniper/contrail-kubernetes For more information on opencontrail, please visit http://www.opencontrail.org