2015-09-11 21:16:56 +00:00
|
|
|
<!-- BEGIN MUNGE: UNVERSIONED_WARNING -->
|
|
|
|
|
|
|
|
<!-- BEGIN STRIP_FOR_RELEASE -->
|
|
|
|
|
|
|
|
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
|
|
|
width="25" height="25">
|
|
|
|
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
|
|
|
width="25" height="25">
|
|
|
|
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
|
|
|
width="25" height="25">
|
|
|
|
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
|
|
|
width="25" height="25">
|
|
|
|
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
|
|
|
width="25" height="25">
|
|
|
|
|
|
|
|
<h2>PLEASE NOTE: This document applies to the HEAD of the source tree</h2>
|
|
|
|
|
|
|
|
If you are using a released version of Kubernetes, you should
|
|
|
|
refer to the docs that go with that version.
|
|
|
|
|
2015-12-14 18:37:38 +00:00
|
|
|
<!-- TAG RELEASE_LINK, added by the munger automatically -->
|
2015-09-11 21:16:56 +00:00
|
|
|
<strong>
|
2015-11-03 18:17:57 +00:00
|
|
|
The latest release of this document can be found
|
2016-03-09 02:06:40 +00:00
|
|
|
[here](http://releases.k8s.io/release-1.2/docs/devel/e2e-tests.md).
|
2015-09-11 21:16:56 +00:00
|
|
|
|
|
|
|
Documentation for other releases can be found at
|
|
|
|
[releases.k8s.io](http://releases.k8s.io).
|
|
|
|
</strong>
|
|
|
|
--
|
|
|
|
|
|
|
|
<!-- END STRIP_FOR_RELEASE -->
|
|
|
|
|
|
|
|
<!-- END MUNGE: UNVERSIONED_WARNING -->
|
|
|
|
|
2016-02-05 23:43:18 +00:00
|
|
|
# End-to-End Testing in Kubernetes
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-05-03 17:23:25 +00:00
|
|
|
Updated: 5/3/2016
|
|
|
|
|
|
|
|
**Table of Contents**
|
|
|
|
<!-- BEGIN MUNGE: GENERATED_TOC -->
|
|
|
|
|
|
|
|
- [End-to-End Testing in Kubernetes](#end-to-end-testing-in-kubernetes)
|
|
|
|
- [Overview](#overview)
|
|
|
|
- [Building and Running the Tests](#building-and-running-the-tests)
|
|
|
|
- [Cleaning up](#cleaning-up)
|
|
|
|
- [Advanced testing](#advanced-testing)
|
|
|
|
- [Bringing up a cluster for testing](#bringing-up-a-cluster-for-testing)
|
|
|
|
- [Debugging clusters](#debugging-clusters)
|
|
|
|
- [Local clusters](#local-clusters)
|
|
|
|
- [Testing against local clusters](#testing-against-local-clusters)
|
|
|
|
- [Kinds of tests](#kinds-of-tests)
|
|
|
|
- [Conformance tests](#conformance-tests)
|
|
|
|
- [Defining Conformance Subset](#defining-conformance-subset)
|
|
|
|
- [Continuous Integration](#continuous-integration)
|
|
|
|
- [What is CI?](#what-is-ci)
|
|
|
|
- [What runs in CI?](#what-runs-in-ci)
|
|
|
|
- [Non-default tests](#non-default-tests)
|
|
|
|
- [The PR-builder](#the-pr-builder)
|
|
|
|
- [Adding a test to CI](#adding-a-test-to-ci)
|
|
|
|
- [Moving a test out of CI](#moving-a-test-out-of-ci)
|
|
|
|
- [Performance Evaluation](#performance-evaluation)
|
|
|
|
- [One More Thing](#one-more-thing)
|
|
|
|
|
|
|
|
<!-- END MUNGE: GENERATED_TOC -->
|
|
|
|
|
2015-09-11 21:16:56 +00:00
|
|
|
## Overview
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
End-to-end (e2e) tests for Kubernetes provide a mechanism to test end-to-end
|
|
|
|
behavior of the system, and is the last signal to ensure end user operations
|
|
|
|
match developer specifications. Although unit and integration tests provide a
|
|
|
|
good signal, in a distributed system like Kubernetes it is not uncommon that a
|
|
|
|
minor change may pass all unit and integration tests, but cause unforeseen
|
|
|
|
changes at the system level.
|
|
|
|
|
|
|
|
The primary objectives of the e2e tests are to ensure a consistent and reliable
|
|
|
|
behavior of the kubernetes code base, and to catch hard-to-test bugs before
|
|
|
|
users do, when unit and integration tests are insufficient.
|
|
|
|
|
|
|
|
The e2e tests in kubernetes are built atop of
|
|
|
|
[Ginkgo](http://onsi.github.io/ginkgo/) and
|
|
|
|
[Gomega](http://onsi.github.io/gomega/). There are a host of features that this
|
|
|
|
Behavior-Driven Development (BDD) testing framework provides, and it is
|
|
|
|
recommended that the developer read the documentation prior to diving into the
|
|
|
|
tests.
|
|
|
|
|
|
|
|
The purpose of *this* document is to serve as a primer for developers who are
|
|
|
|
looking to execute or add tests using a local development environment.
|
|
|
|
|
|
|
|
Before writing new tests or making substantive changes to existing tests, you
|
|
|
|
should also read [Writing Good e2e Tests](writing-good-e2e-tests.md)
|
2016-01-22 21:52:53 +00:00
|
|
|
|
2015-09-11 21:16:56 +00:00
|
|
|
## Building and Running the Tests
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
There are a variety of ways to run e2e tests, but we aim to decrease the number
|
|
|
|
of ways to run e2e tests to a canonical way: `hack/e2e.go`.
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
You can run an end-to-end test which will bring up a master and nodes, perform
|
|
|
|
some tests, and then tear everything down. Make sure you have followed the
|
|
|
|
getting started steps for your chosen cloud platform (which might involve
|
|
|
|
changing the `KUBERNETES_PROVIDER` environment variable to something other than
|
|
|
|
"gce").
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-02-05 23:43:18 +00:00
|
|
|
To build Kubernetes, up a cluster, run tests, and tear everything down, use:
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-02-05 23:43:18 +00:00
|
|
|
```sh
|
|
|
|
go run hack/e2e.go -v --build --up --test --down
|
|
|
|
```
|
|
|
|
|
|
|
|
If you'd like to just perform one of these steps, here are some examples:
|
|
|
|
|
|
|
|
```sh
|
|
|
|
# Build binaries for testing
|
|
|
|
go run hack/e2e.go -v --build
|
|
|
|
|
|
|
|
# Create a fresh cluster. Deletes a cluster first, if it exists
|
|
|
|
go run hack/e2e.go -v --up
|
|
|
|
|
|
|
|
# Test if a cluster is up.
|
|
|
|
go run hack/e2e.go -v --isup
|
|
|
|
|
|
|
|
# Push code to an existing cluster
|
|
|
|
go run hack/e2e.go -v --push
|
|
|
|
|
|
|
|
# Push to an existing cluster, or bring up a cluster if it's down.
|
|
|
|
go run hack/e2e.go -v --pushup
|
|
|
|
|
|
|
|
# Run all tests
|
|
|
|
go run hack/e2e.go -v --test
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-03-22 20:09:31 +00:00
|
|
|
# Run tests matching the regex "\[Feature:Performance\]"
|
|
|
|
go run hack/e2e.go -v -test --test_args="--ginkgo.focus=\[Feature:Performance\]"
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-02-05 23:43:18 +00:00
|
|
|
# Conversely, exclude tests that match the regex "Pods.*env"
|
|
|
|
go run hack/e2e.go -v -test --test_args="--ginkgo.focus=Pods.*env"
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-03-22 20:09:31 +00:00
|
|
|
# Run tests in parallel, skip any that must be run serially
|
|
|
|
GINKGO_PARALLEL=y go run hack/e2e.go --v --test --test_args="--ginkgo.skip=\[Serial\]"
|
|
|
|
|
2016-02-05 23:43:18 +00:00
|
|
|
# Flags can be combined, and their actions will take place in this order:
|
|
|
|
# --build, --push|--up|--pushup, --test|--tests=..., --down
|
|
|
|
#
|
|
|
|
# You can also specify an alternative provider, such as 'aws'
|
|
|
|
#
|
|
|
|
# e.g.:
|
|
|
|
KUBERNETES_PROVIDER=aws go run hack/e2e.go -v --build --pushup --test --down
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-02-05 23:43:18 +00:00
|
|
|
# -ctl can be used to quickly call kubectl against your e2e cluster. Useful for
|
|
|
|
# cleaning up after a failed test or viewing logs. Use -v to avoid suppressing
|
|
|
|
# kubectl output.
|
|
|
|
go run hack/e2e.go -v -ctl='get events'
|
|
|
|
go run hack/e2e.go -v -ctl='delete pod foobar'
|
|
|
|
```
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
The tests are built into a single binary which can be run used to deploy a
|
|
|
|
Kubernetes system or run tests against an already-deployed Kubernetes system.
|
|
|
|
See `go run hack/e2e.go --help` (or the flag definitions in `hack/e2e.go`) for
|
|
|
|
more options, such as reusing an existing cluster.
|
2016-02-05 23:43:18 +00:00
|
|
|
|
|
|
|
### Cleaning up
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
During a run, pressing `control-C` should result in an orderly shutdown, but if
|
|
|
|
something goes wrong and you still have some VMs running you can force a cleanup
|
|
|
|
with this command:
|
2016-02-05 23:43:18 +00:00
|
|
|
|
|
|
|
```sh
|
|
|
|
go run hack/e2e.go -v --down
|
|
|
|
```
|
|
|
|
|
|
|
|
## Advanced testing
|
|
|
|
|
|
|
|
### Bringing up a cluster for testing
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
If you want, you may bring up a cluster in some other manner and run tests
|
|
|
|
against it. To do so, or to do other non-standard test things, you can pass
|
|
|
|
arguments into Ginkgo using `--test_args` (e.g. see above). For the purposes of
|
|
|
|
brevity, we will look at a subset of the options, which are listed below:
|
2015-09-11 21:16:56 +00:00
|
|
|
|
|
|
|
```
|
2016-04-29 20:04:03 +00:00
|
|
|
-ginkgo.dryRun=false: If set, ginkgo will walk the test hierarchy without
|
|
|
|
actually running anything. Best paired with -v.
|
|
|
|
|
|
|
|
-ginkgo.failFast=false: If set, ginkgo will stop running a test suite after a
|
|
|
|
failure occurs.
|
|
|
|
|
|
|
|
-ginkgo.failOnPending=false: If set, ginkgo will mark the test suite as failed
|
|
|
|
if any specs are pending.
|
|
|
|
|
|
|
|
-ginkgo.focus="": If set, ginkgo will only run specs that match this regular
|
|
|
|
expression.
|
|
|
|
|
|
|
|
-ginkgo.skip="": If set, ginkgo will only run specs that do not match this
|
|
|
|
regular expression.
|
|
|
|
|
|
|
|
-ginkgo.trace=false: If set, default reporter prints out the full stack trace
|
|
|
|
when a failure occurs
|
|
|
|
|
2015-09-11 21:16:56 +00:00
|
|
|
-ginkgo.v=false: If set, default reporter print out all specs as they begin.
|
2016-04-29 20:04:03 +00:00
|
|
|
|
2015-09-11 21:16:56 +00:00
|
|
|
-host="": The host, or api-server, to connect to
|
2016-04-29 20:04:03 +00:00
|
|
|
|
2015-09-11 21:16:56 +00:00
|
|
|
-kubeconfig="": Path to kubeconfig containing embedded authinfo.
|
2016-04-29 20:04:03 +00:00
|
|
|
|
|
|
|
-prom-push-gateway="": The URL to prometheus gateway, so that metrics can be
|
|
|
|
pushed during e2es and scraped by prometheus. Typically something like
|
|
|
|
127.0.0.1:9091.
|
|
|
|
|
|
|
|
-provider="": The name of the Kubernetes provider (gce, gke, local, vagrant,
|
|
|
|
etc.)
|
|
|
|
|
|
|
|
-repo-root="../../": Root directory of kubernetes repository, for finding test
|
|
|
|
files.
|
2015-09-11 21:16:56 +00:00
|
|
|
```
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
Prior to running the tests, you may want to first create a simple auth file in
|
|
|
|
your home directory, e.g. `$HOME/.kube/config`, with the following:
|
2015-09-11 21:16:56 +00:00
|
|
|
|
|
|
|
```
|
|
|
|
{
|
|
|
|
"User": "root",
|
|
|
|
"Password": ""
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
As mentioned earlier there are a host of other options that are available, but
|
|
|
|
they are left to the developer.
|
|
|
|
|
|
|
|
**NOTE:** If you are running tests on a local cluster repeatedly, you may need
|
|
|
|
to periodically perform some manual cleanup:
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
- `rm -rf /var/run/kubernetes`, clear kube generated credentials, sometimes
|
|
|
|
stale permissions can cause problems.
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
- `sudo iptables -F`, clear ip tables rules left by the kube-proxy.
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-03-03 23:21:14 +00:00
|
|
|
### Debugging clusters
|
|
|
|
|
|
|
|
If a cluster fails to initialize, or you'd like to better understand cluster
|
|
|
|
state to debug a failed e2e test, you can use the `cluster/log-dump.sh` script
|
|
|
|
to gather logs.
|
|
|
|
|
|
|
|
This script requires that the cluster provider supports ssh. Assuming it does,
|
2016-04-29 20:04:03 +00:00
|
|
|
running:
|
2016-03-03 23:21:14 +00:00
|
|
|
|
|
|
|
```
|
|
|
|
cluster/log-dump.sh <directory>
|
|
|
|
````
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
will ssh to the master and all nodes and download a variety of useful logs to
|
|
|
|
the provided directory (which should already exist).
|
2016-03-03 23:21:14 +00:00
|
|
|
|
|
|
|
The Google-run Jenkins builds automatically collected these logs for every
|
|
|
|
build, saving them in the `artifacts` directory uploaded to GCS.
|
|
|
|
|
2016-02-05 23:43:18 +00:00
|
|
|
### Local clusters
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
It can be much faster to iterate on a local cluster instead of a cloud-based
|
|
|
|
one. To start a local cluster, you can run:
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-02-05 23:43:18 +00:00
|
|
|
```sh
|
|
|
|
# The PATH construction is needed because PATH is one of the special-cased
|
|
|
|
# environment variables not passed by sudo -E
|
|
|
|
sudo PATH=$PATH hack/local-up-cluster.sh
|
|
|
|
```
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
This will start a single-node Kubernetes cluster than runs pods using the local
|
|
|
|
docker daemon. Press Control-C to stop the cluster.
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-02-05 23:43:18 +00:00
|
|
|
#### Testing against local clusters
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
In order to run an E2E test against a locally running cluster, point the tests
|
|
|
|
at a custom host directly:
|
2016-02-05 23:43:18 +00:00
|
|
|
|
|
|
|
```sh
|
|
|
|
export KUBECONFIG=/path/to/kubeconfig
|
|
|
|
go run hack/e2e.go -v --test_args="--host=http://127.0.0.1:8080"
|
|
|
|
```
|
|
|
|
|
|
|
|
To control the tests that are run:
|
|
|
|
|
|
|
|
```sh
|
|
|
|
go run hack/e2e.go -v --test_args="--host=http://127.0.0.1:8080" --ginkgo.focus="Secrets"
|
|
|
|
```
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-01-08 19:35:30 +00:00
|
|
|
## Kinds of tests
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
We are working on implementing clearer partitioning of our e2e tests to make
|
|
|
|
running a known set of tests easier (#10548). Tests can be labeled with any of
|
|
|
|
the following labels, in order of increasing precedence (that is, each label
|
|
|
|
listed below supersedes the previous ones):
|
|
|
|
|
|
|
|
- If a test has no labels, it is expected to run fast (under five minutes), be
|
|
|
|
able to be run in parallel, and be consistent.
|
|
|
|
|
|
|
|
- `[Slow]`: If a test takes more than five minutes to run (by itself or in
|
|
|
|
parallel with many other tests), it is labeled `[Slow]`. This partition allows
|
|
|
|
us to run almost all of our tests quickly in parallel, without waiting for the
|
|
|
|
stragglers to finish.
|
|
|
|
|
|
|
|
- `[Serial]`: If a test cannot be run in parallel with other tests (e.g. it
|
|
|
|
takes too many resources or restarts nodes), it is labeled `[Serial]`, and
|
|
|
|
should be run in serial as part of a separate suite.
|
|
|
|
|
|
|
|
- `[Disruptive]`: If a test restarts components that might cause other tests
|
|
|
|
to fail or break the cluster completely, it is labeled `[Disruptive]`. Any
|
|
|
|
`[Disruptive]` test is also assumed to qualify for the `[Serial]` label, but
|
|
|
|
need not be labeled as both. These tests are not run against soak clusters to
|
|
|
|
avoid restarting components.
|
|
|
|
|
|
|
|
- `[Flaky]`: If a test is found to be flaky and we have decided that it's too
|
|
|
|
hard to fix in the short term (e.g. it's going to take a full engineer-week), it
|
|
|
|
receives the `[Flaky]` label until it is fixed. The `[Flaky]` label should be
|
|
|
|
used very sparingly, and should be accompanied with a reference to the issue for
|
|
|
|
de-flaking the test, because while a test remains labeled `[Flaky]`, it is not
|
|
|
|
monitored closely in CI. `[Flaky]` tests are by default not run, unless a
|
|
|
|
`focus` or `skip` argument is explicitly given.
|
|
|
|
|
|
|
|
- `[Feature:.+]`: If a test has non-default requirements to run or targets
|
|
|
|
some non-core functionality, and thus should not be run as part of the standard
|
|
|
|
suite, it receives a `[Feature:.+]` label, e.g. `[Feature:Performance]` or
|
|
|
|
`[Feature:Ingress]`. `[Feature:.+]` tests are not run in our core suites,
|
|
|
|
instead running in custom suites. If a feature is experimental or alpha and is
|
|
|
|
not enabled by default due to being incomplete or potentially subject to
|
|
|
|
breaking changes, it does *not* block the merge-queue, and thus should run in
|
|
|
|
some separate test suites owned by the feature owner(s)
|
|
|
|
(see [Continuous Integration](#continuous-integration) below).
|
2016-01-08 19:35:30 +00:00
|
|
|
|
2016-02-05 23:43:18 +00:00
|
|
|
### Conformance tests
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
Finally, `[Conformance]` tests represent a subset of the e2e-tests we expect to
|
|
|
|
pass on **any** Kubernetes cluster. The `[Conformance]` label does not supersede
|
|
|
|
any other labels.
|
2016-03-22 20:09:31 +00:00
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
As each new release of Kubernetes providers new functionality, the subset of
|
|
|
|
tests necessary to demonstrate conformance grows with each release. Conformance
|
|
|
|
is thus considered versioned, with the same backwards compatibility guarantees
|
|
|
|
as laid out in [our versioning policy](../design/versioning.md#supported-releases).
|
|
|
|
Conformance tests for a given version should be run off of the release branch
|
|
|
|
that corresponds to that version. Thus `v1.2` conformance tests would be run
|
|
|
|
from the head of the `release-1.2` branch. eg:
|
2016-03-22 20:48:31 +00:00
|
|
|
|
2016-03-22 22:36:27 +00:00
|
|
|
- A v1.3 development cluster should pass v1.1, v1.2 conformance tests
|
2016-04-29 20:04:03 +00:00
|
|
|
|
2016-03-22 22:36:27 +00:00
|
|
|
- A v1.2 cluster should pass v1.1, v1.2 conformance tests
|
2016-03-22 20:09:31 +00:00
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
- A v1.1 cluster should pass v1.0, v1.1 conformance tests, and fail v1.2
|
|
|
|
conformance tests
|
|
|
|
|
|
|
|
Conformance tests are designed to be run with no cloud provider configured.
|
|
|
|
Conformance tests can be run against clusters that have not been created with
|
|
|
|
`hack/e2e.go`, just provide a kubeconfig with the appropriate endpoint and
|
|
|
|
credentials.
|
2016-02-05 23:43:18 +00:00
|
|
|
|
2016-03-22 20:09:31 +00:00
|
|
|
```sh
|
|
|
|
# setup for conformance tests
|
|
|
|
export KUBECONFIG=/path/to/kubeconfig
|
2016-03-23 19:14:54 +00:00
|
|
|
export KUBERNETES_CONFORMANCE_TEST=y
|
2016-02-05 23:43:18 +00:00
|
|
|
|
2016-03-22 20:09:31 +00:00
|
|
|
# run all conformance tests
|
2016-04-27 22:29:11 +00:00
|
|
|
go run hack/e2e.go -v --test --test_args="--ginkgo.focus=\[Conformance\]"
|
2016-03-22 20:09:31 +00:00
|
|
|
|
|
|
|
# run all parallel-safe conformance tests in parallel
|
|
|
|
GINKGO_PARALLEL=y go run hack/e2e.go --v --test --test_args="--ginkgo.focus=\[Conformance\] --ginkgo.skip=\[Serial\]"
|
2016-04-29 20:04:03 +00:00
|
|
|
|
2016-03-22 20:09:31 +00:00
|
|
|
# ... and finish up with remaining tests in serial
|
|
|
|
go run hack/e2e.go --v --test --test_args="--ginkgo.focus=\[Serial\].*\[Conformance\]"
|
|
|
|
```
|
2016-01-08 19:35:30 +00:00
|
|
|
|
2016-03-22 20:09:31 +00:00
|
|
|
### Defining Conformance Subset
|
2016-03-02 19:23:58 +00:00
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
It is impossible to define the entire space of Conformance tests without knowing
|
|
|
|
the future, so instead, we define the compliment of conformance tests, below
|
|
|
|
(`Please update this with companion PRs as necessary`):
|
|
|
|
|
|
|
|
- A conformance test cannot test cloud provider specific features (i.e. GCE
|
|
|
|
monitoring, S3 Bucketing, ...)
|
|
|
|
|
|
|
|
- A conformance test cannot rely on any particular non-standard file system
|
|
|
|
permissions granted to containers or users (i.e. sharing writable host /tmp with
|
|
|
|
a container)
|
2016-03-02 19:23:58 +00:00
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
- A conformance test cannot rely on any binaries that are not required for the
|
|
|
|
linux kernel or for a kubelet to run (i.e. git)
|
2016-03-02 19:23:58 +00:00
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
- A conformance test cannot test a feature which obviously cannot be supported
|
|
|
|
on a broad range of platforms (i.e. testing of multiple disk mounts, GPUs, high
|
|
|
|
density)
|
2016-03-02 19:23:58 +00:00
|
|
|
|
2016-01-30 00:20:53 +00:00
|
|
|
## Continuous Integration
|
|
|
|
|
|
|
|
A quick overview of how we run e2e CI on Kubernetes.
|
|
|
|
|
|
|
|
### What is CI?
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
We run a battery of `e2e` tests against `HEAD` of the master branch on a
|
|
|
|
continuous basis, and block merges via the [submit
|
|
|
|
queue](http://submit-queue.k8s.io/) on a subset of those tests if they fail (the
|
|
|
|
subset is defined in the [munger config]
|
|
|
|
(https://github.com/kubernetes/contrib/blob/master/mungegithub/mungers/submit-queue.go)
|
|
|
|
via the `jenkins-jobs` flag; note we also block on `kubernetes-build` and
|
|
|
|
`kubernetes-test-go` jobs for build and unit and integration tests).
|
2016-01-30 00:20:53 +00:00
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
CI results can be found at [ci-test.k8s.io](http://ci-test.k8s.io), e.g.
|
|
|
|
[ci-test.k8s.io/kubernetes-e2e-gce/10594](http://ci-test.k8s.io/kubernetes-e2e-gce/10594).
|
2016-01-30 00:20:53 +00:00
|
|
|
|
|
|
|
### What runs in CI?
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
We run all default tests (those that aren't marked `[Flaky]` or `[Feature:.+]`)
|
|
|
|
against GCE and GKE. To minimize the time from regression-to-green-run, we
|
|
|
|
partition tests across different jobs:
|
2016-01-30 00:20:53 +00:00
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
- `kubernetes-e2e-<provider>` runs all non-`[Slow]`, non-`[Serial]`,
|
|
|
|
non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel.
|
2016-01-30 00:20:53 +00:00
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
- `kubernetes-e2e-<provider>-slow` runs all `[Slow]`, non-`[Serial]`,
|
|
|
|
non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel.
|
|
|
|
|
|
|
|
- `kubernetes-e2e-<provider>-serial` runs all `[Serial]` and `[Disruptive]`,
|
|
|
|
non-`[Flaky]`, non-`[Feature:.+]` tests in serial.
|
|
|
|
|
|
|
|
We also run non-default tests if the tests exercise general-availability ("GA")
|
|
|
|
features that require a special environment to run in, e.g.
|
|
|
|
`kubernetes-e2e-gce-scalability` and `kubernetes-kubemark-gce`, which test for
|
|
|
|
Kubernetes performance.
|
2016-01-30 00:20:53 +00:00
|
|
|
|
|
|
|
#### Non-default tests
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
Many `[Feature:.+]` tests we don't run in CI. These tests are for features that
|
|
|
|
are experimental (often in the `experimental` API), and aren't enabled by
|
|
|
|
default.
|
2016-01-30 00:20:53 +00:00
|
|
|
|
2016-01-30 00:23:59 +00:00
|
|
|
### The PR-builder
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
We also run a battery of tests against every PR before we merge it. These tests
|
|
|
|
are equivalent to `kubernetes-gce`: it runs all non-`[Slow]`, non-`[Serial]`,
|
|
|
|
non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel. These
|
|
|
|
tests are considered "smoke tests" to give a decent signal that the PR doesn't
|
|
|
|
break most functionality. Results for your PR can be found at
|
|
|
|
[pr-test.k8s.io](http://pr-test.k8s.io), e.g.
|
|
|
|
[pr-test.k8s.io/20354](http://pr-test.k8s.io/20354) for #20354.
|
2016-01-30 00:23:59 +00:00
|
|
|
|
2016-01-30 00:20:53 +00:00
|
|
|
### Adding a test to CI
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
As mentioned above, prior to adding a new test, it is a good idea to perform a
|
|
|
|
`-ginkgo.dryRun=true` on the system, in order to see if a behavior is already
|
|
|
|
being tested, or to determine if it may be possible to augment an existing set
|
|
|
|
of tests for a specific use case.
|
|
|
|
|
|
|
|
If a behavior does not currently have coverage and a developer wishes to add a
|
|
|
|
new e2e test, navigate to the ./test/e2e directory and create a new test using
|
|
|
|
the existing suite as a guide.
|
|
|
|
|
|
|
|
TODO(#20357): Create a self-documented example which has been disabled, but can
|
|
|
|
be copied to create new tests and outlines the capabilities and libraries used.
|
|
|
|
|
|
|
|
When writing a test, consult #kinds_of_tests above to determine how your test
|
|
|
|
should be marked, (e.g. `[Slow]`, `[Serial]`; remember, by default we assume a
|
|
|
|
test can run in parallel with other tests!).
|
|
|
|
|
|
|
|
When first adding a test it should *not* go straight into CI, because failures
|
|
|
|
block ordinary development. A test should only be added to CI after is has been
|
|
|
|
running in some non-CI suite long enough to establish a track record showing
|
|
|
|
that the test does not fail when run against *working* software. Note also that
|
|
|
|
tests running in CI are generally running on a well-loaded cluster, so must
|
|
|
|
contend for resources; see above about [kinds of tests](#kinds_of_tests).
|
|
|
|
|
|
|
|
Generally, a feature starts as `experimental`, and will be run in some suite
|
|
|
|
owned by the team developing the feature. If a feature is in beta or GA, it
|
|
|
|
*should* block the merge-queue. In moving from experimental to beta or GA, tests
|
|
|
|
that are expected to pass by default should simply remove the `[Feature:.+]`
|
|
|
|
label, and will be incorporated into our core suites. If tests are not expected
|
|
|
|
to pass by default, (e.g. they require a special environment such as added
|
|
|
|
quota,) they should remain with the `[Feature:.+]` label, and the suites that
|
|
|
|
run them should be incorporated into the
|
|
|
|
[munger config](https://github.com/kubernetes/contrib/blob/master/mungegithub/mungers/submit-queue.go)
|
|
|
|
via the `jenkins-jobs` flag.
|
|
|
|
|
|
|
|
Occasionally, we'll want to add tests to better exercise features that are
|
|
|
|
already GA. These tests also shouldn't go straight to CI. They should begin by
|
|
|
|
being marked as `[Flaky]` to be run outside of CI, and once a track-record for
|
|
|
|
them is established, they may be promoted out of `[Flaky]`.
|
2016-01-30 00:20:53 +00:00
|
|
|
|
|
|
|
### Moving a test out of CI
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
If we have determined that a test is known-flaky and cannot be fixed in the
|
|
|
|
short-term, we may move it out of CI indefinitely. This move should be used
|
|
|
|
sparingly, as it effectively means that we have no coverage of that test. When a
|
|
|
|
test is demoted, it should be marked `[Flaky]` with a comment accompanying the
|
|
|
|
label with a reference to an issue opened to fix the test.
|
2015-09-11 21:16:56 +00:00
|
|
|
|
|
|
|
## Performance Evaluation
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
Another benefit of the e2e tests is the ability to create reproducible loads on
|
|
|
|
the system, which can then be used to determine the responsiveness, or analyze
|
|
|
|
other characteristics of the system. For example, the density tests load the
|
|
|
|
system to 30,50,100 pods per/node and measures the different characteristics of
|
|
|
|
the system, such as throughput, api-latency, etc.
|
|
|
|
|
|
|
|
For a good overview of how we analyze performance data, please read the
|
|
|
|
following [post](http://blog.kubernetes.io/2015/09/kubernetes-performance-measurements-and.html)
|
|
|
|
|
|
|
|
For developers who are interested in doing their own performance analysis, we
|
|
|
|
recommend setting up [prometheus](http://prometheus.io/) for data collection,
|
|
|
|
and using [promdash](http://prometheus.io/docs/visualization/promdash/) to
|
|
|
|
visualize the data. There also exists the option of pushing your own metrics in
|
|
|
|
from the tests using a
|
|
|
|
[prom-push-gateway](http://prometheus.io/docs/instrumenting/pushing/).
|
|
|
|
Containers for all of these components can be found
|
|
|
|
[here](https://hub.docker.com/u/prom/).
|
|
|
|
|
|
|
|
For more accurate measurements, you may wish to set up prometheus external to
|
|
|
|
kubernetes in an environment where it can access the major system components
|
|
|
|
(api-server, controller-manager, scheduler). This is especially useful when
|
|
|
|
attempting to gather metrics in a load-balanced api-server environment, because
|
|
|
|
all api-servers can be analyzed independently as well as collectively. On
|
|
|
|
startup, configuration file is passed to prometheus that specifies the endpoints
|
|
|
|
that prometheus will scrape, as well as the sampling interval.
|
2015-09-11 21:16:56 +00:00
|
|
|
|
|
|
|
```
|
|
|
|
#prometheus.conf
|
|
|
|
job: {
|
2016-04-29 20:04:03 +00:00
|
|
|
name: "kubernetes"
|
|
|
|
scrape_interval: "1s"
|
|
|
|
target_group: {
|
|
|
|
# apiserver(s)
|
|
|
|
target: "http://localhost:8080/metrics"
|
|
|
|
# scheduler
|
|
|
|
target: "http://localhost:10251/metrics"
|
|
|
|
# controller-manager
|
|
|
|
target: "http://localhost:10252/metrics"
|
|
|
|
}
|
|
|
|
}
|
2015-09-11 21:16:56 +00:00
|
|
|
```
|
|
|
|
|
2016-04-29 20:04:03 +00:00
|
|
|
Once prometheus is scraping the kubernetes endpoints, that data can then be
|
|
|
|
plotted using promdash, and alerts can be created against the assortment of
|
|
|
|
metrics that kubernetes provides.
|
2015-09-11 21:16:56 +00:00
|
|
|
|
2016-05-03 17:23:25 +00:00
|
|
|
## One More Thing
|
|
|
|
|
|
|
|
You should also know the [testing conventions](coding-conventions.md#testing-conventions).
|
|
|
|
|
2015-09-11 21:16:56 +00:00
|
|
|
**HAPPY TESTING!**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
|
|
|
|
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/docs/devel/e2e-tests.md?pixel)]()
|
|
|
|
<!-- END MUNGE: GENERATED_ANALYTICS -->
|