mirror of https://github.com/k3s-io/k3s
720 lines
28 KiB
Markdown
720 lines
28 KiB
Markdown
# End-to-End Testing in Kubernetes
|
|
|
|
Updated: 5/3/2016
|
|
|
|
**Table of Contents**
|
|
<!-- BEGIN MUNGE: GENERATED_TOC -->
|
|
|
|
- [End-to-End Testing in Kubernetes](#end-to-end-testing-in-kubernetes)
|
|
- [Overview](#overview)
|
|
- [Building and Running the Tests](#building-and-running-the-tests)
|
|
- [Cleaning up](#cleaning-up)
|
|
- [Advanced testing](#advanced-testing)
|
|
- [Bringing up a cluster for testing](#bringing-up-a-cluster-for-testing)
|
|
- [Federation e2e tests](#federation-e2e-tests)
|
|
- [Configuring federation e2e tests](#configuring-federation-e2e-tests)
|
|
- [Image Push Repository](#image-push-repository)
|
|
- [Build](#build)
|
|
- [Deploy federation control plane](#deploy-federation-control-plane)
|
|
- [Run the Tests](#run-the-tests)
|
|
- [Teardown](#teardown)
|
|
- [Shortcuts for test developers](#shortcuts-for-test-developers)
|
|
- [Debugging clusters](#debugging-clusters)
|
|
- [Local clusters](#local-clusters)
|
|
- [Testing against local clusters](#testing-against-local-clusters)
|
|
- [Version-skewed and upgrade testing](#version-skewed-and-upgrade-testing)
|
|
- [Kinds of tests](#kinds-of-tests)
|
|
- [Viper configuration and hierarchichal test parameters.](#viper-configuration-and-hierarchichal-test-parameters)
|
|
- [Conformance tests](#conformance-tests)
|
|
- [Defining Conformance Subset](#defining-conformance-subset)
|
|
- [Continuous Integration](#continuous-integration)
|
|
- [What is CI?](#what-is-ci)
|
|
- [What runs in CI?](#what-runs-in-ci)
|
|
- [Non-default tests](#non-default-tests)
|
|
- [The PR-builder](#the-pr-builder)
|
|
- [Adding a test to CI](#adding-a-test-to-ci)
|
|
- [Moving a test out of CI](#moving-a-test-out-of-ci)
|
|
- [Performance Evaluation](#performance-evaluation)
|
|
- [One More Thing](#one-more-thing)
|
|
|
|
<!-- END MUNGE: GENERATED_TOC -->
|
|
|
|
## Overview
|
|
|
|
End-to-end (e2e) tests for Kubernetes provide a mechanism to test end-to-end
|
|
behavior of the system, and is the last signal to ensure end user operations
|
|
match developer specifications. Although unit and integration tests provide a
|
|
good signal, in a distributed system like Kubernetes it is not uncommon that a
|
|
minor change may pass all unit and integration tests, but cause unforeseen
|
|
changes at the system level.
|
|
|
|
The primary objectives of the e2e tests are to ensure a consistent and reliable
|
|
behavior of the kubernetes code base, and to catch hard-to-test bugs before
|
|
users do, when unit and integration tests are insufficient.
|
|
|
|
The e2e tests in kubernetes are built atop of
|
|
[Ginkgo](http://onsi.github.io/ginkgo/) and
|
|
[Gomega](http://onsi.github.io/gomega/). There are a host of features that this
|
|
Behavior-Driven Development (BDD) testing framework provides, and it is
|
|
recommended that the developer read the documentation prior to diving into the
|
|
tests.
|
|
|
|
The purpose of *this* document is to serve as a primer for developers who are
|
|
looking to execute or add tests using a local development environment.
|
|
|
|
Before writing new tests or making substantive changes to existing tests, you
|
|
should also read [Writing Good e2e Tests](writing-good-e2e-tests.md)
|
|
|
|
## Building and Running the Tests
|
|
|
|
There are a variety of ways to run e2e tests, but we aim to decrease the number
|
|
of ways to run e2e tests to a canonical way: `hack/e2e.go`.
|
|
|
|
You can run an end-to-end test which will bring up a master and nodes, perform
|
|
some tests, and then tear everything down. Make sure you have followed the
|
|
getting started steps for your chosen cloud platform (which might involve
|
|
changing the `KUBERNETES_PROVIDER` environment variable to something other than
|
|
"gce").
|
|
|
|
To build Kubernetes, up a cluster, run tests, and tear everything down, use:
|
|
|
|
```sh
|
|
go run hack/e2e.go -v --build --up --test --down
|
|
```
|
|
|
|
If you'd like to just perform one of these steps, here are some examples:
|
|
|
|
```sh
|
|
# Build binaries for testing
|
|
go run hack/e2e.go -v --build
|
|
|
|
# Create a fresh cluster. Deletes a cluster first, if it exists
|
|
go run hack/e2e.go -v --up
|
|
|
|
# Run all tests
|
|
go run hack/e2e.go -v --test
|
|
|
|
# Run tests matching the regex "\[Feature:Performance\]"
|
|
go run hack/e2e.go -v --test --test_args="--ginkgo.focus=\[Feature:Performance\]"
|
|
|
|
# Conversely, exclude tests that match the regex "Pods.*env"
|
|
go run hack/e2e.go -v --test --test_args="--ginkgo.skip=Pods.*env"
|
|
|
|
# Run tests in parallel, skip any that must be run serially
|
|
GINKGO_PARALLEL=y go run hack/e2e.go --v --test --test_args="--ginkgo.skip=\[Serial\]"
|
|
|
|
# Run tests in parallel, skip any that must be run serially and keep the test namespace if test failed
|
|
GINKGO_PARALLEL=y go run hack/e2e.go --v --test --test_args="--ginkgo.skip=\[Serial\] --delete-namespace-on-failure=false"
|
|
|
|
# Flags can be combined, and their actions will take place in this order:
|
|
# --build, --up, --test, --down
|
|
#
|
|
# You can also specify an alternative provider, such as 'aws'
|
|
#
|
|
# e.g.:
|
|
KUBERNETES_PROVIDER=aws go run hack/e2e.go -v --build --up --test --down
|
|
|
|
# -ctl can be used to quickly call kubectl against your e2e cluster. Useful for
|
|
# cleaning up after a failed test or viewing logs. Use -v to avoid suppressing
|
|
# kubectl output.
|
|
go run hack/e2e.go -v -ctl='get events'
|
|
go run hack/e2e.go -v -ctl='delete pod foobar'
|
|
```
|
|
|
|
The tests are built into a single binary which can be run used to deploy a
|
|
Kubernetes system or run tests against an already-deployed Kubernetes system.
|
|
See `go run hack/e2e.go --help` (or the flag definitions in `hack/e2e.go`) for
|
|
more options, such as reusing an existing cluster.
|
|
|
|
### Cleaning up
|
|
|
|
During a run, pressing `control-C` should result in an orderly shutdown, but if
|
|
something goes wrong and you still have some VMs running you can force a cleanup
|
|
with this command:
|
|
|
|
```sh
|
|
go run hack/e2e.go -v --down
|
|
```
|
|
|
|
## Advanced testing
|
|
|
|
### Bringing up a cluster for testing
|
|
|
|
If you want, you may bring up a cluster in some other manner and run tests
|
|
against it. To do so, or to do other non-standard test things, you can pass
|
|
arguments into Ginkgo using `--test_args` (e.g. see above). For the purposes of
|
|
brevity, we will look at a subset of the options, which are listed below:
|
|
|
|
```
|
|
--ginkgo.dryRun=false: If set, ginkgo will walk the test hierarchy without
|
|
actually running anything. Best paired with -v.
|
|
|
|
--ginkgo.failFast=false: If set, ginkgo will stop running a test suite after a
|
|
failure occurs.
|
|
|
|
--ginkgo.failOnPending=false: If set, ginkgo will mark the test suite as failed
|
|
if any specs are pending.
|
|
|
|
--ginkgo.focus="": If set, ginkgo will only run specs that match this regular
|
|
expression.
|
|
|
|
--ginkgo.skip="": If set, ginkgo will only run specs that do not match this
|
|
regular expression.
|
|
|
|
--ginkgo.trace=false: If set, default reporter prints out the full stack trace
|
|
when a failure occurs
|
|
|
|
--ginkgo.v=false: If set, default reporter print out all specs as they begin.
|
|
|
|
--host="": The host, or api-server, to connect to
|
|
|
|
--kubeconfig="": Path to kubeconfig containing embedded authinfo.
|
|
|
|
--prom-push-gateway="": The URL to prometheus gateway, so that metrics can be
|
|
pushed during e2es and scraped by prometheus. Typically something like
|
|
127.0.0.1:9091.
|
|
|
|
--provider="": The name of the Kubernetes provider (gce, gke, local, vagrant,
|
|
etc.)
|
|
|
|
--repo-root="../../": Root directory of kubernetes repository, for finding test
|
|
files.
|
|
```
|
|
|
|
Prior to running the tests, you may want to first create a simple auth file in
|
|
your home directory, e.g. `$HOME/.kube/config`, with the following:
|
|
|
|
```
|
|
{
|
|
"User": "root",
|
|
"Password": ""
|
|
}
|
|
```
|
|
|
|
As mentioned earlier there are a host of other options that are available, but
|
|
they are left to the developer.
|
|
|
|
**NOTE:** If you are running tests on a local cluster repeatedly, you may need
|
|
to periodically perform some manual cleanup:
|
|
|
|
- `rm -rf /var/run/kubernetes`, clear kube generated credentials, sometimes
|
|
stale permissions can cause problems.
|
|
|
|
- `sudo iptables -F`, clear ip tables rules left by the kube-proxy.
|
|
|
|
### Federation e2e tests
|
|
|
|
By default, `e2e.go` provisions a single Kubernetes cluster, and any `Feature:Federation` ginkgo tests will be skipped.
|
|
|
|
Federation e2e testing involve bringing up multiple "underlying" Kubernetes clusters,
|
|
and deploying the federation control plane as a Kubernetes application on the underlying clusters.
|
|
|
|
The federation e2e tests are still managed via `e2e.go`, but require some extra configuration items.
|
|
|
|
#### Configuring federation e2e tests
|
|
|
|
The following environment variables will enable federation e2e building, provisioning and testing.
|
|
|
|
```sh
|
|
$ export FEDERATION=true
|
|
$ export E2E_ZONES="us-central1-a us-central1-b us-central1-f"
|
|
```
|
|
|
|
A Kubernetes cluster will be provisioned in each zone listed in `E2E_ZONES`. A zone can only appear once in the `E2E_ZONES` list.
|
|
|
|
#### Image Push Repository
|
|
|
|
Next, specify the docker repository where your ci images will be pushed.
|
|
|
|
* **If `KUBERNETES_PROVIDER=gce` or `KUBERNETES_PROVIDER=gke`**:
|
|
|
|
If you use the same GCP project where you to run the e2e tests as the container image repository,
|
|
FEDERATION_PUSH_REPO_BASE environment variable will be defaulted to "gcr.io/${DEFAULT_GCP_PROJECT_NAME}".
|
|
You can skip ahead to the **Build** section.
|
|
|
|
You can simply set your push repo base based on your project name, and the necessary repositories will be
|
|
auto-created when you first push your container images.
|
|
|
|
```sh
|
|
$ export FEDERATION_PUSH_REPO_BASE="gcr.io/${GCE_PROJECT_NAME}"
|
|
```
|
|
|
|
Skip ahead to the **Build** section.
|
|
|
|
* **For all other providers**:
|
|
|
|
You'll be responsible for creating and managing access to the repositories manually.
|
|
|
|
```sh
|
|
$ export FEDERATION_PUSH_REPO_BASE="quay.io/colin_hom"
|
|
```
|
|
|
|
Given this example, the `federation-apiserver` container image will be pushed to the repository
|
|
`quay.io/colin_hom/federation-apiserver`.
|
|
|
|
The docker client on the machine running `e2e.go` must have push access for the following pre-existing repositories:
|
|
|
|
* `${FEDERATION_PUSH_REPO_BASE}/federation-apiserver`
|
|
* `${FEDERATION_PUSH_REPO_BASE}/federation-controller-manager`
|
|
|
|
These repositories must allow public read access, as the e2e node docker daemons will not have any credentials. If you're using
|
|
GCE/GKE as your provider, the repositories will have read-access by default.
|
|
|
|
#### Build
|
|
|
|
* Compile the binaries and build container images:
|
|
|
|
```sh
|
|
$ KUBE_RELEASE_RUN_TESTS=n KUBE_FASTBUILD=true go run hack/e2e.go -v -build
|
|
```
|
|
|
|
* Push the federation container images
|
|
|
|
```sh
|
|
$ build-tools/push-federation-images.sh
|
|
```
|
|
|
|
#### Deploy federation control plane
|
|
|
|
The following command will create the underlying Kubernetes clusters in each of `E2E_ZONES`, and then provision the
|
|
federation control plane in the cluster occupying the last zone in the `E2E_ZONES` list.
|
|
|
|
```sh
|
|
$ go run hack/e2e.go -v --up
|
|
```
|
|
|
|
#### Run the Tests
|
|
|
|
This will run only the `Feature:Federation` e2e tests. You can omit the `ginkgo.focus` argument to run the entire e2e suite.
|
|
|
|
```sh
|
|
$ go run hack/e2e.go -v --test --test_args="--ginkgo.focus=\[Feature:Federation\]"
|
|
```
|
|
|
|
#### Teardown
|
|
|
|
```sh
|
|
$ go run hack/e2e.go -v --down
|
|
```
|
|
|
|
#### Shortcuts for test developers
|
|
|
|
* To speed up `e2e.go -up`, provision a single-node kubernetes cluster in a single e2e zone:
|
|
|
|
`NUM_NODES=1 E2E_ZONES="us-central1-f"`
|
|
|
|
Keep in mind that some tests may require multiple underlying clusters and/or minimum compute resource availability.
|
|
|
|
* You can quickly recompile the e2e testing framework via `go install ./test/e2e`. This will not do anything besides
|
|
allow you to verify that the go code compiles.
|
|
|
|
* If you want to run your e2e testing framework without re-provisioning the e2e setup, you can do so via
|
|
`make WHAT=test/e2e/e2e.test` and then re-running the ginkgo tests.
|
|
|
|
* If you're hacking around with the federation control plane deployment itself,
|
|
you can quickly re-deploy the federation control plane Kubernetes manifests without tearing any resources down.
|
|
To re-deploy the federation control plane after running `-up` for the first time:
|
|
|
|
```sh
|
|
$ federation/cluster/federation-up.sh
|
|
```
|
|
|
|
### Debugging clusters
|
|
|
|
If a cluster fails to initialize, or you'd like to better understand cluster
|
|
state to debug a failed e2e test, you can use the `cluster/log-dump.sh` script
|
|
to gather logs.
|
|
|
|
This script requires that the cluster provider supports ssh. Assuming it does,
|
|
running:
|
|
|
|
```
|
|
cluster/log-dump.sh <directory>
|
|
````
|
|
|
|
will ssh to the master and all nodes and download a variety of useful logs to
|
|
the provided directory (which should already exist).
|
|
|
|
The Google-run Jenkins builds automatically collected these logs for every
|
|
build, saving them in the `artifacts` directory uploaded to GCS.
|
|
|
|
### Local clusters
|
|
|
|
It can be much faster to iterate on a local cluster instead of a cloud-based
|
|
one. To start a local cluster, you can run:
|
|
|
|
```sh
|
|
# The PATH construction is needed because PATH is one of the special-cased
|
|
# environment variables not passed by sudo -E
|
|
sudo PATH=$PATH hack/local-up-cluster.sh
|
|
```
|
|
|
|
This will start a single-node Kubernetes cluster than runs pods using the local
|
|
docker daemon. Press Control-C to stop the cluster.
|
|
|
|
You can generate a valid kubeconfig file by following instructions printed at the
|
|
end of aforementioned script.
|
|
|
|
#### Testing against local clusters
|
|
|
|
In order to run an E2E test against a locally running cluster, point the tests
|
|
at a custom host directly:
|
|
|
|
```sh
|
|
export KUBECONFIG=/path/to/kubeconfig
|
|
export KUBE_MASTER_IP="http://127.0.0.1:<PORT>"
|
|
export KUBE_MASTER=local
|
|
go run hack/e2e.go -v --test
|
|
```
|
|
|
|
To control the tests that are run:
|
|
|
|
```sh
|
|
go run hack/e2e.go -v --test --test_args="--ginkgo.focus=\"Secrets\""
|
|
```
|
|
|
|
### Version-skewed and upgrade testing
|
|
|
|
We run version-skewed tests to check that newer versions of Kubernetes work
|
|
similarly enough to older versions. The general strategy is to cover the following cases:
|
|
|
|
1. One version of `kubectl` with another version of the cluster and tests (e.g.
|
|
that v1.2 and v1.4 `kubectl` doesn't break v1.3 tests running against a v1.3
|
|
cluster).
|
|
1. A newer version of the Kubernetes master with older nodes and tests (e.g.
|
|
that upgrading a master to v1.3 with nodes at v1.2 still passes v1.2 tests).
|
|
1. A newer version of the whole cluster with older tests (e.g. that a cluster
|
|
upgraded---master and nodes---to v1.3 still passes v1.2 tests).
|
|
1. That an upgraded cluster functions the same as a brand-new cluster of the
|
|
same version (e.g. a cluster upgraded to v1.3 passes the same v1.3 tests as
|
|
a newly-created v1.3 cluster).
|
|
|
|
[hack/e2e-runner.sh](http://releases.k8s.io/HEAD/hack/jenkins/e2e-runner.sh) is
|
|
the authoritative source on how to run version-skewed tests, but below is a
|
|
quick-and-dirty tutorial.
|
|
|
|
```sh
|
|
# Assume you have two copies of the Kubernetes repository checked out, at
|
|
# ./kubernetes and ./kubernetes_old
|
|
|
|
# If using GKE:
|
|
export KUBERNETES_PROVIDER=gke
|
|
export CLUSTER_API_VERSION=${OLD_VERSION}
|
|
|
|
# Deploy a cluster at the old version; see above for more details
|
|
cd ./kubernetes_old
|
|
go run ./hack/e2e.go -v --up
|
|
|
|
# Upgrade the cluster to the new version
|
|
#
|
|
# If using GKE, add --upgrade-target=${NEW_VERSION}
|
|
#
|
|
# You can target Feature:MasterUpgrade or Feature:ClusterUpgrade
|
|
cd ../kubernetes
|
|
go run ./hack/e2e.go -v --test --check_version_skew=false --test_args="--ginkgo.focus=\[Feature:MasterUpgrade\]"
|
|
|
|
# Run old tests with new kubectl
|
|
cd ../kubernetes_old
|
|
go run ./hack/e2e.go -v --test --test_args="--kubectl-path=$(pwd)/../kubernetes/cluster/kubectl.sh"
|
|
```
|
|
|
|
If you are just testing version-skew, you may want to just deploy at one
|
|
version and then test at another version, instead of going through the whole
|
|
upgrade process:
|
|
|
|
```sh
|
|
# With the same setup as above
|
|
|
|
# Deploy a cluster at the new version
|
|
cd ./kubernetes
|
|
go run ./hack/e2e.go -v --up
|
|
|
|
# Run new tests with old kubectl
|
|
go run ./hack/e2e.go -v --test --test_args="--kubectl-path=$(pwd)/../kubernetes_old/cluster/kubectl.sh"
|
|
|
|
# Run old tests with new kubectl
|
|
cd ../kubernetes_old
|
|
go run ./hack/e2e.go -v --test --test_args="--kubectl-path=$(pwd)/../kubernetes/cluster/kubectl.sh"
|
|
```
|
|
|
|
## Kinds of tests
|
|
|
|
We are working on implementing clearer partitioning of our e2e tests to make
|
|
running a known set of tests easier (#10548). Tests can be labeled with any of
|
|
the following labels, in order of increasing precedence (that is, each label
|
|
listed below supersedes the previous ones):
|
|
|
|
- If a test has no labels, it is expected to run fast (under five minutes), be
|
|
able to be run in parallel, and be consistent.
|
|
|
|
- `[Slow]`: If a test takes more than five minutes to run (by itself or in
|
|
parallel with many other tests), it is labeled `[Slow]`. This partition allows
|
|
us to run almost all of our tests quickly in parallel, without waiting for the
|
|
stragglers to finish.
|
|
|
|
- `[Serial]`: If a test cannot be run in parallel with other tests (e.g. it
|
|
takes too many resources or restarts nodes), it is labeled `[Serial]`, and
|
|
should be run in serial as part of a separate suite.
|
|
|
|
- `[Disruptive]`: If a test restarts components that might cause other tests
|
|
to fail or break the cluster completely, it is labeled `[Disruptive]`. Any
|
|
`[Disruptive]` test is also assumed to qualify for the `[Serial]` label, but
|
|
need not be labeled as both. These tests are not run against soak clusters to
|
|
avoid restarting components.
|
|
|
|
- `[Flaky]`: If a test is found to be flaky and we have decided that it's too
|
|
hard to fix in the short term (e.g. it's going to take a full engineer-week), it
|
|
receives the `[Flaky]` label until it is fixed. The `[Flaky]` label should be
|
|
used very sparingly, and should be accompanied with a reference to the issue for
|
|
de-flaking the test, because while a test remains labeled `[Flaky]`, it is not
|
|
monitored closely in CI. `[Flaky]` tests are by default not run, unless a
|
|
`focus` or `skip` argument is explicitly given.
|
|
|
|
- `[Feature:.+]`: If a test has non-default requirements to run or targets
|
|
some non-core functionality, and thus should not be run as part of the standard
|
|
suite, it receives a `[Feature:.+]` label, e.g. `[Feature:Performance]` or
|
|
`[Feature:Ingress]`. `[Feature:.+]` tests are not run in our core suites,
|
|
instead running in custom suites. If a feature is experimental or alpha and is
|
|
not enabled by default due to being incomplete or potentially subject to
|
|
breaking changes, it does *not* block the merge-queue, and thus should run in
|
|
some separate test suites owned by the feature owner(s)
|
|
(see [Continuous Integration](#continuous-integration) below).
|
|
|
|
### Viper configuration and hierarchichal test parameters.
|
|
|
|
The future of e2e test configuration idioms will be increasingly defined using viper, and decreasingly via flags.
|
|
|
|
Flags in general fall apart once tests become sufficiently complicated. So, even if we could use another flag library, it wouldn't be ideal.
|
|
|
|
To use viper, rather than flags, to configure your tests:
|
|
|
|
- Just add "e2e.json" to the current directory you are in, and define parameters in it... i.e. `"kubeconfig":"/tmp/x"`.
|
|
|
|
Note that advanced testing parameters, and hierarchichally defined parameters, are only defined in viper, to see what they are, you can dive into [TestContextType](../../test/e2e/framework/test_context.go).
|
|
|
|
In time, it is our intent to add or autogenerate a sample viper configuration that includes all e2e parameters, to ship with kubernetes.
|
|
|
|
### Conformance tests
|
|
|
|
Finally, `[Conformance]` tests represent a subset of the e2e-tests we expect to
|
|
pass on **any** Kubernetes cluster. The `[Conformance]` label does not supersede
|
|
any other labels.
|
|
|
|
As each new release of Kubernetes providers new functionality, the subset of
|
|
tests necessary to demonstrate conformance grows with each release. Conformance
|
|
is thus considered versioned, with the same backwards compatibility guarantees
|
|
as laid out in [our versioning policy](../design/versioning.md#supported-releases).
|
|
Conformance tests for a given version should be run off of the release branch
|
|
that corresponds to that version. Thus `v1.2` conformance tests would be run
|
|
from the head of the `release-1.2` branch. eg:
|
|
|
|
- A v1.3 development cluster should pass v1.1, v1.2 conformance tests
|
|
|
|
- A v1.2 cluster should pass v1.1, v1.2 conformance tests
|
|
|
|
- A v1.1 cluster should pass v1.0, v1.1 conformance tests, and fail v1.2
|
|
conformance tests
|
|
|
|
Conformance tests are designed to be run with no cloud provider configured.
|
|
Conformance tests can be run against clusters that have not been created with
|
|
`hack/e2e.go`, just provide a kubeconfig with the appropriate endpoint and
|
|
credentials.
|
|
|
|
```sh
|
|
# setup for conformance tests
|
|
export KUBECONFIG=/path/to/kubeconfig
|
|
export KUBERNETES_CONFORMANCE_TEST=y
|
|
export KUBERNETES_PROVIDER=skeleton
|
|
|
|
# run all conformance tests
|
|
go run hack/e2e.go -v --test --test_args="--ginkgo.focus=\[Conformance\]"
|
|
|
|
# run all parallel-safe conformance tests in parallel
|
|
GINKGO_PARALLEL=y go run hack/e2e.go -v --test --test_args="--ginkgo.focus=\[Conformance\] --ginkgo.skip=\[Serial\]"
|
|
|
|
# ... and finish up with remaining tests in serial
|
|
go run hack/e2e.go -v --test --test_args="--ginkgo.focus=\[Serial\].*\[Conformance\]"
|
|
```
|
|
|
|
### Defining Conformance Subset
|
|
|
|
It is impossible to define the entire space of Conformance tests without knowing
|
|
the future, so instead, we define the compliment of conformance tests, below
|
|
(`Please update this with companion PRs as necessary`):
|
|
|
|
- A conformance test cannot test cloud provider specific features (i.e. GCE
|
|
monitoring, S3 Bucketing, ...)
|
|
|
|
- A conformance test cannot rely on any particular non-standard file system
|
|
permissions granted to containers or users (i.e. sharing writable host /tmp with
|
|
a container)
|
|
|
|
- A conformance test cannot rely on any binaries that are not required for the
|
|
linux kernel or for a kubelet to run (i.e. git)
|
|
|
|
- A conformance test cannot test a feature which obviously cannot be supported
|
|
on a broad range of platforms (i.e. testing of multiple disk mounts, GPUs, high
|
|
density)
|
|
|
|
## Continuous Integration
|
|
|
|
A quick overview of how we run e2e CI on Kubernetes.
|
|
|
|
### What is CI?
|
|
|
|
We run a battery of `e2e` tests against `HEAD` of the master branch on a
|
|
continuous basis, and block merges via the [submit
|
|
queue](http://submit-queue.k8s.io/) on a subset of those tests if they fail (the
|
|
subset is defined in the [munger config]
|
|
(https://github.com/kubernetes/contrib/blob/master/mungegithub/mungers/submit-queue.go)
|
|
via the `jenkins-jobs` flag; note we also block on `kubernetes-build` and
|
|
`kubernetes-test-go` jobs for build and unit and integration tests).
|
|
|
|
CI results can be found at [ci-test.k8s.io](http://ci-test.k8s.io), e.g.
|
|
[ci-test.k8s.io/kubernetes-e2e-gce/10594](http://ci-test.k8s.io/kubernetes-e2e-gce/10594).
|
|
|
|
### What runs in CI?
|
|
|
|
We run all default tests (those that aren't marked `[Flaky]` or `[Feature:.+]`)
|
|
against GCE and GKE. To minimize the time from regression-to-green-run, we
|
|
partition tests across different jobs:
|
|
|
|
- `kubernetes-e2e-<provider>` runs all non-`[Slow]`, non-`[Serial]`,
|
|
non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel.
|
|
|
|
- `kubernetes-e2e-<provider>-slow` runs all `[Slow]`, non-`[Serial]`,
|
|
non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel.
|
|
|
|
- `kubernetes-e2e-<provider>-serial` runs all `[Serial]` and `[Disruptive]`,
|
|
non-`[Flaky]`, non-`[Feature:.+]` tests in serial.
|
|
|
|
We also run non-default tests if the tests exercise general-availability ("GA")
|
|
features that require a special environment to run in, e.g.
|
|
`kubernetes-e2e-gce-scalability` and `kubernetes-kubemark-gce`, which test for
|
|
Kubernetes performance.
|
|
|
|
#### Non-default tests
|
|
|
|
Many `[Feature:.+]` tests we don't run in CI. These tests are for features that
|
|
are experimental (often in the `experimental` API), and aren't enabled by
|
|
default.
|
|
|
|
### The PR-builder
|
|
|
|
We also run a battery of tests against every PR before we merge it. These tests
|
|
are equivalent to `kubernetes-gce`: it runs all non-`[Slow]`, non-`[Serial]`,
|
|
non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel. These
|
|
tests are considered "smoke tests" to give a decent signal that the PR doesn't
|
|
break most functionality. Results for your PR can be found at
|
|
[pr-test.k8s.io](http://pr-test.k8s.io), e.g.
|
|
[pr-test.k8s.io/20354](http://pr-test.k8s.io/20354) for #20354.
|
|
|
|
### Adding a test to CI
|
|
|
|
As mentioned above, prior to adding a new test, it is a good idea to perform a
|
|
`-ginkgo.dryRun=true` on the system, in order to see if a behavior is already
|
|
being tested, or to determine if it may be possible to augment an existing set
|
|
of tests for a specific use case.
|
|
|
|
If a behavior does not currently have coverage and a developer wishes to add a
|
|
new e2e test, navigate to the ./test/e2e directory and create a new test using
|
|
the existing suite as a guide.
|
|
|
|
TODO(#20357): Create a self-documented example which has been disabled, but can
|
|
be copied to create new tests and outlines the capabilities and libraries used.
|
|
|
|
When writing a test, consult #kinds_of_tests above to determine how your test
|
|
should be marked, (e.g. `[Slow]`, `[Serial]`; remember, by default we assume a
|
|
test can run in parallel with other tests!).
|
|
|
|
When first adding a test it should *not* go straight into CI, because failures
|
|
block ordinary development. A test should only be added to CI after is has been
|
|
running in some non-CI suite long enough to establish a track record showing
|
|
that the test does not fail when run against *working* software. Note also that
|
|
tests running in CI are generally running on a well-loaded cluster, so must
|
|
contend for resources; see above about [kinds of tests](#kinds_of_tests).
|
|
|
|
Generally, a feature starts as `experimental`, and will be run in some suite
|
|
owned by the team developing the feature. If a feature is in beta or GA, it
|
|
*should* block the merge-queue. In moving from experimental to beta or GA, tests
|
|
that are expected to pass by default should simply remove the `[Feature:.+]`
|
|
label, and will be incorporated into our core suites. If tests are not expected
|
|
to pass by default, (e.g. they require a special environment such as added
|
|
quota,) they should remain with the `[Feature:.+]` label, and the suites that
|
|
run them should be incorporated into the
|
|
[munger config](https://github.com/kubernetes/contrib/blob/master/mungegithub/mungers/submit-queue.go)
|
|
via the `jenkins-jobs` flag.
|
|
|
|
Occasionally, we'll want to add tests to better exercise features that are
|
|
already GA. These tests also shouldn't go straight to CI. They should begin by
|
|
being marked as `[Flaky]` to be run outside of CI, and once a track-record for
|
|
them is established, they may be promoted out of `[Flaky]`.
|
|
|
|
### Moving a test out of CI
|
|
|
|
If we have determined that a test is known-flaky and cannot be fixed in the
|
|
short-term, we may move it out of CI indefinitely. This move should be used
|
|
sparingly, as it effectively means that we have no coverage of that test. When a
|
|
test is demoted, it should be marked `[Flaky]` with a comment accompanying the
|
|
label with a reference to an issue opened to fix the test.
|
|
|
|
## Performance Evaluation
|
|
|
|
Another benefit of the e2e tests is the ability to create reproducible loads on
|
|
the system, which can then be used to determine the responsiveness, or analyze
|
|
other characteristics of the system. For example, the density tests load the
|
|
system to 30,50,100 pods per/node and measures the different characteristics of
|
|
the system, such as throughput, api-latency, etc.
|
|
|
|
For a good overview of how we analyze performance data, please read the
|
|
following [post](http://blog.kubernetes.io/2015/09/kubernetes-performance-measurements-and.html)
|
|
|
|
For developers who are interested in doing their own performance analysis, we
|
|
recommend setting up [prometheus](http://prometheus.io/) for data collection,
|
|
and using [promdash](http://prometheus.io/docs/visualization/promdash/) to
|
|
visualize the data. There also exists the option of pushing your own metrics in
|
|
from the tests using a
|
|
[prom-push-gateway](http://prometheus.io/docs/instrumenting/pushing/).
|
|
Containers for all of these components can be found
|
|
[here](https://hub.docker.com/u/prom/).
|
|
|
|
For more accurate measurements, you may wish to set up prometheus external to
|
|
kubernetes in an environment where it can access the major system components
|
|
(api-server, controller-manager, scheduler). This is especially useful when
|
|
attempting to gather metrics in a load-balanced api-server environment, because
|
|
all api-servers can be analyzed independently as well as collectively. On
|
|
startup, configuration file is passed to prometheus that specifies the endpoints
|
|
that prometheus will scrape, as well as the sampling interval.
|
|
|
|
```
|
|
#prometheus.conf
|
|
job: {
|
|
name: "kubernetes"
|
|
scrape_interval: "1s"
|
|
target_group: {
|
|
# apiserver(s)
|
|
target: "http://localhost:8080/metrics"
|
|
# scheduler
|
|
target: "http://localhost:10251/metrics"
|
|
# controller-manager
|
|
target: "http://localhost:10252/metrics"
|
|
}
|
|
}
|
|
```
|
|
|
|
Once prometheus is scraping the kubernetes endpoints, that data can then be
|
|
plotted using promdash, and alerts can be created against the assortment of
|
|
metrics that kubernetes provides.
|
|
|
|
## One More Thing
|
|
|
|
You should also know the [testing conventions](coding-conventions.md#testing-conventions).
|
|
|
|
**HAPPY TESTING!**
|
|
|
|
|
|
|
|
<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
|
|
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/docs/devel/e2e-tests.md?pixel)]()
|
|
<!-- END MUNGE: GENERATED_ANALYTICS -->
|