2016-01-22 21:52:53 +00:00
|
|
|
<!-- BEGIN MUNGE: UNVERSIONED_WARNING -->
|
|
|
|
|
|
|
|
<!-- BEGIN STRIP_FOR_RELEASE -->
|
|
|
|
|
|
|
|
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
|
|
|
width="25" height="25">
|
|
|
|
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
|
|
|
width="25" height="25">
|
|
|
|
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
|
|
|
width="25" height="25">
|
|
|
|
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
|
|
|
width="25" height="25">
|
|
|
|
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
|
|
|
width="25" height="25">
|
|
|
|
|
|
|
|
<h2>PLEASE NOTE: This document applies to the HEAD of the source tree</h2>
|
|
|
|
|
|
|
|
If you are using a released version of Kubernetes, you should
|
|
|
|
refer to the docs that go with that version.
|
|
|
|
|
2016-03-09 02:06:40 +00:00
|
|
|
<!-- TAG RELEASE_LINK, added by the munger automatically -->
|
|
|
|
<strong>
|
|
|
|
The latest release of this document can be found
|
|
|
|
[here](http://releases.k8s.io/release-1.2/docs/devel/writing-good-e2e-tests.md).
|
|
|
|
|
2016-01-22 21:52:53 +00:00
|
|
|
Documentation for other releases can be found at
|
|
|
|
[releases.k8s.io](http://releases.k8s.io).
|
|
|
|
</strong>
|
|
|
|
--
|
|
|
|
|
|
|
|
<!-- END STRIP_FOR_RELEASE -->
|
|
|
|
|
|
|
|
<!-- END MUNGE: UNVERSIONED_WARNING -->
|
|
|
|
|
|
|
|
# Writing good e2e tests for Kubernetes #
|
|
|
|
|
|
|
|
## Patterns and Anti-Patterns ##
|
|
|
|
|
|
|
|
### Goals of e2e tests ###
|
|
|
|
|
|
|
|
Beyond the obvious goal of providing end-to-end system test coverage,
|
|
|
|
there are a few less obvious goals that you should bear in mind when
|
|
|
|
designing, writing and debugging your end-to-end tests. In
|
|
|
|
particular, "flaky" tests, which pass most of the time but fail
|
|
|
|
intermittently for difficult-to-diagnose reasons are extremely costly
|
|
|
|
in terms of blurring our regression signals and slowing down our
|
|
|
|
automated merge queue. Up-front time and effort designing your test
|
|
|
|
to be reliable is very well spent. Bear in mind that we have hundreds
|
|
|
|
of tests, each running in dozens of different environments, and if any
|
|
|
|
test in any test environment fails, we have to assume that we
|
|
|
|
potentially have some sort of regression. So if a significant number
|
|
|
|
of tests fail even only 1% of the time, basic statistics dictates that
|
|
|
|
we will almost never have a "green" regression indicator. Stated
|
|
|
|
another way, writing a test that is only 99% reliable is just about
|
|
|
|
useless in the harsh reality of a CI environment. In fact it's worse
|
|
|
|
than useless, because not only does it not provide a reliable
|
|
|
|
regression indicator, but it also costs a lot of subsequent debugging
|
|
|
|
time, and delayed merges.
|
|
|
|
|
|
|
|
#### Debuggability ####
|
|
|
|
|
|
|
|
If your test fails, it should provide as detailed as possible reasons
|
|
|
|
for the failure in it's output. "Timeout" is not a useful error
|
|
|
|
message. "Timed out after 60 seconds waiting for pod xxx to enter
|
|
|
|
running state, still in pending state" is much more useful to someone
|
|
|
|
trying to figure out why your test failed and what to do about it.
|
|
|
|
Specifically,
|
|
|
|
[assertion](https://onsi.github.io/gomega/#making-assertions) code
|
|
|
|
like the following generates rather useless errors:
|
|
|
|
|
|
|
|
```
|
|
|
|
Expect(err).NotTo(HaveOccurred())
|
|
|
|
```
|
|
|
|
|
|
|
|
Rather
|
|
|
|
[annotate](https://onsi.github.io/gomega/#annotating-assertions) your assertion with something like this:
|
|
|
|
|
|
|
|
```
|
|
|
|
Expect(err).NotTo(HaveOccurred(), "Failed to create %d foobars, only created %d", foobarsReqd, foobarsCreated)
|
|
|
|
```
|
|
|
|
|
|
|
|
On the other hand, overly verbose logging, particularly of non-error conditions, can make
|
|
|
|
it unnecessarily difficult to figure out whether a test failed and if
|
|
|
|
so why? So don't log lots of irrelevant stuff either.
|
|
|
|
|
|
|
|
#### Ability to run in non-dedicated test clusters ####
|
|
|
|
|
|
|
|
To reduce end-to-end delay and improve resource utilization when
|
|
|
|
running e2e tests, we try, where possible, to run large numbers of
|
|
|
|
tests in parallel against the same test cluster. This means that:
|
|
|
|
|
|
|
|
1. you should avoid making any assumption (implicit or explicit) that
|
|
|
|
your test is the only thing running against the cluster. For example,
|
|
|
|
making the assumption that your test can run a pod on every node in a
|
|
|
|
cluster is not a safe assumption, as some other tests, running at the
|
|
|
|
same time as yours, might have saturated one or more nodes in the
|
|
|
|
cluster. Similarly, running a pod in the system namespace, and
|
|
|
|
assuming that that will increase the count of pods in the system
|
|
|
|
namespace by one is not safe, as some other test might be creating or
|
|
|
|
deleting pods in the system namespace at the same time as your test.
|
|
|
|
If you do legitimately need to write a test like that, make sure to
|
|
|
|
label it ["\[Serial\]"](e2e-tests.md#kinds_of_tests) so that it's easy
|
|
|
|
to identify, and not run in parallel with any other tests.
|
|
|
|
1. You should avoid doing things to the cluster that make it difficult
|
|
|
|
for other tests to reliably do what they're trying to do, at the same
|
|
|
|
time. For example, rebooting nodes, disconnecting network interfaces,
|
|
|
|
or upgrading cluster software as part of your test is likely to
|
|
|
|
violate the assumptions that other tests might have made about a
|
|
|
|
reasonably stable cluster environment. If you need to write such
|
|
|
|
tests, please label them as
|
|
|
|
["\[Disruptive\]"](e2e-tests.md#kinds_of_tests) so that it's easy to
|
|
|
|
identify them, and not run them in parallel with other tests.
|
|
|
|
1. You should avoid making assumptions about the Kubernetes API that
|
|
|
|
are not part of the API specification, as your tests will break as
|
|
|
|
soon as these assumptions become invalid. For example, relying on
|
|
|
|
specific Events, Event reasons or Event messages will make your tests
|
|
|
|
very brittle.
|
|
|
|
|
|
|
|
#### Speed of execution ####
|
|
|
|
|
|
|
|
We have hundreds of e2e tests, some of which we run in serial, one
|
|
|
|
after the other, in some cases. If each test takes just a few minutes
|
|
|
|
to run, that very quickly adds up to many, many hours of total
|
|
|
|
execution time. We try to keep such total execution time down to a
|
|
|
|
few tens of minutes at most. Therefore, try (very hard) to keep the
|
|
|
|
execution time of your individual tests below 2 minutes, ideally
|
|
|
|
shorter than that. Concretely, adding inappropriately long 'sleep'
|
|
|
|
statements or other gratuitous waits to tests is a killer. If under
|
|
|
|
normal circumstances your pod enters the running state within 10
|
|
|
|
seconds, and 99.9% of the time within 30 seconds, it would be
|
|
|
|
gratuitous to wait 5 minutes for this to happen. Rather just fail
|
|
|
|
after 30 seconds, with a clear error message as to why your test
|
|
|
|
failed ("e.g. Pod x failed to become ready after 30 seconds, it
|
|
|
|
usually takes 10 seconds"). If you do have a truly legitimate reason
|
|
|
|
for waiting longer than that, or writing a test which takes longer
|
|
|
|
than 2 minutes to run, comment very clearly in the code why this is
|
|
|
|
necessary, and label the test as
|
|
|
|
["\[Slow\]"](e2e-tests.md#kinds_of_tests), so that it's easy to
|
|
|
|
identify and avoid in test runs that are required to complete
|
|
|
|
timeously (for example those that are run against every code
|
|
|
|
submission before it is allowed to be merged).
|
|
|
|
Note that completing within, say, 2 minutes only when the test
|
|
|
|
passes is not generally good enough. Your test should also fail in a
|
|
|
|
reasonable time. We have seen tests that, for example, wait up to 10
|
|
|
|
minutes for each of several pods to become ready. Under good
|
|
|
|
conditions these tests might pass within a few seconds, but if the
|
|
|
|
pods never become ready (e.g. due to a system regression) they take a
|
|
|
|
very long time to fail and typically cause the entire test run to time
|
|
|
|
out, so that no results are produced. Again, this is a lot less
|
|
|
|
useful than a test that fails reliably within a minute or two when the
|
|
|
|
system is not working correctly.
|
|
|
|
|
|
|
|
#### Resilience to relatively rare, temporary infrastructure glitches or delays ####
|
|
|
|
|
|
|
|
Remember that your test will be run many thousands of
|
|
|
|
times, at different times of day and night, probably on different
|
|
|
|
cloud providers, under different load conditions. And often the
|
|
|
|
underlying state of these systems is stored in eventually consistent
|
|
|
|
data stores. So, for example, if a resource creation request is
|
|
|
|
theoretically asynchronous, even if you observe it to be practically
|
|
|
|
synchronous most of the time, write your test to assume that it's
|
|
|
|
asynchronous (e.g. make the "create" call, and poll or watch the
|
|
|
|
resource until it's in the correct state before proceeding).
|
|
|
|
Similarly, don't assume that API endpoints are 100% available.
|
|
|
|
They're not. Under high load conditions, API calls might temporarily
|
|
|
|
fail or time-out. In such cases it's appropriate to back off and retry
|
|
|
|
a few times before failing your test completely (in which case make
|
|
|
|
the error message very clear about what happened, e.g. "Retried
|
|
|
|
http://... 3 times - all failed with xxx". Use the standard
|
|
|
|
retry mechanisms provided in the libraries detailed below.
|
|
|
|
|
|
|
|
### Some concrete tools at your disposal ###
|
|
|
|
|
|
|
|
Obviously most of the above goals apply to many tests, not just yours.
|
|
|
|
So we've developed a set of reusable test infrastructure, libraries
|
|
|
|
and best practises to help you to do the right thing, or at least do
|
|
|
|
the same thing as other tests, so that if that turns out to be the
|
|
|
|
wrong thing, it can be fixed in one place, not hundreds, to be the
|
|
|
|
right thing.
|
|
|
|
|
|
|
|
Here are a few pointers:
|
|
|
|
|
|
|
|
+ [E2e Framework](../../test/e2e/framework.go):
|
|
|
|
Familiarise yourself with this test framework and how to use it.
|
|
|
|
Amongst others, it automatically creates uniquely named namespaces
|
|
|
|
within which your tests can run to avoid name clashes, and reliably
|
|
|
|
automates cleaning up the mess after your test has completed (it
|
|
|
|
just deletes everything in the namespace). This helps to ensure
|
|
|
|
that tests do not leak resources. Note that deleting a namespace
|
|
|
|
(and by implication everything in it) is currently an expensive
|
|
|
|
operation. So the fewer resources you create, the less cleaning up
|
|
|
|
the framework needs to do, and the faster your test (and other
|
|
|
|
tests running concurrently with yours) will complete. Your tests
|
|
|
|
should always use this framework. Trying other home-grown
|
|
|
|
approaches to avoiding name clashes and resource leaks has proven
|
|
|
|
to be a very bad idea.
|
|
|
|
+ [E2e utils library](../../test/e2e/util.go):
|
|
|
|
This handy library provides tons of reusable code for a host of
|
|
|
|
commonly needed test functionality, including waiting for resources
|
|
|
|
to enter specified states, safely and consistently retrying failed
|
|
|
|
operations, usefully reporting errors, and much more. Make sure
|
|
|
|
that you're familiar with what's available there, and use it.
|
|
|
|
Likewise, if you come across a generally useful mechanism that's
|
|
|
|
not yet implemented there, add it so that others can benefit from
|
|
|
|
your brilliance. In particular pay attention to the variety of
|
|
|
|
timeout and retry related constants at the top of that file. Always
|
|
|
|
try to reuse these constants rather than try to dream up your own
|
|
|
|
values. Even if the values there are not precisely what you would
|
|
|
|
like to use (timeout periods, retry counts etc), the benefit of
|
|
|
|
having them be consistent and centrally configurable across our
|
|
|
|
entire test suite typically outweighs your personal preferences.
|
|
|
|
+ **Follow the examples of stable, well-written tests:** Some of our
|
|
|
|
existing end-to-end tests are better written and more reliable than
|
|
|
|
others. A few examples of well-written tests include:
|
|
|
|
[Replication Controllers](../../test/e2e/rc.go),
|
|
|
|
[Services](../../test/e2e/service.go),
|
|
|
|
[Reboot](../../test/e2e/reboot.go).
|
|
|
|
+ [Ginkgo Test Framework](https://github.com/onsi/ginkgo): This is the
|
|
|
|
test library and runner upon which our e2e tests are built. Before
|
|
|
|
you write or refactor a test, read the docs and make sure that you
|
|
|
|
understand how it works. In particular be aware that every test is
|
|
|
|
uniquely identified and described (e.g. in test reports) by the
|
|
|
|
concatenation of it's `Describe` clause and nested `It` clauses.
|
|
|
|
So for example `Describe("Pods",...).... It(""should be scheduled
|
|
|
|
with cpu and memory limits")` produces a sane test identifier and
|
|
|
|
descriptor `Pods should be scheduled with cpu and memory limits`,
|
|
|
|
which makes it clear what's being tested, and hence what's not
|
|
|
|
working if it fails. Other good examples include:
|
|
|
|
|
|
|
|
```
|
|
|
|
CAdvisor should be healthy on every node
|
|
|
|
```
|
|
|
|
|
|
|
|
and
|
|
|
|
|
|
|
|
```
|
|
|
|
Daemon set should run and stop complex daemon
|
|
|
|
```
|
|
|
|
|
|
|
|
On the contrary
|
|
|
|
(these are real examples), the following are less good test
|
|
|
|
descriptors:
|
|
|
|
|
|
|
|
```
|
|
|
|
KubeProxy should test kube-proxy
|
|
|
|
```
|
|
|
|
|
|
|
|
and
|
|
|
|
|
|
|
|
```
|
|
|
|
Nodes [Disruptive] Network when a node becomes unreachable
|
|
|
|
[replication controller] recreates pods scheduled on the
|
|
|
|
unreachable node AND allows scheduling of pods on a node after
|
|
|
|
it rejoins the cluster
|
|
|
|
```
|
|
|
|
|
|
|
|
An improvement might be
|
|
|
|
|
|
|
|
```
|
|
|
|
Unreachable nodes are evacuated and then repopulated upon rejoining [Disruptive]
|
|
|
|
```
|
|
|
|
|
|
|
|
Note that opening issues for specific better tooling is welcome, and
|
|
|
|
code implementing that tooling is even more welcome :-).
|
|
|
|
|
|
|
|
|
|
|
|
<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
|
|
|
|
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/docs/devel/writing-good-e2e-tests.md?pixel)]()
|
|
|
|
<!-- END MUNGE: GENERATED_ANALYTICS -->
|