We already have a performance testing system -- Kubemark. However, Kubemark requires setting up and bootstrapping a whole cluster, which takes a lot of time.
We want to have a standard way to reproduce scheduling latency metrics result and benchmark scheduler as simple and fast as possible. We have the following goals:
- Save time on testing
- The test and benchmark can be run in a single box.
We only set up components necessary to scheduling without booting up a cluster.
- Profiling runtime metrics to find out bottleneck
- Write scheduler integration test but focus on performance measurement.
Take advantage of go profiling tools and collect fine-grained metrics,
like cpu-profiling, memory-profiling and block-profiling.
- Reproduce test result easily
- We want to have a known place to do the performance related test for scheduler.
Developers should just run one script to collect all the information they need.
Currently the test suite has the following:
- density test (by adding a new Go test)
- schedule 30k pods on 1000 (fake) nodes and 3k pods on 100 (fake) nodes
- print out scheduling rate every second
- let you learn the rate changes vs number of scheduled pods
- benchmark
- make use of `go test -bench` and report nanosecond/op.
- schedule b.N pods when the cluster has N nodes and P scheduled pods. Since it takes relatively long time to finish one round, b.N is small: 10 - 100.