16 KiB

Raw Blame History

Kubernetes Roadmap

Updated December 6, 2014

This document is intended to capture the set of supported use cases, features, docs, and patterns that we feel are required to call Kubernetes “feature complete” for a 1.0 release candidate. This list does not emphasize the bug fixes and stabilization that will be required to take it all the way to production ready. This is a living document, and is certainly open for discussion.

Target workloads

Features for 1.0 will be driven by the initial set of workloads we intend to support.

Most realistic examples of production services include a load-balanced web frontend exposed to the public Internet, with a stateful backend, such as a clustered database or key-value store, so we will target such a workload for 1.0.

Which exact stateful applications are TBD. Candidates include:

redis
memcache
mysql (using master/slave replication)
mongo
cassandra
etcd
zookeeper

APIs

Consistent v1 API. v1beta3 is being developed as the release candidate for the v1 API.
Deprecation policy: Declare the project’s intentions with regards to expiring and removing features and interfaces, including the minimum amount of time non-beta APIs will be supported.
Input validation: Validate schemas of API requests in the apiserver and, optionally, in the client.
Error propagation: Report problems reliably and consistently, with documented behavior.
Easy to add new controllers, such as per-node controller
Replication controller: Make replication controller a standalone entity in the master stack.
Pod templates: Proposal to make pod templates a first-class API object, rather than an artifact of replica controller #170
Kubelet API should be well defined and versioned.
Cloud provider API for managing nodes, storage, and network resources. #2770

Scheduling and resource isolation

Resource requirements and scheduling: Use knowledge of resources available and resources required to make good enough scheduling decisions such that applications can start and run. #168

Images and registry

Simple out-of-the box registry setup. #1319
Easy to configure .dockercfg.
Easy to deploy new code to Kubernetes (build and push).
Predictable deployment via configuration-time image resolution. #1697

Storage

Durable volumes: Provide a model for data with identity and lifetime independent of pods. #1515, #598, #2609
Pluggable volume sources and devices: Allow new kinds of data sources and/or devices as volumes. #945, #2598

Networking and naming

DNS: Provide DNS for services, internal and external. #2224, #1261
External IPs: Make Kubernetes services externally reachable. #1161
Re-think the network parts of the API: Clean factoring of a la carte networking functionality. #2585
Out-of-the-box, kick-the-tires networking implementation. #1307

Authentication and authorization

Auth[nz] and ACLs: Have a plan for how the API and system will express:
Identity & authentication
Authorization & access control
Cluster subdivision, accounting, & isolation
Support for pluggable authentication implementation and authorization polices
Implemented auth[nz] for:
1. admin to master and/or kubelet
2. user to master
3. master component to component (e.g., controller manager to apiserver): localhost in 1.0
4. kubelet to master

Usability

Documentation

Documnted reference cluster architecture
Accurate and complete API documentation

Cluster turnup, scaling, management, and upgrades

Easy cluster startup
Automatic node registration
Configuring k8s 1. Move away from flags in master 2. Node configuration distribution
1. Kubelet configuration
2. dockercfg
Easy cluster scaling (adding/removing nodes)
Kubernetes can be upgraded
master components
Kubelets
OS + kernel + Docker

Workload deployment and management

See the CLI/configuration roadmap for details.

Productionization

Scalability
100 nodes for 1.0
1000 nodes by summer 2015
HA master -- not gating 1.0
Master election
Eliminate global in-memory state 1. IP allocator 2. Operations
Sharding 1. Pod getter
Kubelets need to coast when master down
Don’t blow away pods when master is down
Testing
More/better/easier E2E
E2E integration testing w/ OpenShift
More non-E2E integration tests
Long-term soaking / stress test
Backward compatibility 1. API 2. etcd state
Release cadence and artifacts
Regular stable releases on a frequent timeline (2 weeks).
Automatic generation of necessary deployment artifacts. It is still TBD if this includes deb and RPMs. Also not clear if it includes docker containers.
Export monitoring metrics (instrumentation)
Bounded disk space on master and kubelets
GC of unused images

Reliability

Current pain points:

Writing end-to-end tests should be made easier e.g. not rely so much (or at all) on scripting and as much as possible be written in Go using appropriate frameworks to make it easy to get started with an end-to-end test.
A developer should be able to take an integration test and turn it into an end to end test (and vice versa) without needing to significantly rewrite the test.
Some e2e tests currently have false positives (they pass when they should not).
It is unclear whether our e2e tests are representative of real workloads.
We need to make sure other providers stay healthy as we submit code. Breakages for most providers are found too late.
Previously discussed: a public dashboard that receives updates from platform maintainers and shows green/red e2e results for each provider per-PR or per-hour or something.
It is very challenging to bring up large clusters. For example, for GCE, operations that create routes, firewall rules and instances can fail and need to be robustly retried.
We have no current means to measure the reliability of long running clusters and our current test infrastructure isn’t well suited to this use case.
We have little or no instrumentation of the various components - memory and CPU usage, time per operation, QPS, etc. Reliability Goals:
Automated flow that uses exactly the same source for end-to-end etc. tests from GitHub which can be regularly run (hourly, at commit time etc.) to ensure none of the providers are broken. Comment from Zach: “I think this is "none of the providers we directly support are broken" (GCE, maybe some local, maybe others). The traditional OSS model is that vendors (OpenShift for instance) handle their own downstream testing, unless they're willing to work fully upstream.”
Dashboard or some other form of storing and querying historical build information.

Work Items

Issue #3118 Build/find a dashboard to record performance metrics. The dashboard should have graphs of metrics over time and be queriable.
Issue #3119 Decide how to archive information.
Issue #3120 Configure and automated hourly build and record the time it takes to build kubernetes from source. Export information into the dashboard.
Issue #3121 Configure the automated e2e test runner (jenkins) to export the time for each test to complete into the dashboard.
Issue #3122 When e2e tests run on multiple cloud providers, break out the test performance by provider so that regressions can be tracked on each cloud provider individually.
Issue #3123 Measure the time for each Go test in the e2e test suite individually rather than in aggregate.
Issue #3124 Develop synthetic workload tests to measure basic Kubernetes API performance. Since tests will run on shared resources (e.g. GCE) individual tests should be run N times to weed out any statistical outliers from the performance results.
Issue #3125 Curate a small number of “applications” that can be used to measure performance at a higher level (QPS for high level requests).
Work out how to specify performance regression tests (e.g. decide on thresholds).
Issue #3126 Measure the time taken to create and tear down clusters of various sizes
Issue #3127 Create a network performance test.
Issue #3128 Measure memory consumption of kubernetes master components
What happens if I have LIST queries that return 1000s of results per request from apiserver with go heap
How efficient is serialization / deserialization of large lists of items
Issue #3130 Rewrite the remaining e2e bash tests in Go. Whilst doing so, reduce/remove the cases where the tests were incorrectly passing.
Issue #3131 Refactor the Go e2e tests to use a test framework (ideally just http://golang.org/pkg/testing/ with some extra bits to make sure the cluster is in the right state at the start of the tests). Try to consolidate on a test framework that works the same for integration and e2e tests.
Issue #3132 Refactor the e2e tests to allow multiple concurrent runs (assuming it is supported by the cloud provider). Allow the client to be authenticated to multiple clusters (https://github.com/GoogleCloudPlatform/kubernetes/issues/1755)
[PR #3046 - done!] Create a GKE cloud provider.
Issue #2234 Create an integration test dashboard
For each supported cloud provider, ensure that we run the e2e tests regularly and fix any breaks
[done] Setup Jenkins to run on VM/cluster of VMs in GCE.
Should have separate projects/flows for testing against different vendors.
Shared configuration with other GCE projects for vendor specific tests (GKE will need this).
Issue #3134 Jenkis should produce build artifacts and push to gcs ~hourly. Ideally we can use this to build and push a ‘continuous’ or ‘latest-dev’ bucket to the official gcs kubernetes-release bucket.
Issue #2953 [zml] Capability bits: I proposed this last week, I still need to write up an issue on it. The idea is that along with the API version (and server version?), the server communicates a bucket of tags that says "I support these capabilities". Then tests like pd.sh can stop being conditionalized on provider and can instead be conditionalized on server capability. Want to get this filed/done before v1beta3, and has testing impact. (Zach edit: The I’s here are me.)
Stress testing as a Jenkins job using a large-ish number of VMs.
Issue #3135 [zml] Upgrade testing: Related to the previous, but you could write an entire doc on upgrade testing alone. I think we're going to need a story here, and it's actually a long one. We need to get a pretty good handle on upgrade/release policy, versions we're going to keep around (OSS-wise, GKE-wise, etc), versions we're going to allow upgrade between, etc. (I volunteer to help pin people down here - I think the release process is getting driven elsewhere but this is a crossbar item between that group and us that's pretty important). (Zach edit: The I’s here are me.)
Issue #3136 Create a compatibility test matrix. Verify that an old client works with a new server, different api versions, etc.
Issue #3137 Create a soak test.
[satnam] Sometimes builds fail after an update and require a build/make-clean.sh. We should ensure that tests, builds etc. get cleaned up properly.
Issue #3138 [davidopp] A way to record a real workload and replay it deterministically
Issue #3139 [davidopp] A way to generate a synthetic workload and play it
Issue #2852 and Issue #3067 [vishnuk] Protect system services against kernel OOM kills and resource starvation.

Performance

Currently we are conflating performance of any size of Kubernetes cluster with scalability. Later we may wish to tease apart these two concerns. As part of overall performance, we also consider the performance of the build and test processes. The main current pain point is the we have no systematic performance measurement mechanism or process.

The goals of the performance related activities are:

A collection of performance regression tests that measure the effect of code changes.
A dashboard or some other form of storing and querying historical performance information.

Things we could measure:

Time to build from source.
Time to run each test in the end to end suite (or a mean plus standard deviation).
Time taken to perform API operations e.g.
List pods, for a varying number of pods (no labels).
List pods, using various label constraints.
Delete pods.
Create service.
Create/Delete pod N times.
Schedule pod (in a system with various numbers of existing pods)
Time taken to perform above during concurrent access scenarios (i.e. 10, 50, 100 concurrent users)
Time taken to perform the above when using an etcd cluster size of 3 with various snapshot intervals
Overhead of running some kind of application on a Kubernetes cluster vs. a hand-spun version directly on a cloud platform or other cluster.
Network performance.
Create a series of layered services (e.g. the onion router network) and measure RTT time for requests to succeed through N layers of services
Memory consumption of kube components on master at varying sizes.
What happens if I have LIST queries that return 1000s of results per request from apiserver with go heap.
How efficient is serialization / deserialization of large lists of items
Open questions:
Synthetic Workloads vs. Real workloads
How much of the performance testing will/should be cloud provider specific?
Are there any any open source tools / frameworks we can use use?
Storage performance?

16 KiB Raw Blame History Unescape Escape