k3s/docs/roadmap.md

203 lines
16 KiB
Markdown
Raw Normal View History

2014-08-08 22:00:31 +00:00
# Kubernetes Roadmap
Updated December 6, 2014
2014-08-08 22:00:31 +00:00
This document is intended to capture the set of supported use cases, features, docs, and patterns that we feel are required to call Kubernetes “feature complete” for a 1.0 release candidate.  This list does not emphasize the bug fixes and stabilization that will be required to take it all the way to production ready. This is a living document, and is certainly open for discussion.
## Target workloads
Features for 1.0 will be driven by the initial set of workloads we intend to support.
Most realistic examples of production services include a load-balanced web frontend exposed to the public Internet, with a stateful backend, such as a clustered database or key-value store, so we will target such a workload for 1.0.
Which exact stateful applications are TBD. Candidates include:
* redis
* memcache
* mysql (using master/slave replication)
* mongo
* cassandra
* etcd
* zookeeper
2014-08-08 22:00:31 +00:00
## APIs
1. Consistent v1 API. [v1beta3](https://github.com/GoogleCloudPlatform/kubernetes/issues/1519) is being developed as the release candidate for the v1 API.
2. Deprecation policy: Declare the projects intentions with regards to expiring and removing features and interfaces, including the minimum amount of time non-beta APIs will be supported.
3. Input validation: Validate schemas of API requests in the apiserver and, optionally, in the client.
4. Error propagation: Report problems reliably and consistently, with documented behavior.
5. Easy to add new controllers, such as [per-node controller](https://github.com/GoogleCloudPlatform/kubernetes/pull/2491)
1. Replication controller: Make replication controller a standalone entity in the master stack.
2. Pod templates: Proposal to make pod templates a first-class API object, rather than an artifact of replica controller [#170](https://github.com/GoogleCloudPlatform/kubernetes/issues/170)
6. Kubelet API should be well defined and versioned.
7. Cloud provider API for managing nodes, storage, and network resources. [#2770](https://github.com/GoogleCloudPlatform/kubernetes/issues/2770)
## Scheduling and resource isolation
1. Resource requirements and scheduling: Use knowledge of resources available and resources required to make good enough scheduling decisions such that applications can start and run. [#168](https://github.com/GoogleCloudPlatform/kubernetes/issues/168)
## Images and registry
1. Simple out-of-the box registry setup. [#1319](https://github.com/GoogleCloudPlatform/kubernetes/issues/1319)
2. Easy to configure .dockercfg.
3. Easy to deploy new code to Kubernetes (build and push).
4. Predictable deployment via configuration-time image resolution. [#1697](https://github.com/GoogleCloudPlatform/kubernetes/issues/1697)
## Storage
1. Durable volumes: Provide a model for data with identity and lifetime independent of pods. [#1515](https://github.com/GoogleCloudPlatform/kubernetes/pull/1515), [#598](https://github.com/GoogleCloudPlatform/kubernetes/issues/598), [#2609](https://github.com/GoogleCloudPlatform/kubernetes/pull/2609)
2. Pluggable volume sources and devices: Allow new kinds of data sources and/or devices as volumes. [#945](https://github.com/GoogleCloudPlatform/kubernetes/issues/945), [#2598](https://github.com/GoogleCloudPlatform/kubernetes/pull/2598)
## Networking and naming
1. DNS: Provide DNS for services, internal and external. [#2224](https://github.com/GoogleCloudPlatform/kubernetes/pull/2224), [#1261](https://github.com/GoogleCloudPlatform/kubernetes/issues/1261)
2. External IPs: Make Kubernetes services externally reachable. [#1161](https://github.com/GoogleCloudPlatform/kubernetes/issues/1161)
3. Re-think the network parts of the API: Clean factoring of a la carte networking functionality. [#2585](https://github.com/GoogleCloudPlatform/kubernetes/issues/2585)
4. Out-of-the-box, kick-the-tires networking implementation. [#1307](https://github.com/GoogleCloudPlatform/kubernetes/issues/1307)
## Authentication and authorization
1. Auth[nz] and ACLs: Have a plan for how the API and system will express:
Update roadmap We took a hard look at 1.0 and what things ae really REQUIRED to get to a stable release that is "useful". This required moving some things we thought were really important but not CRITICAL down the list. For now they are stricken from this doc, but I expect this doc to start growing a "post 1.0" list soon. Things stricken and why: Using the host network: This is primarily a performance optimization, but it causes potential problems with other uses of HostPorts. We'd rather focus on fixing perf problems than dodging them. We can revisit later if there is a strong case for it. Representation of Ports in the Manifest structure: We discussed and decided that, since HostPort semantics have changed, this matters less than before. Scenarios where IP-per-pod is hard or impossible: We're still game to help people figure out how to make it work, but we don't see a case for making k8s 1.0 work in a fundamentally different mode. Too much churn and risk. We can revisit later, if needed. Auto-scaling controller: We really want this, but it's not critical to making k8s "useful". Pluggable authentication: Overlaps with the other identity topic. Having one topic seems clearer. Pod spreading: We still want this, but it's not critical for 1.0. Container status snippets: We still want this, but it's not critical for 1.0. Docker-daemon-kills-all-children-on-exit problem: This is still a big problem, but we're not going to gate our 1.0 on something we don't control. This has to be documented as a shortcoming in general. Interconnection of services: expand / decompose the service pattern: overlaps with the other services topic. Recipes for settings where networking is not like GCE: This is happening in the form of cloudprovider modules, but is not going to gate 1.0.
2014-08-28 16:17:52 +00:00
1. Identity & authentication
2. Authorization & access control
3. Cluster subdivision, accounting, & isolation
2. Support for pluggable authentication implementation and authorization polices
3. Implemented auth[nz] for:
1. admin to master and/or kubelet
2. user to master
3. master component to component (e.g., controller manager to apiserver): localhost in 1.0
4. kubelet to master
## Usability
### Documentation
1. Documnted reference cluster architecture
2. Accurate and complete API documentation
### Cluster turnup, scaling, management, and upgrades
1. Easy cluster startup
1. Automatic node registration
2. Configuring k8s
1. Move away from flags in master
2. Node configuration distribution
1. Kubelet configuration
2. dockercfg
2. Easy cluster scaling (adding/removing nodes)
3. Kubernetes can be upgraded
1. master components
2. Kubelets
3. OS + kernel + Docker
### Workload deployment and management
2015-01-20 21:58:41 +00:00
See the [CLI/configuration roadmap](cli-roadmap.md) for details.
2014-08-08 22:00:31 +00:00
## Productionization
1. Scalability
1. 100 nodes for 1.0
2. 1000 nodes by summer 2015
2. HA master -- not gating 1.0
1. Master election
2. Eliminate global in-memory state
1. IP allocator
2. Operations
3. Sharding
1. Pod getter
3. Kubelets need to coast when master down
1. Dont blow away pods when master is down
4. Testing
1. More/better/easier E2E
2. E2E integration testing w/ OpenShift
3. More non-E2E integration tests
4. Long-term soaking / stress test
5. Backward compatibility
1. API
2. etcd state
5. Release cadence and artifacts
1. Regular stable releases on a frequent timeline (2 weeks).
2. Automatic generation of necessary deployment artifacts. It is still TBD if this includes deb and RPMs. Also not clear if it includes docker containers.
6. Export monitoring metrics (instrumentation)
7. Bounded disk space on master and kubelets
1. GC of unused images
# Reliability
## Current pain points:
* Writing end-to-end tests should be made easier e.g. not rely so much (or at all) on scripting and as much as possible be written in Go using appropriate frameworks to make it easy to get started with an end-to-end test.
* A developer should be able to take an integration test and turn it into an end to end test (and vice versa) without needing to significantly rewrite the test.
* Some e2e tests currently have false positives (they pass when they should not).
* It is unclear whether our e2e tests are representative of real workloads.
* We need to make sure other providers stay healthy as we submit code. Breakages for most providers are found too late.
* Previously discussed: a public dashboard that receives updates from platform maintainers and shows green/red e2e results for each provider per-PR or per-hour or something.
* It is very challenging to bring up large clusters. For example, for GCE, operations that create routes, firewall rules and instances can fail and need to be robustly retried.
* We have no current means to measure the reliability of long running clusters and our current test infrastructure isnt well suited to this use case.
* We have little or no instrumentation of the various components - memory and CPU usage, time per operation, QPS, etc.
Reliability Goals:
* Automated flow that uses exactly the same source for end-to-end etc. tests from GitHub which can be regularly run (hourly, at commit time etc.) to ensure none of the providers are broken. Comment from Zach: “I think this is "none of the providers we directly support are broken" (GCE, maybe some local, maybe others). The traditional OSS model is that vendors (OpenShift for instance) handle their own downstream testing, unless they're willing to work fully upstream.”
* Dashboard or some other form of storing and querying historical build information.
## Work Items
* Issue [#3130](https://github.com/GoogleCloudPlatform/kubernetes/issues/3130) Rewrite the remaining e2e bash tests in Go. Whilst doing so, reduce/remove the cases where the tests were incorrectly passing.
* Issue [#3131](https://github.com/GoogleCloudPlatform/kubernetes/issues/3131) Refactor the Go e2e tests to use a test framework (ideally just http://golang.org/pkg/testing/ with some extra bits to make sure the cluster is in the right state at the start of the tests). Try to consolidate on a test framework that works the same for integration and e2e tests.
* Issue [#3132](https://github.com/GoogleCloudPlatform/kubernetes/issues/3132) Refactor the e2e tests to allow multiple concurrent runs (assuming it is supported by the cloud provider).
Allow the client to be authenticated to multiple clusters (https://github.com/GoogleCloudPlatform/kubernetes/issues/1755)
* [PR #3046 - done!] Create a GKE cloud provider.
* Issue [#2234](https://github.com/GoogleCloudPlatform/kubernetes/issues/2234) Create an integration test dashboard
* For each supported cloud provider, ensure that we run the e2e tests regularly and fix any breaks
* [done] Setup Jenkins to run on VM/cluster of VMs in GCE.
* Should have separate projects/flows for testing against different vendors.
* Shared configuration with other GCE projects for vendor specific tests (GKE will need this).
* Issue [#3134](https://github.com/GoogleCloudPlatform/kubernetes/issues/3134) Jenkis should produce build artifacts and push to gcs ~hourly. Ideally we can use this to build and push a continuous or latest-dev bucket to the official gcs kubernetes-release bucket.
* Issue [#2953]((https://github.com/GoogleCloudPlatform/kubernetes/issues/2953) [zml] Capability bits: I proposed this last week, I still need to write up an issue on it. The idea is that along with the API version (and server version?), the server communicates a bucket of tags that says "I support these capabilities". Then tests like pd.sh can stop being conditionalized on provider and can instead be conditionalized on server capability. Want to get this filed/done before v1beta3, and has testing impact. (Zach edit: The Is here are me.)
* Stress testing as a Jenkins job using a large-ish number of VMs.
* Issue [#3135](https://github.com/GoogleCloudPlatform/kubernetes/issues/3135) [zml] Upgrade testing: Related to the previous, but you could write an entire doc on upgrade testing alone. I think we're going to need a story here, and it's actually a long one. We need to get a pretty good handle on upgrade/release policy, versions we're going to keep around (OSS-wise, GKE-wise, etc), versions we're going to allow upgrade between, etc. (I volunteer to help pin people down here - I think the release process is getting driven elsewhere but this is a crossbar item between that group and us that's pretty important). (Zach edit: The Is here are me.)
* Issue [#3136](https://github.com/GoogleCloudPlatform/kubernetes/issues/3136) Create a compatibility test matrix. Verify that an old client works with a new server, different api versions, etc.
* Issue [#3137](https://github.com/GoogleCloudPlatform/kubernetes/issues/3137) Create a soak test.
* [satnam] Sometimes builds fail after an update and require a build/make-clean.sh. We should ensure that tests, builds etc. get cleaned up properly.
* Issue [#3138](https://github.com/GoogleCloudPlatform/kubernetes/issues/3138) [davidopp] A way to record a real workload and replay it deterministically
* Issue [#3139](https://github.com/GoogleCloudPlatform/kubernetes/issues/3139) [davidopp] A way to generate a synthetic workload and play it
* Issue [#2852](https://github.com/GoogleCloudPlatform/kubernetes/issues/2852) and Issue [#3067](https://github.com/GoogleCloudPlatform/kubernetes/issues/3067) [vishnuk] Protect system services against kernel OOM kills and resource starvation.
2015-01-27 19:48:18 +00:00
# Performance
Currently we are conflating performance of any size of Kubernetes cluster with scalability. Later we may wish to tease apart these two concerns. As part of overall performance, we also consider the performance of the build and test processes. The main current pain point is the we have no systematic performance measurement mechanism or process.
The goals of the performance related activities are:
* A collection of performance regression tests that measure the effect of code changes.
* A dashboard or some other form of storing and querying historical performance information.
Things we could measure:
1. Time to build from source.
2. Time to run each test in the end to end suite (or a mean plus standard deviation).
3. Time taken to perform API operations e.g.
1. List pods, for a varying number of pods (no labels).
2. List pods, using various label constraints.
3. Delete pods.
4. Create service.
5. Create/Delete pod N times.
6. Schedule pod (in a system with various numbers of existing pods)
4. Time taken to perform above during concurrent access scenarios (i.e. 10, 50, 100 concurrent users)
5. Time taken to perform the above when using an etcd cluster size of 3 with various snapshot intervals
6. Overhead of running some kind of application on a Kubernetes cluster vs. a hand-spun version directly on a cloud platform or other cluster.
7. Network performance.
1. Create a series of layered services (e.g. the onion router network) and measure RTT time for requests to succeed through N layers of services
8. Memory consumption of kube components on master at varying sizes.
1. What happens if I have LIST queries that return 1000s of results per request from apiserver with go heap.
2. How efficient is serialization / deserialization of large lists of items
9. Open questions:
1. Synthetic Workloads vs. Real workloads
2. How much of the performance testing will/should be cloud provider specific?
3. Are there any any open source tools / frameworks we can use use?
4. Storage performance?
## Work Items
* Issue [#3118](https://github.com/GoogleCloudPlatform/kubernetes/issues/3118) Build/find a dashboard to record performance metrics. The dashboard should have graphs of metrics over time and be queriable.
* Issue [#3119](https://github.com/GoogleCloudPlatform/kubernetes/issues/3119) Decide how to archive information.
* Issue [#3120](https://github.com/GoogleCloudPlatform/kubernetes/issues/3120) Configure and automated hourly build and record the time it takes to build kubernetes from source. Export information into the dashboard.
* Issue [#3121](https://github.com/GoogleCloudPlatform/kubernetes/issues/3121) Configure the automated e2e test runner (jenkins) to export the time for each test to complete into the dashboard.
* Issue [#3122](https://github.com/GoogleCloudPlatform/kubernetes/issues/3122) When e2e tests run on multiple cloud providers, break out the test performance by provider so that regressions can be tracked on each cloud provider individually.
* Issue [#3123](https://github.com/GoogleCloudPlatform/kubernetes/issues/3123) Measure the time for each Go test in the e2e test suite individually rather than in aggregate.
* Issue [#3124](https://github.com/GoogleCloudPlatform/kubernetes/issues/3124) Develop synthetic workload tests to measure basic Kubernetes API performance. Since tests will run on shared resources (e.g. GCE) individual tests should be run N times to weed out any statistical outliers from the performance results.
Issue [#3125](https://github.com/GoogleCloudPlatform/kubernetes/issues/3125) Curate a small number of “applications” that can be used to measure performance at a higher level (QPS for high level requests).
Work out how to specify performance regression tests (e.g. decide on thresholds).
Issue [#3126](https://github.com/GoogleCloudPlatform/kubernetes/issues/3126) Measure the time taken to create and tear down clusters of various sizes
Issue [#3127](https://github.com/GoogleCloudPlatform/kubernetes/issues/3127) Create a network performance test.
Issue [#3128](https://github.com/GoogleCloudPlatform/kubernetes/issues/3128) Measure memory consumption of kubernetes master components
* What happens if I have LIST queries that return 1000s of results per request from apiserver with go heap
* How efficient is serialization / deserialization of large lists of items