k3s/docs/roadmap.md

# Kubernetes Roadmap

Updated August 28, 2014

This document is intended to capture the set of features, docs, and patterns that we feel are required to call Kubernetes “feature complete” for a 1.0 release candidate.  This list does not emphasize the bug fixes and stabilization that will be required to take it all the way to production ready.  This is a living document, and is certainly open for discussion.

## APIs
1. ~~Versioned APIs:  Manage APIs for master components and kubelets with explicit versions, version-specific conversion routines, and component-to-component version checking.~~ **Done**
2. Component-centric APIs:  Clarify which types belong in each component’s API and which ones are truly common.
  1. Clarify the role of etcd in the cluster.
3. Idempotency: Whenever possible APIs must be idempotent.
4. Container restart policy: Policy for each pod or container stating whether and when it should be restarted upon termination.
5. Life cycle events/hooks and notifications: Notify containers about what is happening to them.
6. Re-think the network parts of the API: Find resolution on the the multiple issues around networking.
  1. ~~Utility of HostPorts in ip-per-pod~~ **Done**
  2. Services/Links/Portals/Ambassadors
7. Durable volumes: Provide a model for data that survives some kinds of outages.
8. Auth[nz] and ACLs: Have a plan for how the API and system will express:
  1. Identity & authentication
  2. Authorization & access control
  3. Cluster subdivision, accounting, & isolation

## Factoring and pluggability
1. ~~Pluggable scheduling: Cleanly separate the scheduler from the apiserver.~~ **Done**
2. Pluggable naming and discovery: Call-outs or hooks to enable external naming systems.
3. Pluggable volumes: Allow new kinds of data sources as volumes.
4. Replication controller: Make replication controller a standalone entity in the master stack.
5. Pod templates: Proposal to make pod templates a first-class API object, rather than an artifact of replica controller

## Cluster features
1. ~~Minion death: Cleanly handle the loss of a minion.~~ **Done**
2. Configure DNS: Provide DNS service for k8s running pods, containers and services. Auto-populate it with the things we know.
3. Resource requirements and scheduling: Use knowledge of resources available and resources required to do better scheduling.
4. ~~True IP-per-pod: Get rid of last remnants of shared port spaces for pods.~~ **Done**
5. IP-per-service: Proposal to make services cleaner.
6. Basic deployment tools: This includes tools for higher-level deployments configs.
7. Standard mechanisms for deploying k8s on k8s with a clear strategy for reusing the infrastructure for self-host.

## Node features
1. Container termination reasons: Capture and report exit codes and other termination reasons.
2. Garbage collect old container images: Clean up old docker images that consume local disk. Maybe a TTL on images.
3. Container logs: Expose stdout/stderr from containers without users having to SSH into minions.  Needs a rotation policy to avoid disks getting filled.
4. Container performance information: Capture and report performance data for each container.
5. Host log management: Make sure we don't kill nodes with full disks.

## Global features
2. Input validation: Stop bad input as early as possible.
3. Error propagation: Report problems reliably and consistently.
4. Consistent patterns of usage of IDs and names throughout the system.
5. Binary release: Repeatable process to produce binaries for release.

## Patterns, policies, and specifications
1. Deprecation policy: Declare the project’s intentions with regards to expiring and removing features and interfaces.
2. Compatibility policy: Declare the project’s intentions with regards to saved state and live upgrades of components.
3. Naming/discovery: Demonstrate techniques for common patterns:
  1. Master-elected services
  2. DB replicas
  3. Sharded services
  4. Worker pools
4. Health-checking: Specification for how it works and best practices.
5. Logging: Demonstrate setting up log collection.
6. ~~Monitoring: Demonstrate setting up cluster monitoring.~~ **Done**
7. Rolling updates: Demo and best practices for live application upgrades.
  1. Have a plan for how higher level deployment / update concepts should / should not fit into Kubernetes
8. Minion requirements: Document the requirements and integrations between kubelet and minion machine environments.
-												Proposed roadmap to 1.0

											
										
										
											2014-08-08 22:00:31 +00:00
+								# Kubernetes Roadmap
-												Update roadmap

We took a hard look at 1.0 and what things ae really REQUIRED to get to a
stable release that is "useful".  This required moving some things we thought
were really important but not CRITICAL down the list.

For now they are stricken from this doc, but I expect this doc to start
growing a "post 1.0" list soon.

Things stricken and why:

Using the host network: This is primarily a performance optimization, but it
causes potential problems with other uses of HostPorts.  We'd rather focus on
fixing perf problems than dodging them.  We can revisit later if there is a
strong case for it.

Representation of Ports in the Manifest structure: We discussed and decided
that, since HostPort semantics have changed, this matters less than before.

Scenarios where IP-per-pod is hard or impossible: We're still game to help
people figure out how to make it work, but we don't see a case for making k8s
1.0 work in a fundamentally different mode.  Too much churn and risk.  We can
revisit later, if needed.

Auto-scaling controller: We really want this, but it's not critical to making
k8s "useful".

Pluggable authentication: Overlaps with the other identity topic.  Having one
topic seems clearer.

Pod spreading: We still want this, but it's not critical for 1.0.

Container status snippets: We still want this, but it's not critical for 1.0.

Docker-daemon-kills-all-children-on-exit problem: This is still a big problem,
but we're not going to gate our 1.0 on something we don't control.  This has
to be documented as a shortcoming in general.

Interconnection of services: expand / decompose the service pattern: overlaps
with the other services topic.

Recipes for settings where networking is not like GCE: This is happening in
the form of cloudprovider modules, but is not going to gate 1.0.

											
										
										
											2014-08-28 16:17:52 +00:00
+								Updated August 28, 2014
-												Proposed roadmap to 1.0

											
										
										
											2014-08-08 22:00:31 +00:00
 								This document is intended to capture the set of features, docs, and patterns that we feel are required to call Kubernetes “feature complete” for a 1.0 release candidate.  This list does not emphasize the bug fixes and stabilization that will be required to take it all the way to production ready.  This is a living document, and is certainly open for discussion.
 								## APIs
-												Update the roadmap with some completed tasks.

											
										
										
											2014-08-25 23:05:03 +00:00
+. ~~Versioned APIs:  Manage APIs for master components and kubelets with explicit versions, version-specific conversion routines, and component-to-component version checking.~~ **Done**
-												Update roadmap

We took a hard look at 1.0 and what things ae really REQUIRED to get to a
stable release that is "useful".  This required moving some things we thought
were really important but not CRITICAL down the list.

For now they are stricken from this doc, but I expect this doc to start
growing a "post 1.0" list soon.

Things stricken and why:

Using the host network: This is primarily a performance optimization, but it
causes potential problems with other uses of HostPorts.  We'd rather focus on
fixing perf problems than dodging them.  We can revisit later if there is a
strong case for it.

Representation of Ports in the Manifest structure: We discussed and decided
that, since HostPort semantics have changed, this matters less than before.

Scenarios where IP-per-pod is hard or impossible: We're still game to help
people figure out how to make it work, but we don't see a case for making k8s
1.0 work in a fundamentally different mode.  Too much churn and risk.  We can
revisit later, if needed.

Auto-scaling controller: We really want this, but it's not critical to making
k8s "useful".

Pluggable authentication: Overlaps with the other identity topic.  Having one
topic seems clearer.

Pod spreading: We still want this, but it's not critical for 1.0.

Container status snippets: We still want this, but it's not critical for 1.0.

Docker-daemon-kills-all-children-on-exit problem: This is still a big problem,
but we're not going to gate our 1.0 on something we don't control.  This has
to be documented as a shortcoming in general.

Interconnection of services: expand / decompose the service pattern: overlaps
with the other services topic.

Recipes for settings where networking is not like GCE: This is happening in
the form of cloudprovider modules, but is not going to gate 1.0.

											
										
										
											2014-08-28 16:17:52 +00:00
+. Component-centric APIs:  Clarify which types belong in each component’s API and which ones are truly common.
 . Clarify the role of etcd in the cluster.
 . Idempotency: Whenever possible APIs must be idempotent.
 . Container restart policy: Policy for each pod or container stating whether and when it should be restarted upon termination.
 . Life cycle events/hooks and notifications: Notify containers about what is happening to them.
 . Re-think the network parts of the API: Find resolution on the the multiple issues around networking.
 . ~~Utility of HostPorts in ip-per-pod~~ **Done**
 . Services/Links/Portals/Ambassadors
 . Durable volumes: Provide a model for data that survives some kinds of outages.
 . Auth[nz] and ACLs: Have a plan for how the API and system will express:
 . Identity & authentication
 . Authorization & access control
 . Cluster subdivision, accounting, & isolation
-												Proposed roadmap to 1.0

											
										
										
											2014-08-08 22:00:31 +00:00
 								## Factoring and pluggability
-												Update the roadmap with some completed tasks.

											
										
										
											2014-08-25 23:05:03 +00:00
+. ~~Pluggable scheduling: Cleanly separate the scheduler from the apiserver.~~ **Done**
-												Proposed roadmap to 1.0

											
										
										
											2014-08-08 22:00:31 +00:00
+. Pluggable naming and discovery: Call-outs or hooks to enable external naming systems.
 . Pluggable volumes: Allow new kinds of data sources as volumes.
 . Replication controller: Make replication controller a standalone entity in the master stack.
 . Pod templates: Proposal to make pod templates a first-class API object, rather than an artifact of replica controller
 								## Cluster features
-												Update roadmap

We took a hard look at 1.0 and what things ae really REQUIRED to get to a
stable release that is "useful".  This required moving some things we thought
were really important but not CRITICAL down the list.

For now they are stricken from this doc, but I expect this doc to start
growing a "post 1.0" list soon.

Things stricken and why:

Using the host network: This is primarily a performance optimization, but it
causes potential problems with other uses of HostPorts.  We'd rather focus on
fixing perf problems than dodging them.  We can revisit later if there is a
strong case for it.

Representation of Ports in the Manifest structure: We discussed and decided
that, since HostPort semantics have changed, this matters less than before.

Scenarios where IP-per-pod is hard or impossible: We're still game to help
people figure out how to make it work, but we don't see a case for making k8s
1.0 work in a fundamentally different mode.  Too much churn and risk.  We can
revisit later, if needed.

Auto-scaling controller: We really want this, but it's not critical to making
k8s "useful".

Pluggable authentication: Overlaps with the other identity topic.  Having one
topic seems clearer.

Pod spreading: We still want this, but it's not critical for 1.0.

Container status snippets: We still want this, but it's not critical for 1.0.

Docker-daemon-kills-all-children-on-exit problem: This is still a big problem,
but we're not going to gate our 1.0 on something we don't control.  This has
to be documented as a shortcoming in general.

Interconnection of services: expand / decompose the service pattern: overlaps
with the other services topic.

Recipes for settings where networking is not like GCE: This is happening in
the form of cloudprovider modules, but is not going to gate 1.0.

											
										
										
											2014-08-28 16:17:52 +00:00
+. ~~Minion death: Cleanly handle the loss of a minion.~~ **Done**
-												Proposed roadmap to 1.0

											
										
										
											2014-08-08 22:00:31 +00:00
+. Configure DNS: Provide DNS service for k8s running pods, containers and services. Auto-populate it with the things we know.
 . Resource requirements and scheduling: Use knowledge of resources available and resources required to do better scheduling.
-												Update roadmap

We took a hard look at 1.0 and what things ae really REQUIRED to get to a
stable release that is "useful".  This required moving some things we thought
were really important but not CRITICAL down the list.

For now they are stricken from this doc, but I expect this doc to start
growing a "post 1.0" list soon.

Things stricken and why:

Using the host network: This is primarily a performance optimization, but it
causes potential problems with other uses of HostPorts.  We'd rather focus on
fixing perf problems than dodging them.  We can revisit later if there is a
strong case for it.

Representation of Ports in the Manifest structure: We discussed and decided
that, since HostPort semantics have changed, this matters less than before.

Scenarios where IP-per-pod is hard or impossible: We're still game to help
people figure out how to make it work, but we don't see a case for making k8s
1.0 work in a fundamentally different mode.  Too much churn and risk.  We can
revisit later, if needed.

Auto-scaling controller: We really want this, but it's not critical to making
k8s "useful".

Pluggable authentication: Overlaps with the other identity topic.  Having one
topic seems clearer.

Pod spreading: We still want this, but it's not critical for 1.0.

Container status snippets: We still want this, but it's not critical for 1.0.

Docker-daemon-kills-all-children-on-exit problem: This is still a big problem,
but we're not going to gate our 1.0 on something we don't control.  This has
to be documented as a shortcoming in general.

Interconnection of services: expand / decompose the service pattern: overlaps
with the other services topic.

Recipes for settings where networking is not like GCE: This is happening in
the form of cloudprovider modules, but is not going to gate 1.0.

											
										
										
											2014-08-28 16:17:52 +00:00
+. ~~True IP-per-pod: Get rid of last remnants of shared port spaces for pods.~~ **Done**
 . IP-per-service: Proposal to make services cleaner.
 . Basic deployment tools: This includes tools for higher-level deployments configs.
-												Proposed roadmap to 1.0

											
										
										
											2014-08-08 22:00:31 +00:00
+. Standard mechanisms for deploying k8s on k8s with a clear strategy for reusing the infrastructure for self-host.
 								## Node features
 . Container termination reasons: Capture and report exit codes and other termination reasons.
-												Update roadmap

We took a hard look at 1.0 and what things ae really REQUIRED to get to a
stable release that is "useful".  This required moving some things we thought
were really important but not CRITICAL down the list.

For now they are stricken from this doc, but I expect this doc to start
growing a "post 1.0" list soon.

Things stricken and why:

Using the host network: This is primarily a performance optimization, but it
causes potential problems with other uses of HostPorts.  We'd rather focus on
fixing perf problems than dodging them.  We can revisit later if there is a
strong case for it.

Representation of Ports in the Manifest structure: We discussed and decided
that, since HostPort semantics have changed, this matters less than before.

Scenarios where IP-per-pod is hard or impossible: We're still game to help
people figure out how to make it work, but we don't see a case for making k8s
1.0 work in a fundamentally different mode.  Too much churn and risk.  We can
revisit later, if needed.

Auto-scaling controller: We really want this, but it's not critical to making
k8s "useful".

Pluggable authentication: Overlaps with the other identity topic.  Having one
topic seems clearer.

Pod spreading: We still want this, but it's not critical for 1.0.

Container status snippets: We still want this, but it's not critical for 1.0.

Docker-daemon-kills-all-children-on-exit problem: This is still a big problem,
but we're not going to gate our 1.0 on something we don't control.  This has
to be documented as a shortcoming in general.

Interconnection of services: expand / decompose the service pattern: overlaps
with the other services topic.

Recipes for settings where networking is not like GCE: This is happening in
the form of cloudprovider modules, but is not going to gate 1.0.

											
										
										
											2014-08-28 16:17:52 +00:00
+. Garbage collect old container images: Clean up old docker images that consume local disk. Maybe a TTL on images.
 . Container logs: Expose stdout/stderr from containers without users having to SSH into minions.  Needs a rotation policy to avoid disks getting filled.
 . Container performance information: Capture and report performance data for each container.
 . Host log management: Make sure we don't kill nodes with full disks.
-												Proposed roadmap to 1.0

											
										
										
											2014-08-08 22:00:31 +00:00
 								## Global features
 . Input validation: Stop bad input as early as possible.
 . Error propagation: Report problems reliably and consistently.
-												Update roadmap

We took a hard look at 1.0 and what things ae really REQUIRED to get to a
stable release that is "useful".  This required moving some things we thought
were really important but not CRITICAL down the list.

For now they are stricken from this doc, but I expect this doc to start
growing a "post 1.0" list soon.

Things stricken and why:

Using the host network: This is primarily a performance optimization, but it
causes potential problems with other uses of HostPorts.  We'd rather focus on
fixing perf problems than dodging them.  We can revisit later if there is a
strong case for it.

Representation of Ports in the Manifest structure: We discussed and decided
that, since HostPort semantics have changed, this matters less than before.

Scenarios where IP-per-pod is hard or impossible: We're still game to help
people figure out how to make it work, but we don't see a case for making k8s
1.0 work in a fundamentally different mode.  Too much churn and risk.  We can
revisit later, if needed.

Auto-scaling controller: We really want this, but it's not critical to making
k8s "useful".

Pluggable authentication: Overlaps with the other identity topic.  Having one
topic seems clearer.

Pod spreading: We still want this, but it's not critical for 1.0.

Container status snippets: We still want this, but it's not critical for 1.0.

Docker-daemon-kills-all-children-on-exit problem: This is still a big problem,
but we're not going to gate our 1.0 on something we don't control.  This has
to be documented as a shortcoming in general.

Interconnection of services: expand / decompose the service pattern: overlaps
with the other services topic.

Recipes for settings where networking is not like GCE: This is happening in
the form of cloudprovider modules, but is not going to gate 1.0.

											
										
										
											2014-08-28 16:17:52 +00:00
+. Consistent patterns of usage of IDs and names throughout the system.
 . Binary release: Repeatable process to produce binaries for release.
-												Proposed roadmap to 1.0

											
										
										
											2014-08-08 22:00:31 +00:00
-												Update roadmap

We took a hard look at 1.0 and what things ae really REQUIRED to get to a
stable release that is "useful".  This required moving some things we thought
were really important but not CRITICAL down the list.

For now they are stricken from this doc, but I expect this doc to start
growing a "post 1.0" list soon.

Things stricken and why:

Using the host network: This is primarily a performance optimization, but it
causes potential problems with other uses of HostPorts.  We'd rather focus on
fixing perf problems than dodging them.  We can revisit later if there is a
strong case for it.

Representation of Ports in the Manifest structure: We discussed and decided
that, since HostPort semantics have changed, this matters less than before.

Scenarios where IP-per-pod is hard or impossible: We're still game to help
people figure out how to make it work, but we don't see a case for making k8s
1.0 work in a fundamentally different mode.  Too much churn and risk.  We can
revisit later, if needed.

Auto-scaling controller: We really want this, but it's not critical to making
k8s "useful".

Pluggable authentication: Overlaps with the other identity topic.  Having one
topic seems clearer.

Pod spreading: We still want this, but it's not critical for 1.0.

Container status snippets: We still want this, but it's not critical for 1.0.

Docker-daemon-kills-all-children-on-exit problem: This is still a big problem,
but we're not going to gate our 1.0 on something we don't control.  This has
to be documented as a shortcoming in general.

Interconnection of services: expand / decompose the service pattern: overlaps
with the other services topic.

Recipes for settings where networking is not like GCE: This is happening in
the form of cloudprovider modules, but is not going to gate 1.0.

											
										
										
											2014-08-28 16:17:52 +00:00
+								## Patterns, policies, and specifications
 . Deprecation policy: Declare the project’s intentions with regards to expiring and removing features and interfaces.
 . Compatibility policy: Declare the project’s intentions with regards to saved state and live upgrades of components.
 . Naming/discovery: Demonstrate techniques for common patterns:
-												Proposed roadmap to 1.0

											
										
										
											2014-08-08 22:00:31 +00:00
+. Master-elected services
 . DB replicas
 . Sharded services
 . Worker pools
 . Health-checking: Specification for how it works and best practices.
-												Update roadmap

We took a hard look at 1.0 and what things ae really REQUIRED to get to a
stable release that is "useful".  This required moving some things we thought
were really important but not CRITICAL down the list.

For now they are stricken from this doc, but I expect this doc to start
growing a "post 1.0" list soon.

Things stricken and why:

Using the host network: This is primarily a performance optimization, but it
causes potential problems with other uses of HostPorts.  We'd rather focus on
fixing perf problems than dodging them.  We can revisit later if there is a
strong case for it.

Representation of Ports in the Manifest structure: We discussed and decided
that, since HostPort semantics have changed, this matters less than before.

Scenarios where IP-per-pod is hard or impossible: We're still game to help
people figure out how to make it work, but we don't see a case for making k8s
1.0 work in a fundamentally different mode.  Too much churn and risk.  We can
revisit later, if needed.

Auto-scaling controller: We really want this, but it's not critical to making
k8s "useful".

Pluggable authentication: Overlaps with the other identity topic.  Having one
topic seems clearer.

Pod spreading: We still want this, but it's not critical for 1.0.

Container status snippets: We still want this, but it's not critical for 1.0.

Docker-daemon-kills-all-children-on-exit problem: This is still a big problem,
but we're not going to gate our 1.0 on something we don't control.  This has
to be documented as a shortcoming in general.

Interconnection of services: expand / decompose the service pattern: overlaps
with the other services topic.

Recipes for settings where networking is not like GCE: This is happening in
the form of cloudprovider modules, but is not going to gate 1.0.

											
										
										
											2014-08-28 16:17:52 +00:00
+. Logging: Demonstrate setting up log collection.
 . ~~Monitoring: Demonstrate setting up cluster monitoring.~~ **Done**
 . Rolling updates: Demo and best practices for live application upgrades.
-												Proposed roadmap to 1.0

											
										
										
											2014-08-08 22:00:31 +00:00
+. Have a plan for how higher level deployment / update concepts should / should not fit into Kubernetes
-												Update roadmap

We took a hard look at 1.0 and what things ae really REQUIRED to get to a
stable release that is "useful".  This required moving some things we thought
were really important but not CRITICAL down the list.

For now they are stricken from this doc, but I expect this doc to start
growing a "post 1.0" list soon.

Things stricken and why:

Using the host network: This is primarily a performance optimization, but it
causes potential problems with other uses of HostPorts.  We'd rather focus on
fixing perf problems than dodging them.  We can revisit later if there is a
strong case for it.

Representation of Ports in the Manifest structure: We discussed and decided
that, since HostPort semantics have changed, this matters less than before.

Scenarios where IP-per-pod is hard or impossible: We're still game to help
people figure out how to make it work, but we don't see a case for making k8s
1.0 work in a fundamentally different mode.  Too much churn and risk.  We can
revisit later, if needed.

Auto-scaling controller: We really want this, but it's not critical to making
k8s "useful".

Pluggable authentication: Overlaps with the other identity topic.  Having one
topic seems clearer.

Pod spreading: We still want this, but it's not critical for 1.0.

Container status snippets: We still want this, but it's not critical for 1.0.

Docker-daemon-kills-all-children-on-exit problem: This is still a big problem,
but we're not going to gate our 1.0 on something we don't control.  This has
to be documented as a shortcoming in general.

Interconnection of services: expand / decompose the service pattern: overlaps
with the other services topic.

Recipes for settings where networking is not like GCE: This is happening in
the form of cloudprovider modules, but is not going to gate 1.0.

											
										
										
											2014-08-28 16:17:52 +00:00
+. Minion requirements: Document the requirements and integrations between kubelet and minion machine environments.