Commit Graph

23 Commits (9a812bd0c5b3c050164e7712d6539a1dbb21af7a)

Author SHA1 Message Date
Sergiusz Urbaniak 56fc0f5900 scheduler: reenable TestPlugin_LifeCycle, increase timeouts 2015-09-30 16:14:13 +02:00
k8s-merge-robot c807bea089 Merge pull request #13857 from mesosphere/node-labels
Auto commit by PR queue bot
2015-09-28 22:10:51 -07:00
Dr. Stefan Schimanski 67746908e5 Deleting gracefully terminating, not-scheduled pre-scheduled pods
In upstream the kubelet is responsible for all pods which have the spec.NodeName
set. In Mesos we have a two-stage scheduling process:

1. pods with a pre-set spec.NodeName are still scheduled by the scheduler.
2. The kubelet will only see them when a Mesos task was started and the executor
   passes the pod to the kubelet.

With this PR a pod with spec.NodeName which is gracefully terminated, but not
yet scheduled, e.g.

- because the termination happened just after creation and the scheduler was
  not fast enough
- because the NodeSelector does not match

is deleted by the Mesos scheduler.
2015-09-26 23:42:08 +02:00
Dr. Stefan Schimanski 4d4ebe9f18 Add Mesos slave attributes as node labels
- pre-create node api objects from the scheduler when offers arrive
- decline offers until nodes a registered
- turn slave attributes as k8s.mesosphere.io/attribute-* labels
- update labels from executor Register/Reregister
- watch nodes in scheduler to make non-Mesos labels available for NodeSelector matching
- add unit tests for label predicate
- add e2e test to check that slave attributes really end up as node labels
2015-09-26 09:46:56 +02:00
Dr. Stefan Schimanski e4dcd97ac3 Dequeue pods in scheduler which are terminating 2015-09-22 16:41:43 +02:00
Dr. Stefan Schimanski eb5a5ffc28 Extract slave hostname registry code in its own module
- remove bleeding of registry-internal objects, without any locking
- rename from SlaveStorage to Registry which fits much better to what
  it actually does
2015-09-16 14:50:31 +02:00
Wojciech Tyczynski 53ae56f205 Replace "minion" with "node" in bunch of places. 2015-09-14 11:07:11 +02:00
Daniel Smith b225c1d47a Run gofmt (separate commit for easy rebases) 2015-09-10 17:17:59 -07:00
Daniel Smith 15b30b8b09 Move version agnostic parts of client
pkg/client/unversioned/cache -> pkg/client/cache
pkg/client/unversioned/record -> pkg/client/record
2015-09-10 17:17:59 -07:00
James DeFelice a1cea8dd87 Flexible resource accounting and pod resource containment:
- new: introduce AllocationStrategy, Predicate, and Procurement to scheduler pkg
- new: --contain-pod-resources flag (workaround for docker+systemd+mesos problems)
- new: --account-for-pod-resources flag (for testing overcommitment)
- bugfix: forward -v flag from minion controller to executor
2015-09-04 00:49:13 +00:00
Wojciech Tyczynski e202f9c797 Add resource version to Store Replace params. 2015-08-31 09:49:12 +02:00
Kris Rousey ae6c64d9bb Moving everyone to unversioned client 2015-08-18 10:23:03 -07:00
Karl Isenberg 61c9dd876e Improve readability of scheduling failure logs 2015-08-17 12:17:48 +02:00
jiangyaoguo 5d3522dc7a Keep event reason consistant in scheduler and controller 2015-08-13 11:33:32 +08:00
Dr. Stefan Schimanski f1a560718c Make slave assignment before binding persistent
- move assigned slave to T.Spec.AssignedSlave
- only create the BindingHost annoation in prepareTaskForLaunch
- recover the assigned slave from annotation and write it back to the T.Spec field

Before this patch the annotation were used to store the assign slave. But due
to the cloning of tasks in the registry, this value was never persisted in the
registry.

This patch adds it to the Spec of a task and only creates the annotation
last-minute before launching.

Without this patch pods which fail before binding will stay in the registry,
but they are never rescheduled again. The reason: the BindingHost annotation does
not exist in the registry and not on the apiserver (compare reconcilePod function).
2015-08-12 08:03:36 +02:00
Veres Lajos 9f77e49109 typofix - https://github.com/vlajos/misspell_fixer 2015-08-08 22:31:48 +01:00
Mike Danese 17defc7383 run gofmt on everything we touched 2015-08-05 17:52:56 -07:00
Mike Danese 8e33cbfa28 rewrite go imports 2015-08-05 17:30:03 -07:00
Dr. Stefan Schimanski f59b5f503b Use BindingHostKey annotation to detect scheduled pods in k8sm-scheduler
Before NodeName in the pod spec was used. Hence, pods with a fixed, pre-set
NodeName were never scheduled by the k8sm-scheduler, leading e.g. to a failing
e2e intra-pod test.

Fixes mesosphere/kubernetes-mesos#388
2015-07-31 10:22:20 +02:00
Dr. Stefan Schimanski a2fa41b73f Implement resource accounting for pods with the Mesos scheduler
This patch

- set limits (0.25 cpu, 64 MB) on containers which are not limited in pod spec
  (these are also passed to the kubelet such that it uses them for the docker
  run limits)
- sums up the container resource limits for cpu and memory inside a pod,
- compares the sums to the offered resources
- puts the sums into the Mesos TaskInfo such that Mesos does the accounting
  for the pod.
- parses the static pod spec and adds up the resources
- sets the executor resources to 0.25 cpu, 64 MB plus the static pod resources
- sets the cgroups in the kubelet for system containers, resource containers
  and docker to the one of the executor that Mesos assigned
- adds scheduler parameters --default-container-cpu-limit and
  --default-container-mem-limit.

The containers themselves are resource limited the Docker resource limit which
the kubelet applies when launching them.

Fixes mesosphere/kubernetes-mesos#68 and mesosphere/kubernetes-mesos#304
2015-07-30 21:18:04 +02:00
James DeFelice 6436c4a3bc additional comments as per code review 2015-06-11 13:47:14 +00:00
James DeFelice ee309f3cff add TODOs 2015-06-11 12:41:50 +00:00
James DeFelice 932c58a497 Kubernetes Mesos integration
This commit includes the fundamental components of the Kubernetes Mesos
integration:

* Kubernetes-Mesos scheduler
* Kubernetes-Mesos executor
* Supporting libs

Dependencies and upstream changes are included in a separate commit for easy
review.

After this initial upstream, there'll be two PRs following.

* km (hypercube) and k8sm-controller-manager #9265
* Static pods support #9077

Fixes applied:

- Precise metrics subsystems definitions
  -  mesosphere/kubernetes-mesos#331
  - https://github.com/GoogleCloudPlatform/kubernetes/pull/8882#discussion_r31875232
  - https://github.com/GoogleCloudPlatform/kubernetes/pull/8882#discussion_r31875240
- Improve comments and add clarifications
  - Fixes https://github.com/GoogleCloudPlatform/kubernetes/pull/8882#discussion-diff-31875208
  - Fixes https://github.com/GoogleCloudPlatform/kubernetes/pull/8882#discussion-diff-31875226
  - Fixes https://github.com/GoogleCloudPlatform/kubernetes/pull/8882#discussion-diff-31875227
  - Fixes https://github.com/GoogleCloudPlatform/kubernetes/pull/8882#discussion-diff-31875228
  - Fixes https://github.com/GoogleCloudPlatform/kubernetes/pull/8882#discussion-diff-31875239
  - Fixes https://github.com/GoogleCloudPlatform/kubernetes/pull/8882#discussion-diff-31875243
  - Fixes https://github.com/GoogleCloudPlatform/kubernetes/pull/8882#discussion-diff-31875234
  - Fixes https://github.com/GoogleCloudPlatform/kubernetes/pull/8882#discussion-diff-31875256
  - Fixes https://github.com/GoogleCloudPlatform/kubernetes/pull/8882#discussion-diff-31875255
  - Fixes https://github.com/GoogleCloudPlatform/kubernetes/pull/8882#discussion-diff-31875251
- Clarify which Schedule function is actually called
  - Fixes https://github.com/GoogleCloudPlatform/kubernetes/pull/8882#discussion-diff-31875246
2015-06-10 20:58:39 +00:00