Automatic merge from submit-queue
Kubelet: Better-defined Container Waiting state
For issue #20478 and #21125.
This PR corrected logic and add unit test for `ShouldContainerBeRestarted()`, cleaned up `Waiting` state related code and added unit test for `generateAPIPodStatus()`.
Fixes#20478Fixes#17971
@yujuhong
Automatic merge from submit-queue
rkt: Add pre-stop lifecycle hooks for rkt.
When a pod is being terminated, the pre-stop hooks of all the containers
will be run before the containers are stopped.
cc @yujuhong @Random-Liu @sjpotter
Automatic merge from submit-queue
Move predicates into library
This PR tries to implement #12744
Any suggestions/ideas are welcome. @davidopp
current state: integration test fails if including podCount check in Kubelet.
DONE:
1. refactor all predicates: predicates return fitOrNot(bool) and error(Error) in which the latter is of type PredicateFailureError or InsufficientResourceError
2. GeneralPredicates() is a predicate function, which includes serveral other predicate functions (PodFitsResource, PodFitsHost, PodFitsHostPort). It is registered as one of the predicates in DefaultAlgorithmProvider, and is also called in canAdmitPod() in Kubelet and should be called by other components (like rescheduler, etc if necessary. See discussion in issue #12744
TODO:
1. determine which predicates should be included in GeneralPredicates()
2. separate GeneralPredicates() into: a.) GeneralPredicatesEvictPod() and b.) GeneralPredicatesNotEvictPod()
3. DaemonSet should use GeneralPredicates()
DONE:
1. refactor all predicates: predicates return fitOrNot(bool) and error(Error) in which the latter is of type
PredicateFailureError or InsufficientResourceError. (For violation of either MaxEBSVolumeCount or
MaxGCEPDVolumeCount, returns one same error type as ErrMaxVolumeCountExceeded)
2. GeneralPredicates() is a predicate function, which includes serveral other predicate functions (PodFitsResource,
PodFitsHost, PodFitsHostPort). It is registered as one of the predicates in DefaultAlgorithmProvider, and
is also called in canAdmitPod() in Kubelet and should be called by other components (like rescheduler, etc)
if necessary. See discussion in issue #12744
3. remove podNumber check from GeneralPredicates
4. HostName is now verified in Kubelet's canAdminPod(). add TestHostNameConflicts in kubelet_test.go
5. add getNodeAnyWay() method in Kubelet to get node information in standaloneMode
TODO:
1. determine which predicates should be included in GeneralPredicates()
2. separate GeneralPredicates() into:
a. GeneralPredicatesEvictPod() and
b. GeneralPredicatesNotEvictPod()
3. DaemonSet should use GeneralPredicates()
Automatic merge from submit-queue
Kubelet: Start using the official docker engine-api
For #23563.
This is the **first step** in the roadmap of switching to docker [engine-api](https://github.com/docker/engine-api).
In this PR, I keep the old `DockerInterface` and implement it with the new engine-api.
With this approach, we could switch to engine-api with minimum change, so that we could:
* Test the engine-api without huge refactoring.
* Send following PRs to refactor functions in `DockerInterface` separately so as to avoid a huge change in one PR.
I've tested this PR locally, it passed all the node conformance test:
```
make test_e2e_node
Ran 19 of 19 Specs in 823.395 seconds
SUCCESS! -- 19 Passed | 0 Failed | 0 Pending | 0 Skipped PASS
Ginkgo ran 1 suite in 13m49.429979585s
Test Suite Passed
```
And it also passed the jenkins gce e2e test:
```
go run hack/e2e.go -test -v --test_args="--ginkgo.skip=\[Slow\]|\[Serial\]|\[Disruptive\]|\[Flaky\]|\[Feature:.+\]"
Ran 161 of 268 Specs in 4570.214 seconds
SUCCESS! -- 161 Passed | 0 Failed | 0 Pending | 107 Skipped PASS
Ginkgo ran 1 suite in 1h16m16.325934558s
Test Suite Passed
2016/03/25 15:12:42 e2e.go:196: Step 'Ginkgo tests' finished in 1h16m18.918754301s
```
I'm writing the design document, and will post the switching roadmap in an umbrella issue soon.
@kubernetes/sig-node
Allow network plugins to declare that they handle shaping and that
Kuberenetes should not. Will be first used by openshift-sdn which
handles shaping through OVS, but this triggers a warning when
kubelet notices the bandwidth annotations.
Add GeneratePodHostNameAndDomain() to RuntimeHelper to
get the hostname of the pod from kubelet.
Also update the logging flag to change the journal match from
_HOSTNAME to _MACHINE_ID.
The kubelet sync loop relies on getting one update as the signal that the
specific source is ready. This change ensures that we don't send multiple
updates (ADD, UPDATE) for the first batch of pods. This is required to prevent
the cleanup routine from killing pods prematurely.
PLEG is reponsible for listing the pods running on the node. If it's hung
due to non-responsive container runtime or internal bugs, we should restart
kubelet.
cleanupTerminatedPods is responsible for checking whether a pod has been
terminated and force a status update to trigger the pod deletion. However, this
function is called in the periodic clenup routine, which runs every 2 seconds.
In other words, it forces a status update for each non-running (and not yet
deleted in the apiserver) pod. When batch deleting tens of pods, the rate of
new updates surpasses what the status manager can handle, causing numerous
redundant requests (and the status channel to be full).
This change forces a status update only when detecting the DeletionTimestamp is
set for a terminated pod. Note that for other non-terminated pods, the pod
workers should be responsible for setting the correct status after killling all
the containers.
cleanupTerminatedPods is responsible for checking whether a pod has been
terminated and force a status update to trigger the pod deletion. However, this
function is called in the periodic clenup routine, which runs every 2 seconds.
In other words, it forces a status update for each non-running (and not yet
deleted in the apiserver) pod. When batch deleting tens of pods, the rate of
new updates surpasses what the status manager can handle, causing numerous
redundant requests (and the status channel to be full).
This change forces a status update only when detecting the DeletionTimestamp is
set for a terminated pod. Note that for other non-terminated pods, the pod
workers should be responsible for setting the correct status after killling all
the containers.