Since it takes a while (1-2mins) for kubelet to pulling a big image
(>500MB). Just showing "Pending" for pod status is not very helpful.
This commit introduces a "pulling" event, and inserts it before the
kubelet starts to pull an image.
If a pod was deleted and the associated volumes/directory were removed, there
could be a window where the pod worker is still active. If the pod worker tries
to inspect the logs, such an error would be logged. Since the pod has been
deleted, such error messages are meaningless.
This change stops logging this error, but stores the error string in the pod
status. The pod status will be updated for pods that are still alive, and will
be discarded eventually for deleted pods.
If a user starts an exec session with a shell and leaves it idle long
enough, they will eventually hit the Kubelet's HTTP server's read/write
timeout of 5 minutes. At this time, the StartExec call to Docker exits,
but if the user requested a TTY, the exec'd process does not exit.
After StartExec finishes, we try to determine the exit code of the
exec'd process, but in this case, we'll never get it. This change exits
the loop after 5 tries if the process is still running.
This commit wires together the graceful delete option for pods
on the Kubelet. When a pod is deleted on the API server, a
grace period is calculated that is based on the
Pod.Spec.TerminationGracePeriodInSeconds, the user's provided grace
period, or a default. The grace period can only shrink once set.
The value provided by the user (or the default) is set onto metadata
as DeletionGracePeriod.
When the Kubelet sees a pod with DeletionTimestamp set, it uses the
value of ObjectMeta.GracePeriodSeconds as the grace period
sent to Docker. When updating status, if the pod has DeletionTimestamp
set and all containers are terminated, the Kubelet will update the
status one last time and then invoke Delete(pod, grace: 0) to
clean up the pod immediately.
Add support for pluggable Docker exec handlers. The default handler is
now Docker's native exec API call. The previous default, nsenter, can be
selected by passing --docker-exec-handler=nsenter when starting the
kubelet.
* Start using FakeRuntime to replace FakeDockerClient in unit tests.
* Move and adapt docker-specific tests (e.g. creating/deleting infra
containers) to manager_test.go in dockertools.
Full pod name is exposed under key 'kubernetes.io/pod'.
It helps in introspection by looking at all containers in a pod through
docker ps -a -f label=kubernetes.io/pod=podXXX
We also plan to visualize this in cAdvisor.
This change instructs kubelet to switch to using the Runtime interface. In order
to do it, the change moves the Prober instantiation to DockerManager.
Note that most of the tests in kubelet_test.go needs to be migrated to
dockertools. For now, we use type assertion to convert the Runtime interface to
DockerManager in most tests.
This change is part of the efforts to make DockerManager implement the Runtime
interface.
The change also modifies the interface slightly to work with existing
code, and aggregates the type converting functions to convert.go.
This change refactors the GetPods function and add some basic unit tests.
We should start migrating docker specific tests from kubelet_test to
manager_test.go.
This change removes docker-specifc code in killUnwantedPods. It
also instructs the cleanup code to move away from interacting with
containers directly. They should always deal with the pod-level
abstraction if at all possible.
This moved Docker specific logic there and allows it to align with the
runtime API. There is still a pod infra container reference in the
function due to network plugins. We can handle this in the Kubelet since
we'll need to be explicit in stating that the network plugin will not
work in a non-Docker runtime.
`kubectl get pod` already prints one container per line. This change fills in
the status for each container listed. This aims to help users quickly identify
unhealthy pods (e.g. in a crash loop) at a glance.
- The first row of every pod would display the pod information and status
- Each row of the subsequent rows corresponds to a container in that pod:
* STATUS refers to the container status (Running, Waiting, Terminated).
* CREATED refers to the elapsed time since the last start time of the
container.
* MESSAGE is a string which explains the last termination reason, and/or
the reason behind the waiting status.
Remove GetDockerServerVersion() from DockerContainerCommandRunner interface,
replaced with runtime.Version(). Also added Version type in runtime for version
comparision.
- When 'getent hosts localhost' returns '::1' the creation of the
listener fails because of the port parsing which uses ":" as a
separator
- Use of net.SplitHostPort() to do the job
- Adding unit tests to ensure that the creation succeeds
- On docker.go: adds a test on the presence the socat command which was
failing silenty if not installed
- Code Review 1
- Fixed typo on Expected
- The UT now fails if the PortForwarder could not be created
- Code Review 2
- Simplify socat error message
- Changing t.Fatal to to.Error on unit tests
- Code Review 3
- Removing useless uses cases in unit tests
- Code Review 4
- Removing useless initiliasiation of PortForwarder
- Changing error message
- Code Review 5
- Simplifying TestCast struct
- Adding addition test in one test case
- Closing the listener
- Code Review 6
- Improving unit test
Kubelet kills unwanted pods in SyncPods, which directly impact the latency of a
sync iteration. This change parallelizes the cleanup to lessen the effect.
Eventually, we should leverage per-pod workers for cleanup, with the exception
of truly orphaned pods.
Use go-dockerclient's APIVersion to check the minimum required Docker
version, as it contains methods for parsing the ApiVersion response from
the Docker daemon and for comparing 2 APIVersion objects.
Currently, restart count are generated by examine dead docker containers, which
are subject to background garbage collection. Therefore, the restart count is
capped at 5 and can decrement if GC happens.
This change leverages the container statuses recorded in the pod status as a
reference point. If a container finished after the last observation, restart
count is incremented on top of the last observed count. If container is created
after last observation, but GC'd before the current observation time, kubelet
would not be aware of the existence of such a container, and would not increase
the restart count accordingly. However, the chance of this should be low, given
that pod statuses are reported frequently. Also, the restart cound would still
be increasing monotonically (with the exception of container insepct error).
Remove kubelet.getPodInfraContainer().
Remove dockertools.RemoveContainerWithID().
Remove dockertools.FindContainersByPod().
Also replace the useless test with a test for GetPods().
Container creation/start failure cannot be reproduced by inspecting the
containers. This change caches such errors so that kubelet can retrieve it
later.
This change also extends FakeDockerClient to support setting error response
for a specific function.
We want to stop leaking more docker details into kubelet, and we also want to
consolidate some of the existing docker interfaces/structs. This change creates
DockerManager as the new home of some functions in dockertools/docker.go. It
also absorbs containerRunner. In addition, GetDockerPodStatus is renamed to
GetPodStatus with the entire pod passed to it so that it is simialr to the what
is defined in the container Runtime interface.
Eventually, DockerManager should implement the container Runtime interface, and
integrate DockerCache with a flag to turn on/off caching. Code in kubelet.go
should not be using docker client directly.
This fixes TestSyncPodsDeletesWithNoPodInfraContainer.
Since we need to sync two pods in parallel, we should not verify
the docker calls in strict order.
Functions Build/ParseDockerName now work with struct instead of the long
list of arguments. This new struct also was reused in the kubelet.go
instead of auxilary podContainer struct.
There are two main goals for this change.
1. Fix the naming scheme in kubelet so that it accepts DNS subdomain
name/namespaces correctly (#4920). The design is discussed in #3453.
2. Prepare for syncing the static pods back to the apiserver(#4090). This
includes
- Eliminate the source component in the internal full pod name (#4922). Pods
no longer need sources as they will all be sync'd via apiserver.
- Changing the naming scheme for the static (file-, http-, and etcd-based)
pods such that they are distinguishable when syncing back to the apiserver.
The changes includes:
* name = <pod.Name>-<hostname>
* namespace = <cluster_namespace> (i.e. "default" for now).
* container_name = k8s_<contianer_name>.<hash_of_container>_<pod_name>_<namespace>_<uid>_<random>
Note that this is not backward-compatible, meaning the kubelet won't recognize
existing running containers using the old naming scheme.
to create and start a fake container. When StartContainer, it pass a name as docker
ID for testing purpose, but leave Name uninitialized. This PR fixes such issue.
Fixed#4472.
Since the parsing function doesn't return an error all the components
returned empty strings. This caused us to enforce the MaxContainerLimit
as a global limit instead of a per-container limit.
Fixes#4413.
Sometimes for external applications it is important to identify
exactly what images are running. Since tags can be moved to point
to newer builds this information can be used to identify old images
running.
Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
Docker's logic for resolving credentials from .dockercfg accepts two kinds of matches:
1. an exact match between the dockercfg entry and the image prefix
2. a hostname match between the dockercfg entry and the image prefix
This change implements the latter, which permits the docker client to take .dockercfg entries of the form:
https://quay.io/v1/
and use them for images of the form:
quay.io/foo/bar
even though they are not a prefix-match.
This should only have been triggered by tests, and those should now be fixed.
I tested by calling panic() if UID was blank in BuildDockerName() or if number
of fields was < 5 in ParseDockerName(). All errors were fixed.
Make all kubelet config sources ensure that UID and Namespace are defaulted, if
need be.
We can *almost* disable the "if blank" logic for UID, except for tests that
call APIs that do not run through SyncPods. We really ought to be enforcing
invariants better.
Sometimes for external applications it is useful to correlate the pod
containers to the real docker instances.
This patch adds a new entry in the container status (containerID) which
is used to identify the instance.
Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
This change refactors the way Kubelet's DockerPuller handles the docker config credentials to utilize a new credentialprovider library.
The credentialprovider library is based on several of the files from the Kubelet's dockertools directory, but supports a new pluggable model for retrieving a .dockercfg-compatible JSON blob with credentials.
With this change, the Kubelet will lazily ask for the docker config from a set of DockerConfigProvider extensions each time it needs a credential.
This change provides common implementations of DockerConfigProvider for:
- "Default": load .dockercfg from disk
- "Caching": wraps another provider in a cache that expires after a pre-specified lifetime.
GCP-only:
- "google-dockercfg": reads a .dockercfg from a GCE instance's metadata
- "google-dockercfg-url": reads a .dockercfg from a URL specified in a GCE instance's metadata.
- "google-container-registry": reads an access token from GCE metadata into a password field.
There are three values that uniquely identify a pod on a host -
the configuration source (etcd, file, http), the pod name, and the
pod namespace. This change ensures that configuration properly
makes those names unique by changing podFullName to contain both
name (currently ID in v1beta1, Name in v1beta3) and namespace.
The Kubelet does not properly handle information requests for
pods not in the default namespace at this time.
Public access to the DockerHub is not guaranteed in all environments,
add a flag to the kubelet that allows it to use a different image (like
one on a private registry) as well as only pull the first time the
image is needed.
Fixes#1545
Move a lot of common error logging into better buckets:
glog.Errorf() - Always an error
glog.Warningf() - Something unexpected, but probably not an error
glog.V(0) - Generally useful for this to ALWAYS be visible
to an operator
* Programmer errors
* Logging extra info about a panic
* CLI argument handling
glog.V(1) - A reasonable default log level if you don't want
verbosity
* Information about config (listening on X, watching Y)
* Errors that repeat frequently that relate to conditions
that can be corrected (pod detected as unhealthy)
glog.V(2) - Useful steady state information about the service
* Logging HTTP requests and their exit code
* System state changing (killing pod)
* Controller state change events (starting pods)
* Scheduler log messages
glog.V(3) - Extended information about changes
* More info about system state changes
glog.V(4) - Debug level verbosity (for now)
* Logging in particularly thorny parts of code where
you may want to come back later and check it