Automatic merge from submit-queue (batch tested with PRs 41812, 41665, 40007, 41281, 41771)
Kubelet-rkt: Add useful informations for Ops on the Kubelet Host
Create a Systemd SyslogIdentifier inside the [Service]
Create a Systemd Description inside the [Unit]
**What this PR does / why we need it**:
#### Overview
Logged against the host, it's difficult to identify who's who.
This PR add useful information to quickly get straight to the point with the **DESCRIPTION** field:
```
systemctl list-units "k8s*"
UNIT LOAD ACTIVE SUB DESCRIPTION
k8s_b5a9bdf7-e396-4989-8df0-30a5fda7f94c.service loaded active running kube-controller-manager-172.20.0.206
k8s_bec0d8a1-dc15-4b47-a850-e09cf098646a.service loaded active running nginx-daemonset-gxm4s
k8s_d2981e9c-2845-4aa2-a0de-46e828f0c91b.service loaded active running kube-apiserver-172.20.0.206
k8s_fde4b0ab-87f8-4fd1-b5d2-3154918f6c89.service loaded active running kube-scheduler-172.20.0.206
```
#### Overview and Journal
Always on the host, to easily retrieve the pods logs, this PR add a SyslogIdentifier named as the PodBaseName.
```
# A DaemonSet prometheus-node-exporter is running on the Kubernetes Cluster
systemctl list-units "k8s*" | grep prometheus-node-exporter
k8s_c60a4b1a-387d-4fce-afa1-642d6f5716c1.service loaded active running prometheus-node-exporter-85cpp
# Get the logs from the prometheus-node-exporter DaemonSet
journalctl -t prometheus-node-exporter | wc -l
278
```
Sadly the `journalctl` flag `-t` / `--identifier` doesn't allow a pattern to catch the logs.
Also this field improve any queries made by any tools who exports the Journal (E.g: ES, Kibana):
```
{
"__CURSOR" : "s=86fd390d123b47af89bb15f41feb9863;i=164b2c27;b=7709deb3400841009e0acc2fec1ebe0e;m=1fe822ca4;t=54635e6a62285;x=b2d321019d70f36f",
"__REALTIME_TIMESTAMP" : "1484572200411781",
"__MONOTONIC_TIMESTAMP" : "8564911268",
"_BOOT_ID" : "7709deb3400841009e0acc2fec1ebe0e",
"PRIORITY" : "6",
"_UID" : "0",
"_GID" : "0",
"_SYSTEMD_SLICE" : "system.slice",
"_SELINUX_CONTEXT" : "system_u:system_r:kernel_t:s0",
"_MACHINE_ID" : "7bbb4401667243da81671e23fd8a2246",
"_HOSTNAME" : "Kubelet-Host",
"_TRANSPORT" : "stdout",
"SYSLOG_FACILITY" : "3",
"_COMM" : "ld-linux-x86-64",
"_CAP_EFFECTIVE" : "3fffffffff",
"SYSLOG_IDENTIFIER" : "prometheus-node-exporter",
"_PID" : "88827",
"_EXE" : "/var/lib/rkt/pods/run/c60a4b1a-387d-4fce-afa1-642d6f5716c1/stage1/rootfs/usr/lib64/ld-2.21.so",
"_CMDLINE" : "stage1/rootfs/usr/lib/ld-linux-x86-64.so.2 stage1/rootfs/usr/bin/systemd-nspawn [....]",
"_SYSTEMD_CGROUP" : "/system.slice/k8s_c60a4b1a-387d-4fce-afa1-642d6f5716c1.service",
"_SYSTEMD_UNIT" : "k8s_c60a4b1a-387d-4fce-afa1-642d6f5716c1.service",
"MESSAGE" : "[ 8564.909237] prometheus-node-exporter[115]: time=\"2017-01-16T13:10:00Z\" level=info msg=\" - time\" source=\"node_exporter.go:157\""
}
```
Automatic merge from submit-queue
rkt: Improve support for privileged pod (pod whose all containers are privileged)
Fix https://github.com/kubernetes/kubernetes/issues/31100
This takes advantage of https://github.com/coreos/rkt/pull/2983 . By appending the new `--all-run` insecure-options to `rkt run-prepared` command when all the containers are privileged. The pod now gets more privileged power.
So that at most one volume object will be created for every unique
host path. Also the volume's name is random generated UUID to avoid
collision since the mount point's name passed by kubelet is not
guaranteed to be unique when 'subpath' is specified.
This enables rkt to use cached stage1 image instead of unpacking the
stage1 image every time for every pod.
After this change, users need to preload the stage1 images in order to
enable rkt to find the stage1 image with the name specified by this flag.
Since appc requires gid to be non-empty today (https://github.com/appc/spec/issues/623),
we have to error out when gid is empty instead of using the root gid.
This fixed a panic where the returned pod network status is nil.
Also this makes lkvm stage1 able to run inside a user defined
network, where the network name needs to be 'rkt.kubernetes.io'.
Also fixed minor issues such as passing the wrong pod UID, ignoring
logging errors.
Automatic merge from submit-queue
rkt: Get logs via syslog identifier
This change works around https://github.com/coreos/rkt/issues/2630
Without this change, logs cannot reliably be collected for containers
with short lifetimes.
With this change, logs cannot be collected on rkt versions v1.6.0 and
before.
I'd like to also bump the required rkt version, but I don't want to do that until there's a released version that can be pointed to (so the next rkt release).
I haven't added tests (which were missing) because this code will be removed if/when logs are retrieved via the API. I have run E2E tests with this merged in and verified the tests which previously failed no longer fail.
cc @yifan-gu
Automatic merge from submit-queue
rkt: Add pod selinux support.
Currently only pod level selinux context is supported, besides when
running selinux, we will not be able to use the overlay fs, see:
https://github.com/coreos/rkt/issues/1727#issuecomment-173203129.
cc @kubernetes/sig-node @alban @mjg59 @pmorie
With quotes, the service doesn't start for systemd 219 with the error
saying the path of the netns cannot be found.
This PR fixes the bug by removing the quotes surround the netns path.
Automatic merge from submit-queue
rkt: Pass through podIP
This is needed for the /etc/hosts mount and the downward API to work.
Furthermore, this is required for the reported `PodStatus` to be
correct.
The `Status` bit mostly worked prior to #25062, and this restores that
functionality in addition to the new functionality.
In retrospect, the regression in status is large enough the prior PR should have included at least some of this; my bad for not realizing the full implications there.
#25902 is needed for downwards api stuff, but either merge order is fine as neither will break badly by itself.
cc @yifan-gu @dcbw
This replaces the previous creation of mounts from the `volumeGetter`
with mounts provided via RunContainerOptions.
This is motivated by the fact that the latter has a more complete set of
mounts (e.g. the `/etc/hosts` one created in kubelet.go).