This allows kubelets to stop the necessary work when the context has
been canceled (e.g., connection closed), and not leaking a goroutine
and inotify watcher waiting indefinitely.
These are all flagged by Go 1.11's
more accurate printf checking in go vet,
which runs as part of go test.
Lubomir I. Ivanov <neolit123@gmail.com>
applied ammend for:
pkg/cloudprovider/provivers/vsphere/nodemanager.go
Automatic merge from submit-queue (batch tested with PRs 63348, 63839, 63143, 64447, 64567). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Containerized subpath
**What this PR does / why we need it**:
Containerized kubelet needs a different implementation of `PrepareSafeSubpath` than kubelet running directly on the host.
On the host we safely open the subpath and then bind-mount `/proc/<pidof kubelet>/fd/<descriptor of opened subpath>`.
With kubelet running in a container, `/proc/xxx/fd/yy` on the host contains path that works only inside the container, i.e. `/rootfs/path/to/subpath` and thus any bind-mount on the host fails.
Solution:
- safely open the subpath and gets its device ID and inode number
- blindly bind-mount the subpath to `/var/lib/kubelet/pods/<uid>/volume-subpaths/<name of container>/<id of mount>`. This is potentially unsafe, because user can change the subpath source to a link to a bad place (say `/run/docker.sock`) just before the bind-mount.
- get device ID and inode number of the destination. Typical users can't modify this file, as it lies on /var/lib/kubelet on the host.
- compare these device IDs and inode numbers.
**Which issue(s) this PR fixes**
Fixes#61456
**Special notes for your reviewer**:
The PR contains some refactoring of `doBindSubPath` to extract the common code. New `doNsEnterBindSubPath` is added for the nsenter related parts.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add proxy for container streaming in kubelet for streaming auth.
For https://github.com/kubernetes/kubernetes/issues/36666, option 2 of https://github.com/kubernetes/kubernetes/issues/36666#issuecomment-378440458.
This PR:
1. Removed the `DirectStreamingRuntime`, and changed `IndirectStreamingRuntime` to `StreamingRuntime`. All `DirectStreamingRuntime`s, `dockertools` and `rkt`, were removed.
2. Proxy container streaming in kubelet instead of returning redirect to apiserver. This solves the container runtime authentication issue, which is what we agreed on in https://github.com/kubernetes/kubernetes/issues/36666.
Please note that, this PR replaced the redirect with proxy directly instead of adding a knob to switch between the 2 behaviors. For existing CRI runtimes like containerd and cri-o, they should change to serve container streaming on localhost, so as to make the whole container streaming connection secure.
If a general authentication mechanism proposed in https://github.com/kubernetes/kubernetes/issues/62747 is ready, we can switch back to redirect, and all code can be found in github history.
Please also note that this added some overhead in kubelet when there are container streaming connections. However, the actual bottleneck is in the apiserver anyway, because it does proxy for all container streaming happens in the cluster. So it seems fine to get security and simplicity with this overhead. @derekwaynecarr @mrunalp Are you ok with this? Or do you prefer a knob?
@yujuhong @timstclair @dchen1107 @mikebrow @feiskyer
/cc @kubernetes/sig-node-pr-reviews
**Release note**:
```release-note
Kubelet now proxies container streaming between apiserver and container runtime. The connection between kubelet and apiserver is authenticated. Container runtime should change streaming server to serve on localhost, to make the connection between kubelet and container runtime local.
In this way, the whole container streaming connection is secure. To switch back to the old behavior, set `--redirect-container-streaming=true` flag.
```
Kubelet should not resolve symlinks outside of mounter interface.
Only mounter interface knows, how to resolve them properly on the host.
As consequence, declaration of SafeMakeDir changes to simplify the
implementation:
from SafeMakeDir(fullPath string, base string, perm os.FileMode)
to SafeMakeDir(subdirectoryInBase string, base string, perm os.FileMode)
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
passthrough readOnly to subpath
**What this PR does / why we need it**:
If a volume is mounted as readonly, or subpath volumeMount is configured as readonly, then the subpath bind mount should be readonly.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62752
**Special notes for your reviewer**:
**Release note**:
```release-note
Fixes issue where subpath readOnly mounts failed
```
Automatic merge from submit-queue (batch tested with PRs 61147, 62236, 62018). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix local volume absolute path issue on Windows
**What this PR does / why we need it**:
remove IsAbs validation on local volume since it does not work on windows cluster, Windows absolute path `D:` is not allowed in local volume, the [validation](https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/core/validation/validation.go#L1386) happens on both master and agent node, while for windows cluster, the master is Linux and agent is Windows, so `path.IsAbs()` func will not work all in both nodes.
**Instead**, this PR use `MakeAbsolutePath` func to convert `local.path` value in kubelet, it supports both linux and windows styple.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62016
**Special notes for your reviewer**:
**Release note**:
```
fix local volume absolute path issue on Windows
```
/sig storage
/sig windows
use MakeAbsolutePath to convert path in Windows
fix test error: allow relative path for local volume
fix comments
fix comments and add windows unit tests
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Ensure /etc/hosts has a header always - Fix conformance test
**What this PR does / why we need it**:
We need to be able to tell if an /etc/hosts in a container has been touched by kubernetes or not (whether we use the host network or not).
We have 2 scenarios where we copy /etc/hosts
- with host network (we just copy the /etc/hosts from node)
- without host network (create a fresh /etc/hosts from pod info)
We are having trouble figuring out whether a /etc/hosts in a
pod/container has been "fixed-up" or not. And whether we used
host network or a fresh /etc/hosts in the various ways we start
up the tests which are:
- VM/box against a remote cluster
- As a container inside the k8s cluster
- DIND scenario in CI where test runs inside a managed container
Please see previous mis-guided attempt to fix this problem at
ba20e63446 In this commit we revert
the code from there as well.
So we should make sure:
- we always add a header if we touched the file
- we add slightly different headers so we can figure out if we used the
host network or not.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#60938
**Special notes for your reviewer**:
Also see
- https://github.com/kubernetes/kubernetes/pull/61405
- https://github.com/kubernetes/kubernetes/pull/60939
- https://github.com/kubernetes/kubernetes/issues/60938
**Release note**:
```release-note
NONE
```
We have 2 scenarios where we copy /etc/hosts
- with host network (we just copy the /etc/hosts from node)
- without host network (create a fresh /etc/hosts from pod info)
We are having trouble figuring out whether a /etc/hosts in a
pod/container has been "fixed-up" or not. And whether we used
host network or a fresh /etc/hosts in the various ways we start
up the tests which are:
- VM/box against a remote cluster
- As a container inside the k8s cluster
- DIND scenario in CI where test runs inside a managed container
Please see previous mis-guided attempt to fix this problem at
ba20e63446 In this commit we revert
the code from there as well.
So we should make sure:
- we always add a header if we touched the file
- we add slightly different headers so we can figure out if we used the
host network or not.
Update the test case to inject /etc/hosts from node to another path
(/etc/hosts-original) as well and use that to compare.
Automatic merge from submit-queue (batch tested with PRs 61894, 61369). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use range in loops; misc fixes
**What this PR does / why we need it**:
It is cleaner to use `range` in for loops to iterate over channel until it is closed.
**Release note**:
```release-note
NONE
```
/kind cleanup
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix `PodScheduled` bug for static pod.
Fixes https://github.com/kubernetes/kubernetes/issues/60589.
This is an implementation of option 2 in https://github.com/kubernetes/kubernetes/issues/60589#issuecomment-375103979.
I've validated this in my own cluster, and there won't be continuously status update for static pod any more.
Signed-off-by: Lantao Liu <lantaol@google.com>
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
none
```
Users must not be allowed to step outside the volume with subPath.
Therefore the final subPath directory must be "locked" somehow
and checked if it's inside volume.
On Windows, we lock the directories. On Linux, we bind-mount the final
subPath into /var/lib/kubelet/pods/<uid>/volume-subpaths/<container name>/<subPathName>,
it can't be changed to symlink user once it's bind-mounted.
Automatic merge from submit-queue (batch tested with PRs 59010, 59212, 59281, 59014, 59297). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improve error returned when fetching container logs during pod termination
**What this PR does / why we need it**:
This change better handles fetching of logs when a container is in a crash loop backoff state. In cases where it is unable to fetch the logs, it gives a helpful error message back to a user who has requested logs of a container from a terminated pod. Rather than attempting to get logs for a container using an empty container ID, it returns a useful error message.
In cases where the container runtime gets an error, log the error but don't leak it back through the API to the user.
**Which issue(s) this PR fixes**:
Fixes#59296
**Release note**:
```release-note
NONE
```
This also incorporates the version string into the package name so
that incompatibile versions will fail to connect.
Arbitrary choices:
- The proto3 package name is runtime.v1alpha2. The proto compiler
normally translates this to a go package of "runtime_v1alpha2", but
I renamed it to "v1alpha2" for consistency with existing packages.
- kubelet/apis/cri is used as "internalapi". I left it alone and put the
public "runtimeapi" in kubelet/apis/cri/runtime.
Add a feature gate ReadOnlyAPIDataVolumes to a provide a way to
disable the new behavior in 1.10, but for 1.11, the new
behavior will become non-optional.
Also, update E2E tests for downwardAPI and projected volumes
to mount the volumes somewhere other than /etc.