This PR is to fix the issue in converting aws volume id from mount
paths. Currently there are three aws volume id formats supported. The
following lists example of those three formats and their corresponding
global mount paths:
1. aws:///vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/vol-123456)
2. aws://us-east-1/vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/mounts/aws/us-est-1/vol-123455)
3. vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/mounts/aws/us-est-1/vol-123455)
For the first two cases, we need to check the mount path and convert
them back to the original format.
Automatic merge from submit-queue
CRI: add docs for sysctls
#34830 adds `sysctls` features in CRI, it is based on sandbox annotations, this PR adds docs for it.
@yujuhong @timstclair @jonboulle
Automatic merge from submit-queue
CRI: Clarify User in CRI.
Addressed https://github.com/kubernetes/kubernetes/pull/36423#issuecomment-259343135.
This PR clarifies the user related fields in CRI.
One question is that:
What is the meaning of the `run_as_user` field in `LinuxSandboxSecurityContext`?
* **Is it user on the host?** Then it doesn't make sense, user shouldn't care about what users are on the host.
* **Is it user inside the infra container image?** This is how the field is currently used. However, Infra container is docker specific, I'm not sure whether we should expose this in CRI.
* **Is it the default user inside the pod?** It tells runtime that if there is a container (infra container, or some other helper containers like streaming container etc.), if their `user` is not specified, use the default "sandbox user". Then how can we guarantee that infra or helper container image have the `user`?
* **It doesn't make sense?** If we remove it, we are relying on the shim to set right user (maybe always root) for infra or helper containers (if there will be any in the future), I'm not sure whether this is what we expect.
@yujuhong @feiskyer @jonboulle @yifan-gu
/cc @kubernetes/sig-node
Automatic merge from submit-queue
V2resource fixes
when using kubectl set resources it resets all resource fields that are not being set.
for example
$ kubectl set resources deployments nginx --limits=cpu=100m
followed by
$ kubectl set resources deployments nginx --limits=memory=256Mi
would result in the nginx deployment only limiting memory at 256Mi with the previous
limit placed on the cpu being wiped out. This behavior is corrected so that each invocation
only modifies fields set in that command and changed the testing so that the desired behavior
is checked.
Also a typo:
you must specify an update to requests or limits or (in the form of --requests/--limits)
corrected to
you must specify an update to requests or limits (in the form of --requests/--limits)
Implemented both the dry run and local flags.
Added test cases to show that both flags are operating as intended.
Removed the print statement "running in local mode" as in PR#35112
The original PR associated with these fixes where reverted due to causing a flake in hack/make-rules/test-cmd.sh, I gave the 'kubectl set resources' tests there own deployment and set the terminationGracePeriodSeconds to 0 and have run test-cmd.sh for hours without hitting the flake
Automatic merge from submit-queue
Close tunnels after failed healthchecks.
When we fail an ssh-tunnel healthcheck, we currently leak a file descriptor keeping the SSH connection open.
This closes the underlying tunnel before removing our pointer to it. It is possible that the tunnel was functional, but the healthcheck failed for some other reason (e.g. kubelet healthz down), which could close an in-use tunnel, but I think that is acceptable.
Automatic merge from submit-queue
[kubelet]update some --cgroups-per-qos to --experimental-cgroups-per-qos
Follow https://github.com/kubernetes/kubernetes/pull/36767, there are some fields still need update in docs or hack/local-up-cluster.sh
Automatic merge from submit-queue
Add a flag allowing contention profiling of the API server
Useful for performance debugging.
cc @smarterclayton @timothysc @lavalamp
```release-note
Add a flag allowing contention profiling of the API server
```
Automatic merge from submit-queue
kubectl: add less verbose version
The kubectl version output is very complex and makes it hard for users
and vendors to give actionable information. For example during the
recent Kubernetes 1.4.3 TLS security scramble I had to write a one-liner
for users to get out the version number to give to figure out if they
are vulnerable:
```
$ kubectl version | grep -i Server | sed -n 's%.*GitVersion:"\([^"]*\).*%\1%p'
```
Instead this patch outputs simply output by default
```
./kubectl version
Client Version: v1.4.3
Server Version: v1.4.3
```
Adding the `--verbose` flag will output the old format.
Automatic merge from submit-queue
[kubelet] rename --cgroups-per-qos to --experimental-cgroups-per-qos
This reflects the true nature of "cgroups per qos" feature.
```release-note
* Rename `--cgroups-per-qos` to `--experimental-cgroups-per-qos` in Kubelet
```
Automatic merge from submit-queue
Implement CanMount() for gfsMounter for linux
**What this PR does / why we need it**:
To implement CanMount() check for glusterfs. If mount binaries are not present on the underlying node, the mount will not proceed and return an error message stating so.
Related to issue : https://github.com/kubernetes/kubernetes/issues/36098
Related to similar change for NFS :
https://github.com/kubernetes/kubernetes/pull/36280
**Release note**:
`Check binaries for GlusterFS on the underlying node before doing mount`
Sample output from testing in GCE/GCI:
rkouj@rkouj0:~/go/src/k8s.io/kubernetes$ kubectl describe pods
Name: glusterfs
Namespace: default
Node: e2e-test-rkouj-minion-group-kjq3/10.240.0.3
Start Time: Fri, 11 Nov 2016 17:22:04 -0800
Labels: <none>
Status: Pending
IP:
Controllers: <none>
Containers:
glusterfs:
Container ID:
Image: gcr.io/google_containers/busybox
Image ID:
Port:
QoS Tier:
cpu: Burstable
memory: BestEffort
Requests:
cpu: 100m
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment Variables:
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
glusterfs:
Type: Glusterfs (a Glusterfs mount on the host that shares a pod's lifetime)
EndpointsName: glusterfs-cluster
Path: kube_vol
ReadOnly: true
default-token-2zcao:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-2zcao
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
8s 8s 1 {default-scheduler } Normal Scheduled Successfully assigned glusterfs to e2e-test-rkouj-minion-group-kjq3
7s 4s 4 {kubelet e2e-test-rkouj-minion-group-kjq3} Warning FailedMount Unable to mount volume kubernetes.io/glusterfs/6bb04587-a876-11e6-a712-42010af00002-glusterfs (spec.Name: glusterfs) on pod glusterfs (UID: 6bb04587-a876-11e6-a712-42010af00002). Verify that your node machine has the required components before attempting to mount this volume type. Required binary /sbin/mount.glusterfs is missing
Automatic merge from submit-queue
Fix handling lists in kubectl convert
Fixes https://github.com/kubernetes/kubernetes/issues/36722
When handling multiple objects in `kubectl convert` (for example in `kubectl convert -f .` with multiple files in current directory) the objects must be managed as a list instead of individually, otherwise `-o yaml|json` will generate invalid format (just multiple json/yaml objects concatenated) which can't be fed to `kubectl create` like in `kubectl convert -f . | kubectl create -f -`.
```release-note
NONE
```
Automatic merge from submit-queue
Default host user namespace via experimental flag
@vishh @ncdc @pmorie @smarterclayton @thockin
Initial thought on the implementation https://github.com/kubernetes/kubernetes/pull/30684#issuecomment-241523425 wasn't quite right. Since we need to dereference a PVC in some cases the defaulting code didn't fit nicely in the docker manager code (would've coupled it with a kube client and would've been messy). I think passing this in via the runtime config turned out cleaner. PTAL
Automatic merge from submit-queue
fix bug when compare version
Fix a small bug when compare version in `patch` which is introduced by my PR #35647 today.
This blocks #36672.
cc: @janetkuo
Automatic merge from submit-queue
Fix watching from resourceVersion=0 in etcd3 watcher
Fixes https://github.com/kubernetes/kubernetes/issues/36545
* Makes etcd3 consistent with watch cache behavior (all synthetic events sent for the initial list of items result in ADDED events)
* Fixes errors if previous values of initial items had been compacted away
* Removes fan-out Get() for previous values of initial items
Should be fixed before making etcd3 the default (https://github.com/kubernetes/kubernetes/pull/36229)
Automatic merge from submit-queue
Restore event messages for replica sets in the deployment controller
Needed to unblock release upgrade tests (see https://github.com/kubernetes/kubernetes/issues/36453)
@kubernetes/deployment ptal
Automatic merge from submit-queue
Fix strategic patch for list of primitive type with merge sementic
Fix strategic patch for list of primitive type when the patch strategy is `merge`.
Before: we cannot replace or delete an item in a list of primitive, e.g. string, when the patch strategy is `merge`. It will always append new items to the list.
This patch will generate a map to update the list of primitive type.
The server with this patch will accept either a new patch or an old patch.
The client will found out the APIserver version before generate the patch.
Fixes#35163, #32398
cc: @pwittrock @fabianofranz
``` release-note
Fix strategic patch for list of primitive type when patch strategy is `merge` to remove deleted objects.
```
Automatic merge from submit-queue
Switch pod eviction client from v1alpha1 to v1beta
Generated client 1.5 has a function to evict a pod. The function uses v1alpha1.Eviction object instead of v1beta1. This pr changes the api version that is being used.
cc: @davidopp @caesarxuchao
Automatic merge from submit-queue
fix issue in reconstruct volume data when kubelet restarts
During state reconstruction when kubelet restarts, outerVolueSpecName
cannot be recovered by scanning the disk directories. But this
information is used by volume manager to check whether pod's volume is
mounted or not. There are two possible cases:
1. pod is not deleted during kubelet restarts so that desired state
should have the information. reconciler.updateState() will use this
inforamtion to update.
2. pod is deleted during this period, reconciler has to use
InnerVolumeSpecName, but it should be ok since this information will not
be used for volume cleanup (umount)
Automatic merge from submit-queue
Improve quota performance for pvc by using shared informer
This avoids a list call for each namespace in the resource quota controller when syncing quota.