Automatic merge from submit-queue (batch tested with PRs 44406, 41543, 44071, 44374, 44299)
Decouple remotecommand
Refactored unversioned/remotecommand to decouple it from undesirable dependencies:
- term package now is not required, and functionality required to resize terminal size can be plugged in directly in kubectl
- in order to remove dependency on kubelet package - constants from kubelet/server/remotecommand were moved to separate util package (pkg/util/remotecommand)
- remotecommand_test.go moved to pkg/client/tests module
Automatic merge from submit-queue (batch tested with PRs 43378, 43216, 43384, 43083, 43428)
Do not reformat devices with partitions
`lsblk` reports FSTYPE of devices with partition tables as empty string `""`,
which is indistinguishable from empty devices. We must look for dependent
devices (i.e. partitions) to see that the device is really empty and report
error otherwise.
The main point of this patch is to run `lsblk` without `"-n"`, i.e. print all
dependent devices and check it output.
Sample output:
```
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
10s 10s 1 default-scheduler Normal Scheduled Successfully assigned testpod to ip-172-18-11-149.ec2.internal
2s 2s 1 kubelet, ip-172-18-11-149.ec2.internal Warning FailedMount MountVolume.MountDevice failed for volume "kubernetes.io/aws-ebs/vol-0fa9da8b91913b187" (spec.Name: "vol") pod "b74f68c5-0d6a-11e7-9233-0e11251010c0" (UID: "b74f68c5-0d6a-11e7-9233-0e11251010c0") with: failed to mount the volume as "ext4", it already contains unknown data, probably partitions. Mount error: mount failed: exit status 32
Mounting command: mount
Mounting arguments: /dev/xvdbb /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/vol-0fa9da8b91913b187 ext4 [defaults]
Output: mount: wrong fs type, bad option, bad superblock on /dev/xvdbb,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.
```
Without this patch, the device would be reformatted and all data in the device partitions would be lost.
Fixes#13212
Release note:
```release-note
NONE
```
@kubernetes/sig-storage-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 41139, 41186, 38882, 37698, 42034)
Make kubelet never delete files on mounted filesystems
With bug #27653, kubelet could remove mounted volumes and delete user data.
The bug itself is fixed, however our trust in kubelet is significantly lower.
Let's add an extra version of RemoveAll that does not cross mount boundary
(rm -rf --one-file-system).
It calls lstat(path) three times for each removed directory - once in
RemoveAllOneFilesystem and twice in IsLikelyNotMountPoint, however this way
it's platform independent and the directory that is being removed by kubelet
should be almost empty.
lsblk reports FSTYPE of devices with partition tables as empty string "",
which is indistinguishable from empty devices. We must look for dependent
devices (i.e. partitions) to see that the device is really empty and report
error otherwise.
I checked that LVM, LUKS and MD RAID have their own FSTYPE in lsblk output,
so it should be only a partition table that has empty FSTYPE.
The main point of this patch is to run lsblk without "-n", i.e. print all
dependent devices and check if they're there.
With this PR, the second call to `Acquire()` will block unless the lock is released (process exits).
Also removed the memory mutex in the previous code since we don't need `Release()` here so no need to save and protect the local fd.
Fix#42929.
Automatic merge from submit-queue (batch tested with PRs 42316, 41618, 42201, 42113, 42191)
Support unqualified and partially qualified domain name in DNS query in Windows kube-proxy
**What this PR does / why we need it**:
In Windows container networking, --dns-search is not currently supported on Windows Docker. Besides, even with --dns-suffix, inside Windows container DNS suffix is not appended to DNS query names. That makes unqualified domain name or partially qualified domain name in DNS query not able to resolve.
This PR provides a solution to resolve unqualified domain name or partially qualified domain name in DNS query for Windows container in Windows kube-proxy. It uses well-known Kubernetes DNS suffix as well host DNS suffix search list to append to the name in DNS query. DNS packet in kube-proxy UDP stream is modified as appropriate.
This PR affects the Windows kube-proxy only.
**Special notes for your reviewer**:
This PR is based on top of Anthony Howe's commit 48647fb, 0e37f0a and 7e2c71f which is already included in the PR 41487. Please only review commit b9dfb69.
**Release note**:
```release-note
Add DNS suffix search list support in Windows kube-proxy.
```
Automatic merge from submit-queue (batch tested with PRs 35094, 42095, 42059, 42143, 41944)
Use chroot for containerized mounts
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
With bug #27653, kubelet could remove mounted volumes and delete user data.
The bug itself is fixed, however our trust in kubelet is significantly lower.
Let's add an extra version of RemoveAll that does not cross mount boundary
(rm -rf --one-file-system).
It calls lstat(path) three times for each removed directory - once in
RemoveAllOneFilesystem and twice in IsLikelyNotMountPoint, however this way
it's platform independent and the directory that is being removed by kubelet
should be almost empty.
Automatic merge from submit-queue (batch tested with PRs 41116, 41804, 42104, 42111, 42120)
make kubectl taint command respect effect NoExecute
**What this PR does / why we need it**:
Part of feature forgiveness implementation, make kubectl taint command respect effect NoExecute.
**Which issue this PR fixes**:
Related Issue: #1574
Related PR: #39469
**Special notes for your reviewer**:
**Release note**:
```release-note
make kubectl taint command respect effect NoExecute
```
Automatic merge from submit-queue (batch tested with PRs 41116, 41804, 42104, 42111, 42120)
Add support for attacher/detacher interface in Flex volume
Add support for attacher/detacher interface in Flex volume
This change breaks backward compatibility and requires to be release noted.
```release-note
Flex volume plugin is updated to support attach/detach interfaces. It broke backward compatibility. Please update your drivers and implement the new callouts.
```
Automatic merge from submit-queue (batch tested with PRs 40665, 41094, 41351, 41721, 41843)
Update i18n tools and process.
@fabianofranz @zen @kubernetes/sig-cli-pr-reviews
This is an update to the translation process based on feedback from folks.
The main changes are:
* `msgctx` is being removed from the files.
* String wrapping and string extraction have been separated.
* More tools from the `gettext` family of tools are being used
* Extracted strings are being sorted for canonical ordering
* A `.pot` template has been added.
Automatic merge from submit-queue (batch tested with PRs 41714, 41510, 42052, 41918, 31515)
Show specific error when a volume is formatted by unexpected filesystem.
kubelet now detects that e.g. xfs volume is being mounted as ext3 because of
wrong volume.Spec.
Mount error is left in the error message to diagnose issues with mounting e.g.
'ext3' volume as 'ext4' - they are different filesystems, however kernel should
mount ext3 as ext4 without errors.
Example kubectl describe pod output:
```
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
41s 3s 7 {kubelet ip-172-18-3-82.ec2.internal} Warning FailedMount MountVolume.MountDevice failed for volume "kubernetes.io/aws-ebs/aws://us-east-1d/vol-ba79c81d" (spec.Name: "pvc-ce175cbb-6b82-11e6-9fe4-0e885cca73d3") pod "3d19cb64-6b83-11e6-9fe4-0e885cca73d3" (UID: "3d19cb64-6b83-11e6-9fe4-0e885cca73d3") with: failed to mount the volume as "ext4", it's already formatted with "xfs". Mount error: mount failed: exit status 32
Mounting arguments: /dev/xvdba /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-ba79c81d ext4 [defaults]
Output: mount: wrong fs type, bad option, bad superblock on /dev/xvdba,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.
```
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
Automatic merge from submit-queue (batch tested with PRs 41706, 39063, 41330, 41739, 41576)
Fix regex match doc of procfs.PidOf
Fixes#41247.
cc @bboreham
OSX 10.11.x has `/var` symlinked to `/private/var`, which was tripping
up logic in `mount.GetMountRefs`
This fixes unit tests for pkg/volume/fc and pkg/volume/iscsi
kubelet now detects that e.g. xfs volume is being mounted as ext3 because of
wrong volume.Spec.
Mount error is left in the error message to diagnose issues with mounting e.g.
'ext3' volume as 'ext4' - they are different filesystems, however kernel should
mount ext3 as ext4 without errors.
Automatic merge from submit-queue
Add initial french translations for kubectl
Add initial French translations, mostly as an example of how to add a new language.
@fabianofranz @kubernetes/sig-cli-pr-reviews
Move over only the conversions that are needed, create a new scheme that
is private to meta and only accessible via ParameterCodec. Move half of
pkg/util/labels/.readonly to pkg/apis/meta/v1/labels.go
Automatic merge from submit-queue
Improve TerminationMessagePath to be more flexible
* Support `terminationMessagePolicy: fallbackToLogsOnError` which allows pod authors to get useful information from containers as per kubernetes/community#154
* Set an upper bound on the size of the termination message path or log output to prevent callers from DoSing the master
* Add tests for running as root, non-root, and for the new terminationMessagePolicy cases.
I set the limit to 4096 bytes, but this may be too high for large pod containers. Probably need to set an absolute bound, i.e. max message size allowed is 20k total, and we truncate if we're above that limit.
Fixes#31839, #23569
```release-note
A new field `terminationMessagePolicy` has been added to containers that allows a user to request `FallbackToLogsOnError`, which will read from the container's logs to populate the termination message if the user does not write to the termination message log file. The termination message file is now properly readable for end users and has a maximum size (4k bytes) to prevent abuse. Each pod may have up to 12k bytes of termination messages before the contents of each will be truncated.
```
These files have been created lately, so we don't have much information
about them anyway, so let's just:
- Remove assignees and make them approvers
- Copy approves as reviewers
Enforce the following limits:
12kb for total message length in container status
4kb for the termination message path file
2kb or 80 lines (whichever is shorter) from the log on error
Fallback to log output if the user requests it.
Automatic merge from submit-queue
Enable lazy initialization of ext3/ext4 filesystems
**What this PR does / why we need it**: It enables lazy inode table and journal initialization in ext3 and ext4.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#30752, fixes#30240
**Release note**:
```release-note
Enable lazy inode table and journal initialization for ext3 and ext4
```
**Special notes for your reviewer**:
This PR removes the extended options to mkfs.ext3/mkfs.ext4, so that the defaults (enabled) for lazy initialization are used.
These extended options come from a script that was historically located at */usr/share/google/safe_format_and_mount* and later ported to GO so this dependency to the script could be removed. After some search, I found the original script here: https://github.com/GoogleCloudPlatform/compute-image-packages/blob/legacy/google-startup-scripts/usr/share/google/safe_format_and_mount
Checking the history of this script, I found the commit [Disable lazy init of inode table and journal.](4d7346f7f5). This one introduces the extended flags with this description:
```
Now that discard with guaranteed zeroing is supported by PD,
initializing them is really fast and prevents perf from being affected
when the filesystem is first mounted.
```
The problem is, that this is not true for all cloud providers and all disk types, e.g. Azure and AWS. I only tested with magnetic disks on Azure and AWS, so maybe it's different for SSDs on these cloud providers. The result is that this performance optimization dramatically increases the time needed to format a disk in such cases.
When mkfs.ext4 is told to not lazily initialize the inode tables and the check for guaranteed zeroing on discard fails, it falls back to a very naive implementation that simply loops and writes zeroed buffers to the disk. Performance on this highly depends on free memory and also uses up all this free memory for write caching, reducing performance of everything else in the system.
As of https://github.com/kubernetes/kubernetes/issues/30752, there is also something inside kubelet that somehow degrades performance of all this. It's however not exactly known what it is but I'd assume it has something to do with cgroups throttling IO or memory.
I checked the kernel code for lazy inode table initialization. The nice thing is, that the kernel also does the guaranteed zeroing on discard check. If it is guaranteed, the kernel uses discard for the lazy initialization, which should finish in a just few seconds. If it is not guaranteed, it falls back to using *bio*s, which does not require the use of the write cache. The result is, that free memory is not required and not touched, thus performance is maxed and the system does not suffer.
As the original reason for disabling lazy init was a performance optimization and the kernel already does this optimization by default (and in a much better way), I'd suggest to completely remove these flags and rely on the kernel to do it in the best way.
Automatic merge from submit-queue
Enable streaming proxy redirects by default (beta)
Prerequisite to moving CRI to Beta.
I'd like to enable this early in our 1.6 cycle to get plenty of test coverage before release.
@yujuhong @liggitt
```release-note
Follow redirects for streaming requests (exec/attach/port-forward) in the apiserver by default (alpha -> beta).
```
Automatic merge from submit-queue
Remove packages which are now apimachinery
Removes all the content from the packages that were moved to `apimachinery`. This will force all vendoring projects to figure out what's wrong. I had to leave many empty marker packages behind to have verify-godep succeed on vendoring heapster.
@sttts straight deletes and simple adds
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)
move api/errors to apimachinery
`pkg/api/errors` is a set of helpers around `meta/v1.Status` that help to create and interpret various apiserver errors. Things like `.NewNotFound` and `IsNotFound` pairings. This pull moves it into apimachinery for use by the clients and servers.
@smarterclayton @lavalamp First commit is the move plus minor fitting. Second commit is straight replace and generation.
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)
Update deployment equality helper
@mfojtik @janetkuo this is split out of https://github.com/kubernetes/kubernetes/pull/38714 to reduce the size of that PR, ptal
Automatic merge from submit-queue
run staging client-go update
Chasing to see what real problems we have in staging-client-go.
@sttts you get similar results?
Automatic merge from submit-queue
replace global registry in apimachinery with global registry in k8s.io/kubernetes
We'd like to remove all globals, but our immediate problem is that a shared registry between k8s.io/kubernetes and k8s.io/client-go doesn't work. Since client-go makes a copy, we can actually keep a global registry with other globals in pkg/api for now.
@kubernetes/sig-api-machinery-misc @lavalamp @smarterclayton @sttts
Automatic merge from submit-queue (batch tested with PRs 39834, 38665)
Use parallel list for deleting items from a primitive list with merge strategy
Implemented parallel list for deleting items from a primitive list with merge strategy. Ref: [design doc](https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md#list-of-primitives)
fixes#35163 and #32398
When using parallel list, we don't need to worry about version skew.
When an old APIServer gets a new patch like:
```yaml
metadata:
$deleteFromPrimitiveList/finalizers:
- b
finalizers:
- c
```
It won't fail and work as before, because the parallel list will be dropped during json decoding.
Remaining issue: There is no check when creating a set (primitive list with merge strategy). Duplicates may get in.
It happens in two cases:
1) Creation using POST
2) Creating a list that doesn't exist before using PATCH
Fixing the first case is the beyond the scope of this PR.
The second case can be fixed in this PR if we need that.
cc: @pwittrock @kubernetes/kubectl @kubernetes/sig-api-machinery
```release-note
Fix issue around merging lists of primitives when using PATCH or kubectl apply.
```