Automatic merge from submit-queue (batch tested with PRs 42316, 41618, 42201, 42113, 42191)
Support unqualified and partially qualified domain name in DNS query in Windows kube-proxy
**What this PR does / why we need it**:
In Windows container networking, --dns-search is not currently supported on Windows Docker. Besides, even with --dns-suffix, inside Windows container DNS suffix is not appended to DNS query names. That makes unqualified domain name or partially qualified domain name in DNS query not able to resolve.
This PR provides a solution to resolve unqualified domain name or partially qualified domain name in DNS query for Windows container in Windows kube-proxy. It uses well-known Kubernetes DNS suffix as well host DNS suffix search list to append to the name in DNS query. DNS packet in kube-proxy UDP stream is modified as appropriate.
This PR affects the Windows kube-proxy only.
**Special notes for your reviewer**:
This PR is based on top of Anthony Howe's commit 48647fb, 0e37f0a and 7e2c71f which is already included in the PR 41487. Please only review commit b9dfb69.
**Release note**:
```release-note
Add DNS suffix search list support in Windows kube-proxy.
```
Automatic merge from submit-queue (batch tested with PRs 35094, 42095, 42059, 42143, 41944)
Use chroot for containerized mounts
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
With bug #27653, kubelet could remove mounted volumes and delete user data.
The bug itself is fixed, however our trust in kubelet is significantly lower.
Let's add an extra version of RemoveAll that does not cross mount boundary
(rm -rf --one-file-system).
It calls lstat(path) three times for each removed directory - once in
RemoveAllOneFilesystem and twice in IsLikelyNotMountPoint, however this way
it's platform independent and the directory that is being removed by kubelet
should be almost empty.
Automatic merge from submit-queue (batch tested with PRs 41116, 41804, 42104, 42111, 42120)
make kubectl taint command respect effect NoExecute
**What this PR does / why we need it**:
Part of feature forgiveness implementation, make kubectl taint command respect effect NoExecute.
**Which issue this PR fixes**:
Related Issue: #1574
Related PR: #39469
**Special notes for your reviewer**:
**Release note**:
```release-note
make kubectl taint command respect effect NoExecute
```
Automatic merge from submit-queue (batch tested with PRs 41116, 41804, 42104, 42111, 42120)
Add support for attacher/detacher interface in Flex volume
Add support for attacher/detacher interface in Flex volume
This change breaks backward compatibility and requires to be release noted.
```release-note
Flex volume plugin is updated to support attach/detach interfaces. It broke backward compatibility. Please update your drivers and implement the new callouts.
```
Automatic merge from submit-queue (batch tested with PRs 40665, 41094, 41351, 41721, 41843)
Update i18n tools and process.
@fabianofranz @zen @kubernetes/sig-cli-pr-reviews
This is an update to the translation process based on feedback from folks.
The main changes are:
* `msgctx` is being removed from the files.
* String wrapping and string extraction have been separated.
* More tools from the `gettext` family of tools are being used
* Extracted strings are being sorted for canonical ordering
* A `.pot` template has been added.
Automatic merge from submit-queue (batch tested with PRs 41714, 41510, 42052, 41918, 31515)
Show specific error when a volume is formatted by unexpected filesystem.
kubelet now detects that e.g. xfs volume is being mounted as ext3 because of
wrong volume.Spec.
Mount error is left in the error message to diagnose issues with mounting e.g.
'ext3' volume as 'ext4' - they are different filesystems, however kernel should
mount ext3 as ext4 without errors.
Example kubectl describe pod output:
```
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
41s 3s 7 {kubelet ip-172-18-3-82.ec2.internal} Warning FailedMount MountVolume.MountDevice failed for volume "kubernetes.io/aws-ebs/aws://us-east-1d/vol-ba79c81d" (spec.Name: "pvc-ce175cbb-6b82-11e6-9fe4-0e885cca73d3") pod "3d19cb64-6b83-11e6-9fe4-0e885cca73d3" (UID: "3d19cb64-6b83-11e6-9fe4-0e885cca73d3") with: failed to mount the volume as "ext4", it's already formatted with "xfs". Mount error: mount failed: exit status 32
Mounting arguments: /dev/xvdba /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-ba79c81d ext4 [defaults]
Output: mount: wrong fs type, bad option, bad superblock on /dev/xvdba,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.
```
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
Automatic merge from submit-queue (batch tested with PRs 41706, 39063, 41330, 41739, 41576)
Fix regex match doc of procfs.PidOf
Fixes#41247.
cc @bboreham
OSX 10.11.x has `/var` symlinked to `/private/var`, which was tripping
up logic in `mount.GetMountRefs`
This fixes unit tests for pkg/volume/fc and pkg/volume/iscsi
kubelet now detects that e.g. xfs volume is being mounted as ext3 because of
wrong volume.Spec.
Mount error is left in the error message to diagnose issues with mounting e.g.
'ext3' volume as 'ext4' - they are different filesystems, however kernel should
mount ext3 as ext4 without errors.
Automatic merge from submit-queue
Add initial french translations for kubectl
Add initial French translations, mostly as an example of how to add a new language.
@fabianofranz @kubernetes/sig-cli-pr-reviews
Move over only the conversions that are needed, create a new scheme that
is private to meta and only accessible via ParameterCodec. Move half of
pkg/util/labels/.readonly to pkg/apis/meta/v1/labels.go
Automatic merge from submit-queue
Improve TerminationMessagePath to be more flexible
* Support `terminationMessagePolicy: fallbackToLogsOnError` which allows pod authors to get useful information from containers as per kubernetes/community#154
* Set an upper bound on the size of the termination message path or log output to prevent callers from DoSing the master
* Add tests for running as root, non-root, and for the new terminationMessagePolicy cases.
I set the limit to 4096 bytes, but this may be too high for large pod containers. Probably need to set an absolute bound, i.e. max message size allowed is 20k total, and we truncate if we're above that limit.
Fixes#31839, #23569
```release-note
A new field `terminationMessagePolicy` has been added to containers that allows a user to request `FallbackToLogsOnError`, which will read from the container's logs to populate the termination message if the user does not write to the termination message log file. The termination message file is now properly readable for end users and has a maximum size (4k bytes) to prevent abuse. Each pod may have up to 12k bytes of termination messages before the contents of each will be truncated.
```
These files have been created lately, so we don't have much information
about them anyway, so let's just:
- Remove assignees and make them approvers
- Copy approves as reviewers
Enforce the following limits:
12kb for total message length in container status
4kb for the termination message path file
2kb or 80 lines (whichever is shorter) from the log on error
Fallback to log output if the user requests it.
Automatic merge from submit-queue
Enable lazy initialization of ext3/ext4 filesystems
**What this PR does / why we need it**: It enables lazy inode table and journal initialization in ext3 and ext4.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#30752, fixes#30240
**Release note**:
```release-note
Enable lazy inode table and journal initialization for ext3 and ext4
```
**Special notes for your reviewer**:
This PR removes the extended options to mkfs.ext3/mkfs.ext4, so that the defaults (enabled) for lazy initialization are used.
These extended options come from a script that was historically located at */usr/share/google/safe_format_and_mount* and later ported to GO so this dependency to the script could be removed. After some search, I found the original script here: https://github.com/GoogleCloudPlatform/compute-image-packages/blob/legacy/google-startup-scripts/usr/share/google/safe_format_and_mount
Checking the history of this script, I found the commit [Disable lazy init of inode table and journal.](4d7346f7f5). This one introduces the extended flags with this description:
```
Now that discard with guaranteed zeroing is supported by PD,
initializing them is really fast and prevents perf from being affected
when the filesystem is first mounted.
```
The problem is, that this is not true for all cloud providers and all disk types, e.g. Azure and AWS. I only tested with magnetic disks on Azure and AWS, so maybe it's different for SSDs on these cloud providers. The result is that this performance optimization dramatically increases the time needed to format a disk in such cases.
When mkfs.ext4 is told to not lazily initialize the inode tables and the check for guaranteed zeroing on discard fails, it falls back to a very naive implementation that simply loops and writes zeroed buffers to the disk. Performance on this highly depends on free memory and also uses up all this free memory for write caching, reducing performance of everything else in the system.
As of https://github.com/kubernetes/kubernetes/issues/30752, there is also something inside kubelet that somehow degrades performance of all this. It's however not exactly known what it is but I'd assume it has something to do with cgroups throttling IO or memory.
I checked the kernel code for lazy inode table initialization. The nice thing is, that the kernel also does the guaranteed zeroing on discard check. If it is guaranteed, the kernel uses discard for the lazy initialization, which should finish in a just few seconds. If it is not guaranteed, it falls back to using *bio*s, which does not require the use of the write cache. The result is, that free memory is not required and not touched, thus performance is maxed and the system does not suffer.
As the original reason for disabling lazy init was a performance optimization and the kernel already does this optimization by default (and in a much better way), I'd suggest to completely remove these flags and rely on the kernel to do it in the best way.
Automatic merge from submit-queue
Enable streaming proxy redirects by default (beta)
Prerequisite to moving CRI to Beta.
I'd like to enable this early in our 1.6 cycle to get plenty of test coverage before release.
@yujuhong @liggitt
```release-note
Follow redirects for streaming requests (exec/attach/port-forward) in the apiserver by default (alpha -> beta).
```
Automatic merge from submit-queue
Remove packages which are now apimachinery
Removes all the content from the packages that were moved to `apimachinery`. This will force all vendoring projects to figure out what's wrong. I had to leave many empty marker packages behind to have verify-godep succeed on vendoring heapster.
@sttts straight deletes and simple adds
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)
move api/errors to apimachinery
`pkg/api/errors` is a set of helpers around `meta/v1.Status` that help to create and interpret various apiserver errors. Things like `.NewNotFound` and `IsNotFound` pairings. This pull moves it into apimachinery for use by the clients and servers.
@smarterclayton @lavalamp First commit is the move plus minor fitting. Second commit is straight replace and generation.
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)
Update deployment equality helper
@mfojtik @janetkuo this is split out of https://github.com/kubernetes/kubernetes/pull/38714 to reduce the size of that PR, ptal
Automatic merge from submit-queue
run staging client-go update
Chasing to see what real problems we have in staging-client-go.
@sttts you get similar results?
Automatic merge from submit-queue
replace global registry in apimachinery with global registry in k8s.io/kubernetes
We'd like to remove all globals, but our immediate problem is that a shared registry between k8s.io/kubernetes and k8s.io/client-go doesn't work. Since client-go makes a copy, we can actually keep a global registry with other globals in pkg/api for now.
@kubernetes/sig-api-machinery-misc @lavalamp @smarterclayton @sttts
Automatic merge from submit-queue (batch tested with PRs 39834, 38665)
Use parallel list for deleting items from a primitive list with merge strategy
Implemented parallel list for deleting items from a primitive list with merge strategy. Ref: [design doc](https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md#list-of-primitives)
fixes#35163 and #32398
When using parallel list, we don't need to worry about version skew.
When an old APIServer gets a new patch like:
```yaml
metadata:
$deleteFromPrimitiveList/finalizers:
- b
finalizers:
- c
```
It won't fail and work as before, because the parallel list will be dropped during json decoding.
Remaining issue: There is no check when creating a set (primitive list with merge strategy). Duplicates may get in.
It happens in two cases:
1) Creation using POST
2) Creating a list that doesn't exist before using PATCH
Fixing the first case is the beyond the scope of this PR.
The second case can be fixed in this PR if we need that.
cc: @pwittrock @kubernetes/kubectl @kubernetes/sig-api-machinery
```release-note
Fix issue around merging lists of primitives when using PATCH or kubectl apply.
```
Automatic merge from submit-queue (batch tested with PRs 34488, 39511, 39619, 38342, 39491)
Make StrategicPatch delete all matching maps in a merging list
fixes#38332
```release-note
NONE
```
cc: @lavalamp @pwittrock
Automatic merge from submit-queue (batch tested with PRs 34488, 39511, 39619, 38342, 39491)
use fake clock in lruexpiration cache test
when the system clock is extremely slow(usually see in VMs), this [check](https://github.com/kubernetes/kubernetes/blob/master/pkg/util/cache/lruexpirecache.go#L74) might still return the value.
```go
if c.clock.Now().After(e.(*cacheEntry).expireTime) {
go c.remove(key)
return nil, false
}
```
that means even we set the ttl to be 0 second, the after check might still be false(because the clock is too slow, and thus equals).
the change here helps to reduce flakes.
Automatic merge from submit-queue
snip pkg/util/strings dependency
The `pkg/util/strings` package looks to be largely used by volumes, which are independent of the bits used by genericapiserver which aren't used by anyone else. This moves the single function (used no where else) to its point of use.
@sttts
Automatic merge from submit-queue
Avoid unnecessary memory allocations
Low-hanging fruits in saving memory allocations. During our 5000-node kubemark runs I've see this:
ControllerManager:
- 40.17% k8s.io/kubernetes/pkg/util/system.IsMasterNode
- 19.04% k8s.io/kubernetes/pkg/controller.(*PodControllerRefManager).Classify
Scheduler:
- 42.74% k8s.io/kubernetes/plugin/pkg/scheduler/algrorithm/predicates.(*MaxPDVolumeCountChecker).filterVolumes
This PR is eliminating all of those.
Automatic merge from submit-queue
Begin paths for internationalization in kubectl
This is just the first step, purposely simple so we can get the interface correct.
@kubernetes/sig-cli @deads2k
Automatic merge from submit-queue (batch tested with PRs 38920, 38090)
Improve error message for name/label validation.
Instead of just providing regex in name/label validation error output, we need to add the naming rules of the name/label, which is more end-user readable.
Fixed#37654
Automatic merge from submit-queue (batch tested with PRs 36888, 38180, 38855, 38590)
Fix variable shadowing in exponential backoff when deleting volumes
While https://github.com/kubernetes/kubernetes/pull/38339 implemented exponential backoff on
volume deletion, that PR suffers from a minor bug when error thrown on volume deletion is anything other than `VolumeInUse` errors - in which case exponential backoff will not work.
This PR fixes that. This PR also makes unit tests more deterministic because exponential backoff changed the way operations are permitted.
CC @jsafrane @childsb @wongma7
Automatic merge from submit-queue (batch tested with PRs 38525, 38977)
Prevent json decoder panic on invalid input
Related downstream issue: https://github.com/openshift/origin/issues/12132
```
# Can be replicated on kubectl with:
$ cat panic.json
{
"kind": "Pod",
"apiVersion": "v1",
"metadata": {
"name": "",
"labels": {
"name": ""
},
"generateName": "",
"namespace": "",
"annotations": []
},
"spec": {}
},
$ kubectl create -f panic.json --validate=false
```
**Release note**:
```release-note
release-note-none
```
This patch handles cases where `ioutil.ReadAll` will return a single
character output on an invalid json input, causing the `Decode` method
to panic when it tries to calculate the line number for the syntax
error. The example below would cause a panic due to the trailing comma
at the end:
```
{
"kind": "Pod",
"apiVersion": "v1",
"metadata": {
"name": "",
"labels": {
"name": ""
},
"generateName": "",
"namespace": "",
"annotations": []
},
"spec": {}
},
```
@kubernetes/cli-review @fabianofranz
Automatic merge from submit-queue
Add a package for handling version numbers (including non-"Semantic" versions)
As noted in #32401, we are using Semantic Version-parsing libraries to parse version numbers that aren't necessarily "Semantic". Although, contrary to what I'd said there, it turns out that this wasn't actually currently a problem for the iptables code, because the regexp used to extract the version number out of the "iptables --version" output only pulled out three components, so given "iptables v1.4.19.1", it would have extracted just "1.4.19". Still, it could be a problem if they later release "1.5" rather than "1.5.0", or if we eventually need to _compare_ against a 4-digit version number.
Also, as noted in #23854, we were also using two different semver libraries in different parts of the code (plus a wrapper around one of them in pkg/version).
This PR adds pkg/util/version, with code to parse and compare both semver and non-semver version strings, and then updates kubernetes to use it everywhere (including getting rid of a bunch of code duplication in kubelet by making utilversion.Version implement the kubecontainer.Version interface directly).
Ironically, this does not actually allow us to get rid of either of the vendored semver libraries, because we still have other dependencies that depend on each of them. (cadvisor uses blang/semver and etcd uses coreos/go-semver)
fixes#32401, #23854
Automatic merge from submit-queue
Fixed a typo of wildcard DNS regex variable name.
Happened to see the typo while reading code, fixed the typo and refined the code.
Automatic merge from submit-queue
fix duplicate validation/field/errors
**Release note**:
``` release-note
release-note-none
```
Related PR: https://github.com/kubernetes/kubernetes/pull/30313
PR #30313 fixed duplicate errors for invalid aggregate errors in
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubectl/cmd/util/helpers.go
However, duplicate aggregate errors that went through
https://github.com/kubernetes/kubernetes/blob/master/pkg/util/validation/field/errors.go
were not affected by that patch.
This patch adds duplicate aggregate error checking to
`pkg/util/validation/field/errors.go`
##### Before
`$ kubectl set env rc/idling-echo-1 test-abc=1234`
```
error: ReplicationController "idling-echo-1" is invalid:
[spec.template.spec.containers[0].env[0].name: Invalid value:
"test-abc": must be a C identifier (matching regex
[A-Za-z_][A-Za-z0-9_]*): e.g. "my_name" or "MyName",
spec.template.spec.containers[1].env[0].name: Invalid value: "test-abc":
must be a C identifier (matching regex [A-Za-z_][A-Za-z0-9_]*): e.g.
"my_name" or "MyName", spec.template.spec.containers[0].env[0].name:
Invalid value: "test-abc": must be a C identifier (matching regex
[A-Za-z_][A-Za-z0-9_]*): e.g. "my_name" or "MyName",
spec.template.spec.containers[1].env[0].name: Invalid value: "test-abc":
must be a C identifier (matching regex [A-Za-z_][A-Za-z0-9_]*): e.g.
"my_name" or "MyName"]
```
`$ kubectl set env rc/node-1 test-abc=1234`
```
error: ReplicationController "idling-echo-1" is invalid:
[spec.template.spec.containers[0].env[0].name: Invalid value:
"test-abc": must be a C identifier (matching regex
[A-Za-z_][A-Za-z0-9_]*): e.g. "my_name" or "MyName",
spec.template.spec.containers[1].env[0].name: Invalid value: "test-abc":
must be a C identifier (matching regex [A-Za-z_][A-Za-z0-9_]*): e.g.
"my_name" or "MyName"]
```
##### After
`$ kubectl set env rc/idling-echo-1 test-abc=1234`
```
error: ReplicationController "idling-echo-1" is invalid:
[spec.template.spec.containers[0].env[0].name: Invalid value:
"test-abc": must be a C identifier (matching regex
[A-Za-z_][A-Za-z0-9_]*): e.g. "my_name" or "MyName",
spec.template.spec.containers[1].env[0].name: Invalid value: "test-abc":
must be a C identifier (matching regex [A-Za-z_][A-Za-z0-9_]*): e.g.
"my_name" or "MyName"]
```
`$ kubectl set env rc/node-1 test-abc=1234`
```
error: ReplicationController "node-1" is invalid:
spec.template.spec.containers[0].env[0].name: Invalid value: "test-abc":
must be a C identifier (matching regex [A-Za-z_][A-Za-z0-9_]*): e.g.
"my_name" or "MyName"
```
This patch handles cases where `ioutil.ReadAll` will return a single
character output on an invalid json input, causing the `Decode` method
to panic when it tries to calculate the line number for the syntax
error. The example below would cause a panic due to the trailing comma
at the end:
```
{
"kind": "Pod",
"apiVersion": "v1",
"metadata": {
"name": "",
"labels": {
"name": ""
},
"generateName": "",
"namespace": "",
"annotations": []
},
"spec": {}
},
```
Automatic merge from submit-queue (batch tested with PRs 36736, 35956, 35655, 37713, 38316)
Optimize port_split_test test case.
The `normalized` field doesn't take affect in current test case.
This PR:
1. initializes valid and normalized cases with normalized=true.
2. adds some invalid cases.
@resouer Thanks!
This is a fix on top #38124. In this fix, we move the logic to filter
out shared mount references into operation_executor's UnmountDevice
function to avoid this part is being used by other types volumes such as
rdb, azure etc. This filter function should be only needed during
unmount device for GCI image.
Automatic merge from submit-queue (batch tested with PRs 35939, 38381, 37825, 38306, 38110)
Add test for multi-threaded use of ratelimiter
Adds a test to help prevent #38273 from occurring again
Related PR: https://github.com/kubernetes/kubernetes/pull/30313
PR #30313 fixed duplicate errors for invalid aggregate errors in
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubectl/cmd/util/helpers.go
However, duplicate aggregate errors that went through
https://github.com/kubernetes/kubernetes/blob/master/pkg/util/validation/field/errors.go
were not affected by that patch.
This patch adds duplicate aggregate error checking to
`pkg/util/validation/field/errors.go`
\##### Before
`$ kubectl set env rc/idling-echo-1 test-abc=1234`
```
error: ReplicationController "idling-echo-1" is invalid:
[spec.template.spec.containers[0].env[0].name: Invalid value:
"test-abc": must be a C identifier (matching regex
[A-Za-z_][A-Za-z0-9_]*): e.g. "my_name" or "MyName",
spec.template.spec.containers[1].env[0].name: Invalid value: "test-abc":
must be a C identifier (matching regex [A-Za-z_][A-Za-z0-9_]*): e.g.
"my_name" or "MyName", spec.template.spec.containers[0].env[0].name:
Invalid value: "test-abc": must be a C identifier (matching regex
[A-Za-z_][A-Za-z0-9_]*): e.g. "my_name" or "MyName",
spec.template.spec.containers[1].env[0].name: Invalid value: "test-abc":
must be a C identifier (matching regex [A-Za-z_][A-Za-z0-9_]*): e.g.
"my_name" or "MyName"]
```
`$ kubectl set env rc/node-1 test-abc=1234`
```
error: ReplicationController "idling-echo-1" is invalid:
[spec.template.spec.containers[0].env[0].name: Invalid value:
"test-abc": must be a C identifier (matching regex
[A-Za-z_][A-Za-z0-9_]*): e.g. "my_name" or "MyName",
spec.template.spec.containers[1].env[0].name: Invalid value: "test-abc":
must be a C identifier (matching regex [A-Za-z_][A-Za-z0-9_]*): e.g.
"my_name" or "MyName"]
```
\##### After
`$ kubectl set env rc/idling-echo-1 test-abc=1234`
```
error: ReplicationController "idling-echo-1" is invalid:
[spec.template.spec.containers[0].env[0].name: Invalid value:
"test-abc": must be a C identifier (matching regex
[A-Za-z_][A-Za-z0-9_]*): e.g. "my_name" or "MyName",
spec.template.spec.containers[1].env[0].name: Invalid value: "test-abc":
must be a C identifier (matching regex [A-Za-z_][A-Za-z0-9_]*): e.g.
"my_name" or "MyName"]
```
`$ kubectl set env rc/node-1 test-abc=1234`
```
error: ReplicationController "node-1" is invalid:
spec.template.spec.containers[0].env[0].name: Invalid value: "test-abc":
must be a C identifier (matching regex [A-Za-z_][A-Za-z0-9_]*): e.g.
"my_name" or "MyName"
```
Automatic merge from submit-queue
Added support for HOME environment variable on Windows
**What this PR does / why we need it**:
On Windows the HOME environment variable should be taken in account when trying to find the home directory.
Several tools already support the HOME environment variable, notably git-bash. It would be very convenient to have the kubernete tools (including minikube) to also support the environment variable.
The current situation
**Special notes for your reviewer**:
**Release note**:
```
```
Automatic merge from submit-queue
add a configuration for kubelet to register as a node with taints
and deprecate --register-schedulable
ref #28687#29178
cc @dchen1107 @davidopp @roberthbailey
Automatic merge from submit-queue (batch tested with PRs 38194, 37594, 38123, 37831, 37084)
Better compat with very old iptables (e.g. CentOS 6)
Fixes reported issue with CentOS6 iptables 1.4.7 (ancient)
Older iptables expanded things like 0x4000 into 0x00004000, which defeats the
fallback "check" logic.
Fixes#37416
this is a workaround for the unmount device issue caused by gci mounter. In GCI cluster, if gci mounter is used for mounting, the container started by mounter script will cause additional mounts created in the container. Since these mounts are irrelavant to the original mounts, they should be not considered when checking the mount references. By comparing the mount path prefix, those additional mounts can be filtered out.
Plan to work on better approach to solve this issue.
This change is to only enable containerized mounter for nfs and
glusterfs types. For other types such as tmpfs, ext2/3/4 or empty type,
we should still use mount from $PATH
Automatic merge from submit-queue
Mention overflows when mistakenly call function FromInt
**What this PR does / why we need it**:
When mistakenly call this method with a value that overflows int32 will causes strange behavior in some environment (maybe in amd64 system, i'm not sure but my test shows that).
For example, call FromInt(93333333333) would result in -1155947179 and not mention overflows.
Automatic merge from submit-queue
Fix kubectl Stratigic Merge Patch compatibility
As @smarterclayton pointed out in [comment1](https://github.com/kubernetes/kubernetes/pull/35647#pullrequestreview-8290820) and [comment2](https://github.com/kubernetes/kubernetes/pull/35647#pullrequestreview-8290847) in PR #35647,
we cannot assume the API servers publish version and they shares the same version.
This PR removes all the calls of GetServerSupportedSMPatchVersion().
Change the behavior of `apply` and `edit` to:
Retrying with the old patch version, if the new version fails.
Default other usage of SMPatch to the new version, since they don't update list of primitives.
fixes#36916
cc: @pwittrock @smarterclayton
This PR is to fix the issue in converting aws volume id from mount
paths. Currently there are three aws volume id formats supported. The
following lists example of those three formats and their corresponding
global mount paths:
1. aws:///vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/vol-123456)
2. aws://us-east-1/vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/mounts/aws/us-est-1/vol-123455)
3. vol-123456
(/var/lib/kubelet/plugins/kubernetes.io/mounts/aws/us-est-1/vol-123455)
For the first two cases, we need to check the mount path and convert
them back to the original format.
Automatic merge from submit-queue
Fix strategic patch for list of primitive type with merge sementic
Fix strategic patch for list of primitive type when the patch strategy is `merge`.
Before: we cannot replace or delete an item in a list of primitive, e.g. string, when the patch strategy is `merge`. It will always append new items to the list.
This patch will generate a map to update the list of primitive type.
The server with this patch will accept either a new patch or an old patch.
The client will found out the APIserver version before generate the patch.
Fixes#35163, #32398
cc: @pwittrock @fabianofranz
``` release-note
Fix strategic patch for list of primitive type when patch strategy is `merge` to remove deleted objects.
```
Automatic merge from submit-queue
Expand documentation and TODOs in a few packages
I was reading through unfamiliar code and mostly added TODOs and expanded and clarified documentations.
There are a couple of things that are real code changes:
- Removed some unused constants
- Changed `workqueue.Parallize` to clamp the number of worker goroutines to the number of items to be processed.
- Added another unit test to `workqueue.queue`. I thought I found a bug (I was wrong) and wrote a unit test to isolate. I figure the extra test is worth keeping.
Automatic merge from submit-queue
Add Windows support to kube-proxy
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
This is the first stab at supporting kube-proxy (userspace mode) on Windows
**Which issue this PR fixes** :
fixes#30278
**Special notes for your reviewer**:
The MVP uses `netsh portproxy` to redirect traffic from `ServiceIP:ServicePort` to a `LocalIP:LocalPort`.
For the next version we are expecting to have guidance from Microsoft Container Networking team.
**Limitations**:
Current implementation does not support DNS queries over UDP as `netsh portproxy` currently only supports TCP. We are working with Microsoft to remediate this.
cc: @brendandburns @dcbw
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
```
Automatic merge from submit-queue
Restore old apiserver cert CN
This patch got lost during rebase of https://github.com/kubernetes/kubernetes/pull/35109:
- set `host@<unix-timestamp>` as CN in self-signed apiserver certs
- skip non-domain CN in getNamedCertificateMap
Automatic merge from submit-queue
Handle redirects in apiserver proxy handler
Overview:
1. Peek at the HTTP response from the proxied backend
2. If it is a redirect response (302/3), redo the request to the redirect location
3. If it's not a redirect, forward the response to the client and then set up the proxy as before
This change is required for implementing streaming requests in the Container Runtime Interface (CRI). See [design](https://docs.google.com/document/d/1OE_QoInPlVCK9rMAx9aybRmgFiVjHpJCHI9LrfdNM_s/edit).
For https://github.com/kubernetes/kubernetes/issues/29579
/cc @yujuhong
Automatic merge from submit-queue
update port validation message
Related Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1355703#c2
Port validation that results in a multi-line message:
```
* spec.template.spec.containers[0].livenessProbe.httpGet.port: Invalid value: "": must contain only alpha-numeric characters (a-z, 0-9), and hyphens (-)
* spec.template.spec.containers[0].livenessProbe.httpGet.port: Invalid value: "": must contain at least one letter (a-z)
```
suggests that ports can only be at minimum one letter.
Per [this bugzilla comment](https://bugzilla.redhat.com/show_bug.cgi?id=1355703#c2), this patch updates the second bullet point on the error message to be clearer:
```
* spec.template.spec.containers[0].livenessProbe.httpGet.port: Invalid value: "": must contain only alpha-numeric characters (a-z, 0-9), and hyphens (-)
* spec.template.spec.containers[0].livenessProbe.httpGet.port: Invalid value: "": must contain at least one letter or number (a-z, 0-9)
```
**Release note**:
```release-note
release-note-none
```
Automatic merge from submit-queue
[Federation][init-08] Refactor the tests by pulling the common utilities into a testing package.
Please review only the last commit here. This is based on PR #35864 which will be reviewed independently.
Design Doc: PR #34484
cc @kubernetes/sig-cluster-federation @nikhiljindal
Pods which are evicted by the nodecontroller due to network
malfunction, or unresponsive kubelet should be differentiated
from termination initiated by other sources. The reason/message
are consumed by kubectl to provide a better summary using get/describe.
Automatic merge from submit-queue
SELinux Overhaul
Overhauls handling of SELinux in Kubernetes. TLDR: Kubelet dir no longer has to be labeled `svirt_sandbox_file_t`.
Fixes#33351 and #33510. Implements #33951.
Automatic merge from submit-queue
Correct the article in generated documents
**What this PR does / why we need it**:
Fix the article in generated docs for "create/delete [article] [kind]"
**Which issue this PR fixes**
fixes#32305
**Special notes for your reviewer**:
None
**Release note**:
``` release-note
Correct the article in generated documents
```
For example:
"a Ingress" > "an Ingress"
When kubelet restarts, all the information about the volumes will be
gone from actual/desired states. When update node status with mounted
volumes, the volume list might be empty although there are still volumes
are mounted and in turn causing master to detach those volumes since
they are not in the mounted volumes list. This fix is to make sure only
update mounted volumes list after reconciler starts sync states process.
This sync state process will scan the existing volume directories and
reconstruct actual states if they are missing.
This PR also fixes the problem during orphaned pods' directories. In
case of the pod directory is unmounted but has not yet deleted (e.g.,
interrupted with kubelet restarts), clean up routine will delete the
directory so that the pod directoriy could be cleaned up (it is safe to
delete directory since it is no longer mounted)
The third issue this PR fixes is that during reconstruct volume in
actual state, mounter could not be nil since it is required for creating
container.VolumeMap. If it is nil, it might cause nil pointer exception
in kubelet.
Details are in proposal PR #33203
In order to be able to use new mounter library, this PR adds the
mounterPath flag to kubelet which passes the flag to the mount
interface. If flag is empty, mount uses default mount path.
Automatic merge from submit-queue
kubelet: storage: don't hang kubelet on unresponsive nfs
Fixes#31272
Currently, due to the nature of nfs, an unresponsive nfs volume in a pod can wedge the kubelet such that additional pods can not be run.
The discussion thus far surrounding this issue was to wrap the `lstat`, the syscall that ends up hanging in uninterruptible sleep, in a goroutine and limiting the number of goroutines that hang to one per-pod per-volume.
However, in my investigation, I found that the callsites that request a listing of the volumes from a particular volume plugin directory don't care anything about the properties provided by the `lstat` call. They only care about whether or not a directory exists.
Given that constraint, this PR just avoids the `lstat` call by using `Readdirnames()` instead of `ReadDir()` or `ReadDirNoExit()`
### More detail for reviewers
Consider the pod mounted nfs volume at `/var/lib/kubelet/pods/881341b5-9551-11e6-af4c-fa163e815edd/volumes/kubernetes.io~nfs/myvol`. The kubelet wedges because when we do a `ReadDir()` or `ReadDirNoExit()` it calls `syscall.Lstat` on `myvol` which requires communication with the nfs server. If the nfs server is unreachable, this call hangs forever.
However, for our code, we only care what about the names of files/directory contained in `kubernetes.io~nfs` directory, not any of the more detailed information the `Lstat` call provides. Getting the names can be done with `Readdirnames()`, which doesn't need to involve the nfs server.
@pmorie @eparis @ncdc @derekwaynecarr @saad-ali @thockin @vishh @kubernetes/rh-cluster-infra
Automatic merge from submit-queue
Improvements to CLI usability and maintainability
Improves `kubectl` from an usability perspective by
1. Fixing how we handle terminal width in help. Some sections like the flags use the entire available width, while others like long descriptions breaks lines but don't follow a well established max width (screenshot below). This PR adds a new responsive writer that will adjust to terminal width and set 80, 100, or 120 columns as the max width, but not more than that given POSIX best practices and recommendations for better readability.
![terminal_width](https://cloud.githubusercontent.com/assets/158611/19253184/b23a983e-8f1f-11e6-9bae-667dd5981485.png)
2. Adds our own normalizers for long descriptions and cmd examples which allows us better control about how things like lists, paragraphs, line breaks, etc are printed. Features markdown support. Looks like `templates.LongDesc` and `templates.Examples` instead of `dedent.Dedend`.
3. Allows simple reordering and reuse of help and usage sections.
3. Adds `verify-cli-conventions.sh` which intends to run tests to make sure cmd developers are using what we propose as [kubectl conventions](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/kubectl-conventions.md). Just a couple simple tests for now but the framework is there and it's easy to extend.
4. Update [kubectl conventions](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/kubectl-conventions.md) to use our own normalizers instead of `dedent.Dedent`.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
Improves how 'kubectl' uses the terminal size when printing help and usage.
```
@kubernetes/kubectl
Automatic merge from submit-queue
Escape special characters in jsonpath field names.
There may be a better way to do this, but this seemed like the simplest possible version.
Example: `{.items[*].metadata.labels.kubernetes\.io/hostname}`
[Resolves#31984]