Automatic merge from submit-queue
Remove static kubelet client, refactor ConnectionInfoGetter
Follow up to https://github.com/kubernetes/kubernetes/pull/33718
* Collapses the multi-valued return to a `ConnectionInfo` struct
* Removes the "raw" connection info method and interface, since it was only used in a single non-test location (by the "real" connection info method)
* Disentangles the node REST object from being a ConnectionInfoProvider itself by extracting an implementation of ConnectionInfoProvider that takes a node (using a provided NodeGetter) and determines ConnectionInfo
* Plumbs the KubeletClientConfig to the point where we construct the helper object that combines the config and the node lookup. I anticipate adding a preference order for choosing an address type in https://github.com/kubernetes/kubernetes/pull/34259
Automatic merge from submit-queue
kubelet: storage: don't hang kubelet on unresponsive nfs
Fixes#31272
Currently, due to the nature of nfs, an unresponsive nfs volume in a pod can wedge the kubelet such that additional pods can not be run.
The discussion thus far surrounding this issue was to wrap the `lstat`, the syscall that ends up hanging in uninterruptible sleep, in a goroutine and limiting the number of goroutines that hang to one per-pod per-volume.
However, in my investigation, I found that the callsites that request a listing of the volumes from a particular volume plugin directory don't care anything about the properties provided by the `lstat` call. They only care about whether or not a directory exists.
Given that constraint, this PR just avoids the `lstat` call by using `Readdirnames()` instead of `ReadDir()` or `ReadDirNoExit()`
### More detail for reviewers
Consider the pod mounted nfs volume at `/var/lib/kubelet/pods/881341b5-9551-11e6-af4c-fa163e815edd/volumes/kubernetes.io~nfs/myvol`. The kubelet wedges because when we do a `ReadDir()` or `ReadDirNoExit()` it calls `syscall.Lstat` on `myvol` which requires communication with the nfs server. If the nfs server is unreachable, this call hangs forever.
However, for our code, we only care what about the names of files/directory contained in `kubernetes.io~nfs` directory, not any of the more detailed information the `Lstat` call provides. Getting the names can be done with `Readdirnames()`, which doesn't need to involve the nfs server.
@pmorie @eparis @ncdc @derekwaynecarr @saad-ali @thockin @vishh @kubernetes/rh-cluster-infra
Automatic merge from submit-queue
Add PSP support for seccomp profiles
Seccomp support for PSP. There are still a couple of TODOs that need to be fixed but this is passing tests.
One thing of note, since seccomp is all being stored in annotations right now it breaks some of the assumptions we've stated for the provider in terms of mutating the passed in pod. I've put big warning comments around the pieces that do that to make sure it's clear and covered the rollback in admission if the policy fails to validate.
@sttts @pmorie @erictune @smarterclayton @liggitt
Automatic merge from submit-queue
convert TPR controller to posthook instead of disable flag
Converts the third party resource controller into a posthook using a loopback client instead going direct to etcd. This let's us eliminate more flags and special-casing during initialization. Also, using a client brings us closer to building this without side-effects for downstream composers.
Automatic merge from submit-queue
glog non-fatal, usually unimportant error instead of fmt
Fixes https://github.com/kubernetes/kubernetes/issues/34977
This particular message isn't usually important, so demote it to glog.
Automatic merge from submit-queue
make version an explicit choice so zero config and customized work
Makes `/version` key off of setting the version. This allows composers to add a version that is correct.
Automatic merge from submit-queue
Fix lint error in remotecommand.go
Before:
```
pkg/client/unversioned/remotecommand/remotecommand.go:101:9: should omit type http.RoundTripper from declaration of var rt; it will be inferred from the right-hand side
Please fix the above errors. You can test via "golint" and commit the result.
```
Don't store Gluster SotrageClass parameters in annotations, it's insecure.
Instead, expect that there is the StorageClass available at the time
when it's needed by Gluster deleter.
Automatic merge from submit-queue
Improvements to CLI usability and maintainability
Improves `kubectl` from an usability perspective by
1. Fixing how we handle terminal width in help. Some sections like the flags use the entire available width, while others like long descriptions breaks lines but don't follow a well established max width (screenshot below). This PR adds a new responsive writer that will adjust to terminal width and set 80, 100, or 120 columns as the max width, but not more than that given POSIX best practices and recommendations for better readability.
![terminal_width](https://cloud.githubusercontent.com/assets/158611/19253184/b23a983e-8f1f-11e6-9bae-667dd5981485.png)
2. Adds our own normalizers for long descriptions and cmd examples which allows us better control about how things like lists, paragraphs, line breaks, etc are printed. Features markdown support. Looks like `templates.LongDesc` and `templates.Examples` instead of `dedent.Dedend`.
3. Allows simple reordering and reuse of help and usage sections.
3. Adds `verify-cli-conventions.sh` which intends to run tests to make sure cmd developers are using what we propose as [kubectl conventions](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/kubectl-conventions.md). Just a couple simple tests for now but the framework is there and it's easy to extend.
4. Update [kubectl conventions](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/kubectl-conventions.md) to use our own normalizers instead of `dedent.Dedent`.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
Improves how 'kubectl' uses the terminal size when printing help and usage.
```
@kubernetes/kubectl
Automatic merge from submit-queue
Avoid unnecessary allocation
This is supposed to avoid unnecessary memory allocations.
PodToSelectableFields seems to be the biggest contributor to memory allocations:
```
Showing top 10 nodes out of 247 (cum >= 83166442)
flat flat% sum% cum cum%
1796823715 31.09% 31.09% 1796823715 31.09% k8s.io/kubernetes/pkg/registry/core/pod.PodToSelectableFields
530856268 9.19% 40.28% 530856268 9.19% k8s.io/kubernetes/pkg/storage.NamespaceKeyFunc
241505351 4.18% 44.46% 241505351 4.18% reflect.unsafe_New
...
```
Automatic merge from submit-queue
+optional tag for OpenAPI spec
OpenAPI rely on "omitempty" json tag to determine if a field is optional or not. This change will add "+optional" tag to all fields with "omitempty" json tag and support the tag in OpenAPI spec generator.
Automatic merge from submit-queue
Escape special characters in jsonpath field names.
There may be a better way to do this, but this seemed like the simplest possible version.
Example: `{.items[*].metadata.labels.kubernetes\.io/hostname}`
[Resolves#31984]
Before:
pkg/client/unversioned/remotecommand/remotecommand.go:101:9: should omit type http.RoundTripper from declaration of var rt; it will be inferred from the right-hand side
Please fix the above errors. You can test via "golint" and commit the result.
Automatic merge from submit-queue
controller: set minReadySeconds in deployment's replica sets
* Estimate available pods for a deployment by using minReadySeconds on
the replica set.
* Stop requeueing deployments on pod events, superseded by following the
replica set status.
* Cleanup redundant deployment utilities
Fixes https://github.com/kubernetes/kubernetes/issues/26079
@kubernetes/deployment ptal
Automatic merge from submit-queue
make function ReadDockerConfigFile more flexible
In our code, the public function `ReadDockerConfigFile` looks like not enough flexible:
when I want to use this function to get docker config info from a specific path, I have to call `SetPreferredDockercfgPath`, and then the setting preferredPath will be valid in function `ReadDockerConfigFile`. I know in our code, we call `SetPreferredDockercfgPath` in one place ,then call `ReadDockerConfigFile` in another place, it was not in same context. But it looks like not thread safety.
I think if user who use our code want to get docker config from a specific path, it is reasonable to call directly `ReadDockerConfigFile ` with a dockerconfigPath argument, and it can avoid some scenarios that thread is not safety .
I add a test case for this function.
Automatic merge from submit-queue
kubectl: apply prune should fallback to basic delete when a resource has no reaper
Fixes#34790
cc @kubernetes/kubectl @MrHohn
Automatic merge from submit-queue
Allow callers to bypass cmdutil.CheckError() logging
**Release note**:
```release-note
release-note-none
```
This patch is originally from:
https://github.com/kubernetes/kubernetes/pull/25451 (eedb67a30d)
Simplifies code where clients are writing their own errors, and want to
terminate with an exit code.
cc @smarterclayton
Automatic merge from submit-queue
attempt to use discovery RESTMapper and fallback if we can't
Updates `kubectl` to always attempt discovery regardless of server version. This is needed to extension servers.
Automatic merge from submit-queue
you can be authorized and have a failure
Fix the authorization filter to allow you through and to avoid showing internal errors to users when authorization failed.
Automatic merge from submit-queue
Update run flags to point to generators docs
@janetkuo you've requested that in https://github.com/kubernetes/kubernetes/pull/32484#issuecomment-246840562 I'm opening this PR but like you I don't like the length of the descriptions already. The other problem with this is that there's not clean docs for a user to figure out what the generators are. I've stumbled upon this several times and I always found myself looking into the code :/ How about adding new flag/subcommand that will give you more information about generators and we'd move all those `--restart` and `--generator` information into specific generator info and present at the top level only general information?
Automatic merge from submit-queue
Fix edge case in qos evaluation
If a pod has a container C1 and C2, where sum(C1.requests, C2.requests) equals (C1.Limits), the code was reporting that the pod had "Guaranteed" qos, when it should have been Burstable.
/cc @vishh @dchen1107
Automatic merge from submit-queue
Add unit test for bad ReclaimPolicy and valid ReclaimPolicy in /pkg/api/validation
unit tests for validation.go regarding PersistentVolumeReclaimPolicy (bad value and good value)
see PR: #30304
Automatic merge from submit-queue
default serializer
Everyone uses the same serializer. Set it as the default, but still allow someone to take control if they want.
Found while trying to use genericapiserver for composition.
Automatic merge from submit-queue
Use same SSH tunnel as kubelet
Provides a secure workaround for #11816 by having kube-apiserver use the same SSH tunnel as the kubelet it is trying to connect to. Use in conjunction with iptables or kubelet `--address=127.0.0.1`. The latter will break heapster.
Will fallback to random behavior if the tunnel cannot be found.
Automatic merge from submit-queue
Add global timeout flag
**Release note**:
```release-note
Add a new global option "--request-timeout" to the `kubectl` client
```
UPSTREAM: https://github.com/kubernetes/client-go/pull/10
This patch adds a global timeout flag (viewable with `kubectl -h`) with
a default value of `0s` (meaning no timeout).
The timeout value is added to the default http client, so that zero
values and default behavior are enforced by the client.
Adding a global timeout ensures that user-made scripts won't hang for an
indefinite amount of time while performing remote calls (right now, remote
calls are re-tried up to 10 times when each attempt fails, however, there is
no option to set a timeout in order to prevent any of these 10 attempts from
hanging indefinitely).
**Example**
```
$ kubectl get pods # no timeout flag set - default to 0s (which means no
timeout)
NAME READY STATUS RESTARTS AGE
docker-registry-1-h7etw 1/1 Running 1 2h
router-1-uv0f9 1/1 Running 1 2h
$ kubectl get pods --request-timeout=0 # zero means no timeout no timeout flag set
NAME READY STATUS RESTARTS AGE
docker-registry-1-h7etw 1/1 Running 1 2h
router-1-uv0f9 1/1 Running 1 2h
$kubectl get pods --request-timeout=1ms
Unable to connect to the server: net/http: request canceled while
waiting for connection (Client.Timeout exceeded while awaiting headers)
```
Automatic merge from submit-queue
Pass whole PVC to provisioner plugins
Gluster provisioner is interested in namespace of PVCs that are being provisioned and I don't want to add at as a new field in `volume.VolumeOptions` - it would contain almost whole PVC.
Let's rework `VolumeOptions` and pass direct reference to PVC there instead of some "interesting" fields and let the provisioner to pick information it is interested in.
There was lot of refactoring in volume plugins to apply this change (too many plugins), however the logic is simple and it's all the same in all plugins.
@rootfs @humblec
Automatic merge from submit-queue
honor SAR verb
Verbs on non-resource requests were dropped. This results in always being denied for all the authorizers I know of, so no unintended exposure, but its still ugly. We should probably pick.
@liggitt I would have expected the kubelet work to get stuck on this.
Automatic merge from submit-queue
Increase buffer sizes in cacher for watchers interested in all/many o…
Should increase throughput of cacher in large clusters.
Automatic merge from submit-queue
Add support for admission controller based on namespace node selectors.
This work is to upstream openshift's project node selectors based admission controller.
Fixes https://github.com/kubernetes/kubernetes/issues/17151
Automatic merge from submit-queue
Add 'kubectl set resources'
Add "kubectl set resources" for easier updating container memory/cpu limits/requests (for pods or resources with pod templates).
**Usage**
`kubectl set resources (-f FILENAME | TYPE NAME) ([--limits=LIMITS & --requests=REQUESTS])`
**Examples**
Set a deployments nginx container cpu limits to "200m and memory to "512Mi"
`kubectl set resources deployment nginx -c=nginx --limits=cpu=200m,memory=512Mi`
Set the limit and requests for all containers in nginx
`kubectl set resources deployment nginx --limits=cpu=200m,memory=512Mi --requests=cpu=100m,memory=256Mi`
Print the result (in yaml format) of updating nginx container limits from a local, without hitting the server
`kubectl set resources -f path/to/file.yaml --limits=cpu=200m,memory=512Mi --local -o yaml`
Remove limits on containers in nginx
`kubectl set resources deployment nginx --limits=cpu=0,memory=0`
Ref: https://github.com/kubernetes/kubernetes/issues/21648
EDIT: removed the '--remove' flag example
Automatic merge from submit-queue
Support trust id as a scope in the OpenStack authentication logic
This patch allows the use of Kubernetes with Keystone trust delegation to avoid passing the user credentials in clear inside the config file : a specific user with delegated rights can be created and used instead.
Automatic merge from submit-queue
NodeController waits for informer sync before doing anything
cc @lavalamp @davidopp
```release-note
NodeController waits for full sync of all it's informers before taking any action.
```
rename the variable
make parameter more flexible
handle docker config file path
use a single set of paths
delete debug print
gofmt
delete the empty line
comment is not correct
move the comment to the correct place
keep original signature
godoc
Automatic merge from submit-queue
Run rbac authorizer from cache
RBAC authorization can be run very effectively out of a cache. The cache is a normal reflector backed cache (shared informer).
I've split this into three parts:
1. slim down the authorizer interfaces
1. boilerplate for adding rbac shared informers and associated listers which conform to the new interfaces
1. wiring
@liggitt @ericchiang @kubernetes/sig-auth
This patch adds a global timeout flag (viewable with `kubectl -h`) with
a default value of `0s` (meaning no timeout).
The timeout value is added to the default http client, so that zero
values and default behavior are enforced by the client.
**Example**
```
$ kubectl get pods # no timeout flag set - default to 0s (which means no
timeout)
NAME READY STATUS RESTARTS AGE
docker-registry-1-h7etw 1/1 Running 1 2h
router-1-uv0f9 1/1 Running 1 2h
$ kubectl get pods --timeout=0 # zero means no timeout no timeout flag set
NAME READY STATUS RESTARTS AGE
docker-registry-1-h7etw 1/1 Running 1 2h
router-1-uv0f9 1/1 Running 1 2h
$kubectl get pods --timeout=1ms
Unable to connect to the server: net/http: request canceled while
waiting for connection (Client.Timeout exceeded while awaiting headers)
```
Automatic merge from submit-queue
Improve edit experience
Improve edit experience a bit according [#26050(comment)](https://github.com/kubernetes/kubernetes/issues/26050#issuecomment-246089751)
> a) always go back to the editor
b) always retain what I hand-edited, even if that has to be in comments
@janetkuo
Add a way to set resource limits/requests on running pods
Ref: https://github.com/kubernetes/kubernetes/issues/21648
I squashed the commits to make rebasing easier
Change log:
- fixed a typo that caused the command to be run with kubectl set set instead of the correct kubectl set limit
- added a ResourcesWithPodTemplates to pkg/kubectl/cmd/util/factory.go
instead of hardcoding these resources move there description all in one place
- Fixing some of the flow control in kubectl set limit
- update the help info
- changed the name of ResourcesWithPodTemplates to ResourcesWithPodSpecs to more accuratly describe what it is doing
and changed the variable names to lower case to conform to go's variable naming convention
- changing the name of the command from 'set limit' to 'set resources'
- Adding the new file pkg/kubectl/cmd/set/set_resources.go
- changes to the test cases to reflect the change from 'kubectl set limit' to 'kubectl set resources'
- comment removed
- adding the man page to the git repository attempting to fix Jenkins tests
- adding the user guide
- fixed a few typos
- typo in hack/cmd-test.sh
- implamenting suggestions for command help text
- adding the dry-run flag
- removing the "remove" option in favor of zeroing out request/limits in order to remove them
- changed limits/requests to requests/limit
- changing ResourcesWithPodSpec
- updated generated docs and removed whitespace
- change priint on success message from "resource limits/requests updated" to "resource requirements updated"
- minor rebasing issues - 'hack/test-cmd.sh' now passes
- cmdutil.PrintSuccess added another argument
- fixing mungedocs failure
- removed whitespace from hack/make-rules/test-cmd.sh and an erroneous entry from pkg/cloudprovider/providers/openstack/MAINTAINERS.md
- fixed typo in Short: field of the cobra command
- rebased
- Creating a new factory in the ResourcesWithPodSpecs() so that the testing will pass
- changing ResourcesWithPodSpecs, it doesn't need to be a method of factory
Automatic merge from submit-queue
Log more information on pod status updates
Also bump the logging level to V2 so that we can see them in a non-test
cluster.
Automatic merge from submit-queue
azure: lower log priority for skipped nic update message
**What this PR does / why we need it**: Very minor, just wanted to remove some log noise I introduced in #34526.
I chose `V(3)` since it aligns with the other nicupdate message printed out here, and will be hidden for the usual default of `--v=2`.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
NONE
```
Automatic merge from submit-queue
Merge string flag into util flag
Continuing my work on https://github.com/kubernetes/kubernetes/issues/15634
This refactoring is expected to be completely finished and then I will add a verify scripts in `hack`
Automatic merge from submit-queue
split genericapiserver configuration apart so that you can run withou…
…t flag options
@dims Take a look at this re-slicing of the `genericapiserver.Config` creation. I think this helps composers overall and resolves the chicken/egg problem you were having.
Automatic merge from submit-queue
Assign options.Err in "set image"
**What this PR does / why we need it**:
There is a usage of options.Err in a Printf, but this option is never set.
This patch passes the stderr into the command and assigns the option correctly.
**Which issue this PR fixes**
fixes#34433
**Special notes for your reviewer**:
None
**Release note**:
```release-note
```
Automatic merge from submit-queue
add UpdateRuntimeConfig interface
Expose UpdateRuntimeConfig interface in RuntimeService for kubelet to pass a set of configurations to runtime. Currently it only takes PodCIDR.
The use case is for kubelet to pass configs to runtime. Kubelet holds some config/information which runtime does not have, such as PodCIDR. I expect some of kubelet configurations will gradually move to runtime, but I believe cases like PodCIDR, which dynamically assigned by k8s master, need to stay for a while.
Automatic merge from submit-queue
Add e2e tests for storageclass
- test pd-ssd and pd-standard on GCE,
- test all four volume types and encryption on AWS
- test just the default volume type on OpenStack (right now, there is no API
to get list of them)
These tests are quite slow, e.g. there are two tests on AWS that has to run mkfs.ext4 on 500 GB magnetic drive with low IOPS, which takes ~3-4 minutes each.
There is a usage of options.Err in a Printf, but this option is never set.
This patch passes the stderr into the command and assigns the option correctly.
Automatic merge from submit-queue
remove call to compinit in zsh completion output
**Release note**:
```release-note
release-note-none
```
Fixes: https://github.com/kubernetes/kubernetes/issues/32029
Fixes:
https://github.com/kubernetes/kubernetes/issues/27538#issuecomment-238574035
The zsh completion output makes a call to "compinit" which causes the
zsh completion system to re-initialize every time `<root_cmd> completion zsh`
is sourced, overwriting any settings already applied to other commands.
This in-turn caused other commands' completions to break (such as git,
gcloud, vim) causing an error "function definition file not found" to
be returned any time a tab-completion was attempted.
This patch removes the call to `compinit` in the zsh completion output,
causing no behavioral changes to the existing `completion` command, but
fixing any issues that were caused after sourcing its output.
This allows security groups to be created and attached to the neutron
port that the loadbalancer is using on the subnet.
The security group ID that is assigned to the nodes needs to be
provided, to allow for traffic from the loadbalancer to the nodePort
to be refelected in the rules.
This adds two config items to the LoadBalancer options -
ManageSecurityGroups (bool)
NodeSecurityGroupID (string)
Automatic merge from submit-queue
kubeadm implement preflight checks
Checks that user running kubeamd init and join is root and will only execute
command if user is root. Moved away from using kubectl error handling to
having kubeadm handle its own errors. This should allow kubeadm to have
more meaningful errors, exit codes, and logging for specific kubeadm use
cases.
fixes#33908
Fixes: https://github.com/kubernetes/kubernetes/issues/32029
Fixes:
https://github.com/kubernetes/kubernetes/issues/27538#issuecomment-238574035
The zsh completion output makes a call to "compinit" which causes the
zsh completion system to re-initialize every time `<root_cmd> completion zsh`
is sourced, overwriting any settings already applied to other commands.
This in-turn caused other commands' completions to break (such as git,
gcloud, vim) causing an error "function definition file not found" to
be returned any time a tab-completion was attempted.
This patch removes the call to `compinit` in the zsh completion output,
causing no behavioral changes to the existing `completion` command, but
fixing any issues that were caused after sourcing its output.
Automatic merge from submit-queue
Change legacy API resource registration
Updates the legacy API resource registration to emphasize its different-ness and to simplify supporting objects. The option has to remain in the genericapiserverconfig for multiple prefixes to enable cases where composers/extenders had composed additional groupless APIs. See OpenShift as an example.
However this is now transparent to "normal" composers.
@ncdc since sttts is out.
Add skip-preflight-checks to known flags.
Fix bug with preflight checks not returning system is-active as errors.
Fix error handling to use correct function.
- test pd-ssd and pd-standard on GCE,
- test all four volume types on AWS
- test just the default volume type on OpenStack (right now, there is no API
to get list of them)
Includes checks for verifying services exist and are enabled, ports are
open, directories do not exist or are empty, and required binaries are
in the path.
Checks that user running kubeamd init and join is root and will only execute
command if user is root. Moved away from using kubectl error handling to
having kubeadm handle its own errors. This should allow kubeadm to have
more meaningful errors, exit codes, and logging for specific kubeadm use
cases.
Automatic merge from submit-queue
Handle DeletedFinalStateUnknown in NodeController
Fix#34692
```release-note
Fix panic in NodeController caused by receiving DeletedFinalStateUnknown object from the cache.
```
cc @davidopp
Automatic merge from submit-queue
WantsAuthorizer admission plugin support
The next step of PSP admission is to be able to limit the PSPs used based on user information. To do this the admission plugin would need to make authz checks for the `user.Info` in the request. This code allows a plugin to request the injection of an authorizer to allow it to make the authz checks.
Note: this could be done with a SAR, however since admission is running in the api server using the SAR would incur an extra hop vs using the authorizer directly.
@deads2k @derekwaynecarr
* Estimate available pods for a deployment by using minReadySeconds on
the replica set.
* Stop requeueing deployments on pod events, superseded by following the
replica set status.
* Cleanup redundant deployment utilities
Automatic merge from submit-queue
Revert "Error out when any RS has more available pods then its spec r…
Reverts https://github.com/kubernetes/kubernetes/pull/29808
The PR is wrong because we can have more available pods than desired every time we scale down.
@kubernetes/deployment ptal
Automatic merge from submit-queue
azure: filter load balancer backend nodes to PrimaryAvailabilitySet (if set)
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
- Adds a new field (`PrimaryAvailabilitySetName`) to the Azure CloudProvider config struct
- If the field is set, only machines who are in that availabilitySet are added to the load balancer backend pool.
This is required to:
- Support more than 100 nodes in Azure (only can have 100 nodes per availability set)
- Support multiple availability sets per cluster (An Azure L4 LoadBalancer can only be pointed at nodes in a single availability set)
Without this PR, or if the field is **not** set in a cluster that contains two availabilitysets, then the following is observed:
- Azure resources are created (LB, LB rules, NSG rules, public IP)
- Azure throws errors when trying to add nodes from the "other" availability set
- The service winds up exposed to the outside world (if you manually retrieve the public ip from Azure API)
- Kubernetes controller-manager's service loop keeps retrying forever because it never finishes fully successfully
- The "external ip" property field is never updated.
**Which issue this PR fixes**: Fixes#34293
**Unknowns**:
- Naming convention: `LoadBalancedAvailabilitySet` might be more descriptive than `PrimaryAvailabilitySet`, but is also a misnomer since `kube-proxy` will still end up routing requests to all relevant nodes.
- Is it worth trying to be "smart" about it in the case the user hasn't set this field in the config? Save the first availability set name and try not to add any nodes that aren't also in that one? It may simply be better to just let this fail so the user has to choose the right setting for their use-case.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
azure: add PrimaryAvailabilitySet to config, only use nodes in that set in the loadbalancer pool
```
CC: @brendandburns, @anhowe
Automatic merge from submit-queue
return warning on empty list result in kubectl get
**Release note**:
```release-note
NONE
```
The current default behavior of `kubectl get` is to return an empty
output when there are no resources to display. This patch improves
usability by returning a warning through stderr in the case of an empty
list.
##### Before
`$ kubectl get pods`
- *empty output*
##### After
`$ kubectl get pods`
```
There are no resources to display.
```
Automatic merge from submit-queue
Allow kuberuntime to get network namespace for not ready sandboxes
Kubelet calls TearDownPod to clean up the network resources for a pod sandbox.
TearDownPod relies on GetNetNS to retrieve network namespace, and the current
implementation makes this impossible for not-ready sandboxes. This change
removes the unnecessary filter to fix this issue.
Automatic merge from submit-queue
Generate unique Operation IDs for root OpenAPI spec
This PR adds a customization method GetOperationID to OpenAPI spec generation and then use it to make sure root spec has unique operation IDs by mostly adding GroupVersion to the start of operation ID.
Automatic merge from submit-queue
Add PVC storage to LimitRange
This PR adds the ability to add a LimitRange to a namespace that enforces min/max on `pvc.Spec.Resources.Requests["storage"]`.
@derekwaynecarr @abhgupta @kubernetes/sig-storage
Examples forthcoming.
```release-note
pvc.Spec.Resources.Requests min and max can be enforced with a LimitRange of type "PersistentVolumeClaim" in the namespace
```
Automatic merge from submit-queue
kubeadm: fix conversion macros and add kubeadm to round trip testing
Tests are probably broken but I'll fix. @jbeda this probably fixes your change unless we decide we need generated deep copies or conversions.
@kubernetes/sig-cluster-lifecycle
Kubelet calls TearDownPod to clean up the network resources for a pod sandbox.
TearDownPod relies on GetNetNS to retrieve network namespace, and the current
implementation makes this impossible for not-ready sandboxes. This change
removes the unnecessary filter to fix this issue.
Automatic merge from submit-queue
CRI: Image pullable support in dockershim
For #33189.
The new test `ImageID should be set to the manifest digest (from RepoDigests) when available` introduced in #33014 is failing, because:
1) `docker-pullable://` conversion is not supported in dockershim;
2) `kuberuntime` and `dockershim` is using `ListImages with image name filter` to check whether image presents. However, `ListImages` doesn't support filter with `digest`.
This PR:
1) Change `kuberuntime.IsImagePresent` to use `runtime.ImageStatus` and `dockershim.InspectImage` instead. ***Notice an API change: `ImageStatus` should return `(nil, nil)` for non-existing image.***
2) Add `docker-pullable://` support.
3) Fix `RemoveImage` in dockershim https://github.com/kubernetes/kubernetes/pull/29316.
I've tried myself, the test can pass now.
@yujuhong @feiskyer @yifan-gu
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Allow 'pod/' prefix in pod name for 'kubectl exec'
This PR adds ability to provide pod name with 'pod/' prefix for 'kubectl exec' command. Pod names without 'pod/' prefix are still allowed.
Fixes#24225
Automatic merge from submit-queue
Fix missleading comment
**What this PR does / why we need it**: It just fixes misleading comment. It took me some time to figure out real behavior.
Automatic merge from submit-queue
Refactor kubectl edit cmd
Refactor `kubectl edit` command.
#33250 will be based on this PR for easier review.
Will need to rebase after #33973 merges.
Gluster provisioner is interested in pvc.Namespace and I don't want to add
at as a new field in VolumeOptions - it would contain almost whole PVC.
Let's pass direct reference to PVC instead and let the provisioner to pick
information it is interested in.
Automatic merge from submit-queue
Avoid unnecessary decoding in etcd3 client
Ref https://github.com/kubernetes/kubernetes/issues/33653
With the "Cacher" layer in Kubernetes, most of the watches processed by "pkg/storage/etcd3/watcher.go" have "filter = Everything()". That said, we generally don't need to decode previous value of the object (which is used only to get the value of filter of it), because we already know it will be true.
This PR is basically fixing this problem.
Should be merged after https://github.com/kubernetes/kubernetes/pull/34246
Automatic merge from submit-queue
Copy finalizers from template spec to pod.
**What this PR does / why we need it**: The PodTemplateSpec has a finalizers field whose contents are not copied over to a pod during creation.
Automatic merge from submit-queue
CRI: Change dockershim to use UnixNano instead of Unix.
Fixes https://github.com/kubernetes/kubernetes/issues/34492.
This PR changes the dockershim to use `UnixNano` instead of `Unix` to return timestamp in nanoseconds.
@yujuhong
Automatic merge from submit-queue
clean api server cruft
Some cruft has developed over refactors. Remove that cruft.
@liggitt probably last in the chain so far
Automatic merge from submit-queue
convert deployment controller to shared informers
Converts the deployment controller to shared informers.
@kargakis I think you've been in here. Pretty straight forward swap.
Fixes#27687
Automatic merge from submit-queue
Add version cache for cri APIVersion
ref https://github.com/kubernetes/kubernetes/issues/29478
1. Added a version cache for `APIVersion()` by using object cache., with ttl=1 min
2. Leaving `Version()` as it is today
Automatic merge from submit-queue
Update godeps for libcontainer+cadvisor
Needed to unblock more progress on pod cgroup.
/cc @vishh @dchen1107 @timstclair
Automatic merge from submit-queue
etcd3: use PrevKV to remove additional get
ref: #https://github.com/kubernetes/kubernetes/issues/33653
We are trying to test using PrevKV feature and see if it improves performance.
In order to test this, we will need etcd v3.1 (alpha) image.
Blockers:
- update gcr.io image (version v3.0.12)
Automatic merge from submit-queue
Kubelet: Use RepoDigest for ImageID when available
```release-note
Use manifest digest (as `docker-pullable://`) as ImageID when available (exposes a canonical, pullable image ID for containers).
```
Previously, we used the docker config digest (also called "image ID"
by Docker) for the value of the `ImageID` field in the container status.
This was not particularly useful, since the config manifest is not
what's used to identify the image in a registry, which uses the manifest
digest instead. Docker 1.12+ always populates the RepoDigests field
with the manifest digests, and Docker 1.10 and 1.11 populate it when
images are pulled by digest.
This commit changes `ImageID` to point to the the manifest digest when
available, using the prefix `docker-pullable://` (instead of
`docker://`)
Related to #32159
Automatic merge from submit-queue
Add default cluster role bindings
Add default cluster roles bindings to rbac bootstrapping. Also adds a case for allowing escalation when you have no authenticator.
@liggitt I expect you may need to make peace with this.
Automatic merge from submit-queue
Enable service account signing key rotation
fixes#21007
```release-note
The kube-apiserver --service-account-key-file option can be specified multiple times, or can point to a file containing multiple keys, to enable rotation of signing keys.
```
This PR enables the apiserver authenticator to verify service account tokens signed by different private keys. This can be done two different ways:
* including multiple keys in the specified keyfile (e.g. `--service-account-key-file=keys.pem`)
* specifying multiple key files (e.g. `--service-account-key-file current-key.pem --service-account-key-file=old-key.pem`)
This is part of enabling signing key rotation:
1. update apiserver(s) to verify tokens signed with a new public key while still allowing tokens signed with the current public key (which is what this PR enables)
2. give controllermanager the new private key to sign new tokens with
3. remove old service account tokens (determined by verifying signature or by checking creationTimestamp) once they are no longer in use (determined using garbage collection or magic) or some other algorithm (24 hours after rotation, etc). For the deletion to immediately revoke the token, `--service-account-lookup` must be enabled on the apiserver.
4. once all old tokens are gone, update apiservers again, removing the old public key.
Automatic merge from submit-queue
openstack: Support config-drive and improve CurrentNodeName, GetZone
This PR adds support for fetching local instance metadata via config-drive (as well as querying metadata service), and surfaces some additional metadata information (from either source):
- `CurrentNodeName` now returns the OpenStack instance name, rather than the current hostname (they might not be the same)
- `GetZone` includes availability zone label in `FailureDomain`
Thanks to @kiall for a WIP implementation of the latter.
Previously, we used the docker config digest (also called "image ID"
by Docker) for the value of the `ImageID` field in the container status.
This was not particularly useful, since the config manifest is not
what's used to identify the image in a registry, which uses the manifest
digest instead. Docker 1.12+ always populates the RepoDigests field
with the manifest digests, and Docker 1.10 and 1.11 populate it when
images are pulled by digest.
This commit changes `ImageID` to point to the the manifest digest when
available, using the prefix `docker-pullable://` (instead of
`docker://`)
Previously, the `InspectImage` method of the Docker interface expected a
"pullable" image ref (name, tag, or manifest digest). If you tried to
inspect an image by its ID (config digest), the inspect would fail to
validate the image against the input identifier. This commit changes
the original method to be named `InspectImageByRef`, and introduces a
new method called `InspectImageByID` which validates that the input
identifier was an image ID.
Automatic merge from submit-queue
Use nodeutil.GetHostIP consistently when talking to nodes
Most of our communications from apiserver -> nodes used
nodutil.GetNodeHostIP, but a few places didn't - and this meant that the
node name needed to be resolvable _and_ we needed to populate valid IP
addresses.
```release-note
The apiserver now uses addresses reported by the kubelet in the Node object's status for apiserver->kubelet communications, rather than the name of the Node object. The address type used defaults to `InternalIP`, `ExternalIP`, and `LegacyHostIP` address types, in that order.
```
Automatic merge from submit-queue
refactor controller as posthook when configured and enabled
builds on https://github.com/kubernetes/kubernetes/pull/33785 .
Models the bootstrap controller as a PostHook which is added when its API group is available and its configured.
@liggitt Stefan is out for a while.
Automatic merge from submit-queue
Match GroupVersionKind against specific version
Currently when multiple GVK match a specific kind in `KindForGroupVersionKinds` only the first will be matched, which not necessarily will be the correct one. I'm proposing to extend this to pick the best match, instead.
Here's my problematic use-case, of course it involves ScheduledJobs 😉:
I have a `GroupVersions` with `batch/v1` and `batch/v2alpha1` in that order. I'm calling `KindForGroupVersionKinds` with kind `batch/v2alpha1 ScheduledJob` and that currently results this matching first `GroupVersion`, instead of picking more concrete one. There's a [clear description](ee77d4e6ca/pkg/api/unversioned/group_version.go (L183)) why it is on single `GroupVersion`, but `GroupVersions` should pick this more carefully.
@deads2k this is your baby, wdyt?
Automatic merge from submit-queue
Add sandbox gc minage
Fixes https://github.com/kubernetes/kubernetes/issues/34272.
Fixes https://github.com/kubernetes/kubernetes/issues/33984.
This PR:
1) Change the `GetPodStatus` to get statuses of all containers in a pod instead of only containers belonging to existing sandboxes. This is because sandbox may be removed by GC or by users, kubelet should be able to deal with this case.
2) Change the CRI comment to clarify the timestamp unit (nanosecond).
2) Add MinAge for sandbox GC Policy.
@yujuhong @feiskyer @yifan-gu
/cc @kubernetes/sig-node
Automatic merge from submit-queue
controller: save older revisions for Deployment's replica sets
@jwforres the only usable way I could find for multiple old revisions for a single replica set is to stuff them as comma-separated values.
@kubernetes/deployment this retains old revisions served by a replica set inside an annotation.
Fixes https://github.com/kubernetes/kubernetes/issues/33844
Automatic merge from submit-queue
Make kubectl label and annotate more consistent
**What this PR does / why we need it**:
This makes the label and annotate cmd files more consistent which should help with code maintenance.
Some of the main changes:
- add dryrun to annotate (can push this in a different PR if requested)
- use Complete(), Validate() and RunX()
- don't place dynamic variables in the options (only user options and args)
- call the NewBuilder() in the Run function.
**Which issue this PR fixes**
fixes#34151
**Special notes for your reviewer**:
Note: you *can* now diff the two files and the changes make sense.
**Release note**:
```release-note
kubectl annotate now supports --dry-run
```
Automatic merge from submit-queue
kubectl: Add external ip information to node when '-o wide' is used
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**Which issue this PR fixes**: fixes#33457
**Special notes for your reviewer**:
1. Is it possible to expose multiple external ips on the node?
2. Should this be supported or first one be taken like now?
3. Should more node address types be shown?
I'll add tests if solution is approved.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
kubectl: Add external ip information to node when '-o wide' is used
```
Automatic merge from submit-queue
remove testapi.Default.GroupVersion
I'm going to try to take this as a series of mechanicals. This removes `testapi.Default.GroupVersion()` and replaces it with `registered.GroupOrDie(api.GroupName).GroupVersion`.
@caesarxuchao I'm trying to see how much of `pkg/api/testapi` I can remove.
Automatic merge from submit-queue
OpenStack LBaaSV2: EnsureLoadBalancer now updates instead of recreates existing LBs
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**: Current LBaaSV2 integration recreates existing LBs and causes service downtime and floating ip rotation. New implementation updates LBs without service downtime or any ip rotation.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#32794
**Special notes for your reviewer**: I really need this before we can move to production with kubernetes. Getting this to v1.4 would be really great. I have performed plenty of testing; lb and listener creation, port changing and listener update, multiple listeners for multi-port LBs, and deletion. Seems to work flawlessly.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
```
Automatic merge from submit-queue
Add cgroup-driver and cgroups-per-qos flags to kubelet
Add the flags needed to support pod-level cgroups to kubelet.
/cc @vishh @dchen1107 @dubstack
Automatic merge from submit-queue
Revert "Add kubelet awareness to taint tolerant match caculator."
Reverts kubernetes/kubernetes#26501
Original PR was not fully reviewed by @kubernetes/sig-node
cc/ @timothysc @resouer
Automatic merge from submit-queue
Kubelet: Use RepoDigest for ImageID when available
**Release note**:
```release-note
Use manifest digest (as `docker-pullable://`) as ImageID when available (exposes a canonical, pullable image ID for containers).
```
Previously, we used the docker config digest (also called "image ID"
by Docker) for the value of the `ImageID` field in the container status.
This was not particularly useful, since the config manifest is not
what's used to identify the image in a registry, which uses the manifest
digest instead. Docker 1.12+ always populates the RepoDigests field
with the manifest digests, and Docker 1.10 and 1.11 populate it when
images are pulled by digest.
This commit changes `ImageID` to point to the the manifest digest when
available, using the prefix `docker-pullable://` (instead of
`docker://`)
Related to #32159
Automatic merge from submit-queue
Add kubelet awareness to taint tolerant match caculator.
Add kubelet awareness to taint tolerant match caculator.
Ref: #25320
This is required by `TaintEffectNoScheduleNoAdmit` & `TaintEffectNoScheduleNoAdmitNoExecute `, so that node will know if it should expect the taint&tolerant
Automatic merge from submit-queue
fix pod eviction storage
Refactor pod eviction storage to remove the tight order coupling of the storage. This also gets us ready to deal with cases where API groups are not co-located on the same server, though the particular client being used would assume a proxy.
Automatic merge from submit-queue
Federated ingress unit test fix
Adds RaceFreeFakeWatcher to watch.go. If this struct succeeds with helping IngressController i will try to move all FakeWatchers to this implementation.
cc: @quinton-hoole @madhusudancs @wojtek-t @kubernetes/sig-cluster-federation
Should help with #32685.
Automatic merge from submit-queue
Discovery client retry when failed to discovery resrouces
Fix#32308
`ServerPreferredNamespacedResources()` fails if in the middle of its execution, the TPR e2e tests change the supported resources on the API server. This PR let the e2e test framework retry `ServerPreferredNamespacedResources()`.
cc @lavalamp
Automatic merge from submit-queue
Ignore troublesome paths that cause coverage to fail
**What this PR does / why we need it**:
`KUBE_COVER=y make check` currently fails, this patch fixes it.
**Which issue this PR fixes**
fixes#31691
**Special notes for your reviewer**:
None
**Release note**:
```release-note
NONE
```
This avoids the whole command failing because of errors like the following:
```
# cover k8s.io/kubernetes/pkg/client/restclient
cover: internal error: block 268 overlaps block 270
```
Automatic merge from submit-queue
Refactor: separate KubeletClient & ConnectionInfoGetter concepts
KubeletClient implements ConnectionInfoGetter, but it is not a complete
implementation: it does not set the kubelet port from the node record,
for example.
By renaming the method so that it does not implement the interface, we
are able to cleanly see where the "raw" GetConnectionInfo is used (it is
correct) and also have go type-checking enforce this for us.
This is related to #25532; I wanted to satisfy myself that what we were doing there was correct, and I wanted also to ensure that the compiler could enforce this going forwards.
Automatic merge from submit-queue
Update client config invalid option errors to be more specific
This patch adds better error handling for cases where a global option (such as --context or --cluster) causes an invalid config to be returned.
```release-note
release-note-none
```
Automatic merge from submit-queue
Add node event for container/image GC failure
Follow up to #31988. Add an event for a node when container/image GC fails.
Automatic merge from submit-queue
Fix nil pointer issue when getting metrics from volume mounter
Currently it is possible that the mounter object stored in Mounted
Volume data structure in the actual state of kubelet volume manager is
nil if this information is recovered from state sync process. This will
cause nil pointer issue when calculating stats in volume_stat_calculator.
A quick fix is to not return the volume if its mounter is nil. A more
complete fix is to also recover mounter object when reconstructing the
volume data structure which will be addressed in PR #33616
Automatic merge from submit-queue
Remove headers that are unnecessary for proxy target
Some headers like authorization is unnecessary to pass to the proxy target. We should start removing these headers in proxy requests.
Automatic merge from submit-queue
Fix wait.JitterUntil
https://github.com/kubernetes/kubernetes/pull/29743 changed a util method to cause process exits if a handler function panics.
Utility methods should not make process exit decisions. If a process (like the controller manager) wants to exit on panic, appending a panic handler or setting `ReallyCrash = true` is the right way to do that (discussed [here](https://github.com/kubernetes/kubernetes/pull/29743#r75509074)).
This restores the documented behavior of wait.JitterUntil
This patch provides a more relevant error message when a client
configuration option is passed with an invalid or non-existent value.
`$ kubectl get pods --cluster="non-existent"`
```
error: No configuration file found, please login or point to an existing
file
```
`$ kubectl get pods --cluster="non-existent"`
```
error: cluster "non-existent" does not exist
```
Currently it is possible that the mounter object stored in Mounted
Volume data structure in the actual state of kubelet volume manager is
nil if this information is recovered from state sync process. This will
cause nil pointer issue when calculating stats in volume_stat_calculator.
A quick fix is to not return the volume if its mounter is nil. A more
complete fix is to also recover mounter object when reconstructing the
volume data structure which will be addressed in PR #33616
Automatic merge from submit-queue
kubelet: eviction: avoid duplicate action on stale stats
Currently, the eviction code can be overly aggressive when synchronize() is called two (or more) times before a particular stat has been recollected by cadvisor. The eviction manager will take additional action based on information for which it has already taken actions.
This PR provides a method for the eviction manager to track the timestamp of the last obversation and not take action if the stat has not been updated since the last time synchronize() was run.
@derekwaynecarr @vishh @kubernetes/rh-cluster-infra