Automatic merge from submit-queue
Add generic NoExecute Toleration to NPD
Ref. #44445
cc @davidopp
```release-note
Add generic Toleration for NoExecute Taints to NodeProblemDetector
```
Automatic merge from submit-queue
Optimize provisioner plugin result check logic
If err is not returned by findProvisionablePlugin(...), storageClass is certainly not nil
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Let kubemark exit if it fails to start
Fix the bug: If there is sth wrong to run hollow kubelet, kubemark will just hang instead of exiting.
I came across the problem when I tried to start kubemark with no-root user.
```
I0523 15:27:39.721447 16855 docker_service.go:223] Setting cgroupDriver to cgroupfs
I0523 15:27:39.721634 16855 docker_legacy.go:151] No legacy containers found, stop performing legacy cleanup.
I0523 15:27:39.722208 16855 kubelet.go:559] Starting the GRPC server for the docker CRI shim.
I0523 15:27:39.722228 16855 docker_server.go:60] Start dockershim grpc server
I0523 15:27:39.722265 16855 server.go:819] failed to unlink socket file "/var/run/dockershim.sock": permission denied
E0523 15:27:39.722327 16855 container_manager_linux.go:98] Unable to ensure the docker processes run in the desired containers
```
Automatic merge from submit-queue
Added k82cn as one of scheduler approver.
According to the requirement of Approver at [community-membership.md](https://github.com/kubernetes/community/blob/master/community-membership.md), I meet the requirements as follow; so I'd like to add myself as an approver of scheduler.
* Reviewer of the codebase for at least 3 months
[k82cn]: [~3 months](6cc40678b6 )
* Primary reviewer for at least 10 substantial PRs to the codebase
[k82cn] Reviewed [40 PRs](https://github.com/issues?q=assignee%3Ak82cn+is%3Aclosed)
* Reviewed or merged at least 30 PRs to the codebase
[k82cn]: 71 merged PRs in kubernetes/kubernetes, and ~100 PRs in kuberentes at https://goo.gl/j2D1fR
As an approver,
* I agree to only approve familiar PRs
* I agree to be responsive to review/approve requests as per community expectations
* I agree to continue my reviewer work as per community expectations
* I agree to continue my contribution, e.g. PRs, mentor contributors
Automatic merge from submit-queue (batch tested with PRs 46407, 46457)
GCE - Refactor API for firewall and backend service creation
**What this PR does / why we need it**:
- Currently, firewall creation function actually instantiates the firewall object; this is inconsistent with the rest of GCE api calls. The API normally gets passed in an existing object.
- Necessary information for firewall creation, (`computeHostTags`,`nodeTags`,`networkURL`,`subnetworkURL`,`region`) were private to within the package. These now have public getters.
- Consumers might need to know whether the cluster is running on a cross-project network. A new `OnXPN` func will make that information available.
- Backend services for regions have been added. Global ones have been renamed to specify global.
- NamedPort management of instance groups has been changed from an `AddPortsToInstanceGroup` func (and missing complementary `Remove...`) to a single, simple `SetNamedPortsOfInstanceGroup`
- Addressed nitpick review comments of #45524
ILB needs the regional backend services and firewall refactor. The ingress controller needs the new `OnXPN` func to decide whether to create a firewall.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46407, 46457)
Remove deletecollection support from namespace object
Namespace storage accidentally picked up deletecollection support from embedding the generic store. If invoked, it skips the custom namespace `Delete()` storage method that enforces finalization, and skips the namespace lifecycle admission plugin that protects immortal namespaces from deletion.
Given the data integrity implications of skipping namespace finalization, I'd backport this as far as we're releasing patch releases.
```release-note
The namespace API object no longer supports the deletecollection operation.
```
Automatic merge from submit-queue
Move NetworkPolicy to v1
Move NetworkPolicy to v1
@kubernetes/sig-network-misc
**Release note**:
```release-note
NetworkPolicy has been moved from `extensions/v1beta1` to the new
`networking.k8s.io/v1` API group. The structure remains unchanged from
the beta1 API.
The `net.beta.kubernetes.io/network-policy` annotation on Namespaces
to opt in to isolation has been removed. Instead, isolation is now
determined at a per-pod level, with pods being isolated if there is
any NetworkPolicy whose spec.podSelector targets them. Pods that are
targeted by NetworkPolicies accept traffic that is accepted by any of
the NetworkPolicies (and nothing else), and pods that are not targeted
by any NetworkPolicy accept all traffic by default.
Action Required:
When upgrading to Kubernetes 1.7 (and a network plugin that supports
the new NetworkPolicy v1 semantics), to ensure full behavioral
compatibility with v1beta1:
1. In Namespaces that previously had the "DefaultDeny" annotation,
you can create equivalent v1 semantics by creating a
NetworkPolicy that matches all pods but does not allow any
traffic:
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: default-deny
spec:
podSelector:
This will ensure that pods that aren't match by any other
NetworkPolicy will continue to be fully-isolated, as they were
before.
2. In Namespaces that previously did not have the "DefaultDeny"
annotation, you should delete any existing NetworkPolicy
objects. These would have had no effect before, but with v1
semantics they might cause some traffic to be blocked that you
didn't intend to be blocked.
```
Automatic merge from submit-queue
Update troubleshooting link in ISSUE_TEMPLATE.md
**What this PR does / why we need it**:
Since our troubleshooting page has been moved, we should update the issue template.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
ref: #44434
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
fix some typo in docs[example/volumes]
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Protobuf generation for k8s.io/metrics
This PR introduces protobuf generation for k8s.io/metrics. Doing so required:
- fixing a bug in `go-to-protobuf` causing the `cast{key,value,type}` values to not be quoted when coming from struct tags (and not auto-injection by `go-to-protobuf` itself).
- Making sure the proto IDL in k8s.io/client-go had a package name of `k8s.io.client_go.xyz` and not `k8s.io.kubernetes.xyz`.
Additionally, I updated `go-to-protobuf` to skip functions and non-public types when composing the import list, which cuts down on the more bizarre imports in the IDL (like importing the sample API package in every IDL file because it contained `addToScheme`, like every other API package).
We use `castvalue` to force gogo-proto to realize that it should consider the value of the map which underlies `ResourceList` when calculating which imports need to be named. Otherwise, it ignores the value's type, leading to compilation errors when it later can't find an import it assumed existed. We accidentally didn't hit this in `k8s.io/kubernetes/pkg/api/v1` since another field coincidentally happens to directly use `resource.Quantity` (the value type of `ResourceList`).
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Apply KubeProxyEndpointLagTimeout to ESIPP tests
Fixes#46533.
The previous construction of ESIPP tests is weird, so I redo it a bit.
A 30 seconds `KubeProxyEndpointLagTimeout` is introduced, as these tests ain't verifying performance, may be better to not make it too tight.
/assign @thockin
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
fix typo in build.sh
**What this PR does / why we need it**:
fix typo in build.sh
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
NONE
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
NONE
```
This commit regenerates the protobuf as per the recent generation
changes (removing erroneous imports, adding k8s.io/metrics), and
syncs the changes to client-go (which also ensures that client-go
protobuf IDL has the correct package names).
Automatic merge from submit-queue (batch tested with PRs 46302, 44597, 44742, 46554)
Do not install do-nothing iptables rules
Deprecate kubelet non-masquerade-cidr.
Do not install iptables rules if it is set to 0.0.0.0/0.
Fixes#46553
Automatic merge from submit-queue (batch tested with PRs 46302, 44597, 44742, 46554)
Change to aggregator so it calls a user apiservice via its pod IP.
proxy_handler now does a sideways call to lookup the pod IPs for aservice.
It will then pick a random pod IP to forward the use apiserver request to.
**What this PR does / why we need it**: It allows the aggregator to work without setting up the full network stack on the kube master (i.e. with kube-dns or kube-proxy)
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#44619
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 46252, 45524, 46236, 46277, 46522)
Add /healthz back to kube-proxy metrics server
Fixes#46447.
/healthz is removed from kube-proxy metrics server by #44968 and that breaks our upgrade test, which run 1.6 tests on 1.7 cluster. It seems harmless to continue holding /healthz on metrics server as well, so that we won't break other potential users.
/assign @bowei
cc @dchen1107
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46252, 45524, 46236, 46277, 46522)
add test in create authinfo
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46252, 45524, 46236, 46277, 46522)
Support sandbox images from private registries
**What this PR does / why we need it**:
The --pod-infra-container-image parameter allows the user to specify
an arbitrary image to be used as the pod infra container (AKA
sandbox), an internal piece of the dockershim implementation of the
Container Runtime Interface.
The dockershim does not have access to any of the pod-level image pull
credentials configuration, so if the user specifies an image from a
private registry, the image pull will fail.
This change allows the dockershim to read local docker configuration
(e.g. /root/.docker/config.json) and use it when pulling the pod infra
container image.
**Which issue this PR fixes**: fixes#45738
**Special notes for your reviewer**:
The changes to fake_client for writing local config files deserve some
attention.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46252, 45524, 46236, 46277, 46522)
Make GCE load-balancers create health checks for nodes
From #14661. Proposal on kubernetes/community#552. Fixes#46313.
Bullet points:
- Create nodes health check and firewall (for health checking) for non-OnlyLocal service.
- Create local traffic health check and firewall (for health checking) for OnlyLocal service.
- Version skew:
- Don't create nodes health check if any nodes has version < 1.7.0.
- Don't backfill nodes health check on existing LBs unless users explicitly trigger it.
**Release note**:
```release-note
GCE Cloud Provider: New created LoadBalancer type Service now have health checks for nodes by default.
An existing LoadBalancer will have health check attached to it when:
- Change Service.Spec.Type from LoadBalancer to others and flip it back.
- Any effective change on Service.Spec.ExternalTrafficPolicy.
```
Automatic merge from submit-queue (batch tested with PRs 46252, 45524, 46236, 46277, 46522)
[Federation] Refactor the cluster selection logic in the sync controller
This is intended to make it easier to define the interaction between cluster selection and scheduling preferences in the sync controller when used for workload types.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46450, 46272, 46453, 46019, 46367)
Move MountVolume.SetUp succeeded to debug level
This message is verbose and repeated over and over again in log files
creating a lot of noise. Leave the message in, but require a -v in
order to actually log it.
**What this PR does / why we need it**: Moves a verbose log message to actually be verbose.
**Which issue this PR fixes** fixes#46364Fixes#29059
Automatic merge from submit-queue (batch tested with PRs 46450, 46272, 46453, 46019, 46367)
check err
Signed-off-by: yupengzte <yu.peng36@zte.com.cn>
**What this PR does / why we need it**:
When the err in not nil, the podStatus is nil, it is dangerous "podStatus[cluster.Name].RunningAndReady".
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 46450, 46272, 46453, 46019, 46367)
Add ClusterSelector to Ingress Controller
This pull request adds ClusterSelector to the Federated Ingress Controller ref: design #29887
This back ports the same functionality from the sync controller (merged pull #40234) in order to make this feature available across all Controllers for the 1.7 release.
cc: @kubernetes/sig-federation-pr-reviews @shashidharatd
**Release note**:
```
The annotation `federation.alpha.kubernetes.io/cluster-selector` can be used with Ingress objects to target federated clusters by label.
```
Automatic merge from submit-queue (batch tested with PRs 46450, 46272, 46453, 46019, 46367)
add test for set selector
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Fix spelling in example/spark
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
This commit adds proto tags to
`k8s.io/metrics/pkg/apis/metrics/v1alpha1`. The tags are more-or-less
what's suggested by `go-to-protobuf`, with the exception of the use of
`castvalue`.
`castvalue` is used to force gogo-proto to realize that the value of
`ResourceList` (which is `map[ResourceName]Quantity`) is actually a type
that it should consider when recording which packages are used.
Otherwise, it ignores the type, using an unnamed import for the
`k8s.io/apimachinery/pkg/api/resource`, which causes compilation errors.
When using a `cast{key,value,type}` that was injected via struct tag, we
need to make sure to quote the value when transfering it over to proto
tags. Otherwise, it'll come through as unquoted, resulting in invalid
proto.
This was previously not a problem, since all values of `castkey` and
`casttype` were actually coming from the auto-injecting code which deals
with maps and aliases, which does correctly quote values.
This commit adds the `k8s.io/metrics` APIs to the list of packages for
which to generate protobuf. Additionally, it adds
`k8s.io/client-go/pkg/apis/v1` as a non-generated (referenced) package.
Automatic merge from submit-queue (batch tested with PRs 45809, 46515, 46484, 46516, 45614)
CRI: add methods for container stats
**What this PR does / why we need it**:
Define methods in CRI to get container stats.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Part of https://github.com/kubernetes/features/issues/290; addresses #27097
**Special notes for your reviewer**:
This PR defines the *minimum required* container metrics for the existing components to function, loosely based on the previous discussion on [core metrics](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/core-metrics-pipeline.md) as well as the existing cadvisor/summary APIs.
Two new RPC calls are added to the RuntimeService: `ContainerStats` and `ListContainerStats`. The former retrieves stats for a given container, while the latter gets stats for all containers in one call.
The stats gathering time of each subsystem can vary substantially (e.g., cpu vs. disk), so even though the on-demand model preferred due to its simplicity, we’d rather give the container runtime more flexibility to determine the collection frequency for each subsystem*. As a trade-off, each piece of stats for the subsystem must contain a timestamp to let kubelet know how fresh/recent the stats are. In the future, we should also recommend a guideline for how recent the stats should be in order to ensure the reliability (e.g., eviction) and the responsiveness (e.g., autoscaling) of the kubernetes cluster.
The next step is to plumb this through kubelet so that kubelet can choose consume container stats from CRI or cadvisor.
**Alternatively, we can add calls to get stats of individual subsystems. However, kubelet does not have the complete knowledge of the runtime environment, so this would only lead to unnecessary complexity in kubelet.*
**Release note**:
```release-note
Augment CRI to support retrieving container stats from the runtime.
```
This commit converts the package names in the proto IDL in client-go.
This allows third parties (and repositories in staging) who make use of
types in client-go to generate proto IDL themselves properly.
Automatic merge from submit-queue (batch tested with PRs 45809, 46515, 46484, 46516, 45614)
kubelet was sending negative allocatable values
**What this PR does / why we need it**:
if you set reservations > node capacity, the node sent negative values for allocatable values on create. setting negative values on update is rejected.
**Which issue this PR fixes**
xref https://bugzilla.redhat.com/show_bug.cgi?id=1455420
**Special notes for your reviewer**:
at this time, the node is allowed to set status on create. without this change, a node was being registered with negative allocatable values. i think we need to revisit letting node set status on create, and i will send a separate pr to debate the merits of that point.
```release-note
Prevent kubelet from setting allocatable < 0 for a resource upon initial creation.
```
Since go-to-protobuf doesn't care about functions or private types (only
public types), we can skip them. This helps to clean up the generated
IDL: previously, the IDL contained erroneous imports due to matching
functions and private types which were not actually converted to protobuf,
but which were the same as functions and private types in other packages.