Automatic merge from submit-queue (batch tested with PRs 45569, 45602, 45604, 45478, 45550)
Enable kernel memcg notification for node and cluster GCI/COS testing.
Sets --experimental-kernel-memcg-notification=true when running on the GCI/COS image. It sets this for master and nodes for cluster e2e tests, and for the node in node e2e tests.
Issue #42676
cc @dchen1107 @Random-Liu
IP aliases are an alpha feature, and node accelerators are a beta
feature. $gcloud determines which is appropriate.
Before, this would try to run "gcloud alpha beta", which is incoherent.
Automatic merge from submit-queue (batch tested with PRs 44830, 45130)
Adding support for Accelerators to GCE clusters.
```release-note
Create clusters with GPUs in GKE by specifying "type=<gpu-type>,count=<gpu-count>" to NODE_ACCELERATORS env var.
List of available GPUs - https://cloud.google.com/compute/docs/gpus/#introduction
```
Automatic merge from submit-queue (batch tested with PRs 44590, 44969, 45325, 45208, 44714)
Enable basic auth username rotation for GCI
When changing basic auth creds, just delete the whole file, in order to be able to rotate username in addition to password.
Automatic merge from submit-queue (batch tested with PRs 45285, 45162)
mounter.go: format return err.
**What this PR does / why we need it**:
when an error returned is nil, it's preferred to explicitly return nil.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44591, 44549)
Update repo-infra bazel dependency and use new gcs_upload rule
This PR provides similar functionality to push-build.sh entirely within Bazel rules (though it relies on gsutil).
It's an alternative to #44306.
Depends on https://github.com/kubernetes/repo-infra/pull/13.
**Release note**:
```release-note
NONE
```
Using Ubuntu on GCE to run cluster e2e tests requires slightly different
node.yaml and master.yaml files than GCI, because Ubuntu uses systemd as
PID 1, wheras GCI uses upstart with a systemd delegate. Therefore the
e2e tests fail using those files since the kubernetes services are not
brought back up after a node/master reboot.
Automatic merge from submit-queue
Use auto mode networks instead of legacy networks in GCP
Use of the --range flag creates legacy networks in GCP.
Legacy networks will not support new GCP features.
```release-note
NONE
```
Automatic merge from submit-queue
Add support for IP aliases for pod IPs (GCP alpha feature)
```release-note
Adds support for allocation of pod IPs via IP aliases.
# Adds KUBE_GCE_ENABLE_IP_ALIASES flag to the cluster up scripts (`kube-{up,down}.sh`).
KUBE_GCE_ENABLE_IP_ALIASES=true will enable allocation of PodCIDR ips
using the ip alias mechanism rather than using routes. This feature is currently
only available on GCE.
## Usage
$ CLUSTER_IP_RANGE=10.100.0.0/16 KUBE_GCE_ENABLE_IP_ALIASES=true bash -x cluster/kube-up.sh
# Adds CloudAllocator to the node CIDR allocator (kubernetes-controller manager).
If CIDRAllocatorType is set to `CloudCIDRAllocator`, then allocation
of CIDR allocation instead is done by the external cloud provider and
the node controller is only responsible for reflecting the allocation
into the node spec.
- Splits off the rangeAllocator from the cidr_allocator.go file.
- Adds cloudCIDRAllocator, which is used when the cloud provider allocates
the CIDR ranges externally. (GCE support only)
- Updates RBAC permission for node controller to include PATCH
```
KUBE_GCE_ENABLE_IP_ALIASES=true will enable allocation of PodCIDR ips
using the ip alias mechanism rather than using routes.
NODE_IP_RANGE will control the node instance IP cidr
KUBE_GCE_IP_ALIAS_SIZE controls the size of each podCIDR
IP_ALIAS_SUBNETWORK controls the name of the subnet created for the cluster
Automatic merge from submit-queue (batch tested with PRs 43866, 42748)
hack/cluster: download cfssl if not present
hack/local-up-cluster.sh uses cfssl to generate certificates and
will exit it cfssl is not already installed. But other cluster-up
mechanisms (GCE) that generate certs just download cfssl if not
present. Make local-up-cluster.sh do that too so users don't have
to bother installing it from somewhere.
Automatic merge from submit-queue (batch tested with PRs 42025, 44169, 43940)
if we have a dedicated serviceaccount keypair, use it to verify serviceaccounts
Automatic merge from submit-queue
[Federation] Remove FEDERATIONS_DOMAIN_MAP references
Remove all references to FEDERATIONS_DOMAIN_MAP as this method is no longer is used and is replaced by adding federation domain map to kube-dns configmap.
cc @madhusudancs @kubernetes/sig-federation-pr-reviews
**Release note**:
```
[Federation] Mechanism of adding `federation domain maps` to kube-dns deployment via `--federations` flag is superseded by adding/updating `federations` key in `kube-system/kube-dns` configmap. If user is using kubefed tool to join cluster federation, adding federation domain maps to kube-dns is already taken care by `kubefed join` and does not need further action.
```
Automatic merge from submit-queue
Use shared informers for proxy endpoints and service configs
Use shared informers instead of creating local controllers/reflectors
for the proxy's endpoints and service configs. This allows downstream
integrators to pass in preexisting shared informers to save on memory &
cpu usage.
This also enables the cache mutation detector for kube-proxy for those
presubmit jobs that already turn it on.
Follow-up to #43295 cc @wojtek-t
Will race with #43937 for conflicting changes 😄 cc @thockin
cc @smarterclayton @sttts @liggitt @deads2k @derekwaynecarr @eparis @kubernetes/rh-cluster-infra
Use shared informers instead of creating local controllers/reflectors
for the proxy's endpoints and service configs. This allows downstream
integrators to pass in preexisting shared informers to save on memory &
cpu usage.
This also enables the cache mutation detector for kube-proxy for those
presubmit jobs that already turn it on.
hack/local-up-cluster.sh uses cfssl to generate certificates and
will exit it cfssl is not already installed. But other cluster-up
mechanisms (GCE) that generate certs just download cfssl if not
present. Make local-up-cluster.sh do that too.
Per Clayton's suggestion, move stuff from cluster/lib/util.sh to
hack/lib/util.sh. Also consolidate ensure-temp-dir and use the
hack/lib/util.sh implementation rather than cluster/common.sh.
Automatic merge from submit-queue
Centos provider: generate SSL certificates for etcd cluster.
**What this PR does / why we need it**:
Support secure etcd cluster for centos provider, generate SSL certificates for etcd in default. Running it w/o SSL is exposing cluster data to everyone and is not recommended. [#39462](https://github.com/kubernetes/kubernetes/pull/39462#issuecomment-271601547)
/cc @jszczepkowski @zmerlynn
**Release note**:
```release-note
Support secure etcd cluster for centos provider.
```
Automatic merge from submit-queue
added prompt warning if etcd3 media type isn't set during upgrade
**What this PR does / why we need it**:
This adds a prompt confirming the upgrade when `STORAGE_MEDIA_TYPE` is not explicitly set. This is to prevent users from accidentally upgrading to protobuf.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Alongs with docs, addresses #43669
**Special notes for your reviewer**:
Should be cherrypicked onto `release-1.6`
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43546, 43544)
Default to enabling legacy ABAC policy in non-test kube-up.sh environments
Fixes https://github.com/kubernetes/kubernetes/issues/43541
In 1.5, we unconditionally stomped the abac policy file if KUBE_USER was set, and unconditionally used ABAC mode pointing to that file.
In 1.6, unless the user opts out (via `ENABLE_LEGACY_ABAC=false`), we want the same legacy policy included as a fallback to RBAC.
This PR:
* defaults legacy ABAC **on** in normal deployments
* defaults legacy ABAC **on** in upgrade E2Es (ensures combination of ABAC and RBAC works properly for upgraded clusters)
* defaults legacy ABAC **off** in non-upgrade E2Es (ensures e2e tests 1.6+ run with tightened permissions, and that default RBAC roles cover the required core components)
GKE changes to drive the `ENABLE_LEGACY_ABAC` envvar were made by @cjcullen out of band
```release-note
`kube-up.sh` using the `gce` provider enables both RBAC authorization and the permissive legacy ABAC policy that makes all service accounts superusers. To opt out of the permissive ABAC policy, export the environment variable `ENABLE_LEGACY_ABAC=false` before running `cluster/kube-up.sh`.
```
Automatic merge from submit-queue
Bump CNI consumers to v0.5.1
**What this PR does / why we need it**:
- vendored CNI plugins properly handle `DEL` on missing resources
- update CNI version refs
**Which issue this PR fixes**
fixes#43488
**Release note**:
`bumps CNI to version v0.5.1 where plugins properly handle DEL on non existent resources`
Automatic merge from submit-queue
Add an env KUBE_ENABLE_MASTER_NOSCHEDULE_TAINT and disable it by default
This PR changed master `NoSchedule` taint to opt-in.
As is discussed with @bgrant0607 @janetkuo, `NoSchedule` master taint breaks existing user workload, we should not enable it by default.
Previously, NPD required the taint because it can only support one OS distro with a specific configuration. If master and node are using different OS distros, NPD will not work either on master or node. However, we've already fixed this in https://github.com/kubernetes/kubernetes/pull/40206, so for NPD it's fine to disable the taint.
This should work, but I'll still try it in my cluster to confirm.
@kubernetes/sig-scheduling-misc @dchen1107 @mikedanese
Automatic merge from submit-queue (batch tested with PRs 43254, 43255, 43184, 42509)
Symlink cluster/gce/cos to cluster/gce/gci
Fixes: #43139
As I just unfortunately found out after spending an hour getting to the point where I could test this, upgrade.sh does not support upgrading nodes to local binaries. So someone will have to cut a release to test whether this change actually works.
Automatic merge from submit-queue
Allow ABAC to be disabled easily on upgrades
**What this PR does / why we need it**:
Adds a local variable to the configure-helper script so that ABAC_AUTHZ_FILE can be set to a nonexistent file in kube-env to disable ABAC on a cluster that previously was using ABAC.
@liggitt @Q-Lee
Automatic merge from submit-queue
Update npd to the official v0.3.0 release.
Update npd to the official release v0.3.0.
This also fixes a npd bug https://github.com/kubernetes/node-problem-detector/pull/98.
@dchen1107 @kubernetes/node-problem-detector-reviewers
Automatic merge from submit-queue
Fixing unbound bash variable.
**What this PR does / why we need it**: this fixes a bug introduced in 1.6 for ABAC.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**: without this, we hit an unbound variable and fail to bring up the kube-apiserver with ABAC enabled.
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 41306, 42187, 41666, 42275, 42266)
remove support for debian masters in GCE
Asked about this on the mailing list and no one objects.
@zmerlynn @roberthbailey
```release-note
Remove support for debian masters in GCE kube-up.
```
Automatic merge from submit-queue (batch tested with PRs 41931, 39821, 41841, 42197, 42195)
Adding legacy ABAC for 1.6
This is a fork of a previous [pull request](https://github.com/kubernetes/kubernetes/pull/42014) to include feedback as the original author is unavailable.
Adds a mechanism to optionally enable legacy abac for 1.6 to provide a migration path for existing users.
Automatic merge from submit-queue (batch tested with PRs 41931, 39821, 41841, 42197, 42195)
Admission Controller: Add Pod Preset
Based off the proposal in https://github.com/kubernetes/community/pull/254
cc @pmorie @pwittrock
TODO:
- [ ] tests
**What this PR does / why we need it**: Implements the Pod Injection Policy admission controller
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Added new Api `PodPreset` to enable defining cross-cutting injection of Volumes and Environment into Pods.
```
export functions from pkg/api/validation
add settings API
add settings to pkg/registry
add settings api to pkg/master/master.go
add admission control plugin for pod preset
add new admission control plugin to kube-apiserver
add settings to import_known_versions.go
add settings to codegen
add validation tests
add settings to client generation
add protobufs generation for settings api
update linted packages
add settings to testapi
add settings install to clientset
add start of e2e
add pod preset plugin to config-test.sh
Signed-off-by: Jess Frazelle <acidburn@google.com>
Automatic merge from submit-queue (batch tested with PRs 35094, 42095, 42059, 42143, 41944)
Use chroot for containerized mounts
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
Automatic merge from submit-queue (batch tested with PRs 40746, 41699, 42108, 42174, 42093)
Avoid fake node names in user info
Node usernames should follow the format `system:node:<node-name>`,
but if we don't know the node name, it's worse to put a fake one in.
In the future, we plan to have a dedicated node authorizer, which would
start rejecting requests from a user with a bogus node name like this.
The right approach is to either mint correct credentials per node, or use node bootstrapping so it requests a correct client certificate itself.
Automatic merge from submit-queue (batch tested with PRs 41937, 41151, 42092, 40269, 42135)
GCE will properly regenerate basic_auth.csv on kube-apiserver start.
**What this PR does / why we need it**:
If basic_auth.csv does not exist we will generate it as normal.
If basic_auth.csv exists we will remove the old admin password before adding the "new" one. (Turns in to a no-op if the password exists).
This did not work properly before because we were replacing by key, where the key was the password. New password would not match and so not replace the old password.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#41935
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
Use docker log rotation mechanism instead of logrotate
This is a solution for https://github.com/kubernetes/kubernetes/issues/38495.
Instead of rotating logs using logrotate tool, which is configured quite rigidly, this PR makes docker responsible for the rotation and makes it possible to configure docker logging parameters. It solves the following problems:
* Logging agent will stop loosing lines upon rotation
* Container's logs size will be more strictly constrained. Instead of checking the size hourly, size will be checked upon write, preventing https://github.com/kubernetes/kubernetes/issues/27754
It's still far from ideal, for example setting logging options per pod, as suggested in https://github.com/kubernetes/kubernetes/issues/15478 would be much more flexible, but latter approach requires deep changes, including changes in API, which may be in vain because of CRI and long-term vision for logging.
Changes include:
* Change in salt. It's possible to configure docker log parameters, using variables in pillar. They're exported from env variables on `gce`, but for different cloud provider they have to be exported first.
* Change in `configure-helper.sh` scripts for those os on `gce` that don't use salt + default values exposed via env variables
This change may be problematic for kubelet logs functionality with CRI enabled, that will be tackled in the follow-up PR, if confirmed.
CC @piosz @Random-Liu @yujuhong @dashpole @dchen1107 @vishh @kubernetes/sig-node-pr-reviews
```release-note
On GCI by default logrotate is disabled for application containers in favor of rotation mechanism provided by docker logging driver.
```
Automatic merge from submit-queue (batch tested with PRs 40932, 41896, 41815, 41309, 41628)
enable DefaultTolerationSeconds admission controller by default
**What this PR does / why we need it**:
Continuation of PR #41414, enable DefaultTolerationSeconds admission controller by default.
**Which issue this PR fixes**:
fixes: #41860
related Issue: #1574, #25320
related PRs: #34825, #41133, #41414
**Special notes for your reviewer**:
**Release note**:
```release-note
enable DefaultTolerationSeconds admission controller by default
```
If the file does not exist we will generate it as normal.
If the file exists we will remove the old admin password before adding
the "new" one. (Turns in to a no-op if the password exists).
This did not work properly before because we were replacing by key,
where the key was the password. New password would not match and so
not replace the old password.
Added a METADATA_CLOBBERS_CONFIG flag
METADATA_CLOBBERS_CONFIG controls if we consider the values on disk or in
metadata to be the canonical source of truth. Currently defaulting to
false for GCE and forcing to true for GKE.
Added handling for older forms of the basic_auth.csv file.
Fixed comment to reflect new METADATA_CLOBBERS_CONFIG var.
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.