Automatic merge from submit-queue (batch tested with PRs 41364, 40317, 41326, 41783, 41782)
Add ability to enable cache mutation detector in GCE
Add the ability to enable the cache mutation detector in GCE. The current default behavior (disabled) is retained.
When paired with https://github.com/kubernetes/test-infra/pull/1901, we'll be able to detect shared informer cache mutations in gce e2e PR jobs.
Automatic merge from submit-queue
Add standalone npd on GCI.
This PR added standalone NPD in GCE GCI cluster. I already verified the PR, and it should work.
/cc @dchen1107 @fabioy @andyxning @kubernetes/sig-node-misc
Automatic merge from submit-queue
Fix the output of health-mointor.sh
The script show prints the errors/response of the health check, but not
show the progress of `curl`.
Allow cache mutation detector enablement by PRs in an attempt to find
mutations before they're merged in to the code base. It's just for the
apiserver and controller-manager for now. If/when the other components
start using a SharedInformerFactory, we should set them up just like
this as well.
Automatic merge from submit-queue
Added configurable etcd initial-cluster-state to kube-up script.
Added configurable etcd initial-cluster-state to kube-up script. This
allows creation of multi-master cluster from scratch. This is a
cherry-pick of #41320 from 1.5 branch.
```release-note
Added configurable etcd initial-cluster-state to kube-up script.
```
Automatic merge from submit-queue (batch tested with PRs 41196, 41252, 41300, 39179, 41449)
Bump GCE ContainerVM to container-vm-v20170214
`container-vm-v20170214` is a re-build of the `docker-runc` in `container-vm-v20170201`, and should clear the GCE slow tests.
c.f. #40828
```release-note
Bump GCE ContainerVM to container-vm-v20170214 to address CVE-2016-9962.
```
Automatic merge from submit-queue (batch tested with PRs 40297, 41285, 41211, 41243, 39735)
cluster/gce: Add env var to enable apiserver basic audit log.
For now, this is focused on a fixed set of flags that makes the audit
log show up under /var/log/kube-apiserver-audit.log and behave similarly
to /var/log/kube-apiserver.log. Allowing other customization would
require significantly more complex changes.
Audit log rotation is handled the same as for `kube-apiserver.log`.
**What this PR does / why we need it**:
Add a knob to enable [basic audit logging](https://kubernetes.io/docs/admin/audit/) in GCE.
**Which issue this PR fixes**:
**Special notes for your reviewer**:
We would like to cherrypick/port this to release-1.5 also.
**Release note**:
```release-note
The kube-apiserver [basic audit log](https://kubernetes.io/docs/admin/audit/) can be enabled in GCE by exporting the environment variable `ENABLE_APISERVER_BASIC_AUDIT=true` before running `cluster/kube-up.sh`. This will log to `/var/log/kube-apiserver-audit.log` and use the same `logrotate` settings as `/var/log/kube-apiserver.log`.
```
For now, this is focused on a fixed set of flags that makes the audit
log show up under /var/log/kube-apiserver-audit.log and behave similarly
to /var/log/kube-apiserver.log. Allowing other customization would
require significantly more complex changes.
Audit log rotation is handled externally by the wildcard /var/log/*.log
already configured in configure-helper.sh.
Added configurable etcd initial-cluster-state to kube-up script. This
allows creation of multi-master cluster from scratch. This is a
cherry-pick of #41320 from 1.5 branch.
Automatic merge from submit-queue (batch tested with PRs 41121, 40048, 40502, 41136, 40759)
Remove deprecated kubelet flags that look safe to remove
Removes:
```
--config
--auth-path
--resource-container
--system-container
```
which have all been marked deprecated since at least 1.4 and look safe to remove.
```release-note
The deprecated flags --config, --auth-path, --resource-container, and --system-container were removed.
```
Automatic merge from submit-queue (batch tested with PRs 40971, 41027, 40709, 40903, 39369)
Bump GCI to gci-beta-56-9000-80-0
cc/ @Random-Liu @adityakali
Changelogs since gci-dev-56-8977-0-0 (currently used in Kubernetes):
- "net.ipv4.conf.eth0.forwarding" and "net.ipv4.ip_forward" may get reset to 0
- Track CVE-2016-9962 in Docker in GCI
- Linux kernel CVE-2016-7097
- Linux kernel CVE-2015-8964
- Linux kernel CVE-2016-6828
- Linux kernel CVE-2016-7917
- Linux kernel CVE-2016-7042
- Linux kernel CVE-2016-9793
- Linux kernel CVE-2016-7039 and CVE-2016-8666
- Linux kernel CVE-2016-8655
- Toolbox: allow docker image to be loaded from local tarball
- Update compute-image-package in GCI
- Change the product name on /etc/os-release (to COS)
- Remove 'dogfood' from HWID_OVERRIDE in /etc/lsb-release
- Include Google NVME extensions to optimize LocalSSD performance.
- /proc/<pid>/io missing on GCI (enables process stats accounting)
- Enable BLK_DEV_THROTTLING
cc/ @roberthbailey @fabioy for GKE cluster update
Automatic merge from submit-queue (batch tested with PRs 38739, 40480, 40495, 40172, 40393)
Use existing ABAC policy file when upgrading GCE cluster
When upgrading, continue loading an existing ABAC policy file so that existing system components continue working as-is
```
When upgrading an existing 1.5 GCE cluster using `cluster/gce/upgrade.sh`, an existing ABAC policy file located at /etc/srv/kubernetes/abac-authz-policy.jsonl (the default location in 1.5) will enable the ABAC authorizer in addition to the RBAC authorizer. To switch an upgraded 1.5 cluster completely to RBAC, ensure the control plane components and your superuser have been granted sufficient RBAC permissions, move the legacy ABAC policy file to a backup location, and restart the apiserver.
```
Automatic merge from submit-queue (batch tested with PRs 39260, 40216, 40213, 40325, 40333)
Fixed propagation of kube master certs during master replication.
Fixed propagation of kube-master-certs during master replication.
Automatic merge from submit-queue (batch tested with PRs 40299, 40311)
cluster: update default rkt version to 1.23.0
This updates cluster configurations to current stable rkt version.
These files have been created lately, so we don't have much information
about them anyway, so let's just:
- Remove assignees and make them approvers
- Copy approves as reviewers
Automatic merge from submit-queue
Build release tars using bazel
**What this PR does / why we need it**: builds equivalents of the various kubernetes release tarballs, solely using bazel.
For example, you can now do
```console
$ make bazel-release
$ hack/e2e.go -v -up -test -down
```
**Special notes for your reviewer**: this is currently dependent on 3b29803eb5, which I have yet to turn into a pull request, since I'm still trying to figure out if this is the best approach.
Basically, the issue comes up with the way we generate the various server docker image tarfiles and load them on nodes:
* we `md5sum` the binary being encapsulated (e.g. kube-proxy) and save that to `$binary.docker_tag` in the server tarball
* we then build the docker image and tag using that md5sum (e.g. `gcr.io/google_containers/kube-proxy:$MD5SUM`)
* we `docker save` this image, which embeds the full tag in the `$binary.tar` file.
* on cluster startup, we `docker load` these tarballs, which are loaded with the tag that we'd created at build time. the nodes then use the `$binary.docker_tag` file to find the right image.
With the current bazel `docker_build` rule, the tag isn't saved in the docker image tar, so the node is unable to find the image after `docker load`ing it.
My changes to the rule save the tag in the docker image tar, though I don't know if there are subtle issues with it. (Maybe we want to only tag when `--stamp` is given?)
Also, the docker images produced by bazel have the timestamp set to the unix epoch, which is not great for debugging. Might be another thing to change with a `--stamp`.
Long story short, we probably need to follow up with bazel folks on the best way to solve this problem.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 36467, 36528, 39568, 40094, 39042)
Bump GCE to container-vm-v20170117
Base image update only, no kubelet or Docker updates.
```release-note
Update GCE ContainerVM deployment to container-vm-v20170117 to pick up CVE fixes in base image.
```
Automatic merge from submit-queue
Enable lazy initialization of ext3/ext4 filesystems
**What this PR does / why we need it**: It enables lazy inode table and journal initialization in ext3 and ext4.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#30752, fixes#30240
**Release note**:
```release-note
Enable lazy inode table and journal initialization for ext3 and ext4
```
**Special notes for your reviewer**:
This PR removes the extended options to mkfs.ext3/mkfs.ext4, so that the defaults (enabled) for lazy initialization are used.
These extended options come from a script that was historically located at */usr/share/google/safe_format_and_mount* and later ported to GO so this dependency to the script could be removed. After some search, I found the original script here: https://github.com/GoogleCloudPlatform/compute-image-packages/blob/legacy/google-startup-scripts/usr/share/google/safe_format_and_mount
Checking the history of this script, I found the commit [Disable lazy init of inode table and journal.](4d7346f7f5). This one introduces the extended flags with this description:
```
Now that discard with guaranteed zeroing is supported by PD,
initializing them is really fast and prevents perf from being affected
when the filesystem is first mounted.
```
The problem is, that this is not true for all cloud providers and all disk types, e.g. Azure and AWS. I only tested with magnetic disks on Azure and AWS, so maybe it's different for SSDs on these cloud providers. The result is that this performance optimization dramatically increases the time needed to format a disk in such cases.
When mkfs.ext4 is told to not lazily initialize the inode tables and the check for guaranteed zeroing on discard fails, it falls back to a very naive implementation that simply loops and writes zeroed buffers to the disk. Performance on this highly depends on free memory and also uses up all this free memory for write caching, reducing performance of everything else in the system.
As of https://github.com/kubernetes/kubernetes/issues/30752, there is also something inside kubelet that somehow degrades performance of all this. It's however not exactly known what it is but I'd assume it has something to do with cgroups throttling IO or memory.
I checked the kernel code for lazy inode table initialization. The nice thing is, that the kernel also does the guaranteed zeroing on discard check. If it is guaranteed, the kernel uses discard for the lazy initialization, which should finish in a just few seconds. If it is not guaranteed, it falls back to using *bio*s, which does not require the use of the write cache. The result is, that free memory is not required and not touched, thus performance is maxed and the system does not suffer.
As the original reason for disabling lazy init was a performance optimization and the kernel already does this optimization by default (and in a much better way), I'd suggest to completely remove these flags and rely on the kernel to do it in the best way.
Automatic merge from submit-queue (batch tested with PRs 39803, 39698, 39537, 39478)
include bootstrap admin in super-user group, ensure tokens file is correct on upgrades
Fixes https://github.com/kubernetes/kubernetes/issues/39532
Possible issues with cluster bring-up scripts:
- [x] known_tokens.csv and basic_auth.csv is not rewritten if the file already exists
* new users (like the controller manager) are not available on upgrade
* changed users (like the kubelet username change) are not reflected
* group additions (like the addition of admin to the superuser group) don't take effect on upgrade
* this PR updates the token and basicauth files line-by-line to preserve user additions, but also ensure new data is persisted
- [x] existing 1.5 clusters may depend on more permissive ABAC permissions (or customized ABAC policies). This PR adds an option to enable existing ABAC policy files for clusters that are upgrading
Follow-ups:
- [ ] both scripts are loading e2e role-bindings, which only be loaded in e2e tests, not in normal kube-up scenarios
- [ ] when upgrading, set the option to use existing ABAC policy files
- [ ] update bootstrap superuser client certs to add superuser group? ("We also have a certificate that "used to be" a super-user. On GCE, it has CN "kubecfg", on GKE it's "client"")
- [ ] define (but do not load by default) a relaxed set of RBAC roles/rolebindings matching legacy ABAC, and document how to load that for new clusters that do not want to isolate user permissions