Automatic merge from submit-queue
GCI: add support for network plugin
I had run e2e against a cluster with both master and nodes on GCI a couple of times. The PR auto tests will cover the hybrid cluster with just master on GCI.
cc/ @roberthbailey @fabioy @kubernetes/goog-image
Automatic merge from submit-queue
Exit image puller subshell
Exit the subshell with 0 so even if the last docker pull fails the pod doesn't end up in the error state.
This fixes an inadvertant search-replace error in #26617.
The error was missed then because the search-replace issue wasn't
present in the standalone controllers, but was in all the others.
Automatic merge from submit-queue
GCI: fix the issue #26379
This PR deletes docker0 explicitly to fix the issue. In some cases, coexistence of docker0 and cbr0 make troubles in GCI-based cluster instances.
I verified it in GKE. With the fix, fluentd-gcp pod shows no error. "curl google.com" can work inside a pod. Mark it as P0 to match the issue priority.
@a-robinson @roberthbailey @freehan @kubernetes/goog-image
Automatic merge from submit-queue
Enable support for memory eviction configuration via salt
Added evictions based on memory by default whenever the available memory is < 100Mi.
Updated GCE and GCI.
Automatic merge from submit-queue
Bump cluster autoscaler version and enable scale down by default
Follow up of https://github.com/kubernetes/contrib/pull/1148.
cc: @piosz @fgrzadkowski @jszczepkowski
Automatic merge from submit-queue
Re-enable node problem detector by default
Re-enable node problem detector started in gce cluster by default.
For now, in the master node, the node problem detector will be started and do nothing (see https://github.com/kubernetes/node-problem-detector/pull/13).
But in fact, in my test cluster, the master has no extra cpu to run the node problem detector, so node problem detector is started on all nodes except master, which is what we want but not expected...
@dchen1107
/cc @kubernetes/sig-node
/cc @andyzheng0831 for the gci script change.
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Automatic merge from submit-queue
Don't run fluentd-es on GCI masters
It isn't run on containervm masters. It can't do anything on the master because the master doesn't have kube-proxy running to enable fluentd to talk to the elasticsearch service.
@andyzheng0831
Automatic merge from submit-queue
Add collection of the new glbc and cluster-autoscaler logs
I've incremented the version numbers by 2 to avoid conflicting with #26652. I'll make sure the potential conflict between the images gets resolved reasonably.
cc @piosz @bprashanth @aledbf
We want to continuously validate Docker releases (#25215), on GCI. This change
adds a new test config variable, `KUBE_GCI_DOCKER_VERSION`, through which we can
specify which version of Docker we want to run on the master and nodes. This
change also patches the Jenkins e2e-runner with the ability to fetch the latest
Docker (pre)release, and sets the aforementioned variable accordingly.
Automatic merge from submit-queue
GCI/Trusty: support the Docker registry mirror
@roberthbailey @zmerlynn please review it.
cc/ @fabioy @dchen1107 @kubernetes/goog-image FYI.
cc/ @ojarjur it is very straightforward to add support for GCI, which is pretty much like the change to ContainerVM's configure-vm.sh in your original PR #25841.
Automatic merge from submit-queue
GCI: correct the fix in #26363
This PR is mainly for correcting the fix to 'find' command in #26363. I added "-maxdepth 1" in an earlier change, and #26363 tried to fix it by changing the search path. This is potentially incorrect, when yaml files are in more than one layer deep. The real fix should be removing the "-maxdepth 1" flag from 'find' command. This PR also updates two minor places in the file configure-helper.sh introduced by two previous PR #26413 and #26048.
@roberthbailey @wonderfly
cc/ @dchen1107 @fabioy @kubernetes/goog-image
Automatic merge from submit-queue
Increase failure threshold for glbc liveness probe
This pod fails a liveness probe on occasion, probably because the failure thresholds are too strict. Simple enough that either reviewer can review.
Automatic merge from submit-queue
pin GCI version to milestone 52
This is mainly for pinning the 1.2 branch to GCI milestone 52
which contains correct docker and kubelet built in.
Doing this allows us to upgrade docker to v1.11 (issue #26455)
in GCI 53 without breaking the 1.2 release branch.
@kubernetes/goog-image @dchen1107 @roberthbailey @andyzheng0831
This is mainly for pinning the 1.2 branch to GCI milestone 52
which contains correct docker and kubelet built in.
Doing this allows us to upgrade docker to v1.11 (issue #26455)
in GCI 53 without breaking the 1.2 release branch.
Automatic merge from submit-queue
Rebuild elasticsearch image to include changes since 1.2
Fixes#25360. I've pushed the image to GCR.
@jimmidyson @keontang @vishh
Automatic merge from submit-queue
Move the defaults setting of GCI to util.sh
fixes#26291
This change recovers some of the side effects of
https://github.com/kubernetes/kubernetes/pull/26197, i.e., keeps the defaults of
`NODE_IMAGE` and `NODE_IMAGE_PROJECT` to `MASTER_IMAGE` and
`MASTER_IMAGE_PROJECT`, for backward compatibility. Although it keeps
`OS_DISTRIBUTION` defaulting to `gci`, the default settings of these vars are
moved to `cluster/gce/util.sh` and conditioned on `OS_DISTRIBUTION==gci`.
@euank @roberthbailey Can you review?
Automatic merge from submit-queue
cluster/coreos: Update heapster addon to beta2
fixes#26616
As noted there, heapster was updated but not for gce/coreos which breaks anything that depends on heapster's new metrics API (i.e. autoscaling)
This change recovers some of the side effects of
https://github.com/kubernetes/kubernetes/pull/26197, i.e., keeps the defaults of
`NODE_IMAGE` and `NODE_IMAGE_PROJECT` to `MASTER_IMAGE` and
`MASTER_IMAGE_PROJECT`, for backward compatibility. Although it keeps
`OS_DISTRIBUTION` defaulting to `gci`, the default settings of these vars are
moved to `cluster/gce/util.sh` and conditioned on `OS_DISTRIBUTION==gci`.
Automatic merge from submit-queue
GCI: cherry-pick the fix in PR #25670
This PR simply copies the fix in #25670 into the GCI support.
cc/ @kubernetes/goog-image @dchen1107 @roberthbailey
Automatic merge from submit-queue
Switch DNS addons from skydns to kubedns
Change GCI and trusty cluster-helper scripts to use kubedns instead of skydns.
Unified skydns templates using a simple underscore based template and
added transform sed scripts to transform into salt and sed yaml
templates
Moved all content out of cluster/addons/dns into build/kube-dns and
saltbase/salt/kube-dns
Automatic merge from submit-queue
Support for cluster autoscaler in GCE Trusty and GCI images
Fixes: #26346
Ref: #26197
cc: @fgrzadkowski @vulpecula @piosz @jszczepkowski
Automatic merge from submit-queue
Prepull images in e2e
Quick and dirty image puller because the SQ stalled multiple times just *today* on image pull flake (https://github.com/kubernetes/kubernetes/issues/25277).
@kubernetes/sig-node @kubernetes/sig-testing wdyt?
Automatic merge from submit-queue
Make node-instance-group base names unique to prevent collisions
We create multiple IGMs for >1000 Node clusters. When we have a conflict on base name IGMs will fight over ownership of the VM that happen to have the name belonging to multiple IGMs.
This change will increase reliability of starting big clusters.
cc @wojtek-t @alex-mohr @roberthbailey @mikedanese
Automatic merge from submit-queue
Add node problem detector as an addon pod.
```release-note
Introduce a new add-on pod NodeProblemDetector.
NodeProblemDetector is a DaemonSet running on each node, monitoring node health and reporting
node problems as NodeCondition and Event. Currently it already supports kernel log monitoring, and
will support more problem detection in the future. It is enabled by default on gce now.
```
This PR enables NodeProblemDetector as an add-on pod.
/cc @mikedanese @kubernetes/sig-node
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Automatic merge from submit-queue
Configuration for GCP webhook authentication and authorization
This PR adds configuration for GCP webhook authentication and authorization in ContainerVM and GCI. The change of configure-vm.sh and kube-apiserver.manifest is directly copied from @cjcullen's PR #25380 and #25296. The change in GCI script configure-helper.sh includes the support for webhook authentication and authorization, and also some code refactor to improve readability.
@cjcullen @roberthbailey @zmerlynn please review it. The original PRs are P1, please mark this as P1.
cc/ @fabioy @kubernetes/goog-image FYI.
I verified it by running e2e tests on GCI cluster. Without the GCI side change, cluster creation fails as being capture by GKE Jenkins tests. I don't test when the two env GCP_AUTHN_URL and GCP_AUTHZ_URL are set, because they are only set in GKE. After this PR is merged, @cjcullen will test in GKE.
Automatic merge from submit-queue
cluster/gce/coreos: Set service-cluster-ip-range
Broken by #19242
See also #26002
This is necessary to kube-up for me, but depending on how #26002 plays out, this PR might not be necessary. Happy to close this or merge or whatever depending on what's best.
cc @yifan-gu @sjpotter @mikedanese
Automatic merge from submit-queue
Updating CentOS image, adding heat back to the required cli tools.
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Updated the CentOS cloudimage to the latest available, and also added heat to the required list of cli tools. This is an interim step to replacing all the commands with openstackclient.
Automatic merge from submit-queue
add CIDR allocator for NodeController
This PR:
* use pkg/controller/framework to watch nodes and reduce lists when allocate CIDR for node
* decouple the cidr allocation logic from monitoring status logic
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/19242)
<!-- Reviewable:end -->
Automatic merge from submit-queue
GCI: Fix the condition for using the default image
This PR revises the condition for using the default GCI image. The old logic is not convenient for manually run e2e tests in some cases (mainly for GCI team to test custom images). The new logic by this PR is very similar to the logic in using ContainerVM. When setting distro to "gci", if master or node image is unset, we use gci-dev for it. If either is set, we respect it.
@roberthbailey @zmerlynn @dchen1107 please review it, and we should cherry pick it in release-1.2 branch. Thanks!
cc/ @kubernetes/goog-image @adityakali FYI
Automatic merge from submit-queue
Fix hyperkube's layer caching, and remove --make-symlinks at build time
@david-mcmahon This is required before you release. Explanation in the code.
Automatic merge from submit-queue
AWS: More support for ap-northeast-2 region
Issue #24446
The new AWS region for Seoul, Korea (ap-northeast-2)
was launched in January 2016
https://aws.amazon.com/blogs/aws/now-open-aws-asia-pacific-seoul-region/
But it requires a few changes.
To test:
```
export KUBERNETES_PROVIDER=aws
export KUBE_AWS_ZONE=ap-northeast-2a
export MASTER_SIZE=t2.medium
export NODE_SIZE=t2.medium
export NUM_NODES=4
cluster/kube-up.sh
```
I assigned the AMIs by checking the specific version used from `ap-northeast-1`,
and finding the same image with the same datestamp.
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24464)
<!-- Reviewable:end -->
Automatic merge from submit-queue
GCI/Trusty: Fix an issue in using 'find' commands
This PR makes the logic of 'find' command consistent with the 'cp' command afterwards, i.e., only check one layer of a given dir. Without this fix, we have seen a recent breakage after PR #25309 added the file cluster/addons/fluentd-elasticsearch/es-image/template-k8s-logstash.json. The 'find' command discovers this json file, but the 'cp' command fails.
@roberthbailey @dchen1107 @zmerlynn please review this fix, and mark it as a cherry pick candidate. I already verified this fix can resolve the breakage.
cc/ @wonderfly @fabioy @kubernetes/goog-image FYI
Automatic merge from submit-queue
cluster/juju: Updated the url for the getting started doc
Minor change to update the URL pointing at the "Running Kubernetes locally via Docker" document
Automatic merge from submit-queue
GCI: Enable the log of upstart jobs
This PR enables the log of upstart jobs in master.yaml and node.yaml. By default, log of upstart jobs are enabled in Trusty and placed in /var/log/upstart, but not enabled in GCI. This change explicitly directs the log to the system logger. For trusty, they are in /var/log/syslog file. In GCI, we can check it using "journalctl". This change will be useful for debugging if cluster initialization fails.
@roberthbailey @maisem @dchen1107 please review it. This will be useful for issues like #23634. We should also cherry pick it in release-1.2
cc/ @fabioy @zmerlynn @wonderfly FYI.
Automatic merge from submit-queue
Automatically download the latest stable release version of Kubernetes.
The ubuntu version of download-release.sh included in the binary release downloads the released .tar.gz file again. Right now the version of the downloaded file is manually encoded within the script. This change fetches the released version automatically, similar to the shell script available on the main Kubernetes site below:
https://get.k8s.io/
Ideally the installation on bare metal ubuntu should work with the files available in the already downloaded package.
@mikedanese
Automatic merge from submit-queue
add index template for es aggregations
This index template helps us to do es aggregations of namespace_name, pod_name and container_name. Then after doing eggs, we will get the whole name not all the spilt pieces.
fix#25127
Automatic merge from submit-queue
Salt configuration for the new Cluster Autoscaler for GCE
Adds support for cloud autoscaler from contrib/cloud-autoscaler in kube-up.sh GCE script.
cc: @fgrzadkowski @piosz
Automatic merge from submit-queue
Use --format='value(name)' with gcloud instead of grep/awk/cut
Fixing our fragile parsing of `gcloud` is getting old (#24746, #25159, maybe others?).
Instead, let's just get the proper output out of `gcloud` in the first place.
Automatic merge from submit-queue
PSP admission
```release-note
Update PodSecurityPolicy types and add admission controller that could enforce them
```
Still working on removing the non-relevant parts of the tests but I wanted to get this open to start soliciting feedback.
- [x] bring PSP up to date with any new features we've added to SCC for discussion
- [x] create admission controller that is a pared down version of SCC (no ns based strategies, no user/groups/service account permissioning)
- [x] fix tests
@liggitt @pmorie - this is the simple implementation requested that assumes all PSPs should be checked for each requests. It is a slimmed down version of our SCC admission controller
@erictune @smarterclayton
Automatic merge from submit-queue
Remove myself from a bunch of OWNERS files
For the time being I am too overloaded to do non scheduler/admission related reviews that aren't explicitly assigned to me.
cc/ @brendandburns
Automatic merge from submit-queue
Change default clusterCIDRs from /16 to /14 in GCE configs allowing 1000 Node clusters by default.
cc @thockin @roberthbailey @wojtek-t @zmerlynn @davidopp
There are a lot of references to https://get.k8s.io/ over the internet.
Most of such references do not describe KUBERNETES_SKIP_DOWNLOAD env variable
and newbies can get into a situation described below:
- execute `wget -q -O - https://get.k8s.io | bash`
- receive a failure due too missed packages or some configs
- fix the issue
- try again `wget -q -O - https://get.k8s.io | bash`
In this case, get-kube.sh will not check that kubernetes directory already
exist and repeat download again.
Lets make get-kube.sh more user-friednly and check existence if kubernetes dir
Automatic merge from submit-queue
Openstack provider
Our pull request delivers solution to create Kubernetes cluster on the top of OpenStack. Heat OpenStack Orchestration engine describes the infrastructure for Kubernetes cluster. CentoOS images are used for Kubernetes host machines.
We tested our solution with DevStack and Citycloud provider.
We believe that our solution will fill the gap that which is on the market.
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/21737)
<!-- Reviewable:end -->
Apparently our cluster start time increased, to the point where users
are reporting spurious timeouts (#23623) and users are reporting that
increasing the timeout fixes the issue (thanks @paralin for the
suggestion and @jlfields for confirming).
Fix#23623
Automatic merge from submit-queue
GCI/Trusty: Fix the running of kube-addon-manager
This PR fixes the issue that kube-addon-master (added in #23600) is not started. Without this fix, no kube-system pods can be running correctly. As a result, the GCI-based Jenkins testing k8s head has been down for a couple of days. The root cause is that we stopped to use namespace.yaml, but configure-helper.sh still tries to copy it. This PR also gets rid of /var/cache/kubernetes-install/kube_env.yaml, as it is not needed anymore after #24108.
@mikedanese @roberthbailey @dchen1107 please review it. If possible please mark it as P1, as it blocks GCI-based Jenkins tests.
cc/ @kubernetes/goog-image @fabioy FYI
Automatic merge from submit-queue
Added vsphere support for vagrant
Since the native vsphere support (using govc library) requires admin permissions on ESX/vCenter, not everyone can have such permissions. So I'm adding a vsphere support using vagrant using vagrant-vsphere plugin
Automatic merge from submit-queue
Move godeps to vendor/
This is a first-step towards glide support, maybe we don't want or need to take this, but it was easy to try.
This fails to compile, not sure why:
```
# k8s.io/kubernetes/pkg/apis/extensions/v1beta1
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2703: undefined: extensions.ClusterAutoscaler
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2703: undefined: ClusterAutoscaler
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2719: undefined: extensions.ClusterAutoscaler
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2719: undefined: ClusterAutoscaler
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2723: undefined: extensions.ClusterAutoscalerList
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2723: undefined: ClusterAutoscalerList
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:3468: Convert_extensions_JobSpec_To_v1beta1_JobSpec redeclared in this block
previous declaration at _output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion.go:328
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:3845: Convert_extensions_ScaleStatus_To_v1beta1_ScaleStatus redeclared in this block
previous declaration at _output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion.go:98
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:4737: Convert_v1beta1_JobSpec_To_extensions_JobSpec redeclared in this block
previous declaration at _output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion.go:380
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:5186: Convert_v1beta1_ScaleStatus_To_extensions_ScaleStatus redeclared in this block
previous declaration at _output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion.go:120
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2723: too many errors
!!! Error in /home/thockin/tmp/godep-vendor/src/k8s.io/kubernetes/hack/lib/golang.sh:417
```
Automatic merge from submit-queue
cluster/images/hyperkube: create symlink for each server
Add a kubelet symlink so that the hyperkube image can appear as a kubelet image. https://github.com/kubernetes/kubernetes/issues/24510
Automatic merge from submit-queue
GCI: Add two GCI specific metadata pairs
This PR adds two GCI specific metadata pairs when using GCI image.
(1) "gci-update-strategy": by default the GCI in-place updater is enabled. It means that when a new image is released, the instance on the old image will be upgraded to the new image. In this change, we turn it off;
(2) "gci-ensure-gke-docker": GCI is built with two versions of docker. When this metadata is set to "true", the version satisfying kubernetes qualification will be used. Setting this metadata prevents from using incorrect docker version.
Automatic merge from submit-queue
Use v1.6.2-1 tag for build.
Is there any reason these don't use the VERSION file like everything else? cc @luxas @ixdy
Automatic merge from submit-queue
Add an entry to the salt config to allow Debian jessie on GCE.
```release-note
Add an entry to the salt config to allow Debian jessie on GCE.
As with the existing Wheezy image on GCE, docker is expected
to already be installed in the image.
```
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Automatic merge from submit-queue
Convert internal types to use exact precision integers
This makes conversion more suitable for future optimizations, and we need to stop pretending for some of our internal types that the width of the int doesn't matter.
@wojtek-t
Automatic merge from submit-queue
Fix detect-node-names to not error out if there are no nodes
Fixes#21564.
Teardown was not working correctly in rare cases because `detect-node-names` was failing before any of the actual cleanup was run. I'm pretty sure the issue was that there was an instance group, but no instances in the instance group, so we bailed out when we tried to expand the bash array.
This PR adds a guard so we don't bail if the array is empty.
cc @jlowdermilk @spxtr
Automatic merge from submit-queue
Promote Pod Hostname & Subdomain to fields (were annotations)
Deprecating the podHostName, subdomain and PodHostnames annotations and created corresponding new fields for them on PodSpec and Endpoints types.
Annotation doc: #22564
Annotation code: #20688
CentOS 7 Core nodes running on OpenStack with an SSL-enabled API
endpoint results in the following error without this patch:
F0425 19:00:58.124520 5 server.go:100] Cloud provider could not be initialized: could not init cloud provider "openstack": Post https://my.openstack.cloud:5000/v2.0/tokens: x509: failed to load system roots and no roots provided
The root cause is that the ca-bundle.crt file is actually a symlink
which points to a directory which wasn't previously exposed.
[root@kubernetesstack-master ~]# ls -l /etc/ssl/certs/ca-bundle.crt
lrwxrwxrwx. 1 root root 49 18 nov 11:02 /etc/ssl/certs/ca-bundle.crt -> /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
[root@kubernetesstack-master ~]#
Fixed the order of fields for basic_auth.
This provider still needs to leverage common.sh for generating proper credentials though.
Also documented a pattern for how to get the SWIFT_SERVER_URL automatically