Automatic merge from submit-queue
Incremental improvements to kubelet e2e tests
- Add keep-alive to ssh connection
- Don't try to stop services on image-based runs
- Increase jenkins ci timeout to 90 minutes to accomadate unpredictable go build times
- Remove spammy log statement
Automatic merge from submit-queue
Add some more info to the Jenkins README.
This is a bit of a work-in-progress, and I'd appreciate feedback on what to add or remove. I'm not sure that I need to say so much about the GCS format, and I should probably say some more about JJB.
@kubernetes/sig-testing
Automatic merge from submit-queue
Removing call to update-swagger-spec.sh from update-generated-swagger-docs.sh
Fixes https://github.com/kubernetes/kubernetes/issues/24233
Right now `update-generated-swagger-docs.sh` calls `update-swagger-spec.sh`, but `verify-generated-swagger-docs.sh` does not verify swagger spec (that is done by `verify-swagger-spec.sh`).
Hence, `verify-swagger-spec` breaks if it is called after `verify-generated-swagger-docs`.
Fixing it by removing the call to `update-swagger-spec.sh` from `update-generated-swagger-docs.sh`.
This will require users to run both `update-swagger-spec` and `update-generated-swagger-docs` when they update api types, but they already need to run many more scripts (`update-api-reference-docs`, `update-codegen`).
People should mostly be running hack/update-all.sh directly :)
Automatic merge from submit-queue
Shorten cluster names in GKE Jenkins on Trusty
We identified an issue that the PD tests in GKE Jenkins on Trusty fail because the PD name is longer than the limit of 63 characters. The PD name embeds the "E2E_NAME" env variable exported in the Jenkins job configuration. This PR shortens that string for all GKE Jenkins on Trusty. As a result, the PD name will meet the limit requirement.
Automatic merge from submit-queue
Bump kubernetes-test-go timeout.
It looks like the run times got more inconsistent because of load on the VM. Adding another Jenkins slave improved things so we're not constantly timing out, but it still gets a little close to timing out at times.
Average runtime is ~45 mins so I went with a 100 min timeout.
Fixes#24285
Automatic merge from submit-queue
Remove soak and disruptive 1.1 Jenkins jobs.
They're both in the kubernetes-jenkins project, not their own. The disruptive one isn't a critical build, and I don't think the soak should be critical at all, since it's never green for a week anyway and I don't think we ever plan for it to be.
Automatic merge from submit-queue
Bump upgrade test timout to 10 hours
@spxtr is it reasonable to expect that running the v1.2 tests in serial would take longer than ~ 5 hours (assuming the upgrade beforehand takes < 1 hour)?
Automatic merge from submit-queue
Run test-go less often on release branches.
I made 1.2 run every 3 hours and 1.1 run every 6 hours. They'll still run right away once a build completes.
I'm going to have to lower the number of executors on the Jenkins slaves that run test-go jobs, since running 3 at a time makes them use up all the CPU and flake.
Automatic merge from submit-queue
Replace tab with eight spaces
This file only uses spaces for indentation, and my text editor highlighted the one tab.
- Add keep-alive to ssh connection
- Don't try to stop services on image-based runs
- Increase jenkins ci timeout to 90 minutes to accomadate unpredictable go build times
- Remove spammy log statement
Automatic merge from submit-queue
Make etcd cache size configurable
Instead of the prior 50K limit, allow users to specify a more sensible size for their cluster.
I'm not sure what a sensible default is here. I'm still experimenting on my own clusters. 50 gives me a 270MB max footprint. 50K caused my apiserver to run out of memory as it exceeded >2GB. I believe that number is far too large for most people's use cases.
There are some other fundamental issues that I'm not addressing here:
- Old etcd items are cached and potentially never removed (it stores using modifiedIndex, and doesn't remove the old object when it gets updated)
- Cache isn't LRU, so there's no guarantee the cache remains hot. This makes its performance difficult to predict. More of an issue with a smaller cache size.
- 1.2 etcd entries seem to have a larger memory footprint (I never had an issue in 1.1, even though this cache existed there). I suspect that's due to image lists on the node status.
This is provided as a fix for #23323
Automatic merge from submit-queue
hack: specify --advertise-address in hack/local-up-cluster.sh
This fixes the bug where the script fails to launch an apiserver on a
machine without active networking (issue #24272).
Automatic merge from submit-queue
Fix spacing in usage_from_stdin and info_from_stdin (issue #24186).
If "a" is a bash array, then the syntax to append the contents of $line as a
new element to the array is a+=("$line"), not messages+=$line
Using the former syntax just seems to append to the first element, creating a
long string and thus losing newline information.
Fixing this allows us to drop some empty lines from invocations of
usage_from_stdin.
Automatic merge from submit-queue
Rename "gcloud-update" jobs to "daily-maintenace" and add Docker cleanup
I'm guessing Jenkins Job Builder won't delete the old job, and we'll need to do that manually?
@spxtr @fejta
Automatic merge from submit-queue
phase 2 of cassandra example overhaul
Here's the next iteration in overhauling this example, towards https://github.com/kubernetes/kubernetes/issues/20961. This removes the pod adoption part, but doesn't (yet) otherwise change any of the resources used.
It also includes some README cleanup, and removes some explicit specification of labels in the rc yaml.
This PR doesn't yet add any commentary on how we're using the seed provider (re: https://github.com/kubernetes/kubernetes/issues/20961#issuecomment-190405959 etc.). Maybe we should add that.
Also: LMK if this PR should include any changes to the links out to the docs.
cc @bgrant0607 @johndmulhausen
Automatic merge from submit-queue
Set metadata.google.internal IP in dockerized e2e based on /etc/hosts
Support the metadata cacher from #24131 inside dockerized e2e runs.
cc @fejta
Automatic merge from submit-queue
Restart job 5m after the previous failure.
If a job flakes at the beginning of it scripts, it will likely sit around doing nothing for 30m blocking the merge queue. Decreasing this to 5m.