Automatic merge from submit-queue
Install pip in kubekins test image and install AWS cli when needed
Follow-on to https://github.com/kubernetes/test-infra/pull/66. This fixes AWS hopefully.
I haven't pushed the image yet.
cc @fejta
Automatic merge from submit-queue
Include static docker binary in kubekins-test image
Fixes kubernetes/test-infra/issues/47.
I haven't pushed this image yet, so I expect CI to fail for now.
The staging images are now created with image families, so we can get rid of the
image indices stored in GCS. Also, get images based on milestone number instead
of "image type".
Automatic merge from submit-queue
Prep for continuous Docker validation test
```release-note
Add a test config variable to specify desired Docker version to run on GCI.
```
We want to continuously validate Docker releases (#25215), on GCI. This change
adds a new test config variable, `KUBE_GCI_DOCKER_VERSION`, through which we can
specify which version of Docker we want to run on the master and nodes. This
change also patches the Jenkins e2e-runner with the ability to fetch the latest
Docker (pre)release, and sets the aforementioned variable accordingly.
Tested on my local Jenkins instance that was able to start a cluster with the latest Docker version (different from installed version) running on both master and nodes.
@dchen1107 Can you review?
cc/ @andyzheng0831 for changes in `cluster/gce/gci/helper.sh`, and @ixdy @spxtr for changes to the Jenkins e2e-runner
cc/ @kubernetes/goog-image
Automatic merge from submit-queue
Avoid duplicate building in Jenkins unit/integration job
Partially adopts #26392: don't run `hack/build-go.sh` in the unit/integration job, since we do that already for e2e.
We do need to still build kubectl, however, so do that in `hack/test-cmd.sh`.
x-ref #25940
Pull the latest build of the release the server is running, rather
than matching exact version. This allows GKE to pick up test fixes
from branch head, instead of waiting for a patch.
Automatic merge from submit-queue
Always skew the client when running version-skew tests
As mentioned in the comments:
- for upgrade jobs, we want kubectl to be at the same version as master.
- for client skew tests, we want to use the skewed kubectl (that's what we're testing).
So, remove gate on `JENKINS_USE_SKEW_KUBECTL`, and always use the skewed `kubectl`.
This should go in before https://github.com/kubernetes/test-infra/pull/109.
We want to continuously validate Docker releases (#25215), on GCI. This change
adds a new test config variable, `KUBE_GCI_DOCKER_VERSION`, through which we can
specify which version of Docker we want to run on the master and nodes. This
change also patches the Jenkins e2e-runner with the ability to fetch the latest
Docker (pre)release, and sets the aforementioned variable accordingly.
Jenkins relies on junit.xml to identify test failures
and non-0 exit codes to indentify infrastructure failures.
Test failures in kubemark tests should not cause the test
script to exit non-0. Infrastructure failures should.
- Add function to dump cluster logs without exiting (refactor)
- Change `test/kubemark/stop-kubemark.sh` to be run regardless of whether tests fail or not
- Exit code for failed tests overwritten to be the exit code of dumping the cluster logs
Automatic merge from submit-queue
e2e-runner: Drop Trusty support in favor of GCI
Now that GCI (Google Container-VM Image) is out, we will start running e2e tests
with it instead of Ubuntu Trusty. This change updates `e2e-runner.sh` to replace
the Trusty related logic with GCI.
_Note that this change has to go in the same time as https://github.com/kubernetes/test-infra/pull/54_
@spxtr Can you review?
cc/ @andyzheng0831 @kubernetes/goog-image
Automatic merge from submit-queue
Fix JENKINS_USE_SKEW_KUBECTL
I got this logic wrong; the first is a NOT comparison, so the second should only be available if that NOT comparison returns true.
Now that GCI (Google Container-VM Image) is out, we will start running e2e tests
with it instead of Ubuntu Trusty. This change updates `e2e-runner.sh` to replace
the Trusty related logic with GCI.
Automatic merge from submit-queue
Use --format='value(name)' with gcloud instead of grep/awk/cut
Fixing our fragile parsing of `gcloud` is getting old (#24746, #25159, maybe others?).
Instead, let's just get the proper output out of `gcloud` in the first place.
Automatic merge from submit-queue
Update e2e-runner.sh so it can fetch multiple types of Trusty images
Trusty beta images now work with k8s 1.2.
@spxtr, @andyzheng0831 Can you review this?
Automatic merge from submit-queue
Jenkins: Clean up even if we can't bring cluster up
We're gathering all the cluster logs, go ahead and clean up the
resources.
Automatic merge from submit-queue
Map secret files into dockerized e2e
Rather than copying the GCE and AWS private keys/credentials to each Jenkins VM, we can put them in credentials and map them through.
This is one half of the change; if the relevant environment variables are set, we'll mount the files in.
cc @fejta @rmmh @apelisse
Automatic merge from submit-queue
Run Kubemark builds inside Docker
Since Docker-in-Docker is tricky to get right (esp. wrt volume mounts), I'm only enabling it when necessary for a build (e.g. for kubemark).
cc @spxtr @fejta @wojtek-t
Automatic merge from submit-queue
Up to go 1.6.2 for build and test.
~~1.6.1 contains some security fixes. 1.6.2 should be out soon.~~ 1.6.2 is out :D
Images aren't pushed yet.
Automatic merge from submit-queue
Script to cache metadata requests on the jenkins master
Fixes https://github.com/kubernetes/kubernetes/issues/23545
Create an http server which caches most requests to the metadata server. Use special logic to cache access tokens such that the expires_on json field is correct. Add a script to simplify enabling/disabling the cache by editing `/etc/hosts`
Automatic merge from submit-queue
jenkins: Allow configuration of release bucket
This allows others to leverage the existing E2E code to test some
patched kube binary by simply overriding the bucket and reusing many of
the existing scripts
Automatic merge from submit-queue
disable linkcheck jenkins job
I don't have time to fix linkcheck any soon, so temporarily disable the job.
ref #23162
Runs a 1000 node GKE parallel e2e test. On demand only. We'll add more
tests as I see what actually works - this is going to have some
flakiness on its own.
Automatic merge from submit-queue
Disable heapster job which has been broken for a month
https://github.com/kubernetes/kubernetes/issues/23538
This job is no longer producing a useful signal. http://kubekins.dls.corp.google.com/ shows that the last pass was nearly two months ago. I would like to disable the job until someone has the chance to fix it so we are not wasting jenkins resources, contributing to system instability.
Automatic merge from submit-queue
Incremental improvements to kubelet e2e tests
- Add keep-alive to ssh connection
- Don't try to stop services on image-based runs
- Increase jenkins ci timeout to 90 minutes to accomadate unpredictable go build times
- Remove spammy log statement
Automatic merge from submit-queue
Add some more info to the Jenkins README.
This is a bit of a work-in-progress, and I'd appreciate feedback on what to add or remove. I'm not sure that I need to say so much about the GCS format, and I should probably say some more about JJB.
@kubernetes/sig-testing
Automatic merge from submit-queue
Shorten cluster names in GKE Jenkins on Trusty
We identified an issue that the PD tests in GKE Jenkins on Trusty fail because the PD name is longer than the limit of 63 characters. The PD name embeds the "E2E_NAME" env variable exported in the Jenkins job configuration. This PR shortens that string for all GKE Jenkins on Trusty. As a result, the PD name will meet the limit requirement.
Automatic merge from submit-queue
Bump kubernetes-test-go timeout.
It looks like the run times got more inconsistent because of load on the VM. Adding another Jenkins slave improved things so we're not constantly timing out, but it still gets a little close to timing out at times.
Average runtime is ~45 mins so I went with a 100 min timeout.
Fixes#24285
Automatic merge from submit-queue
Remove soak and disruptive 1.1 Jenkins jobs.
They're both in the kubernetes-jenkins project, not their own. The disruptive one isn't a critical build, and I don't think the soak should be critical at all, since it's never green for a week anyway and I don't think we ever plan for it to be.
Automatic merge from submit-queue
Bump upgrade test timout to 10 hours
@spxtr is it reasonable to expect that running the v1.2 tests in serial would take longer than ~ 5 hours (assuming the upgrade beforehand takes < 1 hour)?
Automatic merge from submit-queue
Run test-go less often on release branches.
I made 1.2 run every 3 hours and 1.1 run every 6 hours. They'll still run right away once a build completes.
I'm going to have to lower the number of executors on the Jenkins slaves that run test-go jobs, since running 3 at a time makes them use up all the CPU and flake.
- Add keep-alive to ssh connection
- Don't try to stop services on image-based runs
- Increase jenkins ci timeout to 90 minutes to accomadate unpredictable go build times
- Remove spammy log statement
Automatic merge from submit-queue
Rename "gcloud-update" jobs to "daily-maintenace" and add Docker cleanup
I'm guessing Jenkins Job Builder won't delete the old job, and we'll need to do that manually?
@spxtr @fejta
Automatic merge from submit-queue
Set metadata.google.internal IP in dockerized e2e based on /etc/hosts
Support the metadata cacher from #24131 inside dockerized e2e runs.
cc @fejta
Automatic merge from submit-queue
Restart job 5m after the previous failure.
If a job flakes at the beginning of it scripts, it will likely sit around doing nothing for 30m blocking the merge queue. Decreasing this to 5m.
This allows others to leverage the existing E2E code to test some
patched kube binary by simply overriding the bucket and reusing many of
the existing scripts
This makes it easier to determine which tests cause particular suites to
fail.
All static HTML pages are now generated by one invocation of gen_html.py.
- make index include good/flake/fail numbers for each link
- consistently use % for string interpolation