mirror of https://github.com/k3s-io/k3s
Fix bad config in flaky test documentation and add script to help check
for flakes.pull/6/head
parent
15c57efde2
commit
d3d71df943
|
@ -11,7 +11,7 @@ There is a testing image ```brendanburns/flake``` up on the docker hub. We will
|
|||
|
||||
Create a replication controller with the following config:
|
||||
```yaml
|
||||
id: flakeController
|
||||
id: flakecontroller
|
||||
kind: ReplicationController
|
||||
apiVersion: v1beta1
|
||||
desiredState:
|
||||
|
@ -41,14 +41,26 @@ labels:
|
|||
|
||||
```./cluster/kubectl.sh create -f controller.yaml```
|
||||
|
||||
This will spin up 100 instances of the test. They will run to completion, then exit, the kubelet will restart them, eventually you will have sufficient
|
||||
runs for your purposes, and you can stop the replication controller by setting the ```replicas``` field to 0 and then running:
|
||||
This will spin up 24 instances of the test. They will run to completion, then exit, and the kubelet will restart them, accumulating more and more runs of the test.
|
||||
You can examine the recent runs of the test by calling ```docker ps -a``` and looking for tasks that exited with non-zero exit codes. Unfortunately, docker ps -a only keeps around the exit status of the last 15-20 containers with the same image, so you have to check them frequently.
|
||||
You can use this script to automate checking for failures, assuming your cluster is running on GCE and has four nodes:
|
||||
|
||||
```sh
|
||||
./cluster/kubectl.sh update -f controller.yaml
|
||||
./cluster/kubectl.sh delete -f controller.yaml
|
||||
echo "" > output.txt
|
||||
for i in {1..4}; do
|
||||
echo "Checking kubernetes-minion-${i}"
|
||||
echo "kubernetes-minion-${i}:" >> output.txt
|
||||
gcloud compute ssh "kubernetes-minion-${i}" --command="sudo docker ps -a" >> output.txt
|
||||
done
|
||||
grep "Exited ([^0])" output.txt
|
||||
```
|
||||
|
||||
Now examine the machines with ```docker ps -a``` and look for tasks that exited with non-zero exit codes (ignore those that exited -1, since that's what happens when you stop the replica controller)
|
||||
Eventually you will have sufficient runs for your purposes. At that point you can stop and delete the replication controller by running:
|
||||
|
||||
```sh
|
||||
./cluster/kubectl.sh stop replicationcontroller flakecontroller
|
||||
```
|
||||
|
||||
If you do a final check for flakes with ```docker ps -a```, ignore tasks that exited -1, since that's what happens when you stop the replication controller.
|
||||
|
||||
Happy flake hunting!
|
||||
|
|
Loading…
Reference in New Issue