mirror of https://github.com/k3s-io/k3s
5 years ago
18 changed files with 395 additions and 413 deletions
@ -0,0 +1,116 @@
## K3S Performance Tests |
--- |
These scripts uses Terraform to automate building and testing on k3s clusters on AWS, it supports building normal and HA clusters with N master nodes, N workers nodes and multiple storage backends including: |
- Postgres RDS |
- Etcd |
- SQlite |
The scripts divides into three sections: |
- server |
- agents |
- tests |
### Server |
The server section deploys the storage backend and then deploys N master nodes, the scripts can be customized to use HA mode or use a single node cluster with sqlite backend, it can also support using 1 master node with external DB, the scripts can also be customized to specify instance type and k3s version, all available options are described in the variable section below. |
The server section will also create a one or more agent nodes specifically for Prometheus deployment, clusterloader2 will deploy prometheus and grafana. |
### Agents |
The agents section deploys the k3s agents, it can be customized with different options that controls the agent node count and the instance types. |
### Tests |
The tests section uses a fork off the (clusterloader2)[https://github.com/kubernetes/perf-tests/tree/master/clusterloader2] tool, the fork just modifies the logging and removes the etcd metrics probes. |
this section will use a dockerized version of the tool, which will run the tests and save the report in `tests/<test_name>-<random-number>`. |
The current available tests are: |
- load test |
- density test |
## Variables |
The scripts can be modified by customizing the variables in `scripts/config`, the variables includes: |
**Main Vars** |
| Name | Description | |
|:----------------:|:------------------------------------------------------------------------------:| |
| CLUSTER_NAME | The cluster name on aws, this will prefix each component in the cluster | |
| DOMAIN_NAME | DNS name of the Loadbalancer for k3s master(s) | |
| ZONE_ID | AWS route53 zone id for modifying the dns name | |
| K3S_VERSION | K3S version that will be used with the cluster | |
| EXTRA_SSH_KEYS | Public ssh keys that will be added to the servers | |
| PRIVATE_KEY_PATH | Private ssh key that will be used by clusterloader2 to ssh and collect metrics | |
| DEBUG | Debug mode for k3s servers | |
**Database Variables** |
| Name | Description | |
|:----------------:|:---------------------------------------------------------------------------------------------------:| |
| DB_ENGINE | The database type, this can be "mysql", "postgres", or "etcd" | |
| DB_INSTANCE_TYPE | The RDS instance type for mysql and postgres, etcd uses db.* class as well as its parsed internally | |
| DB_NAME | Database name created only in postgres and mysql | |
| DB_USERNAME | Database username created only for postgres and mysql | |
| DB_PASSWORD | Database password for the user created only for postgres and mysql | |
| DB_VERSION | Database version | |
**K3S Server Variables** |
| Name | Description | |
|:--------------------:|:---------------------------------------------------------------------------------:| |
| SERVER_HA | Whether or not to use HA mode, if not then sqlite will be used as storage backend | |
| SERVER_COUNT | k3s master node count | |
| SERVER_INSTANCE_TYPE | Ec2 instance type created for k3s server(s) | |
**K3S Agent Variables** |
| Name | Description | |
|:-------------------:|:-----------------------------------------:| |
| AGENT_NODE_COUNT | Number of k3s agents that will be created | |
| AGENT_INSTANCE_TYPE | Ec2 instance type created for k3s agents | |
**Prometheus server Variables** |
| Name | Description | |
|:-------------------------:|:-------------------------------------------------------------------:| |
| PROM_WORKER_NODE_COUNT | Number of k3s agents that will be created for prometheus deployment | |
| PROM_WORKER_INSTANCE_TYPE | Ec2 instance type created for k3s prometheus agents | |
## Usage |
### build |
The script includes a Makefile that run different sections, to build the master and workers, adjust the config file in `tests/perf/scripts/config` and then use the following: |
``` |
cd tests/perf |
make apply |
``` |
This will basically build the db, server, and agent layers, it will also deploy a kubeconfig file in tests/kubeconfig.yaml. |
### test |
To start the clusterloader2 load test you can modify the tests/perf/tests/load/config.yaml and then run the following: |
``` |
cd tests/perf |
make test |
``` |
### destroy |
To destroy the cluster just run the following: |
``` |
make destroy |
make clean |
``` |
@ -1,28 +1,34 @@
#################### |
CLUSTER_NAME="hgalal-k3s" |
K3S_VERSION="v0.10.0" |
EXTRA_SSH_KEYS="ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDZBAE6I9J733HJfCBVu7iWSUuJ7th0U4P4IFfpFDca52n/Hk4yFFr8SPR8JJc1n42c3vEVCbExp/MD4ihqEBy9+pLewxA+fkb7UAT4cT2eLfvZdTTVe8KSiw6lVN6tWSoNXmNqY+wH7zWQ04lfjXPa/c01L1n2XwV/O+5xii9vEuSxN9YhfQ/s61SdLFqQ5yS8gPsM0qQW+bFt5KGGbapqztDO+h9lxGbZRcRAKbCzZ5kF1mhjI/+VubTWKtoVLCumjzjYqILYyx9g/mLSo26qjDEZvtwBQB9KLugDAtnalLVp0HgivC5YfLHr8PxViVSHfIIKS2DhUpn07jr8eKi9" |
PRIVATE_KEY_PATH="/home/hussein/.ssh/id_rsa" #this has to be a full path |
CLUSTER_NAME="loadtest-k3s" |
ZONE_ID="" |
K3S_VERSION="v0.11.0-alpha2" |
EXTRA_SSH_KEYS="" # comma separated public keys |
PRIVATE_KEY_PATH="~/.ssh/id_rsa" |
########################## |
DB_ENGINE="postgres" |
DB_INSTANCE_TYPE="db.m4.4xlarge" |
DB_NAME="k3s" |
DB_USERNAME="k3suser" |
DB_PASSWORD="024d9442b3add64b7ef90655bc302cd8" |
########################## |
K3S_HA=1 |
DB_INSTANCE_TYPE="db.m4.4xlarge" |
################################# |
PROM_HOST="prometheus-load.eng.rancher.space" |
GRAF_HOST="prometheus-load.eng.rancher.space" |
########################## |
@ -0,0 +1,31 @@
#cloud-config |
%{ if length(extra_ssh_keys) > 0 } |
ssh_authorized_keys: |
%{ for ssh_key in extra_ssh_keys } |
- ${ssh_key} |
%{ endfor } |
%{ endif } |
runcmd: |
- echo "net.ipv4.neigh.default.gc_interval = 3600" >> /etc/sysctl.conf |
- echo "net.ipv4.neigh.default.gc_stale_time = 3600" >> /etc/sysctl.conf |
- echo "net.ipv4.neigh.default.gc_thresh3 = 16384" >> /etc/sysctl.conf |
- echo "net.ipv4.neigh.default.gc_thresh2 = 8192" >> /etc/sysctl.conf |
- echo "net.ipv4.neigh.default.gc_thresh1 = 4096" >> /etc/sysctl.conf |
- echo "fs.file-max = 12000500" >> /etc/sysctl.conf |
- echo "fs.nr_open = 20000500" >> /etc/sysctl.conf |
- echo "net.ipv4.tcp_mem = '10000000 10000000 10000000'" >> /etc/sysctl.conf |
- echo "net.ipv4.tcp_rmem = '1024 4096 16384'" >> /etc/sysctl.conf |
- echo "net.ipv4.tcp_wmem = '1024 4096 16384'" >> /etc/sysctl.conf |
- echo "net.core.rmem_max = 16384" >> /etc/sysctl.conf |
- echo "net.core.wmem_max = 16384" >> /etc/sysctl.conf |
- ulimit -n 20000000 |
- echo "# <domain> <type> <item> <value>" >> /etc/security/limits.d/limits.conf |
- echo " * soft nofile 20000" >> /etc/security/limits.d/limits.conf |
- echo " * hard nofile 20000" >> /etc/security/limits.d/limits.conf |
- sysctl -p |
- apt-get update |
- apt-get install -y git vim software-properties-common resolvconf linux-headers-$(uname -r) |
- echo "nameserver" > /etc/resolvconf/resolv.conf.d/tail |
- echo "RateLimitIntervalSec=0" >> /etc/systemd/journald.conf |
- echo "RateLimitBurst=0" >> /etc/systemd/journald.conf |
- curl -sSL https://releases.rancher.com/install-docker/19.03.sh | sh |
@ -0,0 +1,22 @@
#!/bin/bash |
set -x |
IFS=',' read -r -a public_ips <<< "$PUBLIC_IPS" |
IFS=',' read -r -a private_ips <<< "$PRIVATE_IPS" |
conn_string="" |
for i in "${!private_ips[@]}"; do |
conn_string=$conn_string"etcd-$i=http://${private_ips[i]}:2380," |
done |
conn_string=${conn_string%?} |
for i in "${!public_ips[@]}"; do |
while true; do |
ssh -i $SSH_KEY_PATH -l ubuntu ${public_ips[i]} "sudo docker run -v /etcd-data:/etcd-data -d -p ${private_ips[i]}:2379:2379 -p ${private_ips[i]}:2380:2380 quay.io/coreos/etcd:$DB_VERSION etcd --initial-advertise-peer-urls http://${private_ips[i]}:2380 --name=etcd-$i --data-dir=/etcd-data --advertise-client-urls= --listen-peer-urls= --listen-client-urls= --initial-cluster-token=etcd-cluster-1 --initial-cluster-state new --initial-cluster $conn_string" |
if [ $? == 0 ]; then |
break |
fi |
sleep 10 |
done |
done |
# |
@ -1,227 +0,0 @@
%{ if prom_worker_node_count != 0 } |
--- |
apiVersion: rbac.authorization.k8s.io/v1 |
# kubernetes versions before 1.8.0 should use rbac.authorization.k8s.io/v1beta1 |
kind: ClusterRoleBinding |
metadata: |
name: kube-state-metrics |
roleRef: |
apiGroup: rbac.authorization.k8s.io |
kind: ClusterRole |
name: kube-state-metrics |
subjects: |
- kind: ServiceAccount |
name: kube-state-metrics |
namespace: kube-system |
--- |
apiVersion: rbac.authorization.k8s.io/v1 |
# kubernetes versions before 1.8.0 should use rbac.authorization.k8s.io/v1beta1 |
kind: ClusterRole |
metadata: |
name: kube-state-metrics |
rules: |
- apiGroups: [""] |
resources: |
- configmaps |
- secrets |
- nodes |
- pods |
- services |
- resourcequotas |
- replicationcontrollers |
- limitranges |
- persistentvolumeclaims |
- persistentvolumes |
- namespaces |
- endpoints |
verbs: ["list", "watch"] |
- apiGroups: ["extensions"] |
resources: |
- daemonsets |
- deployments |
- replicasets |
- ingresses |
verbs: ["list", "watch"] |
- apiGroups: ["apps"] |
resources: |
- daemonsets |
- deployments |
- replicasets |
- statefulsets |
verbs: ["list", "watch"] |
- apiGroups: ["batch"] |
resources: |
- cronjobs |
- jobs |
verbs: ["list", "watch"] |
- apiGroups: ["autoscaling"] |
resources: |
- horizontalpodautoscalers |
verbs: ["list", "watch"] |
- apiGroups: ["policy"] |
resources: |
- poddisruptionbudgets |
verbs: ["list", "watch"] |
- apiGroups: ["certificates.k8s.io"] |
resources: |
- certificatesigningrequests |
verbs: ["list", "watch"] |
- apiGroups: ["storage.k8s.io"] |
resources: |
- storageclasses |
verbs: ["list", "watch"] |
- apiGroups: ["autoscaling.k8s.io"] |
resources: |
- verticalpodautoscalers |
verbs: ["list", "watch"] |
--- |
apiVersion: apps/v1 |
kind: Deployment |
metadata: |
labels: |
k8s-app: kube-state-metrics |
name: kube-state-metrics |
namespace: kube-system |
spec: |
selector: |
matchLabels: |
k8s-app: kube-state-metrics |
replicas: 1 |
template: |
metadata: |
labels: |
k8s-app: kube-state-metrics |
spec: |
serviceAccountName: kube-state-metrics |
containers: |
- name: kube-state-metrics |
image: quay.io/coreos/kube-state-metrics:v1.7.2 |
ports: |
- name: http-metrics |
containerPort: 8080 |
- name: telemetry |
containerPort: 8081 |
livenessProbe: |
httpGet: |
path: /healthz |
port: 8080 |
initialDelaySeconds: 5 |
timeoutSeconds: 5 |
readinessProbe: |
httpGet: |
path: / |
port: 8080 |
initialDelaySeconds: 5 |
timeoutSeconds: 5 |
--- |
apiVersion: v1 |
kind: ServiceAccount |
metadata: |
name: kube-state-metrics |
namespace: kube-system |
--- |
apiVersion: v1 |
kind: Service |
metadata: |
name: kube-state-metrics |
namespace: kube-system |
labels: |
k8s-app: kube-state-metrics |
annotations: |
prometheus.io/scrape: 'true' |
spec: |
ports: |
- name: http-metrics |
port: 8080 |
targetPort: http-metrics |
protocol: TCP |
- name: telemetry |
port: 8081 |
targetPort: telemetry |
protocol: TCP |
selector: |
k8s-app: kube-state-metrics |
--- |
kind: ClusterRoleBinding |
apiVersion: rbac.authorization.k8s.io/v1 |
metadata: |
name: slo-monitor |
subjects: |
- kind: ServiceAccount |
name: slo-monitor |
namespace: kube-system |
roleRef: |
kind: ClusterRole |
name: slo-monitor |
apiGroup: rbac.authorization.k8s.io |
--- |
kind: ClusterRole |
apiVersion: rbac.authorization.k8s.io/v1 |
metadata: |
name: slo-monitor |
namespace: kube-system |
rules: |
- apiGroups: [""] |
resources: ["pods", "events"] |
verbs: ["get", "watch", "list"] |
--- |
apiVersion: v1 |
kind: ServiceAccount |
metadata: |
name: slo-monitor |
namespace: kube-system |
--- |
apiVersion: apps/v1 |
kind: Deployment |
metadata: |
name: slo-monitor |
namespace: kube-system |
labels: |
app: slo-monitor |
spec: |
selector: |
matchLabels: |
app: slo-monitor |
template: |
metadata: |
labels: |
app: slo-monitor |
annotations: |
prometheus.io/scrape: "true" |
spec: |
containers: |
- name: slo-monitor |
image: gcr.io/google-containers/slo-monitor:0.12.0 |
command: |
- /slo-monitor |
- --alsologtostderr=true |
imagePullPolicy: Always |
ports: |
- name: metrics |
containerPort: 8080 |
resources: |
requests: |
cpu: 300m |
memory: 100Mi |
limits: |
cpu: 300m |
memory: 100Mi |
restartPolicy: Always |
serviceAccountName: slo-monitor |
--- |
apiVersion: v1 |
kind: Service |
metadata: |
name: slo-monitor |
namespace: kube-system |
labels: |
app: slo-monitor |
spec: |
selector: |
app: slo-monitor |
ports: |
- name: metrics |
port: 80 |
targetPort: metrics |
type: ClusterIP |
%{ endif } |
@ -1,86 +0,0 @@
%{ if prom_worker_node_count != 0 } |
--- |
apiVersion: v1 |
kind: Namespace |
metadata: |
name: monitoring |
--- |
apiVersion: helm.cattle.io/v1 |
kind: HelmChart |
metadata: |
name: prometheus |
namespace: kube-system |
spec: |
chart: https://raw.githubusercontent.com/galal-hussein/charts/master/prometheus-9.2.0.tgz |
targetNamespace: monitoring |
valuesContent: |- |
alertmanager: |
nodeSelector: |
prom: "true" |
persistentVolume: |
enabled: false |
kubeStateMetrics: |
nodeSelector: |
prom: "true" |
nodeExporter: |
nodeSelector: |
prom: "true" |
server: |
nodeSelector: |
prom: "true" |
ingress: |
enabled: true |
hosts: |
- ${prom_host} |
persistentVolume: |
enabled: false |
pushgateway: |
nodeSelector: |
prom: "true" |
persistentVolume: |
enabled: false |
serverFiles: |
prometheus.yml: |
scrape_configs: |
- job_name: prometheus |
static_configs: |
- targets: |
- localhost:9090 |
- job_name: kubernetes-apiservers |
scrape_interval: 10s |
scrape_timeout: 10s |
metrics_path: /metrics |
scheme: https |
kubernetes_sd_configs: |
- api_server: null |
role: endpoints |
namespaces: |
names: [] |
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token |
tls_config: |
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt |
insecure_skip_verify: true |
relabel_configs: |
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name] |
separator: ; |
regex: default;kubernetes;https |
replacement: $1 |
action: keep |
--- |
apiVersion: helm.cattle.io/v1 |
kind: HelmChart |
metadata: |
name: grafana |
namespace: kube-system |
spec: |
chart: stable/grafana |
targetNamespace: monitoring |
valuesContent: |- |
ingress: |
enabled: true |
hosts: |
- ${graf_host} |
nodeSelector: |
prom: "true" |
%{ endif } |
Reference in new issue