mirror of https://github.com/k3s-io/k3s
galal-hussein
5 years ago
18 changed files with 395 additions and 413 deletions
@ -0,0 +1,116 @@
|
||||
## K3S Performance Tests |
||||
--- |
||||
|
||||
These scripts uses Terraform to automate building and testing on k3s clusters on AWS, it supports building normal and HA clusters with N master nodes, N workers nodes and multiple storage backends including: |
||||
|
||||
- MySQL RDS |
||||
- Postgres RDS |
||||
- Etcd |
||||
- SQlite |
||||
|
||||
The scripts divides into three sections: |
||||
|
||||
- server |
||||
- agents |
||||
- tests |
||||
|
||||
### Server |
||||
|
||||
The server section deploys the storage backend and then deploys N master nodes, the scripts can be customized to use HA mode or use a single node cluster with sqlite backend, it can also support using 1 master node with external DB, the scripts can also be customized to specify instance type and k3s version, all available options are described in the variable section below. |
||||
|
||||
The server section will also create a one or more agent nodes specifically for Prometheus deployment, clusterloader2 will deploy prometheus and grafana. |
||||
|
||||
### Agents |
||||
|
||||
The agents section deploys the k3s agents, it can be customized with different options that controls the agent node count and the instance types. |
||||
|
||||
### Tests |
||||
|
||||
The tests section uses a fork off the (clusterloader2)[https://github.com/kubernetes/perf-tests/tree/master/clusterloader2] tool, the fork just modifies the logging and removes the etcd metrics probes. |
||||
|
||||
this section will use a dockerized version of the tool, which will run the tests and save the report in `tests/<test_name>-<random-number>`. |
||||
|
||||
The current available tests are: |
||||
|
||||
- load test |
||||
- density test |
||||
|
||||
## Variables |
||||
|
||||
The scripts can be modified by customizing the variables in `scripts/config`, the variables includes: |
||||
|
||||
**Main Vars** |
||||
|
||||
| Name | Description | |
||||
|:----------------:|:------------------------------------------------------------------------------:| |
||||
| CLUSTER_NAME | The cluster name on aws, this will prefix each component in the cluster | |
||||
| DOMAIN_NAME | DNS name of the Loadbalancer for k3s master(s) | |
||||
| ZONE_ID | AWS route53 zone id for modifying the dns name | |
||||
| K3S_VERSION | K3S version that will be used with the cluster | |
||||
| EXTRA_SSH_KEYS | Public ssh keys that will be added to the servers | |
||||
| PRIVATE_KEY_PATH | Private ssh key that will be used by clusterloader2 to ssh and collect metrics | |
||||
| DEBUG | Debug mode for k3s servers | |
||||
|
||||
**Database Variables** |
||||
|
||||
| Name | Description | |
||||
|:----------------:|:---------------------------------------------------------------------------------------------------:| |
||||
| DB_ENGINE | The database type, this can be "mysql", "postgres", or "etcd" | |
||||
| DB_INSTANCE_TYPE | The RDS instance type for mysql and postgres, etcd uses db.* class as well as its parsed internally | |
||||
| DB_NAME | Database name created only in postgres and mysql | |
||||
| DB_USERNAME | Database username created only for postgres and mysql | |
||||
| DB_PASSWORD | Database password for the user created only for postgres and mysql | |
||||
| DB_VERSION | Database version | |
||||
|
||||
**K3S Server Variables** |
||||
|
||||
| Name | Description | |
||||
|:--------------------:|:---------------------------------------------------------------------------------:| |
||||
| SERVER_HA | Whether or not to use HA mode, if not then sqlite will be used as storage backend | |
||||
| SERVER_COUNT | k3s master node count | |
||||
| SERVER_INSTANCE_TYPE | Ec2 instance type created for k3s server(s) | |
||||
|
||||
**K3S Agent Variables** |
||||
|
||||
| Name | Description | |
||||
|:-------------------:|:-----------------------------------------:| |
||||
| AGENT_NODE_COUNT | Number of k3s agents that will be created | |
||||
| AGENT_INSTANCE_TYPE | Ec2 instance type created for k3s agents | |
||||
|
||||
**Prometheus server Variables** |
||||
|
||||
| Name | Description | |
||||
|:-------------------------:|:-------------------------------------------------------------------:| |
||||
| PROM_WORKER_NODE_COUNT | Number of k3s agents that will be created for prometheus deployment | |
||||
| PROM_WORKER_INSTANCE_TYPE | Ec2 instance type created for k3s prometheus agents | |
||||
|
||||
|
||||
## Usage |
||||
|
||||
### build |
||||
|
||||
The script includes a Makefile that run different sections, to build the master and workers, adjust the config file in `tests/perf/scripts/config` and then use the following: |
||||
|
||||
``` |
||||
cd tests/perf |
||||
make apply |
||||
``` |
||||
|
||||
This will basically build the db, server, and agent layers, it will also deploy a kubeconfig file in tests/kubeconfig.yaml. |
||||
|
||||
### test |
||||
|
||||
To start the clusterloader2 load test you can modify the tests/perf/tests/load/config.yaml and then run the following: |
||||
|
||||
``` |
||||
cd tests/perf |
||||
make test |
||||
``` |
||||
|
||||
### destroy |
||||
|
||||
To destroy the cluster just run the following: |
||||
``` |
||||
make destroy |
||||
make clean |
||||
``` |
@ -1,28 +1,34 @@
|
||||
## MAIN VARIABLES ## |
||||
#################### |
||||
CLUSTER_NAME="hgalal-k3s" |
||||
K3S_VERSION="v0.10.0" |
||||
EXTRA_SSH_KEYS="ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDZBAE6I9J733HJfCBVu7iWSUuJ7th0U4P4IFfpFDca52n/Hk4yFFr8SPR8JJc1n42c3vEVCbExp/MD4ihqEBy9+pLewxA+fkb7UAT4cT2eLfvZdTTVe8KSiw6lVN6tWSoNXmNqY+wH7zWQ04lfjXPa/c01L1n2XwV/O+5xii9vEuSxN9YhfQ/s61SdLFqQ5yS8gPsM0qQW+bFt5KGGbapqztDO+h9lxGbZRcRAKbCzZ5kF1mhjI/+VubTWKtoVLCumjzjYqILYyx9g/mLSo26qjDEZvtwBQB9KLugDAtnalLVp0HgivC5YfLHr8PxViVSHfIIKS2DhUpn07jr8eKi9" |
||||
PRIVATE_KEY_PATH="/home/hussein/.ssh/id_rsa" #this has to be a full path |
||||
CLUSTER_NAME="loadtest-k3s" |
||||
DOMAIN_NAME="" |
||||
ZONE_ID="" |
||||
K3S_VERSION="v0.11.0-alpha2" |
||||
EXTRA_SSH_KEYS="" # comma separated public keys |
||||
PRIVATE_KEY_PATH="~/.ssh/id_rsa" |
||||
DEBUG=1 |
||||
|
||||
## K3S DB VARIABLES ## |
||||
########################## |
||||
DB_ENGINE="postgres" |
||||
DB_INSTANCE_TYPE="db.m4.4xlarge" |
||||
DB_NAME="k3s" |
||||
DB_USERNAME="k3suser" |
||||
DB_PASSWORD="024d9442b3add64b7ef90655bc302cd8" |
||||
DB_VERSION=11.5 |
||||
|
||||
## K3S SERVER VARIABLES ## |
||||
########################## |
||||
K3S_HA=1 |
||||
MASTER_COUNT=3 |
||||
DB_INSTANCE_TYPE="db.m4.4xlarge" |
||||
SERVER_HA=1 |
||||
SERVER_COUNT=3 |
||||
SERVER_INSTANCE_TYPE="m5.2xlarge" |
||||
DEBUG=1 |
||||
|
||||
|
||||
## PROMETHEUS SERVER VARIABLES ## |
||||
################################# |
||||
PROM_WORKER_NODE_COUNT=1 |
||||
PROM_HOST="prometheus-load.eng.rancher.space" |
||||
GRAF_HOST="prometheus-load.eng.rancher.space" |
||||
|
||||
PROM_WORKER_INSTANCE_TYPE="m5.large" |
||||
|
||||
## K3S AGENTS VARIABLES ## |
||||
########################## |
||||
AGENT_NODE_COUNT=100 |
||||
WORKER_INSTANCE_TYPE="m5.xlarge" |
||||
AGENT_INSTANCE_TYPE="m5.large" |
||||
|
@ -0,0 +1,31 @@
|
||||
#cloud-config |
||||
%{ if length(extra_ssh_keys) > 0 } |
||||
ssh_authorized_keys: |
||||
%{ for ssh_key in extra_ssh_keys } |
||||
- ${ssh_key} |
||||
%{ endfor } |
||||
%{ endif } |
||||
runcmd: |
||||
- echo "net.ipv4.neigh.default.gc_interval = 3600" >> /etc/sysctl.conf |
||||
- echo "net.ipv4.neigh.default.gc_stale_time = 3600" >> /etc/sysctl.conf |
||||
- echo "net.ipv4.neigh.default.gc_thresh3 = 16384" >> /etc/sysctl.conf |
||||
- echo "net.ipv4.neigh.default.gc_thresh2 = 8192" >> /etc/sysctl.conf |
||||
- echo "net.ipv4.neigh.default.gc_thresh1 = 4096" >> /etc/sysctl.conf |
||||
- echo "fs.file-max = 12000500" >> /etc/sysctl.conf |
||||
- echo "fs.nr_open = 20000500" >> /etc/sysctl.conf |
||||
- echo "net.ipv4.tcp_mem = '10000000 10000000 10000000'" >> /etc/sysctl.conf |
||||
- echo "net.ipv4.tcp_rmem = '1024 4096 16384'" >> /etc/sysctl.conf |
||||
- echo "net.ipv4.tcp_wmem = '1024 4096 16384'" >> /etc/sysctl.conf |
||||
- echo "net.core.rmem_max = 16384" >> /etc/sysctl.conf |
||||
- echo "net.core.wmem_max = 16384" >> /etc/sysctl.conf |
||||
- ulimit -n 20000000 |
||||
- echo "# <domain> <type> <item> <value>" >> /etc/security/limits.d/limits.conf |
||||
- echo " * soft nofile 20000" >> /etc/security/limits.d/limits.conf |
||||
- echo " * hard nofile 20000" >> /etc/security/limits.d/limits.conf |
||||
- sysctl -p |
||||
- apt-get update |
||||
- apt-get install -y git vim software-properties-common resolvconf linux-headers-$(uname -r) |
||||
- echo "nameserver 1.1.1.1" > /etc/resolvconf/resolv.conf.d/tail |
||||
- echo "RateLimitIntervalSec=0" >> /etc/systemd/journald.conf |
||||
- echo "RateLimitBurst=0" >> /etc/systemd/journald.conf |
||||
- curl -sSL https://releases.rancher.com/install-docker/19.03.sh | sh |
@ -0,0 +1,22 @@
|
||||
#!/bin/bash |
||||
set -x |
||||
|
||||
IFS=',' read -r -a public_ips <<< "$PUBLIC_IPS" |
||||
IFS=',' read -r -a private_ips <<< "$PRIVATE_IPS" |
||||
|
||||
conn_string="" |
||||
for i in "${!private_ips[@]}"; do |
||||
conn_string=$conn_string"etcd-$i=http://${private_ips[i]}:2380," |
||||
done |
||||
conn_string=${conn_string%?} |
||||
for i in "${!public_ips[@]}"; do |
||||
while true; do |
||||
ssh -i $SSH_KEY_PATH -l ubuntu ${public_ips[i]} "sudo docker run -v /etcd-data:/etcd-data -d -p ${private_ips[i]}:2379:2379 -p ${private_ips[i]}:2380:2380 quay.io/coreos/etcd:$DB_VERSION etcd --initial-advertise-peer-urls http://${private_ips[i]}:2380 --name=etcd-$i --data-dir=/etcd-data --advertise-client-urls=http://0.0.0.0:2379 --listen-peer-urls=http://0.0.0.0:2380 --listen-client-urls=http://0.0.0.0:2379 --initial-cluster-token=etcd-cluster-1 --initial-cluster-state new --initial-cluster $conn_string" |
||||
if [ $? == 0 ]; then |
||||
break |
||||
fi |
||||
sleep 10 |
||||
done |
||||
done |
||||
|
||||
# |
@ -1,227 +0,0 @@
|
||||
%{ if prom_worker_node_count != 0 } |
||||
--- |
||||
apiVersion: rbac.authorization.k8s.io/v1 |
||||
# kubernetes versions before 1.8.0 should use rbac.authorization.k8s.io/v1beta1 |
||||
kind: ClusterRoleBinding |
||||
metadata: |
||||
name: kube-state-metrics |
||||
roleRef: |
||||
apiGroup: rbac.authorization.k8s.io |
||||
kind: ClusterRole |
||||
name: kube-state-metrics |
||||
subjects: |
||||
- kind: ServiceAccount |
||||
name: kube-state-metrics |
||||
namespace: kube-system |
||||
--- |
||||
apiVersion: rbac.authorization.k8s.io/v1 |
||||
# kubernetes versions before 1.8.0 should use rbac.authorization.k8s.io/v1beta1 |
||||
kind: ClusterRole |
||||
metadata: |
||||
name: kube-state-metrics |
||||
rules: |
||||
- apiGroups: [""] |
||||
resources: |
||||
- configmaps |
||||
- secrets |
||||
- nodes |
||||
- pods |
||||
- services |
||||
- resourcequotas |
||||
- replicationcontrollers |
||||
- limitranges |
||||
- persistentvolumeclaims |
||||
- persistentvolumes |
||||
- namespaces |
||||
- endpoints |
||||
verbs: ["list", "watch"] |
||||
- apiGroups: ["extensions"] |
||||
resources: |
||||
- daemonsets |
||||
- deployments |
||||
- replicasets |
||||
- ingresses |
||||
verbs: ["list", "watch"] |
||||
- apiGroups: ["apps"] |
||||
resources: |
||||
- daemonsets |
||||
- deployments |
||||
- replicasets |
||||
- statefulsets |
||||
verbs: ["list", "watch"] |
||||
- apiGroups: ["batch"] |
||||
resources: |
||||
- cronjobs |
||||
- jobs |
||||
verbs: ["list", "watch"] |
||||
- apiGroups: ["autoscaling"] |
||||
resources: |
||||
- horizontalpodautoscalers |
||||
verbs: ["list", "watch"] |
||||
- apiGroups: ["policy"] |
||||
resources: |
||||
- poddisruptionbudgets |
||||
verbs: ["list", "watch"] |
||||
- apiGroups: ["certificates.k8s.io"] |
||||
resources: |
||||
- certificatesigningrequests |
||||
verbs: ["list", "watch"] |
||||
- apiGroups: ["storage.k8s.io"] |
||||
resources: |
||||
- storageclasses |
||||
verbs: ["list", "watch"] |
||||
- apiGroups: ["autoscaling.k8s.io"] |
||||
resources: |
||||
- verticalpodautoscalers |
||||
verbs: ["list", "watch"] |
||||
--- |
||||
apiVersion: apps/v1 |
||||
kind: Deployment |
||||
metadata: |
||||
labels: |
||||
k8s-app: kube-state-metrics |
||||
name: kube-state-metrics |
||||
namespace: kube-system |
||||
spec: |
||||
selector: |
||||
matchLabels: |
||||
k8s-app: kube-state-metrics |
||||
replicas: 1 |
||||
template: |
||||
metadata: |
||||
labels: |
||||
k8s-app: kube-state-metrics |
||||
spec: |
||||
serviceAccountName: kube-state-metrics |
||||
containers: |
||||
- name: kube-state-metrics |
||||
image: quay.io/coreos/kube-state-metrics:v1.7.2 |
||||
ports: |
||||
- name: http-metrics |
||||
containerPort: 8080 |
||||
- name: telemetry |
||||
containerPort: 8081 |
||||
livenessProbe: |
||||
httpGet: |
||||
path: /healthz |
||||
port: 8080 |
||||
initialDelaySeconds: 5 |
||||
timeoutSeconds: 5 |
||||
readinessProbe: |
||||
httpGet: |
||||
path: / |
||||
port: 8080 |
||||
initialDelaySeconds: 5 |
||||
timeoutSeconds: 5 |
||||
--- |
||||
apiVersion: v1 |
||||
kind: ServiceAccount |
||||
metadata: |
||||
name: kube-state-metrics |
||||
namespace: kube-system |
||||
--- |
||||
apiVersion: v1 |
||||
kind: Service |
||||
metadata: |
||||
name: kube-state-metrics |
||||
namespace: kube-system |
||||
labels: |
||||
k8s-app: kube-state-metrics |
||||
annotations: |
||||
prometheus.io/scrape: 'true' |
||||
spec: |
||||
ports: |
||||
- name: http-metrics |
||||
port: 8080 |
||||
targetPort: http-metrics |
||||
protocol: TCP |
||||
- name: telemetry |
||||
port: 8081 |
||||
targetPort: telemetry |
||||
protocol: TCP |
||||
selector: |
||||
k8s-app: kube-state-metrics |
||||
--- |
||||
kind: ClusterRoleBinding |
||||
apiVersion: rbac.authorization.k8s.io/v1 |
||||
metadata: |
||||
name: slo-monitor |
||||
subjects: |
||||
- kind: ServiceAccount |
||||
name: slo-monitor |
||||
namespace: kube-system |
||||
roleRef: |
||||
kind: ClusterRole |
||||
name: slo-monitor |
||||
apiGroup: rbac.authorization.k8s.io |
||||
--- |
||||
kind: ClusterRole |
||||
apiVersion: rbac.authorization.k8s.io/v1 |
||||
metadata: |
||||
name: slo-monitor |
||||
namespace: kube-system |
||||
rules: |
||||
- apiGroups: [""] |
||||
resources: ["pods", "events"] |
||||
verbs: ["get", "watch", "list"] |
||||
--- |
||||
apiVersion: v1 |
||||
kind: ServiceAccount |
||||
metadata: |
||||
name: slo-monitor |
||||
namespace: kube-system |
||||
--- |
||||
apiVersion: apps/v1 |
||||
kind: Deployment |
||||
metadata: |
||||
name: slo-monitor |
||||
namespace: kube-system |
||||
labels: |
||||
app: slo-monitor |
||||
spec: |
||||
selector: |
||||
matchLabels: |
||||
app: slo-monitor |
||||
template: |
||||
metadata: |
||||
labels: |
||||
app: slo-monitor |
||||
annotations: |
||||
prometheus.io/scrape: "true" |
||||
spec: |
||||
containers: |
||||
- name: slo-monitor |
||||
image: gcr.io/google-containers/slo-monitor:0.12.0 |
||||
command: |
||||
- /slo-monitor |
||||
- --alsologtostderr=true |
||||
imagePullPolicy: Always |
||||
ports: |
||||
- name: metrics |
||||
containerPort: 8080 |
||||
resources: |
||||
requests: |
||||
cpu: 300m |
||||
memory: 100Mi |
||||
limits: |
||||
cpu: 300m |
||||
memory: 100Mi |
||||
restartPolicy: Always |
||||
serviceAccountName: slo-monitor |
||||
--- |
||||
apiVersion: v1 |
||||
kind: Service |
||||
metadata: |
||||
name: slo-monitor |
||||
namespace: kube-system |
||||
labels: |
||||
app: slo-monitor |
||||
spec: |
||||
selector: |
||||
app: slo-monitor |
||||
ports: |
||||
- name: metrics |
||||
port: 80 |
||||
targetPort: metrics |
||||
type: ClusterIP |
||||
%{ endif } |
@ -1,86 +0,0 @@
|
||||
%{ if prom_worker_node_count != 0 } |
||||
--- |
||||
apiVersion: v1 |
||||
kind: Namespace |
||||
metadata: |
||||
name: monitoring |
||||
|
||||
--- |
||||
apiVersion: helm.cattle.io/v1 |
||||
kind: HelmChart |
||||
metadata: |
||||
name: prometheus |
||||
namespace: kube-system |
||||
spec: |
||||
chart: https://raw.githubusercontent.com/galal-hussein/charts/master/prometheus-9.2.0.tgz |
||||
targetNamespace: monitoring |
||||
valuesContent: |- |
||||
alertmanager: |
||||
nodeSelector: |
||||
prom: "true" |
||||
persistentVolume: |
||||
enabled: false |
||||
kubeStateMetrics: |
||||
nodeSelector: |
||||
prom: "true" |
||||
nodeExporter: |
||||
nodeSelector: |
||||
prom: "true" |
||||
server: |
||||
nodeSelector: |
||||
prom: "true" |
||||
ingress: |
||||
enabled: true |
||||
hosts: |
||||
- ${prom_host} |
||||
persistentVolume: |
||||
enabled: false |
||||
pushgateway: |
||||
nodeSelector: |
||||
prom: "true" |
||||
persistentVolume: |
||||
enabled: false |
||||
serverFiles: |
||||
prometheus.yml: |
||||
scrape_configs: |
||||
- job_name: prometheus |
||||
static_configs: |
||||
- targets: |
||||
- localhost:9090 |
||||
- job_name: kubernetes-apiservers |
||||
scrape_interval: 10s |
||||
scrape_timeout: 10s |
||||
metrics_path: /metrics |
||||
scheme: https |
||||
kubernetes_sd_configs: |
||||
- api_server: null |
||||
role: endpoints |
||||
namespaces: |
||||
names: [] |
||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token |
||||
tls_config: |
||||
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt |
||||
insecure_skip_verify: true |
||||
relabel_configs: |
||||
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name] |
||||
separator: ; |
||||
regex: default;kubernetes;https |
||||
replacement: $1 |
||||
action: keep |
||||
--- |
||||
apiVersion: helm.cattle.io/v1 |
||||
kind: HelmChart |
||||
metadata: |
||||
name: grafana |
||||
namespace: kube-system |
||||
spec: |
||||
chart: stable/grafana |
||||
targetNamespace: monitoring |
||||
valuesContent: |- |
||||
ingress: |
||||
enabled: true |
||||
hosts: |
||||
- ${graf_host} |
||||
nodeSelector: |
||||
prom: "true" |
||||
%{ endif } |
Loading…
Reference in new issue