Merge pull request #44500 from Cynerva/gkk/cdk-1.6-support

Automatic merge from submit-queue (batch tested with PRs 43000, 44500, 44457, 44553, 44267)

Add Kubernetes 1.6 support to Juju charms

**What this PR does / why we need it**:

This adds Kubernetes 1.6 support to Juju charms.

This includes some large architectural changes in order to support multiple versions of Kubernetes with a single release of the charms. There are a few bug fixes in here as well, for issues that we discovered during testing.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

Thanks to @marcoceppi, @ktsakalozos, @jacekn, @mbruzek, @tvansteenburgh for their work in this feature branch as well!

**Release note**:

```release-note
Add Kubernetes 1.6 support to Juju charms
Add metric collection to charms for autoscaling
Update kubernetes-e2e charm to fail when test suite fails
Update Juju charms to use snaps
Add registry action to the kubernetes-worker charm
Add support for kube-proxy cluster-cidr option to kubernetes-worker charm
Fix kubernetes-master charm starting services before TLS certs are saved
Fix kubernetes-worker charm failures in LXD
Fix stop hook failure on kubernetes-worker charm
Fix handling of juju kubernetes-worker.restart-needed state
Fix nagios checks in charms
```
pull/6/head
Kubernetes Submit Queue 2017-04-18 13:19:06 -07:00 committed by GitHub
commit 09e3fdbafe
49 changed files with 946 additions and 993 deletions

2
cluster/juju/.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
builds
deps

View File

@ -1,5 +1,6 @@
repo: https://github.com/kubernetes/kubernetes.git repo: https://github.com/kubernetes/kubernetes.git
includes: includes:
- 'layer:metrics'
- 'layer:nagios' - 'layer:nagios'
- 'layer:nginx' - 'layer:nginx'
- 'layer:tls-client' - 'layer:tls-client'

View File

@ -0,0 +1,2 @@
metrics:
juju-units: {}

View File

@ -73,7 +73,7 @@ a deployed cluster. The following example will skip the `Flaky`, `Slow`, and
`Feature` labeled tests: `Feature` labeled tests:
```shell ```shell
juju run-action kubernetes-e2e/0 skip='\[(Flaky|Slow|Feature:.*)\]' juju run-action kubernetes-e2e/0 test skip='\[(Flaky|Slow|Feature:.*)\]'
``` ```
> Note: the escaping of the regex due to how bash handles brackets. > Note: the escaping of the regex due to how bash handles brackets.

View File

@ -45,3 +45,7 @@ tar -czf $ACTION_LOG_TGZ ${JUJU_ACTION_UUID}.log
action-set log="$ACTION_LOG_TGZ" action-set log="$ACTION_LOG_TGZ"
action-set junit="$ACTION_JUNIT_TGZ" action-set junit="$ACTION_JUNIT_TGZ"
if tail ${JUJU_ACTION_UUID}.log | grep -q "Test Suite Failed"; then
action-fail "Failure detected in the logs"
fi

View File

@ -28,6 +28,9 @@ import os
import sys import sys
os.environ['PATH'] += os.pathsep + os.path.join(os.sep, 'snap', 'bin')
def main(): def main():
''' Control logic to enlist Ceph RBD volumes as PersistentVolumes in ''' Control logic to enlist Ceph RBD volumes as PersistentVolumes in
Kubernetes. This will invoke the validation steps, and only execute if Kubernetes. This will invoke the validation steps, and only execute if

View File

@ -21,3 +21,8 @@ options:
privileged mode. If "auto", kube-apiserver will not run in privileged privileged mode. If "auto", kube-apiserver will not run in privileged
mode by default, but will switch to privileged mode if gpu hardware is mode by default, but will switch to privileged mode if gpu hardware is
detected on a worker node. detected on a worker node.
channel:
type: string
default: "stable"
description: |
Snap channel to install Kubernetes master services from

View File

@ -1,6 +1,8 @@
#!/bin/sh #!/bin/sh
set -ux set -ux
export PATH=$PATH:/snap/bin
alias kubectl="kubectl --kubeconfig=/home/ubuntu/config" alias kubectl="kubectl --kubeconfig=/home/ubuntu/config"
kubectl cluster-info > $DEBUG_SCRIPT_DIR/cluster-info kubectl cluster-info > $DEBUG_SCRIPT_DIR/cluster-info

View File

@ -2,12 +2,8 @@
set -ux set -ux
for service in kube-apiserver kube-controller-manager kube-scheduler; do for service in kube-apiserver kube-controller-manager kube-scheduler; do
systemctl status $service > $DEBUG_SCRIPT_DIR/$service-systemctl-status systemctl status snap.$service.daemon > $DEBUG_SCRIPT_DIR/$service-systemctl-status
journalctl -u $service > $DEBUG_SCRIPT_DIR/$service-journal journalctl -u snap.$service.daemon > $DEBUG_SCRIPT_DIR/$service-journal
done done
mkdir -p $DEBUG_SCRIPT_DIR/etc-default # FIXME: grab snap config or something
cp -v /etc/default/kube* $DEBUG_SCRIPT_DIR/etc-default
mkdir -p $DEBUG_SCRIPT_DIR/lib-systemd-system
cp -v /lib/systemd/system/kube* $DEBUG_SCRIPT_DIR/lib-systemd-system

View File

@ -1,9 +1,11 @@
repo: https://github.com/kubernetes/kubernetes.git repo: https://github.com/kubernetes/kubernetes.git
includes: includes:
- 'layer:basic' - 'layer:basic'
- 'layer:snap'
- 'layer:tls-client' - 'layer:tls-client'
- 'layer:leadership' - 'layer:leadership'
- 'layer:debug' - 'layer:debug'
- 'layer:metrics'
- 'layer:nagios' - 'layer:nagios'
- 'interface:ceph-admin' - 'interface:ceph-admin'
- 'interface:etcd' - 'interface:etcd'
@ -17,10 +19,8 @@ options:
packages: packages:
- socat - socat
tls-client: tls-client:
ca_certificate_path: '/srv/kubernetes/ca.crt' ca_certificate_path: '/root/cdk/ca.crt'
server_certificate_path: '/srv/kubernetes/server.crt' server_certificate_path: '/root/cdk/server.crt'
server_key_path: '/srv/kubernetes/server.key' server_key_path: '/root/cdk/server.key'
client_certificate_path: '/srv/kubernetes/client.crt' client_certificate_path: '/root/cdk/client.crt'
client_key_path: '/srv/kubernetes/client.key' client_key_path: '/root/cdk/client.key'
tactics:
- 'tactics.update_addons.UpdateAddonsTactic'

View File

@ -17,10 +17,6 @@
import re import re
import subprocess import subprocess
from charmhelpers.core import unitdata
BIN_VERSIONS = 'bin_versions'
def get_version(bin_name): def get_version(bin_name):
"""Get the version of an installed Kubernetes binary. """Get the version of an installed Kubernetes binary.
@ -33,31 +29,6 @@ def get_version(bin_name):
>>> `get_version('kubelet') >>> `get_version('kubelet')
(1, 6, 0) (1, 6, 0)
"""
db = unitdata.kv()
bin_versions = db.get(BIN_VERSIONS, {})
cached_version = bin_versions.get(bin_name)
if cached_version:
return tuple(cached_version)
version = _get_bin_version(bin_name)
bin_versions[bin_name] = list(version)
db.set(BIN_VERSIONS, bin_versions)
return version
def reset_versions():
"""Reset the cache of bin versions.
"""
db = unitdata.kv()
db.unset(BIN_VERSIONS)
def _get_bin_version(bin_name):
"""Get a binary version by calling it with --version and parsing output.
""" """
cmd = '{} --version'.format(bin_name).split() cmd = '{} --version'.format(bin_name).split()
version_string = subprocess.check_output(cmd).decode('utf-8') version_string = subprocess.check_output(cmd).decode('utf-8')

View File

@ -118,6 +118,13 @@ class FlagManager:
""" """
return self.data.get(key, default) return self.data.get(key, default)
def destroy_all(self):
'''
Destructively removes all data from the FlagManager.
'''
self.data.clear()
self.__save()
def to_s(self): def to_s(self):
''' '''
Render the flags to a single string, prepared for the Docker Render the flags to a single string, prepared for the Docker

View File

@ -37,7 +37,23 @@ requires:
ceph-storage: ceph-storage:
interface: ceph-admin interface: ceph-admin
resources: resources:
kubernetes: kubectl:
type: file type: file
filename: kubernetes.tar.gz filename: kubectl.snap
description: "A tarball packaged release of the kubernetes bins." description: kubectl snap
kube-apiserver:
type: file
filename: kube-apiserver.snap
description: kube-apiserver snap
kube-controller-manager:
type: file
filename: kube-controller-manager.snap
description: kube-controller-manager snap
kube-scheduler:
type: file
filename: kube-scheduler.snap
description: kube-scheduler snap
cdk-addons:
type: file
filename: cdk-addons.snap
description: CDK addons snap

View File

@ -0,0 +1,34 @@
metrics:
juju-units: {}
pods:
type: gauge
description: number of pods
command: /snap/bin/kubectl get po --all-namespaces | tail -n+2 | wc -l
services:
type: gauge
description: number of services
command: /snap/bin/kubectl get svc --all-namespaces | tail -n+2 | wc -l
replicasets:
type: gauge
description: number of replicasets
command: /snap/bin/kubectl get rs --all-namespaces | tail -n+2 | wc -l
replicationcontrollers:
type: gauge
description: number of replicationcontrollers
command: /snap/bin/kubectl get rc --all-namespaces | tail -n+2 | wc -l
nodes:
type: gauge
description: number of kubernetes nodes
command: /snap/bin/kubectl get nodes | tail -n+2 | wc -l
persistentvolume:
type: gauge
description: number of pv
command: /snap/bin/kubectl get pv --all-namespaces | tail -n+2 | wc -l
persistentvolumeclaims:
type: gauge
description: number of claims
command: /snap/bin/kubectl get pvc --all-namespaces | tail -n+2 | wc -l
serviceaccounts:
type: gauge
description: number of sa
command: /snap/bin/kubectl get sa --all-namespaces | tail -n+2 | wc -l

View File

@ -16,7 +16,9 @@
import base64 import base64
import os import os
import re
import random import random
import shutil
import socket import socket
import string import string
import json import json
@ -24,18 +26,19 @@ import json
import charms.leadership import charms.leadership
from shlex import split from shlex import split
from subprocess import call
from subprocess import check_call from subprocess import check_call
from subprocess import check_output from subprocess import check_output
from subprocess import CalledProcessError from subprocess import CalledProcessError
from charms import layer from charms import layer
from charms.layer import snap
from charms.reactive import hook from charms.reactive import hook
from charms.reactive import remove_state from charms.reactive import remove_state
from charms.reactive import set_state from charms.reactive import set_state
from charms.reactive import is_state
from charms.reactive import when, when_any, when_not from charms.reactive import when, when_any, when_not
from charms.reactive.helpers import data_changed from charms.reactive.helpers import data_changed
from charms.kubernetes.common import get_version, reset_versions from charms.kubernetes.common import get_version
from charms.kubernetes.flagmanager import FlagManager from charms.kubernetes.flagmanager import FlagManager
from charmhelpers.core import hookenv from charmhelpers.core import hookenv
@ -46,15 +49,12 @@ from charmhelpers.fetch import apt_install
from charmhelpers.contrib.charmsupport import nrpe from charmhelpers.contrib.charmsupport import nrpe
dashboard_templates = [ # Override the default nagios shortname regex to allow periods, which we
'dashboard-controller.yaml', # need because our bin names contain them (e.g. 'snap.foo.daemon'). The
'dashboard-service.yaml', # default regex in charmhelpers doesn't allow periods, but nagios itself does.
'influxdb-grafana-controller.yaml', nrpe.Check.shortname_re = '[\.A-Za-z0-9-_]+$'
'influxdb-service.yaml',
'grafana-service.yaml', os.environ['PATH'] += os.pathsep + os.path.join(os.sep, 'snap', 'bin')
'heapster-controller.yaml',
'heapster-service.yaml'
]
def service_cidr(): def service_cidr():
@ -74,66 +74,91 @@ def freeze_service_cidr():
@hook('upgrade-charm') @hook('upgrade-charm')
def reset_states_for_delivery(): def reset_states_for_delivery():
'''An upgrade charm event was triggered by Juju, react to that here.''' '''An upgrade charm event was triggered by Juju, react to that here.'''
migrate_from_pre_snaps()
install_snaps()
remove_state('authentication.setup')
remove_state('kubernetes-master.components.started')
def rename_file_idempotent(source, destination):
if os.path.isfile(source):
os.rename(source, destination)
def migrate_from_pre_snaps():
# remove old states
remove_state('kubernetes.components.installed')
remove_state('kubernetes.dashboard.available')
remove_state('kube-dns.available')
remove_state('kubernetes-master.app_version.set')
# disable old services
services = ['kube-apiserver', services = ['kube-apiserver',
'kube-controller-manager', 'kube-controller-manager',
'kube-scheduler'] 'kube-scheduler']
for service in services: for service in services:
hookenv.log('Stopping {0} service.'.format(service)) hookenv.log('Stopping {0} service.'.format(service))
host.service_stop(service) host.service_stop(service)
remove_state('kubernetes-master.components.started')
remove_state('kubernetes-master.components.installed')
remove_state('kube-dns.available')
remove_state('kubernetes.dashboard.available')
# rename auth files
os.makedirs('/root/cdk', exist_ok=True)
rename_file_idempotent('/etc/kubernetes/serviceaccount.key',
'/root/cdk/serviceaccount.key')
rename_file_idempotent('/srv/kubernetes/basic_auth.csv',
'/root/cdk/basic_auth.csv')
rename_file_idempotent('/srv/kubernetes/known_tokens.csv',
'/root/cdk/known_tokens.csv')
@when_not('kubernetes-master.components.installed') # cleanup old files
def install(): files = [
'''Unpack and put the Kubernetes master files on the path.''' "/lib/systemd/system/kube-apiserver.service",
# Get the resource via resource_get "/lib/systemd/system/kube-controller-manager.service",
try: "/lib/systemd/system/kube-scheduler.service",
archive = hookenv.resource_get('kubernetes') "/etc/default/kube-defaults",
except Exception: "/etc/default/kube-apiserver.defaults",
message = 'Error fetching the kubernetes resource.' "/etc/default/kube-controller-manager.defaults",
hookenv.log(message) "/etc/default/kube-scheduler.defaults",
hookenv.status_set('blocked', message) "/srv/kubernetes",
return "/home/ubuntu/kubectl",
"/usr/local/bin/kubectl",
if not archive: "/usr/local/bin/kube-apiserver",
hookenv.log('Missing kubernetes resource.') "/usr/local/bin/kube-controller-manager",
hookenv.status_set('blocked', 'Missing kubernetes resource.') "/usr/local/bin/kube-scheduler",
return "/etc/kubernetes"
# Handle null resource publication, we check if filesize < 1mb
filesize = os.stat(archive).st_size
if filesize < 1000000:
hookenv.status_set('blocked', 'Incomplete kubernetes resource.')
return
hookenv.status_set('maintenance', 'Unpacking kubernetes resource.')
files_dir = os.path.join(hookenv.charm_dir(), 'files')
os.makedirs(files_dir, exist_ok=True)
command = 'tar -xvzf {0} -C {1}'.format(archive, files_dir)
hookenv.log(command)
check_call(split(command))
apps = [
{'name': 'kube-apiserver', 'path': '/usr/local/bin'},
{'name': 'kube-controller-manager', 'path': '/usr/local/bin'},
{'name': 'kube-scheduler', 'path': '/usr/local/bin'},
{'name': 'kubectl', 'path': '/usr/local/bin'},
] ]
for file in files:
if os.path.isdir(file):
hookenv.log("Removing directory: " + file)
shutil.rmtree(file)
elif os.path.isfile(file):
hookenv.log("Removing file: " + file)
os.remove(file)
for app in apps: # clear the flag managers
unpacked = '{}/{}'.format(files_dir, app['name']) FlagManager('kube-apiserver').destroy_all()
app_path = os.path.join(app['path'], app['name']) FlagManager('kube-controller-manager').destroy_all()
install = ['install', '-v', '-D', unpacked, app_path] FlagManager('kube-scheduler').destroy_all()
hookenv.log(install)
check_call(install)
reset_versions()
set_state('kubernetes-master.components.installed') def install_snaps():
channel = hookenv.config('channel')
hookenv.status_set('maintenance', 'Installing kubectl snap')
snap.install('kubectl', channel=channel, classic=True)
hookenv.status_set('maintenance', 'Installing kube-apiserver snap')
snap.install('kube-apiserver', channel=channel)
hookenv.status_set('maintenance',
'Installing kube-controller-manager snap')
snap.install('kube-controller-manager', channel=channel)
hookenv.status_set('maintenance', 'Installing kube-scheduler snap')
snap.install('kube-scheduler', channel=channel)
hookenv.status_set('maintenance', 'Installing cdk-addons snap')
snap.install('cdk-addons', channel=channel)
set_state('kubernetes-master.snaps.installed')
@when('config.changed.channel')
def channel_changed():
install_snaps()
@when('cni.connected') @when('cni.connected')
@ -145,20 +170,18 @@ def configure_cni(cni):
@when('leadership.is_leader') @when('leadership.is_leader')
@when('kubernetes-master.components.installed')
@when_not('authentication.setup') @when_not('authentication.setup')
def setup_leader_authentication(): def setup_leader_authentication():
'''Setup basic authentication and token access for the cluster.''' '''Setup basic authentication and token access for the cluster.'''
api_opts = FlagManager('kube-apiserver') api_opts = FlagManager('kube-apiserver')
controller_opts = FlagManager('kube-controller-manager') controller_opts = FlagManager('kube-controller-manager')
service_key = '/etc/kubernetes/serviceaccount.key' service_key = '/root/cdk/serviceaccount.key'
basic_auth = '/srv/kubernetes/basic_auth.csv' basic_auth = '/root/cdk/basic_auth.csv'
known_tokens = '/srv/kubernetes/known_tokens.csv' known_tokens = '/root/cdk/known_tokens.csv'
api_opts.add('--basic-auth-file', basic_auth) api_opts.add('basic-auth-file', basic_auth)
api_opts.add('--token-auth-file', known_tokens) api_opts.add('token-auth-file', known_tokens)
api_opts.add('--service-cluster-ip-range', service_cidr())
hookenv.status_set('maintenance', 'Rendering authentication templates.') hookenv.status_set('maintenance', 'Rendering authentication templates.')
if not os.path.isfile(basic_auth): if not os.path.isfile(basic_auth):
setup_basic_auth('admin', 'admin', 'admin') setup_basic_auth('admin', 'admin', 'admin')
@ -167,13 +190,13 @@ def setup_leader_authentication():
setup_tokens(None, 'kubelet', 'kubelet') setup_tokens(None, 'kubelet', 'kubelet')
setup_tokens(None, 'kube_proxy', 'kube_proxy') setup_tokens(None, 'kube_proxy', 'kube_proxy')
# Generate the default service account token key # Generate the default service account token key
os.makedirs('/etc/kubernetes', exist_ok=True) os.makedirs('/root/cdk', exist_ok=True)
if not os.path.isfile(service_key):
cmd = ['openssl', 'genrsa', '-out', service_key, cmd = ['openssl', 'genrsa', '-out', service_key,
'2048'] '2048']
check_call(cmd) check_call(cmd)
api_opts.add('--service-account-key-file', service_key) api_opts.add('service-account-key-file', service_key)
controller_opts.add('--service-account-private-key-file', service_key) controller_opts.add('service-account-private-key-file', service_key)
# read service account key for syndication # read service account key for syndication
leader_data = {} leader_data = {}
@ -184,27 +207,25 @@ def setup_leader_authentication():
# this is slightly opaque, but we are sending file contents under its file # this is slightly opaque, but we are sending file contents under its file
# path as a key. # path as a key.
# eg: # eg:
# {'/etc/kubernetes/serviceaccount.key': 'RSA:2471731...'} # {'/root/cdk/serviceaccount.key': 'RSA:2471731...'}
charms.leadership.leader_set(leader_data) charms.leadership.leader_set(leader_data)
set_state('authentication.setup') set_state('authentication.setup')
@when_not('leadership.is_leader') @when_not('leadership.is_leader')
@when('kubernetes-master.components.installed')
@when_not('authentication.setup') @when_not('authentication.setup')
def setup_non_leader_authentication(): def setup_non_leader_authentication():
api_opts = FlagManager('kube-apiserver') api_opts = FlagManager('kube-apiserver')
controller_opts = FlagManager('kube-controller-manager') controller_opts = FlagManager('kube-controller-manager')
service_key = '/etc/kubernetes/serviceaccount.key' service_key = '/root/cdk/serviceaccount.key'
basic_auth = '/srv/kubernetes/basic_auth.csv' basic_auth = '/root/cdk/basic_auth.csv'
known_tokens = '/srv/kubernetes/known_tokens.csv' known_tokens = '/root/cdk/known_tokens.csv'
# This races with other codepaths, and seems to require being created first # This races with other codepaths, and seems to require being created first
# This block may be extracted later, but for now seems to work as intended # This block may be extracted later, but for now seems to work as intended
os.makedirs('/etc/kubernetes', exist_ok=True) os.makedirs('/root/cdk', exist_ok=True)
os.makedirs('/srv/kubernetes', exist_ok=True)
hookenv.status_set('maintenance', 'Rendering authentication templates.') hookenv.status_set('maintenance', 'Rendering authentication templates.')
@ -225,23 +246,22 @@ def setup_non_leader_authentication():
with open(k, 'w+') as fp: with open(k, 'w+') as fp:
fp.write(contents) fp.write(contents)
api_opts.add('--basic-auth-file', basic_auth) api_opts.add('basic-auth-file', basic_auth)
api_opts.add('--token-auth-file', known_tokens) api_opts.add('token-auth-file', known_tokens)
api_opts.add('--service-cluster-ip-range', service_cidr()) api_opts.add('service-account-key-file', service_key)
api_opts.add('--service-account-key-file', service_key) controller_opts.add('service-account-private-key-file', service_key)
controller_opts.add('--service-account-private-key-file', service_key)
set_state('authentication.setup') set_state('authentication.setup')
@when('kubernetes-master.components.installed') @when('kubernetes-master.snaps.installed')
def set_app_version(): def set_app_version():
''' Declare the application version to juju ''' ''' Declare the application version to juju '''
version = check_output(['kube-apiserver', '--version']) version = check_output(['kube-apiserver', '--version'])
hookenv.application_version_set(version.split(b' v')[-1].rstrip()) hookenv.application_version_set(version.split(b' v')[-1].rstrip())
@when('kube-dns.available', 'kubernetes-master.components.installed') @when('cdk-addons.configured')
def idle_status(): def idle_status():
''' Signal at the end of the run that we are running. ''' ''' Signal at the end of the run that we are running. '''
if not all_kube_system_pods_running(): if not all_kube_system_pods_running():
@ -253,25 +273,25 @@ def idle_status():
hookenv.status_set('active', 'Kubernetes master running.') hookenv.status_set('active', 'Kubernetes master running.')
@when('etcd.available', 'kubernetes-master.components.installed', @when('etcd.available', 'tls_client.server.certificate.saved',
'certificates.server.cert.available', 'authentication.setup') 'authentication.setup')
@when_not('kubernetes-master.components.started') @when_not('kubernetes-master.components.started')
def start_master(etcd, tls): def start_master(etcd):
'''Run the Kubernetes master components.''' '''Run the Kubernetes master components.'''
hookenv.status_set('maintenance', hookenv.status_set('maintenance',
'Rendering the Kubernetes master systemd files.') 'Configuring the Kubernetes master services.')
freeze_service_cidr() freeze_service_cidr()
handle_etcd_relation(etcd) handle_etcd_relation(etcd)
# Use the etcd relation object to render files with etcd information. configure_master_services()
render_files()
hookenv.status_set('maintenance', hookenv.status_set('maintenance',
'Starting the Kubernetes master services.') 'Starting the Kubernetes master services.')
services = ['kube-apiserver', services = ['kube-apiserver',
'kube-controller-manager', 'kube-controller-manager',
'kube-scheduler'] 'kube-scheduler']
for service in services: for service in services:
hookenv.log('Starting {0} service.'.format(service)) host.service_restart('snap.%s.daemon' % service)
host.service_start(service)
hookenv.open_port(6443) hookenv.open_port(6443)
set_state('kubernetes-master.components.started') set_state('kubernetes-master.components.started')
@ -345,63 +365,28 @@ def push_api_data(kube_api):
kube_api.set_api_port('6443') kube_api.set_api_port('6443')
@when('kubernetes-master.components.started', 'kube-dns.available')
@when_not('kubernetes.dashboard.available')
def install_dashboard_addons():
''' Launch dashboard addons if they are enabled in config '''
if hookenv.config('enable-dashboard-addons'):
hookenv.log('Launching kubernetes dashboard.')
context = {}
context['arch'] = arch()
try:
context['pillar'] = {'num_nodes': get_node_count()}
for template in dashboard_templates:
create_addon(template, context)
set_state('kubernetes.dashboard.available')
except CalledProcessError:
hookenv.log('Kubernetes dashboard waiting on kubeapi')
@when('kubernetes-master.components.started', 'kubernetes.dashboard.available')
def remove_dashboard_addons():
''' Removes dashboard addons if they are disabled in config '''
if not hookenv.config('enable-dashboard-addons'):
hookenv.log('Removing kubernetes dashboard.')
for template in dashboard_templates:
delete_addon(template)
remove_state('kubernetes.dashboard.available')
@when('kubernetes-master.components.started') @when('kubernetes-master.components.started')
@when_not('kube-dns.available') def configure_cdk_addons():
def start_kube_dns(): ''' Configure CDK addons '''
''' State guard to starting DNS ''' dbEnabled = str(hookenv.config('enable-dashboard-addons')).lower()
hookenv.status_set('maintenance', 'Deploying KubeDNS') args = [
'arch=' + arch(),
context = { 'dns-ip=' + get_dns_ip(),
'arch': arch(), 'dns-domain=' + hookenv.config('dns_domain'),
# The dictionary named 'pillar' is a construct of the k8s template file 'enable-dashboard=' + dbEnabled
'pillar': { ]
'dns_server': get_dns_ip(), check_call(['snap', 'set', 'cdk-addons'] + args)
'dns_replicas': 1,
'dns_domain': hookenv.config('dns_domain')
}
}
try: try:
create_addon('kubedns-sa.yaml', context) check_call(['cdk-addons.apply'])
create_addon('kubedns-cm.yaml', context)
create_addon('kubedns-controller.yaml', context)
create_addon('kubedns-svc.yaml', context)
except CalledProcessError: except CalledProcessError:
hookenv.status_set('waiting', 'Waiting to retry KubeDNS deployment') hookenv.status_set('waiting', 'Waiting to retry addon deployment')
remove_state('cdk-addons.configured')
return return
set_state('cdk-addons.configured')
set_state('kube-dns.available')
@when('kubernetes-master.components.installed', 'loadbalancer.available', @when('loadbalancer.available', 'certificates.ca.available',
'certificates.ca.available', 'certificates.client.cert.available') 'certificates.client.cert.available')
def loadbalancer_kubeconfig(loadbalancer, ca, client): def loadbalancer_kubeconfig(loadbalancer, ca, client):
# Get the potential list of loadbalancers from the relation object. # Get the potential list of loadbalancers from the relation object.
hosts = loadbalancer.get_addresses_ports() hosts = loadbalancer.get_addresses_ports()
@ -413,8 +398,7 @@ def loadbalancer_kubeconfig(loadbalancer, ca, client):
build_kubeconfig(server) build_kubeconfig(server)
@when('kubernetes-master.components.installed', @when('certificates.ca.available', 'certificates.client.cert.available')
'certificates.ca.available', 'certificates.client.cert.available')
@when_not('loadbalancer.available') @when_not('loadbalancer.available')
def create_self_config(ca, client): def create_self_config(ca, client):
'''Create a kubernetes configuration for the master unit.''' '''Create a kubernetes configuration for the master unit.'''
@ -520,8 +504,11 @@ def initial_nrpe_config(nagios=None):
@when_any('config.changed.nagios_context', @when_any('config.changed.nagios_context',
'config.changed.nagios_servicegroups') 'config.changed.nagios_servicegroups')
def update_nrpe_config(unused=None): def update_nrpe_config(unused=None):
services = ('kube-apiserver', 'kube-controller-manager', 'kube-scheduler') services = (
'snap.kube-apiserver.daemon',
'snap.kube-controller-manager.daemon',
'snap.kube-scheduler.daemon'
)
hostname = nrpe.get_nagios_hostname() hostname = nrpe.get_nagios_hostname()
current_unit = nrpe.get_nagios_unit_name() current_unit = nrpe.get_nagios_unit_name()
nrpe_setup = nrpe.NRPE(hostname=hostname) nrpe_setup = nrpe.NRPE(hostname=hostname)
@ -535,7 +522,11 @@ def remove_nrpe_config(nagios=None):
remove_state('nrpe-external-master.initial-config') remove_state('nrpe-external-master.initial-config')
# List of systemd services for which the checks will be removed # List of systemd services for which the checks will be removed
services = ('kube-apiserver', 'kube-controller-manager', 'kube-scheduler') services = (
'snap.kube-apiserver.daemon',
'snap.kube-controller-manager.daemon',
'snap.kube-scheduler.daemon'
)
# The current nrpe-external-master interface doesn't handle a lot of logic, # The current nrpe-external-master interface doesn't handle a lot of logic,
# use the charm-helpers code for now. # use the charm-helpers code for now.
@ -546,45 +537,15 @@ def remove_nrpe_config(nagios=None):
nrpe_setup.remove_check(shortname=service) nrpe_setup.remove_check(shortname=service)
def set_privileged(privileged, render_config=True): def is_privileged():
"""Update the KUBE_ALLOW_PRIV flag for kube-apiserver and re-render config. """Return boolean indicating whether or not to set allow-privileged=true.
If the flag already matches the requested value, this is a no-op.
:param str privileged: "true" or "false"
:param bool render_config: whether to render new config file
:return: True if the flag was changed, else false
""" """
if privileged == "true": privileged = hookenv.config('allow-privileged')
set_state('kubernetes-master.privileged') if privileged == 'auto':
return is_state('kubernetes-master.gpu.enabled')
else: else:
remove_state('kubernetes-master.privileged') return privileged == 'true'
flag = '--allow-privileged'
kube_allow_priv_opts = FlagManager('KUBE_ALLOW_PRIV')
if kube_allow_priv_opts.get(flag) == privileged:
# Flag isn't changing, nothing to do
return False
hookenv.log('Setting {}={}'.format(flag, privileged))
# Update --allow-privileged flag value
kube_allow_priv_opts.add(flag, privileged, strict=True)
# re-render config with new options
if render_config:
context = {
'kube_allow_priv': kube_allow_priv_opts.to_s(),
}
# render the kube-defaults file
render('kube-defaults.defaults', '/etc/default/kube-defaults', context)
# signal that we need a kube-apiserver restart
set_state('kubernetes-master.kube-apiserver.restart')
return True
@when('config.changed.allow-privileged') @when('config.changed.allow-privileged')
@ -593,24 +554,10 @@ def on_config_allow_privileged_change():
"""React to changed 'allow-privileged' config value. """React to changed 'allow-privileged' config value.
""" """
config = hookenv.config() remove_state('kubernetes-master.components.started')
privileged = config['allow-privileged']
if privileged == "auto":
return
set_privileged(privileged)
remove_state('config.changed.allow-privileged') remove_state('config.changed.allow-privileged')
@when('kubernetes-master.kube-apiserver.restart')
def restart_kube_apiserver():
"""Restart kube-apiserver.
"""
host.service_restart('kube-apiserver')
remove_state('kubernetes-master.kube-apiserver.restart')
@when('kube-control.gpu.available') @when('kube-control.gpu.available')
@when('kubernetes-master.components.started') @when('kubernetes-master.components.started')
@when_not('kubernetes-master.gpu.enabled') @when_not('kubernetes-master.gpu.enabled')
@ -628,7 +575,7 @@ def on_gpu_available(kube_control):
) )
return return
set_privileged("true") remove_state('kubernetes-master.components.started')
set_state('kubernetes-master.gpu.enabled') set_state('kubernetes-master.gpu.enabled')
@ -642,32 +589,6 @@ def disable_gpu_mode():
remove_state('kubernetes-master.gpu.enabled') remove_state('kubernetes-master.gpu.enabled')
def create_addon(template, context):
'''Create an addon from a template'''
source = 'addons/' + template
target = '/etc/kubernetes/addons/' + template
render(source, target, context)
# Need --force when upgrading between k8s versions where the templates have
# changed.
cmd = ['kubectl', 'apply', '--force', '-f', target]
check_call(cmd)
def delete_addon(template):
'''Delete an addon from a template'''
target = '/etc/kubernetes/addons/' + template
cmd = ['kubectl', 'delete', '-f', target]
call(cmd)
def get_node_count():
'''Return the number of Kubernetes nodes in the cluster'''
cmd = ['kubectl', 'get', 'nodes', '-o', 'name']
output = check_output(cmd)
node_count = len(output.splitlines())
return node_count
def arch(): def arch():
'''Return the package architecture as a string. Raise an exception if the '''Return the package architecture as a string. Raise an exception if the
architecture is not supported by kubernetes.''' architecture is not supported by kubernetes.'''
@ -695,16 +616,10 @@ def build_kubeconfig(server):
# Cache last server string to know if we need to regenerate the config. # Cache last server string to know if we need to regenerate the config.
if not data_changed('kubeconfig.server', server): if not data_changed('kubeconfig.server', server):
return return
# The final destination of the kubeconfig and kubectl.
destination_directory = '/home/ubuntu'
# Create an absolute path for the kubeconfig file. # Create an absolute path for the kubeconfig file.
kubeconfig_path = os.path.join(destination_directory, 'config') kubeconfig_path = os.path.join(os.sep, 'home', 'ubuntu', 'config')
# Create the kubeconfig on this system so users can access the cluster. # Create the kubeconfig on this system so users can access the cluster.
create_kubeconfig(kubeconfig_path, server, ca, key, cert) create_kubeconfig(kubeconfig_path, server, ca, key, cert)
# Copy the kubectl binary to the destination directory.
cmd = ['install', '-v', '-o', 'ubuntu', '-g', 'ubuntu',
'/usr/local/bin/kubectl', destination_directory]
check_call(cmd)
# Make the config file readable by the ubuntu users so juju scp works. # Make the config file readable by the ubuntu users so juju scp works.
cmd = ['chown', 'ubuntu:ubuntu', kubeconfig_path] cmd = ['chown', 'ubuntu:ubuntu', kubeconfig_path]
check_call(cmd) check_call(cmd)
@ -753,7 +668,7 @@ def handle_etcd_relation(reldata):
etcd declares itself as available''' etcd declares itself as available'''
connection_string = reldata.get_connection_string() connection_string = reldata.get_connection_string()
# Define where the etcd tls files will be kept. # Define where the etcd tls files will be kept.
etcd_dir = '/etc/ssl/etcd' etcd_dir = '/root/cdk/etcd'
# Create paths to the etcd client ca, key, and cert file locations. # Create paths to the etcd client ca, key, and cert file locations.
ca = os.path.join(etcd_dir, 'client-ca.pem') ca = os.path.join(etcd_dir, 'client-ca.pem')
key = os.path.join(etcd_dir, 'client-key.pem') key = os.path.join(etcd_dir, 'client-key.pem')
@ -767,38 +682,28 @@ def handle_etcd_relation(reldata):
# Never use stale data, always prefer whats coming in during context # Never use stale data, always prefer whats coming in during context
# building. if its stale, its because whats in unitdata is stale # building. if its stale, its because whats in unitdata is stale
data = api_opts.data data = api_opts.data
if data.get('--etcd-servers-strict') or data.get('--etcd-servers'): if data.get('etcd-servers-strict') or data.get('etcd-servers'):
api_opts.destroy('--etcd-cafile') api_opts.destroy('etcd-cafile')
api_opts.destroy('--etcd-keyfile') api_opts.destroy('etcd-keyfile')
api_opts.destroy('--etcd-certfile') api_opts.destroy('etcd-certfile')
api_opts.destroy('--etcd-servers', strict=True) api_opts.destroy('etcd-servers', strict=True)
api_opts.destroy('--etcd-servers') api_opts.destroy('etcd-servers')
# Set the apiserver flags in the options manager # Set the apiserver flags in the options manager
api_opts.add('--etcd-cafile', ca) api_opts.add('etcd-cafile', ca)
api_opts.add('--etcd-keyfile', key) api_opts.add('etcd-keyfile', key)
api_opts.add('--etcd-certfile', cert) api_opts.add('etcd-certfile', cert)
api_opts.add('--etcd-servers', connection_string, strict=True) api_opts.add('etcd-servers', connection_string, strict=True)
def render_files(): def configure_master_services():
'''Use jinja templating to render the docker-compose.yml and master.json ''' Add remaining flags for the master services and configure snaps to use
file to contain the dynamic data for the configuration files.''' them '''
context = {}
config = hookenv.config()
# Add the charm configuration data to the context.
context.update(config)
# Update the context with extra values: arch, and networking information
context.update({'arch': arch(),
'master_address': hookenv.unit_get('private-address'),
'public_address': hookenv.unit_get('public-address'),
'private_address': hookenv.unit_get('private-address')})
api_opts = FlagManager('kube-apiserver') api_opts = FlagManager('kube-apiserver')
controller_opts = FlagManager('kube-controller-manager') controller_opts = FlagManager('kube-controller-manager')
scheduler_opts = FlagManager('kube-scheduler') scheduler_opts = FlagManager('kube-scheduler')
scheduler_opts.add('--v', '2') scheduler_opts.add('v', '2')
# Get the tls paths from the layer data. # Get the tls paths from the layer data.
layer_options = layer.options('tls-client') layer_options = layer.options('tls-client')
@ -808,23 +713,27 @@ def render_files():
server_cert_path = layer_options.get('server_certificate_path') server_cert_path = layer_options.get('server_certificate_path')
server_key_path = layer_options.get('server_key_path') server_key_path = layer_options.get('server_key_path')
# set --allow-privileged flag for kube-apiserver if is_privileged():
set_privileged( api_opts.add('allow-privileged', 'true', strict=True)
"true" if config['allow-privileged'] == "true" else "false", set_state('kubernetes-master.privileged')
render_config=False) else:
api_opts.add('allow-privileged', 'false', strict=True)
remove_state('kubernetes-master.privileged')
# Handle static options for now # Handle static options for now
api_opts.add('--min-request-timeout', '300') api_opts.add('service-cluster-ip-range', service_cidr())
api_opts.add('--v', '4') api_opts.add('min-request-timeout', '300')
api_opts.add('--client-ca-file', ca_cert_path) api_opts.add('v', '4')
api_opts.add('--tls-cert-file', server_cert_path) api_opts.add('client-ca-file', ca_cert_path)
api_opts.add('--tls-private-key-file', server_key_path) api_opts.add('tls-cert-file', server_cert_path)
api_opts.add('--kubelet-certificate-authority', ca_cert_path) api_opts.add('tls-private-key-file', server_key_path)
api_opts.add('--kubelet-client-certificate', client_cert_path) api_opts.add('kubelet-certificate-authority', ca_cert_path)
api_opts.add('--kubelet-client-key', client_key_path) api_opts.add('kubelet-client-certificate', client_cert_path)
# Needed for upgrade from 1.5.x to 1.6.0 api_opts.add('kubelet-client-key', client_key_path)
# XXX: support etcd3 api_opts.add('logtostderr', 'true')
api_opts.add('--storage-backend', 'etcd2') api_opts.add('insecure-bind-address', '127.0.0.1')
api_opts.add('insecure-port', '8080')
api_opts.add('storage-backend', 'etcd2') # FIXME: add etcd3 support
admission_control = [ admission_control = [
'NamespaceLifecycle', 'NamespaceLifecycle',
'LimitRanger', 'LimitRanger',
@ -832,68 +741,50 @@ def render_files():
'ResourceQuota', 'ResourceQuota',
'DefaultTolerationSeconds' 'DefaultTolerationSeconds'
] ]
if get_version('kube-apiserver') < (1, 6): if get_version('kube-apiserver') < (1, 6):
hookenv.log('Removing DefaultTolerationSeconds from admission-control') hookenv.log('Removing DefaultTolerationSeconds from admission-control')
admission_control.remove('DefaultTolerationSeconds') admission_control.remove('DefaultTolerationSeconds')
api_opts.add( api_opts.add('admission-control', ','.join(admission_control), strict=True)
'--admission-control', ','.join(admission_control), strict=True)
# Default to 3 minute resync. TODO: Make this configureable? # Default to 3 minute resync. TODO: Make this configureable?
controller_opts.add('--min-resync-period', '3m') controller_opts.add('min-resync-period', '3m')
controller_opts.add('--v', '2') controller_opts.add('v', '2')
controller_opts.add('--root-ca-file', ca_cert_path) controller_opts.add('root-ca-file', ca_cert_path)
controller_opts.add('logtostderr', 'true')
controller_opts.add('master', 'http://127.0.0.1:8080')
context.update({ scheduler_opts.add('v', '2')
'kube_allow_priv': FlagManager('KUBE_ALLOW_PRIV').to_s(), scheduler_opts.add('logtostderr', 'true')
'kube_apiserver_flags': api_opts.to_s(), scheduler_opts.add('master', 'http://127.0.0.1:8080')
'kube_scheduler_flags': scheduler_opts.to_s(),
'kube_controller_manager_flags': controller_opts.to_s(),
})
# Render the configuration files that contains parameters for cmd = ['snap', 'set', 'kube-apiserver'] + api_opts.to_s().split(' ')
# the apiserver, scheduler, and controller-manager check_call(cmd)
render_service('kube-apiserver', context) cmd = (
render_service('kube-controller-manager', context) ['snap', 'set', 'kube-controller-manager'] +
render_service('kube-scheduler', context) controller_opts.to_s().split(' ')
)
# explicitly render the generic defaults file check_call(cmd)
render('kube-defaults.defaults', '/etc/default/kube-defaults', context) cmd = ['snap', 'set', 'kube-scheduler'] + scheduler_opts.to_s().split(' ')
check_call(cmd)
# when files change on disk, we need to inform systemd of the changes
call(['systemctl', 'daemon-reload'])
call(['systemctl', 'enable', 'kube-apiserver'])
call(['systemctl', 'enable', 'kube-controller-manager'])
call(['systemctl', 'enable', 'kube-scheduler'])
def render_service(service_name, context):
'''Render the systemd service by name.'''
unit_directory = '/lib/systemd/system'
source = '{0}.service'.format(service_name)
target = os.path.join(unit_directory, '{0}.service'.format(service_name))
render(source, target, context)
conf_directory = '/etc/default'
source = '{0}.defaults'.format(service_name)
target = os.path.join(conf_directory, service_name)
render(source, target, context)
def setup_basic_auth(username='admin', password='admin', user='admin'): def setup_basic_auth(username='admin', password='admin', user='admin'):
'''Create the htacces file and the tokens.''' '''Create the htacces file and the tokens.'''
srv_kubernetes = '/srv/kubernetes' root_cdk = '/root/cdk'
if not os.path.isdir(srv_kubernetes): if not os.path.isdir(root_cdk):
os.makedirs(srv_kubernetes) os.makedirs(root_cdk)
htaccess = os.path.join(srv_kubernetes, 'basic_auth.csv') htaccess = os.path.join(root_cdk, 'basic_auth.csv')
with open(htaccess, 'w') as stream: with open(htaccess, 'w') as stream:
stream.write('{0},{1},{2}'.format(username, password, user)) stream.write('{0},{1},{2}'.format(username, password, user))
def setup_tokens(token, username, user): def setup_tokens(token, username, user):
'''Create a token file for kubernetes authentication.''' '''Create a token file for kubernetes authentication.'''
srv_kubernetes = '/srv/kubernetes' root_cdk = '/root/cdk'
if not os.path.isdir(srv_kubernetes): if not os.path.isdir(root_cdk):
os.makedirs(srv_kubernetes) os.makedirs(root_cdk)
known_tokens = os.path.join(srv_kubernetes, 'known_tokens.csv') known_tokens = os.path.join(root_cdk, 'known_tokens.csv')
if not token: if not token:
alpha = string.ascii_letters + string.digits alpha = string.ascii_letters + string.digits
token = ''.join(random.SystemRandom().choice(alpha) for _ in range(32)) token = ''.join(random.SystemRandom().choice(alpha) for _ in range(32))
@ -920,3 +811,9 @@ def all_kube_system_pods_running():
return False return False
return True return True
def apiserverVersion():
cmd = 'kube-apiserver --version'.split()
version_string = check_output(cmd).decode('utf-8')
return tuple(int(q) for q in re.findall("[0-9]+", version_string)[:3])

View File

@ -1,16 +0,0 @@
#!/usr/bin/env python
# Copyright 2015 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

View File

@ -1,185 +0,0 @@
#!/usr/bin/env python
# Copyright 2015 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import argparse
import os
import shutil
import subprocess
import tempfile
import logging
from contextlib import contextmanager
import charmtools.utils
from charmtools.build.tactics import Tactic
description = """
Update addon manifests for the charm.
This will clone the kubernetes repo and place the addons in
<charm>/templates/addons.
Can be run with no arguments and from any folder.
"""
log = logging.getLogger(__name__)
def clean_addon_dir(addon_dir):
""" Remove and recreate the addons folder """
log.debug("Cleaning " + addon_dir)
shutil.rmtree(addon_dir, ignore_errors=True)
os.makedirs(addon_dir)
def run_with_logging(command):
""" Run a command with controlled logging """
log.debug("Running: %s" % command)
process = subprocess.Popen(command, stderr=subprocess.PIPE)
stderr = process.communicate()[1].rstrip()
process.wait()
if process.returncode != 0:
log.error(stderr)
raise Exception("%s: exit code %d" % (command, process.returncode))
log.debug(stderr)
@contextmanager
def kubernetes_repo():
""" Yield a kubernetes repo to copy addons from.
If KUBE_VERSION is set, this will clone the local repo and checkout the
corresponding branch. Otherwise, the local branch will be used. """
repo = os.path.abspath("../../../..")
if "KUBE_VERSION" in os.environ:
branch = os.environ["KUBE_VERSION"]
log.info("Cloning %s with branch %s" % (repo, branch))
path = tempfile.mkdtemp(prefix="kubernetes")
try:
cmd = ["git", "clone", repo, path, "-b", branch]
run_with_logging(cmd)
yield path
finally:
shutil.rmtree(path)
else:
log.info("Using local repo " + repo)
yield repo
def add_addon(repo, source, dest):
""" Add an addon manifest from the given repo and source.
Any occurrences of 'amd64' are replaced with '{{ arch }}' so the charm can
fill it in during deployment. """
source = os.path.join(repo, "cluster/addons", source)
if os.path.isdir(dest):
dest = os.path.join(dest, os.path.basename(source))
log.debug("Copying: %s -> %s" % (source, dest))
with open(source, "r") as f:
content = f.read()
content = content.replace("amd64", "{{ arch }}")
with open(dest, "w") as f:
f.write(content)
def update_addons(dest):
""" Update addons. This will clean the addons folder and add new manifests
from upstream. """
with kubernetes_repo() as repo:
log.info("Copying addons to charm")
clean_addon_dir(dest)
add_addon(repo, "dashboard/dashboard-controller.yaml", dest)
add_addon(repo, "dashboard/dashboard-service.yaml", dest)
try:
add_addon(repo, "dns/kubedns-sa.yaml",
dest + "/kubedns-sa.yaml")
add_addon(repo, "dns/kubedns-cm.yaml",
dest + "/kubedns-cm.yaml")
add_addon(repo, "dns/kubedns-controller.yaml.in",
dest + "/kubedns-controller.yaml")
add_addon(repo, "dns/kubedns-svc.yaml.in",
dest + "/kubedns-svc.yaml")
except IOError as e:
# fall back to the older filenames
log.debug(e)
add_addon(repo, "dns/skydns-rc.yaml.in",
dest + "/kubedns-controller.yaml")
add_addon(repo, "dns/skydns-svc.yaml.in",
dest + "/kubedns-svc.yaml")
influxdb = "cluster-monitoring/influxdb"
add_addon(repo, influxdb + "/grafana-service.yaml", dest)
add_addon(repo, influxdb + "/heapster-controller.yaml", dest)
add_addon(repo, influxdb + "/heapster-service.yaml", dest)
add_addon(repo, influxdb + "/influxdb-grafana-controller.yaml", dest)
add_addon(repo, influxdb + "/influxdb-service.yaml", dest)
# Entry points
class UpdateAddonsTactic(Tactic):
""" This tactic is used by charm-tools to dynamically populate the
template/addons folder at `charm build` time. """
@classmethod
def trigger(cls, entity, target=None, layer=None, next_config=None):
""" Determines which files the tactic should apply to. We only want
this tactic to trigger once, so let's use the templates/ folder
"""
relpath = entity.relpath(layer.directory) if layer else entity
return relpath == "templates"
@property
def dest(self):
""" The destination we are writing to. This isn't a Tactic thing,
it's just a helper for UpdateAddonsTactic """
return self.target / "templates" / "addons"
def __call__(self):
""" When the tactic is called, update addons and put them directly in
our build destination """
update_addons(self.dest)
def sign(self):
""" Return signatures for the charm build manifest. We need to do this
because the addon template files were added dynamically """
sigs = {}
for file in os.listdir(self.dest):
path = self.dest / file
relpath = path.relpath(self.target.directory)
sigs[relpath] = (
self.current.url,
"dynamic",
charmtools.utils.sign(path)
)
return sigs
def parse_args():
""" Parse args. This is solely done for the usage output with -h """
parser = argparse.ArgumentParser(description=description)
parser.parse_args()
def main():
""" Update addons into the layer's templates/addons folder """
parse_args()
os.chdir(os.path.join(os.path.dirname(__file__), ".."))
dest = "templates/addons"
update_addons(dest)
if __name__ == "__main__":
main()

View File

@ -1,17 +0,0 @@
###
# kubernetes system config
#
# The following values are used to configure the kube-apiserver
#
# The address on the local server to listen to.
KUBE_API_ADDRESS="--insecure-bind-address=127.0.0.1"
# The port on the local server to listen on.
KUBE_API_PORT="--insecure-port=8080"
# default admission control policies
KUBE_ADMISSION_CONTROL=""
# Add your own!
KUBE_API_ARGS="{{ kube_apiserver_flags }}"

View File

@ -1,22 +0,0 @@
[Unit]
Description=Kubernetes API Server
Documentation=http://kubernetes.io/docs/admin/kube-apiserver/
After=network.target
[Service]
EnvironmentFile=-/etc/default/kube-defaults
EnvironmentFile=-/etc/default/kube-apiserver
ExecStart=/usr/local/bin/kube-apiserver \
$KUBE_LOGTOSTDERR \
$KUBE_LOG_LEVEL \
$KUBE_API_ADDRESS \
$KUBE_API_PORT \
$KUBE_ALLOW_PRIV \
$KUBE_ADMISSION_CONTROL \
$KUBE_API_ARGS
Restart=on-failure
Type=notify
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target

View File

@ -1,8 +0,0 @@
###
# The following values are used to configure the kubernetes controller-manager
# defaults from config and apiserver should be adequate
# Add your own!
KUBE_CONTROLLER_MANAGER_ARGS="{{ kube_controller_manager_flags }}"

View File

@ -1,18 +0,0 @@
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
[Service]
EnvironmentFile=-/etc/default/kube-defaults
EnvironmentFile=-/etc/default/kube-controller-manager
ExecStart=/usr/local/bin/kube-controller-manager \
$KUBE_LOGTOSTDERR \
$KUBE_LOG_LEVEL \
$KUBE_MASTER \
$KUBE_CONTROLLER_MANAGER_ARGS
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target

View File

@ -1,22 +0,0 @@
###
# kubernetes system config
#
# The following values are used to configure various aspects of all
# kubernetes services, including
#
# kube-apiserver.service
# kube-controller-manager.service
# kube-scheduler.service
# kubelet.service
# kube-proxy.service
# logging to stderr means we get it in the systemd journal
KUBE_LOGTOSTDERR="--logtostderr=true"
# journal message level, 0 is debug
KUBE_LOG_LEVEL="--v=0"
# Should this cluster be allowed to run privileged docker containers
KUBE_ALLOW_PRIV="{{ kube_allow_priv }}"
# How the controller-manager, scheduler, and proxy find the apiserver
KUBE_MASTER="--master=http://127.0.0.1:8080"

View File

@ -1,7 +0,0 @@
###
# kubernetes scheduler config
# default config should be adequate
# Add your own!
KUBE_SCHEDULER_ARGS="{{ kube_scheduler_flags }}"

View File

@ -1,17 +0,0 @@
[Unit]
Description=Kubernetes Scheduler Plugin
Documentation=http://kubernetes.io/docs/admin/multiple-schedulers/
[Service]
EnvironmentFile=-/etc/default/kube-defaults
EnvironmentFile=-/etc/default/kube-scheduler
ExecStart=/usr/local/bin/kube-scheduler \
$KUBE_LOGTOSTDERR \
$KUBE_LOG_LEVEL \
$KUBE_MASTER \
$KUBE_SCHEDULER_ARGS
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target

View File

@ -41,6 +41,27 @@ a unit for maintenance.
Resuming the workload will [uncordon](http://kubernetes.io/docs/user-guide/kubectl/kubectl_uncordon/) a paused unit. Workloads will automatically migrate unless otherwise directed via their application declaration. Resuming the workload will [uncordon](http://kubernetes.io/docs/user-guide/kubectl/kubectl_uncordon/) a paused unit. Workloads will automatically migrate unless otherwise directed via their application declaration.
## Private registry
With the "registry" action that is part for the kubernetes-worker charm, you can very easily create a private docker registry, with authentication, and available over TLS. Please note that the registry deployed with the action is not HA, and uses storage tied to the kubernetes node where the pod is running. So if the registry pod changes is migrated from one node to another for whatever reason, you will need to re-publish the images.
### Example usage
Create the relevant authentication files. Let's say you want user `userA` to authenticate with the password `passwordA`. Then you'll do :
echo "userA:passwordA" > htpasswd-plain
htpasswd -c -b -B htpasswd userA passwordA
(the `htpasswd` program comes with the `apache2-utils` package)
Supposing your registry will be reachable at `myregistry.company.com`, and that you already have your TLS key in the `registry.key` file, and your TLS certificate (with `myregistry.company.com` as Common Name) in the `registry.crt` file, you would then run :
juju run-action kubernetes-worker/0 registry domain=myregistry.company.com htpasswd="$(base64 -w0 htpasswd)" htpasswd-plain="$(base64 -w0 htpasswd-plain)" tlscert="$(base64 -w0 registry.crt)" tlskey="$(base64 -w0 registry.key)" ingress=true
If you then decide that you want do delete the registry, just run :
juju run-action kubernetes-worker/0 registry delete=true ingress=true
## Known Limitations ## Known Limitations
Kubernetes workers currently only support 'phaux' HA scenarios. Even when configured with an HA cluster string, they will only ever contact the first unit in the cluster map. To enable a proper HA story, kubernetes-worker units are encouraged to proxy through a [kubeapi-load-balancer](https://jujucharms.com/kubeapi-load-balancer) Kubernetes workers currently only support 'phaux' HA scenarios. Even when configured with an HA cluster string, they will only ever contact the first unit in the cluster map. To enable a proper HA story, kubernetes-worker units are encouraged to proxy through a [kubeapi-load-balancer](https://jujucharms.com/kubeapi-load-balancer)
@ -48,5 +69,4 @@ application. This enables a HA deployment without the need to
re-render configuration and disrupt the worker services. re-render configuration and disrupt the worker services.
External access to pods must be performed through a [Kubernetes External access to pods must be performed through a [Kubernetes
Ingress Resource](http://kubernetes.io/docs/user-guide/ingress/). More Ingress Resource](http://kubernetes.io/docs/user-guide/ingress/).
information

View File

@ -14,4 +14,32 @@ microbot:
delete: delete:
type: boolean type: boolean
default: False default: False
description: Removes a microbots deployment, service, and ingress if True. description: Remove a microbots deployment, service, and ingress if True.
upgrade:
description: Upgrade the kubernetes snaps
registry:
description: Create a private Docker registry
params:
htpasswd:
type: string
description: base64 encoded htpasswd file used for authentication.
htpasswd-plain:
type: string
description: base64 encoded plaintext version of the htpasswd file, needed by docker daemons to authenticate to the registry.
tlscert:
type: string
description: base64 encoded TLS certificate for the registry. Common Name must match the domain name of the registry.
tlskey:
type: string
description: base64 encoded TLS key for the registry.
domain:
type: string
description: The domain name for the registry. Must match the Common Name of the certificate.
ingress:
type: boolean
default: false
description: Create an Ingress resource for the registry (or delete resource object if "delete" is True)
delete:
type: boolean
default: false
description: Remove a registry replication controller, service, and ingress if True.

View File

@ -14,6 +14,7 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
import os
import sys import sys
from charmhelpers.core.hookenv import action_get from charmhelpers.core.hookenv import action_get
@ -22,6 +23,7 @@ from charmhelpers.core.hookenv import unit_public_ip
from charms.templating.jinja2 import render from charms.templating.jinja2 import render
from subprocess import call from subprocess import call
os.environ['PATH'] += os.pathsep + os.path.join(os.sep, 'snap', 'bin')
context = {} context = {}
context['replicas'] = action_get('replicas') context['replicas'] = action_get('replicas')
@ -32,7 +34,7 @@ if not context['replicas']:
context['replicas'] = 3 context['replicas'] = 3
# Declare a kubectl template when invoking kubectl # Declare a kubectl template when invoking kubectl
kubectl = ['kubectl', '--kubeconfig=/srv/kubernetes/config'] kubectl = ['kubectl', '--kubeconfig=/root/cdk/kubeconfig']
# Remove deployment if requested # Remove deployment if requested
if context['delete']: if context['delete']:
@ -56,11 +58,11 @@ if context['delete']:
# Creation request # Creation request
render('microbot-example.yaml', '/etc/kubernetes/addons/microbot.yaml', render('microbot-example.yaml', '/root/cdk/addons/microbot.yaml',
context) context)
create_command = kubectl + ['create', '-f', create_command = kubectl + ['create', '-f',
'/etc/kubernetes/addons/microbot.yaml'] '/root/cdk/addons/microbot.yaml']
create_response = call(create_command) create_response = call(create_command)

View File

@ -2,6 +2,8 @@
set -ex set -ex
kubectl --kubeconfig=/srv/kubernetes/config cordon $(hostname) export PATH=$PATH:/snap/bin
kubectl --kubeconfig=/srv/kubernetes/config drain $(hostname) --force
kubectl --kubeconfig=/root/cdk/kubeconfig cordon $(hostname)
kubectl --kubeconfig=/root/cdk/kubeconfig drain $(hostname) --force
status-set 'waiting' 'Kubernetes unit paused' status-set 'waiting' 'Kubernetes unit paused'

View File

@ -0,0 +1,136 @@
#!/usr/bin/python3
#
# For a usage examples, see README.md
#
# TODO
#
# - make the action idempotent (i.e. if you run it multiple times, the first
# run will create/delete the registry, and the reset will be a no-op and won't
# error out)
#
# - take only a plain authentication file, and create the encrypted version in
# the action
#
# - validate the parameters (make sure tlscert is a certificate, that tlskey is a
# proper key, etc)
#
# - when https://bugs.launchpad.net/juju/+bug/1661015 is fixed, handle the
# base64 encoding the parameters in the action itself
import os
import sys
from base64 import b64encode
from charmhelpers.core.hookenv import action_get
from charmhelpers.core.hookenv import action_set
from charms.templating.jinja2 import render
from subprocess import call
os.environ['PATH'] += os.pathsep + os.path.join(os.sep, 'snap', 'bin')
deletion = action_get('delete')
context = {}
# These config options must be defined in the case of a creation
param_error = False
for param in ('tlscert', 'tlskey', 'domain', 'htpasswd', 'htpasswd-plain'):
value = action_get(param)
if not value and not deletion:
key = "registry-create-parameter-{}".format(param)
error = "failure, parameter {} is required".format(param)
action_set({key: error})
param_error = True
context[param] = value
# Create the dockercfg template variable
dockercfg = '{"%s:443": {"auth": "%s", "email": "root@localhost"}}' % \
(context['domain'], context['htpasswd-plain'])
context['dockercfg'] = b64encode(dockercfg.encode()).decode('ASCII')
if param_error:
sys.exit(0)
# This one is either true or false, no need to check if it has a "good" value.
context['ingress'] = action_get('ingress')
# Declare a kubectl template when invoking kubectl
kubectl = ['kubectl', '--kubeconfig=/root/cdk/kubeconfig']
# Remove deployment if requested
if deletion:
resources = ['svc/kube-registry', 'rc/kube-registry-v0', 'secrets/registry-tls-data',
'secrets/registry-auth-data', 'secrets/registry-access']
if action_get('ingress'):
resources.append('ing/registry-ing')
delete_command = kubectl + ['delete', '--ignore-not-found=true'] + resources
delete_response = call(delete_command)
if delete_response == 0:
action_set({'registry-delete': 'success'})
else:
action_set({'registry-delete': 'failure'})
sys.exit(0)
# Creation request
render('registry.yaml', '/root/cdk/addons/registry.yaml',
context)
create_command = kubectl + ['create', '-f',
'/root/cdk/addons/registry.yaml']
create_response = call(create_command)
if create_response == 0:
action_set({'registry-create': 'success'})
# Create a ConfigMap if it doesn't exist yet, else patch it.
# A ConfigMap is needed to change the default value for nginx' client_max_body_size.
# The default is 1MB, and this is the maximum size of images that can be
# pushed on the registry. 1MB images aren't useful, so we bump this value to 1024MB.
cm_name = 'nginx-load-balancer-conf'
check_cm_command = kubectl + ['get', 'cm', cm_name]
check_cm_response = call(check_cm_command)
if check_cm_response == 0:
# There is an existing ConfigMap, patch it
patch = '{"data":{"max-body-size":"1024m"}}'
patch_cm_command = kubectl + ['patch', 'cm', cm_name, '-p', patch]
patch_cm_response = call(patch_cm_command)
if patch_cm_response == 0:
action_set({'configmap-patch': 'success'})
else:
action_set({'configmap-patch': 'failure'})
else:
# No existing ConfigMap, create it
render('registry-configmap.yaml', '/root/cdk/addons/registry-configmap.yaml',
context)
create_cm_command = kubectl + ['create', '-f', '/root/cdk/addons/registry-configmap.yaml']
create_cm_response = call(create_cm_command)
if create_cm_response == 0:
action_set({'configmap-create': 'success'})
else:
action_set({'configmap-create': 'failure'})
# Patch the "default" serviceaccount with an imagePullSecret.
# This will allow the docker daemons to authenticate to our private
# registry automatically
patch = '{"imagePullSecrets":[{"name":"registry-access"}]}'
patch_sa_command = kubectl + ['patch', 'sa', 'default', '-p', patch]
patch_sa_response = call(patch_sa_command)
if patch_sa_response == 0:
action_set({'serviceaccount-patch': 'success'})
else:
action_set({'serviceaccount-patch': 'failure'})
else:
action_set({'registry-create': 'failure'})

View File

@ -2,5 +2,7 @@
set -ex set -ex
kubectl --kubeconfig=/srv/kubernetes/config uncordon $(hostname) export PATH=$PATH:/snap/bin
kubectl --kubeconfig=/root/cdk/kubeconfig uncordon $(hostname)
status-set 'active' 'Kubernetes unit resumed' status-set 'active' 'Kubernetes unit resumed'

View File

@ -0,0 +1,5 @@
#!/bin/sh
set -eux
charms.reactive set_state kubernetes-worker.snaps.upgrade-specified
exec hooks/config-changed

View File

@ -20,3 +20,14 @@ options:
mode by default. If "false", kubelet will never run in privileged mode. mode by default. If "false", kubelet will never run in privileged mode.
If "auto", kubelet will not run in privileged mode by default, but will If "auto", kubelet will not run in privileged mode by default, but will
switch to privileged mode if gpu hardware is detected. switch to privileged mode if gpu hardware is detected.
channel:
type: string
default: "stable"
description: |
Snap channel to install Kubernetes worker services from
require-manual-upgrade:
type: boolean
default: true
description: |
When true, worker services will not be upgraded until the user triggers
it manually by running the upgrade action.

View File

@ -1,7 +1,9 @@
#!/bin/sh #!/bin/sh
set -ux set -ux
alias kubectl="kubectl --kubeconfig=/srv/kubernetes/config" export PATH=$PATH:/snap/bin
alias kubectl="kubectl --kubeconfig=/root/cdk/kubeconfig"
kubectl cluster-info > $DEBUG_SCRIPT_DIR/cluster-info kubectl cluster-info > $DEBUG_SCRIPT_DIR/cluster-info
kubectl cluster-info dump > $DEBUG_SCRIPT_DIR/cluster-info-dump kubectl cluster-info dump > $DEBUG_SCRIPT_DIR/cluster-info-dump

View File

@ -2,12 +2,8 @@
set -ux set -ux
for service in kubelet kube-proxy; do for service in kubelet kube-proxy; do
systemctl status $service > $DEBUG_SCRIPT_DIR/$service-systemctl-status systemctl status snap.$service.daemon > $DEBUG_SCRIPT_DIR/$service-systemctl-status
journalctl -u $service > $DEBUG_SCRIPT_DIR/$service-journal journalctl -u snap.$service.daemon > $DEBUG_SCRIPT_DIR/$service-journal
done done
mkdir -p $DEBUG_SCRIPT_DIR/etc-default # FIXME: get the snap config or something
cp -v /etc/default/kube* $DEBUG_SCRIPT_DIR/etc-default
mkdir -p $DEBUG_SCRIPT_DIR/lib-systemd-system
cp -v /lib/systemd/system/kube* $DEBUG_SCRIPT_DIR/lib-systemd-system

View File

@ -2,7 +2,9 @@ repo: https://github.com/kubernetes/kubernetes.git
includes: includes:
- 'layer:basic' - 'layer:basic'
- 'layer:debug' - 'layer:debug'
- 'layer:snap'
- 'layer:docker' - 'layer:docker'
- 'layer:metrics'
- 'layer:nagios' - 'layer:nagios'
- 'layer:tls-client' - 'layer:tls-client'
- 'layer:nvidia-cuda' - 'layer:nvidia-cuda'
@ -17,8 +19,8 @@ options:
- 'ceph-common' - 'ceph-common'
- 'socat' - 'socat'
tls-client: tls-client:
ca_certificate_path: '/srv/kubernetes/ca.crt' ca_certificate_path: '/root/cdk/ca.crt'
server_certificate_path: '/srv/kubernetes/server.crt' server_certificate_path: '/root/cdk/server.crt'
server_key_path: '/srv/kubernetes/server.key' server_key_path: '/root/cdk/server.key'
client_certificate_path: '/srv/kubernetes/client.crt' client_certificate_path: '/root/cdk/client.crt'
client_key_path: '/srv/kubernetes/client.key' client_key_path: '/root/cdk/client.key'

View File

@ -17,10 +17,6 @@
import re import re
import subprocess import subprocess
from charmhelpers.core import unitdata
BIN_VERSIONS = 'bin_versions'
def get_version(bin_name): def get_version(bin_name):
"""Get the version of an installed Kubernetes binary. """Get the version of an installed Kubernetes binary.
@ -33,31 +29,6 @@ def get_version(bin_name):
>>> `get_version('kubelet') >>> `get_version('kubelet')
(1, 6, 0) (1, 6, 0)
"""
db = unitdata.kv()
bin_versions = db.get(BIN_VERSIONS, {})
cached_version = bin_versions.get(bin_name)
if cached_version:
return tuple(cached_version)
version = _get_bin_version(bin_name)
bin_versions[bin_name] = list(version)
db.set(BIN_VERSIONS, bin_versions)
return version
def reset_versions():
"""Reset the cache of bin versions.
"""
db = unitdata.kv()
db.unset(BIN_VERSIONS)
def _get_bin_version(bin_name):
"""Get a binary version by calling it with --version and parsing output.
""" """
cmd = '{} --version'.format(bin_name).split() cmd = '{} --version'.format(bin_name).split()
version_string = subprocess.check_output(cmd).decode('utf-8') version_string = subprocess.check_output(cmd).decode('utf-8')

View File

@ -118,6 +118,13 @@ class FlagManager:
""" """
return self.data.get(key, default) return self.data.get(key, default)
def destroy_all(self):
'''
Destructively removes all data from the FlagManager.
'''
self.data.clear()
self.__save()
def to_s(self): def to_s(self):
''' '''
Render the flags to a single string, prepared for the Docker Render the flags to a single string, prepared for the Docker

View File

@ -29,7 +29,19 @@ provides:
interface: kubernetes-cni interface: kubernetes-cni
scope: container scope: container
resources: resources:
kubernetes: cni:
type: file type: file
filename: kubernetes.tar.gz filename: cni.tgz
description: "An archive of kubernetes binaries for the worker." description: CNI plugins
kubectl:
type: file
filename: kubectl.snap
description: kubectl snap
kubelet:
type: file
filename: kubelet.snap
description: kubelet snap
kube-proxy:
type: file
filename: kube-proxy.snap
description: kube-proxy snap

View File

@ -0,0 +1,2 @@
metrics:
juju-units: {}

View File

@ -15,40 +15,138 @@
# limitations under the License. # limitations under the License.
import os import os
import shutil
from shlex import split from shlex import split
from subprocess import call, check_call, check_output from subprocess import check_call, check_output
from subprocess import CalledProcessError from subprocess import CalledProcessError
from socket import gethostname from socket import gethostname
from charms import layer from charms import layer
from charms.layer import snap
from charms.reactive import hook from charms.reactive import hook
from charms.reactive import set_state, remove_state from charms.reactive import set_state, remove_state, is_state
from charms.reactive import when, when_any, when_not from charms.reactive import when, when_any, when_not
from charms.reactive.helpers import data_changed
from charms.kubernetes.common import get_version, reset_versions from charms.kubernetes.common import get_version
from charms.kubernetes.flagmanager import FlagManager from charms.kubernetes.flagmanager import FlagManager
from charms.reactive.helpers import data_changed, any_file_changed
from charms.templating.jinja2 import render from charms.templating.jinja2 import render
from charmhelpers.core import hookenv from charmhelpers.core import hookenv, unitdata
from charmhelpers.core.host import service_stop from charmhelpers.core.host import service_stop, service_restart
from charmhelpers.core.host import service_restart
from charmhelpers.contrib.charmsupport import nrpe from charmhelpers.contrib.charmsupport import nrpe
kubeconfig_path = '/srv/kubernetes/config' # Override the default nagios shortname regex to allow periods, which we
# need because our bin names contain them (e.g. 'snap.foo.daemon'). The
# default regex in charmhelpers doesn't allow periods, but nagios itself does.
nrpe.Check.shortname_re = '[\.A-Za-z0-9-_]+$'
kubeconfig_path = '/root/cdk/kubeconfig'
os.environ['PATH'] += os.pathsep + os.path.join(os.sep, 'snap', 'bin')
db = unitdata.kv()
@hook('upgrade-charm') @hook('upgrade-charm')
def remove_installed_state(): def upgrade_charm():
remove_state('kubernetes-worker.components.installed') cleanup_pre_snap_services()
check_resources_for_upgrade_needed()
# Remove gpu.enabled state so we can reconfigure gpu-related kubelet flags, # Remove gpu.enabled state so we can reconfigure gpu-related kubelet flags,
# since they can differ between k8s versions # since they can differ between k8s versions
remove_state('kubernetes-worker.gpu.enabled') remove_state('kubernetes-worker.gpu.enabled')
kubelet_opts = FlagManager('kubelet') kubelet_opts = FlagManager('kubelet')
kubelet_opts.destroy('--feature-gates') kubelet_opts.destroy('feature-gates')
kubelet_opts.destroy('--experimental-nvidia-gpus') kubelet_opts.destroy('experimental-nvidia-gpus')
remove_state('kubernetes-worker.cni-plugins.installed')
remove_state('kubernetes-worker.config.created')
remove_state('kubernetes-worker.ingress.available')
set_state('kubernetes-worker.restart-needed')
def check_resources_for_upgrade_needed():
hookenv.status_set('maintenance', 'Checking resources')
resources = ['kubectl', 'kubelet', 'kube-proxy']
paths = [hookenv.resource_get(resource) for resource in resources]
if any_file_changed(paths):
set_upgrade_needed()
def set_upgrade_needed():
set_state('kubernetes-worker.snaps.upgrade-needed')
config = hookenv.config()
previous_channel = config.previous('channel')
require_manual = config.get('require-manual-upgrade')
if previous_channel is None or not require_manual:
set_state('kubernetes-worker.snaps.upgrade-specified')
def cleanup_pre_snap_services():
# remove old states
remove_state('kubernetes-worker.components.installed')
# disable old services
services = ['kubelet', 'kube-proxy']
for service in services:
hookenv.log('Stopping {0} service.'.format(service))
service_stop(service)
# cleanup old files
files = [
"/lib/systemd/system/kubelet.service",
"/lib/systemd/system/kube-proxy.service"
"/etc/default/kube-default",
"/etc/default/kubelet",
"/etc/default/kube-proxy",
"/srv/kubernetes",
"/usr/local/bin/kubectl",
"/usr/local/bin/kubelet",
"/usr/local/bin/kube-proxy",
"/etc/kubernetes"
]
for file in files:
if os.path.isdir(file):
hookenv.log("Removing directory: " + file)
shutil.rmtree(file)
elif os.path.isfile(file):
hookenv.log("Removing file: " + file)
os.remove(file)
# cleanup old flagmanagers
FlagManager('kubelet').destroy_all()
FlagManager('kube-proxy').destroy_all()
@when('config.changed.channel')
def channel_changed():
set_upgrade_needed()
@when('kubernetes-worker.snaps.upgrade-needed')
@when_not('kubernetes-worker.snaps.upgrade-specified')
def upgrade_needed_status():
msg = 'Needs manual upgrade, run the upgrade action'
hookenv.status_set('blocked', msg)
@when('kubernetes-worker.snaps.upgrade-specified')
def install_snaps():
check_resources_for_upgrade_needed()
channel = hookenv.config('channel')
hookenv.status_set('maintenance', 'Installing kubectl snap')
snap.install('kubectl', channel=channel, classic=True)
hookenv.status_set('maintenance', 'Installing kubelet snap')
snap.install('kubelet', channel=channel, classic=True)
hookenv.status_set('maintenance', 'Installing kube-proxy snap')
snap.install('kube-proxy', channel=channel, classic=True)
set_state('kubernetes-worker.snaps.installed')
remove_state('kubernetes-worker.snaps.upgrade-needed')
remove_state('kubernetes-worker.snaps.upgrade-specified')
@hook('stop') @hook('stop')
@ -57,52 +155,50 @@ def shutdown():
- delete the current node - delete the current node
- stop the kubelet service - stop the kubelet service
- stop the kube-proxy service - stop the kube-proxy service
- remove the 'kubernetes-worker.components.installed' state - remove the 'kubernetes-worker.cni-plugins.installed' state
''' '''
if os.path.isfile(kubeconfig_path):
kubectl('delete', 'node', gethostname()) kubectl('delete', 'node', gethostname())
service_stop('kubelet') service_stop('kubelet')
service_stop('kube-proxy') service_stop('kube-proxy')
remove_state('kubernetes-worker.components.installed') remove_state('kubernetes-worker.cni-plugins.installed')
@when('docker.available') @when('docker.available')
@when_not('kubernetes-worker.components.installed') @when_not('kubernetes-worker.cni-plugins.installed')
def install_kubernetes_components(): def install_cni_plugins():
''' Unpack the kubernetes worker binaries ''' ''' Unpack the cni-plugins resource '''
charm_dir = os.getenv('CHARM_DIR') charm_dir = os.getenv('CHARM_DIR')
# Get the resource via resource_get # Get the resource via resource_get
try: try:
archive = hookenv.resource_get('kubernetes') archive = hookenv.resource_get('cni')
except Exception: except Exception:
message = 'Error fetching the kubernetes resource.' message = 'Error fetching the cni resource.'
hookenv.log(message) hookenv.log(message)
hookenv.status_set('blocked', message) hookenv.status_set('blocked', message)
return return
if not archive: if not archive:
hookenv.log('Missing kubernetes resource.') hookenv.log('Missing cni resource.')
hookenv.status_set('blocked', 'Missing kubernetes resource.') hookenv.status_set('blocked', 'Missing cni resource.')
return return
# Handle null resource publication, we check if filesize < 1mb # Handle null resource publication, we check if filesize < 1mb
filesize = os.stat(archive).st_size filesize = os.stat(archive).st_size
if filesize < 1000000: if filesize < 1000000:
hookenv.status_set('blocked', 'Incomplete kubernetes resource.') hookenv.status_set('blocked', 'Incomplete cni resource.')
return return
hookenv.status_set('maintenance', 'Unpacking kubernetes resource.') hookenv.status_set('maintenance', 'Unpacking cni resource.')
unpack_path = '{}/files/kubernetes'.format(charm_dir) unpack_path = '{}/files/cni'.format(charm_dir)
os.makedirs(unpack_path, exist_ok=True) os.makedirs(unpack_path, exist_ok=True)
cmd = ['tar', 'xfvz', archive, '-C', unpack_path] cmd = ['tar', 'xfvz', archive, '-C', unpack_path]
hookenv.log(cmd) hookenv.log(cmd)
check_call(cmd) check_call(cmd)
apps = [ apps = [
{'name': 'kubelet', 'path': '/usr/local/bin'},
{'name': 'kube-proxy', 'path': '/usr/local/bin'},
{'name': 'kubectl', 'path': '/usr/local/bin'},
{'name': 'loopback', 'path': '/opt/cni/bin'} {'name': 'loopback', 'path': '/opt/cni/bin'}
] ]
@ -113,11 +209,15 @@ def install_kubernetes_components():
hookenv.log(install) hookenv.log(install)
check_call(install) check_call(install)
reset_versions() # Used by the "registry" action. The action is run on a single worker, but
set_state('kubernetes-worker.components.installed') # the registry pod can end up on any worker, so we need this directory on
# all the workers.
os.makedirs('/srv/registry', exist_ok=True)
set_state('kubernetes-worker.cni-plugins.installed')
@when('kubernetes-worker.components.installed') @when('kubernetes-worker.snaps.installed')
def set_app_version(): def set_app_version():
''' Declare the application version to juju ''' ''' Declare the application version to juju '''
cmd = ['kubelet', '--version'] cmd = ['kubelet', '--version']
@ -125,7 +225,7 @@ def set_app_version():
hookenv.application_version_set(version.split(b' v')[-1].rstrip()) hookenv.application_version_set(version.split(b' v')[-1].rstrip())
@when('kubernetes-worker.components.installed') @when('kubernetes-worker.snaps.installed')
@when_not('kube-control.dns.available') @when_not('kube-control.dns.available')
def notify_user_transient_status(): def notify_user_transient_status():
''' Notify to the user we are in a transient state and the application ''' Notify to the user we are in a transient state and the application
@ -140,7 +240,9 @@ def notify_user_transient_status():
hookenv.status_set('waiting', 'Waiting for cluster DNS.') hookenv.status_set('waiting', 'Waiting for cluster DNS.')
@when('kubernetes-worker.components.installed', 'kube-control.dns.available') @when('kubernetes-worker.snaps.installed',
'kube-control.dns.available')
@when_not('kubernetes-worker.snaps.upgrade-needed')
def charm_status(kube_control): def charm_status(kube_control):
'''Update the status message with the current status of kubelet.''' '''Update the status message with the current status of kubelet.'''
update_kubelet_status() update_kubelet_status()
@ -150,10 +252,10 @@ def update_kubelet_status():
''' There are different states that the kubelet can be in, where we are ''' There are different states that the kubelet can be in, where we are
waiting for dns, waiting for cluster turnup, or ready to serve waiting for dns, waiting for cluster turnup, or ready to serve
applications.''' applications.'''
if (_systemctl_is_active('kubelet')): if (_systemctl_is_active('snap.kubelet.daemon')):
hookenv.status_set('active', 'Kubernetes worker running.') hookenv.status_set('active', 'Kubernetes worker running.')
# if kubelet is not running, we're waiting on something else to converge # if kubelet is not running, we're waiting on something else to converge
elif (not _systemctl_is_active('kubelet')): elif (not _systemctl_is_active('snap.kubelet.daemon')):
hookenv.status_set('waiting', 'Waiting for kubelet to start.') hookenv.status_set('waiting', 'Waiting for kubelet to start.')
@ -178,14 +280,29 @@ def send_data(tls):
tls.request_server_cert(common_name, sans, certificate_name) tls.request_server_cert(common_name, sans, certificate_name)
@when('kubernetes-worker.components.installed', 'kube-api-endpoint.available', @when('kube-api-endpoint.available', 'kube-control.dns.available',
'cni.available')
def watch_for_changes(kube_api, kube_control, cni):
''' Watch for configuration changes and signal if we need to restart the
worker services '''
servers = get_kube_api_servers(kube_api)
dns = kube_control.get_dns()
cluster_cidr = cni.get_config()['cidr']
if (data_changed('kube-api-servers', servers) or
data_changed('kube-dns', dns) or
data_changed('cluster-cidr', cluster_cidr)):
set_state('kubernetes-worker.restart-needed')
@when('kubernetes-worker.snaps.installed', 'kube-api-endpoint.available',
'tls_client.ca.saved', 'tls_client.client.certificate.saved', 'tls_client.ca.saved', 'tls_client.client.certificate.saved',
'tls_client.client.key.saved', 'tls_client.server.certificate.saved', 'tls_client.client.key.saved', 'tls_client.server.certificate.saved',
'tls_client.server.key.saved', 'kube-control.dns.available', 'tls_client.server.key.saved', 'kube-control.dns.available',
'cni.available') 'cni.available', 'kubernetes-worker.restart-needed')
def start_worker(kube_api, kube_control, cni): def start_worker(kube_api, kube_control, cni):
''' Start kubelet using the provided API and DNS info.''' ''' Start kubelet using the provided API and DNS info.'''
config = hookenv.config()
servers = get_kube_api_servers(kube_api) servers = get_kube_api_servers(kube_api)
# Note that the DNS server doesn't necessarily exist at this point. We know # Note that the DNS server doesn't necessarily exist at this point. We know
# what its IP will eventually be, though, so we can go ahead and configure # what its IP will eventually be, though, so we can go ahead and configure
@ -193,29 +310,21 @@ def start_worker(kube_api, kube_control, cni):
# the correct DNS even though the server isn't ready yet. # the correct DNS even though the server isn't ready yet.
dns = kube_control.get_dns() dns = kube_control.get_dns()
cluster_cidr = cni.get_config()['cidr']
if (data_changed('kube-api-servers', servers) or if cluster_cidr is None:
data_changed('kube-dns', dns)): hookenv.log('Waiting for cluster cidr.')
return
# Create FlagManager for kubelet and add dns flags
opts = FlagManager('kubelet')
opts.add('--cluster-dns', dns['sdn-ip']) # FIXME sdn-ip needs a rename
opts.add('--cluster-domain', dns['domain'])
# Create FlagManager for KUBE_MASTER and add api server addresses
kube_master_opts = FlagManager('KUBE_MASTER')
kube_master_opts.add('--master', ','.join(servers))
# set --allow-privileged flag for kubelet # set --allow-privileged flag for kubelet
set_privileged( set_privileged()
"true" if config['allow-privileged'] == "true" else "false",
render_config=False)
create_config(servers[0]) create_config(servers[0])
render_init_scripts() configure_worker_services(servers, dns, cluster_cidr)
set_state('kubernetes-worker.config.created') set_state('kubernetes-worker.config.created')
restart_unit_services() restart_unit_services()
update_kubelet_status() update_kubelet_status()
remove_state('kubernetes-worker.restart-needed')
@when('cni.connected') @when('cni.connected')
@ -254,9 +363,9 @@ def render_and_launch_ingress():
else: else:
hookenv.log('Deleting the http backend and ingress.') hookenv.log('Deleting the http backend and ingress.')
kubectl_manifest('delete', kubectl_manifest('delete',
'/etc/kubernetes/addons/default-http-backend.yaml') '/root/cdk/addons/default-http-backend.yaml')
kubectl_manifest('delete', kubectl_manifest('delete',
'/etc/kubernetes/addons/ingress-replication-controller.yaml') # noqa '/root/cdk/addons/ingress-replication-controller.yaml') # noqa
hookenv.close_port(80) hookenv.close_port(80)
hookenv.close_port(443) hookenv.close_port(443)
@ -338,46 +447,40 @@ def create_config(server):
user='kubelet') user='kubelet')
def render_init_scripts(): def configure_worker_services(api_servers, dns, cluster_cidr):
''' We have related to either an api server or a load balancer connected ''' Add remaining flags for the worker services and configure snaps to use
to the apiserver. Render the config files and prepare for launch ''' them '''
context = {}
context.update(hookenv.config())
layer_options = layer.options('tls-client') layer_options = layer.options('tls-client')
ca_cert_path = layer_options.get('ca_certificate_path') ca_cert_path = layer_options.get('ca_certificate_path')
server_cert_path = layer_options.get('server_certificate_path') server_cert_path = layer_options.get('server_certificate_path')
server_key_path = layer_options.get('server_key_path') server_key_path = layer_options.get('server_key_path')
unit_name = os.getenv('JUJU_UNIT_NAME').replace('/', '-')
context.update({
'kube_allow_priv': FlagManager('KUBE_ALLOW_PRIV').to_s(),
'kube_api_endpoint': FlagManager('KUBE_MASTER').to_s(),
'JUJU_UNIT_NAME': unit_name,
})
kubelet_opts = FlagManager('kubelet') kubelet_opts = FlagManager('kubelet')
kubelet_opts.add('--require-kubeconfig', None) kubelet_opts.add('require-kubeconfig', 'true')
kubelet_opts.add('--kubeconfig', kubeconfig_path) kubelet_opts.add('kubeconfig', kubeconfig_path)
kubelet_opts.add('--network-plugin', 'cni') kubelet_opts.add('network-plugin', 'cni')
kubelet_opts.add('--anonymous-auth', 'false') kubelet_opts.add('logtostderr', 'true')
kubelet_opts.add('--client-ca-file', ca_cert_path) kubelet_opts.add('v', '0')
kubelet_opts.add('--tls-cert-file', server_cert_path) kubelet_opts.add('address', '0.0.0.0')
kubelet_opts.add('--tls-private-key-file', server_key_path) kubelet_opts.add('port', '10250')
context['kubelet_opts'] = kubelet_opts.to_s() kubelet_opts.add('cluster-dns', dns['sdn-ip'])
kubelet_opts.add('cluster-domain', dns['domain'])
kubelet_opts.add('anonymous-auth', 'false')
kubelet_opts.add('client-ca-file', ca_cert_path)
kubelet_opts.add('tls-cert-file', server_cert_path)
kubelet_opts.add('tls-private-key-file', server_key_path)
kube_proxy_opts = FlagManager('kube-proxy') kube_proxy_opts = FlagManager('kube-proxy')
kube_proxy_opts.add('--kubeconfig', kubeconfig_path) kube_proxy_opts.add('cluster-cidr', cluster_cidr)
context['kube_proxy_opts'] = kube_proxy_opts.to_s() kube_proxy_opts.add('kubeconfig', kubeconfig_path)
kube_proxy_opts.add('logtostderr', 'true')
kube_proxy_opts.add('v', '0')
kube_proxy_opts.add('master', ','.join(api_servers), strict=True)
os.makedirs('/var/lib/kubelet', exist_ok=True) cmd = ['snap', 'set', 'kubelet'] + kubelet_opts.to_s().split(' ')
check_call(cmd)
render('kube-default', '/etc/default/kube-default', context) cmd = ['snap', 'set', 'kube-proxy'] + kube_proxy_opts.to_s().split(' ')
render('kubelet.defaults', '/etc/default/kubelet', context) check_call(cmd)
render('kubelet.service', '/lib/systemd/system/kubelet.service', context)
render('kube-proxy.defaults', '/etc/default/kube-proxy', context)
render('kube-proxy.service', '/lib/systemd/system/kube-proxy.service',
context)
def create_kubeconfig(kubeconfig, server, ca, key, certificate, user='ubuntu', def create_kubeconfig(kubeconfig, server, ca, key, certificate, user='ubuntu',
@ -406,38 +509,45 @@ def launch_default_ingress_controller():
''' Launch the Kubernetes ingress controller & default backend (404) ''' ''' Launch the Kubernetes ingress controller & default backend (404) '''
context = {} context = {}
context['arch'] = arch() context['arch'] = arch()
addon_path = '/etc/kubernetes/addons/{}' addon_path = '/root/cdk/addons/{}'
manifest = addon_path.format('default-http-backend.yaml')
# Render the default http backend (404) replicationcontroller manifest # Render the default http backend (404) replicationcontroller manifest
manifest = addon_path.format('default-http-backend.yaml')
render('default-http-backend.yaml', manifest, context) render('default-http-backend.yaml', manifest, context)
hookenv.log('Creating the default http backend.') hookenv.log('Creating the default http backend.')
kubectl_manifest('create', manifest) try:
kubectl('apply', '-f', manifest)
except CalledProcessError as e:
hookenv.log(e)
hookenv.log('Failed to create default-http-backend. Will attempt again next update.') # noqa
hookenv.close_port(80)
hookenv.close_port(443)
return
# Render the ingress replication controller manifest # Render the ingress replication controller manifest
manifest = addon_path.format('ingress-replication-controller.yaml') manifest = addon_path.format('ingress-replication-controller.yaml')
render('ingress-replication-controller.yaml', manifest, context) render('ingress-replication-controller.yaml', manifest, context)
if kubectl_manifest('create', manifest):
hookenv.log('Creating the ingress replication controller.') hookenv.log('Creating the ingress replication controller.')
set_state('kubernetes-worker.ingress.available') try:
hookenv.open_port(80) kubectl('apply', '-f', manifest)
hookenv.open_port(443) except CalledProcessError as e:
else: hookenv.log(e)
hookenv.log('Failed to create ingress controller. Will attempt again next update.') # noqa hookenv.log('Failed to create ingress controller. Will attempt again next update.') # noqa
hookenv.close_port(80) hookenv.close_port(80)
hookenv.close_port(443) hookenv.close_port(443)
return
set_state('kubernetes-worker.ingress.available')
hookenv.open_port(80)
hookenv.open_port(443)
def restart_unit_services(): def restart_unit_services():
'''Reload the systemd configuration and restart the services.''' '''Restart worker services.'''
# Tell systemd to reload configuration from disk for all daemons. hookenv.log('Restarting kubelet and kube-proxy.')
call(['systemctl', 'daemon-reload']) services = ['kube-proxy', 'kubelet']
# Ensure the services available after rebooting. for service in services:
call(['systemctl', 'enable', 'kubelet.service']) service_restart('snap.%s.daemon' % service)
call(['systemctl', 'enable', 'kube-proxy.service'])
# Restart the services.
hookenv.log('Restarting kubelet, and kube-proxy.')
call(['systemctl', 'restart', 'kubelet'])
remove_state('kubernetes-worker.kubelet.restart')
call(['systemctl', 'restart', 'kube-proxy'])
def get_kube_api_servers(kube_api): def get_kube_api_servers(kube_api):
@ -504,8 +614,7 @@ def initial_nrpe_config(nagios=None):
@when_any('config.changed.nagios_context', @when_any('config.changed.nagios_context',
'config.changed.nagios_servicegroups') 'config.changed.nagios_servicegroups')
def update_nrpe_config(unused=None): def update_nrpe_config(unused=None):
services = ('kubelet', 'kube-proxy') services = ('snap.kubelet.daemon', 'snap.kube-proxy.daemon')
hostname = nrpe.get_nagios_hostname() hostname = nrpe.get_nagios_hostname()
current_unit = nrpe.get_nagios_unit_name() current_unit = nrpe.get_nagios_unit_name()
nrpe_setup = nrpe.NRPE(hostname=hostname) nrpe_setup = nrpe.NRPE(hostname=hostname)
@ -519,7 +628,7 @@ def remove_nrpe_config(nagios=None):
remove_state('nrpe-external-master.initial-config') remove_state('nrpe-external-master.initial-config')
# List of systemd services for which the checks will be removed # List of systemd services for which the checks will be removed
services = ('kubelet', 'kube-proxy') services = ('snap.kubelet.daemon', 'snap.kube-proxy.daemon')
# The current nrpe-external-master interface doesn't handle a lot of logic, # The current nrpe-external-master interface doesn't handle a lot of logic,
# use the charm-helpers code for now. # use the charm-helpers code for now.
@ -530,41 +639,26 @@ def remove_nrpe_config(nagios=None):
nrpe_setup.remove_check(shortname=service) nrpe_setup.remove_check(shortname=service)
def set_privileged(privileged, render_config=True): def set_privileged():
"""Update the KUBE_ALLOW_PRIV flag for kubelet and re-render config files. """Update the allow-privileged flag for kubelet.
If the flag already matches the requested value, this is a no-op.
:param str privileged: "true" or "false"
:param bool render_config: whether to render new config files
:return: True if the flag was changed, else false
""" """
if privileged == "true": privileged = hookenv.config('allow-privileged')
if privileged == 'auto':
gpu_enabled = is_state('kubernetes-worker.gpu.enabled')
privileged = 'true' if gpu_enabled else 'false'
flag = 'allow-privileged'
hookenv.log('Setting {}={}'.format(flag, privileged))
kubelet_opts = FlagManager('kubelet')
kubelet_opts.add(flag, privileged)
if privileged == 'true':
set_state('kubernetes-worker.privileged') set_state('kubernetes-worker.privileged')
else: else:
remove_state('kubernetes-worker.privileged') remove_state('kubernetes-worker.privileged')
flag = '--allow-privileged'
kube_allow_priv_opts = FlagManager('KUBE_ALLOW_PRIV')
if kube_allow_priv_opts.get(flag) == privileged:
# Flag isn't changing, nothing to do
return False
hookenv.log('Setting {}={}'.format(flag, privileged))
# Update --allow-privileged flag value
kube_allow_priv_opts.add(flag, privileged, strict=True)
# re-render config with new options
if render_config:
render_init_scripts()
# signal that we need a kubelet restart
set_state('kubernetes-worker.kubelet.restart')
return True
@when('config.changed.allow-privileged') @when('config.changed.allow-privileged')
@when('kubernetes-worker.config.created') @when('kubernetes-worker.config.created')
@ -572,29 +666,11 @@ def on_config_allow_privileged_change():
"""React to changed 'allow-privileged' config value. """React to changed 'allow-privileged' config value.
""" """
config = hookenv.config() set_state('kubernetes-worker.restart-needed')
privileged = config['allow-privileged']
if privileged == "auto":
return
set_privileged(privileged)
remove_state('config.changed.allow-privileged') remove_state('config.changed.allow-privileged')
@when('kubernetes-worker.kubelet.restart')
def restart_kubelet():
"""Restart kubelet.
"""
# Make sure systemd loads latest service config
call(['systemctl', 'daemon-reload'])
# Restart kubelet
service_restart('kubelet')
remove_state('kubernetes-worker.kubelet.restart')
@when('cuda.installed') @when('cuda.installed')
@when('kubernetes-worker.components.installed')
@when('kubernetes-worker.config.created') @when('kubernetes-worker.config.created')
@when_not('kubernetes-worker.gpu.enabled') @when_not('kubernetes-worker.gpu.enabled')
def enable_gpu(): def enable_gpu():
@ -610,34 +686,35 @@ def enable_gpu():
return return
hookenv.log('Enabling gpu mode') hookenv.log('Enabling gpu mode')
try:
# Not sure why this is necessary, but if you don't run this, k8s will
# think that the node has 0 gpus (as shown by the output of
# `kubectl get nodes -o yaml`
check_call(['nvidia-smi'])
except CalledProcessError as cpe:
hookenv.log('Unable to communicate with the NVIDIA driver.')
hookenv.log(cpe)
return
kubelet_opts = FlagManager('kubelet') kubelet_opts = FlagManager('kubelet')
if get_version('kubelet') < (1, 6): if get_version('kubelet') < (1, 6):
hookenv.log('Adding --experimental-nvidia-gpus=1 to kubelet') hookenv.log('Adding --experimental-nvidia-gpus=1 to kubelet')
kubelet_opts.add('--experimental-nvidia-gpus', '1') kubelet_opts.add('experimental-nvidia-gpus', '1')
else: else:
hookenv.log('Adding --feature-gates=Accelerators=true to kubelet') hookenv.log('Adding --feature-gates=Accelerators=true to kubelet')
kubelet_opts.add('--feature-gates', 'Accelerators=true') kubelet_opts.add('feature-gates', 'Accelerators=true')
# enable privileged mode and re-render config files
set_privileged("true", render_config=False)
render_init_scripts()
# Apply node labels # Apply node labels
_apply_node_label('gpu=true', overwrite=True) _apply_node_label('gpu=true', overwrite=True)
_apply_node_label('cuda=true', overwrite=True) _apply_node_label('cuda=true', overwrite=True)
# Not sure why this is necessary, but if you don't run this, k8s will
# think that the node has 0 gpus (as shown by the output of
# `kubectl get nodes -o yaml`
check_call(['nvidia-smi'])
set_state('kubernetes-worker.gpu.enabled') set_state('kubernetes-worker.gpu.enabled')
set_state('kubernetes-worker.kubelet.restart') set_state('kubernetes-worker.restart-needed')
@when('kubernetes-worker.gpu.enabled') @when('kubernetes-worker.gpu.enabled')
@when_not('kubernetes-worker.privileged') @when_not('kubernetes-worker.privileged')
@when_not('kubernetes-worker.restart-needed')
def disable_gpu(): def disable_gpu():
"""Disable GPU usage on this node. """Disable GPU usage on this node.
@ -650,18 +727,16 @@ def disable_gpu():
kubelet_opts = FlagManager('kubelet') kubelet_opts = FlagManager('kubelet')
if get_version('kubelet') < (1, 6): if get_version('kubelet') < (1, 6):
kubelet_opts.destroy('--experimental-nvidia-gpus') kubelet_opts.destroy('experimental-nvidia-gpus')
else: else:
kubelet_opts.remove('--feature-gates', 'Accelerators=true') kubelet_opts.remove('feature-gates', 'Accelerators=true')
render_init_scripts()
# Remove node labels # Remove node labels
_apply_node_label('gpu', delete=True) _apply_node_label('gpu', delete=True)
_apply_node_label('cuda', delete=True) _apply_node_label('cuda', delete=True)
remove_state('kubernetes-worker.gpu.enabled') remove_state('kubernetes-worker.gpu.enabled')
set_state('kubernetes-worker.kubelet.restart') set_state('kubernetes-worker.restart-needed')
@when('kubernetes-worker.gpu.enabled') @when('kubernetes-worker.gpu.enabled')

View File

@ -0,0 +1,6 @@
apiVersion: v1
data:
body-size: 1024m
kind: ConfigMap
metadata:
name: nginx-load-balancer-conf

View File

@ -1,4 +1,9 @@
apiVersion: v1 apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-load-balancer-conf
---
apiVersion: v1
kind: ReplicationController kind: ReplicationController
metadata: metadata:
name: nginx-ingress-controller name: nginx-ingress-controller
@ -45,3 +50,4 @@ spec:
args: args:
- /nginx-ingress-controller - /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-http-backend - --default-backend-service=$(POD_NAMESPACE)/default-http-backend
- --nginx-configmap=$(POD_NAMESPACE)/nginx-load-balancer-conf

View File

@ -1,22 +0,0 @@
###
# kubernetes system config
#
# The following values are used to configure various aspects of all
# kubernetes services, including
#
# kube-apiserver.service
# kube-controller-manager.service
# kube-scheduler.service
# kubelet.service
# kube-proxy.service
# logging to stderr means we get it in the systemd journal
KUBE_LOGTOSTDERR="--logtostderr=true"
# journal message level, 0 is debug
KUBE_LOG_LEVEL="--v=0"
# Should this cluster be allowed to run privileged docker containers
KUBE_ALLOW_PRIV="{{ kube_allow_priv }}"
# How the controller-manager, scheduler, and proxy find the apiserver
KUBE_MASTER="{{ kube_api_endpoint }}"

View File

@ -1 +0,0 @@
KUBE_PROXY_ARGS="{{ kube_proxy_opts }}"

View File

@ -1,19 +0,0 @@
[Unit]
Description=Kubernetes Kube-Proxy Server
Documentation=http://kubernetes.io/docs/admin/kube-proxy/
After=network.target
[Service]
EnvironmentFile=-/etc/default/kube-default
EnvironmentFile=-/etc/default/kube-proxy
ExecStart=/usr/local/bin/kube-proxy \
$KUBE_LOGTOSTDERR \
$KUBE_LOG_LEVEL \
$KUBE_MASTER \
$KUBE_PROXY_ARGS
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target

View File

@ -1,14 +0,0 @@
# kubernetes kubelet (node) config
# The address for the info server to serve on (set to 0.0.0.0 or "" for all interfaces)
KUBELET_ADDRESS="--address=0.0.0.0"
# The port for the info server to serve on
KUBELET_PORT="--port=10250"
# You may leave this blank to use the actual hostname. If you override this
# reachability problems become your own issue.
# KUBELET_HOSTNAME="--hostname-override={{ JUJU_UNIT_NAME }}"
# Add your own!
KUBELET_ARGS="{{ kubelet_opts }}"

View File

@ -1,22 +0,0 @@
[Unit]
Description=Kubernetes Kubelet Server
Documentation=http://kubernetes.io/docs/admin/kubelet/
After=docker.service
Requires=docker.service
[Service]
WorkingDirectory=/var/lib/kubelet
EnvironmentFile=-/etc/default/kube-default
EnvironmentFile=-/etc/default/kubelet
ExecStart=/usr/local/bin/kubelet \
$KUBE_LOGTOSTDERR \
$KUBE_LOG_LEVEL \
$KUBELET_ADDRESS \
$KUBELET_PORT \
$KUBELET_HOSTNAME \
$KUBE_ALLOW_PRIV \
$KUBELET_ARGS
Restart=on-failure
[Install]
WantedBy=multi-user.target

View File

@ -0,0 +1,118 @@
apiVersion: v1
kind: Secret
metadata:
name: registry-tls-data
type: Opaque
data:
tls.crt: {{ tlscert }}
tls.key: {{ tlskey }}
---
apiVersion: v1
kind: Secret
metadata:
name: registry-auth-data
type: Opaque
data:
htpasswd: {{ htpasswd }}
---
apiVersion: v1
kind: ReplicationController
metadata:
name: kube-registry-v0
labels:
k8s-app: kube-registry
version: v0
kubernetes.io/cluster-service: "true"
spec:
replicas: 1
selector:
k8s-app: kube-registry
version: v0
template:
metadata:
labels:
k8s-app: kube-registry
version: v0
kubernetes.io/cluster-service: "true"
spec:
containers:
- name: registry
image: registry:2
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
memory: 100Mi
requests:
cpu: 100m
memory: 100Mi
env:
- name: REGISTRY_HTTP_ADDR
value: :5000
- name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
value: /var/lib/registry
- name: REGISTRY_AUTH_HTPASSWD_REALM
value: basic_realm
- name: REGISTRY_AUTH_HTPASSWD_PATH
value: /auth/htpasswd
volumeMounts:
- name: image-store
mountPath: /var/lib/registry
- name: auth-dir
mountPath: /auth
ports:
- containerPort: 5000
name: registry
protocol: TCP
volumes:
- name: image-store
hostPath:
path: /srv/registry
- name: auth-dir
secret:
secretName: registry-auth-data
---
apiVersion: v1
kind: Service
metadata:
name: kube-registry
labels:
k8s-app: kube-registry
kubernetes.io/cluster-service: "true"
kubernetes.io/name: "KubeRegistry"
spec:
selector:
k8s-app: kube-registry
type: LoadBalancer
ports:
- name: registry
port: 5000
protocol: TCP
---
apiVersion: v1
kind: Secret
metadata:
name: registry-access
data:
.dockercfg: {{ dockercfg }}
type: kubernetes.io/dockercfg
{%- if ingress %}
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: registry-ing
spec:
tls:
- hosts:
- {{ domain }}
secretName: registry-tls-data
rules:
- host: {{ domain }}
http:
paths:
- backend:
serviceName: kube-registry
servicePort: 5000
path: /
{% endif %}

View File

@ -47,7 +47,7 @@ cluster/juju/layers/kubernetes-master/reactive/kubernetes_master.py:def send_clu
cluster/juju/layers/kubernetes-master/reactive/kubernetes_master.py:def service_cidr(): cluster/juju/layers/kubernetes-master/reactive/kubernetes_master.py:def service_cidr():
cluster/juju/layers/kubernetes-worker/reactive/kubernetes_worker.py: context.update({'kube_api_endpoint': ','.join(api_servers), cluster/juju/layers/kubernetes-worker/reactive/kubernetes_worker.py: context.update({'kube_api_endpoint': ','.join(api_servers),
cluster/juju/layers/kubernetes-worker/reactive/kubernetes_worker.py: ca_cert_path = layer_options.get('ca_certificate_path') cluster/juju/layers/kubernetes-worker/reactive/kubernetes_worker.py: ca_cert_path = layer_options.get('ca_certificate_path')
cluster/juju/layers/kubernetes-worker/reactive/kubernetes_worker.py:def render_init_scripts(api_servers): cluster/juju/layers/kubernetes-worker/reactive/kubernetes_worker.py:def configure_worker_services(api_servers, dns, cluster_cidr):
cluster/lib/logging.sh: local source_file=${BASH_SOURCE[$frame_no]} cluster/lib/logging.sh: local source_file=${BASH_SOURCE[$frame_no]}
cluster/lib/logging.sh: local source_file=${BASH_SOURCE[$stack_skip]} cluster/lib/logging.sh: local source_file=${BASH_SOURCE[$stack_skip]}
cluster/log-dump.sh: local -r node_name="${1}" cluster/log-dump.sh: local -r node_name="${1}"