Fixes#8196. Maybe. If my theory is correct on how we got there. Also
changes the inference of master to be based on the master name, not
the node instance prefix. That way if we somehow have a bogus
hostname, the master will configure itself as a node, the whole
cluster fails, and it's a ton more obvious.
And adds a .sha1 cache file to indicate what file was already pushed
to GCS, and how to force it if not, removing a few seconds off a
kube-up/push if you're just cycling.
With this and #7602, all TAR_URLS will have a .sha1 as well.
Next ContainerVM image will have SaltStack in it. Also be a little
less persnickety if it's found running. This isn't the case, but we
don't have to be aggressive.
Tested on GCE.
Includes untested modifications for AWS and Vagrant.
No changes for any other distros.
Probably will work on other up-to-date providers
but beware. Symptom would be that service proxying
stops working.
1. Generates a token kube-proxy in AWS, GCE, and Vagrant setup scripts.
1. Distributes the token via salt-overlay, and salt to /var/lib/kube-proxy/kubeconfig
1. Changes kube-proxy args:
- use the --kubeconfig argument
- changes --master argument from http://MASTER:7080 to https://MASTER
- http -> https
- explicit port 7080 -> implied 443
Possible ways this might break other distros:
Mitigation: there is an default empty kubeconfig file.
If the distro does not populate the salt-overlay, then
it should get the empty, which parses to an empty
object, which, combined with the --master argument,
should still work.
Mitigation:
- azure: Special case to use 7080 in
- rackspace: way out of date, so don't care.
- vsphere: way out of date, so don't care.
- other distros: not using salt.
ensure-kube-token is not needed anymore because
the token passed in kube-env.
In the up case it is set, in the push case it is an empty string
but not used.
Allow unset KUBELET_TOKEN (for push case).
Fix comment.
- Configure the apiserver to listen securely on 443 instead of 6443.
- Configure the kubelet to connect to 443 instead of 6443.
- Update documentation to refer to bearer tokens instead of basic auth.
Generates the new token on AWS, GCE, Vagrant.
Renames instance metadata from "kube-token" to "kubelet-token".
(Is this okay for GKE?)
Having separate tokens for kubelet and kube-proxy permits
using principle of least privilege, makes it easy to
rate limit the clients separately, allows annotation
of apiserver logs with the client identity at a finer grain
than just source-ip.
* Fixes an issue where salt-minion would actually come up after reboot
(upstart is horrible obnoxious)
* Caches .deb downloads
* Handles PD remount on reboot correctly
* Notes a future optimization
Fixes#5666
This is needed when we upgrade (and useful when you're trying to
change the startup script for reboots).
Along the way: allow add-instance-metadata[-from-file] to take a
variable number of KVs.
Address #6075: Shoot the master VM while saving the master-pd. This
takes a couple of minor changes to configure-vm.sh, some of which also
would be necessary for reboot. In particular, I changed it so that the
kube-token instance metadata is no longer required after inception;
instead, we mount the master-pd and see if we've already created the
known tokens file before blocking on the instance metadata.
Also partially addresses #6099 in bash by refactoring the kube-push
path.
These secrets will be used in subsequent PRs by:
scheduler, controller-manager, monitoring services,
logging services, and skydns.
Each of these services will then be able to stop using kubernetes-ro
or host networking.
It looks like api_servers finally won this battle. Kill off the
last remaining places passing it, but allow the kubelet Salt to
accept apiservers for a period of time.
(This was bothering my OCD.)
We were actually failing to call brctl in configure-vm.sh. I finally
tracked it down to the attempt to delete the docker0 bridge. This
particular package was getting installed later by Salt anyways, so
all this PR is doing is moving the package install up from Salt to
bash.
Also adds some minor logging.
Nodes are probably broken if update or install fails. Don't proceed
if we can't get past these. Also, instead of ignoring the error off
dpkg, use --force depends, which changes the errors to be kinder
warnings for anyone looking through the logs.
Deletion is wonderful. The only weird thing was where to put the
message about the proxy URLs. Satnam suggested kubectl clusterinfo,
which seemed like a good option to put at the end of cluster turn-up.
the /usr/sbin/policy-rc.d script and returning 101, per
https://people.debian.org/~hmh/invokerc.d-policyrc.d-specification.txt,
but only for the window while we're installing Salt.
This is a much more fool-proof method than what I was attempting
before. I hunted for how to do this before and clearly failed at my
Google-fu.
Fixes#5621
This was a dumb mistake during a re-factor of configure-vm. I tested
this early, re-factored the tail of this file, spot checked kube-push
and failed to test kube-push properly. My bad.
Fixes#5361. Fixes#5408.
IP address. The configure-vm script can resolve this relatively easily
on the node. This is less painful for GKE, which creates all the
resources in parallel.
Change provisioning to pass all variables to both master and node. Run
Salt in a masterless setup on all nodes ala
http://docs.saltstack.com/en/latest/topics/tutorials/quickstart.html,
which involves ensuring Salt daemon is NOT running after install. Kill
Salt master install. And fix push to actually work in this new flow.
As part of this, the GCE Salt config no longer has access to the Salt
mine, which is primarily obnoxious for two reasons: - The minions
can't use Salt to see the master: this is easily fixed by static
config. - The master can't see the list of all the minions: this is
fixed temporarily by static config in util.sh, but later, by other
means (see
https://github.com/GoogleCloudPlatform/kubernetes/issues/156, which
should eventually remove this direction).
As part of it, flatten all of cluster/gce/templates/* into
configure-vm.sh, using a single, separate piece of YAML to drive the
environment variables, rather than constantly rewriting the startup
script.
the master's IP upon creation to make it easier to replace the master later.
This pulls out the parts of PR #3174 that don't break anything and will
make upgrading existing clusters in the future less painful.
Add /etc/salt to the master-pd
Change the .kubeconfig context that gce kube-up creates to project
+ instance prefix, so you can spin up clusters with the same name
in different compute projects without overwriting .kubeconfig.
In config-{default,test.sh}. This will make it possible for e.g.
Jenkins to override LOGGING_DESTINATION. Also reorder the parameters
so they're in the same order across files for easier scanning.
Adds labels to the services, waits for them to be created (which
should be instant, but just in case), query the forwarding rules like
as we did before.
Fixes#3893
This implements phase 1 of the proposal in #3579, moving the creation
of the pods, RCs, and services to the master after the apiserver is
available.
This is such a wide commit because our existing initial config story
is special:
* Add kube-addons service and associated salt configuration:
** We configure /etc/kubernetes/addons to be a directory of objects
that are appropriately configured for the current cluster.
** "/etc/init.d/kube-addons start" slurps up everything in that dir.
(Most of the difficult is the business logic in salt around getting
that directory built at all.)
** We cheat and overlay cluster/addons into saltbase/salt/kube-addons
as config files for the kube-addons meta-service.
* Change .yaml.in files to salt templates
* Rename {setup,teardown}-{monitoring,logging} to
{setup,teardown}-{monitoring,logging}-firewall to properly reflect
their real purpose now (the purpose of these functions is now ONLY to
bring up the firewall rules, and possibly to relay the IP to the user).
* Rework GCE {setup,teardown}-{monitoring,logging}-firewall: Both
functions were improperly configuring global rules, yet used
lifecycles tied to the cluster. Use $NODE_INSTANCE_PREFIX with the
rule. The logging rule needed a $NETWORK specifier. The monitoring
rule tried gcloud describe first, but given the instancing, this feels
like a waste of time now.
* Plumb ENABLE_CLUSTER_MONITORING, ENABLE_CLUSTER_LOGGING,
ELASTICSEARCH_LOGGING_REPLICAS and DNS_REPLICAS down to the master,
since these are needed there now.
(Desperately want just a yaml or json file we can share between
providers that has all this crap. Maybe #3525 is an answer?)
Huge caveats: I've gone pretty firm testing on GCE, including
twiddling the env variables and making sure the objects I expect to
come up, come up. I've tested that it doesn't break GKE bringup
somehow. But I haven't had a chance to test the other providers.
Removed auth for Grafana to facilitate usage via service proxy on the api-server.
Added a grafana service
Removed elasticsearch dependency for monitoring - faster startup times.
Removed auth for Grafana to facilitate usage via service proxy on the api-server.
Added a grafana service
Removed elasticsearch dependency for monitoring - faster startup times.
persistent disk when running on GCE.
I'll follow up soon with a second PR that enables kube-push to
completely bring down the master VM and replace it with a new one.
Quoting is hard. When writing strings into YAML files, wrap them in single quotes. Also escape any embedded single quotes in those strings via a double signle quote ('').
Configure apiserver to serve Securely on port 6443.
Generate token for kubelets during master VM startup.
Put token into file apiserver can get and another file the kubelets can get.
Added e2e test.
This change refactors the way Kubelet's DockerPuller handles the docker config credentials to utilize a new credentialprovider library.
The credentialprovider library is based on several of the files from the Kubelet's dockertools directory, but supports a new pluggable model for retrieving a .dockercfg-compatible JSON blob with credentials.
With this change, the Kubelet will lazily ask for the docker config from a set of DockerConfigProvider extensions each time it needs a credential.
This change provides common implementations of DockerConfigProvider for:
- "Default": load .dockercfg from disk
- "Caching": wraps another provider in a cache that expires after a pre-specified lifetime.
GCP-only:
- "google-dockercfg": reads a .dockercfg from a GCE instance's metadata
- "google-dockercfg-url": reads a .dockercfg from a URL specified in a GCE instance's metadata.
- "google-container-registry": reads an access token from GCE metadata into a password field.