Commit Graph

678 Commits (26bcdb1bd677b3319496cb5283287ddf83962df3)

Author SHA1 Message Date
Justin Santa Barbara f8af47b645 AWS: Pass globals into Backoff struct
Thanks for the suggestion bprashanth!
2016-02-21 20:17:22 -05:00
k8s-merge-robot 2e3053a204 Merge pull request #21431 from freehan/sourcerange
Auto commit by PR queue bot
2016-02-21 16:14:42 -08:00
Justin Santa Barbara b269e8f43c AWS: Delay all AWS calls when we observe RequestLimitExceeded errors
This applies a cross-request time delay when we observe
RequestLimitExceeded errors, unlike the default library behaviour which
only applies a *per-request* backoff.

Issue #12121
2016-02-21 15:09:42 -05:00
Justin Santa Barbara 22d719018a AWS: Recover if tags missing on security group
In the AWS API (generally) we tag things we create, and then we filter
to find them.  However, creation & tagging are typically two separate
calls.  So there is a chance that we will create an object, but fail to
tag it.

We fix this (done here in the case of security groups, but we can do
this more generally) by retrieving the resource without a tag filter.
If the retrieved resource has the correct tags, great.  If it has the
tags for another cluster, that's a problem, and we raise an error.  If
it has no tags at all, we add the tags.

This only works where the resource is uniquely named (or we can
otherwise retrieve it uniquely).  For security groups, the SG name comes
from the service UUID, so that's unique.

Fixes #11324
2016-02-20 14:58:19 -05:00
k8s-merge-robot f08a8f23c1 Merge pull request #20959 from justinsb/fix_20911
Auto commit by PR queue bot
2016-02-20 00:56:53 -08:00
k8s-merge-robot 6be4417aff Merge pull request #17649 from jtblin/jtblin/17647-aws-elb-creation-no-tags
Auto commit by PR queue bot
2016-02-18 19:41:15 -08:00
Minhan Xia 7ffb123abe add source range support for loadbalancer on gce 2016-02-18 17:05:02 -08:00
Jerome Touffe-Blin 62a8b3d44c Fix #17912 - pick public subnets only on ELB creation 2016-02-16 23:16:38 +11:00
Rudi Chiarito da4ba61232 Fix AWS IPPermission check for case with preexisting groups and ranges 2016-02-15 11:41:18 -05:00
k8s-merge-robot d6b4ff3884 Merge pull request #20909 from Clarifai/instance-type-label
Auto commit by PR queue bot
2016-02-13 18:51:42 -08:00
Rudi Chiarito b3863eae82 Add instance-type label to cloud providers
Fully implemented for AWS and GCE
2016-02-12 15:02:03 -05:00
Jan Safranek 1d0b1c227b Add PV.Name into names of generated GCE/AWS/OSP volumes.
Volume names have now format <cluster-name>-dynamic-<pv-name>.

pv-name is guaranteed to be unique in Kubernetes cluster, adding
<cluster-name> ensures we don't conflict with any running cluster
in the cloud project (kube-controller-manager --cluster-name=XXX).

'kubernetes' is the default cluster name.
2016-02-12 09:46:59 +01:00
Justin Santa Barbara a0093eb503 e2e: Don't try to create a UDP LoadBalancer on AWS
AWS doesn't support type=LoadBalancer with UDP services.  For now, we
simply skip over the test with type=LoadBalancer on AWS for the UDP
service.

Fix #20911
2016-02-10 12:31:18 -05:00
k8s-merge-robot 9520bb5ddf Merge pull request #20731 from Clarifai/ensure-lb-servicename
Auto commit by PR queue bot
2016-02-10 01:25:07 -08:00
k8s-merge-robot 2ec49efd54 Merge pull request #19945 from Clarifai/fix-formatting
Auto commit by PR queue bot
2016-02-09 16:05:00 -08:00
David Pratt 57782459e6 Log missing public IP from AWS metadata. 2016-02-09 12:06:02 -06:00
David Pratt adde7f548f Fix AWS kubelet registration.
This commit allows the AWS cloud provider plugin to work on EC2 instances
that do not have a public IP. The EC2 metadata service returns a 404 for the
'public-ipv4' endpoint for private instances, and the plugin was bubbling this
up as a fatal error.
2016-02-09 11:46:53 -06:00
Rudi Chiarito 5874b0cb9d Pass namespaced service name to cloudprovider's EnsureLoadBalancer
Also has an AWS implementation that plugs the service name into the ELB and SG.
Log the service name under GCE and OpenStack.
Fixes #20668
2016-02-09 06:50:53 -05:00
k8s-merge-robot 318705feb9 Merge pull request #20378 from jtblin/jtblin/11543-aws-cloud-provider-private-hosted-dns-zone-via-api
Auto commit by PR queue bot
2016-02-08 23:43:59 -08:00
k8s-merge-robot fec0d127b3 Merge pull request #15938 from justinsb/aws_ebs_cleanup
Auto commit by PR queue bot
2016-02-08 21:42:52 -08:00
Tim Hockin fecb71420c Demote static IPs ASAP for easier cleanup
This exposed bugs in the IP promotion/demotion logic.
2016-02-06 21:15:06 -08:00
k8s-merge-robot 257c3ad776 Merge pull request #20153 from sky-uk/fix-sg-comparison
Auto commit by PR queue bot
2016-02-06 12:25:26 -08:00
k8s-merge-robot 1b52e0ec3a Merge pull request #20210 from jsafrane/devel/gce-tags
Auto commit by PR queue bot
2016-02-05 21:36:25 -08:00
Tim Hockin a6cb6d76e3 Use defer for IP release 2016-02-04 16:40:28 -08:00
Jerome Touffe-Blin 0a36db19bf Fix #11543 - use DescribeInstances API to retrieve the correct private DNS name 2016-02-04 08:48:25 +11:00
Justin Santa Barbara f61a5d0400 AWS: Switch arguments to AttachDisk/DetachDisk to match GCE 2016-02-03 20:43:23 +00:00
Justin Santa Barbara 6c87a4be7c AWS: Handle deleting volume that no longer exists
The tests in particular double-delete volumes, so we need to handle this
graciously.
2016-02-03 20:43:14 +00:00
Justin Santa Barbara 1ae1db6027 AWS: Update copy-paste of GCE PD code to latest version
We are (sadly) using a copy-and-paste of the GCE PD code for AWS EBS.
This code hasn't been updated in a while, and it seems that the GCE code
has some code to make volume mounting more robust that we should copy.
2016-02-03 20:43:14 +00:00
Rudi Chiarito a0831a2378 Mass fix of Infof and co. missing the trailing "f", even when formatting placeholders are used 2016-02-03 11:34:59 -05:00
k8s-merge-robot df4d50a7ac Merge pull request #20098 from brendandburns/flake2
Auto commit by PR queue bot
2016-02-02 04:22:45 -08:00
k8s-merge-robot bf9523a03a Merge pull request #20195 from bprashanth/fix_zone_url
Auto commit by PR queue bot
2016-02-02 03:50:05 -08:00
k8s-merge-robot 5a6cf15c09 Merge pull request #19874 from justinsb/aws_fix_findinstancesbynodenames
Auto commit by PR queue bot
2016-02-01 21:38:46 -08:00
Brendan Burns 2aa5dc317b Try harder to delete cloud resources in service tests 2016-02-01 15:34:55 -08:00
k8s-merge-robot 3db1a6c3ce Merge pull request #19976 from aarondav/elb-timeout
Auto commit by PR queue bot
2016-01-30 18:57:56 -08:00
k8s-merge-robot d63398a543 Merge pull request #18829 from lebauce/openstack-use-os-ext-ips
Auto commit by PR queue bot
2016-01-29 17:49:43 -08:00
Fabio Yeon eb2c2d1af4 Merge pull request #20111 from fabioy/fix-tmp-tests
Add temp directory creation method for tests.
2016-01-29 09:51:12 -08:00
Fabio Yeon 7205a160ac Remove all instances of "/tmp" from unit tests and replace with a common
tmp directory creator. Exception is documented.
2016-01-27 16:11:22 -08:00
Prashanth Balasubramanian bcd39900b3 Modify Create/SetGlobalForwardingRule to just take a link. 2016-01-27 16:00:45 -08:00
Prashanth Balasubramanian cba768322a Add cloudprovider methods for ssl. 2016-01-27 16:00:45 -08:00
Jan Safranek 23cd0913f7 Tag dynamically created GCE PD disks.
GCE disks don't have tags, we must encode the tags into Description field.
It's encoded as JSON, which is both human and machine readable:
description: '{"kubernetes.io/created-for/pv/name":"pv-gce-oxwts","kubernetes.io/created-for/pvc/name":"myclaim","kubernetes.io/created-for/pvc/namespace":"default"}'
2016-01-27 15:16:05 +01:00
Prashanth Balasubramanian 5dec84c0dc Fix zone in cloudprovider method. 2016-01-26 19:12:31 -08:00
James Ravn c3383b3422 Fix formatting of aws_test.go 2016-01-26 15:13:25 +00:00
James Ravn b4bbc3b3ef Fix removal of aws security group ingress
The ip permission method now checks for containment, not equality, so
order of parameters matter. This change fixes
`removeSecurityGroupIngress` to pass in the removal permission first to
compare against the existing permission.
2016-01-26 13:57:52 +00:00
Dogan Narinc and James Ravn 190c829ac5 Fix AWS ip permission comparison on security group
Change isEqualIPPermission to consider the entire list of security group
ids on when checking if a security group id has already been added.

This is used for example when adding and removing ingress rules to the
cluster nodes from an elastic load balancer. Without this, once there
are multiple load balancers, the method as it stands incorrectly returns
false even if the security group id is in the list of group ids. This
causes a few problems: dangling security groups which fill up an
account's limit since they don't get removed, and inability to recreate
load balancers in certain situations (receiving an
InvalidPermission.Duplicate from AWS when adding the same security
group).
2016-01-26 11:30:27 +00:00
Rudi Chiarito 76e29ed455 Register ECR credential plugin only when an AWS cloud instance is created 2016-01-25 22:18:45 -05:00
Quinton Hoole 10f7985dfb Merge pull request #19995 from justinsb/gce_label_pd
Ubernetes-Lite GCE: Label volumes with zone information
2016-01-25 10:34:10 -08:00
Aaron Davidson 97689c326d Reduce healthy threshold and check interval for Amazon ELBs
According to AWS, the ELB healthy threshold is "Number of consecutive health check successes before declaring an EC2 instance healthy." It has an unusual interaction with Kubernetes, since all nodes will enter either an unhealthy state or a healthy state together depending on the service's healthiness as a whole.

We have observed that if our service goes down for the unhealthy threshold (which is 2 checks at 30 second intervals = 60 seconds), then the ELB will stop serving traffic to all nodes in the cluster, and will wait for the healthy threshold (currently 10 * 30 = 300 seconds) AFTER the service is restored to add back the cluster nodes, meaning it remains unreachable for an extra 300 seconds.

With the new settings, the ELB will continue to timeout dead nodes after 60 seconds, but will restore healthy nodes after 20 seconds. The minimum value for healthyThreshold is 2, and the minimum value for interval is 5 seconds. I went for 10 seconds instead of the minimum sort of arbitrarily because I was not sure how much this value may affect the scalability of clusters in EC2, as it does put some extra load on the kube-proxy.
2016-01-23 11:10:37 -08:00
Justin Santa Barbara 1276675512 Ubernetes-Lite: Error if a PD name is ambiguous
We don't cope well if a PD is in multiple zones, but this is actually
fairly easy to detect.  This is probably justified purely on the basis
that we never want to delete the wrong volume (DeleteDisk), but also
because this means that we now warn on creation if a disk is in multiple
zones (with the labeling admission controller).

This also means that with the scheduling predicate in place, that many
of our volume problems "go away" in practice: you still can't create or
delete a volume when it is ambiguous, but thereafter the volume will be
labeled with the zone, that will match it only to nodes with the same
zone, and then we query for the volume in that zone when we
attach/detach it.
2016-01-22 17:16:38 -05:00
Justin Santa Barbara 900567288b Ubernetes Lite: Label volumes with zone information
When volumes are labeled, they will only be scheduled onto nodes in the
same zone.
2016-01-22 17:16:31 -05:00
Zach Loafman 7189db3701 Merge pull request #19396 from justinsb/aws_mountdevice
AWS: Use a strongly typed mountDevice
2016-01-22 11:04:23 -08:00
Justin Santa Barbara 2201c631b3 Fix for panic when instance not found
This removes a panic I mistakenly introduced when an instance is not
found, and also restores the exact prior behaviour for
getInstanceByName, where it returns cloudprovider.InstanceNotFound when
the instance is not found.
2016-01-21 19:15:09 -05:00
Alex Mohr f64a40f315 Merge pull request #19863 from justinsb/aws_fix_loadbalancer_tcp_check
AWS: Eliminate assumptions about all load-balancer ports matching
2016-01-21 10:52:20 -08:00
Alex Mohr b8f8a62775 Merge pull request #19862 from justinsb/aws_fix_tcploadbalancer_comment
AWS: Fix comment to reflect new method name
2016-01-21 10:51:37 -08:00
Alex Mohr 8e9bebb424 Merge pull request #19861 from justinsb/aws_remove_dead_code
AWS: Remove dead code
2016-01-21 10:51:06 -08:00
Alex Mohr 819cdc85d3 Merge pull request #19390 from danielschonfeld/optimize-kubelet-heartbeat-aws
optimize kubelet heartbeats instance data to be cached for AWS
2016-01-21 10:33:00 -08:00
Alex Mohr d2d349bc84 Merge pull request #19334 from resouer/network
Networking should be used to hold network related pkgs
2016-01-21 10:26:13 -08:00
Alex Mohr eaa61a72b0 Merge pull request #17919 from justinsb/multizone_gce
Ubernetes Lite support for GCE
2016-01-21 10:22:34 -08:00
k8s-merge-robot 0f6f521beb Merge pull request #18959 from jsafrane/devel/cinder-tags
Auto commit by PR queue bot
2016-01-21 03:33:58 -08:00
Justin Santa Barbara 43cbfb74fe Ubernetes Lite GCE: Support multiple zones in GCE cloud provider
We adapt the existing code to work across all zones in a region.

We require a feature-flag to enable Ubernetes-Lite

Reasons:

* There are some behavioural changes if users create volumes with
the same name in two zones.
* We don't want to make one API call per zone if we're not running
Ubernetes-Lite.
* Ubernetes-Lite is still experimental.

There isn't a parallel flag implemented for AWS, because at the moment
there would be no behaviour changes from this.
2016-01-20 23:04:53 -05:00
Justin Santa Barbara 9f44c72ba9 AWS: findInstancesByNodeNames should not make O(N) API calls
findInstancesByNodeNames was a simple loop around
findInstanceByNodeName, which made an EC2 API call for each call.

We've had trouble with this sort of behaviour hitting EC2 rate limits on
bigger clusters (e.g. #11979).

Instead, change this method to fetch _all_ the tagged EC2 instances, and
then loop through the local results.  This is one API call (modulo
paging).

We are currently only using findInstancesByNodeNames for the load
balancer, where we attach every node, so we were fetching all but one of
the instances anyway.

Issue #11979
2016-01-20 12:37:22 -05:00
Justin Santa Barbara 0586d866de AWS: Eliminate assumptions about all load-balancer ports matching
It costs us basically nothing to just check all the ports, and
protects us against future changes to the controller.
2016-01-20 09:30:46 -05:00
Justin Santa Barbara 12dd568662 AWS: Fix comment to reflect new method name
TCPLoadBalancer.EnsureTCPLoadBalancer => LoadBalancer.EnsureLoadBalancer
2016-01-20 09:28:28 -05:00
Justin Santa Barbara 30882265b6 AWS: Remove dead code
I think I added these functions by mistake; they aren't used and
apparently never were.
2016-01-20 09:23:53 -05:00
k8s-merge-robot 4969f11089 Merge pull request #19439 from bprashanth/compute_dep
Auto commit by PR queue bot
2016-01-16 10:38:11 -08:00
Harry Zhang 936a11e775 Use networking to hold network related pkgs
Change names of unclear methods

Use net as pkg name for short
2016-01-15 13:46:16 +08:00
Mike Danese 1f0b10bd22 Merge pull request #19538 from mesosphere/jdef_mesos_026_compat
MESOS: compatibility w/ mesos v0.26
2016-01-14 13:44:42 -08:00
Daniel Schonfeld 12cbc9ff89 optimize ExternalId() and InstanceId() to returned cached results directly from the metadata service 2016-01-13 23:36:57 -05:00
James DeFelice ad1803a4ce construct master URIs from MasterInfo.Address if present; prefer /state over /state.json 2016-01-12 17:49:19 +00:00
Sylvain Baubeau b9dfe1b737 Use Openstack os-ext-ips extension to qualify IP address types
fixes #18409
2016-01-12 15:03:47 +01:00
David Oppenheimer 8ac484793d Comment out calls to httptest.Server.Close() to work around
https://github.com/golang/go/issues/12262 . See #19254 for
more details. This change should be reverted when we upgrade
to Go 1.6.
2016-01-11 23:02:11 -08:00
Prashanth Balasubramanian cc09a603dd Code changes 2016-01-11 16:27:12 -08:00
k8s-merge-robot 37b5726716 Merge pull request #14431 from Defensative/UDP-LB
Auto commit by PR queue bot
2016-01-08 12:39:02 -08:00
k8s-merge-robot e0e305c6be Merge pull request #19337 from danielschonfeld/optimize-list-routes
Auto commit by PR queue bot
2016-01-08 10:19:47 -08:00
Justin Santa Barbara 03900b1dc9 AWS: Use a strongly typed mountDevice
We've had problems in the past from using a string with passing the
wrong value when detaching; stronger typing would have caught this for
us.
2016-01-08 00:25:11 -05:00
Daniel Schonfeld 24c44e7a8e optimize ListRoutes to fetch instances only once per call
Issue #12121 - fixes courtesy of @justinsb - thank you
2016-01-07 14:32:37 -05:00
Justin Santa Barbara e7c3a08947 AWS: Provide newly required initialization arguments
It seems that some formerly optional arguments are now required in the
latest aws-sdk-go, see e.g.
https://github.com/aws/aws-sdk-go/issues/452.
2016-01-06 13:37:02 -05:00
Kenneth Shelton 9e6c45c395 Updated comments
Updated documentation
Fixed e2e test
2016-01-05 20:51:21 +00:00
Kenneth Shelton d399a8f8cc * Added UDP LB support (for GCE) 2016-01-05 20:51:21 +00:00
Trevor Pounds bbc181d1f8 Remove unused EC2 metadata functions. 2016-01-04 16:10:23 -08:00
Trevor Pounds 89d7eb050a Update AWS cloud provider to aws-sdk-go v1.0.2. 2016-01-04 16:10:23 -08:00
Justin Santa Barbara 7444216d4f AWS: Delete routes during create if they are black-holed
If a route already exists but is invalid (e.g. from a crash), we
automatically delete it before trying to create a route that would
otherwise conflict.
2016-01-03 18:19:12 -05:00
Justin Santa Barbara f9a6ac077e Ubernetes Lite: Volumes can dictate zone scheduling
For AWS EBS, a volume can only be attached to a node in the same AZ.
The scheduler must therefore detect if a volume is being attached to a
pod, and ensure that the pod is scheduled on a node in the same AZ as
the volume.

So that the scheduler need not query the cloud provider every time, and
to support decoupled operation (e.g. bare metal) we tag the volume with
our placement labels.  This is done automatically by means of an
admission controller on AWS when a PersistentVolume is created backed by
an EBS volume.

Support for tagging GCE PVs will follow.

Pods that specify a volume directly (i.e. without using a
PersistentVolumeClaim) will not currently be scheduled correctly (i.e.
they will be scheduled without zone-awareness).
2015-12-31 12:27:01 -05:00
Jan Safranek 815d1e0865 Tag OpenStack Cinder volumes created by Kubernetes.
This synchronizes Cinder with AWS EBS code, where we already tag volumes with
claim.Namespace and claim.Name (and pv.Name, as suggested in separate PR).
2015-12-21 11:36:42 +01:00
Jan Safranek 2f06ebf9b7 Implement Creater and Deleter interfaces for Cinder. 2015-12-16 14:23:14 -05:00
Jan Safranek 1b7445a6e2 Use SSD as default volume type.
General purpose SSD ('gp2') volume type is just slighly more expensive than
Magnetic ('standard' / default in AWS), while the performance gain is pretty
significant.

So far, the volumes were created only during testing, where the extra cost
won't make any difference. In future, we plan to introduce QoS classes, where
users could choose SSD/Magnetic depending on their use cases.

'gp2' is just the default volume type for (hopefuly) short period before these
QoS classes are implemented.
2015-12-15 12:14:48 +01:00
Jan Safranek 6ff5286df9 Implement Creater and Deleter interfaces for AWS EBS.
Also mark the created EBS volumes with tags, so the admin knows
who/what created the volumes.
2015-12-15 10:22:49 +01:00
Jan Safranek 700d92c2a8 AWS: Use GiB as units for disk sizes.
From some reason, MiBs were used for public functions and AWS cloud provider
recalculated them to GiB. Let's expose what AWS really supports and don't hide
real allocation units.
2015-12-15 10:18:00 +01:00
k8s-merge-robot 56cd501598 Merge pull request #18427 from mesosphere/sttts-cloud-provider-npe
Auto commit by PR queue bot
2015-12-09 08:45:06 -08:00
Dr. Stefan Schimanski 60ce27cb50 cloudprovider/mesos: fix panics when the Mesos master cannot be reached 2015-12-09 12:58:38 +01:00
Robert Bailey 2ecf504a2e Change the gce constant for session affinity to have the capitalization
shown in the documentation.

Fixes #18347
2015-12-08 09:36:49 -08:00
Mike Danese dcdd7f1ca8 remove vagrant cloud provider 2015-12-02 13:20:54 -08:00
Sebastien LAWNICZAK 3eae5895f8 Passing DomainID/DomainName to AuthOptions
To be able to use Domains with IdentityV3, domain-id/domain-name in provider config should be passed to gophercloud.AuthOptions
2015-12-01 23:12:25 +01:00
k8s-merge-robot 74049947d2 Merge pull request #12589 from slaws/os-vip-with-floatingip
Auto commit by PR queue bot
2015-12-01 02:01:39 -08:00
saadali 42b200a0a0 Refactor GCE wrapper library to allow execution from E2E test suite
This reverts commit 147b6911f5, reversing
changes made to 6fd986065b.
2015-11-25 11:48:06 -08:00
k8s-merge-robot 9a4a8075ed Merge pull request #15537 from jsafrane/devel/cinder-hostname
Auto commit by PR queue bot
2015-11-24 06:47:40 -08:00
Jerome Touffe-Blin 4a01539ded Fix #17647 - AWS add tag to SG only if existing tag 2015-11-23 22:08:17 +11:00
Jerzy Szczepkowski 8a922e22be Revert "Refactor GCE wrapper library to allow execution from E2E test suite" 2015-11-23 09:24:32 +01:00
k8s-merge-robot 3fbf0cb810 Merge pull request #17276 from saad-ali/fixErrorCreatingPD
Auto commit by PR queue bot
2015-11-21 23:32:30 -08:00
saadali 882469dd7b Refactor GCE wrapper library to allow execution from E2E test suite 2015-11-20 11:41:10 -08:00
Brendan Burns 4903474bad Remove container api, its no longer generated and breaking godeps. 2015-11-19 09:20:28 -05:00