Commit Graph

698 Commits (31de62216dea226f0553e698da38000b73684d49)

Author SHA1 Message Date
Christopher Batey aebd4c95e1 Use testify for AWS provider
This has two main advantages:

* The use of the mock package to verify API calls against the aws SDK
* Nicer error messages for asserts without having to use if statements
2016-03-01 14:32:45 +00:00
k8s-merge-robot 6e6550a105 Merge pull request #21989 from justinsb/fix_21986
Auto commit by PR queue bot
2016-03-01 03:51:43 -08:00
Justin Santa Barbara 818925cc25 Openstack null-support for load balancer source
We return an error if the user specifies a non 0.0.0.0/0 load balancer
source restriction on OpenStack, where we can't enforce the restriction
(currently).
2016-02-29 19:32:15 -05:00
Justin Santa Barbara 49e1149227 AWS: Add support for load balancer source ranges
This refactors #21431 to pull a lot of the code into cloudprovider so it
can be reused by AWS.

It also changes the name of the annotation to be non-GCE specific:
service.beta.kubernetes.io/load-balancer-source-ranges

Fix #21651
2016-02-29 19:32:08 -05:00
k8s-merge-robot fe03c663d9 Merge pull request #22094 from alex-mohr/routes
Auto commit by PR queue bot
2016-02-29 05:46:51 -08:00
k8s-merge-robot 99eaf73f0c Merge pull request #22019 from sky-uk/aws-handle-implicit-routing-tables
Auto commit by PR queue bot
2016-02-27 06:53:41 -08:00
k8s-merge-robot 394d5da23c Merge pull request #21319 from Clarifai/ensure-lb-servicename
Auto commit by PR queue bot
2016-02-27 02:03:14 -08:00
Vishnu kannan 85efe33c16 Use local metadata server, if available, for GCE compute API invocations.
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-02-26 16:54:22 -08:00
Alex Mohr 0816fa2072 Add support for more than 500 results to GCE cloud provider API calls
for Instance.List and Routes.List which we will definitely have
more than 500 of when supporting 1000 nodes.

Add TODOs for other GCE List API calls to do similar fixes.

Add more logging to GCE's routecontroller.go when creating or deleting routes.
2016-02-26 16:03:01 -08:00
James Ravn f568b6511a Handle aws implicit and shared routing tables
Fix the AWS subnet lookup that checks if a subnet is public, which was
missing a few cases:

- Subnets without explicit routing tables, which use the main VPC
  routing table.
- Routing tables not tagged with KubernetesCluster. The filter for this
  is now removed.
2016-02-25 22:52:26 +00:00
Justin Santa Barbara 1cdfc9ad84 AWS: Find the correct security group by looking at tags
Like everything else AWS, we differentiate between k8s-owned security
groups and k8s-not-owned security groups using tags.

When we are setting up the ingress rule for ELBs, pick the security
group that is tagged over any others.

We continue to tolerate a single security group being untagged, but
having multiple security groups without tagging is now an error, as it
leads to undefined behaviour.

We also log at startup if the cluster tag is not defined.

Fix #21986
2016-02-25 11:20:58 -05:00
k8s-merge-robot 2a58c0062d Merge pull request #17913 from jtblin/jtblin/17912-pick-public-subnets
Auto commit by PR queue bot
2016-02-24 23:48:15 -08:00
Justin Santa Barbara e50ae40301 AWS: Wrap AWS error when failing to create security group ingress
All AWS errors should be wrapped in a user-friendly error before
returning.
2016-02-24 14:13:44 -05:00
Rudi Chiarito af4507b1ae Add service name to descriptions for GCE LB resources
Follow up from #20731. I have no way of testing this.

There's an additional group of functions (Get|Delete|Reserve)GlobalStaticIP that can create an IP without the
service description, but those are not called anywhere in the Kubernetes codebase and are probably for the
Ingress project. I'm leaving those alone for now.
2016-02-24 10:21:31 -05:00
k8s-merge-robot 9c1d8bf99d Merge pull request #21399 from sky-uk/disable-ingress-sg
Auto commit by PR queue bot
2016-02-24 00:05:47 -08:00
James Ravn and Yoseph Samuel 9f62e81be5 Disable aws node security group ingress creation
Add aws cloud config:

    [global]
    disableSecurityGroupIngress = true

The aws provider creates an inbound rule per load balancer on the node
security group. However, this can quickly run into the AWS security
group rule limit of 50.

This disables the automatic ingress creation. It requires that the user
has setup a rule that allows inbound traffic on kubelet ports from the
local VPC subnet (so load balancers can access it). E.g.  `10.82.0.0/16
30000-32000`.

Limits: http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Appendix_Limits.html#vpc-limits-security-groups

Authors: @jsravn, @balooo
2016-02-23 15:24:50 +00:00
Chris Batey, James Ravn and Yoseph Samuel 087ff78cf9 Only find running aws hosts by nodename
When finding instance by node name in AWS, only retrieve running
instances.  Otherwise terminated, old nodes can show up with the same
tag when rebuilding nodes in the cluster.

Another improvement made is to filter instances by the node names
provided, rather than selecting all instances and filtering in code.

Authors: @jsravn, @chbatey, @balooo
2016-02-23 14:47:16 +00:00
k8s-merge-robot 9470a7e61c Merge pull request #21627 from justinsb/fix_11324
Auto commit by PR queue bot
2016-02-23 03:45:10 -08:00
Justin Santa Barbara 7e69426b8b AWS: Only warn about nameless instances if they are running
Otherwise this is just confusing logspam.

Fix #20912
2016-02-22 10:27:13 -05:00
k8s-merge-robot 84891cabad Merge pull request #19335 from justinsb/aws_delay_when_requestlimitexceeded
Auto commit by PR queue bot
2016-02-21 18:38:18 -08:00
Justin Santa Barbara f8af47b645 AWS: Pass globals into Backoff struct
Thanks for the suggestion bprashanth!
2016-02-21 20:17:22 -05:00
k8s-merge-robot 2e3053a204 Merge pull request #21431 from freehan/sourcerange
Auto commit by PR queue bot
2016-02-21 16:14:42 -08:00
Justin Santa Barbara b269e8f43c AWS: Delay all AWS calls when we observe RequestLimitExceeded errors
This applies a cross-request time delay when we observe
RequestLimitExceeded errors, unlike the default library behaviour which
only applies a *per-request* backoff.

Issue #12121
2016-02-21 15:09:42 -05:00
Justin Santa Barbara 22d719018a AWS: Recover if tags missing on security group
In the AWS API (generally) we tag things we create, and then we filter
to find them.  However, creation & tagging are typically two separate
calls.  So there is a chance that we will create an object, but fail to
tag it.

We fix this (done here in the case of security groups, but we can do
this more generally) by retrieving the resource without a tag filter.
If the retrieved resource has the correct tags, great.  If it has the
tags for another cluster, that's a problem, and we raise an error.  If
it has no tags at all, we add the tags.

This only works where the resource is uniquely named (or we can
otherwise retrieve it uniquely).  For security groups, the SG name comes
from the service UUID, so that's unique.

Fixes #11324
2016-02-20 14:58:19 -05:00
k8s-merge-robot f08a8f23c1 Merge pull request #20959 from justinsb/fix_20911
Auto commit by PR queue bot
2016-02-20 00:56:53 -08:00
k8s-merge-robot 6be4417aff Merge pull request #17649 from jtblin/jtblin/17647-aws-elb-creation-no-tags
Auto commit by PR queue bot
2016-02-18 19:41:15 -08:00
Minhan Xia 7ffb123abe add source range support for loadbalancer on gce 2016-02-18 17:05:02 -08:00
Jerome Touffe-Blin 62a8b3d44c Fix #17912 - pick public subnets only on ELB creation 2016-02-16 23:16:38 +11:00
Rudi Chiarito da4ba61232 Fix AWS IPPermission check for case with preexisting groups and ranges 2016-02-15 11:41:18 -05:00
k8s-merge-robot d6b4ff3884 Merge pull request #20909 from Clarifai/instance-type-label
Auto commit by PR queue bot
2016-02-13 18:51:42 -08:00
Rudi Chiarito b3863eae82 Add instance-type label to cloud providers
Fully implemented for AWS and GCE
2016-02-12 15:02:03 -05:00
Jan Safranek 1d0b1c227b Add PV.Name into names of generated GCE/AWS/OSP volumes.
Volume names have now format <cluster-name>-dynamic-<pv-name>.

pv-name is guaranteed to be unique in Kubernetes cluster, adding
<cluster-name> ensures we don't conflict with any running cluster
in the cloud project (kube-controller-manager --cluster-name=XXX).

'kubernetes' is the default cluster name.
2016-02-12 09:46:59 +01:00
Justin Santa Barbara a0093eb503 e2e: Don't try to create a UDP LoadBalancer on AWS
AWS doesn't support type=LoadBalancer with UDP services.  For now, we
simply skip over the test with type=LoadBalancer on AWS for the UDP
service.

Fix #20911
2016-02-10 12:31:18 -05:00
k8s-merge-robot 9520bb5ddf Merge pull request #20731 from Clarifai/ensure-lb-servicename
Auto commit by PR queue bot
2016-02-10 01:25:07 -08:00
k8s-merge-robot 2ec49efd54 Merge pull request #19945 from Clarifai/fix-formatting
Auto commit by PR queue bot
2016-02-09 16:05:00 -08:00
David Pratt 57782459e6 Log missing public IP from AWS metadata. 2016-02-09 12:06:02 -06:00
David Pratt adde7f548f Fix AWS kubelet registration.
This commit allows the AWS cloud provider plugin to work on EC2 instances
that do not have a public IP. The EC2 metadata service returns a 404 for the
'public-ipv4' endpoint for private instances, and the plugin was bubbling this
up as a fatal error.
2016-02-09 11:46:53 -06:00
Rudi Chiarito 5874b0cb9d Pass namespaced service name to cloudprovider's EnsureLoadBalancer
Also has an AWS implementation that plugs the service name into the ELB and SG.
Log the service name under GCE and OpenStack.
Fixes #20668
2016-02-09 06:50:53 -05:00
k8s-merge-robot 318705feb9 Merge pull request #20378 from jtblin/jtblin/11543-aws-cloud-provider-private-hosted-dns-zone-via-api
Auto commit by PR queue bot
2016-02-08 23:43:59 -08:00
k8s-merge-robot fec0d127b3 Merge pull request #15938 from justinsb/aws_ebs_cleanup
Auto commit by PR queue bot
2016-02-08 21:42:52 -08:00
Tim Hockin fecb71420c Demote static IPs ASAP for easier cleanup
This exposed bugs in the IP promotion/demotion logic.
2016-02-06 21:15:06 -08:00
k8s-merge-robot 257c3ad776 Merge pull request #20153 from sky-uk/fix-sg-comparison
Auto commit by PR queue bot
2016-02-06 12:25:26 -08:00
k8s-merge-robot 1b52e0ec3a Merge pull request #20210 from jsafrane/devel/gce-tags
Auto commit by PR queue bot
2016-02-05 21:36:25 -08:00
Tim Hockin a6cb6d76e3 Use defer for IP release 2016-02-04 16:40:28 -08:00
Jerome Touffe-Blin 0a36db19bf Fix #11543 - use DescribeInstances API to retrieve the correct private DNS name 2016-02-04 08:48:25 +11:00
Justin Santa Barbara f61a5d0400 AWS: Switch arguments to AttachDisk/DetachDisk to match GCE 2016-02-03 20:43:23 +00:00
Justin Santa Barbara 6c87a4be7c AWS: Handle deleting volume that no longer exists
The tests in particular double-delete volumes, so we need to handle this
graciously.
2016-02-03 20:43:14 +00:00
Justin Santa Barbara 1ae1db6027 AWS: Update copy-paste of GCE PD code to latest version
We are (sadly) using a copy-and-paste of the GCE PD code for AWS EBS.
This code hasn't been updated in a while, and it seems that the GCE code
has some code to make volume mounting more robust that we should copy.
2016-02-03 20:43:14 +00:00
Rudi Chiarito a0831a2378 Mass fix of Infof and co. missing the trailing "f", even when formatting placeholders are used 2016-02-03 11:34:59 -05:00
k8s-merge-robot df4d50a7ac Merge pull request #20098 from brendandburns/flake2
Auto commit by PR queue bot
2016-02-02 04:22:45 -08:00
k8s-merge-robot bf9523a03a Merge pull request #20195 from bprashanth/fix_zone_url
Auto commit by PR queue bot
2016-02-02 03:50:05 -08:00
k8s-merge-robot 5a6cf15c09 Merge pull request #19874 from justinsb/aws_fix_findinstancesbynodenames
Auto commit by PR queue bot
2016-02-01 21:38:46 -08:00
Brendan Burns 2aa5dc317b Try harder to delete cloud resources in service tests 2016-02-01 15:34:55 -08:00
k8s-merge-robot 3db1a6c3ce Merge pull request #19976 from aarondav/elb-timeout
Auto commit by PR queue bot
2016-01-30 18:57:56 -08:00
k8s-merge-robot d63398a543 Merge pull request #18829 from lebauce/openstack-use-os-ext-ips
Auto commit by PR queue bot
2016-01-29 17:49:43 -08:00
Fabio Yeon eb2c2d1af4 Merge pull request #20111 from fabioy/fix-tmp-tests
Add temp directory creation method for tests.
2016-01-29 09:51:12 -08:00
Fabio Yeon 7205a160ac Remove all instances of "/tmp" from unit tests and replace with a common
tmp directory creator. Exception is documented.
2016-01-27 16:11:22 -08:00
Prashanth Balasubramanian bcd39900b3 Modify Create/SetGlobalForwardingRule to just take a link. 2016-01-27 16:00:45 -08:00
Prashanth Balasubramanian cba768322a Add cloudprovider methods for ssl. 2016-01-27 16:00:45 -08:00
Jan Safranek 23cd0913f7 Tag dynamically created GCE PD disks.
GCE disks don't have tags, we must encode the tags into Description field.
It's encoded as JSON, which is both human and machine readable:
description: '{"kubernetes.io/created-for/pv/name":"pv-gce-oxwts","kubernetes.io/created-for/pvc/name":"myclaim","kubernetes.io/created-for/pvc/namespace":"default"}'
2016-01-27 15:16:05 +01:00
Prashanth Balasubramanian 5dec84c0dc Fix zone in cloudprovider method. 2016-01-26 19:12:31 -08:00
James Ravn c3383b3422 Fix formatting of aws_test.go 2016-01-26 15:13:25 +00:00
James Ravn b4bbc3b3ef Fix removal of aws security group ingress
The ip permission method now checks for containment, not equality, so
order of parameters matter. This change fixes
`removeSecurityGroupIngress` to pass in the removal permission first to
compare against the existing permission.
2016-01-26 13:57:52 +00:00
Dogan Narinc and James Ravn 190c829ac5 Fix AWS ip permission comparison on security group
Change isEqualIPPermission to consider the entire list of security group
ids on when checking if a security group id has already been added.

This is used for example when adding and removing ingress rules to the
cluster nodes from an elastic load balancer. Without this, once there
are multiple load balancers, the method as it stands incorrectly returns
false even if the security group id is in the list of group ids. This
causes a few problems: dangling security groups which fill up an
account's limit since they don't get removed, and inability to recreate
load balancers in certain situations (receiving an
InvalidPermission.Duplicate from AWS when adding the same security
group).
2016-01-26 11:30:27 +00:00
Rudi Chiarito 76e29ed455 Register ECR credential plugin only when an AWS cloud instance is created 2016-01-25 22:18:45 -05:00
Quinton Hoole 10f7985dfb Merge pull request #19995 from justinsb/gce_label_pd
Ubernetes-Lite GCE: Label volumes with zone information
2016-01-25 10:34:10 -08:00
Aaron Davidson 97689c326d Reduce healthy threshold and check interval for Amazon ELBs
According to AWS, the ELB healthy threshold is "Number of consecutive health check successes before declaring an EC2 instance healthy." It has an unusual interaction with Kubernetes, since all nodes will enter either an unhealthy state or a healthy state together depending on the service's healthiness as a whole.

We have observed that if our service goes down for the unhealthy threshold (which is 2 checks at 30 second intervals = 60 seconds), then the ELB will stop serving traffic to all nodes in the cluster, and will wait for the healthy threshold (currently 10 * 30 = 300 seconds) AFTER the service is restored to add back the cluster nodes, meaning it remains unreachable for an extra 300 seconds.

With the new settings, the ELB will continue to timeout dead nodes after 60 seconds, but will restore healthy nodes after 20 seconds. The minimum value for healthyThreshold is 2, and the minimum value for interval is 5 seconds. I went for 10 seconds instead of the minimum sort of arbitrarily because I was not sure how much this value may affect the scalability of clusters in EC2, as it does put some extra load on the kube-proxy.
2016-01-23 11:10:37 -08:00
Justin Santa Barbara 1276675512 Ubernetes-Lite: Error if a PD name is ambiguous
We don't cope well if a PD is in multiple zones, but this is actually
fairly easy to detect.  This is probably justified purely on the basis
that we never want to delete the wrong volume (DeleteDisk), but also
because this means that we now warn on creation if a disk is in multiple
zones (with the labeling admission controller).

This also means that with the scheduling predicate in place, that many
of our volume problems "go away" in practice: you still can't create or
delete a volume when it is ambiguous, but thereafter the volume will be
labeled with the zone, that will match it only to nodes with the same
zone, and then we query for the volume in that zone when we
attach/detach it.
2016-01-22 17:16:38 -05:00
Justin Santa Barbara 900567288b Ubernetes Lite: Label volumes with zone information
When volumes are labeled, they will only be scheduled onto nodes in the
same zone.
2016-01-22 17:16:31 -05:00
Zach Loafman 7189db3701 Merge pull request #19396 from justinsb/aws_mountdevice
AWS: Use a strongly typed mountDevice
2016-01-22 11:04:23 -08:00
Justin Santa Barbara 2201c631b3 Fix for panic when instance not found
This removes a panic I mistakenly introduced when an instance is not
found, and also restores the exact prior behaviour for
getInstanceByName, where it returns cloudprovider.InstanceNotFound when
the instance is not found.
2016-01-21 19:15:09 -05:00
Alex Mohr f64a40f315 Merge pull request #19863 from justinsb/aws_fix_loadbalancer_tcp_check
AWS: Eliminate assumptions about all load-balancer ports matching
2016-01-21 10:52:20 -08:00
Alex Mohr b8f8a62775 Merge pull request #19862 from justinsb/aws_fix_tcploadbalancer_comment
AWS: Fix comment to reflect new method name
2016-01-21 10:51:37 -08:00
Alex Mohr 8e9bebb424 Merge pull request #19861 from justinsb/aws_remove_dead_code
AWS: Remove dead code
2016-01-21 10:51:06 -08:00
Alex Mohr 819cdc85d3 Merge pull request #19390 from danielschonfeld/optimize-kubelet-heartbeat-aws
optimize kubelet heartbeats instance data to be cached for AWS
2016-01-21 10:33:00 -08:00
Alex Mohr d2d349bc84 Merge pull request #19334 from resouer/network
Networking should be used to hold network related pkgs
2016-01-21 10:26:13 -08:00
Alex Mohr eaa61a72b0 Merge pull request #17919 from justinsb/multizone_gce
Ubernetes Lite support for GCE
2016-01-21 10:22:34 -08:00
k8s-merge-robot 0f6f521beb Merge pull request #18959 from jsafrane/devel/cinder-tags
Auto commit by PR queue bot
2016-01-21 03:33:58 -08:00
Justin Santa Barbara 43cbfb74fe Ubernetes Lite GCE: Support multiple zones in GCE cloud provider
We adapt the existing code to work across all zones in a region.

We require a feature-flag to enable Ubernetes-Lite

Reasons:

* There are some behavioural changes if users create volumes with
the same name in two zones.
* We don't want to make one API call per zone if we're not running
Ubernetes-Lite.
* Ubernetes-Lite is still experimental.

There isn't a parallel flag implemented for AWS, because at the moment
there would be no behaviour changes from this.
2016-01-20 23:04:53 -05:00
Justin Santa Barbara 9f44c72ba9 AWS: findInstancesByNodeNames should not make O(N) API calls
findInstancesByNodeNames was a simple loop around
findInstanceByNodeName, which made an EC2 API call for each call.

We've had trouble with this sort of behaviour hitting EC2 rate limits on
bigger clusters (e.g. #11979).

Instead, change this method to fetch _all_ the tagged EC2 instances, and
then loop through the local results.  This is one API call (modulo
paging).

We are currently only using findInstancesByNodeNames for the load
balancer, where we attach every node, so we were fetching all but one of
the instances anyway.

Issue #11979
2016-01-20 12:37:22 -05:00
Justin Santa Barbara 0586d866de AWS: Eliminate assumptions about all load-balancer ports matching
It costs us basically nothing to just check all the ports, and
protects us against future changes to the controller.
2016-01-20 09:30:46 -05:00
Justin Santa Barbara 12dd568662 AWS: Fix comment to reflect new method name
TCPLoadBalancer.EnsureTCPLoadBalancer => LoadBalancer.EnsureLoadBalancer
2016-01-20 09:28:28 -05:00
Justin Santa Barbara 30882265b6 AWS: Remove dead code
I think I added these functions by mistake; they aren't used and
apparently never were.
2016-01-20 09:23:53 -05:00
k8s-merge-robot 4969f11089 Merge pull request #19439 from bprashanth/compute_dep
Auto commit by PR queue bot
2016-01-16 10:38:11 -08:00
Harry Zhang 936a11e775 Use networking to hold network related pkgs
Change names of unclear methods

Use net as pkg name for short
2016-01-15 13:46:16 +08:00
Mike Danese 1f0b10bd22 Merge pull request #19538 from mesosphere/jdef_mesos_026_compat
MESOS: compatibility w/ mesos v0.26
2016-01-14 13:44:42 -08:00
Daniel Schonfeld 12cbc9ff89 optimize ExternalId() and InstanceId() to returned cached results directly from the metadata service 2016-01-13 23:36:57 -05:00
James DeFelice ad1803a4ce construct master URIs from MasterInfo.Address if present; prefer /state over /state.json 2016-01-12 17:49:19 +00:00
Sylvain Baubeau b9dfe1b737 Use Openstack os-ext-ips extension to qualify IP address types
fixes #18409
2016-01-12 15:03:47 +01:00
David Oppenheimer 8ac484793d Comment out calls to httptest.Server.Close() to work around
https://github.com/golang/go/issues/12262 . See #19254 for
more details. This change should be reverted when we upgrade
to Go 1.6.
2016-01-11 23:02:11 -08:00
Prashanth Balasubramanian cc09a603dd Code changes 2016-01-11 16:27:12 -08:00
k8s-merge-robot 37b5726716 Merge pull request #14431 from Defensative/UDP-LB
Auto commit by PR queue bot
2016-01-08 12:39:02 -08:00
k8s-merge-robot e0e305c6be Merge pull request #19337 from danielschonfeld/optimize-list-routes
Auto commit by PR queue bot
2016-01-08 10:19:47 -08:00
Justin Santa Barbara 03900b1dc9 AWS: Use a strongly typed mountDevice
We've had problems in the past from using a string with passing the
wrong value when detaching; stronger typing would have caught this for
us.
2016-01-08 00:25:11 -05:00
Daniel Schonfeld 24c44e7a8e optimize ListRoutes to fetch instances only once per call
Issue #12121 - fixes courtesy of @justinsb - thank you
2016-01-07 14:32:37 -05:00
Justin Santa Barbara e7c3a08947 AWS: Provide newly required initialization arguments
It seems that some formerly optional arguments are now required in the
latest aws-sdk-go, see e.g.
https://github.com/aws/aws-sdk-go/issues/452.
2016-01-06 13:37:02 -05:00
Kenneth Shelton 9e6c45c395 Updated comments
Updated documentation
Fixed e2e test
2016-01-05 20:51:21 +00:00
Kenneth Shelton d399a8f8cc * Added UDP LB support (for GCE) 2016-01-05 20:51:21 +00:00
Trevor Pounds bbc181d1f8 Remove unused EC2 metadata functions. 2016-01-04 16:10:23 -08:00
Trevor Pounds 89d7eb050a Update AWS cloud provider to aws-sdk-go v1.0.2. 2016-01-04 16:10:23 -08:00