Automatic merge from submit-queue (batch tested with PRs 47851, 47824, 47858, 46099)
Revert "[Federation] Fix federated service reconcilation issue due to addition of External…"
Reverts kubernetes/kubernetes#45798
Reverting the temporary fix as the problem is fixed in #45869.
with that fix federation also can default ExternalTrafficLocalOnly if not set.
Issue: #45812
cc @MrHohn @madhusudancs @kubernetes/sig-federation-bugs
Automatic merge from submit-queue (batch tested with PRs 47451, 47410, 47598, 47616, 47473)
Revert "Ignore `daemonset-controller-hash` label key in federation before comparing the federated object with its cluster equivalent."
This reverts commit 3530c9ce87.
~This needs to wait for #47258, otherwise federation test won't pass~ (merged)
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Add NodeInternalIP as a fallback to federation api-server nodeport service
Previously NodeLegacyHostIP was used as a fallback (see #41243) but in 1.7 it was removed (#44830)
Now clusters where nodes have not set ExternalIP can not be used by kubefed to setup federation.
cc @shashidharatd
```release-note
kubefed will now configure NodeInternalIP as the federation API server endpoint when NodeExternalIP is unavailable for federation API servers exposed as NodePort services
```
Kubernetes daemonset controller writes a daemonset's hash to the object
label as an optimization to avoid recomputing it every time. Adding a
new label to the object that the federation is unaware of causes
problems because federated controllers compare the objects in
federation and their equivalents in clusters and try to reconcile them.
This leads to a constant fight between the federated daemonset
controller and the cluster controllers, and they never reach a stable
state.
Ideally, cluster components should not update an object's spec or
metadata in a way federation cannot replicate. They can update an
object's status though. Therefore, this daemonset hash should be a
field in daemonset's status, not a label in object meta. @janetkuo says
that this label is only a short term solution. In the near future, they
are going to replace it with revision numbers in daemonset status. We
can then rip this bandaid out.
Automatic merge from submit-queue (batch tested with PRs 45871, 46498, 46729, 46144, 46804)
Fix some comments in dnsprovider
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 36721, 46483, 45500, 46724, 46036)
Federation: Minor corrections in service controller and add a unit testcase
**What this PR does / why we need it**:
This PR fixes few outdated comments in federation service controller and few other minor fixes.
This also adds a unit test case to test federated service deletion.
/assign @quinton-hoole
/cc @marun @kubernetes/sig-federation-pr-reviews
```release-note
NONE
```
Automatic merge from submit-queue
Add initializer support to admission and uninitialized filtering to rest storage
Initializers are the opposite of finalizers - they allow API clients to react to object creation and populate fields prior to other clients seeing them.
High level description:
1. Add `metadata.initializers` field to all objects
2. By default, filter objects with > 0 initializers from LIST and WATCH to preserve legacy client behavior (known as partially-initialized objects)
3. Add an admission controller that populates .initializer values per type, and denies mutation of initializers except by certain privilege levels (you must have the `initialize` verb on a resource)
4. Allow partially-initialized objects to be viewed via LIST and WATCH for initializer types
5. When creating objects, the object is "held" by the server until the initializers list is empty
6. Allow some creators to bypass initialization (set initializers to `[]`), or to have the result returned immediately when the object is created.
The code here should be backwards compatible for all clients because they do not see partially initialized objects unless they GET the resource directly. The watch cache makes checking for partially initialized objects cheap. Some reflectors may need to change to ask for partially-initialized objects.
```release-note
Kubernetes resources, when the `Initializers` admission controller is enabled, can be initialized (defaulting or other additive functions) by other agents in the system prior to those resources being visible to other clients. An initialized resource is not visible to clients unless they request (for get, list, or watch) to see uninitialized resources with the `?includeUninitialized=true` query parameter. Once the initializers have completed the resource is then visible. Clients must have the the ability to perform the `initialize` action on a resource in order to modify it prior to initialization being completed.
```
Automatic merge from submit-queue (batch tested with PRs 46239, 46627, 46346, 46388, 46524)
move labels to components which own the APIs
During the apimachinery split in 1.6, we accidentally moved several label APIs into apimachinery. They don't belong there, since the individual APIs are not general machinery concerns, but instead are the concern of particular components: most commonly the kubelet. This pull moves the labels into their owning components and out of API machinery.
@kubernetes/sig-api-machinery-misc @kubernetes/api-reviewers @kubernetes/api-approvers
@derekwaynecarr since most of these are related to the kubelet
Automatic merge from submit-queue (batch tested with PRs 46801, 45184, 45930, 46192, 45563)
[Federation] Add a SchedulingAdapter that can extend the FederatedTypeAdapter and that provides hooks for scheduling objects into clusters.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46686, 45049, 46323, 45708, 46487)
[Federation][kubefed]: Use StorageClassName for etcd pvc
This PR updates kubefed to use the StorageClassName field [added in 1.6](http://blog.kubernetes.io/2017/03/dynamic-provisioning-and-storage-classes-kubernetes.html
) for etcd's pvc to allow the user to specify which storage class they want to use. If no value is provided to ``kubefed init``, the field will not be set, and initialization of the pvc may fail on a cluster without a default storage class configured.
The alpha annotation that was previously used (``volume.alpha.kubernetes.io/storage-class``) was deprecated as of 1.4 according to the following blog post:
http://blog.kubernetes.io/2016/10/dynamic-provisioning-and-storage-in-kubernetes.html
**Release note**:
```
'kubefed init' has been updated to support specification of the storage class (via --etcd-pv-storage-class) for the Persistent Volume Claim (PVC) used for etcd storage. If --etcd-pv-storage-class is not specified, the default storage class configured for the cluster will be used.
```
cc: @kubernetes/sig-federation-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 46394, 46650, 46436, 46673, 46212)
Write "kubectl options" help message to stdout, not stderr
Fix a very minor issue causing `kubectl` to write its help messages to `stderr` instead of `stdout`.
Try this:
`kubectl options | grep log`
It should print only the options related to logging, but right now it prints the entire help menu (since it's printing to stderr).
This patch brings us closer to unix convention and reduces user friction.
~~Another use case (if a user can't remember whether it's `-r` or `-R` for recursion):~~
~~`kubectl patch -h | grep recursive`~~
Update: this patch only affects `kubectl options`. The other commands are working as intended.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45534, 37212, 46613, 46350)
check err
Signed-off-by: yupengzte <yu.peng36@zte.com.cn>
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Previous behavior was to write to stderr (thanks to the fallback system
in the Cobra library), which made it difficult to grep for flags.
For example:
kubectl options | grep recursive
Automatic merge from submit-queue (batch tested with PRs 46252, 45524, 46236, 46277, 46522)
[Federation] Refactor the cluster selection logic in the sync controller
This is intended to make it easier to define the interaction between cluster selection and scheduling preferences in the sync controller when used for workload types.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46450, 46272, 46453, 46019, 46367)
check err
Signed-off-by: yupengzte <yu.peng36@zte.com.cn>
**What this PR does / why we need it**:
When the err in not nil, the podStatus is nil, it is dangerous "podStatus[cluster.Name].RunningAndReady".
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 46450, 46272, 46453, 46019, 46367)
Add ClusterSelector to Ingress Controller
This pull request adds ClusterSelector to the Federated Ingress Controller ref: design #29887
This back ports the same functionality from the sync controller (merged pull #40234) in order to make this feature available across all Controllers for the 1.7 release.
cc: @kubernetes/sig-federation-pr-reviews @shashidharatd
**Release note**:
```
The annotation `federation.alpha.kubernetes.io/cluster-selector` can be used with Ingress objects to target federated clusters by label.
```
Automatic merge from submit-queue
[Federation] Move service dns controller to its own package
This PR does nothing but just moves service dns controller code to its own package.
**Release note**:
```release-note
NONE
```
cc @kubernetes/sig-federation-pr-reviews
/assign @marun
Automatic merge from submit-queue
Fix typo in test_helper
`CompareObjectMeta` is comparting Name attribute, but
logging Namespace. Looks like a copy/paste error.
Automatic merge from submit-queue (batch tested with PRs 46429, 46308, 46395, 45867, 45492)
deduplicate endpoints before DNS registration
**What this PR does / why we need it**: Multizone clusters will return duplicated endpoints to the federation controller manager. The FCM will then attempt to create an A record with duplicate entries, which will fail. As a result, federated services on multi-AZ clusters don't work right now. This PR deduplicates the endpoint IPs before attempting the DNS record registration.
**Which issue this PR fixes**: fixes#35997
**Special notes for your reviewer**:
I believe there is a lot of refactoring required with multizone federated clusters, most notably with regard to AWS and optimising for ALIAS records rather than A, but this PR will at least allow basic functionality to work.
```release-note NONE
```
Automatic merge from submit-queue (batch tested with PRs 44774, 46266, 46248, 46403, 46430)
[Federation] ClusterSelector for Service Controller
This pull request adds ClusterSelector to the Federated Service Controller ref: design #29887 This back ports the same functionality from the sync controller (merged pull #40234).
cc: @nikhiljindal @marun
This is intended to make it easier to define the interaction between cluster selection and scheduling preferences in the sync controller when used for workload types.
Automatic merge from submit-queue (batch tested with PRs 42042, 46139, 46126, 46258, 46312)
[Federation] Use service accounts instead of the user's credentials when accessing joined clusters' API servers.
Fixes#41267.
Release notes:
```release-note
Modifies kubefed to create and the federation controller manager to use credentials associated with a service account rather than the user's credentials.
```
Automatic merge from submit-queue
[Federation][kubefed]: Move server image definition to cmd
This enables consumers like openshift to provide a different default without editing the kubefed init logic.
cc: @kubernetes/sig-federation-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 40234, 45885, 42975)
Fed target cluster by label for sync controller
[use clusterselector w/ federated configmap deploys](667dc77444)
**What this PR does / why we need it**: adds the ability to indicate objects are sent to subsets of federated clusters ref #29887
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes
**Special notes for your reviewer**:
**Release note**:
```release-note
```
fix test error formatting
updates from comments
update gofmt
simplify tests
add to new sync controller
add tests
remove configmap changes due to rebase
updates from review
refactor tests to be based on operations
improvements from review
updates from rebase
rebase to #45374
updates from review
refactor SendToCluster for tests
fix import order
rebase to upstream
Automatic merge from submit-queue (batch tested with PRs 42895, 45940)
[Federation] Automate configuring nameserver in cluster-dns for CoreDNS provider
Addresses issue #42894#42822
**Release note**:
```
[Federation] CoreDNS server will be automatically added to nameserver resolv.conf chain When using CoreDNS as dns provider for federation during federation join.
```
cc @madhusudancs @kubernetes/sig-federation-bugs
Automatic merge from submit-queue (batch tested with PRs 45247, 45810, 45034, 45898, 45899)
[Federation] Segregate DNS related code to separate controller
**What this PR does / why we need it**:
This is the continuation of service controller re-factor work as outlined in #41253
This PR segregates DNS related code from service controller to another controller `service-dns controller` which manages the DNS records on the configured DNS provider.
`service-dns controller` monitors the federated services for the ingress annotations and create/update/delete DNS records accordingly.
`service-dns controller` can be optionally disabled and DNS record management could be done by third party components by monitoring the ingress annotations on federated services. (This would enable something like federation middleware for CoreDNS where federation api server could be used as a backend to CoreDNS eliminating the need for etcd storage.)
**Special notes for your reviewer**:
**Release note**:
```
Federation: A new controller for managing DNS records is introduced which can be optionally disabled to enable third party components to manage DNS records for federated services.
```
cc @kubernetes/sig-federation-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 45374, 44537, 45739, 44474, 45888)
[Federation] Refactor sync controller's reconcile method for maintainability
This PR refactors the sync controllers reconcile method for maintainability with the goal of eliminating the need for type-specific controller unit tests. The unit test coverage for reconcile is not complete, but I think it's a good start.
cc: @kubernetes/sig-federation-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 45860, 45119, 44525, 45625, 44403)
coredns: support IPv6 record set
Added support for AAAA record for coredns and included unit test.
Refactored common test code to reduce duplication from added test and
existing tests.
Fixed function names in comments for Google and AWS tests to match
actual test name in this area.
**What this PR does / why we need it**:
Adding IPv6 support to kubernetes, once piece at a time. :)
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#44351
**Special notes for your reviewer**:
In addition to the change and unit test method, I did some minor refactoring (since the UT was a near clone of an existing test). Fixed typos in related test methods' comment lines. Please let me know if this is OK (I was thinking it was a small change, but don't know the protocol here), or if I need to break it into multiple commits.
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 45860, 45119, 44525, 45625, 44403)
[Federation] Move annotations and related parsing code as common code
This PR moves some code, which was duplicate, around as common code.
Changes the names of structures used for annotations to common names.
s/FederatedReplicaSetPreferences/ReplicaAllocationPreferences/
s/ClusterReplicaSetPreferences/PerClusterPreferences/
This can be reused in job controller and hpa controller code.
**Special notes for your reviewer**:
@kubernetes/sig-federation-misc
**Release note**:
```NONE
```
Added support for AAAA record for coredns and included unit test.
Fixed function names in comments for Google and AWS tests to match
actual test name in this area.
Automatic merge from submit-queue (batch tested with PRs 44626, 45641)
Update Google Cloud DNS provider Rrset.Get(name) method to return a list and change the `Rrset.List()` implementation to perform a paged walk
Some federated service e2e tests and a few ingress tests would become flaky after a few hundred runs. @csbell spent quite a lot of time debugging this and found out that this flakiness was due to a bug in the federated service controller deletion logic. Deletion of a federated service object triggers a logic in the controller to update the DNS records corresponding to that object. This DNS record update logic would return an error in failed runs which would in-turn cause the controller to reschedule the operation. This led to an infinite retry-failure cycle that never gave the API server a chance to garbage collect the deleted service object.
A couple of days ago we started seeing a correlation between the number of resource records in a DNS managed zone and these test failures. If you look at the test runs before and after run 2900 in the test grid - https://k8s-testgrid.appspot.com/cluster-federation#gce, you will notice that the grid became super green at 2900. That's when I deleted all the dangling DNS records from the past runs.
After some investigation yesterday, we found that `ResourceRecordSet.Get()` interface and its implementation, and `ResourceRecordSet.List()` implementation at least for Google Cloud DNS were incorrect.
This PR makes minimal set of changes (read: least invasive) in Google Cloud DNS provider implementation to fix these problems:
1. Modifies DNS provider Rrset.Get(name) interface to return multiple records and updates federated service controller.
There can be multiple DNS resource records for a given name. They can vary by type, ttl, rrdata and a number of various other parameters. It is incorrect to return a single resource record for a given name.
This change updates the Get interface to return multiple records for a given name and uses this list in the federated service controller to perform DNS operations.
2. Update Google Cloud DNS List implementation to perform a paged walk of lists to aggregate all the DNS records.
The current `List()` implementation just lists the DNS resorce records in a given managed zone once and retruns the list. It neither performs a paged walk nor does it consider the `page_token` in the returned response.
This change walks all the pages and aggregates the records in the pages and returns the aggregated list. This is potentially dangerous as it can blow up memory if there are a huge number of records in the given managed zone. But this is the best we can do without changing the provider interface too much.
Next step is to define a new paged list interface and implement it.
**Release note**:
```release-note
NONE
```
/assign @csbell
cc @justinsb @shashidharatd @quinton-hoole @kubernetes/sig-federation-pr-reviews
When we fetch the dns records by name, we get a list of records that match
the given name. As an optimization we look up to see if the new record we
want to create is already in the returned list to avoid performing any updates.
However, when the new record we want to create isn't in the returned list, it
is hard to say if the returned list contains the list of records that we want
to retain. For example, we might get a list of A records and we want to create
a CNAME record. Creating a new CNAME record without removing the A records is
a DNS misconfiguration. So to play safe we just remove all the existing records
in the list and create the new desired record.
**Note**: This is the opposite of what I said here - https://reviewable.kubernetes.io/reviews/kubernetes/kubernetes/44626#-Ki9xQOzybryHvsxNrra.
Automatic merge from submit-queue (batch tested with PRs 45556, 45561, 45256)
[Federation] Replace the indexing lister with a regular store in the replicaset controller
This is part of the refactoring work to allow the replicaset controller to use the generic sync controller.
None of the other controllers use a lister, including the deployment controller
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45382, 45384, 44781, 45333, 45543)
[Federation] Provide updater timeout to instance rather than to Update()
This PR changes the federated updater to receive its timeout at construction rather than on every call to Update(). This provides a slight decrease in coupling by removing the need for the deletion handler to be provided the timeout along with the updater.
cc: @kubernetes/sig-federation-pr-reviews @perotinus
Automatic merge from submit-queue
[Federation] Improve the logging and user feedback in 'kubefed init'
This is a follow-up to #41849, which added some status information. This PR is based off of that one, and includes its changes as well.
See #41725.
```release-note
None
```
The current `List()` implementation just lists the DNS resorce records in
a given managed zone once and retruns the list. It neither performs a paged
walk nor does it consider the `page_token` in the returned response.
This change walks all the pages and aggregates the records in the pages
and returns the aggregated list. This is potentially dangerous as it can
blow up memory if there are a huge number of records in the given
managed zone. But this is the best we can do without changing the
provider interface too much. Next step is to define a new paged list
interface and implement it.