Automatic merge from submit-queue
genericapiserver: unify swagger and openapi in config
- make swagger config customizable
- remove superfluous `Config.Enable*` flags for OpenAPI and Swagger.
This is necessary for downstream projects to tweak the swagger spec.
Automatic merge from submit-queue (batch tested with PRs 37032, 38119, 38186, 38200, 38139)
Detect long-running requests from parsed request info
Follow up to https://github.com/kubernetes/kubernetes/pull/36064
Uses parsed request info to more tightly match verbs and subresources
Removes regex-based long-running request path matching (which is easily fooled)
```release-note
The --long-running-request-regexp flag to kube-apiserver is deprecated and will be removed in a future release. Long-running requests are now detected based on specific verbs (watch, proxy) or subresources (proxy, portforward, log, exec, attach).
```
Automatic merge from submit-queue (batch tested with PRs 38194, 37594, 38123, 37831, 37084)
remove unnecessary fields from genericapiserver config
Cleans up some unnecessary fields in the genericapiserver config.
This removes all dependencies on Config during cert generation, only operating
on ServerRunOptions. This way we get rid of the repeated call of Config.Complete
and cleanly stratify the GenericApiServer bootstrapping.
Automatic merge from submit-queue
specify custom ca file to verify the keystone server
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
Sometimes the keystone server's certificate is self-signed, mainly used for internal development, testing and etc.
For this kind of ca, we need a way to verify the keystone server.
Otherwise, below error will occur.
> x509: certificate signed by unknown authority
This patch provide a way to pass in a ca file to verify the keystone server when starting `kube-apiserver`.
**Which issue this PR fixes** : fixes#22695, #24984
**Special notes for your reviewer**:
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
``` release-note
```
Automatic merge from submit-queue
Remove static kubelet client, refactor ConnectionInfoGetter
Follow up to https://github.com/kubernetes/kubernetes/pull/33718
* Collapses the multi-valued return to a `ConnectionInfo` struct
* Removes the "raw" connection info method and interface, since it was only used in a single non-test location (by the "real" connection info method)
* Disentangles the node REST object from being a ConnectionInfoProvider itself by extracting an implementation of ConnectionInfoProvider that takes a node (using a provided NodeGetter) and determines ConnectionInfo
* Plumbs the KubeletClientConfig to the point where we construct the helper object that combines the config and the node lookup. I anticipate adding a preference order for choosing an address type in https://github.com/kubernetes/kubernetes/pull/34259
Automatic merge from submit-queue
default serializer
Everyone uses the same serializer. Set it as the default, but still allow someone to take control if they want.
Found while trying to use genericapiserver for composition.
Automatic merge from submit-queue
Add support for admission controller based on namespace node selectors.
This work is to upstream openshift's project node selectors based admission controller.
Fixes https://github.com/kubernetes/kubernetes/issues/17151
Automatic merge from submit-queue
Run rbac authorizer from cache
RBAC authorization can be run very effectively out of a cache. The cache is a normal reflector backed cache (shared informer).
I've split this into three parts:
1. slim down the authorizer interfaces
1. boilerplate for adding rbac shared informers and associated listers which conform to the new interfaces
1. wiring
@liggitt @ericchiang @kubernetes/sig-auth
Automatic merge from submit-queue
WantsAuthorizer admission plugin support
The next step of PSP admission is to be able to limit the PSPs used based on user information. To do this the admission plugin would need to make authz checks for the `user.Info` in the request. This code allows a plugin to request the injection of an authorizer to allow it to make the authz checks.
Note: this could be done with a SAR, however since admission is running in the api server using the SAR would incur an extra hop vs using the authorizer directly.
@deads2k @derekwaynecarr
Automatic merge from submit-queue
clean api server cruft
Some cruft has developed over refactors. Remove that cruft.
@liggitt probably last in the chain so far
Automatic merge from submit-queue
Enable service account signing key rotation
fixes#21007
```release-note
The kube-apiserver --service-account-key-file option can be specified multiple times, or can point to a file containing multiple keys, to enable rotation of signing keys.
```
This PR enables the apiserver authenticator to verify service account tokens signed by different private keys. This can be done two different ways:
* including multiple keys in the specified keyfile (e.g. `--service-account-key-file=keys.pem`)
* specifying multiple key files (e.g. `--service-account-key-file current-key.pem --service-account-key-file=old-key.pem`)
This is part of enabling signing key rotation:
1. update apiserver(s) to verify tokens signed with a new public key while still allowing tokens signed with the current public key (which is what this PR enables)
2. give controllermanager the new private key to sign new tokens with
3. remove old service account tokens (determined by verifying signature or by checking creationTimestamp) once they are no longer in use (determined using garbage collection or magic) or some other algorithm (24 hours after rotation, etc). For the deletion to immediately revoke the token, `--service-account-lookup` must be enabled on the apiserver.
4. once all old tokens are gone, update apiservers again, removing the old public key.
Automatic merge from submit-queue
Set deserialization cache size based on target memory usage
**Special notes for your reviewer**:
This is the PR we talked about yesterday.
**Release note**:
```release-note
To reduce memory usage to reasonable levels in smaller clusters, kube-apiserver now sets the deserialization cache size based on the target memory usage.
```
Automatic merge from submit-queue
pass loopback config to poststart hooks
Updates post start hooks to take a clientconfig with the new loopback credentials for bootstrapping.
@ericchiang This is a little bit of plumbing, but mainly auth I think.
Automatic merge from submit-queue
Add ECDSA support for service account tokens
Fixes#28180
```release-note
ECDSA keys can now be used for signing and verifying service account tokens.
```
Automatic merge from submit-queue
Allow anonymous API server access, decorate authenticated users with system:authenticated group
When writing authorization policy, it is often necessary to allow certain actions to any authenticated user. For example, creating a service or configmap, and granting read access to all users
It is also frequently necessary to allow actions to any unauthenticated user. For example, fetching discovery APIs might be part of an authentication process, and therefore need to be able to be read without access to authentication credentials.
This PR:
* Adds an option to allow anonymous requests to the secured API port. If enabled, requests to the secure port that are not rejected by other configured authentication methods are treated as anonymous requests, and given a username of `system:anonymous` and a group of `system:unauthenticated`. Note: this should only be used with an `--authorization-mode` other than `AlwaysAllow`
* Decorates user.Info returned from configured authenticators with the group `system:authenticated`.
This is related to defining a default set of roles and bindings for RBAC (https://github.com/kubernetes/features/issues/2). The bootstrap policy should allow all users (anonymous or authenticated) to request the discovery APIs.
```release-note
kube-apiserver learned the '--anonymous-auth' flag, which defaults to true. When enabled, requests to the secure port that are not rejected by other configured authentication methods are treated as anonymous requests, and given a username of 'system:anonymous' and a group of 'system:unauthenticated'.
Authenticated users are decorated with a 'system:authenticated' group.
NOTE: anonymous access is enabled by default. If you rely on authentication alone to authorize access, change to use an authorization mode other than AlwaysAllow, or or set '--anonymous-auth=false'.
```
c.f. https://github.com/kubernetes/kubernetes/issues/29177#issuecomment-244191596
Automatic merge from submit-queue
Allow secure access to apiserver from Admission Controllers
* Allow options.InsecurePort to be set to 0 to switch off insecure access
* In NewSelfClient, Set the TLSClientConfig to the cert and key files
if InsecurePort is switched off
* Mint a bearer token that allows the client(s) created in NewSelfClient
to talk to the api server
* Add a new authenticator that checks for this specific bearer token
Fixes#13598
* Allow options.InsecurePort to be set to 0 to switch off insecure access
* In NewSelfClient, Set the TLSClientConfig to the cert and key files
if InsecurePort is switched off
* Mint a bearer token that allows the client(s) created in NewSelfClient
to talk to the api server
* Add a new authenticator that checks for this specific bearer token
Fixes#13598
Automatic merge from submit-queue
Cleanup non-rest apiserver handlers
- rename MuxHelper -> PathRecorderMux
- move non-rest handlers into routes packages within genericapiserver and `pkg/routes` (those from master)
- move ui and logs handlers out of genericapiserver (they are
not generic)
- make version handler configurable (`config.EnableVersion`)
- rename MuxHelper -> PathRecorderMux
- move non-rest handlers into routes packages within genericapiserver and
`pkg/routes` (those from master)
- move ui and logs handlers out of genericapiserver (they are
not generic)
- make version handler configurable (`config.EnableVersion`)
Automatic merge from submit-queue
Add admission controller for default storage class.
The admission controller adds a default class to PVCs that do not require any
specific class. This way, users (=PVC authors) do not need to care about
storage classes, administrator can configure a default one and all these PVCs
that do not care about class will get the default one.
The marker of default class is annotation "volume.beta.kubernetes.io/storage-class", which must be set to "true" to work. All other values (or missing annotation) makes the class non-default.
Based on @thockin's code, added tests and made it not to reject a PVC when no class is marked as default.
.
@kubernetes/sig-storage
Automatic merge from submit-queue
Remove implicit Prometheus metrics from client
**What this PR does / why we need it**: This PR starts to cut away at dependencies that the client has.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
The implicit registration of Prometheus metrics for request count and latency have been removed, and a plug-able interface was added. If you were using our client libraries in your own binaries and want these metrics, add the following to your imports in the main package: "k8s.io/pkg/client/metrics/prometheus".
```
cc: @kubernetes/sig-api-machinery @kubernetes/sig-instrumentation @fgrzadkowski @wojtek-t
Automatic merge from submit-queue
pkg/genericapiserver/options: don't import pkg/apiserver
Refactor the authorization options for the API server so
pkg/apiserver isn't directly imported by the options package.
Closes#28544
cc @smarterclayton
@madhusudancs, @nikhiljindal I've updated `federation/cmd/federation-apiserver/app/server.go` to include the RBAC options with this change. I don't know if this was intentionally left out in the first place but would like your feedback.
The admission controller adds a default class to PVCs that do not require any
specific class. This way, users (=PVC authors) do not need to care about
storage classes, administrator can configure a default one and all these PVCs
that do not care about class will get the default one.
Automatic merge from submit-queue
Cut the client repo, staging it in the main repo
Tracking issue: #28559
ref: https://github.com/kubernetes/kubernetes/pull/25978#issuecomment-232710174
This PR implements the plan a few of us came up with last week for cutting client into its own repo:
1. creating "_staging" (name is tentative) directory in the main repo, using a script to copy the client and its dependencies to this directory
2. periodically publishing the contents of this staging client to k8s.io/client-go repo
3. converting k8s components in the main repo to use the staged client. They should import the staged client as if the client were vendored. (i.e., the import line should be `import "k8s.io/client-go/<pacakge name>`). This requirement is to ease step 4.
4. In the future, removing the staging area, and vendoring the real client-go repo.
The advantage of having the staging area is that we can continuously run integration/e2e tests with the latest client repo and the latest main repo, without waiting for the client repo to be vendored back into the main repo. This staging area will exist until our test matrix is vendoring both the client and the server.
In the above plan, the tricky part is step 3. This PR achieves it by creating a symlink under ./vendor, pointing to the staging area, so packages in the main repo can refer to the client repo as if it's vendored. To prevent the godep tool from messing up the staging area, we export the staged client to GOPATH in hack/godep-save.sh so godep will think the client packages are local and won't attempt to manage ./vendor/k8s.io/client-go.
This is a POC. We'll rearrange the directory layout of the client before merge.
@thockin @lavalamp @bgrant0607 @kubernetes/sig-api-machinery
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.kubernetes.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.kubernetes.io/reviews/kubernetes/kubernetes/29147)
<!-- Reviewable:end -->
Automatic merge from submit-queue
make the resource prefix in etcd configurable for cohabitation
This looks big, its not as bad as it seems.
When you have different resources cohabiting, the resource name used for the etcd directory needs to be configurable. HPA in two different groups worked fine before. Now we're looking at something like RC<->RS. They normally store into two different etcd directories. This code allows them to be configured to store into the same location.
To maintain consistency across all resources, I allowed the `StorageFactory` to indicate which `ResourcePrefix` should be used inside `RESTOptions` which already contains storage information.
@lavalamp affects cohabitation.
@smarterclayton @mfojtik prereq for our rc<->rs and d<->dc story.
Automatic merge from submit-queue
Format apiserver options
Trivial change, code formatting only: it was hard to read long lines, and my editor was really slow when scrolling through them.
Automatic merge from submit-queue
Cache Webhook Authentication responses
Add a simple LRU cache w/ 2 minute TTL to the webhook authenticator.
Kubectl is a little spammy, w/ >= 4 API requests per command. This also prevents a single unauthenticated user from being able to DOS the remote authenticator.
A few months ago we refactored options to keep it independent of the
implementations, so that it could be used in CLI tools to validate
config or to generate config, without pulling in the full dependency
tree of the master. This change restores that by separating
server_run_options.go back to its own package.
Also, options structs should never contain non-serializable types, which
storagebackend.Config was doing with runtime.Codec. Split the codec out.
Fix a typo on the name of the etcd2.go storage backend.
Finally, move DefaultStorageMediaType to server_run_options.
Automatic merge from submit-queue
Webhook Token Authenticator
Add a webhook token authenticator plugin to allow a remote service to make authentication decisions.
The codec factory should support two distinct interfaces - negotiating
for a serializer with a client, vs reading or writing data to a storage
form (etcd, disk, etc). Make the EncodeForVersion and DecodeToVersion
methods only take Encoder and Decoder, and slight refactoring elsewhere.
In the storage factory, use a content type to control what serializer to
pick, and use the universal deserializer. This ensures that storage can
read JSON (which might be from older objects) while only writing
protobuf. Add exceptions for those resources that may not be able to
write to protobuf (specifically third party resources, but potentially
others in the future).
Automatic merge from submit-queue
Use correct defaults when binding apiserver flags
defaults should be set in the struct-creating function, then the current struct field value used as the default when binding the flag
Automatic merge from submit-queue
Make etcd cache size configurable
Instead of the prior 50K limit, allow users to specify a more sensible size for their cluster.
I'm not sure what a sensible default is here. I'm still experimenting on my own clusters. 50 gives me a 270MB max footprint. 50K caused my apiserver to run out of memory as it exceeded >2GB. I believe that number is far too large for most people's use cases.
There are some other fundamental issues that I'm not addressing here:
- Old etcd items are cached and potentially never removed (it stores using modifiedIndex, and doesn't remove the old object when it gets updated)
- Cache isn't LRU, so there's no guarantee the cache remains hot. This makes its performance difficult to predict. More of an issue with a smaller cache size.
- 1.2 etcd entries seem to have a larger memory footprint (I never had an issue in 1.1, even though this cache existed there). I suspect that's due to image lists on the node status.
This is provided as a fix for #23323
Pass down into the server initialization the necessary interface for
handling client/server content type negotiation. Add integration tests
for the negotiation.
A NegotiatedSerializer is passed into the API installer (and
ParameterCodec, which abstracts conversion of query params) that can be
used to negotiate client/server request/response serialization. All
error paths are now negotiation aware, and are at least minimally
version aware.
Watch is specially coded to only allow application/json - a follow up
change will convert it to use negotiation.
Ensure the swagger scheme will include supported serializations - this
now includes application/yaml as a negotiated option.
For AWS EBS, a volume can only be attached to a node in the same AZ.
The scheduler must therefore detect if a volume is being attached to a
pod, and ensure that the pod is scheduled on a node in the same AZ as
the volume.
So that the scheduler need not query the cloud provider every time, and
to support decoupled operation (e.g. bare metal) we tag the volume with
our placement labels. This is done automatically by means of an
admission controller on AWS when a PersistentVolume is created backed by
an EBS volume.
Support for tagging GCE PVs will follow.
Pods that specify a volume directly (i.e. without using a
PersistentVolumeClaim) will not currently be scheduled correctly (i.e.
they will be scheduled without zone-awareness).
Public utility methods and JWT parsing, and controller specific logic.
Also remove the coupling between ServiceAccountTokenGetter and the
authenticator class.
Add a HostIPC field to the Pod Spec to create containers sharing
the same ipc of the host.
This feature must be explicitly enabled in apiserver using the
option host-ipc-sources.
Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
With some regularity, if the root certificate file needs to be generated
the apiserver could come up on the non-secure port before the cert
was generated.
`hack/local-up-cluster.sh` requires that apiserver.crt exists
before the replication controller starts. Otherwise service accounts
and secrets don't work.
This change just takes the certificate handling code out of the `go`.
Also add related new flags to apiserver:
"--oidc-issuer-url", "--oidc-client-id", "--oidc-ca-file", "--oidc-username-claim",
to enable OpenID Connect authentication.
Since pflag can handle net.IPNet arguements use that code. This means
that our code no longer has casts back and forth and just natively uses
net.IPNet.
pflag can handle IP addresses so use the pflag code instead of doing it
ourselves. This means our code just uses net.IP and we don't have all of
the useless casting back and forth!
PR #10643 Started adding the dns names for the kubernetes master to self
sign certs which were created. The kubelet uses this same code, and thus
the kubelet cert started saying it was valid for these name as well.
While hardless, the kubelet cert shouldn't claim to be these things. So
make the caller explicitly list both their ip and dns subject alt names.
A cert from GCE shows:
- IP Address:23.236.49.122
- IP Address:10.0.0.1
- DNS:kubernetes,
- DNS:kubernetes.default
- DNS:kubernetes.default.svc
- DNS:kubernetes.default.svc.cluster.local
- DNS:e2e-test-zml-master
A similarly configured self signed cert shows:
- IP Address:23.236.49.122
- IP Address:10.0.0.1
- DNS:kubernetes
- DNS:kubernetes.default
- DNS:kubernetes.default.svc
So we are missing the fqdn kubernetes.default.svc.cluster.local. The
apiserver does not even know the fqdn! it's defined entirely by the
kubelet! We also do not have the cluster name certificate. This may be
--cluster-name= argument to the apiserver but will take a bit more
research.
pkg/service:
There were a couple of references here just as a reminder to change the
behavior of findPort. As of v1beta3, TargetPort was always defaulted, so
we could remove findDefaultPort and related tests.
pkg/apiserver:
The tests were using versioned API codecs for some of their encoding
tests. Necessary API types had to be written and registered with the
fake versioned codecs.
pkg/kubectl:
Some tests were converted to current versions where it made sense.
Use the systemd $NOTIFY_SOCKET convention for kube-apiserver
startup. This allows it to be part of dependency trees and for
consumers to wait until it is listening on its ports.
The $NOTIFY_SOCKET protocol is described here:
http://www.freedesktop.org/software/systemd/man/sd_notify.html
Currently this is limited to the kube-apiserver process. Other
kube processes are internal kubernetes moving points. The API
server is the entry point relied on by callers.
100% stolen from Stef Walter from:
https://github.com/GoogleCloudPlatform/kubernetes/pull/8316
* Add an allocator which saves state in etcd
* Perform PortalIP allocation check on startup and periodically afterwards
Also expose methods in master for downstream components to handle IP allocation
/ master registration themselves.
I had the kublet die on startup and the only error was "0x401da0" Which
I assume is an address of the err.Error function. The other way to fix
this, I think, would be to use err.Error(), however that could cause
fmt.Fprintf() problems, debuging on the error message people used.
Now I get a nice clean error I can understand:
"cAdvisor.New() err = mountpoint for cpu not found"
Part of #108.
Also:
* Added hyperkube cmd (not built by default yet).
* Added version support to hyperkube
* Remove health_check_minions flag from apiserver as it is no longer used with #3733
This exposes the proper v1beta3 API endpoint when the user specifies
the --runtime_config=api/v1beta3 argument to the apiserver. v1beta3
is still considered experimental and subject to change.
--runtime_config is a map of string keys and values, that can be
specified by providing
--runtime_config=a=b,b=c,d,e
Only the key must be specified, the value can be omitted.
Enables v1beta3 in hack/local-up-cluster.sh and hack/test-cmd.sh
OpenShift would like to also enable swagger, but we need to register our
services as swagger services prior to the SwaggerAPI being started. I've
added a bool (default false) to master.Config to enable swagger, and split
the method in master out so that a downstream consumer can call it.
apiserver becomes kube-apiserver
controller-manager -> kube-controller-manager
scheduler and proxy similarly.
Only thing I promise is that right now hack/build-go.sh and
build/release.sh exit with 0. That's it. Who knows if any of this
actually works....