Implements part of #24071
I am not familiar with the scheduler enough to know what to do with the scores. Punting for now.
Missing items from the implementation plan: limitranger, rkt support, kubectl
support and user docs
Automatic merge from submit-queue
cluster/images/hyperkube: create symlink for each server
Add a kubelet symlink so that the hyperkube image can appear as a kubelet image. https://github.com/kubernetes/kubernetes/issues/24510
Introduce DescriberSettings for Describer display options
Introduce --show-events flag and DescriberSettings in Describer methods
Introduce unit-tests
Regenerated kubectl describe docs
Add events flag tests to test-cmd.sh
Signed-off-by: dhodovsk@redhat.com
Signed-off-by: jchaloup@redhat.com
The codec factory should support two distinct interfaces - negotiating
for a serializer with a client, vs reading or writing data to a storage
form (etcd, disk, etc). Make the EncodeForVersion and DecodeToVersion
methods only take Encoder and Decoder, and slight refactoring elsewhere.
In the storage factory, use a content type to control what serializer to
pick, and use the universal deserializer. This ensures that storage can
read JSON (which might be from older objects) while only writing
protobuf. Add exceptions for those resources that may not be able to
write to protobuf (specifically third party resources, but potentially
others in the future).
Automatic merge from submit-queue
kubectl rolling-update support for same image
Fixes#23497.
Enables `kubectl rolling-update --image` to the same image, adding a `--image-pull-policy` flag to remove ambiguity. This allows rolling-update to behave as an "update and/or restart" (https://github.com/kubernetes/kubernetes/issues/23497#issuecomment-212349730), or as a forced update when the same tag can mean multiple versions (e.g. `:latest`). cc @janetkuo @nikhiljindal
Automatic merge from submit-queue
Make etcd cache size configurable
Instead of the prior 50K limit, allow users to specify a more sensible size for their cluster.
I'm not sure what a sensible default is here. I'm still experimenting on my own clusters. 50 gives me a 270MB max footprint. 50K caused my apiserver to run out of memory as it exceeded >2GB. I believe that number is far too large for most people's use cases.
There are some other fundamental issues that I'm not addressing here:
- Old etcd items are cached and potentially never removed (it stores using modifiedIndex, and doesn't remove the old object when it gets updated)
- Cache isn't LRU, so there's no guarantee the cache remains hot. This makes its performance difficult to predict. More of an issue with a smaller cache size.
- 1.2 etcd entries seem to have a larger memory footprint (I never had an issue in 1.1, even though this cache existed there). I suspect that's due to image lists on the node status.
This is provided as a fix for #23323
- Add README.md for node e2e tests
- Add support for --cleanup=false to leave test files on remote hosts and temporary instances for debugging
- Add ubuntu trusty instances for docker 1.8 and docker 1.9 to jenkins pr builder
- Disable coreos-beta for jenkins ci since it is failing - need to investigate
- support allocating gce instances from images and running tests against them
- set --hostname-override to match node name
- add jenkins script to source to reproduce jenkins build locally
- Add Makefile targets
- Start services in the test harness and connect locally
- Build test into binary and copy to remote host to start services
- Use tar to copy binaries to remote hosts to simplify design
controller-manager in k8sm:
- rename host_port_endpoints flag to host-port-endpoints
- host port mapping strategy in scheduler should be driven by host-port-endpoints flag
- added host-port-endpoints to known flags
- docs: scheduler should also be configured with host-port-endpoints
- task recovery: be explicit about excluding mirror pods
`--kubelet-cgroups` and `--system-cgroups` respectively.
Updated `--runtime-container` to `--runtime-cgroups`.
Cleaned up most of the kubelet code that consumes these flags to match
the flag name changes.
Signed-off-by: Vishnu kannan <vishnuk@google.com>
I can't revert with github which says "Sorry, this pull request couldn’t be
reverted automatically. It may have already been reverted, or the content may
have changed since it was merged."
Reverts commit: 0c191e787b
* Metrics will not be expose until they are hooked up to a handler
* Metrics are not cached and expose a dos vector, this must be fixed before release or the stats should not be exposed through an api endpoint
Recycle controller tries to recycle or delete a PV several times.
It stores count of failed attempts and timestamp of the last attempt in
annotations of the PV.
By default, the controller tries to recycle/delete a PV 3 times in
10 minutes interval. These values are configurable by
kube-controller-manager --pv-recycler-maximum-retry=X --pvclaimbinder-sync-period=Y
arguments.
We do this because they will be recreated immediately by the
DaemonSet Controller. In addition, we also require a specific flag
(--ignore-daemonsets) when there are DaemonSet pods on the node.
The protobuf tags contain the assigned tag id, which then ensures
subsequent generation is consistently tagging (tags don't change across
generations unless someone deletes the protobuf tag).
In addition, generate final proto IDL that is free of gogoproto
extensions for ease of generation into other languages.
Add a flag --keep-gogoproto which preserves the gogoproto extensions in
the final IDL.
Add `kube-reserved` and `system-reserved` flags for configuration
reserved resources for usage outside of kubernetes pods. Allocatable is
provided by the Kubelet according to the formula:
```
Allocatable = Capacity - KubeReserved - SystemReserved
```
Also provides a method for estimating a reasonable default for
`KubeReserved`, but the current implementation probably is low and needs
more tuning.
Implement a flag that defines the frequency at which a node's out of
disk condition can change its status. Use this flag to suspend out of
disk status changes in the time period specified by the flag, after
the status is changed once.
Set the flag to 0 in e2e tests so that we can predictably test out of
disk node condition.
Also, use util.Clock interface for all time related functionality in
the kubelet. Calling time functions in unversioned package or time
package such as unversioned.Now() or time.Now() makes it really hard
to test such code. It also makes the tests flaky and sometimes
unnecessarily slow due to time.Sleep() calls used to simulate the
time elapsed. So use util.Clock interface instead which can be faked
in the tests.
Add flags to control max connections (set to 256k vs 64k default) and TCP
established timeout (set to 1 day vs 5 day default). Flags can be set to 0 to
mean "don't change it".
This is only set at startup, and not wrapped in a rectifier loop.
Tested manually.
1. Name default scheduler with name `kube-scheduler`
2. The default scheduler only schedules the pods meeting the following condition:
- the pod has no annotation "scheduler.alpha.kubernetes.io/name: <scheduler-name>"
- the pod has annotation "scheduler.alpha.kubernetes.io/name: kube-scheduler"
update gofmt
update according to @david's review
run hack/test-integration.sh, hack/test-go.sh and local e2e.test
Currently if a pod is being scheduled with no meta.RolesKey label
attached to it, per convention the first configured mesos (framework)
role is being used.
This is quite limiting and also lets e2e tests fail. This commit
introduces a new configuration option "--mesos-default-pod-roles" defaulting to
"*" which defines the default pod roles in case the meta.RolesKey pod
label is missing.
Contains the following fixes for Windows users of kubectl edit:
* Defaults to notepad as the default Windows editor
* Uses CRLF line endings
* Ensures a file lock is freed
Default to hardcodes for components that had them, and 5.0 qps, 10 burst
for those that relied on client defaults
Unclear if maybe it'd be better to just assume these are set as part of
the incoming kubeconfig. For now just exposing them as flags since it's
easier for me to manually tweak.
This changes the --legacy-userspace-proxy flag to be a string flag
--proxy-mode. If specified, the flag will be respected ('userspace' and
'iptables' being valid values). If left blank (default) we will choose the
"best". best means userspace for now UNLESS the user adds an annotation
(net.experimental.kubernetes.io/proxy-mode) to their node, in which case we
will try to use that.
This allows people to try it on a single machine without fear of global failure
and without it getting rolled back on reboots. It is a poor-man's config blob.
Increase the supported controls on pod logging. Add validaiton to pod
log options. Ensure the Kubelet is using a consistent, structured way to
process pod log arguments.
Add ?sinceSeconds=<durationInSeconds>, &sinceTime=<RFC3339>, ?timestamps=<bool>,
?tailLines=<number>, and ?limitBytes=<number>
Add a HostIPC field to the Pod Spec to create containers sharing
the same ipc of the host.
This feature must be explicitly enabled in apiserver using the
option host-ipc-sources.
Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
A lot of packages use StringSet, but they don't use anything else from
the util package. Moving StringSet into another package will shrink
their dependency trees significantly.
1. Add EvnetRecordQps and EventBurst parameter in kubelet.
2. If EvnetRecordQps and EventBurst was set, rate limit events in kubelet
with a independent ratelimiter as setted.
- new: introduce AllocationStrategy, Predicate, and Procurement to scheduler pkg
- new: --contain-pod-resources flag (workaround for docker+systemd+mesos problems)
- new: --account-for-pod-resources flag (for testing overcommitment)
- bugfix: forward -v flag from minion controller to executor
Allow the user to specify the resolver configuration file that is used
to determine the default DNS parameters. This defaults to the system's
/etc/resolv.conf.
Also add related new flags to apiserver:
"--oidc-issuer-url", "--oidc-client-id", "--oidc-ca-file", "--oidc-username-claim",
to enable OpenID Connect authentication.
We know there are some flags (declared with an _) which we wish to
ignore. These flags are used by container definitions, e2e, etc. By
explicitly ignoring those flags we can cut the amount of noise in the
whitelist.
This works by defining two 'static' lists in hack. The first is the list
of all flags in the project which use a `-` or an `_` in their name. All
files being processed by verify-flags-underscore.py (or all files in the
repo if no filename arguments are given) will be searched for flag
declaration using a simple regex. Its not super smart. If a flag is
found which is not in the static list it will complain/reject the commit
until a human adds it to the list. If we do not keep a static list of
flags it takes >.2 seconds to find them 'all' at runtime. Since this is
run in pre-commit saving every part of a second helps.
After it finds all of the flags it runs all of the arguments (or all
files in repo if no arguments) looking for usage of those flags which
includes an `_`. There are lots of places where these are false
positives. For example we have a flag named oom-adj-score but the kernel
calls it oom_adj_score. To handle this we keep a second 'whitelist' of
lines which are allowed to use these flag names with an `_`.
Running the entire git repo looking for flags in every golang file and
looking in every single file for bad usage takes about 8.75 seconds.
Running it in the precommit hook where we only check things that changed
takes about .06 seconds.