Automatic merge from submit-queue
Fix node e2e coreos kubelet cgroup detection
Fixes#26979#26431
The root issue, as best I can tell, is that cgroup detection does not work when the kubelet is started under an ssh session and the systemd `*Accounting` variables are set. I added additional logging and noted some differences in the cgroup slice names between those cadvisor returns and the kubelet detects for itself.
This difference does not occur if the kubelet is properly running under a unit. That environment is also a more common and sane environment.
See also discussion in #26903
cc @derekwaynecarr @vishh @pwittrock
Note that these tests are untested and there is no guarantee that they work.
The ongoing auth problems is blocking these e2es from being tested and upon
@quinton-hoole's request I am submitting them now.
Automatic merge from submit-queue
Add pending pod check in cluster autoscaler e2e tests
The tests should wait until all pods are running before declaring a success and resizing the mig.
cc: @fgrzadkowski @piosz @jszczepkowski
Automatic merge from submit-queue
volume integration: wait for PVs before creating PVCs
The test should wait until all volumes are processed by volume controller (i.e. in the controller cache) before creating a PVC.
Without that, the "best" matching PV could not be in the cache and controller might bind the PVC to suboptiomal one.
This fixes integration test flake "Bind mismatch! Expected pvc-2 capacity 50000000000 but got pvc-2 capacity 52000000000".
Fixes#27179 (together with #26894)
Automatic merge from submit-queue
Fix integration pv flakes
There are two fixes in this PR:
- run tests in separarate functions and use objects with different names, otherwise events from the beginning of the function are caught later when we watch for events of a different PV/PVC
- don't set PV.Spec.ClaimRef.UID of pre-bound PVs. PVs with UID set are considered as bound and they are deleted/recycled when appropriate PVC does not exists yet.
Fixes#26730 and probably also ~~#26894~~ #26256
The test should wait until all volumes are processed by volume controller (i.e.
in the controller cache) before creating a PVC.
Without that, the "best" matching PV could not be in the cache and controller
might bind the PVC to suboptiomal one.
This fixes integration test flake "Bind mismatch! Expected pvc-2 capacity
50000000000 but got pvc-2 capacity 52000000000".
Automatic merge from submit-queue
Let kubelet log the DeletionTimestamp if it's not nil in update
This helps to debug if it's the kubelet to blame when a pod is not deleted.
Example output:
```
SyncLoop (UPDATE, "api"): "redis-master_default(c6782276-2dd4-11e6-b874-64510650ab1c):DeletionTimestamp=2016-06-08T23:58:12Z"
```
ref #26290
cc @Random-Liu
Automatic merge from submit-queue
Update reason_cache.go, Get method operate lru cache not threadsafe
The reason_cache wrapped lru cache , lru cache modies linked list even for a get, should use WLock for both read and write
Automatic merge from submit-queue
Fix docker api version in kubelet
There are two variables `dockerv110APIVersion` and `dockerV110APIVersion` with
the same purpose, but different values. Remove the incorrect one and fix usage
in the file.
/cc @dchen1107 @Random-Liu
Automatic merge from submit-queue
processor listener: fix locking in pop()
Currently the lock in processorListener is used to guard pendingNotifications. But in pop, it also locks around on select chan. This will block the goroutine with lock acquired.
This PR changes the lock to guard the correct section only.
Automatic merge from submit-queue
pkg/kubectl: add resource printers for rbac api group
This PR adds the necessary kubectl printers for the rbac api group which we overlooked in previous PRs.
cc @erictune
Automatic merge from submit-queue
ResourceQuota BestEffort scope aligned with Pod level QoS
This aligns quota with the changes in kubelet and CLI.
So if quota allows 10 `BestEffort` pods, it will now track properly with what the user sees with changes in 1.3.
```
apiVersion: v1
kind: ResourceQuota
metadata:
name: best-effort
spec:
hard:
pods: "10"
scopes:
- BestEffort
```
/cc @vishh @kubernetes/rh-cluster-infra
Automatic merge from submit-queue
AWS: cache instances during service reload to avoid rate limiting on restart
Fixes#25610 by reducing redundant calls to DescribeInstances()
```release-note
* The AWS cloudprovider will cache results from DescribeInstances() if the set of nodes hasn't changed
```
Also move int/stringSlicesEqual from servicecontroller.go to pkg/util/slice
Automatic merge from submit-queue
e2e: actually check error when we fail to GET the scheduled pod
and GET the pod before we try to gracefully delete it.
and add a debug log.
ref #26224
Automatic merge from submit-queue
Extract interface for master endpoints reconciler.
Make the master endpoints reconciler an interface so its implementation can be overridden, if
desired.
xref #20975#26574
cc @kubernetes/sig-api-machinery @lavalamp @smarterclayton @pmorie @DirectXMan12 @wojtek-t @kubernetes/rh-cluster-infra
Automatic merge from submit-queue
fix recursive & non-recursive kubectl get of generic output format
This PR fixes the issues with `kubectl get` in https://github.com/kubernetes/kubernetes/issues/26466
Changes made:
- fix printing when using the generic output format in both non-recursive & recurvise settings to ensure that errors are being shown
- add tests to check printing generic output in a **non-recursive** setting with non-existent pods
- clean up the **recursive** `kubectl get` tests
/cc @janetkuo
Automatic merge from submit-queue
Considering all nodes for the scheduler cache to allow lookups
Fixes the actual issue that led me to create https://github.com/kubernetes/kubernetes/issues/22554
Currently the nodes in the cache provided to the predicates excludes the unschedulable nodes using field level filtering for the watch results. This results in the above issue as the `ServiceAffinity` predicate uses the cached node list to look up the node metadata for a peer pod (another pod belonging to the same service). Since this peer pod could be currently hosted on a node that is currently unschedulable, the lookup could potentially fail, resulting in the pod failing to be scheduled.
As part of the fix, we are now including all nodes in the watch results and excluding the unschedulable nodes using `NodeCondition`
@derekwaynecarr PTAL
Automatic merge from submit-queue
Port the downward api test to the node e2e suite
Also extend the framework to allow a custom client config loading function, so
that the node e2e suite can reuse the same framework across tests.
This fixes#26609
/cc @timstclair @pwittrock