Automatic merge from submit-queue
Remove the status of the terminated containers in the summary endpoint
Ref: https://github.com/kubernetes/kubernetes/issues/47853
- When building summary, a container is considered to be terminated if it has an older creation time and no CPU instantaneous or memory RSS usage.
- We remove the terminated containers in the summary by grouping the containers with the same name in the same pod, sorting them in each group by creation time, and skipping the oldest ones with no usage in each group. Let me know if there's simpler way.
**Release note**:
```
None
```
/assign @yujuhong
Automatic merge from submit-queue
Add assert.NotNil for test case
I hardcode the `DefaultInterfaceName` from `eth0` to `eth-k8sdefault` at release 1.2.0, in order to test my CNI plugins. When running the test, it panics and prints wrongly formatted messages as below.
In the test case `TestBuildSummary`, `containerInfoV2ToNetworkStats` will return `nil` if `DefaultInterfaceName` is not `eth0`. So maybe we should add `assert.NotNil` to the test case.
```
ok k8s.io/kubernetes/pkg/kubelet/server 0.591s
W0523 03:25:28.257074 2257 summary.go:311] Missing default interface "eth-k8sdefault" for s%!(EXTRA string=node:FooNode)
W0523 03:25:28.257322 2257 summary.go:311] Missing default interface "eth-k8sdefault" for s%!(EXTRA string=pod:test0_pod1)
W0523 03:25:28.257361 2257 summary.go:311] Missing default interface "eth-k8sdefault" for s%!(EXTRA string=pod:test0_pod0)
W0523 03:25:28.257419 2257 summary.go:311] Missing default interface "eth-k8sdefault" for s%!(EXTRA string=pod:test2_pod0)
--- FAIL: TestBuildSummary (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x0 pc=0x471817]
goroutine 16 [running]:
testing.func·006()
/usr/src/go/src/testing/testing.go:441 +0x181
k8s.io/kubernetes/pkg/kubelet/server/stats.checkNetworkStats(0xc20806d3b0, 0x140bbc0, 0x4, 0x0, 0x0)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/server/stats/summary_test.go:296 +0xc07
k8s.io/kubernetes/pkg/kubelet/server/stats.TestBuildSummary(0xc20806d3b0)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/server/stats/summary_test.go:124 +0x11d2
testing.tRunner(0xc20806d3b0, 0x1e43180)
/usr/src/go/src/testing/testing.go:447 +0xbf
created by testing.RunTests
/usr/src/go/src/testing/testing.go:555 +0xa8b
```
Automatic merge from submit-queue
Expose SummaryProvider for reuse by other parts of kubelet
To support out of resource killing in the kubelet, we will introduce a new top-level module that will ensure node stability by checking if eviction thresholds have been met for memory and file-system usage on the node. In addition, it will then need information about pod memory and disk usage in order to make an eviction selection. Currently, this information is collected in `SummaryProvider` but it's hidden away and not available for re-use by other top-level modules of the kubelet. This initial refactor adds the ability to get summary stat information from the `ResourceAnalyzer` so it can be reused by other top-level modules.
I suspect we will further re-factor this area as code evolves, but this unblocks further progress on out-of-resource killing.
/cc @vishh @timothysc @kubernetes/sig-node @kubernetes/rh-cluster-infra
Automatic merge from submit-queue
Add memory available to summary stats provider
To support out of resource killing when low on memory, we want to let operators specify eviction thresholds based on available memory instead of memory usage for ease of use when working with heterogeneous nodes.
So for example, a valid eviction threshold would be the following:
* If node.memory.available < 200Mi for 30s, then evict pod(s)
For the node, `memory.availableBytes` is always known since the `memory.limit_in_bytes` is always known for root cgroup. For individual containers in pods, we only populate the `availableBytes` if the container was launched with a memory limit specified. When no memory limit is specified, the cgroupfs sets a value of 1 << 63 in the `memory.limit_in_bytes` so we look for a similar max value to handle unbounded limits, and ignore setting `memory.availableBytes`.
FYI @vishh @timstclair - as discussed on Slack.
/cc @kubernetes/sig-node @kubernetes/rh-cluster-infra
`--kubelet-cgroups` and `--system-cgroups` respectively.
Updated `--runtime-container` to `--runtime-cgroups`.
Cleaned up most of the kubelet code that consumes these flags to match
the flag name changes.
Signed-off-by: Vishnu kannan <vishnuk@google.com>
* Metrics will not be expose until they are hooked up to a handler
* Metrics are not cached and expose a dos vector, this must be fixed before release or the stats should not be exposed through an api endpoint