[BACKLOG] Monitoring Systems (metrics collection and visualization) #459

bombnp · 2023-01-10T02:07:26Z

Problem

There's currently no monitoring dashboard for our system in the following categories:

Resource usage (CPU, Memory, Disk, I/O) -> can utilize open-source metric exporters
Performance (Latency, Error rates) -> requires custom logging?

We want to identify areas to optimize resources since our resources are starting to run out.

Task Description

Create a monitoring system consisting of metrics collection (through exporters and Prometheus) and visualization (through Grafana?). Visualize metrics by resources, nodes, pods, or other api objects as needed.

Additional Context

For now, I've enabled Prometheus + node-exporter + kube-state-metrics stacks integration from Lens(in lens-metrics namespace), which can be used to visualize usage of specific nodes/pods, but multiple at the same time. It's likely you'd be using the same stacks, but some metrics must be installed ourselves.

Related Teams

Task Advisors

@bombnp

The text was updated successfully, but these errors were encountered:

bombnp added the new feature New feature label Jan 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BACKLOG] Monitoring Systems (metrics collection and visualization) #459

[BACKLOG] Monitoring Systems (metrics collection and visualization) #459

bombnp commented Jan 10, 2023

[BACKLOG] Monitoring Systems (metrics collection and visualization) #459

[BACKLOG] Monitoring Systems (metrics collection and visualization) #459

Comments

bombnp commented Jan 10, 2023

Problem

Task Description

Additional Context

Related Teams

Task Advisors