Kubernetes metrics
Some metrics collected by Kubernetes are the same as those of server and application monitoring.
Container(container
) metric
The container
category collects all custom labels set on the container's pods as tags.
- Target: Cluster project, Namespace project
- Collection interval: 5 seconds
- Statistical data: 5 minutes
Tags
Tags | Type | Unit | Description |
---|---|---|---|
agentOid | - | - | Node agent ID (unique) |
agentPcode | - | - | Project code (unique) |
command | - | - | Execution command |
containerId | - | - | Container ID (unique) |
containerKey | - | - | Container key |
created | - | - | Time stamp generated by the container |
image | - | - | Container image name |
imageHash | - | - | Image hash value |
imageId | - | - | Image ID |
k8s-app | - | - | Value for the pod's label k8s-app |
microOid | - | - | Unique ID of the WhaTap APM agent installed in the container |
name | - | - | Container name |
namespace | - | - | Namespace to which the container belongs |
namespaceHash | - | - | Hash value of the namespace to which the container belongs |
okind | - | - | Unique ID of OKIND specified in the WhaTap APM agent installed in the container |
okindName | - | - | Name of OKIND specified in the WhaTap APM agent installed in the container |
oname | - | - | Name of the WhaTap APM agent installed in the container |
onode | - | - | Unique ID of the node agent on which the container is running |
onodeName | - | - | Node name on which the container is running |
podHash | - | - | Hash value of the container's Pod |
podName | - | - | Container's Pod name |
replicaSetHash | - | - | Hash value of the container's replica set |
replicaSetName | - | - | Name of the container's replica set |
whatap_project | - | - | Name of the WhaTap project to which the container belongs |
Fields
Field | Type | Unit | Shortname/Name/Description |
---|---|---|---|
blkio_rbps | - | byte | IoReadBytes |
Container Block I/O Read Byte | |||
Sum of bytes read per second across all block devices in the container | |||
blkio_riops | - | count | IoReadIops |
Container Block I/O Read IOPS | |||
Sum of counts read per second across all block devices in the container | |||
blkio_wbps | - | byte | IoWriteBytes |
Container Block I/O Write Byte | |||
Sum of bytes written per second across all block devices in the container | |||
blkio_wiops | - | count | IoWriteIops |
Container Block I/O Write IOPS | |||
Sum of counts written per second across all block devices in the container | |||
cpu_per_quota | - | percent | CpuByLimit |
Container CPU usage by limit (%) | |||
Container CPU utilization by limit | |||
cpu_quota | - | millicores | CpuLimit |
Container CPU Limit (core) | |||
Container CPU Limit Quota If the limit is not set, the total CPU cores of the node where the container is running appears in millicores | |||
cpu_quota_percent | - | percent | CpuLimitByNode |
Container CPU Limit by Node(%) | |||
노드 CPU 대비 컨테이너 CPU Limit 할당량 - Limit 미설정인 경우 컨테이너가 작동 중인 해당 노드의 CPU 전체 코어가 퍼센트로 표시됨 | |||
cpu_sys | - | percent | CpuSysByNode |
Container CPU Sys Usage by Node(%) | |||
Container CPU System Utilization against Node CPU | |||
cpu_throttledperiods | - | count | CpuThrottledCnt |
Container CPU Throttling Count | |||
Container CPU Throttled Count | |||
cpu_throttledtime | - | nanosecond(ns) | CpuThrottledTime |
Container CPU Throttling Time | |||
Container CPU Throttled Time | |||
cpu_total | - | percent | CpuByNode |
Container CPU Usage by Node(%) | |||
Container CPU Utilization against Node CPU | |||
cpu_total_milli | - | millicores | CpuTotUsage |
Container CPU Usage (millicore) | |||
Container CPU Usage | |||
cpu_user | - | percent | CpuUserByNode |
Container CPU User Usage by Node(%) | |||
Container CPU User Utilization against Node CPU | |||
cpu_request | - | millicores | CpuRequest |
Container CPU Request (core) | |||
Container CPU Request | |||
cpu_per_request | - | percent | CpuByRequest |
Container CPU Usage by Request(%) | |||
Utilization against Container CPU Request = cpu_total_milli / cpu_request * 100 | |||
mem_failcnt | - | count | MemFailCnt |
Container Memory Failure Count | |||
Container Memory Limit reached Count | |||
mem_limit | - | byte | MemLimit |
Container Memory Limit (byte) | |||
Container Memory Limit Size | |||
mem_maxusage | - | byte | MemMaxUsage |
Container Memory Max Usage (byte) | |||
Recorded maximum container memory usage See the guide below for details. | |||
mem_percent | - | percent | MemWsByLimit |
Container Memory Working Set by Limit(%) | |||
Working Set Usage based on Container Memory Limit = mem_usage / mem_limit * 100 | |||
mem_totalcache | - | byte | MemTotCache |
Container Memory Total Cache (byte) | |||
Container's Total Cache Size | |||
mem_totalpgfault | - | count | MemTotPageFaultCnt |
Container Memory Total Page Fault Count | |||
Container's Page Fault Count | |||
mem_totalrss | - | byte | MemTotRss |
Container Memory Total RSS (byte) | |||
Container's Total RSS Memory Size | |||
mem_totalrss_percent | - | percent | MemTotRssByLimit |
Container Memory Total RSS By Limit (%) | |||
Container's Total RSS Memory Utilization | |||
mem_totalunevictable | - | byte | MemTotUnevictable |
Container Memory Total Unevictable (byte) | |||
Container's Total Unevictable Memory Size | |||
mem_usage | - | byte | MemUsage |
Container Memory Usage (byte) | |||
Container Memory Usage | |||
mem_working_set | - | byte | MemWs |
Container Memory Working Set (byte) | |||
Container memory working set = mem_usage - inactive file | |||
mem_working_set_percent | - | percent | MemWsByLimit |
Container Memory Working Set by Limit (%) | |||
Working Set Usage based on Container Memory Limit = mem_usage / mem_limit * 100 | |||
mem_request | - | byte | MemRequest |
Container Memory Request (byte) | |||
Container Memory Request Size | |||
mem_per_request | - | percent | MemWsByRequest |
Container Memory Working Set by Request (%) | |||
Working Set Usage based on Container Memory Request = mem_working_set / mem_request * 100 | |||
network_rbps | - | byte | NetRxBytes |
Container Network Receive Byte | |||
Sum of bytes read per second across all block devices in the container | |||
network_rdropped | - | byte | NetRxDropped |
Container Network Receive Dropped | |||
Container Network Receive Dropped Count | |||
network_rerror | - | byte | NetRxError |
Container Network Receive Error | |||
Container Network Receive Error Count | |||
network_riops | - | byte | NetRxIops |
Container Network Receive IOPS | |||
Container Network Receive Error Count | |||
network_wbps | - | byte | NetTxByes |
Container Network Transmit Byte | |||
Container Network Transmit Data Size | |||
network_wdropped | - | count | NetTxDropped |
Container Network Transmit Dropped | |||
Container Network Transmit Dropped Count | |||
network_werror | - | count | NetTxError |
Container Network Transmit Error | |||
Container Network Transmit Error Count | |||
network_wiops | - | count | NetTxIops |
Container Network Transmit IOPS | |||
Container Network Transmit Error Count | |||
node_cpu | - | percent | ConNodeCpu |
Container Work Node CPU Usage (%) | |||
CPU Usage of the Node where the container is running | |||
node_mem | - | percent | ConNodeMem |
Container Work Node Memory Usage (%) | |||
Memory Usage of the Node where the container is running | |||
phase | string | - | Pod lifecycle ① PENDING ② RUNNING ③ SUCCEEDED ④ FAILED ⑤ UNKNOWN |
restart_count | integer | - | ConRestartCnt |
Container Restart Count | |||
Number of container restarts | |||
state | integer | - | ConState |
Container Current State | |||
Container State Code ① RUNNING = 114 ② PAUSE = 112 ③ RESTARTING = 101 ④ OOMKILLED = 111 ⑤ DEAD = 100 ⑥ WAITING = 119 | |||
status | string | - | ConStatus |
Container Current Status | |||
Container State Information ① running: Displays the uptime information ② waiting/terminated: Displays the reason of the state |
mem_maxusage
indicates the maximum memory usage recorded while the container was running. However, if the Linux kernel version is lower than 5.19, the raw data for this metric may not be supported. In this case, the value may be displayed as 0. Update the Linux kernel version to 5.19 or later to collect this metric.
Kubernetes node (kube_node
) metric
The kube_node
category collects all custom labels set on the node as tags.
- Target: Cluster project, Namespace project
- Collection interval: 5 seconds
- Statistical data: 5 minutes, 1 hour
Tags
Tags | Type | Unit | Description |
---|---|---|---|
nodeName | - | - | Node name |
Fields
Field | Type | Unit | Description |
---|---|---|---|
allocatable_cpu | - | millicores | CPU size that can be assigned to node |
allocatable_memory | - | byte | Memory size that can be assigned to node |
allocatable_pods | integer | - | Number of Pods that can be assigned to node |
limit_cpu | - | millicores | Sum of node CPU limits |
limit_memory | - | byte | Sum of node memory limits |
pods | integer | - | Total number of node Pods |
request_cpu | - | millicores | Sum of node CPU requests |
request_memory | - | byte | Sum of node memory requests |
Kubernetes event (kube_event
) metric
The kube_event
category collects cluster-wide data for cluster projects, and collects data only for events that occurred in the namespace for namespace projects.
- Target: Cluster project, Namespace project
- Collection interval: 5 seconds
- Statistical data: 5 minutes, 1 hour
Tags
Tags | Type | Unit | Description |
---|---|---|---|
field_path | - | - | Field Path |
kind | - | - | Object type on which the event occurred |
name | - | - | Kubernetes object name on which the event occurred |
namespace | - | - | Namespace on which the event occurred |
reason | - | - | Event occurrence cause |
type | - | - | Event type - Warning or Normal |
uid | - | - | UID - Object where an event occurred |
Fields
Field | Type | Unit | Description |
---|---|---|---|
action | string | - | Action name |
count | - | count | Event occurrence count |
event_time | Integer | - | Time stamp for the first event |
first_timestamp | Integer | - | First event occurrence time |
last_timestamp | Integer | - | Last event occurrence time |
message | string | - | Event Message |
reasonFiled | string | - | Event reason |
reporting_component | string | - | Component that reports the current event |
reporting_instance | string | - | Instance that reports the current event |
series_last_observed_time | Integer | - | Series last observed time |
Kubernetes Cluster (kube_stat
) metric
The kube_stat
category collects all the clusters for the cluster project, and the namespace projects collects objects associated with the namespace.
- Target: Cluster project, Namespace project
- Collection interval: 5 seconds
- Statistical data: 5 minutes, 1 hour
Tags
Tags | Type | Unit | Description |
---|---|---|---|
name | - | - | kube_stat (fixed value) |
Fields
Field | Type | Unit | Description |
---|---|---|---|
alloctable_cpu | - | millicores | Number of all cores in the cluster (cluster project only) |
alloctable_ephemeral-storage | - | byte | Cluster-wide allocatable ephemeral storage (cluster project only) |
alloctable_hugepages-1gi | - | byte | Cluster-wide allocatable hugepages-1Gi (cluster project only) |
alloctable_hugepages-2mi | - | byte | Cluster-wide allocatable hugepages-2Mi (cluster project only) |
alloctable_memory | - | byte | Total memory allocatable in the cluster (cluster project only) |
alloctable_pods | Integer | - | Number of pods that can be allocated |
available_pod | Integer | - | Number of pods whose phase is in Running state |
desired_pod | Integer | - | Sum of the number of pods deployed without metadata.ownerReferences and the number of desired pods defined in Kubernetes objects (ReplicaSet, Daemonset, StatefulSet) |
Same as the number of pods retrieved by kubectl get pods -A | |||
nodes | Integer | - | Number of nodes |
pod_phase_Pending | Integer | - | Number of pending pods |
pod_phase_Running | Integer | - | Number of running pods |
running_containers | Integer | - | Number of running containers |
stopped_containers | Integer | - | Number of stopped containers |
total_available_cpu | Integer | - | Total allocatable CPU |
total_available_memory | Integer | - | Total sum of allocatable memory |
total_limit_cpu | - | millicores | Total sum of limit CPU |
total_limit_memory | - | byte | Total sum of limit memory |
total_request_cpu | - | millicores | Total sum of request CPU |
total_request_memory | - | byte | Total sum of request memory |
unavailable_pod | Integer | - | Number of pods whose phase is not in Running state (Pending, Failed, Succedded) |
waiting_containers | Integer | - | Waiting container count |
Pod (kube_pod
) metric
The kube_pod
category collects all custom labels set on the Pod as tags.
- Target: Master (cluster) project, Namespace project
- Collection interval: 5 seconds
- Statistical data: 5 minutes
Tags
Tags | Type | Unit | Description |
---|---|---|---|
agentOid | - | - | Node agent ID (unique) |
agentPcode | - | - | Project code (unique) |
command | - | - | Execution command |
containerIds | - | - | Container ID that belongs to the Pod |
containerIdsCount | - | - | Number of containerIds |
containerKeys | - | - | Hash value for the container ID that belongs to the pod |
containerKeysCount | - | - | Number of containerKeys |
DaemonSet | - | - | DaemonSet name of the pod |
Deployment | - | - | Deployment |
k8s-app | - | - | Value for the pod's label k8s-app |
microOid | - | - | ID of the agent running on the applications inside the Pod's container. |
microOids | - | - | Multiple IDs of the agents running on applications inside multiple containers in the pod |
microOidsCount | - | - | Number of microOids |
name | - | - | Pod Name |
onames | - | - | Name of the agent running on the applications inside the Pod's container. |
onamesCount | - | - | Number of onames |
podName | - | - | Pod Name |
namespace | - | - | Namespace to which the Pod belongs |
namespaceHash | - | - | Hash value of the namespace to which the Pod belongs |
replicaSetHash | - | - | Hash value of ReplicaSet of the Pod |
replicaSetName | - | - | ReplicaSet name of the Pod |
whatap_project | - | - | Name of the WhaTap project to which the Pod belongs |
Fields
Field | Type | Unit | Shortname, Name, Description |
---|---|---|---|
blkio_rbps | - | byte | IoReadBytes |
Pod Block I/O Read Byte | |||
Sum of bytes read per second across all block devices in the Pod | |||
blkio_riops | - | count | IoReadIops |
Pod Block I/O Read IOPS | |||
Sum of cases read per second across all block devices in the Pod | |||
blkio_wbps | - | byte | IoWriteBytes |
Pod Block I/O Write Byte | |||
Sum of bytes written per second across all block devices in the Pod | |||
blkio_wiops | - | count | IoWriteIops |
Pod Block I/O Write IOPS | |||
Sum of cases written per second across all block devices in the Pod | |||
cpu_per_limit | - | precent | CpuByLimit |
Pod CPU Usage by Limit (%) | |||
Container CPU utilization by limit | |||
cpu_per_request | - | precent | CpuByRequest |
Pod CPU Usage by Request (%) | |||
Total CPU utilization based on the CPU requests | |||
cpu_quota_percent | - | precent | CpuLimitByNode |
Pod CPU Limit by Node (%) | |||
Pod CPU limit quota against the node - If the limit is not set, the total CPU cores of the node where the Pod is running appears in percentage. | |||
cpu_sys | - | precent | CpuSysByNode |
Pod CPU Sys Usage by Node (%) | |||
Pod CPU System Utilization against Node CPU | |||
cpu_throttledperiods | - | count | CpuThrottledCnt |
Pod CPU Throttling Count | |||
Pod CPU Throttled Count | |||
cpu_throttledtime | - | nanosecond(ns) | CpuThrottledTime |
Pod CPU Throttling Time | |||
Pod CPU Throttled Time | |||
cpu_total | - | percent | CpuByNode |
Pod CPU Usage by Node (%) | |||
Pod CPU Utilization against Node CPU | |||
cpu_total_milli | - | millicores | CpuTotUsage |
Pod CPU Usage (millicore) | |||
Pod CPU usage | |||
cpu_user | - | percent | CpuUserByNode |
Pod CPU User Usage by Node (%) | |||
Pod CPU User Utilization against Node CPU | |||
cpu_request | - | millicores | CpuRequest |
Pod CPU Request (core) | |||
Pod CPU Request | |||
cpu_per_request | - | precent | CpuByRequest |
Pod CPU Usage by Request (%) | |||
Utilization against Pod CPU Request = cpu_total_milli/cpu_request * 100 | |||
mem_totalcache | - | byte | MemTotCache |
Pod Memory Total Cache (byte) | |||
Total Pod Cache Size | |||
mem_totalpgfault | - | count | MemTotPageFaultCnt |
Pod Memory Total Page Fault Count | |||
Pod's Page Fault Count | |||
mem_totalrss | - | byte | MemTotRss |
Pod Memory Total RSS (byte) | |||
Pod's Total RSS Memory Size | |||
mem_totalrss_percent | - | precent | MemTotRssByLimit |
Pod Memory Total RSS by Limit (%) | |||
Pod's Total RSS Memory Utilization | |||
mem_totalunevictable | - | byte | MemTotUnevictable |
Pod Memory Total Unevictable (byte) | |||
Pod's Total Unevictable Memory Size | |||
mem_usage | - | byte | MemUsage |
Pod Memory Usage (byte) | |||
Pod Memory Usage | |||
mem_working_set | - | byte | MemWs |
Pod Memory Working Set (byte) | |||
Pod Memory working set = mem_usage - inactive file | |||
memory_request | - | byte | MemRequest |
Pod Memory Request (byte) | |||
Pod memory requests | |||
memory_limit | - | byte | MemLimit |
Pod Memory Limit (byte) | |||
Pod memory limit quota | |||
memory_per_request | - | precent | MemByRequest |
Pod Memory Working Set by Request (%) | |||
Working Set usage based on the Pod memory request | |||
memory_per_limit | - | precent | MemByLimit |
Pod Memory Working Set by Limit (%) | |||
Working Set usage based on the Pod memory limit | |||
network_rbps | - | byte | NetRxBytes |
Pod Network Receive Byte | |||
Sum of bytes read per second across all block devices in the Pod | |||
network_rdropped | - | byte | NetRxDropped |
Pod Network Receive Dropped | |||
Pod Network Receive Dropped Count | |||
network_rerror | - | byte | NetRxError |
Pod Network Receive Error | |||
Pod's network receive error count | |||
network_riops | - | byte | NetRxIops |
Pod Network Receive IOPS | |||
Pod Network Receive Count | |||
network_wbps | - | byte | NetTxByes |
Pod Network Transmit Byte | |||
Pod Network Transmit Data Size | |||
network_wdropped | - | count | NetTxDropped |
Pod Network Transmit Dropped | |||
Pod Network Transmit Dropped Count | |||
network_werror | - | count | NetTxError |
Pod Network Transmit Error | |||
Pod Network Transmit Error Count | |||
network_wiops | - | count | NetTxIops |
Pod Network Transmit IOPS | |||
Pod Network Transmit Count | |||
phase | string | - | Phase |
Pod Current Phase | |||
Pod lifecycle ① PENDING ② RUNNING ③ SUCCEEDED ④ FAILED ⑤ UNKNOWN |
The following fields are reserved for internal use.
Field | Type | Unit | Description |
---|---|---|---|
kube_sless_normal | - | - | Number of Kubernetes informative events |
kube_sless_warning | - | - | Number of Kubernetes warning events |
micro_sful_critical | - | - | Number of APM events that are critical |
micro_sful_info | - | - | APM informative event count |
micro_sful_warning | - | - | APM warning event count |
micro_sless_critical | - | - | Number of APM events that are not critical |
micro_sless_info | - | - | Number of APM events that are not informative |
micro_sless_warning | - | - | Number of APM events that are not for warning |
sful_critical | - | - | Number of events that are critical in the metric |
sful_info | - | - | Number of events that are informative in the metric |
sful_warning | - | - | Number of events that are for warning in the metric |
sless_critical | - | - | Number of events that are not critical in the metric |
sless_info | - | - | Number of events that are not informative in the metric |
sless_warning | - | - | Number of events that are not for warning in the metric |
Kubernetes Pod Statistics (kube_pod_stat
) metric
The kube_pod_stat
category cluster project collects data for all clusters, and the namespace project collects data only for pods that belong to the namespace.
- Target: Cluster project, Namespace project
- Collection interval: 5 seconds
- Statistical data: 5 minutes, 1 hour
Tags
Tags | Type | Unit | Description |
---|---|---|---|
kind | - | - | Type - A cluster project has its fixed value, and a namespace project collects only the deployment or ReplicaSet. |
name | - | - | Kubernetes resource name - A cluster project has no name value and a namespace project has the name for Deployment or ReplicaSet. |
Fields
Field | Type | Unit | Description |
---|---|---|---|
available_pod | integer | - | Number of pods whose phase is in Running state |
desired_pod | integer | - | Sum of the number of pods deployed without metadata.ownerReferences and the number of desired pods defined in Kubernetes objects (ReplicaSet, Daemonset, StatefulSet) |
Same as the number of pods retrieved by kubectl get pods -A | |||
limit_cpu | - | millicores | CPU Limit Usage |
limit_memory | - | byte | Memory Limit Usage |
request_cpu | - | millicores | CPU Request Usage |
request_memory | - | byte | Memory Request Usage |
running_container | integer | - | Running Container Count |
stopped_container | integer | - | Stopped Container Count |
waiting_container | integer | - | Waiting container count |
Kubernetes Horizontal Pod Autoscaler (HPA) (kube_hpa_stat
) metric
Metric collection starts only when HPA is added to the ClusterRole used by WhaTap.
- Target: Cluster project
- Collection interval: 5 seconds
- Statistical data: 5 minutes, 1 hour
Tags
Tags | Type | Unit | Description |
---|---|---|---|
name | - | - | HPA name |
Fields
Field | Type | Unit | Description |
---|---|---|---|
currentReplicas | integer | count | Current Replica Count |
desiredReplicas | integer | count | Desired Replica Count |
lastScaleTime | integer | count | Last scaled TimeStamp |
maxReplicas | integer | count | Maximum Replica Count |
minReplicas | integer | count | Minimum Replica Count |
Process (kube_process
) metrics
Kubernetes agent 1.7.12 or later is required. For more information about agent updates, see the following.
Kubernetes-related processes that exist in the node are collected during monitoring.
-
Target: Cluster project, Namespace project
-
Collection interval: 5 seconds
-
Statistical data: 5 minutes
Tags
Tags | Type | Unit | Description |
---|---|---|---|
ppid | string | - | Parent process ID /proc/[pid]/status::PPid |
pid | string | - | Process ID /proc/[pid]/status::Pid |
cmd1 | string | - | Command name /proc/[pid]/status::Name |
cmd2 | string | - | Command line (all commands and arguments) /proc/[pid]/cmdline |
user | string | - | User ID or user name /proc/[pid]/status::Uid |
onodeName | string | - | Node name of the process Container system's environment variable ( NODE_IP ) |
createTime | TimeStamp | - | Process start time Field calculated through /proc/uptime |
Fields
Field | Type | Unit | Description |
---|---|---|---|
cpu | float | percent(%) | CPU utilization - Field calculated through /proc/[pid]/stat |
memory | float | percent(%) | Memory utilization - Field calculated through /proc/[pid]/statm |
rss | long | byte | Actual memory usage (Resident Set Size) - VmRSS of /proc/[pid]/status |
uid | string | - | User ID or name - Uid of /proc/[pid]/status |
state | string | - | Process state - State of /proc/[pid]/status |
sharedMemory | long | byte | Shared memory size - Field calculated through /proc/[pid]/statm |
openFileDescriptors | integer | - | Number of file descriptors that the process has open - Field calculated through /proc/[pid]/fd |
vmSize | long | byte | Virtual memory size - VmSize of /proc/[pid]/status |
threads | integer | - | Number of threads created by processes - Threads of /proc/[pid]/status |
Linux process status in the Kubernetes environment
On Linux, the State
field in the /proc/[pid]/status file displays the current state of the process. The meanings of each status are as follows:
Code | Description |
---|---|
R (Running) | The process is running or ready to run. |
S (Sleeping) | Interruptible sleep state, waiting for an event. |
D (Disk Sleep) | Non-interruptible sleep state, waiting for an I/O operation. |
R (Zombie) | The process has been terminated, but the parent process has not yet collected its termination status. |
T (Stopped) | The process is stopped by a job control signal (such as SIGSTOP) or debugger. |
t (Tracing stop) | Tracing stopped - State being traced by the debugger (indicated by a lowercase t) |
X (Dead) | Dead state - The process is dead (commonly unseen) |
x (Dead) | Dead state - The kernel thread is dead (commonly unseen) |
K (WakeKill) | Forcibly terminated - It ignores any wakeup signal and is immediately dead. |
W (Waking) | Waking up - State of being woken up after receiving a wake-up signal |
I (Idle) | Kernel thread is idle (usually invisible to user space processes). |
Because Kubernetes manages the resources of containers and nodes efficiently, many processes running in containers are actually in waiting state. As a result, most processes may be in Sleeping
state.
Agent status (agent_status_summary
) metrics
This category collects metrics related to agent status every 10 seconds.
Fields
Field | Type | Unit | Description |
---|---|---|---|
inActTime | - | millisecond(ms) | Amount of time the agent remains inactive |
isActive | Boolean | - | Whether the current agent is active or not (true / false ) |
isRestart | Boolean | - | Whether the agent was restarted (true / false ) |
lastActTime | - | millisecond(ms) | Last time when the agent was activated - 0 : Disabled |
oid | - | - | Unique IDs for each agent in the project |
oType | - | - | Agent type - 1 : Application agent - 2 : See subType |
startTime | - | millisecond(ms) | Timestamp indicating the time when the agent was started |
subType | - | - | Agent type - 9 : Node agent - 10 : Master agent |
Ingress (kube_ingress
) metric
Kubernetes agent 1.7.13 or later is required. For more information about agent updates, see the following.
It is collected when monitoring metadata and the related information for Ingress resources.
-
Target: Cluster project, Namespace project
-
Collection interval: 30 seconds
-
Statistical data: 5 minutes
Tags
Tags | Type | Unit | Description |
---|---|---|---|
ingressUid | string | - | Unique ID of the Ingress resource |
ingressName | string | - | Name of the Ingress resource |
ingressNamespace | string | - | Namespace of the Ingress resource |
creationTimeMillis | Long | millisecond(ms) | Created time of the Ingress resource |
ingressClassName | string | - | Name of the Ingress class |
ingressLoadBalancerIps | List | - | IP of the Ingress load balancer |
Fields
Field | Type | Unit | Description |
---|---|---|---|
host | List | - | Host name that the Ingress resource listens to (if * , it applies to all hosts) |
path | List | - | Request path under a specific host |
backendServiceName | List | - | Name of the service passed to the backend |
backendServicePort | List | - | Port number passed to the backend |
backendServiceUid | List | - | URL of the service passed to the backend |
pathType | List | - | Path matching method (e.g. Prefix , Exact ) |
Kubernetes cron job (kube_cronjob
) metric
It collects metadata and status information for deployment resources by using the tags.
Tags
Tags | Type | Unit | Description |
---|---|---|---|
cronJobName | string | - | CronJob name (metadata.name ) |
namespace | string | - | CronJob namespace (metadata.namespace ) |
cronJobUid | string | - | CronJob UID (metadata.uid ) |
Fields
Field | Type | Unit | Description |
---|---|---|---|
lastScheduleTime | string | - | Last time when the CronJob was scheduled (status.lastScheduleTime ) |
lastSuccessfulTime | string | - | Last time when the CronJob succeeded (status.lastSuccessfulTime ) |
cronJobSpecSchedule | string | - | Scheduling cycle of the CronJob (spec.schedule , 예: "0 */5 * * *" ) |
successfulJobsHistoryLimit | integer | count | Maximum number of successful jobs to keep (spec.successfulJobsHistoryLimit ) |
failedJobsHistoryLimit | integer | count | Maximum number of failed jobs to keep (spec.failedJobsHistoryLimit ) |
concurrencyPolicy | string | - | Concurrency policy (Allow , Forbid , Replace ) (spec.concurrencyPolicy ) |
startingDeadlineSeconds | integer | second | Maximum delay time when the job can be run after its scheduled time (spec.startingDeadlineSeconds ) |
Kubernetes deployment (kube_deployment
) metric
It collects metadata and status information for deployment resources by using the tags.
Tags
Tags | Type | Unit | Description |
---|---|---|---|
deployName | String | - | Deployment name (metadata.name ) |
namespace | String | - | Deployment namespace (metadata.namespace ) |
deployUid | String | - | Deployment UID (metadata.uid ) |
Fields
Field | Type | Unit | Description |
---|---|---|---|
deployReadyReplicas | integer | count | Number of Pods that are in ready status (status.readyReplicas ) |
deployTotalReplicas | integer | count | Total number of set Pods (spec.replicas ) |
deployUpdatedReplicas | integer | count | Number of Pods updated to the latest version (status.updatedReplicas ) |
deployAvailableReplicas | integer | count | Number of Pods in available state (status.availableReplicas ) |
deployUnavailableReplicas | integer | count | Number of Pods in unavailable state (status.unavailableReplicas ) |
deployCreationTime | integer | millisecond(ms) | Deployment creation time (Unix Epoch milliseconds, metadata.creationTimestamp ) |
deploySelector | string | - | Selector label of the deployment (spec.selector.matchLabels ) |
deployStrategy | string | - | Deployment strategy (spec.strategy.type , e.g. RollingUpdate , Recreate ) |
Kubernetes daemonset (kube_daemonset
) metric
It collects metadata and status information for daemonset resources by using the tags.
Tags
Tags | Type | Unit | Description |
---|---|---|---|
daemonSetName | string | - | DaemonSet name (metadata.name ) |
namespace | string | - | DaemonSet namespace (metadata.namespace ) |
daemonSetUid | string | - | DaemonSet UID (metadata.uid ) |
creationTime | integer | millisecond(ms) | DaemonSet creation time (metadata.creationTimestamp ) |
age | string | - | Elapsed time since DaemonSet creation |
Fields
Field | Type | Unit | Description |
---|---|---|---|
currentNumberScheduled | integer | count | Number of scheduled Pods (status.currentNumberScheduled ) |
desiredNumberScheduled | integer | count | Number of desired Pods (status.desiredNumberScheduled ) |
numberReady | integer | count | Number of Pods in ready state (status.numberReady ) |
numberAvailable | integer | count | Number of available Pods (status.numberAvailable ) |
numberMisscheduled | integer | count | Number of wrongly scheduled Pods (status.numberMisscheduled ) |
updatedNumberScheduled | integer | count | Number of updated Pods (status.updatedNumberScheduled ) |
daemonSetSelector | string | - | Selector label (spec.selector.matchLabels ) |
Kubernetes replica set (kube_replicaset
) metric
It collects metadata and status information for ReplicaSet resources with the tags.
Tags
Tags | Type | Unit | Description |
---|---|---|---|
replicaSetName | string | - | ReplicaSet name (metadata.name ) |
namespace | string | - | ReplicaSet namespace (metadata.namespace ) |
replicaSetUid | string | - | ReplicaSet UID (metadata.uid ) |
ownerKind | string | - | Type of the upper layer object (e.g. Deployment ) (metadata.ownerReferences.kind ) |
ownerName | string | - | Name of the upper layer object (metadata.ownerReferences.name ) |
ownerUid | string | - | UID of the upper layer object (metadata.ownerReferences.uid ) |
creationTime | integer | millisecond(ms) | ReplicaSet creation time (metadata.creationTimestamp ) |
age | string | - | Elapsed time since creation of ReplicaSet |
Fields
Field | Type | Unit | Description |
---|---|---|---|
replicaSetReplicas | integer | count | Number of set Pods (spec.replicas ) |
replicaSetReadyReplicas | integer | count | Number of Pods in ready state (status.readyReplicas ) |
replicaSetAvailableReplicas | integer | count | Number of available Pods (status.availableReplicas ) |
replicaSetSelector | string | - | Selector label (spec.selector.matchLabels ) |
Kubernetes stateful set (kube_statefulSet
) metric
It collects metadata and status information for Stateful resources with the tags.
Tags
Tags | Type | Unit | Description |
---|---|---|---|
statefulSetName | string | - | Name of StatefulSet (metadata.name ) |
namespace | string | - | Namespace to which the StatefulSet belongs (metadata.namespace ) |
statefulSetUid | string | - | UIO of StatefulSet (metadata.uid ) |
creationTime | integer | millisecond(ms) | StatefulSet creation time (metadata.creationTimestamp ) |
age | string | - | Elapsed time since StatefulSet creation |
Fields
Field | Type | Unit | Description |
---|---|---|---|
replicas | integer | count | Number of set Pods (spec.replicas ) |
readyReplicas | integer | count | Number of Pods in ready state (status.readyReplicas ) |
currentReplicas | integer | count | Number of created Pods (status.currentReplicas ) |
updatedReplicas | integer | count | Number of updated Pods (status.updatedReplicas ) |
statefulSetSelector | string | - | Selector label (spec.selector.matchLabels ) |
Kubernetes job (kube_job
) metric
It collects metadata and status information for deployment resources by using the tags.
Tags
Tags | Type | Unit | Description |
---|---|---|---|
jobName | string | - | Job name (metadata.name ) |
namespace | string | - | Namespace of the Job (metadata.namespace ) |
jobUid | string | - | Job UID (metadata.uid ) |
ownerKind | string | - | Parent object kind (e.g. CronJob ) |
ownerName | string | - | Parent object name |
ownerUid | string | - | Parent object UID |
Fields
Field | Type | Unit | Description |
---|---|---|---|
succeededPods | integer | count | Number of successful jobs (status.succeeded ) |
failedPods | integer | count | Number of failed Pods (status.failed ) |
completionTime | string | - | Job completion time (status.completionTime.toString() ) |
parallelism | integer | count | Number of Pods set to run concurrently (spec.parallelism ) |
completions | integer | count | Number of Pod executions required for a job to be completed (spec.completions ) |
backoffLimit | integer | count | Maximum number of possible retries upon failure (spec.backoffLimit ) |
activeDeadlineSeconds | integer | second(sec) | Maximum time for a job to run (spec.activeDeadlineSeconds ) |