PostgreSQL metrics
Database metrics are a comprehensive collection of database workloads and resource usages. The WhaTap monitoring service continuously collects metrics to identify bottlenecks in the database and provides data to help identify the root causes of performance issues.
For more information about other metrics, see the link.
agent_status_summary
This category collects metrics related to agent status every 10 seconds.
Fields
Field name | Unit | Description | Remarks |
---|---|---|---|
inActTime | Millisecond(ms) | Amount of time the agent remains inactive | - |
isActive | - | Whether the agent is active or not | true / false |
isRestart | - | Whether the agent has been recently restarted | true / false |
lastActTime | Millisecond(ms) | Time when the agent was last activated | 0 : If disabled |
oid | - | Unique IDs for each agent in the project | - |
startTime | Millisecond(ms) | Timestamp when the agent was started. | - |
db_postgresql_counter
Tags
The following lists the tags that are collected in the db_postgresql_counter
category.
Tag name | Description | Remarks |
---|---|---|
agentIp | IP address per agent. | - |
dbIsMulti | Whether to support the multitenant or not. | - |
dbName | Database name. | - |
dbType | Database type. | - |
dbVersion | Version of the database in use. | - |
oname | Agent name | Unique value |
Field
The following lists the fields that are collected in the db_postgresql_counter
category.
Metrics | Description | Category |
---|---|---|
active_sessions | Number of active sessions. | Connection |
CPUUtilization | Percentage of CPU resources being used by the database instances. | Performance |
FreeStorageSpace | Available storage space for the instance. | I/O |
FreeableMemory | Amount of memory that can be used immediately by the system. | I/O |
NetworkReceiveThroughput | Speed of network data received by the instance | Troughput |
NetworkTransmitThroughput | Speed of network data transmitted by the instance | Troughput |
blk_read_time | Time taken to read a data block in the backend (in milliseconds). | I/O |
blk_write_time | Time taken to write a data block in the backend (milliseconds). | I/O |
blks_hit | Total count where a disk block was found in the buffer cache and did not require a read. | I/O |
blks_hit_ratio | Percentage of queries retrieved from the cache out of the total queries. Calculation method: sum( blks_hit )/sum(blks_hit + blks_read ) * 100 | I/O |
blks_read | Number of data blocks read directly from disk. | I/O |
buffers_checkpoint | Number of buffers flushed during checkpoint. | Info |
check_point | Number of cases a database checkpoint occurs. | Troughput |
checkpoint_write_time | Time (milliseconds) taken to write data during checkpoint. | Troughput |
checkpoints_timed | Number of checkpoints occurred according to the checkpoint_timeout setting. | Troughput |
commit_count | Number of transaction commits. | Troughput |
confl_bufferpin | Number of session conflicts due to buffer pin contention. | Troughput |
confl_deadlock | Number of deadlock conflicts. | Troughput |
confl_lock | Number of session conflicts due to lock contention. | Troughput |
confl_snapshot | Number of session conflicts due to snapshot contention. | Troughput |
confl_tablespace | Number of session conflicts due to tablespace contention. | Troughput |
conflicts | Number of queries that were aborted because they conflicted with database recovery operations. | Troughput |
connect_error | Number of database connection failures. | Connection |
deadlock_count | Total number of deadlocks occurred. | Performance |
dml_count | Total number of DML (Data Manipulation Language) operations. | Troughput |
fatal_count | Number of fatal errors. | Info |
idle_trx | Number of idle transactions. | Connection |
idle_trx_time | Cumulative time of idle transactions. | Connection |
idx_scan | With this method, table records are accessed using the keys stored in index leaf blocks. This metric has the values queried every minute. | Performance |
index_hit_ratio | Percentage of index queries retrieved from the cache. | Performance |
lock_wait_sessions | Number of pending sessions. | Performance |
long_running_sessions | Number of long running sessions. | Performance |
max_connections | Maximum number of connections allowed. Value set in pg_settings . If this value is insufficient, the 'FATAL ERROR: too many clients already.' Errors may occur. | Info |
max_runtime | Execution time of the longest running query. | Performance |
node_state | This database consists of replication relationships. 1 : Master, 2 : Slave. | Info |
numbackends | Number of backend processes connected to the database. | Info |
oldest_cur_xid | The oldest active transaction ID. | Info |
per_to_emergency_autovac | Number of emergency automatic vacuum occurrences. | Performance |
per_to_wraparound | Number of vacuum operations to prevent transaction ID wraparound. | Performance |
replication_count | Number of active replication databases (Slave). | Performance |
replication_delay | Replication delay time. Delay time reflected in the replication database for the replication structure, this is a metric measured in the slave DB. | Performance |
replication_lag | Extent to which replicated data lags behind the original data. | Performance |
rollback_count | Number of transaction rollbacks. | Performance |
seq_scan | Method to read records by performing a full scan on the table. Select this when there is no index or when the range to be read is wide even if an index exists. This metric has the values queried every minute. | Performance |
table_hit_ratio | Percentage of queries retrieved from the cache. | Performance |
temp_bytes | Total amount of bytes written to temporary files by queries in the database. | Performance |
temp_files | Number of temporary files created by queries in the database. | Performance |
total_runtime | Total query execution time. | Performance |
total_sessions | Total number of connected sessions. | Connection |
tps | Number of processed transactions per second. | Performance |
tup_deleted | Number of tuples deleted by queries in the database. | Troughput |
tup_fetched | Number of tuples retrieved from the database by queries. | Troughput |
tup_inserted | Number of tuples inserted by queries in the database. | Troughput |
tup_returned | Number of tuples returned by queries in the database. | Troughput |
tup_updated | Number of tuples updated by queries in the database. | Troughput |
uptime | Database uptime. | Info |
vacuum_running_count | Number of vacuum operations running. | Performance |
wait_sessions | Number of waiting sessions. | Performance |
wal_count | Number of Write-Ahead Logging (WAL) records written. Spikes in the WAL file counts are usually caused by the large amount of changes in a short period or vacuum-like maintenance operations that generate temporary tables and objects. They should slowly return to normal levels. This usually results in lots of disk I/O and CPU activity, slowing the application query's execution until things get back to normal. This metric has the values queried every minute. | Performance |
warning_count | Number of warnings occurred. | Info |
XOS-related metrics
The following describes the metrics collected from the XOS agent.
Metric | Unit | Description |
---|---|---|
cpu(xos) | pct | CPU utilization, cpu_user + cpu_sys + cpu_iowait . |
cpu_user(xos) | pct | Time spent for CPU to execute user code. |
cpu_sys(xos) | pct | Time spent for CPU to execute the kernel code. |
cpu_iowait(xos) | pct | CPU time waited for IO. |
cpu_idle(xos) | pct | CPU processor's idle time. |
disk_reads(xos) | block | Amount of data read from disk. |
disk_read_bytes(xos) | byte | Amount of data read from disk. |
disk_writes(xos) | block | Amount of data written to disk. |
disk_write_bytes(xos) | byte | Amount of data written to disk. |
mem(xos) | pct | Memory utilization. |
mem_total(xos) | MB | Total amount of memory. |
mem_free(xos) | MB | Amount of memory not in use by the system. |
mem_used(xos) | MB | Amount of memory in use by the system. |
mem_buffcache(xos) | MB | Amount of memory used by the buff/cache. |
mem_swaptotal(xos) | MB | Set this in preparation for the case where the total Swap size and physical memory are insufficient. |
mem_swapfree(xos) | MB | Because the unused Swap area and Swap area are secured in disk space, the system performance may drop drastically while using Swap. |
mem_available(xos) | MB | Available memory area. |
net_recv_bytes(xos) | byte | Amount of data received over the network. |
net_recv_packets(xos) | - | Number of data packets received over the network. |
net_send_bytes(xos) | byte | Amount of data transmitted over the network. |
net_send_packets(xos) | - | Number of data packets transmitted over the network. |