Skip to main content

PostgreSQL metrics

Database metrics are a comprehensive collection of database workloads and resource usages. The WhaTap monitoring service continuously collects metrics to identify bottlenecks in the database and provides data to help identify the root causes of performance issues.

For more information about other metrics, see the link.

agent_status_summary

This category collects metrics related to agent status every 10 seconds.

Fields

Field nameUnitDescriptionRemarks
inActTimeMillisecond(ms)Amount of time the agent remains inactive-
isActive-Whether the agent is active or nottrue / false
isRestart-Whether the agent has been recently restartedtrue / false
lastActTimeMillisecond(ms)Time when the agent was last activated0: If disabled
oid-Unique IDs for each agent in the project-
startTimeMillisecond(ms)Timestamp when the agent was started.-

db_postgresql_counter

Tags

The following lists the tags that are collected in the db_postgresql_counter category.

Tag nameDescriptionRemarks
agentIpIP address per agent.-
dbIsMultiWhether to support the multitenant or not.-
dbNameDatabase name.-
dbTypeDatabase type.-
dbVersionVersion of the database in use.-
onameAgent nameUnique value

Field

The following lists the fields that are collected in the db_postgresql_counter category.

MetricsDescriptionCategory
active_sessionsNumber of active sessions.Connection
CPUUtilizationPercentage of CPU resources being used by the database instances.Performance
FreeStorageSpaceAvailable storage space for the instance.I/O
FreeableMemoryAmount of memory that can be used immediately by the system.I/O
NetworkReceiveThroughputSpeed of network data received by the instanceTroughput
NetworkTransmitThroughputSpeed of network data transmitted by the instanceTroughput
blk_read_timeTime taken to read a data block in the backend (in milliseconds).I/O
blk_write_timeTime taken to write a data block in the backend (milliseconds).I/O
blks_hitTotal count where a disk block was found in the buffer cache and did not require a read.I/O
blks_hit_ratioPercentage of queries retrieved from the cache out of the total queries.
Calculation method: sum(blks_hit)/sum(blks_hit + blks_read) * 100
I/O
blks_readNumber of data blocks read directly from disk.I/O
buffers_checkpointNumber of buffers flushed during checkpoint.Info
check_pointNumber of cases a database checkpoint occurs.Troughput
checkpoint_write_timeTime (milliseconds) taken to write data during checkpoint.Troughput
checkpoints_timedNumber of checkpoints occurred according to the checkpoint_timeout setting.Troughput
commit_countNumber of transaction commits.Troughput
confl_bufferpinNumber of session conflicts due to buffer pin contention.Troughput
confl_deadlockNumber of deadlock conflicts.Troughput
confl_lockNumber of session conflicts due to lock contention.Troughput
confl_snapshotNumber of session conflicts due to snapshot contention.Troughput
confl_tablespaceNumber of session conflicts due to tablespace contention.Troughput
conflictsNumber of queries that were aborted because they conflicted with database recovery operations.Troughput
connect_errorNumber of database connection failures.Connection
deadlock_countTotal number of deadlocks occurred.Performance
dml_countTotal number of DML (Data Manipulation Language) operations.Troughput
fatal_countNumber of fatal errors.Info
idle_trxNumber of idle transactions.Connection
idle_trx_timeCumulative time of idle transactions.Connection
idx_scanWith this method, table records are accessed using the keys stored in index leaf blocks. This metric has the values queried every minute.Performance
index_hit_ratioPercentage of index queries retrieved from the cache.Performance
lock_wait_sessionsNumber of pending sessions.Performance
long_running_sessionsNumber of long running sessions.Performance
max_connectionsMaximum number of connections allowed. Value set in pg_settings. If this value is insufficient, the 'FATAL ERROR: too many clients already.' Errors may occur.Info
max_runtimeExecution time of the longest running query.Performance
node_stateThis database consists of replication relationships. 1: Master, 2: Slave.Info
numbackendsNumber of backend processes connected to the database.Info
oldest_cur_xidThe oldest active transaction ID.Info
per_to_emergency_autovacNumber of emergency automatic vacuum occurrences.Performance
per_to_wraparoundNumber of vacuum operations to prevent transaction ID wraparound.Performance
replication_countNumber of active replication databases (Slave).Performance
replication_delayReplication delay time. Delay time reflected in the replication database for the replication structure, this is a metric measured in the slave DB.Performance
replication_lagExtent to which replicated data lags behind the original data.Performance
rollback_countNumber of transaction rollbacks.Performance
seq_scanMethod to read records by performing a full scan on the table. Select this when there is no index or when the range to be read is wide even if an index exists. This metric has the values queried every minute.Performance
table_hit_ratioPercentage of queries retrieved from the cache.Performance
temp_bytesTotal amount of bytes written to temporary files by queries in the database.Performance
temp_filesNumber of temporary files created by queries in the database.Performance
total_runtimeTotal query execution time.Performance
total_sessionsTotal number of connected sessions.Connection
tpsNumber of processed transactions per second.Performance
tup_deletedNumber of tuples deleted by queries in the database.Troughput
tup_fetchedNumber of tuples retrieved from the database by queries.Troughput
tup_insertedNumber of tuples inserted by queries in the database.Troughput
tup_returnedNumber of tuples returned by queries in the database.Troughput
tup_updatedNumber of tuples updated by queries in the database.Troughput
uptimeDatabase uptime.Info
vacuum_running_countNumber of vacuum operations running.Performance
wait_sessionsNumber of waiting sessions.Performance
wal_countNumber of Write-Ahead Logging (WAL) records written. Spikes in the WAL file counts are usually caused by the large amount of changes in a short period or vacuum-like maintenance operations that generate temporary tables and objects. They should slowly return to normal levels. This usually results in lots of disk I/O and CPU activity, slowing the application query's execution until things get back to normal. This metric has the values queried every minute.Performance
warning_countNumber of warnings occurred.Info

XOS-related metrics

The following describes the metrics collected from the XOS agent.

MetricUnitDescription
cpu(xos)pctCPU utilization, cpu_user + cpu_sys + cpu_iowait.
cpu_user(xos)pctTime spent for CPU to execute user code.
cpu_sys(xos)pctTime spent for CPU to execute the kernel code.
cpu_iowait(xos)pctCPU time waited for IO.
cpu_idle(xos)pctCPU processor's idle time.
disk_reads(xos)blockAmount of data read from disk.
disk_read_bytes(xos)byteAmount of data read from disk.
disk_writes(xos)blockAmount of data written to disk.
disk_write_bytes(xos)byteAmount of data written to disk.
mem(xos)pctMemory utilization.
mem_total(xos)MBTotal amount of memory.
mem_free(xos)MBAmount of memory not in use by the system.
mem_used(xos)MBAmount of memory in use by the system.
mem_buffcache(xos)MBAmount of memory used by the buff/cache.
mem_swaptotal(xos)MBSet this in preparation for the case where the total Swap size and physical memory are insufficient.
mem_swapfree(xos)MBBecause the unused Swap area and Swap area are secured in disk space, the system performance may drop drastically while using Swap.
mem_available(xos)MBAvailable memory area.
net_recv_bytes(xos)byteAmount of data received over the network.
net_recv_packets(xos)-Number of data packets received over the network.
net_send_bytes(xos)byteAmount of data transmitted over the network.
net_send_packets(xos)-Number of data packets transmitted over the network.