Skip to main content
Version: Cloud

Monitor Postgres DB in AWS RDS

Overview

The metrics of Postgres DB running in AWS are gathered by sfPoller and displayed within the dashboard of SnappyFlow.

Prerequisites

To collect metrics of Postgres DB hosted on AWS, it is necessary to have a sfpoller set up within your AWS environment.

Click here to learn more about setting up sfpoller in your AWS environment.

Configure sfPoller

Ensure that both the project and application are created within sfPoller. If they have not been created yet, you can click here to learn how to create a project and application in sfPoller.

Follow the below step to add endpoints and plugins:

  1. In the Application tab of sfPoller, navigate to your Project > Application.

  2. Click on the Application, it will take you to the Endpoint page.

  3. Click the Add Endpoint button, add the following data, and save.

    • Account Type: Select account type as AWS

    • Account Name: Name of the AWS account

    • Endpoint Type: Postgres

    • Name: Give a meaningful name to the endpoint

    • IP: Add the application IP address

  4. In the Plugins window, click the +Add button.

  5. In the Add Plugin window, select the below details and save.

    • Plugin Type: Metric
    • Plugin: Postgres
    • Interval: Choose an interval value. The minimum value for the interval is 300
  6. Click the global Save button in the window's top right corner to save all the changes made so far.

Replication

PostgreSQL includes built-in binary replication based on shipping the changes write ahead logs(WAL) to replica nodes asynchronously, with the ability to run read-only queries against these replicated nodes. This allows splitting read traffic among multiple nodes efficiently.

PostgreSQL includes built-in synchronous replication that ensures that, for each write transaction, the master waits until at least one replica node has written the data to its transaction log. This can be useful for workloads that do not require such guarantees, and may not be wanted for all data as it slows down performance due to the requirement of the confirmation of the transaction reaching the synchronous standby.

note

Replication is supported only on AWS RDS PostgreSQL versions between 10 and 13.

User with root access is required to collect replication metrics from PostgreSQL.

View Database Metrics

Follow the below steps to view the metrics collected from Postgres DB.

  1. Go to the Application tab in SnappyFlow and navigate to your Project > Application > Dashboard.

  2. You can view the database metrics in the Metrics section.

    note

    Once plugins are added to sfPoller, they will be automatically detected within the Metrics section. However, if the plugins are not detected, you can import templates to view the corresponding metrics.

  3. To access the unprocessed data gathered from the plugins, navigate to the Browse data section and choose the Index: Metric, Instance: Endpoint, Plugin, and Document Type.

Template Details

TemplatePluginsDocument Type
PostgreSQLpostgresserverDetails, databaseDetails, tableDetails, indexDetails, queryDetails
PostgreSQL_ReplicationpostgresmasterReplicationDetails, replicationSlotDetails, slaveReplicationDetails

Metrics list

Server Details
MetricDescription
numCreatedTempFilesTotal number of temporary files created .
numTransactionsTotal number of transactions that have been committed and rolled back.
archiveFailCountNumber of failed attempts for archiving WAL files.
activeProcessesNumber of active process running in the server.
archivedCountNumber of WAL files that have been successfully archived.
numUpdateTotal number of rows updated by queries.
numRollbackTotal number of transactions that have been rolled back.
hostThe host address of the instance.
cacheHitsNumber of times disk blocks were found already in the buffer cache.
idle_in_transactionNumber of idle_in_transaction process running in the server.
numSelectTotal number of live rows fetched by select queries.
maxConnectionsThe maximum number of concurrent connections to the database server.
tempFileSizeTotal amount of data written to temporary files by queries.
idleProcessesNumber of idle process waiting in the server.
numCommitTotal number of transactions that have been committed.
cacheSizeThe effective size of the disk cache that is available to a single query.
numConnectionsTotal number of backends currently connected to this server.
indexSizeTotal size of the indices.
bufferSizeAmount of memory the database server uses for shared memory buffers.
dbSizeTotal size of the databases.
upTimeTime difference between start time and current time.
abortedConnectionNumber of aborted process in the server.
numInsertTotal number of rows inserted by queries in this database.
numDeleteTotal number of rows deleted by queries in this database.
cacheHitRatioCache Hit Percentage on the server.
numDatabasesNumber of Databases.
versionPostgreSQL version of the server.
Database Details
MetricDescription
blkReadTimeTime spent writing data file blocks by backends in this database, in milliseconds.
blocksReadNumber of disk blocks read in this database.
transPerSecNumber of transaction in this database per second.
numDeadTupleEstimated number of dead rows in database.
numTransactionsNumber of transactions in this database that have been committed and rolled back.
numLiveTupleEstimated number of live rows in database.
numFetchNumber of live rows fetched by index scans in this database.
numRollbackNumber of transactions in this database that have been rolled back.
numUpdateNumber of rows updated by queries in this database.
hostThe host address of the instance.
numReturnNumber of live rows fetched by sequential scans and index entries returned by index scans in this database.
tempFileSizeTotal amount of data written to temporary files by queries in this database.
_dbNameName of the database.
numCommitNumber of transactions in this database that have been committed.
blocksHitNumber of times disk blocks were found already in the buffer cache, so that a read was not necessary (this only includes hits in the PostgreSQL buffer cache, not the operating system's file system cache).
blkWriteTimeTime spent writing data file blocks by backends in this database, in milliseconds.
numTempFileNumber of temporary files created by queries in this database.
indexSizeTotal disk space used by all indexes in the database.
dbSizeSize of the database.
numInsertNumber of rows inserted by queries in this database.
numDeleteNumber of rows deleted by queries in this database.
cacheHitRatioPercentage of cacheHit of this database.
indexHitRatioPercentage of indexHit of this table.
numTablesNumber of tables in this database.
Table Details
MetricDescription
numDeadTupleEstimated number of dead rows.
numLiveTupleEstimated number of live rows.
numUpdateNumber of rows updated.
_tableNameName of the table.
_dbNameName of the database for this table.
tableSizeTotal disk space used by the specified table.
idxBlksReadNumber of disk blocks read from all indexes on this table.
hostThe host address of the instance.
seqScanNumber of sequential scans initiated on this table.
indexScanFetchNumber of live rows fetched by index scans.
heapBlksReadNumber of disk blocks read from this table.
idxBlksHitNumber of buffer hits in all indexes on this table.
heapBlksHitNumber of buffer hits in this table.
seqScanFetchNumber of live rows fetched by sequential scans.
indexSizeTotal disk space used by indexes attached to the specified table.
numInsertNumber of rows inserted.
indexScanNumber of index scans initiated on this table.
numDeleteNumber of rows deleted.
cacheHitRatioPercentage of cacheHit of this table.
_schemaNameName of the schema of this table.
indexHitRatioPercentage of indexHit of this table.
Index Details
MetricDescription
_indexNameName of this index.
blksHitNumber of buffer hits in this index.
_tableNameName of the table for this index.
blksReadNumber of disk blocks read from this index.
numFetchNumber of live table rows fetched by simple index scans using this index.
indexScanNumber of index scans initiated on this index.
_schemaNameName of the schema of this index.
_dbNameName of the database for this index.
hostThe host address of the instance.
numReturnNumber of index entries returned by scans on this index.
Query Details
MetricDescription
runtimePeriod of time for the current query is running.
userNameName of the user logged into this backend.
wait_eventTrue if backend is currently waiting, otherwise False.
_dbNameName of the database this backend is connected to.
hostThe host address of the instance.
stateCurrent overall state of this backend. Possible values are:active, idle, idle in transaction, idle in transaction (aborted), fastpath function call, disabled.
_queryNameIf state is active this field shows the currently executing query. In all other states, it shows the last query that was executed.
Master Replication Details
MetricDescription
writeLagTime elapsed between flushing recent WAL locally and receiving notification that this standby server has written it.
replayLSNLast write-ahead log location replayed into the database on this standby server.
sentLSNLast write-ahead log location sent on this connection.
pidProcess ID of a WAL sender process.
clientAddressIP address of the client connected to this WAL sender. If this field is null, it indicates that the client is connected via a Unix socket on the server machine.
replayLagTime elapsed between flushing recent WAL locally and receiving notification that this standby server has written, flushed and applied it.
syncPriorityPriority of this standby server for being chosen as the synchronous standby in a priority-based synchronous replication. This has no effect in a quorum-based synchronous replication.
writeLSNLast write-ahead log location written to disk by this standby server.
hostThe host address of the instance.
stateCurrent WAL sender state. Possible values are startup, catchup, streaming, backup, stopping.
applicationNameName of the application that is connected to this WAL sender.
flushLSNLast write-ahead log location flushed to disk by this standby server.
syncStateSynchronous state of this standby server.
flushLagTime elapsed between flushing recent WAL locally and receiving notification that this standby server has written and flushed it.
userNameName of the user logged into this WAL sender process.
backendStartTime when this process was started, i.e., when the client connected to this WAL sender.
Replication Slot Details
MetricDescription
temporaryTrue if this is a temporary replication slot. Temporary slots are not saved to disk and are automatically dropped on error or when the session has finished.
slotTypeThe slot type: physical or logical.
activeTrue if this slot is currently actively being used.
active_pidThe process ID of the session using this slot if the slot is currently actively being used. NULL if inactive.
slotPluginThe base name of the shared object containing the output plugin this logical slot is using, or null for physical slots.
slotNameA unique, cluster-wide identifier for the replication slot.
databaseThe name of the database this slot is associated with, or null. Only logical slots have an associated database.
confirmed_flush_lsnThe address (LSN) up to which the logical slot's consumer has confirmed receiving data. Data older than this is not available anymore. NULL for physical slots.
hostThe host address of the instance.
restart_lsnThe address (LSN) of oldest WAL which still might be required by the consumer of this slot and thus won't be automatically removed during checkpoints unless this LSN gets behind more than max_slot_wal_keep_size from the current LSN. NULL if the LSN of this slot has never been reserved.
Slave Replication Details
MetricDescription
receivedLSNLast write-ahead log location already received and written to disk, but not flushed.
latestEndLSNLast write-ahead log location reported to origin WAL sender.
conninfoConnection string used by this WAL receiver, with security-sensitive fields obfuscated.
receiveStartTLIFirst timeline number used when WAL receiver is started.
pidProcess ID of the WAL receiver process.
senderHostHost of the PostgreSQL instance this WAL receiver is connected to. This can be a host name, an IP address, or a directory path if the connection is via Unix socket.
slotNameReplication slot name used by this WAL receiver.
hostThe host address of the instance.
lastMsgSendTimeSend time of last message received from origin WAL sender.
lastMsgReceiptTimeReceipt time of last message received from origin WAL sender.
latestEndTimeTime of last write-ahead log location reported to origin WAL sender.
senderPortPort number of the PostgreSQL instance this WAL receiver is connected to.
receivedTLITimeline number of last write-ahead log location received and flushed to disk, the initial value of this field being the timeline number of the first log location used when WAL receiver is started.
receiveStartLSNFirst write-ahead log location used when WAL receiver is started.
statusActivity status of the WAL receiver process.