Skip to main content
Version: Cloud

Monitor Document Database

Overview

Amazon DocumentDB makes it easy to set up, operate, and scale MongoDB-compatible databases in the cloud.

Prerequisites

CloudWatch Access for IAM Role

Provide Read only access for CloudWatch to the dedicated IAM Role used for APM. You can use AWS managed polices that addresses many common use cases by providing standalone IAM policies that are created and administered by AWS. Attach this AWS policy CloudWatchReadOnlyAccess to IAM role to get read access for all CloudWatch.

AmazonDocDBReadOnlyAccess

Provides read-only access to Amazon DocDB with MongoDB compatibility. Note that this policy also grants access to Amazon RDS and Amazon Neptune resources.

Required Permissions:

  • "rds:DescribeDBClusters"
  • "rds:DescribeDBInstances"
  • "rds:ListTagsForResource"
  • "rds:DescribeCertificates"

SF Poller Configuration

Select DocumentDB Endpoint Type in Add Endpoints and add the Cluster ID

  • Add Endpoint

  • Select ElastiCache Endpoint

  • Enter the ClusterName

Select the plugin from the dropdown under Plugins tab and config the polling interval. Plugin configuration for DocumentDB services this includes cloudwatch-docdb plugin. You can enable/disable any of the plugin based on your needs and instance support.

cloudwatch-docdb: A monitoring support for AWS DocuemntDB Cluster, collects all the cluster level and instance level metrics.

Metrics list:

Resource Utilization Stats:

MetricDescription
BackupRetentionPeriodStorageUsedThe total amount of backup storage in GiB used to support the point-in-time restore feature within the Amazon DocumentDB's retention window.
ChangeStreamLogSizeThe amount of storage used by your cluster to store the change stream log in megabytes.
CPUUtilizationThe percentage of CPU used by an instance.
DatabaseConnectionsThe number of connections open on an instance taken at a one-minute frequency.
DatabaseConnectionsMaxThe maximum number of open database connections on an instance in a one-minute period.
DatabaseCursorsThe number of cursors open on an instance taken at a one-minute frequency.
DatabaseCursorsMaxThe maximum number of open cursors on an instance in a one-minute period.
DatabaseCursorsTimedOutThe number of cursors that timed out in a one-minute period.
FreeableMemoryThe amount of available random access memory, in bytes.
FreeLocalStorageThis metric reports the amount of storage available to each instance for temporary tables and logs.
LowMemThrottleQueueDepthThe queue depth for requests that are throttled due to low available memory taken at a one-minute frequency.
LowMemThrottleMaxQueueDepthThe maximum queue depth for requests that are throttled due to low available memory in a one-minute period.
LowMemNumOperationsThrottledThe number of requests that are throttled due to low available memory in a one-minute period.
SnapshotStorageUsedThe total amount of backup storage in GiB consumed by all snapshots for a given Amazon DocumentDB cluster outside its backup retention window.
SwapUsageThe amount of swap space used on the instance.
TotalBackupStorageBilledThe total amount of backup storage in GiB for which you are billed for a given Amazon DocumentDB cluster.
TransactionsOpenThe number of transactions open on an instance taken at a one-minute frequency.
TransactionsOpenMaxThe maximum number of transactions open on an instance in a one-minute period.
VolumeBytesUsedThe amount of storage used by your cluster in bytes. This value affects the cost of the cluster. For pricing information

Operation Stats:

MetricDescription
DocumentsDeletedThe number of deleted documents in a one-minute period.
DocumentsInsertedThe number of inserted documents in a one-minute period.
DocumentsReturnedThe number of returned documents in a one-minute period.
DocumentsUpdatedThe number of updated documents in a one-minute period.
OpcountersCommandThe number of commands issued in a one-minute period.
OpcountersDeleteThe number of delete operations issued in a one-minute period.
OpcountersGetmoreThe number of getmores issued in a one-minute period.
OpcountersInsertThe number of insert operations issued in a one-minute period.
OpcountersQueryThe number of queries issued in a one-minute period.
OpcountersUpdate Thenumber of update operations issued in a one-minute period.
TransactionsStartedThe number of transactions started on an instance in a one-minute period.
TransactionsCommittedThe number of transactions committed on an instance in a one-minute period.
TransactionsAbortedThe number of transactions aborted on an instance in a one-minute period.
TTLDeletedDocumentsThe number of documents deleted by a TTLMonitor in a one-minute period.

Throughput Stats:

MetricDescription
NetworkReceiveThroughputThe amount of network throughput, in bytes per second, received from clients by each instance in the cluster.
NetworkThroughputThe amount of network throughput, in bytes per second, both received from and transmitted to clients by each instance in the Amazon DocumentDB cluster.
NetworkTransmitThroughputThe amount of network throughput, in bytes per second, sent to clients by each instance in the cluster.
ReadIOPSThe average number of disk read I/O operations per second.
ReadThroughputThe average number of bytes read from disk per second.
VolumeReadIOPsThe average number of billed read I/O operations from a cluster volume, reported at 5-minute intervals. Billed read operations are calculated at the cluster volume level.
VolumeWriteIOPsThe average number of billed write I/O operations from a cluster volume, reported at 5-minute intervals. Billed write operations are calculated at the cluster volume level, aggregated from all instances in the cluster.
WriteIOPSThe average number of disk write I/O operations per second.
WriteThroughputThe average number of bytes written to disk per second.

System Stats:

MetricDescription
BufferCacheHitRatioThe percentage of requests that are served by the buffer cache.
DiskQueueDepththe number of concurrent write requests to the distributed storage volume.
EngineUptimeThe amount of time, in seconds, that the instance has been running.
IndexBufferCacheHitRatioThe percentage of index requests that are served by the buffer cache.

Latency Stats:

MetricDescription
DBClusterReplicaLagMaximumThe maximum amount of lag, in milliseconds, between the primary instance and each Amazon DocumentDB instance in the cluster.
DBClusterReplicaLagMinimumThe minimum amount of lag, in milliseconds, between the primary instance and each replica instance in the cluster.
DBInstanceReplicaLagThe amount of lag, in milliseconds, when replicating updates from the primary instance to a replica instance.
ReadLatencyThe average amount of time taken per disk I/O operation.
WriteLatencyThe average amount of time, in milliseconds, taken per disk I/O operation.

View Data and Dashboards:

Data collected by plugins can be viewed in SnappyFlow’s browse data section

  • Plugin = DocDB
  • DocumentType = OperationStats,clusterDetails instanceDetails, latencyStats,resourceUtilizationStats, systemStats, throughputStats
  • Dashboard template: DocDB