This is an English translation of a Japanese blog. Some content may not be fully translated.
AWS

Building a Data Replication Environment from DynamoDB to Aurora PostgreSQL

Introduction

Stream Data from Amazon DynamoDB to Amazon Aurora Using AWS Lambda and Amazon Kinesis Firehose | Amazon Web Services Blog https://aws.amazon.com/blogs/database/how-to-stream-data-from-amazon-dynamodb-aws-lambda-amazon-kinesis-firehose/

Using this 2017 article as a reference, I built the environment from scratch since implementation approaches have changed. This may be replaced if Glue Elastic Views becomes GA. Note that this is for simple replication only. For updates and deletes, you would need to change the loading method to Aurora.

DynamoDB to Aurora Data Replication Approach

There are several patterns for linking DynamoDB to Aurora, so choose the appropriate approach. See below for reference.

This time the flow is: ①DynamoDB -> ②DynamoDB Streams -> ③Amazon Kinesis Data Stream -> ④Amazon Kinesis Firehose -> ⑤Lambda -> ⑥S3 -> ⑦Lambda -> ⑧Aurora.

The ⑤ Lambda before placing data in S3 converts the streaming JSON to CSV, ⑥ S3 event notification triggers ⑦ Lambda, which loads data into Aurora PostgreSQL.

Although you can send data directly from ④ Kinesis Firehose to S3 without ⑤ Lambda, I chose to insert ⑤ Lambda to store data in CSV format for data loading.

① DynamoDB

image-20220507213949409

aws dynamodb create-table \
    --table-name dynamotest \
    --attribute-definitions \
      AttributeName=id,AttributeType=S \
      AttributeName=datetime,AttributeType=S \
    --key-schema AttributeName=id,KeyType=HASH AttributeName=datetime,KeyType=RANGE \
    --billing-mode PAY_PER_REQUEST

② DynamoDB Streams

Enable data streams:

image-20220507214257374

image-20220507214151509

③ Amazon Kinesis Data Stream

Choose On-demand or specify the number of shards. Decide on data retention period in advance.

image-20220507214421947

image-20220507214452638

④ Amazon Kinesis Firehose

Adjust backup options, buffer size, and interval as needed.

image-20220507215024717

Specify Lambda for Transform source records with AWS Lambda.

image-20220507215121697

image-20220507215152970

image-20220507215212733

image-20220507215227709

image-20220507215243982

⑤ Lambda

As described in the blog below:

⑥ S3

Nothing special.

⑦ Lambda, ⑧ Aurora

As described in the blogs below:

Suggest an edit on GitHub