dynamo-db
  1. dynamo-db-optimizing-for-cost-and-performance

DynamoDB Best Practices for Optimizing Cost and Performance

DynamoDB is a popular NoSQL database service provided by AWS with features like scalability, low latency, and high availability. However, with the increased usage and huge data storage, the cost of the service may increase along with degraded performance. In this article, we will look into some best practices for optimizing the cost and performance of DynamoDB.

Best Practices

1. Use Sparse Indexes

Creating fewer indexes with fewer projected attributes is recommended as they may reduce the overall size of the table, which will lead to lower costs and faster performance.

Syntax

DynamoDB.Table('table_name').create_global_secondary_index(
    IndexName='index_name',
    KeySchema=[
        {
            'AttributeName': 'attribute_name',
            'KeyType': 'HASH'
        }
    ],
    Projection={
        'ProjectionType': 'INCLUDE',
        'NonKeyAttributes': ['attribute_name_1', 'attribute_name_2']
    }
)

Example

import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.create_table(
    TableName='users',
    KeySchema=[
        {
            'AttributeName': 'user_id',
            'KeyType': 'HASH'  
        }
    ],
    AttributeDefinitions=[
        {
            'AttributeName': 'user_id',
            'AttributeType': 'S'  
        },
        {
            'AttributeName': 'city',
            'AttributeType': 'S'  
        },
        {
            'AttributeName': 'state',
            'AttributeType': 'S'  
        },
        {
            'AttributeName': 'name',
            'AttributeType': 'S'  
        },
    ],
    ProvisionedThroughput={
        'ReadCapacityUnits': 5,
        'WriteCapacityUnits': 5
    }
)
table.create_global_secondary_index(
    IndexName='name-city-state-index',
    KeySchema=[
        {
            'AttributeName': 'name',
            'KeyType': 'HASH'
        },
        {
            'AttributeName': 'city',
            'KeyType': 'RANGE'
        }
    ],
    Projection={
        'ProjectionType': 'INCLUDE',
        'NonKeyAttributes': ['state']
    },
    ProvisionedThroughput={
        'ReadCapacityUnits': 5,
        'WriteCapacityUnits': 5
    }
)

Output The above code creates a DynamoDB table named 'users' with a global secondary index named 'name-city-state-index' and projects only 'state' as the non-key attribute.

2. Use Proper Partition Keys

Choosing the right partition key can improve performance and reduce cost.

Syntax

DynamoDB.Table('table_name').put_item(
    Item={
        'partition_key': 'value',
        'sort_key': 'value',
        'attribute_name': 'value'
    }
)

Example

import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.create_table(
    TableName='users2',
    KeySchema=[
        {
            'AttributeName': 'user_id',
            'KeyType': 'HASH'  
        },
        {
            'AttributeName': 'name',
            'KeyType': 'RANGE'  
        },
    ],
    AttributeDefinitions=[
        {
            'AttributeName': 'user_id',
            'AttributeType': 'S'  
        },
        {
            'AttributeName': 'name',
            'AttributeType': 'S'  
        },
    ],
    ProvisionedThroughput={
        'ReadCapacityUnits': 5,
        'WriteCapacityUnits': 5
    }
)

Output The above code creates a table with a partition key 'user_id' and a sort key 'name'.

3. Use Projections

Projections are used to fetch only the required attributes during a query operation and reduce the data retrieval charges.

Syntax

DynamoDB.Table('table_name').query(
    KeyConditionExpression=Key('partition_key').eq('value'),
    ProjectionExpression='attribute_name_1, attribute_name_2'
)

Example

import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('users')
response = table.query(
    KeyConditionExpression=Key('user_id').eq('1'),
    ProjectionExpression='name, age',
)

Output The above code fetches the 'name' and 'age' attributes of a user with 'user_id' equal to 1.

4. Use DynamoDB Streams for Triggers

DynamoDB Streams are used for triggering Lambda functions, replicating data to other databases, and performing other operations.

Syntax

table = dynamodb.Table('table_name')
response = table.put_item(
    Item={
        'partition_key': 'value',
        'attribute_name': 'value'
    }
)
stream = table.latest_stream_arn

client = boto3.client('lambda')
response = client.create_event_source_mapping(
    EventSourceArn=stream,
    FunctionName='function-name'
)

Example

import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('users')
response = table.put_item(
    Item={
        'user_id': '1',
        'name': 'John',
        'age': '25'
    }
)
stream = table.latest_stream_arn

client = boto3.client('lambda')
response = client.create_event_source_mapping(
    EventSourceArn=stream,
    FunctionName='my-function'
)

Output The above code triggers a Lambda function named 'my-function' with the latest stream of 'users' table on the addition of a new item.

Important Points

  • Use sparse indexes to avoid extra charges for storing unnecessary data.
  • Choose the right partition key to avoid extra read/write operations.
  • Use projections to fetch only the required data and avoid retrieval charges.
  • Use DynamoDB streams for triggers, replicating data and other operations.

Summary

In this article, we looked into some best practices for optimizing the cost and performance of DynamoDB. Using sparse indexes, proper partition keys, projections, and DynamoDB streams can help reduce costs and improve performance.

Published on: