Blogs arrow How to Retrieve Data Efficiently From DynamoDB in AWS

How to Retrieve Data Efficiently From DynamoDB in AWS

line

Feb 9, 2021

Introduction

DynamoDB is a Fast and Flexible Nosql Database with built in security and backup features .It automates the data traffic between tables of multiple servers with high accuracy. The hardware setup, configuration and backup through replication are managed by AWS.

Methods Available To Retrieve Data

The main purpose of adding data to a database is to retrieve that data when needed .

The DynamoDB offers this by providing two data retrieval methods as follows

  • Scan
  • Query          

Scan

A Scan operation always scans the entire table before it filters the desired values. Hence it takes more time and space to process data operations such as read, write and delete in dynamo db.

scan

In the above image you can see the dynamo db table name of userDetails and you can see list of users is stored in userDetails table.

Implementing Scan in DynamoDB

Scan filtering is simple, you can use any column (key, sort, index, index sort or simple column) and have multiple filters in the same API.

implement-scan
As you can see in the above image we have filtered with the column name email and firstName using scan in the dynamo db.
TIP: Scan operation does not require Partition Key or Sort Key to fetch results from the table.

Scan Syntax For Above Implementation

Model.scan()
.where(’email’)
.equals(‘usertwo@gmail.com’)
.where(‘firstName’)
.equals(‘user’)
.exec((err, response) => {
// logic here
});
In the above syntax, the Model is the schema you defined in the project

Query

Query lets you use filters to select a range of data to be returned, making the operation more efficient compared to a Scan operation
TIP: To query the table we must pass the partition key. So selecting a proper partition key for the table is important and Query operation will return all items matching with the partition key of the table
To query in DynamoDB you must create an index in the tables.
query

In the above image click the create index button and a popup will open as shown below .

pop

In the above popup enter the partition key with any attribute name matching to the particular table, for example in the userDetails table firstName, email, createdAt, gender, phoneNumber, position, organizationId and click on submit button.

indexes

Implementing Query in DynamoDB

As shown below select query in the first dropdown and position-index in the second dropdown. 4 indexes are already created. You can see the created index image shared above.

query-dynamo

Once you select the query in the dropdown you will see the partition key field below the query.

items

In the above image you can see the list of users with position ADMINISTRATOR

Query Syntax For Above Implementation

Model.query(‘ADMINISTRATOR’)
.usingIndex(‘position-index’)
.exec((err, response) => {
// logic here
});

In the above syntax Model is the schema you defined in the project and you can also use a filter for query as did in scan syntax.

Also add the indexes in the schema file you defined in the project as shown below

indexes: [
{
hashKey: ’email’, name: ’email-index’, type: ‘global’,
},
{
hashKey: ‘phoneNumber’, name: ‘phoneNumber-index’, type: ‘global’,
},
{
hashKey: ‘position’, name: ‘position-index’, type: ‘global’,
},
{
hashKey: ‘organizationId’, name: ‘organizationId-index’, type: ‘global’,
},
]

Advantages of querying over scanning

  • The scan operation generally scans the entire table for retrieving data hence it slows down the process compared to querying 
  • For large table and index processing query method is preferred compared to scan method
  • In the scan method the request is hitting the same partition, hence all capacity units are consumed and requests get throttled to that partition
  • Isolate scan methods can be used to reduce this drawback which handles processing by sharing between two scan methods 

DynamoDB Provisioned Capacity Vs On-Demand Capacity

  • The Read and Write request units are not mentioned in On-Demand capacity. Whereas in Provisioned Capacity based on application usage the units are mentioned during creation of table .
  • OnDemand Capacity is considered as a good option for applications that have sudden spikes and increased unpredictable data usage. Whereas Provisioned Capacity can be implemented in smaller applications where data usage can be predicted .
  • OnDemand Capacity supports best for auto scaling with DynamoDB. Whereas Provisioned Capacity lags timing during scaling tables in DynamoDB .

Cost benefits of Provisioned Capacity over On-Demand Capacity

The OnDemand capacity is considered more costlier compared to provisioned DynamoDB databases due to its advantages over scaling of databases. However for small applications provisioned capacity is preferred over ondemand capacity .Please refer here for more details

Conclusion

In this blog,  we saw a brief summary of  scan and query  operation in the dynamo db in simple steps.

ganesh-pic

WRITTEN BY

Ganesh RP

lingam-pic

REVIEWED BY

Naveen Lingam

More Blogs

line