Introduction
The retrieved data from DynamoDB in AWS is a fast and flexible NoSQL database with built-in security and backup features. It precisely automates data traffic between tables AWS manages hardware setup, configuration, and backup via replication.on multiple servers with high accuracy.
Methods Available To Retrieve Data
The primary purpose of adding data to a database is to retrieve that data when needed.
The DynamoDB offers this by providing two data retrieval methods as follows
- Scan
- Query
Scan
Before filtering the desired values, the Scan operation always scans the entire table. Hence it takes more time and space to process data operations such as read, write and delete in dynamo db.
In the above image you can see the dynamo db table name of userDetails and you can see list of users is stored in userDetails table.
Implementing Scan in DynamoDB
Scan filtering is simple; you can have multiple filters in the same API and use any column (key, sort, index, index sort, or simple column).
As you can see in the above image we have filtered with the column name email and firstName using scan in the dynamo db.
TIP:To retrieve results from a table, a scan operation does not require a partition key or a sort key.
Scan Syntax For Above Implementation
Model.scan()
.where(’email’)
.equals(‘usertwo@gmail.com’)
.where(‘firstName’)
.equals(‘user’)
.exec((err, response) => {
// logic here
});
In the above syntax the model is the schema you defined in your project.
Query
Query lets you to use filters to select a range of data to be returned, making the operation more efficient than just a Scan operation.
TIP: To query the table, we must provide the partition key. So selecting a proper partition key for the table is important and Query operation will return all items matching with the partition key of the table
To query in DynamoDB you must create an index in the tables.
When you click the Create Index button in the preceding image, a popup window will appear, as shown below.
In the above popup, enter the partition key and any attribute name that matches the table, for example, firstName, email, createdAt, gender, phone in the userDetails table. Enter the number, position, and organizationId, then press the submit button.
Implementing Query in DynamoDB
Select query in the first dropdown and position-index in the second dropdown, as shown below. 4 indexes are already created. The created index image is shown above.
When you select a query from the dropdown, the partition key field appears below the query.
In the above image you can see the list of users with position ADMINISTRATOR
Query Syntax For Above Implementation
Model.query(‘ADMINISTRATOR’)
.usingIndex(‘position-index’)
.exec((err, response) => {
// logic here
});
In the above syntax Model is the schema you defined in the project and you can also use a filter for query as did in scan syntax.
Also add the indexes in the schema file you defined in the project as shown below
indexes: [
{
hashKey: ’email’, name: ’email-index’, type: ‘global’,
},
{
hashKey: ‘phoneNumber’, name: ‘phoneNumber-index’, type: ‘global’,
},
{
hashKey: ‘position’, name: ‘position-index’, type: ‘global’,
},
{
hashKey: ‘organizationId’, name: ‘organizationId-index’, type: ‘global’,
},
]
Advantages of querying over scanning
- The scan operation generally scans the entire table for retrieving data hence it slows down the process compared to querying
- In comparison to the scan method, the query method is preferred for large table and index processing.
- In the scan method the request is hitting the same partition, hence all capacity units are consumed and requests get throttled to that partition
- Isolate scan methods can be used to reduce this drawback which handles processing by sharing between two scan methods
DynamoDB Provisioned Capacity Vs On-Demand Capacity
- The Read and Write request units are not mentioned in On-Demand capacity. Whereas in Provisioned Capacity based on application usage the units are mentioned during creation of table .
- OnDemand Capacity is thought to be a good option for applications with sudden spikes and increased unpredictable data usage. Whereas Provisioned Capacity can be implemented in smaller applications where data usage can be predicted .
- OnDemand Capacity supports best for auto scaling with DynamoDB. In DynamoDB, Provisioned Capacity lags timing during table scaling.
Cost benefits of Provisioned Capacity over On-Demand Capacity
OnDemand capacity is considered more expensive than provisioned DynamoDB databases due to its advantages in database scaling. However for small applications provisioned capacity is preferred over ondemand capacity .Please refer here for more details
Conclusion
In this blog, we saw a brief summary of scan and query operation in the dynamo db in simple steps.