Components of NoSQL
To understand the architecture of Catalyst NoSQL databases and learn about working with them, you must first understand the components that make up NoSQL databases.
Basic Components
The fundamental components involved in Catalyst NoSQL are explained below.
Terminology
Component Name | Description |
---|---|
Table | A table is the fundamental structure that NoSQL stores data in. Similar to relational databases, tables are composed of records or items. The sample table Travel shown below holds data about travel destinations that one could visit, with key information about each place. |
Attributes | An attribute represents a characteristic of a data value in a table. Attributes in NoSQL are like fields or columns in a relational database wherein each attribute holds data of a specific data type. The sample table Travel below contains the attributes DestinationName, Distance, BestTimeToVist, Location, and EstimatedCost. |
Items | An item is a collection of attributes that hold the data of a single data point. Items in NoSQL are similar to rows or records in a relational database wherein they can contain values for different attributes. In the sample table Travel, each collection of data uniquely identified by the DestinationName is an item. For example, the record "Honululu" that also contains a set of values for the Distance, BestTimeToVisit, and EstimatedCost is an item. Similarly, the record identified by "Prague" with different attribute values is an item. Note: The maximum total size of an item provisioned in Catalyst NoSQL is 400 KB. |
Data | In Catalyst NoSQL's terms, the data of a table is composed of all the attributes and items in it, present in the Catalyst Custom JSON format. The JSON code given below for the Travel table represents the data of the table. The Working with Data help section elaborates on the custom JSON data format and the supported data types. |
Sample Table Representation
Table Name: Travel
copy{ "item": { "DestinationName": { "S": "Honolulu" }, "Distance": { "N": 4960 }, "BestTimeToVist": { "L": [ { "S": "September" }, { "S": "October" }, { "S": "November" } ] }, "EstimatedCost": { "N": 2500 } }, "item": { "DestinationName": { "S": "Prague" }, "Distance": { "N": 4081 }, "Location": { "M": { "Country": { "S": "Czech Republic" }, "Continent": { "S": "Europe" } } }, "EstimatedCost": { "N": 3000 } }, "item": { "DestinationName": { "S": "Marrakesh" }, "Distance": { "N": 3700 }, "BestTimeToVist": { "L": [ { "S": "March" }, { "S": "April" }, { "S": "September" }, { "S": "October" } ] }, "Location": { "M": { "Country": { "S": "Morocco" }, "Continent": { "S": "Africa" } } }, "EstimatedCost": { "N": 1700 } } }
The Travel table represents a simple NoSQL table in the Custom JSON format. The structure includes nested attributes and arrays: the BestTimeToVist is present as an array of the data type list (“L”), and the Location attribute is a nested attribute of the data type map (“M”) containing the keys Country and Continent.
Table Keys
Following a typical NoSQL architecture, Catalyst NoSQL stores data in partitions across clusters, also known as shards. These shards can also be replicated over many servers in mirrors. This distributed storage across multiple nodes breaks down datasets into smaller chunks and promotes optimization.
While querying data in a NoSQL table, we must be able to locate the partition a specific item is stored in. Moreover, sorting through a partition to locate an item also involves additional efforts. These are accomplished with the help of distinct keys that identify partitions or sort through them, as described below.
Terminology
Component Name | Description |
---|---|
Partition Key | The partition key identifies a logical partition in a distributed storage that a data item is stored in. These logical partitions are mapped to physical storage nodes in the backend through hash functions. In the sample table Menu discussed below, the attribute Category is a partition key. |
Sort Key | The primary sort key sorts through a specific partition to locate a data item and returns it in a sorted order if more than 1 item matches with the partition key in the data. This is done by identifying it from a range of key hashes. Catalyst enables you to add a sort key optionally while creating a NoSQL table. In the sample table Menu discussed below, the attribute DishName is regarded as a sort key. |
Simple Primary Key | A simple primary key in Catalyst NoSQL denotes the partition key alone. When you choose to only configure a partition key for a NoSQL table without a sort key, the partition key will act as the primary key of the table by uniquely identifying an item in it. This is similar to the concept of a primary key in a relational database. In the sample table Travel discussed in the previous section, the attribute DestinationName is regarded as the primary key, as this attribute must be present in all items and no two items with the same values for this attribute can exist. However, in the sample table Menu discussed below, the partition key Category does not uniquely identify an item, because multiple items can exist with the same value for it. This is because the Menu table uses a composite primary key. |
Composite Primary Key | The composite primary key includes both a partition key and a sort key to uniquely identify an item, i.e., the combination of both keys act as a primary key. Querying a table with the composite primary key is global, as the data spans across multiple partitions. Unlike the simple primary key which requires the partition key to be unique, you can store non-unique values for the partition key here, as long as you define a sort key as well. Note: For all the items that have the same partition key in a table, the sort keys must be unique.
The sample table Menu discussed below contains a composite primary key wherein the partition key Category and the sort key DishName together uniquely identify a data item (a dish). |
Additional Sort Keys | Catalyst NoSQL allows you to configure additional sort keys, besides the main sort key, which is a part of the composite primary key. Additional sort keys are used if you require the sorting to be performed based on a different key besides the main sort key, or if the main sort key is unknown for a specific item. In this case, a combination of the partition key and an additional sort key will be used as a composite primary key. Querying a table with additional sort keys are similar to local indexing, as they are all used with the same partition key to sort in that specific partition alone. Note: Catalyst allows you to configure a maximum of 5 additional sort keys for a table.
In the sample table Menu discussed below, the attributes PreparationTime and Calories are configured as additional sort keys. |
Sample Table Representation
Table Name: Menu
copy{ "item": { "Category": { "S": "Pizza" }, "DishName": { "S": "Vegan" }, "Calories": { "N": 350 }, "PreparationTime": { "N": 20 }, "Cost": { "N:" 25 }, "DiscountPercentage": { "N": 5 }, "KeyIngredients": { "L": [ { "S": "Pizza Base" }, { "S": "Pizza Sauce" }, { "S": "Olives" }, { "S": "Bell Peppers" }, { "S": "Corn" }, { "S": "Onions" } ] } }, "item": { "Category": { "S": "Pizza" }, "DishName": { "S": "Chicken Overload" }, "Calories": { "N": 450 }, "PreparationTime": { "N": 30 }, "KeyIngredients": { "L": [ { "S": "Pizza Base" }, { "S": "Pizza Sauce" }, { "S": "Fried Chicken" }, { "S": "Paprika" } ] } }, "item": { "Category": { "S": "Sandwich" }, "DishName": { "S": "Vegan" }, "Calories": { "N": 300 }, "PreparationTime": { "N": 20 }, "Cost": { "N": 25 }, "DiscountPercentage": { "N": 15 } } }
The Menu table includes various attributes of which Category is the partition key, DishName is the sort key, and PreparationTime and Calories are additional sort keys.
The composite primary key enhances data querying. If this table was configured with just the simple primary key, i.e., with Category (example: “Pizza”), then querying it will fetch all items of that Category (all the “pizzas” in the menu). However, because it is configured with a composite primary key, i.e, with Category and DishName, then querying it with both the keys will retrieve a unique item of those values (example: “Pizza” and “Vegan”).
Similarly, the additional sort keys provide further flexibility while querying the table within the same partition. You can choose to query the table with the additional sort key PreparationTime in combination with the partition key Category (example: “Sandwich” and “20”).
Before discussing indexing, let’s understand a special attribute that Catalyst NoSQL provides for all tables.
Time To Live (TTL)
Time To Live (TTL) provides an automated way of deleting items in the Catalyst NoSQL tables after a specific point of time. You can configure an attribute to hold the expiration timestamp in your table while you create it. You can then add the expiration timestamp of each item in the table in the Unix epoch time format while adding the data. Catalyst will permanently delete the items in the table past their expiration time automatically, saving you from performing manual updates and consuming write throughputs.
The TTL attribute is handy in removing items that will not be needed beyond a specific point of time or whose validity is short-lived. The items that are deleted by Catalyst automatically after they are past their expiration date will be removed from the table and from anywhere they are indexed.
Catalyst operates a built-in scheduler once every 24 hours to check for any items whose TTL attribute values indicate expired. This check is performed in all NoSQL tables of all your projects, and the items past their expiration are deleted permanently.
Let’s understand TTL with an example. Assume we create an attribute named Expiry to store the TTL timestamp in the Menu table given in the previous section. We can then define the expiration timestamp for all the items in the table in the following manner.
Sample Table Representation
Table Name: Menu
copy{ "item": { "Category": { "S": "Pizza" }, "DishName": { "S": "Vegan" }, "Calories": { "N": 350 }, "PreparationTime": { "N": 20 }, "Cost": { "N:" 25 }, "DiscountPercentage": { "N": 5 }, "Expiry": { "N": 1715755926 } }, "item": { "Category": { "S": "Pizza" }, "DishName": { "S": "Chicken Overload" }, "Calories": { "N": 450 }, "PreparationTime": { "N": 30 }, "Expiry": { "N": 1715565610 } }, "item": { "Category": { "S": "Sandwich" }, "DishName": { "S": "Vegan" }, "Calories": { "N": 300 }, "PreparationTime": { "N": 20 }, "Cost": { "N": 25 }, "DiscountPercentage": { "N": 15 } "Expiry": { "N": 1715700869 } } }
Points to remember:
-
You will be able to update the TTL attribute value for items in a NoSQL table anytime you require. When the built-in Catalyst scheduler runs, it will identify and process the items based on the latest TTL value.
-
You can also delete the TTL attribute for items as per your needs, based on the given values
Last Updated 2025-05-30 16:54:59 +0530 +0530
Yes
No
Send your feedback to us