Indexing in NoSQL

Catalyst NoSQL provides the ability to create indexes for NoSQL tables that will allow you to execute alternate queries on the table data, without making use of the primary keys. This means that you can create and utilize secondary indexes and search for data items by other attributes besides the partition key and sort key.

Indexes facilitate convenience and add flexibility in data retrieval operations, as come in handy during scenarios that require you to prioritize non-conventional attributes in a table. Querying using indexes quickens the data read process and promotes efficiency. You can also modify or delete indexes in a table based on your requirements without affecting the main table in any manner.

Architecture of Indexes

Catalyst NoSQL indexes span across the entire table and all partitions that the table data is stored in. This cross-partition or global indexing feature proves highly beneficial as it allows you to execute queries across all partition keys of a table.

Catalyst enables you to create a maximum of 20 indexes for a table. You can configure individual partition keys and sort keys for each index you create. Because the indexing is global, you will be able to configure partition and sort keys different from the ones configured in the main table.

Essentially, an index can be regarded as a copy of the main table created with different partition and sort keys for different requirements in querying data. In terms of the architecture, when an index is created for a table, a copy of it is stored in different storage nodes with the same partitioning rules as applicable for the main table. Therefore, when you execute queries based on a secondary index, you can still access all the attributes and data of the table.

Note: Catalyst will automatically update indexes when you add, update, or delete data from its base table.

Index Types

Catalyst provides three different ways to index a table in NoSQL:

  • Indexing All Attributes: This will create an index of all the attributes in a table. When you choose this option, Catalyst will essentially replicate all the table attributes and store them based on the partition key you configure for the index.

  • Indexing Only Keys: This will index only the partition key, sort key, and the additional sort key attributes, if any, in a table. These key attributes are stored based on the partition key you configure for the index.

  • Indexing Specific Attributes: This will index specific attributes of a table that you select. The select attributes are replicated and stored based on the partition key you configure for the index.

When you execute a query based on an index, the partition key and sort key attributes of the table and the index will be returned along with other attributes based on the type you configured.

Let’s understand indexing better with an example. The Menu table discussed in the previous sections is updated and shown below.

Sample Table Representation

Table Name- Menu

    
copy
{ "item": { "Category": { "S": "Pizza" }, "DishName": { "S": "Vegan" }, "Calories": { "N": 350 }, "PreparationTime": { "N": 20 }, "DiscountPercentage": { "N": 12 }, "KeyIngredients": { "L": [ { "S": "Pizza Base" }, { "S": "Pizza Sauce" }, { "S": "Olives" }, { "S": "Bell Peppers" }, { "S": "Corn" }, { "S": "Onions" } ] } }, "item": { "Category": { "S": "Pizza" }, "DishName": { "S": "Chicken Overload" }, "Cost": { "N": 25 }, "DiscountPercentage": { "N": 15 }, "Calories": { "N": 450 }, "PreparationTime": { "N": 30 } }, "item": { "Category": { "S": "Sandwich" }, "DishName": { "S": "Vegan" }, "Calories": { "N": 300 }, "PreparationTime": { "N": 20 }, "Cost": { "N": 20 }, "DiscountPercentage": { "N": 10 } } }
Note: The table presented here is a representation of the Catalyst Custom JSON format. The formats for constructing data in each server-side SDK in this JSON format varies. To learn about the general syntax of the Catalyst Custom JSON format present in this table, as well as the supported data types and their notations, refer to this section.

Assume you create an index, Pricing, for this table with the following configurations-

  • Partition Key: Cost
  • Sort Key: DiscountPercentage
  • Index Type: Specific Attributes (Calories and PreparationTime)

Given below is a representation of the created index.

Sample Index Representation

Index Name- Pricing

    
copy
{ "item": { "Cost": { "N": 25 }, "DiscountPercentage": { "N": 15 }, "Category": { "S": "Pizza" }, "DishName": { "S": "Chicken Overload" } "Calories": { "N": 450 }, "PreparationTime": { "N": 30 } }, "item": { "Cost": { "N": 20 }, "DiscountPercentage": { "N": 10 }, "Category": { "N": "Sandwich" }, "DishName": { "N": "Vegan" }, "Calories": { "N": 300 }, "PreparationTime": { "N": 20 } } }
Note: Data items that do not contain the partition key of an index are skipped from being indexed. In this example, the item identified by "Pizza" and "Vegan" from the main table is not indexed because it does not include the value for the index's partition key, Cost.

When you execute a query based on the index Pricing, Catalyst will search for items against the attribute Cost and will sort the results based on the attribute DiscountPercentage if more that one result matches. The selected attributes (Calories and PreparationTime) will be returned in the query for the matching items, along with the primary keys of the table and the index.

Last Updated 2025-05-30 16:54:59 +0530 +0530