Realtime Cloud Storage
Working with tables

  • Feb 27, 2015
  • Starting Guide

In Realtime Cloud Storage your data is kept as items in tables. When you create a table, in addition to the table name, you must specify the key schema of the table. Realtime Cloud Storage supports the following two types of key schemas:

Primary Key – The key is made of one attribute, a hash attribute. For example, a ProductCatalog table could have ProductID as its primary key. Realtime Cloud Storage builds an unordered hash index on this primary key attribute.

Primary Key + Secondary Key – The key is made of two attributes. The first attribute is the hash attribute (the primary key) and the second attribute is the range attribute (the secondary key). For example, the forum Thread table could have ForumName and Subject as its key schema, where ForumName is the hash attribute (the primary key) and Subject is the range attribute (the secondary key). Realtime Cloud Storage builds an unordered hash index on the primary key and a sorted range index on the secondary key.

To ensure high availability and low latency responses, Realtime Cloud Storage requires that you specify your required read and write throughput values when you create a table. Cloud Storage uses this information to reserve sufficient Amazon DynamoDB hardware resources and appropriately partitions your data over multiple servers to meet your throughput requirements. As your application data and access requirements change, you can easily increase or decrease your provisioned throughput using the Realtime Cloud Storage Web Console or the API.

After specifying the key schema you just need to specify the provision type (the required throughput) and it’s expected usage (more reads, more writes or balanced use of reads and writes).

After you click SAVE your table will be provisioned and until it’s ready the console will show you the CREATING state. Allow a few seconds for the provisioning process to conclude before you start using the table.

Whenever you want to use a Realtime Storage table you must have a table reference object. Here’s an example that references the ProductCatalog table using the JavaScript SDK:

  var tableRef = storageRef.table("ProductCatalog");

 

Limits when increasing provisioned throughput:

You can update a table as often as necessary to increase provisioned throughput. You can increase the read capacity or write capacity for a table, subject to these conditions:

- You can call the update method to increase READ or WRITE capacity (or both), up to twice their current values;
- The new provisioned throughput settings do not take effect until the update operation is complete;
- You can call update multiple times, until you reach the desired throughput capacity for your table.

Limits when decreasing provisioned throughput:

You can reduce the provisioned throughput of a table no more than four times in a single UTC calendar day. These reductions can be any of the following operations:

- Decrease READ operations;
- Decrease WRITE operations;
- Decrease both READ and WRITE in a single request. This counts as one of your allowed reductions for the day.

In the following example we are using the Node.JS SDK to update the throughput of table ProductCatalog to 20 read operations/sec and 5 write operations/sec:

  var tableRef = storageRef.table("ProductCatalog");

  tableRef.update({
      provisionType: Realtime.Storage.provisionType.Custom,
      throughput: {
          read: 20,
          write: 5
      }
  });

Your application could be using this table, reading and writing, while a throughput update is being performed. There's absolutely no downtime.

A note about the performance of your tables:

Provisioned throughput is dependent on the primary key selection and the workload patterns on individual items. When storing data, Realtime Cloud Storage divides a table's items into multiple partitions, and distributes the data primarily based on the hash key element. The provisioned throughput associated with a table is also divided evenly among the partitions, with no sharing of provisioned throughput across partitions.

Consequently, to achieve the full amount of request throughput you have provisioned for a table, keep your workload spread evenly across the hash key values. Distributing requests across hash key values distributes the requests across partitions.

For example, if a table has a very small number of heavily accessed hash key elements, possibly even a single very heavily used hash key element, traffic is concentrated on a small number of partitions – potentially only one partition. If the workload is heavily unbalanced, meaning disproportionately focused on one or a few partitions, the operations will not achieve the overall provisioned throughput level. To get the most out of Realtime Cloud Storage throughput, build tables where the hash key element has a large number of distinct values, and values are requested fairly uniformly, as randomly as possible.

Back to Connecting or proceed to Working with items

If you find this interesting please share: