Zone maps
As Redshift writes blocks for each object, it stores some metadata. An important part of the data stored in each block is the minimum and maximum values that produce maps of the region. Zone maps will dramatically speed queries by allowing Redshift to skip entire blocks while searching for information.
In the next video, our SME will explain the concept of zone maps.
The query below expects: fetch the result from a table RS_TABLE and the date will be ‘09 June 2020’.
How many ZONES are created for unsorted data, i.e., data without the SORT key?
Here, you will get three zones, which are basically three groups in which your desired result will fall.
Now, using the SORT key, how will you derive results quickly, and how many zones will be created?
Here, you will get a single zone, and the result can be easily sent back to the end-user in a short time.
Let’s summarise zone maps into the following three points:
| Goal | To run queries faster by increasing the effectiveness of zone maps and reducing I/O |
| Impact | Enables range-restricted scans to prune blocks by leveraging zone maps. |
| Zone maps | In-memory block metadata Tracks the minimum and maximum values for each block Effectively punes blocks that do not contain data for a given query |
While creating a table, you can specify one or more columns as the sort key. Amazon Redshift holds the data on the disk in sorted order. Whether or not the data is organised has a significant impact on the disk I/O, column compression and query performance. Let’s hear from Kamlesh what are the different types of SORT Keys available in Amazon Redshift and their usage.
Sort types
Now that you have understood the concept of the SORT key and its workings, let’s learn about the different types of Sort keys. They are summarised in the image given below.
Note:
Compound sort key type is the default sort key in Redshift.
While doing a regular range filtering or just filtering on one column, you need to define the column as the sort key so that Amazon Redshift can skip reading the entire data blocks for that particular column because it keeps track of the minimum and maximum column data stored on each block.
Additional Reading
Choosing sort keys
Comparing sort styles