| Criteria | Meet expectations | Does not meet expectations |
| Data Ingestion(5%) | Has imported data from Amazon RDS to HDFS using Sqoop Has been able to list the imported data inside HDFS | Could not import data from Amazon RDS to HDFS using Sqoop Could not list the imported data inside HDFS |
| Reading the data from files in HDFS(5%) | Has created input schema using StructType Has been able to read the data using the input schema created and verifying the data using count function | Could not create input schema using StructType Could not read the data using input schema Final count is not equal to the expected count |
| Creation of Dimension tables(20%) | Has been able to create a dataframe for Location Dimension according to Target Dimension model Has been able to clean and transform the data for Location dimension Has been able to create a dataframe for Card Type Dimension according to Target Dimension model Has been able to clean and transform the data for Card Type dimension Has been able to create a dataframe for ATM Dimension according to Target Dimension model Has been able to clean and transform the data for ATM dimension Has been able to create a dataframe for Date Dimension according to Target Dimension model Has been able to clean and transform the data for Date dimension | Could not create a dataframe for Location Dimension according to Target Dimension model Duplicate records are present, the primary key has not been created according to Target Dimension model. Could not create a dataframe for Card Type Dimension according to Target Dimension model Duplicate records are present, the primary key has not been created according to Target Dimension model. Could not create a dataframe for ATM Dimension according to Target Dimension model Duplicate records are present, the primary key has not been created according to Target Dimension model. Could not create a dataframe for Date Dimension according to Target Dimension model Duplicate records are present, the primary key has not been created according to Target Dimension model. |
| Creation of Transaction Fact table(10%) | Has been able to create stage 1 dataframe where original dataframe is joined with one of the Dimension tables Has been able to create stage 2 dataframe where original dataframe is joined with one of the Dimension tables Has been able to create stage 3 dataframe where original dataframe is joined with one of the Dimension tables Has been able to create stage 4 dataframe where original dataframe is joined with one of the Dimension tables Has been able to transform the data for this fact table | Could not create stage 1 dimension, join method was not correct and unnecessary columns have been selected. Could not create stage dimension, join method was not correct and unnecessary columns have been selected. Could not create stage 4 dimension, join method was not correct and unnecessary columns have been selected. Could not create appropriate primary key and transform the fact table according to the Target Dimension Model. |
| Loading dimension and fact tables to S3 bucket(5%) | Has used appropriate commands to load the various tables to S3 bucket in csv format | Could not load the various tables to S3 bucket in csv format. |
| Creation of RedShift cluster(5%) | Has been able to create a RedShift cluster | Could not create the RedShift cluster with the appropriate configuration. |
| Setting up Database in the RedShift cluster and running queries to create the dimension and fact tables(5%) | Has been able to set up the Database in RedShift cluster and run the appropriate queries for each dimension and fact tables | Could not set up the Database in RedShift cluster and the various tables could not be created |
| Loading Data to RedShift Cluster from Amazon S3 bucket(5%) | Has been able to load the tables from S3 bucket to the RedShift Cluster | Could not load the tables from S3 bucket to the RedShift cluster |
| Analytical Queries(40%) | Has been able to correctly write a solution for solving the 1st analytical query Has been able to correctly write a solution for solving the 2nd analytical query Has been able to correctly write a solution for solving the 3rd analytical query Has been able to correctly write a solution for solving the 4th analytical query Has been able to correctly write a solution for solving the 6th analytical query Has been able to correctly write a solution for solving the 7th analytical query Has been able to correctly write a solution for solving the 8th analytical query | Could not write the correct solution for the 1st query with correct syntax, clauses, subqueries and the result tables have unnecessary columns. Could not write the correct solution for the 2nd query with correct syntax, clauses, subqueries and the result tables have unnecessary columns. Could not write the correct solution for the 3rd query with correct syntax, clauses, subqueries and the result tables have unnecessary columns. Could not write the correct solution for the 4th query with correct syntax, clauses, subqueries and the result tables have unnecessary columns. Could not write the correct solution for the 5th query with correct syntax, clauses, subqueries and the result tables have unnecessary columns. Could not write the correct solution for the 6th query with correct syntax, clauses, subqueries and the result tables have unnecessary columns. Could not write the correct solution for the 7th query with correct syntax, clauses, subqueries and the result tables have unnecessary columns Could not write the correct solution for the 8th query with correct syntax, clauses, subqueries and the result tables have unnecessary columns. |
Note
Proper documentation for the above is also very important. Please refer to the submission guidelines for more details.