IKH

AWS EMR

Session Overview

In the previous session, you learnt about cloud computing and understood how cloud computing play a significant role in various industries today.

In this session

The following topics will be covered:

  • In the segment ‘ introduction to AWS’, you will get a brief overview of Amazon web Services (AWS). you will look at the services provided by AWS and those that we will be using in this program.
  • Once You are familiar with some common AWS services, you will learn the steps to access the same using the Nuvepro dashboard in the following segment’ Introduction to Nuvepro’.
  • In the next segment ‘Virtual Machine on Cloud-EC2’, you will learn the steps to launch an EC2 t2.micro type instance. This instance will give you a hands-on experience of the EC2 environment. The t2.micro type instance is created for practice purposes only. So, this instance will be terminated, and the steps to terminate the instance will be discussed in the segment.
  • In the next segment ‘Amazon EMR’, you will learn about Amazon EMR and its features. You will learn about its use cases and benefits as well as the different big data tools that are available and can be installed on EMR instances during setup. You will also learn about EMR Notebooks that can be used to make Jupyter Notebooks running on EMR clusters.
  • In the following segment ‘Setting Up an Amazon EMR Instance’, you will learn how the EMR dashboard looks and then proceed to set up an EMR cluster. You will also learn how to set up an EMR Notebook and link it with the EMR cluster that you have created. Then, you will understand how you can configure the YARN parameters for your EMR cluster, which is important for the optimal performance of your EMR cluster. Finally, you will learn how you can clone your EMR cluster.
  • In the next segment, we will summarise the steps for logging in to the EMR instance. You will be using these steps throughout the program. You will also learn how to transfer files from your local machine to an EMR cluster, and vice versa.
  • Once you are done launching your instance and logging in to it, you can try the basic shell commands in the instance. The commands are provided in the segment ‘Practising Linux Commands’ and will be used throughout the program. So, it is highly recommended that you try these commands properly.
  • Finally, in the segment ‘EMR – Instance Termination’, you will learn how to terminate an EMR instance after usage. It is extremely important that you follow these steps after every coding session on Amazon EMR, as this service is expensive and can adversely affect the allocated budget of your Nuvepro account if left unchecked.  This session provides a step-by-step guide to set up an EMR instance, and you will be following similar steps in the future modules where different configurations of an EMR cluster may be required. So, follow all the steps carefully and properly to avoid any inconveniences in the upcoming modules. For any issues faced during the setup, please feel free to raise a ticket on the DF.