IKH

Introduction to Hadoop and MapReduce Programming

Introduction to Hadoop

Module Introduction

Welcome to the module on ‘Hadoop and MapReduce Programming’

In the previous module on ‘ Introduction to Cloud Computing and AWS Setup’, you learnt about cloud computing and different types of cloud deployment models and their use cases. you also AWS EC2 and then learnt how to set up an AWS EC2 instance.

In this module

In this module , you will learn

you will learn about distributed systems, more specifically, Hadoop, which is an open-source framework for storing and processing big data on clusters of commodity hardware. First, you will learn about Hadoop and its features and components, and also understand why it is used. In the next session, you will learn in-depth about the Hadoop Distributed File System (HDFS), which is the storage component of Hadoop.

You will also learn how to navigate inside the Hadoop cluster, how to modify files, etc. You will then learn about the read and write operations in HDFS and their features. Finally, you will learn about MapReduce, which is the programming framework of Hadoop. First, you will understand the workings of MapReduce, and then learn how to write MapReduce programs. You will also learn about Combiner and Partitioner, which are useful components of MapReduce programs. Finally, you will learn how MapReduce programs get executed in Hadoop.

Guidelines for this module

As this module contains highly detailed content on the numerous concepts of Hadoop and MapReduce Programming it is advised that you start the module early on and continue to attempt the various in – segment questions as well as the practice questions for MapReduce at the end of the module. also, please read the platform text before attempting the in -segment questions. the video in this module are presentation-based. the presentations used session are provided in the corresponding segments labelled ‘session summary’.

Guidelines for in- segment & graded questions

There will be a separate session for graded questions. the other sessions will contain questions will not be graded. the graded questions in this module will each have 10 marks for a correct answer and 0 for an incorrect answer. each graded questions will allow only one attempt , whereas non- graded questions may allow one or two attempts depending on the question type and the number of options.

Report an error