Call Us: +91-9000011489

Register for Free Demo

I. Introduction

A. Hadoop history, concepts
B. Ecosystem
C. Distributions
D. High level architecture
E. Hadoop challenges (hardware / software)

Hands on session. Preparing to Install Hadoop

II. Planning and Installation

A. Selecting software, Hadoop distributions
B. Sizing the cluster, planning for growth
C. Rack topology
D. Installation of  Core Hadoop and Ecosystem tools
E. Directory structure, logs

Hands on Session: Cluster installation

III. HDFS

A. Concepts (horizontal scaling, block replication, data locality, rack awareness)
B. Nodes and daemons (NameNode, Secondary NameNode, HA Standby NameNode, DataNode)
C. Health monitoring
D. Command-line and browser-based administration
E.  Adding storage, replacing defective drives  - Commissioning / De-Comissioning of Datanodes

Hands on Session: getting familiar with HDFS commands

IV. Mapreduce2

A. MapReduce1
B. Terminology and Data Flow  - (Map – Shuffle – Reduce)
C. YARN Architecture
D. MapReduce Essential Configuration

Hands on Session : MapReduce UI walk through

V. Schedulers

A. Working with Jobs
B. Scheduling Concepts
C. FIFO Scheduler
D. Fair Scheduler
E. CapacityScheduler - Configuration

Hands on Session:Working with Schedulers

VI. DataIngestion&Security

A. Flume for logs and other data ingestion into HDFS
B. Sqoop for importing from SQL databases to HDFS, as well as exporting back to SQL
C. Overview ofHive
D. Copying data between clusters (distcp)
E. Ranger installation and configuration for HDFS, Hive, Hbase

Hands onsession: setup and configure Flume, Sqoop, Ranger

Installing Hadoop with Ambari Lab Tasks
A. Install Ambari, HDP (optional)