Hadoop Classroom Training

Big Data and Hadoop Certification course is designed to prepare you for a job assignment in the Big Data world. The course provides you not only with Hadoop 2.7 essential skills, but also gives you practical work experience in Big Data Hadoop by completing long-term, real-world projects. You’ll use Hadoop 2.7 with CloudLab cloud-based Hadoop environment lab to complete your hands-on project work. Register Today

Schedules for Hadoop Classroom Training

S No Timings Demo Date Start Date Register
1 11:15 AM to 12:15 PM Apr 24th Apr 25th Register

Other Popular Courses

Trainer : Mr Karthik (8+ Yrs Exp)

Course Fee: INR 12,000/-

Duration: 6 Weeks (Mon - Fri) / 8 Weekends (Sat - Sun)


✔ Daily Tasks ✔ Weekly Interviews
✔ Real-time Project ✔ Resume Guidance
✔ Certification Guidance ✔ Placement Services

Hadoop Training Course Contents:

Module I

Module II


  • What is Cloud Computing
  • What is Grid Computing
  • What is Virtualization
  • How above three are inter-related to each other
  • What is Big Data
  • Introduction to Analytics and the need for big data analytics
  • Hadoop Solutions - Big Picture
  • Hadoop distributions
  • Comparing Hadoop Vs. Traditional systems
  • Volunteer Computing
  • Data Retrieval - Radom Access Vs. Sequential Access
  • NoSQL Databases


  • Problems with traditional large-scale systems
  • Data Storage literature survey
  • Data Processing literature Survey
  • Network Constraints
  • Requirements for a new approach


  • What is Hadoop?
  • The Hadoop Distributed File System
  • How MapReduce Works
  • Anatomy of a Hadoop Cluster


  • Master Daemons
  • Name node
  • Job Tracker
  • Secondary name node
  • Slave Daemons
  • Job tracker
  • Task tracker


  • Blocks and Splits
  • Input Splits
  • HDFS Splits
  • Data Replication
  • Hadoop Rack Aware
  • Data high availability
  • Data Integrity
  • Cluster architecture and block placement
  • Accessing HDFS
  • JAVA Approach
  • CLI Approach


  • Developing MapReduce Programs in
  • Local Mode
  • Running without HDFS and Mapreduce
  • Pseudo-distributed Mode
  • Running all daemons in a single node
  • Fully distributed mode
  • Running daemons on dedicated nodes

CHAPTER 7: HADOOP ADMINISTATIVE TASKS - Setup Hadoop cluster of Apache, Cloudera and HortonWorks

  • Install and configure Apache Hadoop
  • Make a fully distributed Hadoop cluster on a single laptop/desktop (Psuedo Mode)
  • Install and configure Cloudera Hadoop distribution in fully distributed mode
  • Install and configure HortonWorks Hadoop distribution in fully distributed mode
  • Monitoring the cluster
  • Getting used to management console of Cloudera and Horton Works
  • Name Node in Safe mode
  • Meta Data Backup
  • Integrating Kerberos security in Hadoop
  • Ganglia and Nagios Cluster monitoring
  • Benchmarking the Cluster
  • Commissioning/Decommissioning Nodes.

CHAPTER 8 : HAOOP DEVELOPER TASKS-Writing a Map Reduce Program

  • Examining a Sample Map Reduce Program
  • With Several Examples
  • Basic API Concepts
  • The Driver Code
  • The Mapper
  • The Reducer
  • Hadoop's Streaming API

CHAPTER 9 : Performing several Hadoop Jobs

  • The configure and close Methods
  • Sequence Files
  • Record Reader
  • Record Writer
  • Role of Reporter
  • Output Collector
  • Processing video files and audio files
  • Processing image files
  • Processing XML files
  • Processing Zip files
  • Counters
  • Directly Accessing HDFS
  • Tool Runner
  • Using The Distributed Cache.

CHAPTER 10 : Common Map Reduce Algorithms

  • Sorting and Searching
  • Indexing
  • Classification/Machine Learning
  • Term Frequency - Inverse Document Frequency
  • Word Co-Occurrence
  • Hands-On Exercise: Creating an Inverted Index
  • Identify Mapper
  • Identify Reducer
  • Exploring well known problems using
  • Map Reduce applications.

CHAPTER 11 : Debugging Map Reduce Programs

  • Testing with MR Unit
  • Logging
  • Other Debugging Strategies.

CHAPTER 12 : Advanced Map Reduce Programming

  • A Recap of the Map Reduce Flow
  • Custom Writables and Writable Comparables
  • The Secondary Sort
  • Creating Input Formats and Output Formats
  • Pipelining Jobs With Oozie
  • Map-Side Joins
  • Reduce-Side Joins.

CHAPTER 13 : Monitoring and debugging on a Production Cluster

  • Counters
  • Skipping Bad Records
  • Rerunning failed tasks with Isolation Runner

CHAPTER 14 : Tuning for Performance

  • Reducing network traffic with combiner
  • Reducing the amount of input data
  • Using Compression
  • Running with speculative execution
  • Refactoring code and rewriting algorithms Parameters affecting Performance
  • Other Performance Aspects

CHAPTER 15 : Hadoop Ecosystem- Hive

  • Hive concepts
  • Hive architecture
  • Install and configure hive on cluster
  • Create database, access it console
  • Buckets,Partitions
  • Joins in Hive
  • Inner joins
  • Outer joins
  • Hive UDF
  • Hive UDAF
  • Hive UDTF
  • Develop and run sample applications in Java to access hive
  • Load Data into Hive and process it using Hive


  • Pig basics
  • Install and configure PIG on a cluster
  • PIG Vs MapReduce and SQL
  • PIG Vs Hive
  • Write sample Pig Latin scripts
  • Modes of running PIG
  • Running in Grunt shell
  • Programming in Eclipse
  • Running as Java program
  • PIG UDFs
  • PIG Macros
  • Load data into Pig and process it using Pig


  • Install and configure Sqoop on cluster
  • Connecting to RDBMS
  • Installing Mysql
  • Import data from Oracle/Mysql to hive
  • Export data to Oracle/Mysql
  • Internal mechanism of import/export
  • Import millions of records into HDFS from RDBMS using Sqoop

Chapter 18 : HBASE

  • HBase concepts
  • HBase architecture
  • Region server architecture
  • File storage architecture
  • HBase basics
  • Cloumn access
  • Scans
  • HBase Use Cases
  • Install and configure HBase on cluster
  • Create database, Develop and run sample applications
  • Access data stored in HBase using clients like Java
  • Map Resuce client to access the HBase data
  • HBase and Hive Integration
  • HBase admin tasks
  • Defining Schema and basic operation


  • Cassandra core concepts
  • Install and configure Cassandra on cluster
  • Create database, tables and access it console
  • Developing applications to access data in Cassandra through Java
  • Install and Configure OpsCenter to access Cassandra data using browser


  • Oozie architecture
  • XML file specifications
  • Install and configure Oozie on cluster
  • Specifying Work flow
  • Action nodes
  • Control nodes
  • Oozie job coordinator
  • Accessing Oozie jobs command line and using web console
  • Create a sample workflows in oozie and run them on cluster

CHAPTER 21 : Zookeeper, Flume, Chukwa, Avro, Scribe,Thrift, HCatalog

  • Flume and Chukwa Concepts
  • Use cases of Thrift ,Avro and scribe
  • Install and Configure flume on cluster
  • Create a sample application to capture logs from Apache using flume


  • Analytics and big data analytics
  • Commonly used analytics algorithms
  • Analytics tools like R and Weka
  • R language basics
  • Mahout


  • Name Node High – Availability
  • Name Node federation
  • Fencing
  • YARn
All Classes are Instructor-Led & LIVE. Completely Practical and Real-time with Study Material, Session Notes, Tasks and 24x7 LIVE Server.

Hadoop Training - Highlights :

  • Completely Practical and Real-time
  • Suitable for Starters + Working Professionals
  • Session wise Handouts and Tasks + Solutions
  • TWO Real-time Case Studies, One Project
  • Weekly Mock Interviews, Certifications
  • Certification & Interview Guidance
  • Fundamentals of Hadoop and YARN and write applications using them
  • Setting up pseudo-node and multi-node clusters on Amazon EC2
  • HDFS, MapReduce, Hive, Pig, Oozie, Sqoop, Flume, ZooKeeper and HBase
  • Configuring ETL tools like Pentaho/Talend to work with MapReduce, Hive, Pig, etc.
  • Hadoop testing applications using MRUnit and other automation tools
  • Practicing real-life projects using Hadoop and Apache Spark
Register Today  Other Popular Courses: SQL DBA Training, MSBI Training, SSIS Training, SSAS Training, SSRS Training [+] More Courses

Job-Oriented Real-time Training @ SQL School Training Institute - Trainer: Mr. Sai Phanindra T