• support@conveytechlabs.com

Hadoop Training Plan andContents

Duration
40 – 45 hrs
Type
online

The demand for Big Data Hadoop professionals is increasing across the globe and it’s a great opportunity for the IT professionals to move into the most sought technology in the present day world. ExcelR offers Big data & Hadoop course in Bangalore and instructor led live online session delivered by industry experts who are considered to be the best trainers in the industry. The training is studded with loads of practical assignments, case studies and project work, which ensures hands on experience for the participants. The training program is meticulously designed to become a professional of Big data Hadoop developer and crack the job in the space of Big Data.

    • Duration: 40 – 45 hrs
    • Timings: Week days 1-2 Hours per day (or) Weekends: 2-3 Hours per day
    • Method: Online/Classroom Training
    • Study Material: Soft Copy
    Week  1:
    Day1:
    • Understanding BigData.
      • What is Big Data?
      • Big-Data characteristics.
    • Hadoop Distributions:
      • Hortonworks
      • Cloudera
      • Pivotal HD
    • Introduction to Apache Hadoop.
      • Flavors of Hadoop: Big-Insights, Google Query etc..
    • Hadoop Eco-system components: Introduction
      • MapReduce
      • HDFS
      • Apache Pig
      • Apache Hive
      • HBASE
      • Apache Oozie
      • FLUME
      • SQOOP
      • Apache Mahout
      • KIJI
      • LUCENE
      • SOLR
      • Impala
      • Chukwa
      • Shark
    Day 2:
    • Understanding Hadoop Cluster
    • Hadoop Core-Components.
    • HDFS Architecture
      • Why 64MB?
      • Why Block?
      • Why replication factor 3?
    • Discuss NameNode and DataNode.
    • Discuss JobTracker and TaskTracker.
    • Typical workflow of Hadoop application
    • Rack Awareness.
      • Network Topology.
      • Assignment of Blocks to Racks and Nodes.
      • Block Reports
      • Heart Beat
      • Block Management Service.
    • Anatomy of File Write.
    • Anatomy of File Read.
    • Heart Beats and Block Reports
    • Discuss Secondary NameNode and Usage of FsImage and Edits log.
    Day 3:
    • Map Reduce Overview
    • Best Practices to setup Hadoop cluster
    • Cluster Configuration
      • Core-default.xml
      • Hdfs-default.xml
      • Mapred-default.xml
      • Hadoop-env.sh
      • Slaves
      • Masters
    • Need of *-site.xml
    Day 4:
    • Map Reduce Framework
    • Why Map Reduce?
    • Use cases where Map Reduce is used.
    • Hello world program with Weather Use Case.
      • Setup environment for the programs.
      • Possible ways of writing Map Reduce program with sample codes find the best code and discuss.
      • Configured, Tool, GenericOptionParser and queues usage.
      • Demo for calculating maximum temperature and Minimum temperature.
    • Limitations of traditional way of solving word count with large dataset.
    • Map Reduce way of solving the problem.
    • Complete overview of MapReduce.
    • Split Size
    • Combiners
    • Multi Reducers
    • Parts of Map Reduce
    • Algorithms
    Day 5:
    • Apache Hadoop– Single Node Installation Demo
    • Apache Hadoop – Multi Node Installation Demo
    • Namenode – format.
    • Add nodes dynamically to a cluster with Demo
    • Remove nodes dynamically to a cluster with Demo.
    • Safe Mode.
    • Hadoop cluster modes.
      • Standalone Mode
      • Psuedo distributed Mode
    • Fully distributed mode.
    Week 2:
    Day 1:
    • Revision
    • HDFS Practicals
    • Map Reduce Anatomy
      • Job Submission.
      • Job Initialization.
      • Task Assignments.
      • Task Execution.
    • Schedulers
    • Quiz
    Day 2:
    • Map Reduce Failure Scenarios
    • Speculative Execution
    • Sequence File
    • Input File Formats
    • Output File Formats
    • Writable DataTypes
    • Custom Input Formats
    • Custom keys, Values usage of writables.
    Day 3:
    • Walkthrough the installation process through the cloudera manager.
    • Example List, show sample example list for the installation.
    • Demo on teragen, wordcount, inverted index, examples….
    • Debugging Map Reduce Programs
    Day 4:
    • Map Reduce Advance Concepts
    • Partitioning and Custom Partitioner
    • Joins
    • Multi outputs
    • Counters
    • MR unit testcases
    • MR Design patterns
    • Distributed Cache
      • Command line implementation
    • MapReduce API implementation
    Day 5:
    • Map Reduce Advance concepts examples.
    • Introduction to course Project.
    Week 3:
    Day 1:
    • Data loadingtechniques.
      • Hadoop Copy commands
        • Put,get,copyFromLocal,copyToLocal, mv,chmod,rmr, rmr –skipTrash, distcp, ls,lsr,df,du,cp,moveFromLocal, moveToLocal,text,touhz,tail,mkdir,help.
    • Demo for Hadoop Copy Commands
    • Sqoop Theory
    • Demo for Sqoop.
    Day 2:
    • Atom
    • DataTypes
      • Complex
        • Bag
        • Tuple
    • Dump Vs Store
    • Operators:
      • Load
      • Store
      • Dump
      • Distinct
      • Group
      • CoGroup
      • Join
      • Stream
      • Foreach Generate
      • Distinct
      • Limit
      • ORDER
      • CROSS
      • UNION
      • SPLIT
      • Sampling
    • Pig store schema.
    • Pig built in operators
    • Pig use cases.
    • Why go for Pig when Map Reduce is there?
    • Introduction to skew Join.
    • Why Pig Created?
    • Need of Pig?
    • Map
    • Integers
      • Float
    • Double
    • byteArray
    • Chararray
    • Diagnostic Operators
      • Describe
      • Explain
      • Illustrate
      • Filter Function
      • Eval Function
      • Macros
      • Demo
    • Storage Handlers.
    Day 3:
    • Pig Practicals and Usecases.
    • Demo using schema.
    • Demo using without schema.
    Day 4:
    • Hive Background.
    • What is Hive?
    • Pig Vs Hive
    • Where to Use Hive?
    • Hive Architecture
    • Metastore
    • Hive execution modes.
    • External, Manged, Native and Non-native tables.
    • Partitions:
      • Dynamic Partitions
      • Static Partitions
    • Buckets
    • Hive DataModel
    • Hive DataTypes
      • Primitive
      • Complex
    • Queries:
      • Create Managed Table
      • Load Data
      • Insert overwrite table
      • Insert into Local directory.
      • Insert Overwrite table select.
    Day 5:
    • Hive Practical’s
    • Joins
      • Inner Joins
      • Outer Joins
      • Skew Joins
    • Multi-table Inserts
    • Multiple files, directories, table inserts.
    • View
    • Index
    • UDF
    • UDAF
    Week 4:
    Day 1:
    • Introduction to NOSQL Databases.
    • NOSql Landscapes
    • Introduction to HBASE
    • HBASE vs RDBMS
    • Create Table on HBASE using HBASE shell
    • Where to use HBASE?
    • Where not to use HBASE?
    • Write Files to HBASE.
    • Major Components of HBASE.
      • HBase Master.
      • HBase Client.
    Day 2:
    • HBase Practicals
    • HBASE –ROOT- Catalog table
    • CAP Theorm
    • Compaction
    • Sharding
    • Sparse Datastore.
    Day 3:
    • YARN
    • Oozie
    • Flume
    • Demos
    Day 4:
    • Cassandra Architecture
    • Big Table and Dynamo
    • Distributed Hash Table, P2P & Fault Tolerant
    • Data Modelling
    • Column Families
    • Installation Demo on Cassandra
    • Spark and Scala
    • Practical’s
    Day 5:
    • Real time Project Analysis
    • Design
    • Implementation
    • Execution
    • Debugging
    • Optimization Techniques
    • Which one to use where
    • Career oriented training.
    • One to One live interaction with a trainer.
    • Demo project end to end explanation.
    • Interview guidence with resume preparation.
    • Support with the trainer through E-mail.