Big Data Hadoop Administrator Online training course makes you job ready for the role of Hadoop Admin with real time Big Data projects implementation by Layman Learning Courses
HADOOP CLUSTER ADMINISTRATION
Learning Objectives: In this module, you will understand what is Big Data and Apache Hadoop, How Hadoop solves the Big Data problems, Hadoop Cluster Architecture, Introduction to MapReduce framework, Hadoop Data Loading techniques, and Role of a Hadoop Cluster Administrator.
Learning Objectives: In this module, you will understand what is Big Data and Apache Hadoop, How Hadoop solves the Big Data problems, Hadoop Cluster Architecture, Introduction to MapReduce framework, Hadoop Data Loading techniques, and Role of a Hadoop Cluster Administrator.
- Introduction to Big Data
- Use cases where Big Data is used
- Introduction to Hadoop framework
- Companies using Hadoop
- HDFS File system
- Hadoop Architecture
- MapReduce Framework
- A typical Hadoop Cluster
- Hadoop Cluster Administrator: Roles and Responsibilities
- Current Job Market
Module 2 HADOOP ARCHITECTURE AND CLUSTER SETUP
Learning Objectives: After this module, you will understand Multiple Hadoop Server roles such as NameNode and DataNode, and MapReduce data processing. You will also understand the Hadoop 1.0 Cluster setup and configuration, Setting up Hadoop Clients using Hadoop 1.0, and important Hadoop configuration files and parameters.
- Hadoop server roles and their usage
- Hadoop Installation and Initial Configuration
- Understand Namenode and Datanodes Communication channels
- Setup a Single Node Cluster
- Namenode Metadata’s details
- Setup a Multi Node Cluster Deploying Hadoop in pseudodistributed mode
- Setup Pass phraseless Access
- Rack Awareness
- Anatomy of Write and Read
- Replication Pipeline, Data Processing
- Installing Hadoop Clients
Module 3: HADOOP CLUSTER: PLANNING AND MANAGING
Learning Objectives: In this module, you will understand Planning and Managing a Hadoop Cluster, Hadoop Cluster Monitoring and Troubleshooting, Analysing logs, and Auditing. You will also understand Scheduling and Executing MapReduce Jobs, and different Schedulers.
List Planning the Hadoop Cluster
- Cluster Sizing
- Hardware considerations
- Software considerations
- Managing Jobs
- Scheduling Jobs
- Types of schedulers in Hadoop FIFO
- Types of schedulers in Hadoop FAIR
- Types of schedulers in Hadoop SCHEDULER
- Setup Queues and Pools for Jobs
- Configuring the schedulers and run MapReduce jobs
- Cluster Monitoring
- Cluster Troubleshooting
Module 4: BACKUP, RECOVERY AND MAINTENANCE
Learning Objectives: In this module, you will understand day to day Cluster Administration tasks such as adding and Removing Data Nodes, NameNode recovery, configuring Backup and Recovery in Hadoop, Diagnosing the Node Failures in the Cluster, Hadoop Upgrade etc.
Configure Rack awareness
- Hadoop Balancer
- Setting up Secondary Namenode
- Hadoop Backup
- How to Whitelist and Blacklist data nodes in a Cluster
- Add Storage to Datanodes
- Setup Users and Quota’s
- Setup Trash for Hadoop
- Understand Safemode of cluster
- Details on fsck
- Namenode Recovery using checkpoint
- Upgrade Hadoop cluster
- Copy data across clusters using distcp
- Diagnostics and Recovery, Cluster Maintenance
Module 5: HADOOP 2.0 AND HIGH AVAILABILITY
Learning Objectives: In this module, you will understand Secondary NameNode setup and check pointing, Hadoop 2.0 New Features, HDFS High Availability, YARN framework, MRv2, and Hadoop 2.0 Cluster setup in pseudo distributed and distributed mode.
- Introduction to Hadoop 2.0
- Understand YARN framework
- Understand High Availability for Namenode
- Understand Federation
- Introduction to Quorum Manager and Automatic failover methods
- Hadoop 2.0 Cluster setup
- Deploying Hadoop 2.0 in pseudodistributed mode
- Deploying a multinode Hadoop 2.0 Cluster
Module 6: CONFIGURING MAPREDUCE, CAPACITY SCHEDULER, HDFS
Learning Objectives: In this module, you will understand the YARN execution and workflow. You will also learn to configure MapReduce jobs, configure capacity scheduler and HDFS.
- YARN Execution
- YARN Workflow
- MapReduce Job Configuration
- Configure Capacity Scheduler
- Configuring HDFS HA
- Hadoop Log Management
- Hadoop Auditing and Alerts
Module 7: GETTING ADVANCED TOPICS: QJM, HDFS FEDERATION AND SECURITY
Learning Objectives: In this module, you will understand basics of Hadoop security, Managing security with Kerberos, HDFS Federation setup and Log Management. You will also understand HDFS High Availability using Quorum Journal Manager (QJM).
- Configure Hadoop Federation
- Basics of Hadoop Platform Security
- Securing the Platform
- Understand Kerberos
- Configuring Kerberos on the Cluster
Module 8: OOZIE, PIG CONFIGURATION AND EXAMPLES
Learning Objectives: In this module, you will understand Setting up Apache Oozie Workflow Scheduler for Hadoop Jobs, Pig Scripting.
- Introdution to Oozie/Configure Oozie
- Introduction to Pig Scripting
- Write Pig Scripts / Process Web logs using Pig
- Introduction to Hive and Hbase
- Introduction to NoSql
- Introduction To MongoDB
- Understand Indexes
Module 9: HIVE, HBASE ADMINISTRATION
Learning Objectives: Hcatalog/Hive Administration, deploying HBase with other Hadoop components, Using HBase effectively to load data, writing to and reading from Hbase.
- Hive Administration
- HBase Architecture
- HBase setup
- HBase and Hive Integration
- HBase performance optimization and tools
Module 10: CLOUDERA SETUP AND PERFORMANCE TUNING
Learning Objectives: In this module, you will understand how to configure Hadoop Cluster using Cloudera Maanger. We will also look into performance tuning parameters and other intermediate phases of MapReduce.
- Look at the important performance tuning parameters
- Intermediate phases of MapReduce
- Details of tuning the intermediate phases
- Hadoop Cluster installation using Cloudera Manager
- Introduction to alternatives to the Hadoop HDFS and MapReduce
Module 11: GANGLIA
Learning Objectives: In this module, you will understand how to configure Ganglia for monitoring your Hadoop Cluster.
- Introduction to ganglia
- Why Ganglia
- Components of Ganglia Gmond, Gmetad, RRDtool
- Installation and Configuration Gmond Configuration, Gmetad Configuration, PHP Web Frontend Configuration
- Setup Monitoring for Hadoop Cluster Commandline Tools, Gmetric, Gstat
- How to automate deploys in your infrastructure
Module 12: PUPPET
Learning Objectives: In this module, you will understand how to use Puppet to automate repetitive tasks
- Introduction to Puppet
- How does Puppet work
- Puppet components Puppet Master, Puppet Agents
- Puppet Manifests and Classes
- Puppet installation and Configuration
- Deploy configuration for Nodes
Module 13: AMBARI
Learning Objectives: You will understand how to use Ambari for managing, provisioning & monitoring your clusters.
- Introduction to Ambari
- Installing and starting Ambari Server
- Configuring and Deploying the cluster
- Choosing and Customizing services
- Assigning Masters, Slaves and Clients
- Troubleshooting Ambari deployments
Module 14: AMAZON WEB SERVICES – AWS
Learning Objectives: In this module, you will understand how to deploy your Hadoop cluster on AWS.
- Introduction to AWS
- Different Instance types
- Get familiar with Common terms on AWS
- Components of Hadoop on AWS
- Deploy Hadoop cluster on AWS
- Explore scalability options

This comment has been removed by the author.
ReplyDelete