Course Objective Summary

Hadoop Corporate Training:

  • Introduction to Big Data and Hadoop
  • Hadoop ecosystem – Concepts
  • Hadoop Map-reduce concepts and features
  • Developing the map-reduce Applications
  • Pig concepts
  • Hive concepts
  • Oozie workflow concepts
  • Sqoop Concepts
  • Flume Concepts
  • Hue Concepts
  • HBASE Concepts
  • Zookeeper concepts
  • Real Life Use Cases

  • Hadoop Administration

  • Set up Hadoop cluster with
  • Configure the ssh
  • Install plain vanilla Cluster setup
  • Designing the Node cluster setup with Hortonworks/ClouderaAdd users to the cluster.
  • Access grants to the users
  • Creating groups and best practices.
  • Creating Data Lake Clusters.
  • Day to day cluster admin activities
  • Profiling and project maintain ace in Big - Data systems
  • Folder structure and significance.
  • SCALA

  • Introduction to Scala
  • Creating a Scala Project
  • Basic Object Oriented Programming
  • Immutable and Mutable Fields
  • Companion Objects
  • Case Classes and Case Objects
  • SPARK

  • SPARK CORE
  • SPARK INTEGRATION WITH NO SQL (CASSANDRA) and AMAZON EC2
  • SPARK STREAMING
  • SPARK SQL
  • Project 1 - Ingest data from database into Hdfs using Sqoop
    Project 2 - Ingest data using Flume
    Project 3 - Processing dataset using Pig
    Project 4 - Processing trading data using Hive
    Project 5 - Querying Healthcare dataset using Hive
    Project 6 - Workflow and Coordination for various data sets using Oozie
    Project 7–Spark and Kafka Integration

    We will be showing how to build a real-time stateful streaming application using Kafka and Spark and storing these results in HBase in real time. Spark streaming and Kafka Integration are the best combinations to build real-time applications. Spark is an in-memory processing engine on top of the Hadoop ecosystem, and Kafka is a distributed public-subscribe messaging system. Kafka can stream data continuously from a source and Spark can process this stream of data instantly with its in-memory processing primitives