Comprehensive Pig Certification Training

A tanfolyamról

Businesses around the world are looking for ways to leverage data for business continuity. Apache Pig was developed to run queries on large data sets stored in HDFS and that runs on Hadoop. It’s best known for its simplistic syntax and ability to save time and so is widely used.
This training will introduce you to the world of Hadoop and MapReduce. You’ll learn through a series of practical hands-on exercises on writing complex MapReduce transformations about HDFS and writing scripts using the advanced features of Pig.

Who Should Attend This Training

Analytics Professionals
BI /ETL/DW Professionals
Project Managers
Testing Professionals
Mainframe Professionals
Software Developers and Architects
Graduates aiming to build a career in Big Data and Hadoop

What You Will Learn

Hadoop Ecosystem
Get introduced to the world of Hadoop. Master the key concepts of Hadoop ecosystem and architecture.
Analyse Data Sets
Analyse large sets of data in a short time by using Pig Latin scripts. Use MapReduce for data processing.
Big Data Analytics
Discover the different advantages of Pig and learn how to leverage Pig efficiently for Big Data analytics.
Implement Pig
Expert-led training to guide learners to efficiently implement the Pig technology for future projects.
Data Flows with Pig
Ace key Pig configurations and understand Pig use cases to execute data flows with the Pig technology.
Advanced Pig
Gain a complete understanding of advanced concepts like Pig Latin relational operators, Pig UDF, and more.

We provide the course in English.

Tematika

Curriculum

Module 1: The Hadoop Ecosystem

Hadoop Overview
Surveying the Hadoop components
Defining the Hadoop Architecture

Module 2: Exploring HDFS and MapReduce

Storing data in HDFS
Achieving reliable and secure storage
Monitoring storage metrics
Controlling HDFS from the Command Line
Parallel processing with MapReduce
Detailing the MapReduce approach
Transferring algorithms not data
Dissecting the key stages of a MapReduce job
Automating data transfer
Facilitating data Ingress and Egress
Aggregating data with Flume
Configuring data fan in and fan-out
Moving relational data with Sqoop

Module 3: Executing Data Flows with Pig

Describing characteristics of Apache Pig
Contrasting Pig with MapReduce
Identifying Pig use cases
Pinpointing key Pig configurations

Module 4: Advanced Pig

Pig Latin: Relational Operators
File Loaders
Group Operator
CO GROUP Operator
Joins and CO GROUP
Union, Diagnostic Operators
Pig UDF
Structuring unstructured data
Representing data in Pig's data model
Running Pig Latin commands at the Grunt Shell
Expressing transformations in Pig Latin Syntax
Invoking Load and Store functions

Module 5: Performing ETL with Pig

Transforming data with Relational Operators
Creating new relations with joins
Reducing data size by sampling
Extending Pig with user–defined functions
Filtering data with Pig
Consolidating data sets with unions
Partitioning data sets with splits
Injecting parameters into Pig scripts

Kinek ajánljuk

Előfeltételek

Prerequisites

There are no prerequisites to attend this course.

Kapcsolódó tanfolyamok

Akciós tanfolyamok

Microsoft
tanfolyamok

Menedzsment
tanfolyamok

Python, Java, C++, Adatbázisok (Cassandra, NoSQL)
tanfolyamok

DevOps Mérnök Integrált képzési program
tanfolyamok

További
tanfolyamok

A tanfolyamról

Tematika

Kinek ajánljuk

Előfeltételek

Kapcsolódó tanfolyamok

Big Data Analytics Training Course

Big Data and Hadoop Training Course

Hadoop Administration Course Certification Training

Apache Kafka Course Certification Training

Apache Spark and Scala Course Training

Comprehensive Hive Certification Training