Várjuk 2025-ben is tanfolyamainkon és vizsgáinkon!

Comprehensive Pig Certification Training

CPC-HV
6 nap
737 590 Ft + ÁFA
tanfolyamkezdési időpontok:
Jelentkezem!
oktatók:

A tanfolyamról

Businesses around the world are looking for ways to leverage data for business continuity. Apache Pig was developed to run queries on large data sets stored in HDFS and that runs on Hadoop. It’s best known for its simplistic syntax and ability to save time and so is widely used.
This training will introduce you to the world of Hadoop and MapReduce. You’ll learn through a series of practical hands-on exercises on writing complex MapReduce transformations about HDFS and writing scripts using the advanced features of Pig.

Who Should Attend This Training

  • Analytics Professionals
  • BI /ETL/DW Professionals
  • Project Managers
  • Testing Professionals
  • Mainframe Professionals
  • Software Developers and Architects
  • Graduates aiming to build a career in Big Data and Hadoop

What You Will Learn

  • Hadoop Ecosystem
    Get introduced to the world of Hadoop. Master the key concepts of Hadoop ecosystem and architecture.
  • Analyse Data Sets
    Analyse large sets of data in a short time by using Pig Latin scripts. Use MapReduce for data processing.
  • Big Data Analytics
    Discover the different advantages of Pig and learn how to leverage Pig efficiently for Big Data analytics.
  • Implement Pig
    Expert-led training to guide learners to efficiently implement the Pig technology for future projects.
  • Data Flows with Pig
    Ace key Pig configurations and understand Pig use cases to execute data flows with the Pig technology.
  • Advanced Pig
    Gain a complete understanding of advanced concepts like Pig Latin relational operators, Pig UDF, and more.

We provide the course in English.

Tematika

Curriculum

Module 1: The Hadoop Ecosystem

  • Hadoop Overview
  • Surveying the Hadoop components
  • Defining the Hadoop Architecture

Module 2: Exploring HDFS and MapReduce

  • Storing data in HDFS
  • Achieving reliable and secure storage
  • Monitoring storage metrics
  • Controlling HDFS from the Command Line
  • Parallel processing with MapReduce
  • Detailing the MapReduce approach
  • Transferring algorithms not data
  • Dissecting the key stages of a MapReduce job
  • Automating data transfer
  • Facilitating data Ingress and Egress
  • Aggregating data with Flume
  • Configuring data fan in and fan-out
  • Moving relational data with Sqoop

Module 3: Executing Data Flows with Pig

  • Describing characteristics of Apache Pig
  • Contrasting Pig with MapReduce
  • Identifying Pig use cases
  • Pinpointing key Pig configurations

Module 4: Advanced Pig

  • Pig Latin: Relational Operators
  • File Loaders
  • Group Operator
  • CO GROUP Operator
  • Joins and CO GROUP
  • Union, Diagnostic Operators
  • Pig UDF
  • Structuring unstructured data
  • Representing data in Pig's data model
  • Running Pig Latin commands at the Grunt Shell
  • Expressing transformations in Pig Latin Syntax
  • Invoking Load and Store functions

Module 5: Performing ETL with Pig

  • Transforming data with Relational Operators
  • Creating new relations with joins
  • Reducing data size by sampling
  • Extending Pig with user–defined functions
  • Filtering data with Pig
  • Consolidating data sets with unions
  • Partitioning data sets with splits
  • Injecting parameters into Pig scripts

Kinek ajánljuk

Előfeltételek

Prerequisites

There are no prerequisites to attend this course.

Kapcsolódó tanfolyamok



Ajánlja másoknak is!