DevOps: CP-HDF (Certified Professional - Hadoop Foundation)
Time & Location
About The Event
CP-HDF stands for “Certified Professional – Hadoop Foundation” certification prepared and honored by DevOps++ Alliance and Agile Testing Alliance.
The course is applicable for all roles and knowledge, experience & certification is consciously designed all the roles in who want to learn Big Data in practical manner.
How is it useful?
Big Data is not a technology per-se, it is a phrase used to quantify data sets that are so large and complex that they become difficult to exchange, secure, and analyze with typical tools. These courses on big data show you how to solve these problems, and many more, with leading IT tools and techniques.
Organizations are learning that important forecasts can be made by sorting through and analyzing Big Data. As more than 75% of this data is “unstructured”, it must be formatted in a way that makes it suitable for data mining and further analysis.
Hadoop is fundamental platform for structuring big data, and resolves the problem of formatting it for subsequent analytics purposes. It is an Apache open source framework written in java that permits distributed processing of large datasets across collections of computers with the help of simple programming models. It provides massive storage for any type of data, huge processing power and the capability to handle virtually limitless parallel tasks or jobs.
Am I Eligible?
There are no pre-requisites for this certification. Any IT professional interested in truly understanding Big Data in practical manner can do this course.
CP-HDF is designed specifically for corporates and working professionals alike. If you are a corporate and can’t dedicate full day for training, then you can opt for either 6 half days course or 3 full days course which is followed by theory and practical exams.
- Introduction Hadoop, HDFS and Big Data
- What is Big Data
- 3V’s of Big Data
- Why Big Data?Existing architecture w.r.t Analytics and problems against that? ü Hadoop
- solution for Data Analytics architecture.
- Introduction to HDFS
- File and Block concept
- Data backup using Fault tolerance (Replication)Rack Awareness
- Understanding Hadoop Distributed File System
- Scalability w.r.t HDFS storage and add nodes to the cluster. ü Purpose of YARN.
- Processing component of Hadoop Framework.
- Difference between Hadoop versions 1.X and 2.x
- Understanding Resource Manager
- Tools and Utilities on top of Hadoop Framework.
- Data Access Mechanism Using PIG and HIVE
- Recognizing the Use cases for PIG involving building data pipelines and iterative data processing.
- Pig execution modes in HADOOP cluster
- Introduction to PIG Latin
- PIG with HDFS and Local file system.
- PIG usage as Data Flow Language.
- Map Reduce jobs with PIG SCRIPTS.
- Introduction to HIVE.
- Introduction to Hive QL.
- Difference between HIVE and Data Warehouse.
- Understanding how HIVE works on top of Hadoop.
- Understanding difference between HIVE managed and HIVE external tables.
- Understanding use cases for HIVE bucketed and HIVE portioned tables.
- Understanding the purpose of HCatalog in HIVE.
- File formats supported in HIVE and its uses
- Processing Framework @ HADOOP – Map Reduce vs Spark and Data Workflow management
- Introduction to No SQL database.
- What is CAP theorem?
- Introduction to No-SQL database HBASE (columnar database).
- Significance of No-SQL
- Understanding Map reduce Framework.
- Stages in Map reduce framework.
- Understanding the purpose and capabilities of Sqoop.
- Export and Import jobs using Sqoop.
- Capabilities of using Apache Spark Streaming and windowing functions.
- Understanding how Spark applications execute on YARN.
- Explain various components in Spark and its uses.
- Understanding the role of Zookeeper in the Cluster.
- Understanding the purpose and capabilities of OOZIE workflow scheduler.
- Understanding OOZIE web console
- Case Study
Online Certification Exam
- early bird$1,100$1,1000$0