Oracle Big Data Fundamentals Ed 2

Duration : 5 Days (40 Hours)

Oracle Big Data Fundamentals Ed 2 Course Overview:

Oracle Big Data Fundamentals Ed 2, equips learners with essential skills to work with Big Data, covering principles of aggregation, analysis, and interaction with Apache Hadoop, Apache Spark, and Oracle Big Data Cloud Service. Learners gain expertise in popular Big Data APIs like HDFS, MapReduce, and Hive, enabling them to set up, use, and troubleshoot Big Data services. The course caters to developers, data analysts, database administrators, system administrators, and other professionals, providing a comprehensive understanding of managing Big Data for analytical purposes in their organization.

Intended Audience:

  1. Technology professionals who want to gain a better understanding of working with the Oracle Big Data platform.
  2. Data engineers, analytics professionals, and system administrators aiming to utilize the capabilities of the Oracle Big Data cloud platform for developing innovative solutions.
  3. Individuals with basic knowledge of SQL, Linux, and basic Python coding.
  4. Those interested in leveraging the Oracle Big Data platform for enterprise-level initiatives such as analytics applications and complex data models.
  5. Individuals looking to expand their skillsets and customize advanced data processing workflows with the Oracle Big Data platform.

Learning Objectives of Oracle Big Data Fundamentals Ed 2:

The Oracle Big Data Fundamentals Ed 2 Training teaches participants the fundamentals to working with big data. Upon completion, learners will be able to:
1. Understand the components of the Oracle Big Data environment.
2. Install and configure key components in the Oracle Big Data platform.
3. Manipulate, transform, and store data using HiveQL.
4. Provision big data clusters and use different big data tools such as Pig and Sqoop.
5. Utilize Oracle Big Data Discovery to visually explore and analyze big data sets.
6. Integrate big data sources and apply machine learning algorithms with Oracle Big Data Graph and Oracle Big Data Spatial and Graph.
7. Store and analyze big data using the Oracle NoSQL Database.
8. Monitor, assess, and optimize big data performance.

 Module 1: Introduction

• Questions About You

• Course Objectives

• Course Road Map

• Oracle Big Data Lite (BDLite) Virtual Machine (VM) Home Page

• Starting the Oracle BDLite VM and accessing the Practice Files

• Reviewing the Available Big Data Documentation, Tutorials, and Other Resources

 Module 2: Introducing Oracle Big Data Strategy

• Characteristics of Big Data

• Importance of Big Data

• Big Data Opportunities: Some Examples

• Big Data Challenges

• Big Data implementation examples

• Oracle strategy for Big Data: combining Big Data Processing Engines: Hadoop / NoSQL / RDBMS

 Module 3: Using Oracle Big Data Lite Virtual Machine and Movieplex Application

• Oracle Big Data Lite VM Used in this Course

• Oracle Big Data Lite VM Home Page Sections

• Reviewing the Deployment Guide

• Downloading and installing Oracle VM VirtualBox and its Extension Pack

• Downloading and Running 7-zip Files to create Virtual Box Appliance F

• Importing the Appliance File

• Staring the Big Data Lite VM and Starting and Stopping Services

• Introducing the Oracle Movieplex Case Study

 Module 4: Introduction to the Big Data Ecosystem

• Computer Clusters and Distributed Computing

• Apache Hadoop

• Types of Analysis That Use Hadoop

• Types of Data Generated

• Apache Hadoop Core Components: HDFS, MapReduce (MR1), and YARN (MR2)

• Apache Hadoop Ecosystem

• Cloudera’s Distribution Including Apache Hadoop (CDH)

• CDH Architecture and Components

 Module 5: Introduction to the Hadoop Distributed File System

• Hadoop Distributed Filesystem (HDFS) Design Principles, Characteristics, and Key Definitions

• Sample Hadoop High Availability (HA) Cluster

• HDFS Files and Blocks

• Active and Standby Daemons (Services) Functions

• DataNodes (DN) Daemons Functions

• Writing a File to HDFS: Example

• Interacting With Data Stored in HDFS: Hue, Hadoop Client, WebHDFS, and HttpFS

 Module 6: Acquire Data using CLI, Fuse, Flume, and Kafka

• Reviewing the Command Line Interface (CLI)

• Viewing File System Contents Using the CLI

• FS Shell Commands

• Loading Data Using the CLI

• Overview of FuseDFS

• What is Flume?

• Kafka topics

• Additional Resources

 Module 7: Acquire and Access Data Using Oracle NoSQL Database

• What is a NoSQL Database

• RDBMS Compared to NoSQL

• HDFS Compared to NoSQL

• Define Oracle NoSQL Database

• Oracle NoSQL models: Key-Value and Table

• Acquiring and Accessing Data in a NoSQL DB

• Accessing the CLIs (Data, Admin, SQL)

• Accessing the KVStore

 Module 8: Introduction to MapReduce and YARN Processing Frameworks

• MapReduce Framework Features, Benefits, and Jobs

• Parallel Processing with MapReduce

• Word Count Examples

• Data Locality Optimization in Hadoop

• Submitting and Monitoring a MapReduce Job

• YARN Architecture, Features, and Daemons

• YARN Application Workflow

• Hadoop Basic Cluster: MapReduce 1 Versus YARN (MR 2)

 Module 9: Resource Management Using Yarn

• Job Scheduling in YARN

• First In, First Out (FIFO) Scheduler, Capacity Scheduler, and Fair Scheduler

• Cloudera Manager Resource Management Features

• Static Service Pools

• Working with the Fair Scheduler

• Cloudera Manager Dynamic Resource Management: Example

• Submitting and Monitoring a MapReduce Job Using YARN

• Using the YARN application Command

 Module 10: Overview of Apache Spark

• Benefits of Using Spark

• Spark Architecture

• Spark Application Components: Driver, Master, Cluster Manager, and Executors

• Running a Spark Application on YARN (yarn-cluster Mode)

• Resilient Distributed Dataset (RDD)

• Spark Interactive Shells: spark-shell and pyspark

• Word Count Example by Using Interactive Scala

• Monitoring Spark Jobs Using YARN’s ResourceManager Web UI

 Module 11: Overview of Apache Hive

• What is Hive?

• Use Case: Storing Clickstream Data

• Hadoop Architecture

• How is Data Stored in HDFS?

• Organizing and Describing Data With Hive

• Big Data SQL on Top of Hive Data

• Defining Tables Over HDFS

• Hive Queries

 Module 12: Overview of Cloudera Impala

• Overview of Cloudera Impala

• Hadoop: Some Data Access/Processing Options

• Cloudera Impala

• Cloudera Impala: Key Features

• Cloudera Impala: Supported Data Formats

• Cloudera Impala: Programming Interfaces

• How Impala Fits Into the Hadoop Ecosystem

• How Impala Works with Hive

 Module 13: Using Oracle XQuery for Hadoop

• XML Review

• Oracle XQuery for Hadoop (OXH)

• OXH Features

• OXH Data Flow

• Using OXH: Installation, Functions, Adapters, and Configuration Properties

• Running an OXH Query

• XQuery Transformation and Basic Filtering

• Viewing the Completed Query in YARN’s ResourceManager

 Module 14: Overview of Solr

• Overview of Solr

• Apache Solr (Cloudera Search)

• Cloudera Search: Key Capabilities

• Cloudera Search: Features

• Cloudera Search Tasks

• Indexing in Cloudera Search

• Types of Indexing

• The solrctl Command

 Module 15: Integrating Your Big Data

• Unifying Data: A Typical Requirement

• Comparing Big Data Processing Engines

• Introducing Data Unification Options

• Introducing Data Unification Options

 Module 16: Batch Loading Options

• Apache Sqoop

• Oracle Loader for Hadoop

• Oracle Copy to Hadoop

 Module 17: Using Oracle SQL Connector for HDFS

• Batch and Dynamic Loading: Oracle SQL Connector for HDFS

• OSCH Architecture

• Using OSCH

• Features

• Parallelism and Performance

• Performance Tuning

• Key Benefits

• Loading: Choosing a Connector

 Module 18: Using Oracle Data Integrator and Oracle GoldenGate for Big Dat

• ETL and Synchronization: Oracle Data Integrator

• ODI’s Declarative Design

• ODI Knowledge Modules (KMs)Simpler Physical Design / Shorter Implementation Time

• Using ODI with Big Data Heterogeneous Integration with Hadoop Environments

• Using ODI Studio

• ODI Studio Components: Overview

• ODI Studio: Big Data Knowledge Modules

• Oracle GoldenGate for Big Data

 Module 19: Using Oracle Big Data SQL

• Barriers to Effective Big Data Adoption

• Overcoming Big Data Barriers

• Oracle Big Data SQL: The Hybrid Solution

• Benefits: Virtualizes data access across Oracle Database, Hadoop and NoSQL stores

• Using Oracle Big Data SQL

• Query Performance Overview

• Deployment Options

 Module 20: Using Oracle Big Data Spatial and Graph

• Graph and Spatial Analysis: All About Relationships

• What is Oracle Big Data Spatial and Graph (BDSG)?

• Strategy (supported platforms, etc)

• BDSG: Graph Analysis

• Oracle BDSG: Spatial Analysis

• Multimedia Analytics Framework

• Deployment Options for Oracle BDSG

• Additional Resources

 Module 21: Using Oracle Advanced Analytics

• Oracle Advanced Analytics (OAA)

• OAA: Oracle Data Mining

• OAA: Oracle Data Mining

 Module 22: Oracle Big Data Deployment Options

• Introduction to the Oracle Big Data Appliance

• Running the Oracle BDA Configuration Generation Utility

• Oracle BDA Mammoth Software Deployment Bundle

• Using the Oracle BDA mammoth Utility

• BDA Hardware and Integrated and Optional Software

• Administering and Securing the Oracle BDA

• Introduction to the Oracle Big Data Cloud Service

• Introduction to the Oracle Big Data Cloud Service – Compute Edition

Oracle Big Data Fundamentals Ed 2 Course Prerequisites:

This course assumes basic knowledge of massive data and the technologies used to manage it. Students should have some experience and/or understanding of core elements of a relational database. An understanding of Structured Query Language (SQL) is also helpful. Knowledge of data presentation and visualization concepts and tools, Cloud Infrastructure and cloud services security, data warehousing, and distributed systems would be beneficial but not required.

Discover the perfect fit for your learning journey

Choose Learning Modality

Live Online

  • Convenience
  • Cost-effective
  • Self-paced learning
  • Scalability


  • Interaction and collaboration
  • Networking opportunities
  • Real-time feedback
  • Personal attention


  • Familiar environment
  • Confidentiality
  • Team building
  • Immediate application

Training Exclusives

This course comes with following benefits:

  • Practice Labs.
  • Get Trained by Certified Trainers.
  • Access to the recordings of your class sessions for 90 days.
  • Digital courseware
  • Experience 24*7 learner support.

Got more questions? We’re all ears and ready to assist!

Request More Details

Please enable JavaScript in your browser to complete this form.

Subscribe to our Newsletter

Please enable JavaScript in your browser to complete this form.