Created
February 10, 2022 18:25
-
-
Save vinayakvanarse/7b3fe680ff9337a7e7aa4ac68f36367f to your computer and use it in GitHub Desktop.
Data Engineering Course Structure
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/****************************** | |
Big Data for Data Engineering | |
******************************/ | |
Lesson 1 | |
1.1 Introduction to Big Data 02:40 | |
1.2 Welcome 01: 09 | |
Lesson 2 What_is_Big_data 09:56 | |
What is Big data 05:49 | |
Big data in Business 04:07 | |
Big Data and Business Analytics comes of age | |
Lesson 3 Beyond_the_Hype 05:57 | |
Beyond the Hype 05:57 | |
Facebook joins Google in HPC Computing Architectures for Big Data | |
Lesson 4 Big_data_and_data_Science 05:57 | |
Big data and data science 05:57 | |
Climate Change and Big Data_Dec 2 012 | |
Lesson 5 Big_Data_use_Cases 05:36 | |
Big data use cases 05:36 | |
Big Data and Sensors_Jan 2 013 | |
Lesson 6 Processing_Big_Data 05:55 | |
Processing big data 05:55 | |
Hadoop and Lustre - Some Thoughts | |
/****************************** | |
HADOOP | |
******************************/ | |
Lesson 2 Introduction to Hadoop 08:10 | |
What is Hadoop Part-A 03:53 | |
What is Hadoop Part-B 04:17 | |
Lesson 3 Hadoop Architecture and HDFS 15:07 | |
Hadoop Architecture Part-A 07: 03 | |
Hadoop Architecture Part-B 04:49 | |
HDFS CommandLine 03:15 | |
Lesson 4 Hadoop administration 05:52 | |
Hadoop Administration 05:52 | |
Lesson 5 Hadoop Components 12:17 | |
MapReduce 04:3 0 | |
Pig and Hive 03:56 | |
Flume, Sqoop , and Oozie 03:51 | |
/****************************** | |
Data Engineering with Scala | |
******************************/ | |
Lesson 2 Introduction 27:36 | |
2.1 Learning Objectives | |
2.2 Introduction to Scala 03:47 | |
2.3 Getting Started with Scala 05: 02 | |
2.4 Creating a Scala Project 06:49 | |
2.5 The Scala REPL 05:49 | |
2.6 Scala Documentation 06: 09 | |
2.7 Introduction | |
Lesson 3 Basic Object Oriented Programming23:57 | |
3.1 Learning Objectives | |
3.2 Classes 05:2 0 | |
3.3 Immutable and Mutable Fields 05:12 | |
3.4 Methods 05:12 | |
3.5 Default and Named Arguments 03:39 | |
3.6 Objects 04:34 | |
Classes | |
Lesson 4 Case Objects and Classes24:11 | |
4.1 Learning Objectives | |
4.2 Companion Objects 03:44 | |
4.3 Case Classes and Case Objects 04:55 | |
4.4 Apply and Unapply 04:43 | |
4.5 Synthetic Methods 05:16 | |
4.6 Immutability and Thread Safety 05:33 | |
Case Objects and Classes | |
Lesson 5 Collections31:2 0 | |
5.1 Learning Objectives | |
5.2 Collections Overview 05:16 | |
5.3 Sequences and Sets 08: 09 | |
5.4 Options 03:29 | |
5.5 Tuples and Maps 06: 05 | |
5.6 Higher Order Functions 08:21 | |
Collections | |
Lesson 6 Idiomatic Scala25:34 | |
6.1 Learning Objectives | |
6.2 For Expressions 06: 01 | |
6.3 Pattern Matching 04:49 | |
6.4 Handling Options 03:55 | |
6.5 Handling Failures 05: 06 | |
6.6 Handling Futures 05:43 | |
Idiomatic Scala | |
/****************************** | |
Big Data Hadoop and Spark Developer | |
******************************/ | |
Big Data Hadoop and Spark Developer | |
Lesson 1 Course Introduction 08:51 | |
1.1 Course Introduction 05:52 | |
1.2 Accessing Practice Lab 02:59 | |
Lesson 2 Introduction to Big Data and Hadoop43:59Preview | |
Lesson 3 Hadoop Architecture,Distributed Storage (HDFS) and YARN57:5 0Preview | |
Lesson 4 Data Ingestion into Big Data Systems and ETL 01: 04: 02Preview | |
Lesson 5 Distributed Processing - MapReduce Framework and Pig 01: 01: 09Preview | |
Lesson 6 Apache Hive57:45Preview | |
Lesson 7 NoSQL Databases - HBase21:41Preview | |
Lesson 8 Basics of Functional Programming and Scala44:59Preview | |
Lesson 9 Apache Spark Next Generation Big Data Framework36:54Preview | |
Lesson 1 0 Spark Core Processing RDD 01:16:31Preview | |
Lesson 11 Spark SQL - Processing DataFrames26:5 0Preview | |
Lesson 12 Spark MLLib - Modelling BigData with Spark32:54Preview | |
Lesson 13 Stream Processing Frameworks and Spark Streaming 01:13:16Preview | |
Lesson 14 Spark GraphX | |
Linux Training | |
Lesson 01 - Course Introduction 05:15Preview | |
Lesson 02 - Introduction to Linux 04:35Preview | |
Lesson 03 - Ubuntu16:24Preview | |
Lesson 04 - Ubuntu Dashboard17:53Preview | |
Lesson 05 - File System Organization31:22Preview | |
Lesson 06 - Introduction to CLI 01:15:45Preview | |
Lesson 07 - Editing Text Files and Search Patterns27:19Preview | |
Lesson 08 - Package Management | |
/****************************** | |
Apache Kafka | |
******************************/ | |
Section 01 - Introduction to Apache Kafka | |
Lesson 01 - Course Introduction 07:16 | |
Course Introduction 07:16 | |
Lesson 02 - Big Data Overview 03: 07Preview | |
Lesson 03 - Big Data Analytics 02:55Preview | |
Lesson 04 - Messaging System 05:48 | |
Lesson 05 - Kafka Overview 08:33Preview | |
Lesson 06 - Kafka Components and Architecture 09:16 | |
Lesson 07 - Kafka Clusters 01:27 | |
Lesson 08 - Kafka Industry Usecases 02:27 | |
Lesson 09 - Demo: Install Kafka and Zookeeper 04:58Preview | |
Lesson 1 0 - Demo: Single Node Single-Multi Broker Cluster 05:38 | |
Lesson 11 - Key Takeaways | |
Section 02 - Kafka Producer | |
Lesson 01 - Overview of Producer and Its Architecture 04:51Preview | |
Lesson 02 - Kafka Producer Configuration14:33Preview | |
Lesson 03 - Send Messages 04:5 0 | |
Lesson 04 - Serializers13:51Preview | |
Lesson 05 - Partitions 08:5 0Preview | |
Lesson 06 - Key Takeaways | |
Section 03 - Kafka Consumer | |
Lesson 01 - Kafka Consumer - Overview, Consumer Groups and Partitioners12:27Preview | |
Lesson 02 - Poll Loop 02:42 | |
Lesson 03 - Configuring Consumer12:26Preview | |
Lesson 04 - Commit and Offset13:59Preview | |
Lesson 05 - Rebalance Listeners 01:45Preview | |
Lesson 06 - Consuming Records with Specific Offset 04:13 | |
Lesson 07 - Deserializers 05:32 | |
Lesson 08 - Key Takeaways | |
Section 04 - Kafka Operations and Performance Tuning | |
Lesson 01 - Learning Objectives 04:46Preview | |
Lesson 02 - Replications14:53Preview | |
Lesson 04 - Storage 09:59Preview | |
Lesson 05 - Configuration in Reliable System18:18 | |
Lesson 05 - Key Takeaways | |
Section 05 - Kafka Cluster Architecture and Administering Kafka | |
Lesson 01 - Learning Objectives 05:22Preview | |
Lesson 02 - Multi Cluster Architecture 08:45Preview | |
Lesson 03 - MirrorMaker17:41Preview | |
Lesson 04 - Administering Kafka 09:5 0Preview | |
Lesson 05 - Dynamic Configuration Changes 09:2 0 | |
Lesson 06 - Console Producer Tool 01:27 | |
Lesson 07 - Console Consumer Tool 02:36 | |
Lesson 08 - Key Takeaways | |
Section 06 - Kafka Monitoring and Schema Registry | |
Lesson 01 - Monitoring47:23Preview | |
Lesson 02 - Kafka Schema Registry and Avro 06:27Preview | |
Lesson 03 - Kafka Schema Registry Components 08:14Preview | |
Lesson 04 - Kafka Schema Registry Working 08:25 | |
Lesson 05 - Key Takeaways | |
Section 07 - Kafka Streams and Kafka Connectors | |
Lesson 01 - Kafka Stream Overview 09:49Preview | |
Lesson 02 - Kafka Stream Architecture, Working and Components5 0:42Preview | |
Lesson 03 - Stream Concepts and Working15:3 0Preview | |
Lesson 04 - Kafka Connectors 06: 08 | |
Lesson 05 - Kafka Connector Configuration25: 08Preview | |
Lesson 06 - Key Takeaways | |
Section 08 - Integration of Kafka with Storm | |
Lesson 01 - Apache Storm 09:1 0Preview | |
Lesson 02 - Apache Storm Architecture and Components 08:34Preview | |
Lesson 03 - Apache Storm Topology1 0:44Preview | |
Lesson 04 - Kafka Spout 03:54 | |
Lesson 05 - Integration of Apache Storm and Kafka1 0:19 | |
Lesson 06 - Key Takeaways | |
Section 09 - Kafka Integration with Spark and Flume | |
Lesson 01 - Introduction to Spark and It_s Components1 0:59Preview | |
Lesson 02 - Basics of Spark - RDD, Data Sets, and Transformation and Actions24:46Preview | |
Lesson 03 - Spark Stream 03: 09 | |
Lesson 04 - Spark Integration with Kafka 06:26 | |
Lesson 05 - Flume 08: 03Preview | |
Lesson 06 -Flume Kafka to HDFS Configuration13:28Preview | |
Lesson 07 - Key Takeaways | |
Section 10 - Admin Client and Securing Kafka | |
Lesson 01 - Admin Client11:59Preview | |
Lesson 02 - Kafka Security 01:36Preview | |
Lesson 03 - Kafka Security Components 08:58 | |
Lesson 04 - Configure SSL in Kafka 01:5 0 | |
Lesson 05 - Secure using ACLs 05:12 | |
Lesson 06 - Key Takeaways | |
/****************************** | |
AWS BigData | |
******************************/ | |
Section 2 - Live Virtual Class Curriculum | |
Lesson 01 - Course Introduction | |
Overview of AWS Certified Data Analytics - Speciality Course | |
Overview of the Certification | |
Overview of the Course | |
Project highlights | |
Course Completion Criteria | |
Lesson 02 AWS in Big Data Introduction | |
Introduction to Cloud Computing | |
Cloud Computing Deployments Models | |
Types of Cloud Computing Services | |
AWS Fundamentals | |
AWS Cloud Economics | |
AWS Virtuous Cycle | |
AWS Cloud Architecture Design Principles | |
Why AWS for Big Data - Challenges | |
Databases in AWS | |
Relational vs Non Relational Databases | |
Data Warehousing in AWS | |
AWS Services for collecting, processing, storing, and analyzing big data | |
Key Takeaways | |
Deploy a Data Warehouse Using Amazon Redshift | |
Lesson 03 Collection | |
AWS Big Data Collection Services | |
Fundamentals of Amazon Kinesis | |
Loading Data into Kinesis Stream | |
Assisted Practice: Loading Data into Amazon Storage | |
Kinesis Data Stream High-Level Architecture | |
Kinesis Stream Core Concepts | |
AWS Services and Amazon Kinesis Data Stream | |
How to Put Data into Kinesis Stream? | |
Kinesis Connector Library | |
Amazon Kinesis Data Firehose | |
Assisted Practice: Transfer Data into Delivery Stream using Firehose | |
Assisted Practice: Transfer VPC Flow log to Splunk using Firehose | |
Data Transfer using AWS Lambda | |
Assisted Practice: Backing up data in Amazon S3 using AWS Lambda | |
Amazon SQS | |
IoT and Big Data | |
Amazon IoT Greengrass | |
AWS Data Pipeline | |
Components of Data Pipeline | |
Assisted Practice: Export MySQL Data to Amazon S3 Using AWS Data Pipeline | |
Key Takeaways | |
Streaming Data with Kinesis Data Analytics | |
Lesson 04 Storage | |
AWS Bigdata Storage services | |
Data lakes and Analytics | |
Data Management | |
Data Life Cycle | |
Fundamentals of Amazon Glacier | |
Glacier and Big Data | |
DynamoDB Introduction | |
DynamoDB: Core Components | |
Assisted Practice: Perform operations on DynamoDB table | |
DynamoDB in AWS Eco-System | |
DynamoDB Partitions | |
Data Distribution | |
DynamoDB GSI and LSI | |
DynamoDB Streams | |
Use cases: Capturing Table Activity with DynamoDB Streams | |
Cross-Region Replication | |
Assisted Practice: Create a Global Table using DynamoDB | |
DynamoDB Performance: Deep Dive | |
Partition Key Selection | |
Snowball & AWS BigData | |
Assisted Practice: Data Migration using AWS Snowball | |
AWS DMS | |
AWS Aurora in BigData | |
Assisted Practice: Create and Modify Aurora DB Cluster | |
Storing and Retrieving the Data from DynamoDB | |
Lesson 05 Processing I | |
AWS Bigdata Processing Services | |
Overview of Amazon Elastic MapReduce (EMR) | |
EMR Cluster Architecture | |
Apache Hadoop | |
Apache Hadoop Architecture | |
Storage Options | |
EMR Operations | |
AWS Cluster | |
Assisted Practice: Create a cluster in S3 | |
Assisted Practice: Monitor a Cluster in S3 | |
Using Hue with EMR | |
Assisted Practice: Launch HUE Web Interface on Amazon EMR | |
Setup Hue for LDAP | |
Assisted Practice: Configure HUE for LDAP Users | |
Hive on EMR | |
Assisted Practice: Set Up a Hive Table to Run Hive Commands | |
Key Takeaways | |
Lesson 06: Processing II | |
Using HBase with EMR | |
HBase Architecture | |
Assisted Practice: Create a cluster with HBase | |
HBase and EMRFS | |
Presto with EMR | |
Presto Architecture | |
Fundamentals of Apache Spark | |
Apache Spark Architecture | |
Assisted Practice: Create a cluster with Spark | |
Apache Spark Integration with EMR | |
Fundamentals of EMR File System | |
Amazon Simple Workflow | |
AWS Lambda in Big Data Ecosystem | |
AWS Lambda and Kinesis Stream | |
AWS Lambda and RedShift | |
HCatalog | |
Key Takeaways | |
Real-Time Application with Apache Spark and AWS EMR | |
Lesson 07 ETL with Redshift | |
Introduction to AWS Bigdata Analysis Services | |
Fundamentals of Amazon Redshift | |
Amazon RedShift Architecture | |
Assisted Practice: Launch a Cluster, Load Dataset, and Execute Queries | |
RedShift in the AWS Ecosystem | |
Columnar Databases | |
Assisted Practice: Monitor RedShift Maintenance and Operations | |
RedShift Table Design | |
Choosing the Distribution Style | |
Redshift Data types | |
RedShift Data Loading | |
COPY Command for Data Loading | |
RedShift Loading Data | |
Key Takeaways | |
Lesson 08: Analysis with Machine Learning | |
Fundamentals of Machine Learning | |
Workflow of Amazon Machine Learning | |
Use cases | |
Machine learning Algorithms | |
Amazon SageMaker | |
Machine learning with Amazon Sagemaker | |
Assisted Practice: Build, Train, and Deploy a Machine Learning Model | |
Elasticsearch | |
Amazon Elasticsearch Service | |
Zone Awareness | |
Logstash | |
RStudio | |
Assisted Practice: Fetch the File and Run Analysis using RStudio | |
Amazon Athena | |
Assisted Practice: Execute Interactive SQL Queries in Athena | |
AWS Glue | |
Key Takeaways | |
Fraud Detection Using Classification Algorithms on AWS Sagemaker | |
Lesson 09 Analysis and Visualization | |
Introduction to AWS Bigdata Visualization Services | |
Amazon QuickSight | |
Amazon QuickSight - Workflow and Use Cases | |
Assisted Practice: Analyze the marketing campaign | |
Working with data | |
Assisted Practice: Analyze the marketing campaign using data from Amazon S3 | |
Assisted Practice: Analyze the marketing campaign using data from Presto | |
Amazon QuickSight: Visualization | |
Assisted Practice: Create Visuals | |
Amazon QuickSight: Stories | |
Assisted Practice: Create a Storyboard | |
Amazon QuickSight: Dashboard | |
Assisted Practice: Create a Dashboard | |
Data Visualization: Other Tools | |
Kibana | |
Assisted Practice: Create a Dashboard on Kibana | |
Key Takeaways | |
Exploratory Data Analysis Using AWS QuickSight | |
Lesson 1 0: Security | |
Introduction to AWS Bigdata Security | |
EMR Security | |
EMR Security: Best Practices | |
Roles | |
Fundamentals of Redshift Security | |
Data Protection and Encryption | |
Master Key, Encryption, and Decryption Process | |
Amazon Redshift Database Encryption | |
Key Management Services(KMS) Overview | |
Encryption using Hardware Security Modules | |
STS and Cross Account Access | |
Cloud Trail | |
Key Takeaways | |
Practice Projects | |
Practice Projects | |
Real-time Analytics on Streaming Data | |
Truegate S3 Replication Big Data Assignment | |
/****************************** | |
Azure BigData | |
******************************/ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment