Course curriculum

    1. Course Objectives

    2. Target Audience

    3. Course Prerequisites

    4. Value to the Professionals

    5. Value to the Professionals- 2

    6. Value to the Professionals- 3

    7. Lessons Covered

    8. Conclusion

    1. Objectives

    2. Need of New Generation Distributed Systems

    3. Limitations of Map reduce in Hadoop

    4. Limitations of Map reduce in Hadoop-2

    5. Batch vs Real-Time Processing

    6. Application of Stream Processing

    7. Application of In-Memory Processing

    8. Introduction to Apache Spark

    9. History of Spark

    10. Language Flexibility in Spark

    11. Spark Execution Architecture

    12. Automatic Parallelization of Complex Flows

    13. Automatic Parallelization of Complex Flows-Important Points

    14. Apis That Match User Goals

    15. Apache Spark- A Unified Platform of Big Data Apps

    16. More Benefits of Apache Spark

    17. Running Spark in Different Modes

    18. Installing Spark as a Standalone Cluster - Configuration

    19. Demo - Install Apache Spark

    20. Overview of Spark on a Cluster

    21. Demo-Install Apache Spark-1

    22. Tasks of Spark on a Cluster

    23. Companies Using Spark - Use Cases

    24. Hadoop Ecosystem vs Apache Spark

    25. Hadoop Ecosystem vs Apache Spark-2

    26. Summary

    27. Summary-2

    28. Conclusion

    1. Objectives

    2. Introduction to Scala

    3. Basic Data Types

    4. Basic Literals

    5. Basic Literals-2

    6. Basic Literals-3

    7. Introduction to Operators

    8. Use Basic Literals and the Arithmetic Operator

    9. Demo Use Basic Literals and the Arithmetic Operator

    10. Use the Logical Operator

    11. Demo Use the Logical Operator

    12. Introduction to Type Inference

    13. Type Inference for Recursive Methods

    14. Type Inference for Polymorphic Methods and Generic Classes

    15. Unreliability on Type Inference Mechanism

    16. Mutable Collection vs Immutable Collection

    17. Functions

    18. Anonymous Functions

    19. Objects

    20. Classes

    21. Use Type Inference, Functions, Anonymous Function and Class

    22. Demo Use Type Inference, Functions, Anonymous Function and Class

    23. Traits as Interfaces

    24. Traits - Example

    25. Collections

    26. Types of Collections

    27. Types of Collections-2

    28. Lists

    29. Perform Operations on Lists

    30. Demo Use Data Structures

    31. Maps

    32. Pattern Matching

    33. Implicits

    34. 3.34 Implicits-2

    35. Streams

    36. Use Data Structures

    37. Demo Perform Operations on Lists

    38. Summary

    39. Summary-2

    40. Conclusion

    1. Objectives

    2. RDDS API

    3. Creating RDDS

    4. Creating RDDS Referencing an External Dataset

    5. Referencing an External Dataset Text Files

    6. Referencing an External Dataset Text Files-2

    7. Referencing an External Dataset Sequence Files

    8. Referencing an External Dataset other Hadoop Input Formats

    9. Creating RDDS - Important Points

    10. RDDS Operations

    11. RDD Operations - Transformations

    12. Features of RDD Persistence

    13. Storage Levels of RDD Persistence

    14. Invoking the Spark Shell

    15. Importing Spark Classes

    16. Creating the Spark context

    17. Creating the Spark Context

    18. Loading a File in Shell

    19. Performing Some Basic Operations on Files in Spark Shell RDDS

    20. Packaging a Spark Project With SBT

    21. Running a Spark Project with SBT

    22. Demo - Build a Scala Project

    23. Build A Scala Project-1

    24. Demo - Build a Spark Java Project

    25. Build A Spark Java Project-1

    26. Shared Variables - Broadcast

    27. Shared Variables - Accumulators

    28. Writing a Scala Application

    29. Demo - Run a Scala Application

    30. Run a Scala Application

    31. Write a Scala Application Reading the Hadoop Data

    32. Write a Scala Application Reading the Hadoop Data

    33. Demo - Run a Scala Application Reading the Hadoop Data

    34. Run a Scala Application Reading the Hadoop Data

    35. DoubleRDD Methods

    36. PairRDD Methods- Join

    37. PairRDD Methods- Others

    38. JavaPairRDD Methods

    39. JavaPairRDD Methods-2

    40. General RDD Methods

    41. General RDD Methods-2

    42. Java RDD Methods

    43. Common Java RDD Methods

    44. Spark Java Function Classes

    45. Method for Combining JavaPairRDD Functions

    46. Transformations in RDD

    47. Other Methods

    48. Actions in RDD

    49. Key-value Pair RDD in Scala

    50. Key-value Pair RDD in Java

    51. Using Mapreduce and Pair RDD Operations

    52. Reading Text File from HDFS

    53. Reading Sequence File from HDFS

    54. Writing Text Data to HDFS.mp4

    55. Writing Sequence File to HDFS

    56. Using Groupby

    57. Using Groupby-2

    58. Demo - Run a Scala Application Performing Groupby Operation

    59. Run A Scala Application Performing Groupby Operation-1

    60. Demo - Write and Run a Java Application

    61. Write and Run a Java Application

    62. Summary

    63. Summary-2

    64. Conclusion

    1. Objectives

    2. Importance of Spark SQL

    3. Benefits of Spark SQL

    4. Dataframes

    5. SQLContext

    6. SQL Context-2

    7. Creating a Dataframe

    8. Using Dataframe Operations

    9. Using Dataframe Operations-2

    10. Demo - Run SparkSQL with a Dataframe

    11. Run Spark SQL Programmatically-1

    12. Save Modes

    13. Saving to Persistent Tables

    14. Parquet Files

    15. Partition Discovery

    16. Schema Merging

    17. JSON Data

    18. Hive Table

    19. DML Operation - Hive Queries

    20. Demo - Run Hive Queries Using Spark SQL

    21. JDBC to other Databases

    22. Supported Hive Features

    23. Supported Hive Features-2

    24. Supported Hive Data Types

    25. Case Classes

    26. Case Classes-2

    27. Summary

    28. Summary-2

    29. Conclusion

    1. Objectives

    2. Introduction to Spark Streaming

    3. Working of Spark Streaming

    4. Streaming Word Count

    5. Micro Batch

    6. DStreams

    7. DStreams-2

    8. Input DStreams and Receivers

    9. Input DStreams and Receivers-2

    10. Basic Sources

    11. Advanced Sources

    12. Transformations on DStreams

    13. Output Operations on DStreams

    14. Design Patterns for Using ForeachRDD

    15. Dataframe and SQL Operations

    16. Dataframe and SQL Operations-2

    17. Checkpointing

    18. Enabling Checkpointing

    19. Socket Stream

    20. File Stream

    21. Stateful Operations

    22. Window Operations

    23. Types of Window Operations

    24. Types of Window Operations-2

    25. Join Operations - Stream - Dataset Joins

    26. Monitoring Spark Streaming Application

    27. Performance Tuning - High Level

    28. Demo - Capture and Process the Netcat Data

    29. Capture and Process the Flume Data

    30. Demo - Capture the Twitter Data

    31. Capture the Twitter Data

    32. Summary

    33. Summary-2

    34. Conclusion

About this course

  • Free
  • 281 lessons
  • 4 hours of video content