Apache Spark 3 for Data Engineering and Analytics with Python
Master Python and PySpark 3.0.1 for Data Engineering / Analytics (Databricks)
Development ,Database and Design Development,Apache Spark
Lectures -80
Duration -8 hours
Lifetime Access
Lifetime Access
30-days Money-Back Guarantee
Get your team access to 10000+ top Tutorials Point courses anytime, anywhere.
Course Description
Apache Spark 3 is an open-source distributed engine for querying and processing data. This course will provide you with a detailed understanding of PySpark and its stack. This course is carefully developed and designed to guide you through the process of data analytics using Python Spark. The author uses an interactive approach in explaining key concepts of PySpark such as the Spark architecture, Spark execution, transformations and actions using the structured API, and much more. You will be able to leverage the power of Python, Java, and SQL and put it to use in the Spark ecosystem.
You will start by getting a firm understanding of the Apache Spark architecture and how to set up a Python environment for Spark. Followed by the techniques for collecting, cleaning, and visualizing data by creating dashboards in Databricks. You will learn how to use SQL to interact with DataFrames. The author provides an in-depth review of RDDs and contrasts them with DataFrames.
There are multiple problem challenges provided at intervals in the course so that you get a firm grasp of the concepts taught in the course.
The code bundle for this course is available here: https://github.com/PacktPublishing/Apache-Spark-3-for-Data-Engineering-and-Analytics-with-Python
Target audience:
This course is designed for Python developers who wish to learn how to use the language for data engineering and analytics with PySpark. Any aspiring data engineering and analytics professionals.
Goals
- Learn Spark architecture, transformations, and actions using the structured API.
- Learn to set up your own local Py-Spark environment.
- Learn to interpret DAG (Directed Acyclic Graph) for Spark execution.
- Learn to interpret the Spark web UI.
- Learn the RDD (Resilient Distributed Datasets) API.
- Learn to visualize (graphs and dashboards) data on Data bricks.
Prerequisites
- Data scientists/analysts who wish to learn an analytical processing strategy that can be deployed over a big data cluster. Data managers who want to gain a deeper understanding of managing data over a cluster.

Curriculum
Check out the detailed breakdown of what’s inside the course
Introduction to Spark and Installation
15 Lectures
-
Introduction 04:43 04:43
-
The Spark Architecture 03:39 03:39
-
The Spark Unified Stack 03:38 03:38
-
Java Installation 06:29 06:29
-
Hadoop Installation 05:26 05:26
-
Python Installation 04:23 04:23
-
PySpark Installation 07:56 07:56
-
Install Microsoft Build Tools 02:35 02:35
-
MacOS - Java Installation 03:45 03:45
-
MacOS - Python Installation 04:17 04:17
-
MacOS - PySpark Installation 07:16 07:16
-
MacOS - Testing the Spark Installation 05:07 05:07
-
Install Jupyter Notebooks 09:18 09:18
-
The Spark Web UI 11:19 11:19
-
Section Summary 02:23 02:23
Spark Execution Concepts
5 Lectures

RDD Crash Course
10 Lectures

Structured API - Spark DataFrame
32 Lectures

Introduction to Spark SQL and Databricks
18 Lectures

Instructor Details

Packt Publishing
Packt are an established, trusted, and innovative global technical learning publisher, founded in Birmingham, UK with over eighteen years experience delivering rich premium content from ground-breaking authors and lecturers on a wide range of emerging and established technologies for professional development.
Packt’s purpose is to help technology professionals advance their knowledge and support the growth of new technologies by publishing vital user focused knowledge-based content faster than any other tech publisher, with a growing library of over 9,000 titles, in book, e-book, audio and video learning formats, our multimedia content is valued as a vital learning tool and offers exceptional support for the development of technology knowledge.
We publish on topics that are at the very cutting edge of technology, helping IT professionals learn about the newest tools and frameworks in a way that suits them.
Course Certificate
Use your certificate to make a career change or to advance in your current career.

Our students work
with the Best


































Related Video Courses
View MoreAnnual Membership
Become a valued member of Tutorials Point and enjoy unlimited access to our vast library of top-rated Video Courses
Subscribe now
Online Certifications
Master prominent technologies at full length and become a valued certified professional.
Explore Now