The ‘Apache Spark 3 – Spark Programming in Python for Beginners’ course will teach you all the fundamentals of Apache Spark Foundation and Spark Architecture. The course also teach you how to use PyCharm IDE for Spark Development and Debugging.
The course is designed for software engineers who are willing to develop a Data Engineering pipeline and application using the Apache Spark. The course is usually available for INR 2,799 on Udemy but you can click on the link and get the ‘Apache Spark 3 – Spark Programming in Python for Beginners’ for INR 499.
Who all can opt for this course?
- Software architects and engineers who are willing to use Apache Spark to plan and create big data engineering projects
- Developers and programmers who want to advance their knowledge of Apache Spark-based data engineering
Course Highlights
Key Highlights | Details |
---|---|
Registration Link | Apply Now! |
Price | INR 499 ( |
Duration | 09 Hours |
Rating | 4.5/5 |
Student Enrollment | 29,696 students |
Instructor | Prashant Kumar Pandey https://www.linkedin.com/in/prashantkumarpandey |
Topics Covered | PyCharm, Cluster Deployment, Data Engineering, Spark Programming |
Course Level | Intermediate |
Total Student Reviews | 5,279 |
Learning Outcomes
- Spark Architecture and the Apache Spark Foundation
- Data processing and engineering in Spark
- Using Data Sinks and Sources
- Working with Spark SQL and Data Frames
- Developing and debugging Spark code using the PyCharm IDE
- Cluster deployment, managing application logs, and unit testing
Course Content
S.No. | Module (Duration) | Topics |
---|---|---|
1. | Understanding Big Data and Data Lake (01 hour 18 minutes) | Section Overview |
What is Big Data and How it Started | ||
Hadoop Architecture, History, and Evolution | ||
What is Data Lake and How it works | ||
Introducing Apache Spark and Databricks Cloud | ||
2. | Installing and Using Apache Spark (01 hour 00 minutes) | Section Overview |
Spark Development Environments | ||
Setup your Databricks Community Cloud Environment | ||
Introduction to Databricks Workspace | ||
Create your First Spark Application in Databricks Cloud | ||
Setup your Local Development IDE | ||
Mac Users – Setup your Local Development IDE | ||
Create your First Spark Application using IDE | ||
Source Code and Other Resources | ||
3. | Spark Execution Model and Architecture (37 minutes) | Execution Methods – How to Run Spark Programs? |
Check your knowledge | ||
Spark Distributed Processing Model – How your program runs? | ||
Spark Execution Modes and Cluster Managers | ||
Check your knowledge | ||
Summarizing Spark Execution Models – When to use What? | ||
Working with PySpark Shell – Demo | ||
Installing Multi-Node Spark Cluster – Demo | ||
Working with Notebooks in Cluster – Demo | ||
Working with Spark Submit – Demo | ||
Section Summary | ||
Check your knowledge | ||
4. | Spark Programming Model and Developer Experience (01 hour 27 minutes) | Creating Spark Project Build Configuration |
Configuring Spark Project Application Logs | ||
Check your knowledge | ||
Creating Spark Session | ||
Check your knowledge | ||
Configuring Spark Session | ||
Data Frame Introduction | ||
Data Frame Partitions and Executors | ||
Spark Transformations and Actions | ||
Spark Jobs Stages and Task | ||
Understanding your Execution Plan | ||
Unit Testing Spark Application | ||
Rounding off Summary | ||
5. | Spark Structured API Foundation (25 minutes) | Introduction to Spark APIs |
Introduction to Spark RDD API | ||
Working with Spark SQL | ||
Spark SQL Engine and Catalyst Optimizer | ||
Section Summary | ||
6. | Spark Data Sources and Sinks (59 minutes) | Spark Data Sources and Sinks |
Spark DataFrameReader API | ||
Reading CSV, JSON and Parquet files | ||
Creating Spark DataFrame Schema | ||
Spark DataFrameWriter API | ||
Writing Your Data and Managing Layout | ||
Spark Databases and Tables | ||
Working with Spark SQL Tables | ||
7. | Spark Dataframe and Dataset Transformations (54 minutes) | Introduction to Data Transformation |
Working with Dataframe Rows | ||
DataFrame Rows and Unit Testing | ||
Dataframe Rows and Unstructured data | ||
Working with Dataframe Columns | ||
Creating and Using UDF | ||
Misc Transformations | ||
8. | Aggregations in Apache Spark (18 minutes) | Aggregating Dataframes |
Grouping Aggregations | ||
Windowing Aggregations | ||
9. | Spark Dataframe Joins (45 minutes) | Dataframe Joins and column name ambiguity |
Outer Joins in Dataframe | ||
Internals of Spark Join and shuffle | ||
Optimizing your joins | ||
Implementing Bucket Joins | ||
10. | Keep Learning (01 minutes) | Final Word |
Bonus Lecture : Get Extra | ||
11. | Archived – Apache Spark Introduction (21 minutes) | Big Data History and Primer |
Understanding the Data Lake Landscape | ||
What is Apache Spark – An Introduction and Overview | ||
Check your knowledge | ||
12. | Archived – Installing and Using Apache Spark (46 minutes) | Spark Development Environments |
Mac Users – Apache Spark in Local Mode Command Line REPL | ||
Windows Users – Apache Spark in Local Mode Command Line REPL | ||
Did you notice? | ||
Mac Users – Apache Spark in the IDE – PyCharm | ||
Windows Users – Apache Spark in the IDE – PyCharm | ||
Did you notice? | ||
Apache Spark in Cloud – Databricks Community and Notebooks | ||
Check your knowledge | ||
Apache Spark in Anaconda – Jupyter Notebook |
Resources Required
- Understanding of the Python programming language
- A modern 8 GB RAM Windows, Mac, and Linux 64-bit computer
Featured Review
McKenna Magoffin (4/5) : I especially liked the background to what’s going on ‘under the hood’ of spark and its operations
Pros
- Lei Lu (5/5) : It sets the bar as the best training instructor on Udemy.
- Aditya (5/5) : One need to know the under-the-hood mechanics to put it to best use.
- Charlene Johnson (5/5) : This course was great!! Thorough explanations of the history and under-workings of Spark.
- Venkat Somireddy (5/5) : One of best courses I have taken on Udemy and the best course on Spark.
Cons
- Biswajit (1/5) : very bad course and one of the worst teacher i have ever seen
- Felix Goins III (1/5) : Topics move very slow – not learning much other than the history – very boring
- Shardul P (2/5) : unfortunately, in Udemy we don’t get the playback speed for 0.85 or 0.90.
- Loic Villepinte (2/5) : Too much time spent on installation and outdated functions that are not needed anymore.
About the Author
The instructor of this course is Prashant Kumar Pandey who is a Architect, Author, Consultant, Trainer @ Learning Journal. With 4.6 Instructor Rating and 16,215 Reviews on Udemy, Instructor offers 12 Courses and has taught 90,173 Students so far.
- Prashant Kumar Pandey is dedicated in bridging the gap between people’s current talents and what is needed for their future careers
- To help IT professionals and students excel in the field, he is writing books, publishing technical articles, and producing training videos in his effort to carry out this purpose
- He has over 18 years of experience in IT and has worked on numerous data-centric and Bigdata projects with worldwide software services firms as a developer, architect, consultant, trainer, and mentor
- Prashant is a strong proponent of ongoing skill improvement and learning throughout one’s life
- He began posting free training videos on his YouTube channel to raise awareness of the value of lifelong learning, and he conceptualised the idea of starting a Learning Diary to record his learning
- The Learning Journal portal, which offers a variety of skill development courses, training, and technical publications since the beginning of the year 2018, was founded by him
- He also serves as the site’s main editor and lead author
Comparison Table
Parameters | Apache Spark 3 – Spark Programming in Python for Beginners | Apache Spark 3 – Real-time Stream Processing using Python | Apache Spark 3 – Beyond Basics and Cracking Job Interviews |
---|---|---|---|
Offers | INR 499 ( | INR 455 ( | INR 455 ( |
Duration | 9 hours | 4.5 hours | 4 hours |
Rating | 4.5/5 | 4.6/5 | 4.6/5 |
Student Enrollments | 29,696 | 8,121 | 8,955 |
Instructors | Prashant Kumar Pandey | Prashant Kumar Pandey | Prashant Kumar Pandey |
Register Here | Apply Now! | Apply Now! | Apply Now! |
Leave feedback about this