The ‘Modern Web Scraping with Python using Scrapy Splash Selenium’ course teaches the fundamentals of Web Scraping. In this course, you will learn how to build a complete Spider from A to Z.
The course also teaches how to store the extracted data in MongoDb & SQLite3 and how to scrape JavaScript websites using Splash & Selenium. Along with that, you will learn how to host spiders in Heroku as well as Splash. The course is usually available for INR 2,799 on Udemy but you can click on the link and get the ‘Modern Web Scraping with Python using Scrapy Splash Selenium’ for INR 449.
Who all can opt for this course?
- Anybody who desires to scrape information from any website
- Anyone desires to learn Scrapy
- Anybody wishes to automate the process of copying content from websites
- Everyone who wants to learn how to use Scrapy-Splash and Selenium to scrape Javascript websites
Course Highlights
Key Highlights | Details |
---|---|
Registration Link | Apply Now! |
Price | INR 449 ( |
Duration | 8.5 Hours |
Rating | 4.3/5 |
Student Enrollment | 22,790 students |
Instructor | Ahmed Rafik https://www.linkedin.com/in/ahmedrafik |
Topics Covered | Scrapy Fundamentals, Xpath, CSS Selector, JSON, Scrapy Splash & Selenium |
Course Level | Intermediate |
Total Student Reviews | 3,583 |
Learning Outcomes
- Recognize the basics of web scraping
- Website scraping with Scrapy
- Comprehend CSS selectors and Xpath
- Construct a Spider from beginning to end
- Put the extracted data in SQLite3 and MongoDb
- Splash and Selenium can be used to scrape JavaScript websites
- Construct a CrawlSpider
- Recognize the behaviour of crawling
- Create a unique Middleware
- Recommended practises for web scraping
- When scraping websites, avoid getting blacklisted
- Circumvent cloudflare
- Snoop on APIs
- Scrape websites with unlimited scroll
- Using Cookies
- Both locally and on the cloud, deploy spiders
- Run spiders on a regular basis
- Create datasets
- Use Scrapy to log into websites
- Use Scrapy to download files and photos
Course Content
S.No. | Module (Duration) | Topics |
---|---|---|
1. | Introduction (17 minutes) | Intro to Web Scraping & Scrapy |
Setting up Scrapy the Development Environment (Updated) | ||
Add VSCODE to path (Mac users) | ||
Udemy 101 (Please don’t skip*) | ||
Asking questions | ||
2. | Scrapy Fundamentals (30 minutes) | Scrapy fundamentals PART 1 |
Scrapy fundamentals PART 2 | ||
Scrapy fundamentals PART 3 | ||
Scrapy fundamentals PART 4 | ||
Scrapy fundamentals PART 5 | ||
3. | XPath expressions & CSS Selectors (36 minutes) | Downloadable files |
XPath & CSS Selectors | ||
CSS Selectors fundamentals | ||
CSS selectors in theory | ||
XPath fundamentals | ||
Navigating using XPath(Going UP) | ||
Navigating using XPath(Going DOWN) | ||
XPath in theory | ||
4. | Project 1 Spiders from A to Z (21 minutes) | Worldometers PART 1 |
Worldometers PART 2 | ||
Worldometers PART 3 | ||
Worldometers PART 4 | ||
Project source code | ||
Exercise | ||
5. | Building Datasets (04 minutes) | Bulding datesets |
6. | Project 2 Dealing with Multiple pages (23 minutes) | Website URL (Please do not skip) |
Setting up the project | ||
Setting up the project – Code update – | ||
Building the spider | ||
Dealing with pagination | ||
Spoofing request headers | ||
TinyDeal project source code | ||
Exercise 2 | ||
7. | Debugging spiders (15 minutes) | What is debugging? |
Debugging spiders PART 1 | ||
Debugging spiders PART 2 | ||
8. | Let’s take a break ! (04 minutes) | The “whys” & “whens” of web scraping |
Web scraping challenges | ||
9. | Project 3 Build Crawlers using Scrapy (21 minutes) | Website URL update |
Crawl spider structure | ||
The Rule object | ||
Following links in pagination | ||
Spoofing request headers | ||
Project source code | ||
Exercise | ||
10. | Splash crash course (30 minutes) | What dilemma splash came to solve |
Setting up Splash (Windows Pro/Entreprise edition & Mac Os) | ||
Setting up Splash(Windows Home Edition) | ||
Setting up Splash (Linux) | ||
Introduction to Splash | ||
Working with elements | ||
Spoofing request headers | ||
11. | Project 4 Scraping JavaScript websites using Splash (16 minutes) | Website URL update |
Splash incognito mode | ||
Using Splash with Scrapy | ||
Parsing (BAD HTML MARKUP) | ||
Project source code | ||
Exercise | ||
12. | Project 5 Scraping JavaScript websites using Selenium (54 minutes) | Selenium basics |
ElementNotInteractable Exception | ||
Selenium with Scrapy | ||
Selenium Middleware PART 1 (NEW) | ||
Selenium Middleware PART 2 (NEW) | ||
Project source code | ||
13. | Working with Pipelines (22 minutes) | Pipelines |
Storing data in MongoDB | ||
Storing data in SQLite3 | ||
Project source code | ||
14. | Scraping APIs (NEW) (23 minutes) | Scraping APIs PART 1 |
Scraping APIs PART 2 | ||
Scraping APIs PART 3 | ||
Scraping APIs PART 4 | ||
Scraping APIs PART 5 | ||
Project source code | ||
15. | Log in to websites (NEW) (19 minutes) | Log in to websites PART 1 |
Log in to websites PART 2 | ||
Log in to websites PART 3 (JavaScript required) | ||
Project source code | ||
16. | Project 6 Bypass Cloudflare (13 minutes) | Website URL update |
Bypass Cloudflare PART 1 | ||
Bypass Cloudflare PART 2 | ||
Project source code | ||
17. | APPENDIX (OLDER SCRAPY 1.5 CONTENT) (02 hours 57 minutes) | *IMPORTANT* |
Avoid getting banned PART 1 | ||
Avoid getting banned PART 2 | ||
Avoid getting banned PART 3 | ||
Scraping APIs PART 1 | ||
Scraping APIs PART 2 | ||
Scraping APIs PART 3 | ||
Scraping APIs PART 4 | ||
Hidden XHR | ||
Scraping APIs PART 5 | ||
IMPORTANT NOTE | ||
Scraping APIs PART 6 | ||
Spider Arguments | ||
Scraping APIs PART 7 | ||
*IMPORTANT* | ||
Another way to scrape Airbnb restaurant detail page | ||
Deploying spiders PART 1 | ||
Deploying spiders PART 2 | ||
Deploying spiders PART 3 | ||
Deploying spiders PART 4 | ||
Execute spiders periodically | ||
Deploy Splash to Heroku | ||
*IMPORTANT* | ||
Project source code | ||
Project source code | ||
Challenge for those who are adventurous | ||
Login to websites using FormRequest | ||
XML Http Post Requests | ||
Project source code | ||
Code UPDATE XHR repeated data (Assignment) | ||
Media Pipelines | ||
The Images Pipeline | ||
Extending The Images Pipeline (Store images with custom names) | ||
*IMPORTANT* | ||
Files Pipeline (Article) | ||
Challenge (Files Pipeline) | ||
Project source code | ||
Using Crawlera with Scrapy | ||
Using Crawlera with Splash | ||
Using Heroku as a Proxy (FREE) | ||
Using FREE Proxies with the CrawlSpider | ||
*IMPORTANT* | ||
Challenge | ||
Project source code | ||
18. | BONUS (18 seconds) | Files Pipeline |
Bonus Lecture |
Resources Required
- Python’s fundamentals
- Internet connected-computer
Featured Review
Meeran Muhaiyaddin Muhammed (5/5) : One of the best ever Web scraping course I ever took on Udemy. It helped me alot in developing my web scraping skills in my work environment.
Pros
- Shlomit Dror (5/5) : I am deeply impressed by the quality of this course; the videos and teacher’s explanations.
- Misha Student (5/5) : Really excellent course that walks you through setting up web scraping spiders for your projects.
- Okodi Ataime Benson (5/5) : This is the best web scraping course I have ever taken.
- Rémi Caland (5/5) : He takes time to explain a lot of part of his work, and this course is perfectly made.
Cons
- Priyadarshan Gupta (2/5) : Felt like, we were supposed to just follow instructions blindly without questioning.
- Unnamed Student (1/5) : Already left my comments about this horrible course in the Q&A section already.
- Utku Çaml?da? (1/5) : Well due to lack of update of splash this course is now basicly useless.
- Unnamed Student (1/5) : I’m not one to disparage instructors on here because Udemy has been a value-added for me for the past one and a half years but dude, u really do not know how to teach.
About the Author
The instructor of this course is Ahmed Rafik who is a Developer and Online Instructor. With 4.4 Instructor Rating and 4,380 Reviews on Udemy, Instructor offers 3 Courses and has taught 31,518 Students so far.
- Ahmed Rafik, teach online courses through Udemy as a self-taught developer
- With the use of various programmes like Scrapy, Splash, and Selenium, he has assisted hundreds of people in learning how to perform web scraping using Python
- Ahmed Rafik think learning to code should be simple for everyone, but need to pick a teacher that has the proper skills
- With his courses, he provide you the practical knowledge that you need to launch a successful web scraping business right away
- The fact that Ahmed Rafik always keep his courses up to date and do his best to avoid dull theoretical explanations wherever feasible is what has allowed him to have the highest rated and best-selling web scraping courses on Udemy
- Ahmed Rafik look forward to having you enrolled in one of his courses; he will make sure to support you at every turn and respond to your inquiries
Comparison Table
Parameters | Modern Web Scraping with Python using Scrapy Splash Selenium | Scrapy: Powerful Web Scraping & Crawling with Python | Advanced Web Scraping with Python using Scrapy & Splash |
---|---|---|---|
Offers | INR 449 ( | INR 455 ( | INR 455 ( |
Duration | 8.5 hours | 11 hours | 5.5 hours |
Rating | 4.3/5 | 4.3 /5 | 4.8 /5 |
Student Enrollments | 22,790 | 16,133 | 6,160 |
Instructors | Ahmed Rafik | GoTrained Academy | Ahmed Rafik |
Register Here | Apply Now! | Apply Now! | Apply Now! |
Leave feedback about this