Python Web Scraping Projects

Packages

Beautiful Soup
Selenium
Scrapy
Pymongo
Inquirer

Get Top 100 and Top Genre-specific films from Letterboxd's Top 250

- Used BeautifulSoup and Selenium Webdriver to scrape HTML content
- Used Inquirer for prompting for user input<br>

Get Top 1000 Fiction books from Goodreads community

- Used Scrapy for getting HTML info and save to json
- Store into MongoDB with Scrapy pipeline
- No duplicate items getting stored into MongoDB when running the spider <br>

Get all NY Mets hats from MLBShop

- Used Scrapy for extracting data
- Saved into MongoDB from pipeline, Output log to file
- Spider Contracts for assertions in scraping data <br>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python Web Scraping Projects

Packages

Get Top 100 and Top Genre-specific films from Letterboxd's Top 250

Get Top 1000 Fiction books from Goodreads community

Get all NY Mets hats from MLBShop

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Python Web Scraping Projects

Packages

Get Top 100 and Top Genre-specific films from Letterboxd's Top 250

Get Top 1000 Fiction books from Goodreads community

Get all NY Mets hats from MLBShop