Skip to content

Latest commit

 

History

History
28 lines (21 loc) · 1.44 KB

File metadata and controls

28 lines (21 loc) · 1.44 KB

Python Web Scraping Projects

Packages

- Used BeautifulSoup and Selenium Webdriver to scrape HTML content
- Used Inquirer for prompting for user input<br>

- Used Scrapy for getting HTML info and save to json
- Store into MongoDB with Scrapy pipeline
- No duplicate items getting stored into MongoDB when running the spider <br>

- Used Scrapy for extracting data
- Saved into MongoDB from pipeline, Output log to file
- Spider Contracts for assertions in scraping data <br>