Skip to content

moony01/py-image-crawling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

py-image-crawling

General-purpose Python image crawler — Selenium-based scraper that downloads images from search engines based on any keyword list. Build custom image datasets for ML training, research, or visual analysis.

License: MIT Python Selenium


Overview

A general-purpose Python utility that uses Selenium WebDriver to scrape images from search engines. Pass any list of keywords and the crawler downloads dozens of representative photos per keyword into category folders — making it easy to build a labeled image dataset for any domain.

The tool is keyword-agnostic: animals, products, places, faces, objects, paintings, anything indexed by image search engines.

Use Cases

  • 📊 ML / AI training data collection — Build labeled image datasets for classification or detection models
  • 🔬 Visual research — Bulk-collect images for academic analysis
  • 🛒 Product / market analysis — Scrape product images by category
  • 🎨 Reference libraries — Build mood boards or visual archives

Real-world Example

Originally built as the data pipeline for the kpopface AI face matcher — given a list of K-Pop idol names, it downloaded hundreds of representative photos per idol to train a Teachable Machine model. The same crawler works just as well with any other keyword set.

Tech Stack

Layer Technology
Language Python 3.9
Browser Automation Selenium WebDriver

Local Development

git clone https://github.com/moony01/py-image-crawling.git
cd py-image-crawling

pip install -r requirements.txt

# Edit search keywords in index.py, then run
python index.py

Downloaded images land in the dataset/ directory, organized into one folder per keyword.

License

MIT License © 2024–2026 moony01

Contact

About

General-purpose Python image scraper using Selenium — download images by keyword for any ML training pipeline, dataset, or visual research.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

  •  

Packages

 
 
 

Contributors

Languages