py-image-crawling

General-purpose Python image crawler — Selenium-based scraper that downloads images from search engines based on any keyword list. Build custom image datasets for ML training, research, or visual analysis.

Overview

A general-purpose Python utility that uses Selenium WebDriver to scrape images from search engines. Pass any list of keywords and the crawler downloads dozens of representative photos per keyword into category folders — making it easy to build a labeled image dataset for any domain.

The tool is keyword-agnostic: animals, products, places, faces, objects, paintings, anything indexed by image search engines.

Use Cases

📊 ML / AI training data collection — Build labeled image datasets for classification or detection models
🔬 Visual research — Bulk-collect images for academic analysis
🛒 Product / market analysis — Scrape product images by category
🎨 Reference libraries — Build mood boards or visual archives

Real-world Example

Originally built as the data pipeline for the kpopface AI face matcher — given a list of K-Pop idol names, it downloaded hundreds of representative photos per idol to train a Teachable Machine model. The same crawler works just as well with any other keyword set.

Tech Stack

Layer	Technology
Language	Python 3.9
Browser Automation	Selenium WebDriver

Local Development

git clone https://github.com/moony01/py-image-crawling.git
cd py-image-crawling

pip install -r requirements.txt

# Edit search keywords in index.py, then run
python index.py

Downloaded images land in the dataset/ directory, organized into one folder per keyword.

License

Contact

👤 @moony01
💖 github.com/sponsors/moony01

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github		.github
__pycache__		__pycache__
dataset/test		dataset/test
LICENSE		LICENSE
README.md		README.md
index.py		index.py
requirements.txt		requirements.txt
test_crawl.py		test_crawl.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

py-image-crawling

Overview

Use Cases

Real-world Example

Tech Stack

Local Development

License

Contact

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

py-image-crawling

Overview

Use Cases

Real-world Example

Tech Stack

Local Development

License

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages