This repository implements the three components of CS50’s Tiny Search Engine:
- crawler — web crawler that pulls pages from a seed URL
- indexer — builds an inverted index from the crawled pages
- querier — answers search queries against the index
- A UNIX‐compatible shell (macOS / Linux)
make,gcc, standard build tools- Internet connection (for crawling)
From the top‐level directory:
# build libcs50 and all three tools
make all- Crawl
# <pagedir> must not exist or be empty
./crawler/crawler <seedURL> <pagedir> <maxDepth>- Indexer
mkdir indexdir
./indexer/indexer pages indexdirExample:
./crawler/crawler http://cs50tse.cs.dartmouth.edu/tse/letters pages 2- Querier
./querier/querier indexdir- Clean
make clean