This repository provides a tool to help you select a suitable professor for your PhD program using a large language model (LLM). By inputting keywords or descriptions about the professor you're looking for, the Python code determines which professor from the CS ranking AI domain best matches your needs.
The tool leverages a RAG (Retrieval-Augmented Generation) approach to compare and match your preferences with professors. The repository includes scripts to help you gather and analyze professor data, focusing on artificial intelligence (AI) professors listed on CSRankings.
- Professor Matching: Input keywords or descriptions to find the best-matching professors.
- Data Collection: Scripts to gather and store information about AI professors.
- Information Extraction: Tools to extract and process professor data for analysis.
content.py: Script to collect and store professor information.professorinfo.py: Script to include additional professors and their home pages.storage.py: Manages data storage.input.py: Handles user inputs and interactions.llamaindex.ipynb: Jupyter notebook for information extraction.university_faculty_XXX.json: Intermediate JSON files containing professor data.
-
API Key Configuration: Replace the environment file with your OpenAI API key to access the LLM services.
-
Prepare Data: Decompress the
default_Vector_store.zipfile to save tokens from embedding pure text from the professor's home pages. -
Run Scripts: Use the provided Python scripts and Jupyter notebook to collect, store, and analyze professor information.
-
Collect Professor Data: Use
content.pyandprofessorinfo.pyto gather and manage professor information. -
Extract Information: Run the
llamaindex.ipynbnotebook to process and analyze the data. -
Match Professors: Input your preferences and descriptions to find the best-matching professor based on the analyzed data.
Feel free to contribute by adding more professors, updating the data collection scripts, or improving the matching algorithm. Pull requests are welcome!
Free use, no license.
Here's a BibTeX citation if you need:
@misc{dong2024professor,
author = {Z. Dong},
title = {ProfessoraGPT},
year = {2024},
howpublished = {\url{https://github.com/Zdong104/ProfessorGPT}},
note = {Accessed: [date accessed]}
}