A comprehensive, production-ready guide to fine-tuning Large Language Models with practical implementations and real-world techniques.
π Modules β’ π― Getting Started β’ π Learn More β’ π€ Contributing
This repository contains 29 complete modules covering everything from fundamental concepts to cutting-edge fine-tuning techniques for Large Language Models. Each module includes detailed Jupyter notebooks with theoretical explanations, practical code examples, and best practices.
Whether you're fine-tuning BERT, LLaMA, GPT, Gemini, or training custom embeddings, you'll find comprehensive guidance here.
- β 29 Progressive Modules - From basics to advanced techniques
- π 98.8% Jupyter Notebooks - Hands-on, executable code examples
- π Multiple LLM Frameworks - HuggingFace, Axolotl, Unsloth, LLaMA-Factory
- π Production-Ready - Industry best practices and optimization techniques
- π§ Complete Toolchain - LoRA, QLoRA, Quantization, RLHF, DPO, ORPO, and more
- π± Multimodal Training - Image-text model fine-tuning
- π Multiple Model APIs - OpenAI, Google Gemini, and open-source models
| # | Module | Topic |
|---|---|---|
| 01 | LLM Fine-Tuning-01 | Foundation Concepts & Introduction |
| 02 | LLM Fine-Tuning-02 | Core Fine-Tuning Principles |
| 04 | LLM Fine-Tuning-04 | Advanced Foundations |
| 05 | LLM Fine-Tuning-05 | Why Fine-Tuning is Hard in LSTMs |
| # | Module | Topic |
|---|---|---|
| 08 | LLM Fine-Tuning-08 | HuggingFace Transformers Guide |
| 09 | LLM Fine-Tuning-09 | BERT Fine-Tuning Deep Dive |
| # | Module | Topic |
|---|---|---|
| 10-11 | LLM Fine-Tuning-10-11 | Knowledge Distillation |
| 12-13 | LLM Fine-Tuning-12-13 | LLM Quantization Strategies |
| 14 | LLM Fine-Tuning-14 | Domain-Specific Fine-Tuning with PDF Data |
| 15 | LLM Fine-Tuning-15 | Instruction Fine-Tuning Explained |
| 16 | LLM Fine-Tuning-16 | Preference-Based Training |
| # | Module | Topic |
|---|---|---|
| 17 | LLM Fine-Tuning-17 | LLaMA-Factory Complete Guide |
| 18 | LLM Fine-Tuning-18 | Unsloth - Fast Fine-Tuning Framework |
| 19 | LLM Fine-Tuning-19 | Axolotl Training Framework |
| # | Module | Topic |
|---|---|---|
| 20 | LLM Fine-Tuning-20 | OpenAI GPT Fine-Tuning |
| 21 | LLM Fine-Tuning-21 | Google GEMINI Fine-Tuning |
| 22 | LLM Fine-Tuning-22 | Fine-Tune Any Small Language Model (SLM) |
| 23 | LLM Fine-Tuning-23 | Multimodal LLM Fine-Tuning |
| 24 | LLM Fine-Tuning-24 | Embedding Models & Embedding Fine-Tuning |
| # | Module | Topic |
|---|---|---|
| 25 | LLM Fine-Tuning-25 | LoRA (Low-Rank Adaptation) |
| 26 | LLM Fine-Tuning-26 | RLHF (Reinforcement Learning from Human Feedback) |
| 27 | LLM Fine-Tuning-27 | GRPO (Group Relative Policy Optimization) |
| 28 | LLM Fine-Tuning-28 | DPO (Direct Preference Optimization) |
| 29 | LLM Fine-Tuning-29 | ORPO (Odds Ratio Preference Optimization) |
| Module | Topic |
|---|---|
| Unsloth vs HuggingFace | Performance & Framework Comparison |
| Crash Course | Quick Start Guide for Rapid Learning |
- Python 3.8 or higher
- CUDA 11.8+ (for GPU acceleration, recommended)
- 8GB+ GPU memory (4GB minimum for quantized models)
- Jupyter Notebook or JupyterLab
# Clone the repository
git clone https://github.com/mdzaheerjk/Complete-LLM-Finetuning.git
cd Complete-LLM-Finetuning
# Create a virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt # If available
# Or install individually:
pip install torch transformers datasets jupyter pandas numpy scikit-learn
pip install bitsandbytes peft accelerate# Start Jupyter
jupyter notebook
# Navigate to LLM-Finetuning-Crash-Course for quick start
# Or begin with LLM Fine-Tuning-01 for foundationsπ Learning Path
Start here β LLM Fine-Tuning Crash Course
Then β LLM Fine-Tuning-01 (Fundamentals)
Then β LLM Fine-Tuning-08 (HuggingFace)
Then β Your specific interest (25-29)
π° Memory-Constrained Fine-Tuning
- Module 25: LoRA (Low memory!)
- Module 12-13: Quantization
- Module 18: Unsloth (Ultra-fast)
π’ Production Deployment
- Module 14: Domain-Specific Fine-Tuning
- Module 15: Instruction Fine-Tuning
- Module 17/19: Professional Frameworks (LLaMA-Factory, Axolotl)
π€ Alignment & Safety
- Module 26: RLHF (Industry standard)
- Module 28: DPO (Simpler alternative to RLHF)
- Module 29: ORPO (Latest technique)
π Comparing Models
- Module 20: OpenAI GPT
- Module 21: Google GEMINI
- Module 22: Open-source SLMs
πΈ Advanced Applications
- Module 23: Multimodal LLM Fine-Tuning
- Module 24: Embedding & Vector Search Fine-Tuning
| Technique | Module | Level |
|---|---|---|
| LoRA | 25 | Intermediate |
| QLoRA | 12-13 | Advanced |
| RLHF | 26 | Advanced |
| DPO | 28 | Advanced |
| ORPO | 29 | Advanced |
| GRPO | 27 | Advanced |
| Knowledge Distillation | 10-11 | Advanced |
| Instruction Tuning | 15 | Intermediate |
| Preference Training | 16 | Advanced |
| Framework | Module | Best For |
|---|---|---|
| HuggingFace | 08 | Flexibility & Community |
| LLaMA-Factory | 17 | Production-grade training |
| Unsloth | 18 | Speed & Efficiency |
| Axolotl | 19 | Complex configurations |
| Peft (LoRA) | 25 | Memory efficiency |
- π¦ LLaMA & LLaMA 2/3
- π§ BERT & RoBERTa
- π¦ GPT-2, GPT-3, GPT-4
- β¨ GEMINI
- π― Mistral, Zephyr
- π Custom embeddings
- Full Fine-Tuning: Update all model parameters (expensive, high quality)
- LoRA: Update only low-rank adaptations (memory efficient)
- QLoRA: Quantized LoRA (ultra memory efficient)
- Prompt Tuning: Only tune soft prompts
- Adapter Tuning: Use adapter modules
- Supervised Fine-Tuning (SFT): Learn from labeled examples
- Reinforcement Learning from Human Feedback (RLHF): Align with human preferences
- Direct Preference Optimization (DPO): Simpler alignment without RL
- Knowledge Distillation: Transfer knowledge from large to small models
- Perplexity: Model confidence on new data
- BLEU/ROUGE: Text generation quality
- Accuracy/F1: Task-specific performance
- Speed & Memory: Efficiency metrics
Module: LLM Fine-Tuning-09 (BERT Fine-Tuning)
- Text classification, NER, Sentiment analysis
- Quick convergence, small datasets
Module: LLM Fine-Tuning-15 (Instruction Fine-Tuning)
- ChatGPT-like models, Q&A systems
- Requires instruction-response pairs
Module: LLM Fine-Tuning-25 (LoRA)
- Limited GPU memory constraints
- Maintains quality with 10x less memory
Module: LLM Fine-Tuning-14 (PDF/Custom Data)
- Legal, medical, financial documents
- Domain-specific terminology
Module: LLM Fine-Tuning-26 (RLHF) or 28 (DPO)
- Make models follow instructions better
- Reduce harmful outputs
- 29 Complete Modules covering full fine-tuning spectrum
- 98.8% Jupyter Notebooks for hands-on learning
- Production-Ready Code with error handling and best practices
- MIT License - Free to use commercially
Contributions are welcome! Please feel free to:
- π Report bugs and issues
- β¨ Suggest improvements
- π Add documentation
- π§ Submit pull requests
- π¬ Share your experiences
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
You're free to use this for:
- β Commercial projects
- β Educational purposes
- β Research
- β Personal learning
- π Start with the relevant module for your use case
- π Check module prerequisites before starting
- π» Ensure GPU availability for faster training
- π Open an issue for bugs or questions
If you find this repository helpful, please consider:
- β Giving it a star to support the project
- π Sharing with your network
- π¬ Providing feedback for improvements
- π€ Contributing your insights
- GitHub: @mdzaheerjk
- Repository: Complete-LLM-Finetuning
Planned Updates:
- Video tutorials linking
- Benchmark comparisons
- Cost analysis per method
- Additional model support
- Community contributions section
- This repository provides educational materials for LLM fine-tuning
- Always respect model licenses and terms of service
- Large model training requires significant computational resources
- Some techniques may have licensing implications - verify before commercial use
Made with β€οΈ for the LLM community
Happy Fine-Tuning! π