Generate professional Minutes of Meeting (MOM) from text, images, PDFs, and documents using Google Gemini 2.5 Flash AI
π Quick Start β’ π Documentation β’ π― Features β’ π οΈ Tech Stack β’ π€ Contributing
β¨ Multi-Format Input Support
- π Text Input: Paste meeting notes directly
- π File Upload: Support for multiple file types:
- πΌοΈ Images: PNG, JPG, GIF, WEBP (OCR + Vision AI)
- π PDFs: Automatic text extraction
- π Word Documents: DOCX file processing
- π Text Files: Direct TXT file support
- π Mixed Content: Process images and documents together (up to 10 files)
π€ AI-Powered Processing
- π§ Google Gemini 2.5 Flash integration
- π Advanced OCR for handwritten text and images
- π Smart text extraction from PDFs and DOCX files
- π Structured MOM generation with professional formatting
π₯ Multiple Download Formats
- π Markdown (.md): Original format with full formatting
- π Plain Text (.txt): Clean text without formatting
- π Word Document (.docx): Professional document with styling
- π― Smart Naming: Files named after meeting agenda/title
π¨ Modern Interface
- π± Responsive design for all devices
- π― Clean, professional UI with Tailwind CSS
- β‘ Real-time processing status and progress indicators
- πΌοΈ Smart file previews with type-specific icons
- ποΈ Drag & drop file upload with validation
π Privacy & Security
- π‘οΈ Secure, temporary processing
- π IST timezone handling
- π No data storage or retention
- π Client-side file processing where possible
Supports images, PDFs, DOCX, and TXT files with smart previews
- Python 3.8 or higher
- Google Gemini API key (Get one here)
git clone https://github.com/yourusername/mom-builder-free.git
cd mom-builder-free# Install all required dependencies automatically
python install_dependencies.pycd backend
# Create .env file
echo "GEMINI_API_KEY=your_gemini_api_key_here" > .env
echo "PORT=8000" >> .env
# Start backend server
python main.py# In a new terminal
cd frontend
# Create .env file
echo "SECRET_KEY=your-secret-key-here" > .env
echo "BACKEND_URL=http://localhost:8000" >> .env
echo "PORT=5000" >> .env
# Start frontend server
python app.pyOpen your browser and navigate to: http://localhost:5000
mom-builder-free/
βββ π backend/ # FastAPI Backend
β βββ π models/ # Pydantic models
β β βββ π requests.py # Request/response models
β βββ π services/ # Business logic
β β βββ π gemini_service.py # AI integration
β β βββ π file_processor.py # Multi-format file processing
β β βββ π file_converter.py # Download format conversion
β βββ π utils/ # Helper utilities
β β βββ π timezone_helper.py # IST timezone handling
β βββ π main.py # FastAPI app entry point
β βββ π requirements.txt # Backend dependencies
βββ π frontend/ # Flask Frontend
β βββ π static/ # CSS, JS, assets
β β βββ π css/ # Tailwind CSS styling
β β βββ π js/ # Interactive JavaScript
β βββ π templates/ # Jinja2 templates
β β βββ π index.html # Main application interface
β βββ π app.py # Flask app entry point
β βββ π requirements.txt # Frontend dependencies
βββ π assets/ # Documentation & UI assets
β βββ π txt.png # TXT file icon
β βββ π docx.png # DOCX file icon
β βββ π substance.png # Markdown file icon
βββ π install_dependencies.py # Automated dependency installer
βββ π main.py # Alternative backend entry point
βββ π README.md # Project documentation
βββ π .gitignore # Git ignore rules
- FastAPI - Modern, fast web framework
- Google Generative AI - Gemini 2.5 Flash integration
- Pydantic - Data validation and serialization
- Uvicorn - ASGI server
- python-docx - Word document processing
- PyPDF2 - PDF text extraction
- pdfplumber - Advanced PDF processing
- Flask - Lightweight web framework
- Jinja2 - Template engine
- Tailwind CSS - Utility-first CSS framework
- Marked.js - Markdown parser and renderer
- Vanilla JavaScript - Interactive file handling and UI
| Method | Endpoint | Description |
|---|---|---|
GET |
/ |
Root endpoint |
GET |
/api/health |
Health check with IST timestamp |
POST |
/api/process-text |
Process text input for MOM generation |
POST |
/api/process-images |
Process files (images, PDFs, DOCX, TXT) |
POST |
/api/download-mom/txt |
Download MOM as plain text |
POST |
/api/download-mom/docx |
Download MOM as Word document |
curl -X POST "http://localhost:8000/api/process-text" \
-H "Content-Type: application/json" \
-d '{"text": "Meeting notes here..."}'curl -X POST "http://localhost:8000/api/process-images" \
-H "Content-Type: application/json" \
-d '{"images": ["data:image/jpeg;base64,/9j/4AAQ...", "data:application/pdf;base64,JVBERi0x..."]}'{
"success": true,
"data": {
"content": "# Minutes of Meeting β Project Kickoff\n\n**Date:** 30-Sep-2025 **Time:** 16:20 IST **Mode:** Hybrid\n\n## Agenda\n1. Project overview\n2. Timeline discussion\n3. Resource allocation\n\n## Key Discussion Points\n- Budget approved for Q4\n- Team expansion planned\n- New technology stack evaluation\n\n## Decisions Made\n- Go-live date: December 15, 2025\n- Weekly sprint reviews\n- Remote work policy updated\n\n## Action Items\n| Task | Assignee | Due Date |\n|------|----------|----------|\n| Setup development environment | John Doe | Oct 5, 2025 |\n| Finalize requirements | Jane Smith | Oct 10, 2025 |\n\n## Next Meeting\n**Date:** October 7, 2025 **Time:** 2:00 PM IST",
"format": "markdown"
}
}One-Click Deployment:
- Click the "Deploy with Vercel" button above
- Connect your GitHub account
- Add environment variable:
GEMINI_API_KEY(your Google Gemini API key) - Click "Deploy"
- Your app will be live in 2-3 minutes!
Manual Deployment: For detailed step-by-step instructions, see VERCEL_DEPLOYMENT.md
Required for Vercel:
GEMINI_API_KEY=your_gemini_api_key_here
ENVIRONMENT=production
SECRET_KEY=your-secret-key-hereAdd these in Vercel Dashboard β Project Settings β Environment Variables
- Click on "Text Input" tab
- Paste your meeting notes, agendas, or discussions
- Click "Generate MOM"
- Download in your preferred format (MD, TXT, DOCX)
- Click on "File Upload" tab
- Upload files (up to 10):
- Images: Meeting photos, whiteboard captures, handwritten notes
- PDFs: Meeting agendas, presentation slides
- DOCX: Word documents with meeting content
- TXT: Plain text files with notes
- Mix different file types as needed
- Click "Generate MOM from Files"
- Download with smart filename based on meeting title
- π Markdown (.md): Full formatting preserved
- π Text (.txt): Clean, readable plain text
- π Word (.docx): Professional document with styling
We welcome contributions! Here's how you can help:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
# Install all dependencies
python install_dependencies.py
# Run backend tests
cd backend && python -m pytest
# Run frontend in development mode
cd frontend && python app.py- PDF processing works best with text-based PDFs
- Handwritten text recognition depends on image quality
- Maximum file size: 10MB per file
- DOCX processing supports basic formatting
This project is licensed under the MIT License - see the LICENSE file for details.
- Google Gemini 2.5 Flash for providing advanced AI capabilities
- FastAPI for the excellent web framework
- Flask for the lightweight frontend framework
- Tailwind CSS for the beautiful UI components
- Open Source Community for the amazing tools and libraries
- β Added support for PDF, DOCX, and TXT files
- β Multiple download formats (MD, TXT, DOCX)
- β Smart file previews with type-specific icons
- β Enhanced UI with dropdown download menu
- β Automated dependency installation script
- β Text input processing
- β Image upload with OCR
- β Basic markdown download
- β Responsive web interface
- Real-time Collaboration: Multi-user editing
- Meeting Templates: Pre-defined MOM formats
- Calendar Integration: Automatic meeting scheduling
- Audio Processing: Voice-to-MOM conversion
- Export Options: PowerPoint, Excel formats
- Meeting Analytics: Insights and reporting
- Mobile App: Native iOS/Android applications
β Star this repository if you find it helpful!
π Built with cutting-edge AI technology for modern teams
Made with β€οΈ by Krishn Jatav
