Skip to content
View SahilChachra's full-sized avatar

Block or report SahilChachra

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
SahilChachra/README.md

👋  Hey there! I'm Sahil Chachra

👨🏻‍💻  About Me

🔭  AI Architect @ BLUE — building a real-time, multi-camera VLM pipeline for warehouse & retail vision analytics.
🧠  I work across LLMs, VLMs, multi-agent systems, and on-device inference — taking models from research to production at the edge.
🤗  Publishing open-source MLX-quantized LLMs on HuggingFace, making frontier models runnable locally on Apple Silicon.

Previously:
💻  Founding AI Engineer @ Stealth — a 27-agent orchestration framework + multi-stage document enrichment pipeline for AI-powered factory floor planning.
💻  Senior AI Engineer @ Avathon — scaled a CV platform to 600+ cameras across enterprise deployments, and added LLM + VLM layers to a pure-CV stack.
💻  Deep Learning Engineer @ TCS — shipped CV models for Smart Mobility & automotive data pipelines.
🛰️  Started out interning at ISRO's Regional Remote Sensing Centre.
🎓  B.Tech in Computer Science & Engineering, 2021.

💬  Reach out for projects, collabs, or just an interesting discussion.

Languages and Tools

Python Docker PyTorch CUDA DeepStream Generative AI Transformers RAG AI Safety AGI Curious

🚀  OpenSource Contribution

🤗  HuggingFace Models

23 MLX-quantized & uncensored LLMs — making frontier models runnable locally on Apple Silicon.

📕  Writing

A few favorites — full archive on Medium.

LLMs & Fine-Tuning

Edge AI & Deployment

Paper Summaries

Read all my articles on Medium

⚙️  GitHub Analytics

🖥️ Dev Environment

MacOs VS Code Macbook Air Macbook Pro Server

🤝🏻  Connect with Me

SahilChachra | LinkedIn SahilChachra | Medium ChachraSahil | Twitter sahilchachra | Hugging Face

Pinned Loading

  1. LLM-Safety-Middleware LLM-Safety-Middleware Public

    A production-grade proxy that puts safety checks between your users and any LLM backend.

    Python 1

  2. Refusal-Finetuning Refusal-Finetuning Public

    This repo explains how to finetune an LLM to update its decision boundary to correctly refuse to answer when context lacks the data to prevent hallucinations.

    Python

  3. LLM-Merging LLM-Merging Public

    What happens if we merge LLMs finetuned on different dataset? Lets find out!

    Python

  4. gpu-pilot gpu-pilot Public

    GPU selection, vLLM config generator, and inference cost advisor — find the right hardware and optimal settings for your LLM deployment.

    JavaScript

  5. ADAS_GTAV ADAS_GTAV Public

    Incorporating ADAS (Advanced Driver Assistance Systems) into GTA V

    Python 3