Skip to content
View DhruvGarg111's full-sized avatar

Highlights

  • Pro

Block or report DhruvGarg111

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
DhruvGarg111/README.md

Dhruv Garg

AI / ML Engineer β€’ Computer Vision β€’ Generative AI

Building practical AI systems, one focused iteration at a time.

Roles

Computer Vision Deep Learning Backend Systems

Dhruv Garg tech ecosystem

Building intelligent systems that see, understand, and create.


πŸ”¬ Engineering Profile

I am a Machine Learning Engineer focused on Computer Vision and Agentic AI, with a strong foundation in scalable backend systems. My engineering philosophy revolves around translating complex research papers into optimized, production-ready code.

  • 🎯 Focus: Bypassing computational bottlenecks in high-resolution (4K) object detection using Explainable AI (XAI).
  • πŸ€– AI Engineering: Building local LLM agents that seamlessly interact with third-party ecosystems (Google APIs, etc.).
  • βš™οΈ Infrastructure: Architecting robust database migrations and building backend profilers.
  • πŸ’‘ Goal: I build systems that are not just intelligent, but fast, scalable, and resilient.

πŸš€ Featured Projects

🌟 Flagship Projects

⚑ PixelQueue

Vision Intelligence Infrastructure: A high-performance, async control panel for human-in-the-loop AI annotation.

A sleek, dark-themed control panel designed for decoupled ML microservices and robust task queues, eliminating UX bottlenecks with pure speed and instantaneous rendering.

Key Innovations:

  • πŸš€ Asynchronous ML: Non-blocking AI auto-labeling via PyTorch, LayerCAM & YOLO.
  • ⚑ Zero-Latency UI: Hardware-accelerated React-Konva staging canvas.
  • πŸ”„ Decoupled Workers: Infinite horizontal scaling using Celery message brokers.
  • πŸ”’ Isolated Workspaces: Robust Role-Based Access Control (RBAC) circuits.

"Finding the needle in the haystack, from 400ft above."

A novel coarse-to-fine computer vision pipeline designed for efficient small object detection in high-resolution (2K/4K) aerial imagery. Tackles the critical trade-off between resolution and latency in drone forensics.

Key Innovations:

  • Uses LayerCAM to identify semantic "hotspots" before processing.
  • Intelligently slices and zooms into regions of interestβ€”skipping 80%+ of empty backgrounds.
  • Outperforms blind sliding-window approaches (SAHI) in both speed and accuracy.

🎨 Neural Canvas

Transform any image into a masterpiece β€” in real-time.

A fast neural style transfer implementation that generates stylized images using a feed-forward CNN trained with perceptual loss. Performs instant stylization in a single forward pass.

Key Features:

  • πŸš€ Real-time inference with a custom residual architecture.
  • 🧠 Perceptual content & style loss using a pretrained VGG-16 network.
  • πŸ” Instance Normalization integrated for high-quality, artifact-free outputs.
  • πŸ“¦ ONNX export supported, ready for edge deployment.

πŸ“¦ More Projects

🧭 pygog (Google CLI Agent)
A powerful CLI for Google services (Gmail, Drive, Calendar). Features a built-in natural language AI agent supporting Gemini, DeepSeek, & OpenAI.
<Python> <Google APIs> <LLM Agents>

πŸ“ Depth Estimation + Semantic Seg.
Multi-modal depth completion using RGB + sparse depth + semantic maps. Features a DepthNet-style encoder-decoder trained on NYU Depth v2 with multi-scale supervision.
<PyTorch> <NYU-Depth-v2> <Encoder-Decoder>


πŸ› οΈ Stack Matrix

stack-icons

vision modeling serving interface

🌐 Open Source Contributions

I actively contribute to the broader developer ecosystem, with recent merged work spanning agent frameworks, AI infrastructure, developer tooling, and performance-focused ML apps. I am also an active Collaborator at SynapseKit organization:

  • SynapseKit/SynapseKit: Shipped 109 PRs covering native observability, VoiceAgent audio pipelines, graph-builder tooling, benchmark suites, CronTrigger scheduling, self-healing cost-aware agents, persistent agent memory, multimodal RAG ingestion, knowledge graph retrievers, Discord automation, cloud/data loaders, and local/self-hosted model integrations across 15+ LLM providers.
  • lancedb/lancedb: Updated LanceDB's Python Gemini embedding provider to the newer google-genai SDK and opened a fix for async event loop blocking in AsyncTable.add embeddings.
  • Nikolaev3Artem/fastapi-silk: Merged 5 PRs adding per-endpoint database trigger counters, multi-version compatibility matrix, SQLite+Alembic setup, SQL profiler tests, and comprehensive README documentation.
  • Bessouat40/RAGLight: Submitted 3 PRs adding MCP server configuration CLI support (in review) and Docling-based high-fidelity PDF ingestion (open).
  • pydantic/pydantic-ai: Merged Anthropic code execution tool upgrade

πŸ“Š Telemetry

github-stats streak
streak most-commit-language
activity-graph

πŸ”— Connect & Explore

Website Β β€’Β  Searchlight Live App Β β€’Β  Email Me


Built by DhruvGarg111

Pinned Loading

  1. Neural-Style-Transfer Neural-Style-Transfer Public

    Python

  2. The-Searchlight-Protocol The-Searchlight-Protocol Public

    Python