Skip to content
View osamaaltaf-pk's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report osamaaltaf-pk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
osamaaltaf-pk/README.md

LinkedIn Email HuggingFace WhatsApp


🚀 Professional Profile

I am a specialized AI Systems & Full-Stack Engineer focused on building production-grade LLM architectures, real-time Voice AI streaming systems, and high-throughput inference pipelines. I design software that bridges the gap between state-of-the-art AI research and practical, scalable engineering.

  • 30+ Active Codebases covering LLM optimization, offline voice interfaces, and high-performance WebRTC streaming.
  • Deep Core Systems Integration: Orchestrating complex pipelines with Redis, Kafka, and hardware-aware mutual exclusion algorithms.
  • Production AI Deployments: Veteran developer of enterprise SaaS backends, browser automation agents, and localized AI portals.

🎯 Target Role Alignment & Capabilities

To provide immediate clarity for technical stakeholders and recruitment reviewers, my work directly maps onto the following target positions:

Target Pillar Core Alignment & Competencies Key Evidence (In my Repos)
Remote AI Engineer Real-time audio processing, WebRTC audio streaming, Speech-to-Speech orchestration, and low-latency voice assistants. Aura-TTSOrpheusAssistant
LLM System Engineer Edge model inference, localized LLM routing engines, Silero VAD integration, Kafka pipeline messaging, and local API optimization. QuickCallOrpheusAssistant
AI Research / Alignment Fine-tuning using Unsloth LoRA/QLoRA, RLHF/GRPO logic training without critic models (DeepSeek-R1 styles), and model alignment. LLMs-Unslothsmol-course
AI Full Stack Engineer Enterprise web dashboards, modular NestJS/FastAPI backends, secure multi-tier authentication, PostgreSQL/Supabase DBs, and global semantic caching. ASK ILMOmniSupport AIGold-Arbitrage

🛠️ Technical Ecosystem

AI & LLM Systems

PyTorch HuggingFace FastAPI Transformers Whisper Gemini API Anthropic SDK

Real-Time & Backend Infrastructure

Node.js NestJS Apache Kafka Redis Docker PostgreSQL Supabase

Frontend Development

React 19 TypeScript Vite Tailwind CSS Framer Motion


📂 Featured Deep-Dives

Here is a curated overview of my primary repositories representing deep engineering focus and production capabilities:

A specialized research-to-production pipeline repository detailing high-throughput fine-tuning, reasoning models, and edge optimizations.

  • Engineering Highlights:
    • Fine-tuning pipelines utilizing Unsloth kernels for parameter-efficient optimizations (LoRA / QLoRA).
    • RLHF training using Group Relative Policy Optimization (GRPO) without a separate Critic model (pioneered by DeepSeek-R1) targeting Qwen3 8B.
    • Horizontal engineering recipes for Mixture-of-Experts (MoE) kernels and ModernBERT dense classification systems.
  • Key Stack: Unsloth, PyTorch, LoRA, QLoRA, HuggingFace Transformers, TRL, JAX.

A suite of voice engineering portals dedicated to running low-latency, state-of-the-art offline speech synthesis and real-time assistants.

  • Engineering Highlights:
    • Aura-TTS: Created a unified speech workstation running locally under RAM/VRAM resource boundaries using custom mutual exclusion locking to manage engine lifecycles (Kokoro-TTS, Pocket-TTS, Supertonic).
    • OrpheusAssistant: Advanced offline voice agent linking real-time WebRTC bi-directional streams (via aiortc), Whisper Large STT, and LLaMA 3.2 3B. Integrates directly with n8n for workflow tool execute actions.
  • Key Stack: Python, FastAPI, WebRTC, WebSockets, PipeCat, PyTorch, n8n, Docker.

An event-driven speech-to-speech architecture demonstrating production backend streaming logic.

  • Engineering Highlights:
    • Parallel Producer-Consumer architecture isolating audio recording from backend text transcription.
    • Integrates Silero Voice Activity Detection (VAD) to segment speech in real-time, feeding files to a background transcription worker.
    • Transcription streams directly onto Apache Kafka event streams, triggering downstream TTS and post-processing APIs.
  • Key Stack: Python, PyTorch, Silero VAD, Faster Whisper, Apache Kafka, Pydantic.

A massive offline-first, open-source educational OS designed specifically for classrooms in developing regions.

  • Engineering Highlights:
    • Multi-tier secure roles (Super Admin, Principal, Teacher, Student) with isolated school dashboards.
    • Innovative Tri-Layer Semantic Cache and LLM router to fetch cached school contents and save AI generation quota / cost.
    • Integrated text-to-speech (Deepgram), document rendering (PDF, Excel tables), and local Firebase overrides.
  • Key Stack: React 19, TypeScript, Express, Supabase, Firebase, Anthropic SDK, Google GenAI SDK, Vite, Motion.

An enterprise-grade productivity suite automating prompt queues for high-volume video/image generation pipelines.

  • Engineering Highlights:
    • Extension Code: Chrome extension running background batch automation scripts with organic delay controls (stealth pacing) to circumvent platform limits.
    • Licensing System: Fully integrated with a serverless backend that binds license keys to secure client-side Hardware Device IDs using Supabase PostgreSQL.
  • Key Stack: JavaScript, HTML5, Express, Supabase PostgreSQL, Vercel Serverless Functions.

📊 Developer Metrics & Impact

  • 🛠️ Strong System Architect: Proficient in event sourcing (Kafka), memory management (Redis), and model optimization (Unsloth, Triton).
  • 🌐 Clean API Designer: Expert in RESTful architecture, WebRTC bi-directional streams, and secure serverless gateways.
  • 📦 Docker-first Deployer: Containerizing complex AI stacks with multi-stage builds and clean Docker Compose networking configs.

💼 Open to Remote Opportunities

I am actively exploring Remote AI Engineer, Junior LLM System Engineer, AI Research, and AI Full-Stack opportunities globally. Let's build the future of localized, low-latency, and event-driven AI systems.

📧 Get In Touch💬 Chat on WhatsApp

Pinned Loading

  1. LLMs-Unsloth LLMs-Unsloth Public

    LLM fine-tuning pipeline using Unsloth, LoRA and QLoRA for efficient domain-specific training

    Jupyter Notebook

  2. OrpheusAssistant OrpheusAssistant Public

    AI assistant built with LLM orchestration, tool use, and conversational memory

    Python 1

  3. Pocket_TTS Pocket_TTS Public

    Lightweight Python TTS pipeline with multi-engine voice support and real-time audio processing

    Python

  4. davidbrowne17/csm-streaming davidbrowne17/csm-streaming Public

    Forked from SesameAILabs/csm

    Realtime demo, Streaming and Finetuning code for CSM

    Python 455 74