VISHAL SINGH vishalsingha

Vishal Singh

Senior Data Scientist · LLM Researcher · IIT BHU

Building reasoning-centric AI for education at scale

About Me

I'm a Senior Data Scientist currently at Physics Wallah, where I build and train large language models for competitive exam preparation (IIT-JEE & NEET). My work sits at the intersection of LLM research, Reinforcement Learning, and scalable ML systems — with a focus on making high-quality AI affordable for education in India.

Previously at Infoedge (Naukri) building LLM-powered products at production scale. B.Tech from IIT (BHU), 2023.

Featured Work

🏆 Aryabhata 1.0 and Aryabhata 2.0 — Reasoning LLM for JEE & NEET (Physics Wallah)

A reasoning-centric LLM achieving 87.8% on JEE Mains '25, 93% on JEE Mains '26, and 86.51% on JEE Advanced — matching or outperforming GPT-4o-mini and Gemini 2.5 Flash at 10× lower cost.

Trained using GRPO (Reinforcement Learning) with curriculum learning, rejection sampling, and checkpoint merging
Developed a verified reward framework using math-verify, LLM-based rewards, and length penalties
📄 Aryabhata 2: Scaling Reinforcement Learning for Advanced STEM Reasoning
🏅 Second Runner-Up at NASSCOM Open-Source GenAI Grand Challenge 2025
Presented at AI4India Summit and AI Impact Summit

🎓 Concept LLM — Compact EdTech Model (Physics Wallah)

A 4B-parameter model delivering near-SOTA performance on conceptual questions for JEE, NEET & Foundation prep.

Trained on NCERT-aligned and teacher-transcript data
Supports explanations in 10 Indian languages with mnemonic-based reasoning
Achieves high accuracy at 15× lower inference cost than SOTA models

💼 LLM Products @ Infoedge (Naukri)

Built fine-tuning & RL workflows for JD generation, CV summarization, job entity extraction and AskNaukri.
Deployed high-speed inference APIs with vLLM (paged attention, KV-caching, continuous batching) for productions usecases.
Finetuned Llama 3 to build AskNaukri for jobseekers — an agentic chatbot with RAG + tool calling (Job Search, Salary Insights, company perks).
Built a personalized job requirement suggestion system as part of Naukri AI suite using ANN-based retrieval, ranking system with Milvus DB.

Technical Skills

Domain	Tools & Frameworks
LLMs & GenAI	Transformers, LoRA, RLHF, GRPO, DPO, RLVR, RAG, AgenticAI, vLLM
MLOps & Backend	LangChain, FastAPI, Docker, PySpark, AWS, Azure, Langfuse, Nginx
Vector DBs	Milvus, FAISS, ChromaDB
ML & DL	PyTorch, Transformers, TRL, VERL, Scikit-learn, OpenCV, NumPy, Pandas
Data & Systems	MySQL, Postgres, Spark, Locust, Supervisor

Publications & Recognition

📄 Aryabhata 2): Scaling Reinforcement Learning for Advanced STEM Reasoning
📄 Aryabhata 1.0: Contributed in evaluation of JEE math paper — presented at NeurIPS Workshop
🥈 2nd Runner-Up — NASSCOM Open-Source GenAI Grand Challenge, 2025
🥉 3rd Place — Recognizance ML Challenge Series, National Level, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly