Theoretical ML/DL Β· Interpretability Β· Information Theory

Yug D Oswal

I'm a final-year CS undergrad at VIT Vellore and an incoming MSc student at the University of Oxford (Advanced Computer Science, Oct 2026). My research sits at the intersection of theoretical ML/DL, mechanistic interpretability, and information theory - broadly, I'm interested in understanding mechanisms underlying learning and developing more capable/reliable intelligence.

Currently I'm working with the Torr Vision Group and OATML groups at Oxford, headed by Prof. Philip Torr and Prof. Yarin Gal respectively. I'm also working with Prof. Ravid Shwartz-Ziv at NYU. I'm always happy to connect, please feel free to reach out!

Photo of Yug D Oswal

00 Education

University of Oxford Incoming
MSc in Advanced Computer Science
Oxford, England
Oct 2026 – Oct 2027
VIT Vellore
BTech in Computer Science and Engineering
CGPA: 9.53 / 10  Β·  Vellore, India
Sep 2022 – Jul 2026

01 News & Updates

02 Papers & Preprints

Beyond the Loss Curve
Yug D Oswal, PI: Ravid Shwartz-Ziv
✦ ICLR DeLTa Workshop 2026
Constructed the first large-scale tractable-distribution dataset for ImageNet-64 enabling exact uncertainty decomposition. Discovered epistemic scaling power-laws and a constant aleatoric floor. Demonstrated that epistemic-based active learning outperforms entropy-based sampling, requiring 47.8% fewer samples for equivalent performance.
Loss Switching
Yug D Oswal, PI: Mathew Mithra Noel
Under Review at Applied Soft Computing, 2025
Gradient-based loss scheduling method (loss switching) paired with statistically optimized classification and regression losses. Achieves β‰₯ 3% top-1 ImageNet gain and β‰₯ 1.4% RMSE improvement across four regression benchmarks under asymmetric outlier shifts.
Cone-Class
Yug D Oswal, PI: Mathew Mithra Noel
arXiv, 2024
Cone activations compute hyperstrip representations, making them effective classification heads. β‰₯ 4.6% accuracy gain on ImageNet with 46.4% parameter reduction in VGG19. ≀ 6Γ— neuron compression yields only β‰ˆ 2% drop for cone vs. β‰ˆ 8% for ReLU.
QNN
Yug D Oswal, PI: Mathew Mithra Noel
arXiv, 2023–2025
Vectorized forward/backward matrix algorithms resolving the core computational bottleneck in QNNs. O(nΒ²) reduced-parameter RP-QNN variants with systematic expressiveness–efficiency ablations.

03 Research Experience

SPAR
Research Fellow Β· Sep 2025 – Jan 2026 Β· with Shivam Raval, Harvard University Β· Remote
Mechanistic Interpretability Β· AI Safety Β· Reward Hacking
Led funded research on causal induction of reward hacking in LLMs via activation steering.
Selected among top 10 of 90+ teams to present to AI safety orgs including Constellation, UK AISI, and BlueDot Impact. EMNLP '26 submission in preparation.
NYU
Research Intern Β· Sep 2025 – Present Β· with Prof. Ravid Shwartz-Ziv Β· Remote
Uncertainty Decomposition Β· Normalizing Flows Β· Deep Learning Β· Scaling Laws Β· Active Learning
Led experimental pipeline for the first large-scale tractable-distribution dataset at ImageNet-scale with exact posterior, marginal, and KL oracle queries.
Discovered epistemic scaling laws, theoretical limits of learning, and improved active learning policies. Results led to a paper accepted at the ICLR DeLTa Workshop 2026.
William & Mary
Research Intern Β· Apr 2025 – Dec 2025 Β· with Prof. Jindong Wang & Hao Chen Β· Remote
CoT Alignment Β· Statistical Interventions Β· Causal Tracing Β· Cognitive Theory
Formulated conditioning of LLM reasoning on outputs via a KL-divergence-based self-alignment framework for chain-of-thought. Achieved 25–35% task-accuracy gain and reduced bias over base DeepSeek Qwen 1.5B on the Bias Benchmark for QA. Integrated causal activation patching to trace how biased reasoning propagates into model outputs.

04 Professional Experience

BDL
AI/ML Engineer Intern Β· Aug 2024 – Oct 2024 Β· Hyderabad
Curated 85,000-sample IR-optical hybrid UAV dataset. Built first prototype multimodal thermal-optical anti-UAV surveillance system (YOLOv8-based), successfully tested in 4 field scenarios. Engineered novel containerized client-server deployment on air-gapped defence systems.
Synergetics
ML Engineering Intern Β· Feb 2024 – Jun 2024 Β· Bangalore
Led the full ML lifecycle for a humanoid, speech-capable therapeutic agent. Engineered and deployed agentic workflows, RAG pipelines, guardrails, and context-aware chat with real-time client integration - reducing latency by 53%.
UoA
Project Lead (Applied ML) Β· Jun 2023 – Aug 2023 Β· New Zealand
Led an international applied ML team to resolve 5 real-world issues in Signal's threat intelligence system. Built scalable pipelines for NER, geocoding, and incremental clustering over live threat data streams for high-profile executive clients.

05 Community & Leadership

CSI VIT
Board Member & Head of Research & Development Β· 2023 – 2025
Directed technical strategy and operations for a 100-member chapter. Mentored juniors in ML and research. Organised large-scale events: Rural ML Outreach (40+ rural students), Riddler (1000+ participants, 25+ countries), LaserTag (1000+), and Init ML workshops (200+). Intel Developer Spotlight feature for Rekindle.

06 Selected Projects

Rekindle
Founder & Lead Developer Β· 2nd place, Intel BOLT Hackathon Β· Intel Developer Spotlight
End-to-end AI memory-support service for dementia patients: custom emotion-extraction model (outperforming baselines on Google GoEmotions), emotion-conditioned memory retrieval, local LLMs for privacy-preserving inference, and full-stack life-logging platform.

07 Writing & Blog