Theoretical ML/DL · Interpretability · Information Theory

Yug D Oswal

I'm a final-year CS undergrad at VIT Vellore and an incoming MSc student at the University of Oxford (Advanced Computer Science, Oct 2026). My research sits at the intersection of theoretical ML/DL, mechanistic interpretability, and information theory - broadly, I'm interested in understanding mechanisms underlying learning and developing more capable/reliable intelligence.

Currently I'm working with the Torr Vision Group and OATML groups at Oxford, headed by Prof. Philip Torr and Prof. Yarin Gal respectively. I'm also working with Prof. Ravid Shwartz-Ziv at NYU. I'm always happy to connect, please feel free to reach out!

Email CV / Résumé Scholar LinkedIn GitHub Blog

00 Education

University of Oxford Incoming

MSc in Advanced Computer Science

Oxford, England

Oct 2026 – Oct 2027

VIT Vellore

BTech in Computer Science and Engineering

CGPA: 9.53 / 10 · Vellore, India

Sep 2022 – Jul 2026

01 News & Updates

Apr 2026 Invited to Y Combinator Startup School India 2026, among the Top 2000 builders, founders, and engineers in India.
Mar 2026 Paper "Beyond the Loss Curve" accepted to the ICLR DeLTa Workshop.
Mar 2026 Admitted to the MSc Advanced Computer Science program at the University of Oxford, starting Oct 2026.
Jan 2026 Selected Top 10 / 90+ teams at SPAR Fall 2025: presented lightning talk on causal reward-hacking induction to researchers at Constellation, UK AISI, and BlueDot Impact.
Sep 2025 Joined NYU as a research intern with Prof. Ravid Shwartz-Ziv (currently at Meta) on tractable-distribution datasets and uncertainty decomposition.
Sep 2025 Started as a SPAR Research Fellow (Supervised Program for Alignment Research) with Shivam Raval, Harvard University.

02 Papers & Preprints

Beyond the Loss Curve: Scaling Laws, Active Learning, and the Limits of Learning

Yug D Oswal, PI: Ravid Shwartz-Ziv

✦ ICLR DeLTa Workshop 2026

Constructed the first large-scale tractable-distribution dataset for ImageNet-64 enabling exact uncertainty decomposition. Discovered epistemic scaling power-laws and a constant aleatoric floor. Demonstrated that epistemic-based active learning outperforms entropy-based sampling, requiring 47.8% fewer samples for equivalent performance.

OpenReview

Loss Switching, Novel Classification and Regression Losses

Yug D Oswal, PI: Mathew Mithra Noel

Under Review at Applied Soft Computing, 2025

Gradient-based loss scheduling method (loss switching) paired with statistically optimized classification and regression losses. Achieves ≥ 3% top-1 ImageNet gain and ≥ 1.4% RMSE improvement across four regression benchmarks under asymmetric outlier shifts.

arXiv

A Significantly Better Class of Activation Functions Than ReLU-Like Functions

Yug D Oswal, PI: Mathew Mithra Noel

arXiv, 2024

Cone activations compute hyperstrip representations, making them effective classification heads. ≥ 4.6% accuracy gain on ImageNet with 46.4% parameter reduction in VGG19. ≤ 6× neuron compression yields only ≈ 2% drop for cone vs. ≈ 8% for ReLU.

arXiv

Computationally Efficient Quadratic Neural Networks

Yug D Oswal, PI: Mathew Mithra Noel

arXiv, 2023–2025

Vectorized forward/backward matrix algorithms resolving the core computational bottleneck in QNNs. O(n²) reduced-parameter RP-QNN variants with systematic expressiveness–efficiency ablations.

arXiv

03 Research Experience

Supervised Program for Alignment Research (SPAR)

Research Fellow · Sep 2025 – Jan 2026 · with Shivam Raval, Harvard University · Remote

Mechanistic Interpretability · AI Safety · Reward Hacking

Led funded research on causal induction of reward hacking in LLMs via activation steering.
Selected among top 10 of 90+ teams to present to AI safety orgs including Constellation, UK AISI, and BlueDot Impact. EMNLP '26 submission in preparation.

Fellowship Certificate GitHub

New York University

Research Intern · Sep 2025 – Present · with Prof. Ravid Shwartz-Ziv · Remote

Uncertainty Decomposition · Normalizing Flows · Deep Learning · Scaling Laws · Active Learning

Led experimental pipeline for the first large-scale tractable-distribution dataset at ImageNet-scale with exact posterior, marginal, and KL oracle queries.
Discovered epistemic scaling laws, theoretical limits of learning, and improved active learning policies. Results led to a paper accepted at the ICLR DeLTa Workshop 2026.

William & Mary

Research Intern · Apr 2025 – Dec 2025 · with Prof. Jindong Wang & Hao Chen · Remote

CoT Alignment · Statistical Interventions · Causal Tracing · Cognitive Theory

Formulated conditioning of LLM reasoning on outputs via a KL-divergence-based self-alignment framework for chain-of-thought. Achieved 25–35% task-accuracy gain and reduced bias over base DeepSeek Qwen 1.5B on the Bias Benchmark for QA. Integrated causal activation patching to trace how biased reasoning propagates into model outputs.

GitHub

04 Professional Experience

Bharat Dynamics Limited - Ministry of Defence, India

AI/ML Engineer Intern · Aug 2024 – Oct 2024 · Hyderabad

Curated 85,000-sample IR-optical hybrid UAV dataset. Built first prototype multimodal thermal-optical anti-UAV surveillance system (YOLOv8-based), successfully tested in 4 field scenarios. Engineered novel containerized client-server deployment on air-gapped defence systems.

Certificate

WebTiga (renamed Synergetics.AI)

ML Engineering Intern · Feb 2024 – Jun 2024 · Bangalore

Led the full ML lifecycle for a humanoid, speech-capable therapeutic agent. Engineered and deployed agentic workflows, RAG pipelines, guardrails, and context-aware chat with real-time client integration - reducing latency by 53%.

Certificate

University of Auckland & Signal Corporation Ltd

Project Lead (Applied ML) · Jun 2023 – Aug 2023 · New Zealand

Led an international applied ML team to resolve 5 real-world issues in Signal's threat intelligence system. Built scalable pipelines for NER, geocoding, and incremental clustering over live threat data streams for high-profile executive clients.

Certificate

05 Community & Leadership

Computer Society of India, VIT Student Chapter

Board Member & Head of Research & Development · 2023 – 2025

Directed technical strategy and operations for a 100-member chapter. Mentored juniors in ML and research. Organised large-scale events: Rural ML Outreach (40+ rural students), Riddler (1000+ participants, 25+ countries), LaserTag (1000+), and Init ML workshops (200+). Intel Developer Spotlight feature for Rekindle.

06 Selected Projects

Rekindle: AI Companion for Alzheimer's & Dementia Care

Founder & Lead Developer · 2nd place, Intel BOLT Hackathon · Intel Developer Spotlight

End-to-end AI memory-support service for dementia patients: custom emotion-extraction model (outperforming baselines on Google GoEmotions), emotion-conditioned memory retrieval, local LLMs for privacy-preserving inference, and full-stack life-logging platform.

Live Demo Refurbished Version

07 Writing & Blog

How To Be Genuine And Why We Fear It

Authenticity, social fear, and emotional vulnerability during my first year of university · 2022
Our Greatest Treasure — Our Parents

Reflections and gratitude, inspired by a survey from Our World in Data · 2023
Teaching Machine Learning Where GPUs Don't Exist (upcoming)

Lessons from running ML outreach in rural low-resource settings · 2024