PhD Student at MIT
pschro@mit.edu
I am a PhD student at MIT in Electrical Engineering & Computer Science, advised by Dr. Jim Glass.
My work has focused on improving the reasoning capabilities of large language models (LLMs) and vision-language models (VLMs) in challenging embodied settings.
My early projects introduced recursive architectures for transformer decoding that improve the performance of LLMs and VLMs when interacting with external environments through text or video. This work led to first-author papers at NeurIPS 2025 (introducing ROVER) and NAACL 2025 (introducing THREAD).
In my most recent work during my internship at the Boston Dynamics AI Institute (now RAI Institute), we introduce SOLE-R1: a new foundation model with video-language reasoning designed for guiding on-robot reinforcement learning. In our paper (under review for ICML 2026), we show that SOLE-R1 significantly outperforms state-of-the-art reasoning models and enables learning over 20 unseen tasks through zero-shot online RL: robots learn without access to ground-truth rewards, success indicators, demonstrations, or task-specific tuning.