About Me
I am a Research Scientist at Google Research in Atlanta. I recently completed my PhD in Computer Science at the Georgia Institute of Technology where I was advised by James M. Rehg. At Georgia Tech I also had the pleasure of working closely with Dhruv Batra and Devi Parikh. At Georgia Tech I have also collaborated with Peter Anderson and Stefan Lee. My research interests are primarily focused on multi-modal modeling of vision and natural language for applications in artificial intelligence. My long-term research goal is to develop multi-modal systems capable of supporting robotic or AR assistants that can seemlessly interact with humans. My research currently revolves around training embodied agents (in simulation) to perform complex semantic grounding tasks.
In the summer of 2020, I was an intern at FAIR working with Abhinav Gupta. In the summer of 2019, I was a research intern at Facebook Reality Labs (FRL) working with James Hillis and Dhruv Batra. In the summer of 2018, I was a research intern at NEC Labs working with Asim Kadav and Hans Peter Graf.
As an undergraduate at Emory University, I worked in a Natural Language Processing lab for two years under Dr. Jinho Choi. I also spent a summer working with Dr. Mubarak Shah at University of Central Florida.
Education
- Georgia Institute of Technology - Presidential PhD Fellowship (2016 - 2020)
- Ph.D in Computer Science, Georgia Institute of Technology, July 2022
- B.S. in Computer Science and Mathematics, Emory University, 2016
Publications
Which way is `right'?: Uncovering Limitations of Vision-and-Language Navigation Models

Transformer-based Localization from Embodied Dialog with Large-scale Pre-training

No RL, No Simulation: Learning to Navigate without Navigating

Where Are You? Localization from Embodied Dialog

Learning a Visually Grounded Memory Assistant

Tripping through time: Efficient Localization of Activities in Videos

Action2Vec: A Crossmodal Embedding Approach to Action Learning

Localizing and Aligning Fine-Grained Actions to Sparse Instructions

Situated Bayesian Reasoning Framework for Robots Operating in Diverse Everyday Environments

Deep Tracking: Visual Tracking Using Deep Convolutional Networks

Advances in Methods and Evaluations for Distributional Semantic Models

Talks
No RL, No Simulation: Learning to Navigate without Navigating
Neurips 2021
Where Are You? Localization from Embodied Dialog
EMNLP 2020