Where Are You? Localization from Embodied Dialog

We present WHERE ARE YOU? (WAY), a dataset of ∼6k dialogs in which two humans – an Observer and a Locator – complete a cooperative localization task.

Hahn, Meera, Jacob Krantz, Dhruv Batra, Devi Parikh, James Rehg, Stefan Lee, and Peter Anderson. "Where Are You? Localization from Embodied Dialog." In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 806-822. 2020.
[Paper] [Website] [Data] [Code]

Tripping through time: Efficient Localization of Activities in Videos

We present TripNet, an end-to-end system that aligns text queries with video content. TripNet uses reinforcement learning to efficiently localize relevant activity clips in long videos, by learning how to intelligently skip around the video.

Hahn, Meera, Asim Kadav, James M. Rehg, and Hans Peter Graf. "Tripping through time: Efficient localization of activities in videos." Proceedings of the British Machine Vision Conference (2019).

Action2Vec: A Crossmodal Embedding Approach to Action Learning

We present a novel cross-modal embedding space for actions, named Action2Vec, which combines linguistic cues from class labels with spatio-temporal features derived from video clips.

Meera Hahn, Andrew Silva, and James M. Rehg. "Action2Vec: A Crossmodal Embedding Approach to Action Learning." arXiv preprint arXiv:1901.00484 (2019).

Situated Bayesian Reasoning Framework for Robots Operating in Diverse Everyday Environments

We present an approach for automatically generating a compact semantic knowledge base, relevant to a robot’s particular operating environment, given only a small number of object labels obtained from object recognition or a robot’s task description.

Chernova, Sonia, Vivian Chu, Angel Daruna, Haley Garrison, Meera Hahn, Priyanka Khante, Weiyu Liu, and Andrea Thomaz. "Situated bayesian reasoning framework for robots operating in diverse everyday environments." In Robotics Research, pp. 353-369. Springer, Cham, 2020.

Localizing and Aligning Fine-Grained Actions to Sparse Instructions

We present a system for automatically generating an alignment between a recipe and a first-person video demonstrating how to prepare the dish. Our approach uses egocentric cues to generate a concise set of action proposals, which are then matched to recipe steps using object detections and computational linguistic techniques.

Hahn, Meera, et al. "Tripping through time: Efficient Localization of Activities in Videos." arXiv preprint arXiv:1904.09936 (2019).

Advances in Methods and Evaluations for Distributional Semantic Models

The goal of this work was to create more semantically rich embeddings for verbs. The approach modified the word embedding architecture to incorporate semantic role labels and dependencies. Additionally, this work introduces novel quantitative evaluations for embedding for all parts of speech. This work was done at Emory University under Dr. Jinho Choi. This was my undergraduate thesis from Emory University that examines new approaches to Word Embedding and proposes novel methods for word embedding evalutation.

Meera Hahn, Jinho Choi. "Where Are You? Localization from Embodied Dialog." Emory University (2016).

Deep Tracking: Visual Tracking Using Deep Convolutional Networks

We present a novel and successful approach to object tracking by using convolutional neural networks. This work was done in a Research Experience for Undergraduate (REU) program at University of Central Florida under Dr. Mubarak Shah

Meera Hahn, Si Chen, and Afshin Dehghan. "Deep tracking: Visual tracking using deep convolutional networks." arXiv preprint arXiv:1512.03993 (2015).