Dataset: Where Are You? (WAY)


WAY dataset

The Where Are You? (WAY) dataset contains ~6k dialogs in which two humans -- an Observer and a Locator -- complete a cooperative localization task. The Observer is spawned at random in a 3D environment and can navigate from first-person views while answering questions from the Locator. The Locator must localize the Observer in a map by asking questions and giving instructions. Based on this dataset, we define three challenging tasks: Localization from Embodied Dialog or LED (localizing the Observer from dialog history), Embodied Visual Dialog (modeling the Observer), and Cooperative Localization (modeling both agents).


Task: Localization via Embodided Dialog (LED)


Localization from Embodied Dialog (LED), is the state estimation problem of localizing the Observer given a map and a partial or complete dialog between the Locator and the Observer. This task specifically tests a models ability to accurately encode a dialog and effectively ground it into the visual representation of an environment.



LED Task

The WAY codebase and baseline LED models are available at:
github.com/batra-mlp-lab/WAY/


November 2020 — Paper accepted to EMNLP 2020!
November 2020PyTorch code for training and evaluating LED models is now available!

Where Are You? Localizaiton from Embodied Dialog

Meera Hahn, Jacob Krantz, Dhruv Batra, Devi Parikh, James M. Rehg, Stefan Lee, Peter Anderson
ECCV 2020 [Bibtex] [PDF] [Code]




People


Meera Hahn
Georgia Tech
Jacob Krantz
Oregon State University
Dhruv Batra
Georgia Tech & Facebook AI Research
Devi Parikh
Georgia Tech & Facebook AI Research
James M. Rehg
Georgia Tech
Stefan Lee
Oregon State University

Email — meerahahn@gatech.edu