Dataset: Where Are You? (WAY)

WAY dataset

The Where Are You? (WAY) dataset contains ~6k dialogs in which two humans -- an Observer and a Locator -- complete a cooperative localization task. The Observer is spawned at random in a 3D environment and can navigate from first-person views while answering questions from the Locator. The Locator must localize the Observer in a map by asking questions and giving instructions. Based on this dataset, we define three challenging tasks: Localization from Embodied Dialog or LED (localizing the Observer from dialog history), Embodied Visual Dialog (modeling the Observer), and Cooperative Localization (modeling both agents).

Task: Localization via Embodided Dialog (LED)

Localization from Embodied Dialog (LED), is the state estimation problem of localizing the Observer given a map and a partial or complete dialog between the Locator and the Observer. This task specifically tests a models ability to accurately encode a dialog and effectively ground it into the visual representation of an environment.

LED Task

The WAY codebase and most recent LED models are available at:

The test server and leaderboard is live on EvalAI:


July 2021 — The public leaderboard is now live on EvalAI!
July 2021PyTorch code and models for the LED task is based on the navigation graph is now available! Contains multiple models including cross modal embeddings and lingunet-skip.
November 2020 — Paper accepted to EMNLP 2020!
November 2020WAY dataset is now available!


Where Are You? Localization from Embodied Dialog

Meera Hahn, Jacob Krantz, Dhruv Batra, Devi Parikh, James M. Rehg, Stefan Lee, Peter Anderson
ECCV 2020 [Bibtex] [PDF] [Code]


Meera Hahn
Georgia Tech
Jacob Krantz
Oregon State University
Dhruv Batra
Georgia Tech & Facebook AI Research
Devi Parikh
Georgia Tech & Facebook AI Research
James M. Rehg
Georgia Tech
Stefan Lee
Oregon State University

Email —