The team of researchers at Facebook AI has built a text-based adventure to study the impact of grounding dialogue on Artificial Intelligence interactions. For this, the researchers particularly examined the grounding dialogue impact, a compilation of mutual knowledge, beliefs, and assumptions which is essential for conversation between two folks, on the understanding of AI agents about the virtual world around them. Going toward that end, they developed a research ecosystem in the form of a large-scale, crowd-sourced text adventure, named LIGHT, in order to make the interaction between the AI system and humans as player characters.
In their statement, researchers stated that the current state-of-the-art framework utilizes only the statistical regularities of language data, without comprehending the world that language describes. It enables learning from both actions and dialogs, and researchers hope that LIGHT can be fun for humans to interrelate with, allowing future engagement with their models. They further noted that all utterances in the LIGHT framework are created by human annotators, so inheriting properties of natural languages like ambiguity and co-reference making it a challenging platform for grounded learning of language and actions. For making the framework a challenging platform, human annotators were tasked with creating backstories, location names, character categories, as well as a list of characters with details, personas, and sets of belongings. Afterward, the researchers were crowdsourced objects and accompanying descriptions, along with a number of actions and emotions. Thus, with these efforts, LIGHT now involves natural language descriptions of 663 locations on the basis of a set of regions and biomes, together with 3,462 objects and 1,755 characters.
With the limitations of the game world established, the team of Facebook AI researchers set out compiling a set of data of character-driven interactions. For that, they had two human-controlled characters in a random location where each complete with objects assigned to a location and their persons, and take turns in which they could act on action and say one thing. Altogether, the researchers recorded 10,777 such episodes about actions, emotes, and dialogues that they utilized to train several Artificial Intelligence models.