In collaboration with a computer programs company DeepMind, search engine giant Google has released PlaNet, Deep Planning Network that learns a world model from image inputs and leverages it for planning. The PlaNet is able to solve various image-based tasks with over 5,000 percent of the data efficiency while maintaining competitiveness with advanced model-free agents. According to the company, the source code is available on the code hosting services platform, GitHub.
As explained by Danijar Hafner, Co-author of the academic paper on PlaNet’s architecture and a student researcher at Google AI, the PlaNet model functions by learning dynamics models given image inputs, and plans with those model to collect new experience. It especially utilizes latent dynamics model, a model which envisages that latent state forward, generates an image and reward at each step from the corresponding latent state, to accomplish a comprehension of conceptual representations like the pace of objects. Through this predictive image generation, the PlaNet agent learns, and plans quickly; in the compact latent state space, instead of the image, it only needs to project future rewards to assess an action sequence. The model, contrary to previous approaches, effectively works without a policy network, as an alternative, it selects actions based on planning.
During the tests where PlaNet was tasked with six continuous control tasks, including a task involving an accelerated robot lying on the ground which had to learn to stand up and walk, and another task for a model that could foresee compound futures. On this given tasks, Google said that its model outperformed model-free methods like A3C and D4PG on image-based tasks. Further, when the PlaNet was placed randomly into a range of environments with knowing the tasks, the model managed to learn all six tasks without modification in 2,000 attempts. Albeit, Hafner, and co-authors consider that extending the processing power could generate an even more robust model.