The Artificial Intelligence unit at search engine giant Google has open-sourced GPipe, a library for effectively training Deep Neural Networks (DNNs) under its TensorFlow framework Lingvo. As per the company’s AI software engineer, GPipe is applicable to any network comprising numerous sequential layers and enables researchers to easily scale performance.
Google’s AI software engineer, Yanping Huang stated that DNNs have enhanced several Machine Learning tasks of speech recognition, visual recognition, and language processing. He added that larger DNN models lead to better task performance and past progress in visual recognition tasks has also demonstrated a strong relationship between the model size and classification accuracy. Hence, in GPipe the company has shown the utilization of pipeline parallelism to improve DNN training to conquer this limitation. GPipe executes two capable AI training methods; one is synchronous stochastic gradient descent, an optimization algorithm utilized to update a given AI model’s parameters, and other is pipeline parallelism, a task execution system in which one step’s output is streamed as input to the next step, as Huang described in “GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism”.
Many of GPipe’s performance accomplishments come from better memory allocation for Artificial Intelligence models. Google Cloud Tensor Processing Units (TPUs) that consists of 8 processor cores and 64GB memory (8GB per core), GPipe lessened intermediate memory usage from 6.26GB to 3.46GB and allowing 318 million parameters on a single accelerator core. According to Huang, without GPipe a single core can only train over 82 million model parameters. During an experiment, Google trained a Deep Learning algorithm, AmoebaNet-B, with 557 million model parameters and sample images on TPUs that integrated 1.8 billion parameters on each TPU. Huang said that it performed well on popular datasets, pushing single-crop ImageNet precision to 84.3 percent, CIFAR-10 accuracy to 99 percent, and CIFAR-100 accuracy to 91.3 percent, as well as training speed also improved.