The Apache Hadoop’s latest version 3.2.0, an open source software platform, released for shared storage and for processing of large data sets. This version is the first in the 3.2 release series and is not generally available yet. The features of Hadoop 3.2.0 release includes Node Attributes which support in tagging compound labels on the nodes based on their traits. It further assists in placing the containers on the basis of the expression of these labels, and it is not related to any queue, so there is no requirement to queue resource planning and authorization for attributes.
This release appears with Hadoop Submarine that allows data engineers to develop, train and deploy Deep Learning models in TensorFlow on the same Hadoop YARN cluster where data exists. It also enables evaluating data or models in Hadoop Distributed File System (HDFS) and other storages jobs. The features also include Storage policy satisfier that supports Hadoop Distributed File System applications to move the blocks between storage types as they set the storage plans on files or directories. It also makes a solution for decoupling storage capacity from computing capacity. This release also arrives with an enhanced S3A connector, as well as better resilience to choked AWS S3 and DynamoDB IO. In addition, ABFS filesystem connector supports the latest Azure Datalake Gen2 Storage.
In the Apache Hadoop 3.2.0, major improvements comprised the jdk1.7 profile has been removed from the Hadoop-annotations module; unnecessary logging related to tags has been removed from the configuration; ADLS connector has been updated to use the current SDK version (2.2.7); includes LocalizedResource size information in the NM download log for localization; this latest version comes with the ability to configure auxiliary services from HDFS-based JAR files, along with the ability to define user environment variables, alone; the debug messages in MetricsConfig. java has been improved; Capacity scheduler performance metrics have been added, and further support for node labels in opportunistic scheduling.