Build a single-node Hadoop from sketch

By | August 23, 2021
 

Dependent environment

First need java support Download the 1.8 version of jdk After downloading, decompress, and declare JAVA_HOME in the environment variable
JAVA\_HOME=/usr/local/java/jdk1.8.0\_161i PATH=$PATH:$HOME/bin:$JAVA\_HOME/bin: export JAVA\_HOME export PATH
After saving, use the source command to make the environment variable take effect

Download the hadoop file

In this post, we will be using version 2.10. Unzip to the directory you want to place in the same way as the java program Execute XXX (here is the directory address after decompression)/bin/hadoop version The version number can appear, indicating that the decompression is correct

Configure the stand-alone version

1. Under /etc/hadoop, two configuration files core-site.xml and hdfs-site.xml need to be modified. This is core-site.xml
<configuration>
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/usr/local/hadoop/tmp</value>
                <description>Abase for other temporary directories.</description>
        </property>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://localhost:9000</value>
        </property>
</configuration>

This is hdfs-site.xml
<configuration>
        <property>
                <name>dfs.replication</name>
                <value>1</value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>/usr/local/hadoop/tmp/dfs/name</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>/usr/local/hadoop/tmp/dfs/data</value>
        </property>
</configuration>

Note: dfs.replication refers to the number of backup copies; dfs.namenode.name.dir and dfs.datanode.data.dir refer to the storage path of the name node and data node respectively 2. Perform initialization Go back to the Hadoop home directory and execute ./bin/hdfs namenode -format At the end of the initialization, this proved to be successful
18/08/20 11:07:16 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0 18/08/20 11:07:16 INFO namenode.NameNode: SHUTDOWN\_MSG: /************************************************************ SHUTDOWN\_MSG: Shutting down NameNode at phenix/127.0.1.1 ************************************************************/
3. Manually add JAVA_HOME Configure the JAVA_HOME path in the file XXX/etc/hadoop/hadoop-env.sh
export JAVA\_HOME=/usr/local/java/jdk1.8.0\_161
4. Start the NameNode and DataNode daemons:
执行./sbin/start-dfs.sh
5. Open the yarn resource manager:
执行./sbin/start-yarn.sh
Finally execute jps The following 6 background programs prove that the startup is successful
[root@VM-0-16-centos hadoop-2.10.1]# jps 14880 NameNode 15220 SecondaryNameNode 15384 ResourceManager 15690 NodeManager 20814 Jps 15038 DataNode

Configure hadoop into the configuration file

So you can use the hadoop command directly
JAVA\_HOME=/usr/local/java/jdk1.8.0\_161i export HADOOP\_HOME=/usr/local/hadoop/hadoop-2.10.1 PATH=$PATH:$HOME/bin:$JAVA\_HOME/bin:${HADOOP\_HOME}/bin:${HADOOP\_HOME}/sbin:$PATH export JAVA\_HOME export PATH

WebUi

http://localhost:50070 HDFS portal http://localhost:8088/cluster YARN portal

Out of service

./sbin/stop-dfs.sh ./sbin/stop-yarn.sh