Hadoop3单机部署
Hadoop3单机伪部署
适用于Linux所有版本包括:
CentOS
, Ubuntu
, Deepin
, Manjaro
Hadoop版本: 3.3.4
java8是必须的
所有Hadoop的jar都是利用java8的运行时版本进行编译的
多个服务的默认端口号改变
Hadoop多个端口号已经变了,下面列出主要的:
名称 | Hadoop2端口 | Hadoop3端口 |
---|---|---|
NameNode | 8020 | 9820 |
NameNode HTTP UI | 50070 | 9870 |
DataNode | 50010 | 9866 |
Secondary NameNode HTTP UI | 50090 | 9869 |
DataNode IPC | 50020 | 9867 |
DataNode HTTP UI | 50070 | 9864 |
配置Java环境
略
部署
# 下载,国内清华镜像源
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.3.4/hadoop-3.3.4.tar.gz
# 解压
tar -zxf hadoop-3.3.4.tar.gz -C /opt/modules
# 创建软链接
ln -s /opt/modules/hadoop-3.3.4 hadoop
# 配置环境变量
# set hadoop environment
# set hadoop environment
HADOOP_HOME=/opt/shortcut/hadoop
if [[ -d ${HADOOP_HOME} ]]; then
export HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_PID_DIR=${HADOOP_HOME}/pids
export PATH=${PATH}:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin
fi
# 删掉Windows的脚本
rm -f etc/hadoop/*.cmd
配置文件修改
一般需要修改以下配置文件:
- etc/hadoop/hadoop-env.sh
- etc/hadoop/core-site.xml
- etc/hadoop/hdfs-site.xml
- etc/hadoop/mapred-site.xml
- etc/hadoop/yarn-site.xml
1. hadoop-env.sh
一般只是定义 JAVA_HOME
路径:
export JAVA_HOME=/usr/java/jdk1.8.0_152
外部如已定义可不用修改
2. core-site.xml
# 创建临时存储目录
mkdir -p /opt/shortcut/hadoop/data/tmp
# 修改core-site.xml
vim core-site.xml
<!-- 添加以下内容(去掉<configuration>) -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://wedo:9820</value>
<description>hdfs访问路径</description>
</property>
<property>
<name>hadoop.http.staticuser.user</name>
<value>wedo</value>
<description></description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/shortcut/hadoop/data/tmp</value>
<description></description>
</property>
<property>
<name>fs.trash.interval</name>
<value>7200</value>
<description></description>
</property>
</configuration>
3. hdfs-site.xml
创建Hadoop本地存储namenode和datanode数据的目录
mkdir -p /opt/shortcut/hadoop/data/namenode
mkdir -p /opt/shortcut/hadoop/data/datanode
修改配置文件 hdfs-site.xml
<configuration>
<!-- 是否进行权限检查 -->
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
<description>是否进行权限检查</description>
</property>
<!-- 副本数 -->
<property>
<name>dfs.replication</name>
<value>1</value>
<description>副本数</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/shortcut/hadoop/data/namenode</value>
<description>namenode数据存储路径</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/shortcut/hadoop/data/datanode</value>
<description>datanode数据存储路径</description>
</property>
<property>
<name>dfs.http.address</name>
<value>0.0.0.0:50070</value>
<description>将绑定IP改为0.0.0.0,而不是本地回环IP,这样,就能够实现外网访问本机的50070端口了</description>
</property>
</configuration>
4. yarn-site.xml
<configuration>
<!-- resourceManager在哪台机器 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>wedo.com</value>
<description>ResourceManager在哪台机器上</description>
</property>
<!-- 在nodemanager中运行mapreduce服务 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>在NodeManager中运行MapReduce</description>
</property>
<!-- 配置日志的聚集功能 -->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
<description>日志的聚合功能</description>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
<description>聚合日志最大存在时间(秒)</description>
</property>
</configuration>
5. mapred-site.xml
将mapred-site.xml.template复制1份为mapred-site.xml
mv mapred-site.xml.template mapred-site.xml
修改如下配置
<configuration>
<!-- mapreduce运行在yarn上面 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=/opt/shortcut/hadoop</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=/opt/shortcut/hadoop</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=/opt/shortcut/hadoop</value>
</property>
</configuration>
启动
格式化HDFS
hdfs namenode -format
不报错则表示成功:
2023-10-03 14:26:31,131 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = I75930/192.168.3.29
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 3.3.4
STARTUP_MSG: classpath = xxx
compiled by 'stevel' on 2022-07-29T12:32Z
STARTUP_MSG: java = 11.0.8
************************************************************/
2023-10-03 14:26:31,138 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2023-10-03 14:26:31,208 INFO namenode.NameNode: createNameNode [-format]
2023-10-03 14:26:31,532 INFO common.Util: Assuming 'file' scheme for path /opt/shortcut/hadoop/data/namenode in configuration.
2023-10-03 14:26:31,532 INFO common.Util: Assuming 'file' scheme for path /opt/shortcut/hadoop/data/namenode in configuration.
2023-10-03 14:26:31,595 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
2023-10-03 14:26:31,626 INFO namenode.FSNamesystem: fsOwner = wedo (auth:SIMPLE)
2023-10-03 14:26:31,626 INFO namenode.FSNamesystem: supergroup = supergroup
2023-10-03 14:26:31,626 INFO namenode.FSNamesystem: isPermissionEnabled = false
2023-10-03 14:26:31,626 INFO namenode.FSNamesystem: isStoragePolicyEnabled = true
2023-10-03 14:26:31,626 INFO namenode.FSNamesystem: HA Enabled: false
...
...
...
2023-10-03 14:26:31,847 INFO common.Storage: Storage directory /warehouse/modules/hadoop-3.3.4/data/namenode has been successfully formatted.
2023-10-03 14:26:32,014 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid=0 when meet shutdown.
2023-10-03 14:26:32,015 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at I75930/192.168.3.29
************************************************************/
启动停止服务
hdfs --daemon start namenode
hdfs --daemon start datanode
yarn --daemon start resourcemanager
yarn --daemon start nodemanager
yarn --daemon start timelineserver
原来的启动方式:
sbin/hadoop-daemon.sh start namenode
sbin/hadoop-daemon.sh start datanode
sbin/yarn-daemon.sh start resourcemanager
sbin/yarn-daemon.sh start nodemanager
sbin/mr-jobhistory-daemon.sh start historyserver
验证
查看是否启动成功
jps
进程名如下:
# wedo @ I75930 in ~ [15:41:26]
$ jps
13635 NameNode
13939 DataNode
32774 Jps
1800 QuorumPeerMain
14187 ResourceManager
14479 NodeManager
浏览器访问
访问HDFS: http://wedo.com:9870
访问Yarn: http://wedo.com:8088