Debian下Hadoop 3.12 集群搭建
<h2 id="debian系统配置">Debian系统配置</h2><p>我这里在Vmware里面虚拟4个Debian系统,一个master,三个solver。hostname分别是<strong>master、solver1、solver2、solver3</strong>。对了,下面的JDK和hadoop安装配置操作都是使用<strong>hadoop用户权限</strong>来执行,并非root权限。</p>
<h4 id="1-静态网络的配置">1. 静态网络的配置</h4>
<p>编辑<code>/etc/network/interfaces</code>文件,注释自动获取IP,并添加下面内容</p>
<pre><code class="language-shell"># The primary network interface
#allow-hotplug ens33
#iface ens33 inet dhcp
# static IP address
auto ens33
iface ens33 inet static
address 192.168.20.101
netmask 255.255.255.0
gateway 192.168.20.2
dns-nameservers 192.168.20.2
dns-nameservers 114.114.114.114
</code></pre>
<h4 id="2-修改etchosts文件添加如下内容">2. 修改<code>/etc/hosts</code>文件,添加如下内容</h4>
<pre><code class="language-shell"># Hadoop
192.168.20.101master
192.168.20.102solver1
192.168.20.103solver2
192.168.20.104solver3
</code></pre>
<h4 id="3-openssh-server安装和vim的安装">3. openssh-server安装和vim的安装</h4>
<pre><code class="language-shell">sudo apt-get install openssh-server vim
</code></pre>
<h4 id="4-生成ssh密钥">4. 生成ssh密钥</h4>
<pre><code class="language-shell"># 分别在不同的主机上执行`ssh-keygen`命令
# master
ssh-keygen -t rsa -C "master"
# solver1
ssh-keygen -t rsa -C "solver1"
# solver2
ssh-keygen -t rsa -C "solver2"
# solver3
ssh-keygen -t rsa -C "solver3"
</code></pre>
<h4 id="5-免密码登录">5. 免密码登录</h4>
<pre><code class="language-shell"># 在每台主机上执行:
ssh-copy-id -i ~/.ssh/id_rsa.pub master
ssh-copy-id -i ~/.ssh/id_rsa.pub solver1
ssh-copy-id -i ~/.ssh/id_rsa.pub solver2
ssh-copy-id -i ~/.ssh/id_rsa.pub solver3
</code></pre>
<h4 id="6-创建用户和用户组">6. 创建用户和用户组</h4>
<pre><code class="language-shell"># 在每台主机上执行:
useradd -m -s /bin/bash hadoop
</code></pre>
<h2 id="jdk-安装与配置">JDK 安装与配置</h2>
<h4 id="1-手动安装jdk">1. 手动安装JDK</h4>
<p>解压jdk安装包到<code>/usr/lib/jvm/</code>,然后创建<code>jdk</code>软链接:</p>
<pre><code class="language-shell">sudo ln -sf /usr/lib/jvm/jdk1.8.0_202 /usr/lib/jvm/jdk
</code></pre>
<h4 id="2-jdk环境变量的配置">2. JDK环境变量的配置</h4>
<ul>
<li>新建<code>jdk.sh</code>文件</li>
</ul>
<pre><code class="language-shell">vi /etc/profile.d/jdk.sh
</code></pre>
<ul>
<li>添加如下内容:</li>
</ul>
<pre><code class="language-shell"># JDK environment settings
export JAVA_HOME=/usr/lib/jvm/jdk
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATh=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
</code></pre>
<ul>
<li>JAVA环境的验证</li>
</ul>
<pre><code class="language-shell">$ java -version
java version "1.8.0_202"
Java(TM) SE Runtime Environment (build 1.8.0_202-b08)
Java HotSpot(TM) 64-Bit Server VM (build 25.202-b08, mixed mode)
</code></pre>
<p>把jdk安装包和jdk.sh分别scp到每台主机上,重复上面的操作。</p>
<h2 id="hadoop-安装与配置">Hadoop 安装与配置</h2>
<h4 id="hadoop-安装">Hadoop 安装</h4>
<h4 id="1-解压hadoop安装包到opt修改hadoop-312的拥有者">1. 解压hadoop安装包到<code>/opt</code>,修改hadoop-3.1.2的拥有者:</h4>
<pre><code class="language-shell">sudo chown -R hadoop:hadoop /opt/hadoop-3.1.2
</code></pre>
<h4 id="2-然后创建hadoop软链接">2. 然后创建<code>hadoop</code>软链接</h4>
<pre><code class="language-shell">sudo ln -sf /opt/hadoop-3.1.2 /opt/hadoop
</code></pre>
<h4 id="3-在hadoop下创建logshdfsnamehdfsdata文件夹">3. 在<code>hadoop</code>下创建<code>logs</code>、<code>hdfs/name</code>、<code>hdfs/data</code>文件夹</h4>
<pre><code class="language-shell">mkdir /opt/hadoop/logs
mkdir -p /opt/hadoop/hdfs/name
mkdir -p /opt/hadoop/hdfs/data
</code></pre>
<h4 id="4-hadoop环境变量的配置">4. hadoop环境变量的配置</h4>
<ul>
<li>新建文件<code>hadoop.sh</code></li>
</ul>
<pre><code class="language-shell">vi /etc/profile.d/hadoop.sh
</code></pre>
<ul>
<li>添加如下内容:</li>
</ul>
<pre><code class="language-shell"># Hadoop environment settings
export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
</code></pre>
<ul>
<li>刷新profile变量</li>
</ul>
<pre><code class="language-shell"># 使profile生效
source /etc/profile
</code></pre>
<h4 id="hadoop文件配置">Hadoop文件配置</h4>
<p>配置文件都在<code>etc/hadoop/</code>文件夹下</p>
<h4 id="1-hadoop-envsh">1. <code>hadoop-env.sh</code></h4>
<pre><code class="language-shell"># jdk环境变量 (因为要远程调用 ${java_home}找不到变量)
export JAVA_HOME=/usr/lib/jvm/jdk
</code></pre>
<h4 id="2-workers">2. <code>workers</code></h4>
<pre><code class="language-shell"># 添加所有solver机器的hostname
solver1
solver2
solver3
</code></pre>
<h4 id="3-core-sitexml-">3. <code>core-site.xml </code></h4>
<pre><code class="language-xml"><configuration>
<!-- hdfs的位置 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<!-- hadoop运行时产生的缓冲文件存储位置 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/tmp</value>
</property>
</configuration>
</code></pre>
<h4 id="4-hdfs-sitexml">4. <code>hdfs-site.xml</code></h4>
<pre><code class="language-xml"><configuration>
<!-- hdfs 数据备份数量 -->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<!-- hdfs namenode上存储hdfs名字空间元数据 -->
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/hadoop/hdfs/name</value>
</property>
<!-- hdfs datanode上数据块的物理存储位置 -->
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/hadoop/hdfs/data</value>
</property>
</configuration>
</code></pre>
<h4 id="5-mapred-sitexml">5. <code>mapred-site.xml</code></h4>
<pre><code class="language-xml"><configuration>
<!--mapreduce运行的平台 默认local本地模式 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<!--mapreduce web UI address -->
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
</configuration>
</code></pre>
<h4 id="6--yarn-sitexml">6.<code>yarn-site.xml</code></h4>
<pre><code class="language-xml"><configuration>
<!-- Site specific YARN configuration properties -->
<!--yarn 的 hostname -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<!--yarn Web UI address -->
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>${yarn.resourcemanager.hostname}:8088</value>
</property>
<!--reducer 获取数据的方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
</code></pre>
<p>把<code>/opt/hadoop-3.1.2</code>和<code>hadoop.sh</code>打包scp到每台电脑上,然后重复Hadoop安装步骤</p>
<h2 id="hadoop-的验证">Hadoop 的验证</h2>
<ul>
<li>首先格式化 hdfs</li>
</ul>
<pre><code class="language-shell">hdfs namenode -format
</code></pre>
<ul>
<li>启动与关闭 jobhistoryserver</li>
</ul>
<pre><code class="language-shell">mr-jobhistory-daemon.sh start historyserver
mr-jobhistory-daemon.sh stop historyserver
</code></pre>
<ul>
<li>启动与关闭 yarn</li>
</ul>
<pre><code class="language-shell">start-yarn.sh
stop-yarn.sh
</code></pre>
<ul>
<li>启动与关闭 hdfs</li>
</ul>
<pre><code class="language-shell">start-dfs.sh
stop-dfs.sh
</code></pre>
<ul>
<li>一键启动与关闭</li>
</ul>
<pre><code class="language-shell">start-all.sh
stop-all.sh
</code></pre>
<ul>
<li><strong>验证</strong></li>
</ul>
<pre><code class="language-shell">$ jps
13074 SecondaryNameNode
14485 Jps
10441 JobHistoryServer
12876 NameNode
13341 ResourceManager
</code></pre>
<p>访问Web UI</p>
<table>
<thead>
<tr>
<th>Daemon</th>
<th>Web Interface</th>
<th>Notes</th>
</tr>
</thead>
<tbody>
<tr>
<td>NameNode</td>
<td>https://192.168.20.101:9870</td>
<td>Default HTTP port is 9870.</td>
</tr>
<tr>
<td>Resourcemanager</td>
<td>http://192.168.20.101:8088</td>
<td>Default HTTP port is 8088.</td>
</tr>
<tr>
<td>MapReduce JobHistory Server</td>
<td>http://192.168.20.101:19888</td>
<td>Default HTTP port is 19888.</td>
</tr>
</tbody>
</table><br><br>
来源:https://www.cnblogs.com/hxca/p/11301376.html
頁:
[1]