戏真多哦 發表於 2019-8-25 21:24:00

Debian下Hadoop 3.12 集群搭建

<h2 id="debian系统配置">Debian系统配置</h2>
<p>我这里在Vmware里面虚拟4个Debian系统,一个master,三个solver。hostname分别是<strong>master、solver1、solver2、solver3</strong>。对了,下面的JDK和hadoop安装配置操作都是使用<strong>hadoop用户权限</strong>来执行,并非root权限。</p>
<h4 id="1-静态网络的配置">1. 静态网络的配置</h4>
<p>编辑<code>/etc/network/interfaces</code>文件,注释自动获取IP,并添加下面内容</p>
<pre><code class="language-shell"># The primary network interface
#allow-hotplug ens33
#iface ens33 inet dhcp

# static IP address
auto ens33
iface ens33 inet static
address 192.168.20.101
netmask 255.255.255.0
gateway 192.168.20.2
dns-nameservers 192.168.20.2
dns-nameservers 114.114.114.114
</code></pre>
<h4 id="2-修改etchosts文件添加如下内容">2. 修改<code>/etc/hosts</code>文件,添加如下内容</h4>
<pre><code class="language-shell"># Hadoop
192.168.20.101master
192.168.20.102solver1
192.168.20.103solver2
192.168.20.104solver3
</code></pre>
<h4 id="3-openssh-server安装和vim的安装">3. openssh-server安装和vim的安装</h4>
<pre><code class="language-shell">sudo apt-get install openssh-server vim
</code></pre>
<h4 id="4-生成ssh密钥">4. 生成ssh密钥</h4>
<pre><code class="language-shell"># 分别在不同的主机上执行`ssh-keygen`命令

# master
ssh-keygen -t rsa -C "master"

# solver1
ssh-keygen -t rsa -C "solver1"

# solver2
ssh-keygen -t rsa -C "solver2"

# solver3
ssh-keygen -t rsa -C "solver3"
</code></pre>
<h4 id="5-免密码登录">5. 免密码登录</h4>
<pre><code class="language-shell"># 在每台主机上执行:
ssh-copy-id -i ~/.ssh/id_rsa.pub master
ssh-copy-id -i ~/.ssh/id_rsa.pub solver1
ssh-copy-id -i ~/.ssh/id_rsa.pub solver2
ssh-copy-id -i ~/.ssh/id_rsa.pub solver3
</code></pre>
<h4 id="6-创建用户和用户组">6. 创建用户和用户组</h4>
<pre><code class="language-shell"># 在每台主机上执行:
useradd -m -s /bin/bash hadoop
</code></pre>
<h2 id="jdk-安装与配置">JDK 安装与配置</h2>
<h4 id="1-手动安装jdk">1. 手动安装JDK</h4>
<p>解压jdk安装包到<code>/usr/lib/jvm/</code>,然后创建<code>jdk</code>软链接:</p>
<pre><code class="language-shell">sudo ln -sf /usr/lib/jvm/jdk1.8.0_202 /usr/lib/jvm/jdk
</code></pre>
<h4 id="2-jdk环境变量的配置">2. JDK环境变量的配置</h4>
<ul>
<li>新建<code>jdk.sh</code>文件</li>
</ul>
<pre><code class="language-shell">vi /etc/profile.d/jdk.sh
</code></pre>
<ul>
<li>添加如下内容:</li>
</ul>
<pre><code class="language-shell"># JDK environment settings
export JAVA_HOME=/usr/lib/jvm/jdk
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATh=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
</code></pre>
<ul>
<li>JAVA环境的验证</li>
</ul>
<pre><code class="language-shell">$ java -version
java version "1.8.0_202"
Java(TM) SE Runtime Environment (build 1.8.0_202-b08)
Java HotSpot(TM) 64-Bit Server VM (build 25.202-b08, mixed mode)
</code></pre>
<p>把jdk安装包和jdk.sh分别scp到每台主机上,重复上面的操作。</p>
<h2 id="hadoop-安装与配置">Hadoop 安装与配置</h2>
<h4 id="hadoop-安装">Hadoop 安装</h4>
<h4 id="1-解压hadoop安装包到opt修改hadoop-312的拥有者">1. 解压hadoop安装包到<code>/opt</code>,修改hadoop-3.1.2的拥有者:</h4>
<pre><code class="language-shell">sudo chown -R hadoop:hadoop /opt/hadoop-3.1.2
</code></pre>
<h4 id="2-然后创建hadoop软链接">2. 然后创建<code>hadoop</code>软链接</h4>
<pre><code class="language-shell">sudo ln -sf /opt/hadoop-3.1.2 /opt/hadoop
</code></pre>
<h4 id="3-在hadoop下创建logshdfsnamehdfsdata文件夹">3. 在<code>hadoop</code>下创建<code>logs</code>、<code>hdfs/name</code>、<code>hdfs/data</code>文件夹</h4>
<pre><code class="language-shell">mkdir /opt/hadoop/logs
mkdir -p /opt/hadoop/hdfs/name
mkdir -p /opt/hadoop/hdfs/data
</code></pre>
<h4 id="4-hadoop环境变量的配置">4. hadoop环境变量的配置</h4>
<ul>
<li>新建文件<code>hadoop.sh</code></li>
</ul>
<pre><code class="language-shell">vi /etc/profile.d/hadoop.sh
</code></pre>
<ul>
<li>添加如下内容:</li>
</ul>
<pre><code class="language-shell"># Hadoop environment settings
export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
</code></pre>
<ul>
<li>刷新profile变量</li>
</ul>
<pre><code class="language-shell"># 使profile生效
source /etc/profile
</code></pre>
<h4 id="hadoop文件配置">Hadoop文件配置</h4>
<p>配置文件都在<code>etc/hadoop/</code>文件夹下</p>
<h4 id="1-hadoop-envsh">1. <code>hadoop-env.sh</code></h4>
<pre><code class="language-shell"># jdk环境变量 (因为要远程调用 ${java_home}找不到变量)
export JAVA_HOME=/usr/lib/jvm/jdk
</code></pre>
<h4 id="2-workers">2. <code>workers</code></h4>
<pre><code class="language-shell"># 添加所有solver机器的hostname
solver1
solver2
solver3
</code></pre>
<h4 id="3-core-sitexml-">3. <code>core-site.xml </code></h4>
<pre><code class="language-xml">&lt;configuration&gt;

&lt;!-- hdfs的位置 --&gt;
&lt;property&gt;
    &lt;name&gt;fs.defaultFS&lt;/name&gt;
    &lt;value&gt;hdfs://master:9000&lt;/value&gt;
&lt;/property&gt;

&lt;!-- hadoop运行时产生的缓冲文件存储位置 --&gt;
&lt;property&gt;
    &lt;name&gt;hadoop.tmp.dir&lt;/name&gt;
    &lt;value&gt;/opt/hadoop/tmp&lt;/value&gt;
&lt;/property&gt;

&lt;/configuration&gt;
</code></pre>
<h4 id="4-hdfs-sitexml">4. <code>hdfs-site.xml</code></h4>
<pre><code class="language-xml">&lt;configuration&gt;

&lt;!-- hdfs 数据备份数量 --&gt;
&lt;property&gt;
    &lt;name&gt;dfs.replication&lt;/name&gt;
    &lt;value&gt;1&lt;/value&gt;
&lt;/property&gt;

&lt;!-- hdfs namenode上存储hdfs名字空间元数据 --&gt;
&lt;property&gt;
    &lt;name&gt;dfs.namenode.name.dir&lt;/name&gt;
    &lt;value&gt;/opt/hadoop/hdfs/name&lt;/value&gt;
&lt;/property&gt;

&lt;!-- hdfs datanode上数据块的物理存储位置 --&gt;
&lt;property&gt;
    &lt;name&gt;dfs.datanode.data.dir&lt;/name&gt;
    &lt;value&gt;/opt/hadoop/hdfs/data&lt;/value&gt;
&lt;/property&gt;

&lt;/configuration&gt;
</code></pre>
<h4 id="5-mapred-sitexml">5. <code>mapred-site.xml</code></h4>
<pre><code class="language-xml">&lt;configuration&gt;

&lt;!--mapreduce运行的平台 默认local本地模式 --&gt;
&lt;property&gt;
    &lt;name&gt;mapreduce.framework.name&lt;/name&gt;
    &lt;value&gt;yarn&lt;/value&gt;
&lt;/property&gt;

&lt;!--mapreduce web UI address --&gt;
&lt;property&gt;
    &lt;name&gt;mapreduce.jobhistory.webapp.address&lt;/name&gt;
    &lt;value&gt;master:19888&lt;/value&gt;
&lt;/property&gt;

&lt;/configuration&gt;
</code></pre>
<h4 id="6--yarn-sitexml">6.<code>yarn-site.xml</code></h4>
<pre><code class="language-xml">&lt;configuration&gt;
&lt;!-- Site specific YARN configuration properties --&gt;
   
&lt;!--yarn 的 hostname --&gt;
&lt;property&gt;
    &lt;name&gt;yarn.resourcemanager.hostname&lt;/name&gt;
    &lt;value&gt;master&lt;/value&gt;
&lt;/property&gt;

&lt;!--yarn Web UI address --&gt;
&lt;property&gt;
    &lt;name&gt;yarn.resourcemanager.webapp.address&lt;/name&gt;
    &lt;value&gt;${yarn.resourcemanager.hostname}:8088&lt;/value&gt;
&lt;/property&gt;

&lt;!--reducer 获取数据的方式 --&gt;
&lt;property&gt;
    &lt;name&gt;yarn.nodemanager.aux-services&lt;/name&gt;
    &lt;value&gt;mapreduce_shuffle&lt;/value&gt;
&lt;/property&gt;

&lt;/configuration&gt;
</code></pre>
<p>把<code>/opt/hadoop-3.1.2</code>和<code>hadoop.sh</code>打包scp到每台电脑上,然后重复Hadoop安装步骤</p>
<h2 id="hadoop-的验证">Hadoop 的验证</h2>
<ul>
<li>首先格式化 hdfs</li>
</ul>
<pre><code class="language-shell">hdfs namenode -format
</code></pre>
<ul>
<li>启动与关闭 jobhistoryserver</li>
</ul>
<pre><code class="language-shell">mr-jobhistory-daemon.sh start historyserver
mr-jobhistory-daemon.sh stop historyserver
</code></pre>
<ul>
<li>启动与关闭 yarn</li>
</ul>
<pre><code class="language-shell">start-yarn.sh
stop-yarn.sh
</code></pre>
<ul>
<li>启动与关闭 hdfs</li>
</ul>
<pre><code class="language-shell">start-dfs.sh
stop-dfs.sh
</code></pre>
<ul>
<li>一键启动与关闭</li>
</ul>
<pre><code class="language-shell">start-all.sh
stop-all.sh
</code></pre>
<ul>
<li><strong>验证</strong></li>
</ul>
<pre><code class="language-shell">$ jps
13074 SecondaryNameNode
14485 Jps
10441 JobHistoryServer
12876 NameNode
13341 ResourceManager
</code></pre>
<p>访问Web UI</p>
<table>
<thead>
<tr>
<th>Daemon</th>
<th>Web Interface</th>
<th>Notes</th>
</tr>
</thead>
<tbody>
<tr>
<td>NameNode</td>
<td>https://192.168.20.101:9870</td>
<td>Default HTTP port is 9870.</td>
</tr>
<tr>
<td>Resourcemanager</td>
<td>http://192.168.20.101:8088</td>
<td>Default HTTP port is 8088.</td>
</tr>
<tr>
<td>MapReduce JobHistory Server</td>
<td>http://192.168.20.101:19888</td>
<td>Default HTTP port is 19888.</td>
</tr>
</tbody>
</table><br><br>
来源:https://www.cnblogs.com/hxca/p/11301376.html
頁: [1]
查看完整版本: Debian下Hadoop 3.12 集群搭建