豪情男儿 發表於 2019-5-15 00:57:00

ubuntu安装伪分布式Hadoop3.1.2

<p>本文是基于已经安装好的ubuntu环境上搭建伪分布式hadoop,在virtualbox安装ubuntu可以参考小编的</p>
<p>”<strong>virtualbox安装ubuntu16.04 LTS及其配置</strong>“</p>
<p>&nbsp;</p>
<h1><strong>一、Hadoop</strong><strong>的三种运行模式(启动模式)</strong></h1>
<h2><strong>1.1</strong><strong>、单机模式(独立模式)(</strong><strong>Local</strong><strong>或</strong><strong>Standalone&nbsp; Mode</strong><strong>)</strong></h2>
<p class="15">  -默认情况下,Hadoop即处于该模式,用于开发和调式。</p>
<p class="15">  -不对配置文件进行修改。<br>  -使用本地文件系统,而不是分布式文件系统。<br>  -Hadoop不会启动NameNode、DataNode、JobTracker、TaskTracker等守护进程,Map()和Reduce()任务作为同一个进程的不同部分来执行的。<br>  -用于对MapReduce程序的逻辑进行调试,确保程序的正确。</p>
<h2><strong>1.2</strong><strong>、伪分布式模式(</strong><strong>Pseudo-Distrubuted Mode</strong><strong>)</strong></h2>
<p class="15">  -Hadoop的守护进程运行在本机机器,模拟一个小规模的集群 </p>
<p class="15">  -在一台主机模拟多主机。<br>  -Hadoop启动NameNode、DataNode、JobTracker、TaskTracker这些守护进程都在同一台机器上运行,是相互独立的Java进程。<br>  -在这种模式下,Hadoop使用的是分布式文件系统,各个作业也是由JobTraker服务,来管理的独立进程。在单机模式之上增加了代码调试功能,允许检查内存使用情况,HDFS输入输出,</p>
<p class="15">    以及其他的守护进程交互。类似于完全分布式模式,因此,这种模式常用来开发测试Hadoop程序的执行是否正确。<br>  -修改3个配置文件:core-site.xml(Hadoop集群的特性,作用于全部进程及客户端)、hdfs-site.xml(配置HDFS集群的工作属性)、mapred-site.xml(配置MapReduce集群的属性)<br>  -格式化文件系统</p>
<h2><strong>1.3</strong><strong>、全分布式集群模式(</strong><strong>Full-Distributed Mode</strong><strong>)</strong></h2>
<p class="15">  -Hadoop的守护进程运行在一个集群上 </p>
<p class="15">  -Hadoop的守护进程运行在由多台主机搭建的集群上,是真正的生产环境。<br>  -在所有的主机上安装JDK和Hadoop,组成相互连通的网络。<br>  -在主机间设置SSH免密码登录,把各从节点生成的公钥添加到主节点的信任列表。<br>  -修改3个配置文件:core-site.xml、hdfs-site.xml、mapred-site.xml,指定NameNode和JobTraker的位置和端口,设置文件的副本等参数<br>  -格式化文件系统</p>
<p class="15">&nbsp;</p>
<h1 class="15">二、准备系统环境</h1>
<h2><strong><span style="font-family: 宋体">2.1、运行虚拟机,进行静态网络配置:</span></strong></h2>
<p><strong><span style="font-family: 宋体">  </span></strong><span style="font-family: 宋体">在终端上输入ifconfig -a命令查看网卡名,我的主机有三个网络接口,分别是enp0s3(桥接网卡),enp0s8(NAT),lo(动态获取ip)</span><span style="font-family: 宋体"><br></span></p>
<div class="cnblogs_code">
<pre># <span style="color: rgba(0, 0, 255, 1)">ifconfig</span> -a</pre>
</div>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190514204510539-263440594.png" alt=""></p>
<p>&nbsp;</p>
<p>  对/etc/network/interfaces文件进行编辑,以下是在终端上执行的命令:</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">sudo</span> vim /etc/network/interfaces</pre>
</div>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190514205006098-333942167.png" alt=""></p>
<p>  上图是ubuntu的/etc/network/interfaces文件默认的内容,<strong>默认动态获取方法的配置</strong>。</p>
<p>  但是在业务上需要给ubuntu主机配置静态ip网络,在这里我只对enp0s3进行修改,以下是<strong>静态分配的配置方法(根据自己的需求改)</strong>:</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 0, 1)">auto enp0s3
iface enp0s3 inet static
address </span><span style="color: rgba(128, 0, 128, 1)">192.168</span>.<span style="color: rgba(128, 0, 128, 1)">87.138</span><span style="color: rgba(0, 0, 0, 1)">
netmask </span><span style="color: rgba(128, 0, 128, 1)">255.255</span>.<span style="color: rgba(128, 0, 128, 1)">255.0</span><span style="color: rgba(0, 0, 0, 1)">
gateway </span><span style="color: rgba(128, 0, 128, 1)">192.168</span>.<span style="color: rgba(128, 0, 128, 1)">87.254</span></pre>
</div>
<p>  接下来需要添加域名服务器,<strong>编辑</strong>/etc/resolv.conf文件,<strong>添加</strong>域名服务器,在这里我<strong>选择</strong>了<strong>全球通用</strong>的<strong>DNS域名服务器</strong>,国内用户推荐使用,速度较快!</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">sudo</span> vim /etc/resolv.conf</pre>
</div>
<div class="cnblogs_code">
<pre>nameserver <span style="color: rgba(128, 0, 128, 1)">114.114</span>.<span style="color: rgba(128, 0, 128, 1)">114.114<br>或者<br>nameserver 8.8.8.8<br></span></pre>
</div>
<p>  配置已经完成了,接下来需要重启网络,网络重启有多种方法,在这里只列出两种方法,二选一即可。</p>
<p>  <strong>1. 重启网卡</strong></p>
<div class="cnblogs_code">
<pre>/etc/init.d/networking restart</pre>
</div>
<p>  <strong>2.&nbsp;这两条命令是重启某个网络接口,一个系统可能有多个网络接口</strong></p>
<div class="cnblogs_code">
<pre># ifdown enp0s3<br># ifup enp0s3</pre>
</div>
<p>  检查网络配置参数是否正确:</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)"># ifconfig</span></pre>
</div>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190514211009389-1070553204.png" alt=""></p>
<p>&nbsp;</p>
<p>&nbsp;  检查是否能ping通:</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">ping</span> www.qq.com</pre>
</div>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190514211131266-918455616.png" alt=""></p>
<p>&nbsp;</p>
<p>&nbsp;  已成功ping通,静态网络已配置好了。</p>
<p>&nbsp;</p>
<h2>2.2、修改主机名与IP地址的对应关系</h2>
<p>  查看主机名:</p>
<div class="cnblogs_code">
<pre># hostname</pre>
</div>
<p><img src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190514214015483-1969886923.png" alt=""></p>
<p>  修改/etc/hosts文件:</p>
<div class="cnblogs_code">
<pre># vim /etc/hosts</pre>
</div>
<p><img src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190514213829124-1283040588.png" alt=""></p>
<p>  </p>
<p>  /etc/hosts文件默认是上图所示,修改文件为以下内容,注释127.0.1.1,添加主机静态地址与主机名:</p>
<p><img src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190514214301486-1233957169.png" alt=""></p>
<p>&nbsp;</p>
<h2>2.3、配置本机ssh免密码登录</h2>
<p>  单机配置ssh免密登陆的话,输入以下的命令即可:</p>
<p>  提示输入信息,一直回车按默认即可。</p>
<div class="cnblogs_code">
<pre># <span style="color: rgba(0, 0, 255, 1)">ssh-keygen</span> -<span style="color: rgba(0, 0, 0, 1)">t rsa
# </span><span style="color: rgba(0, 0, 255, 1)">cat</span> ~/.<span style="color: rgba(0, 0, 255, 1)">ssh</span>/id_rsa.pub &gt;&gt; ~/.<span style="color: rgba(0, 0, 255, 1)">ssh</span>/<span style="color: rgba(0, 0, 0, 1)">authorized_keys
# </span><span style="color: rgba(0, 0, 255, 1)">chmod</span> <span style="color: rgba(128, 0, 128, 1)">600</span> ~/.<span style="color: rgba(0, 0, 255, 1)">ssh</span>/authorized_keys</pre>
</div>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190514212404801-1423789947.png" alt=""></p>
<p>  </p>
<p><span style="font-family: PTSans">  完成之后,以</span>&nbsp;root <span style="font-family: PTSans">用户登录,修改</span>&nbsp;ssh <span style="font-family: PTSans">配置文件:</span></p>
<p>&nbsp;</p>
<div class="cnblogs_code">
<pre>vim /etc/<span style="color: rgba(0, 0, 255, 1)">ssh</span>/sshd_config</pre>
</div>
<p><span style="font-family: PTSans">  把文件中的下面几条信息的注释去掉,如图所示:</span></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 0, 1)">RSAAuthentication yes # 启用RSA认证
PubkeyAuthentication yes # 启用公钥私钥配对认证方式
AuthorizedKeysFile .</span><span style="color: rgba(0, 0, 255, 1)">ssh</span>/authorized_keys #公钥文件路径(和上面生成的文件同)</pre>
</div>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190514213404399-1855593749.png" alt=""></p>
<p>&nbsp;</p>
<p>  然后重启服务:</p>
<div class="cnblogs_code">
<pre># service sshd restart</pre>
</div>
<p> </p>
<p><span style="font-family: PTSans">  输入ssh localhost验证出现如下界面,中间不需要输入密码,即配置完成。</span></p>
<p>&nbsp;</p>
<div class="cnblogs_code">
<pre># <span style="color: rgba(0, 0, 255, 1)">ssh</span> localhost</pre>
</div>
<p>&nbsp;</p>
<p><span style="font-family: PTSans"><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190514213530260-1155950310.png" alt=""></span></p>
<p>&nbsp;</p>
<h2>2.4、 安装Oracle Java,并配置环境变量</h2>
<p>  <strong>1. 从官网下载oracle jdk1.8</strong></p>
<p>  <strong>2. 解压tar包,指定解压/usr/local/目录</strong></p>
<div class="cnblogs_code">
<pre># <span style="color: rgba(0, 0, 255, 1)">tar</span> -zxvf jdk-8u211-linux-x64.<span style="color: rgba(0, 0, 255, 1)">tar</span>.gz -C /usr/local/</pre>
</div>
<p>  <strong>3. 配置环境变量</strong></p>
<div class="cnblogs_code">
<pre># vim /etc/profile</pre>
</div>
<p>  然后添加以下配置在文件尾:</p>
<div class="cnblogs_code">
<pre>export JAVA_HOME=/usr/local/jdk1.<span style="color: rgba(128, 0, 128, 1)">8</span><span style="color: rgba(0, 0, 0, 1)">.0_211
export PATH</span>=$PATH:$JAVA_HOME/<span style="color: rgba(0, 0, 0, 1)">bin
export JRE_HOME</span>=$JAVA_HOME/<span style="color: rgba(0, 0, 0, 1)">jre
export CLASSPATH</span>=.:$JAVA_HOME/lib:$JRE_HOME/lib</pre>
</div>
<p>  保存退出即可。</p>
<p>  <strong>4. 测试jdk配置成功否</strong></p>
<p>  刷新环境变量:</p>
<div class="cnblogs_code">
<pre># source /etc/profile</pre>
</div>
<p>  输入java -verion,如配置成功,有下图的java版本在终端上显示:</p>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190514221907552-37777031.png" alt=""></p>
<p>&nbsp;</p>
<p>  到此,hadoop需要的系统环境已经搭建完毕了,接下来开始搭建伪分布式hadoop集群~</p>
<p>&nbsp;</p>
<h1>三、<strong>搭建伪分布式hadoop集群</strong></h1>
<h2><strong>3.1 安装hadoop</strong></h2>
<p><strong>  </strong>从官网下载hadoop3.1.2,解压hadoop安装包到/usr/local/目录下:</p>
<div class="cnblogs_code">
<pre># <span style="color: rgba(0, 0, 255, 1)">tar</span> -zxvf hadoop-<span style="color: rgba(128, 0, 128, 1)">3.1</span>.<span style="color: rgba(128, 0, 128, 1)">2</span>.<span style="color: rgba(0, 0, 255, 1)">tar</span>.gz -C /usr/local  </pre>
</div>
<p>  在环境变量配置hadoop:</p>
<div class="cnblogs_code">
<pre># vim /etc/profile</pre>
</div>
<p>  然后添加以下配置在文件尾:</p>
<div class="cnblogs_code">
<pre>export HADOOP_HOME=/usr/local/hadoop-<span style="color: rgba(128, 0, 128, 1)">3.1</span>.2<span style="color: rgba(0, 0, 0, 1)">
export PATH</span>=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/<span style="color: rgba(0, 0, 0, 1)">sbin
export HADOOP_HDFS_HOME</span>=/usr/local/hadoop-<span style="color: rgba(128, 0, 128, 1)">3.1</span>.2<span style="color: rgba(0, 0, 0, 1)">
export HADOOP_CONF_DIR</span>=/usr/local/hadoop-<span style="color: rgba(128, 0, 128, 1)">3.1</span>.2/etc/hadoop</pre>
</div>
<p>  使用source /etc/profile刷新环境变量后,用hadoop version命令测试是否安装成功:</p>
<div class="cnblogs_code">
<pre># source /etc/<span style="color: rgba(0, 0, 0, 1)">profile
# hadoop version</span></pre>
</div>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190514223225416-2139757341.png" alt=""></p>
<p>&nbsp;</p>
<h2>3.2 伪分布式hadoop配置</h2>
<p>  hadoop的配置文件统一放在$HADOOP_HOME/etc/hadoop目录下,在这里我们只需要修改5个文件,分别是hadoop-env.sh,core-site.xml,mapred-site.xml,yarn-site.xml,yarn-site.xml。</p>
<p>  <strong>1. hadoop-env.sh</strong></p>
<p><strong>  </strong>在文件中修改如下:</p>
<div class="cnblogs_code">
<pre>export JAVA_HOME=/usr/local/jdk1.<span style="color: rgba(128, 0, 128, 1)">8</span><span style="color: rgba(0, 0, 0, 1)">.0_211
export HADOOP_HOME</span>=/usr/local/hadoop-<span style="color: rgba(128, 0, 128, 1)">3.1</span>.2<span style="color: rgba(128, 0, 128, 1)"><br></span></pre>
</div>
<p>&nbsp;</p>
<p>  <strong>2. core-site.xml</strong></p>
<div class="cnblogs_code">
<pre>&lt;configuration&gt;<br>  &lt;property&gt;
      &lt;name&gt;fs.defaultFS&lt;/name&gt;
      &lt;value&gt;hdfs:<span style="color: rgba(0, 128, 0, 1)">//</span><span style="color: rgba(0, 128, 0, 1)">localhost:9000/&lt;/value&gt;</span>
  &lt;/property&gt;</pre>
<p>  &nbsp; &lt;property&gt;<br>      &lt;name&gt;hadoop.tmp.dir&lt;/name&gt;<br>      &lt;value&gt;/usr/local/hadoop/data/&lt;/value&gt;<br>  &nbsp; &lt;/property&gt;</p>
<p>  &nbsp; &lt;property&gt;<br>      &lt;name&gt;fs.checkpoint.dir&lt;/name&gt;<br>      &lt;value&gt;file:///usr/local/hadoop/data/dfs/namesecondary&lt;/value&gt;<br>   &lt;/property&gt;</p>
<pre>&lt;/configuration&gt;</pre>
</div>
<p>&nbsp;</p>
<p>  <strong>3. hdfs-site.xml</strong></p>
<div class="cnblogs_code">
<pre>&lt;configuration&gt;
    &lt;property&gt;
      &lt;name&gt;dfs.replication&lt;/name&gt;
      &lt;value&gt;<span style="color: rgba(128, 0, 128, 1)">1</span>&lt;/value&gt;
    &lt;/property&gt;
    &lt;property&gt;
      &lt;name&gt;dfs.http.address&lt;/name&gt;
      &lt;value&gt;luengmingbiao:<span style="color: rgba(128, 0, 128, 1)">50070</span>&lt;/value&gt;
    &lt;/property&gt;<br>   <span class="hljs-tag">&lt;<span class="hljs-title">property&gt; <br>     <span class="hljs-tag">&lt;<span class="hljs-title">name&gt;dfs.namenode.name.dir<span class="hljs-tag">&lt;/<span class="hljs-title">name&gt; <br>     <span class="hljs-tag">&lt;<span class="hljs-title">value&gt;file:///usr/local/hadoop/data/dfs/name<span class="hljs-tag">&lt;/<span class="hljs-title">value&gt; <br><span class="hljs-tag">    &lt;/<span class="hljs-title">property&gt;<br></span></span></span></span></span></span></span></span></span></span></span></span>    <span class="hljs-tag">&lt;<span class="hljs-title">property&gt;<br>     <span class="hljs-tag">&lt;<span class="hljs-title">name&gt;dfs.datanode.data.dir<span class="hljs-tag">&lt;/<span class="hljs-title">name&gt; <br><span class="hljs-tag">     &lt;<span class="hljs-title">value&gt;file:///usr/local/hadoop/data/dfs/data<span class="hljs-tag">&lt;/<span class="hljs-title">value&gt; <br><span class="hljs-tag">   &lt;/<span class="hljs-title">property&gt;</span></span></span></span></span></span></span></span></span></span></span></span></pre>
&lt;/configuration&gt;</div>
<p>&nbsp;</p>
<p>  <strong>4. mapred-site.xml</strong></p>
<div class="cnblogs_code">
<pre>&lt;configuration&gt;
    &lt;property&gt;
      &lt;name&gt;mapreduce.framework.name&lt;/name&gt;
      &lt;value&gt;yarn&lt;/value&gt;
    &lt;/property&gt;
&lt;/configuration&gt;</pre>
</div>
<p>&nbsp;</p>
<p>  <strong>5.</strong>&nbsp;<strong>yarn-site.xml</strong></p>
<div class="cnblogs_code">
<pre>&lt;configuraion&gt;<br>  &lt;property&gt;
    &lt;name&gt;yarn.resourcemanager.<span style="color: rgba(0, 0, 255, 1)">hostname</span>&lt;/name&gt;
    &lt;value&gt;luengmingbiao&lt;/value&gt;
  &lt;/property&gt;
  &lt;property&gt;
    &lt;name&gt;yarn.nodemanager.aux-services&lt;/name&gt;
    &lt;value&gt;mapreduce_shuffle&lt;/value&gt;
  &lt;/property&gt;
  &lt;property&gt;
      &lt;name&gt;yarn.application.classpath&lt;/name&gt;
    &lt;value&gt;/usr/local/hadoop-3.1.2/etc/hadoop:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/*:/usr/local/hadoop-3.1.2/share/hadoop/common/*:/usr/local/hadoop-3.1.2/share/hadoop/hdfs:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/*:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/*:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/lib/*:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/*:/usr/local/hadoop-3.1.2/share/hadoop/yarn:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/*:/usr/local/hadoop-3.1.2/share/hadoop/yarn/*<span style="color: rgba(0, 128, 0, 1)"><span style="color: rgba(0, 0, 0, 1)">&lt;/value&gt;</span>
<span style="color: rgba(0, 0, 0, 1)">  &lt;/property&gt;
&lt;/configuraion&gt;</span></span></pre>
</div>
<p><strong>注:</strong>&nbsp;“yarn.application.classpath“可以通过在终端上输入如下命令获取:</p>
<div class="cnblogs_code">
<pre># hadoop classpath</pre>
</div>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>  对hdfs(Hadoop Distributed File System)进行格式化,hdfs是用来存储数据的分布式文件系统。</p>
<div class="cnblogs_code">
<pre># hdfs namenode -format</pre>
</div>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190515001544158-1326601174.png" alt="" width="742" height="642"></p>
<p>&nbsp;</p>
<p>  出现上述图所示,代表成功格式化。</p>
<p>  Hadoop3.x<span style="font-family: 宋体">以上版本在启动上有一个坑,不添加以下配置启动进程的时候会报以下的错并打印到终端上:</span></p>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190515001713471-1525255770.png" alt=""></p>
<p>&nbsp;</p>
<p>  解决方案(可以只针对<strong>ERROR</strong>出现的变量进行定义,如果不行再配置全部):</p>
<div class="cnblogs_code">
<pre># vim $HADOOP_HOME/sbin/start-dfs.<span style="color: rgba(0, 0, 255, 1)">sh</span></pre>
</div>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190515001900605-571964255.png" alt=""></p>
<p>&nbsp;</p>
<div class="cnblogs_code">
<pre># vim $HADOOP_HOME/sbin/stop-dfs.<span style="color: rgba(0, 0, 255, 1)">sh</span></pre>
</div>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190515001938857-50166690.png" alt=""></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<div class="cnblogs_code">
<pre># vim $HADOOP_HOME/sbin/start-yarn.<span style="color: rgba(0, 0, 255, 1)">sh</span></pre>
</div>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190515002005341-1067225797.png" alt=""></p>
<p style="text-align: center">&nbsp;</p>
<div class="cnblogs_code">
<pre># vim $HADOOP_HOME/sbin/stop-yarn.<span style="color: rgba(0, 0, 255, 1)">sh</span></pre>
</div>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190515002045766-1056236891.png" alt=""></p>
<p>&nbsp;</p>
<h2>3.3 启动Hadoop</h2>
<p>  <strong>1. 启动HDFS</strong></p>
<div class="cnblogs_code">
<pre># hdfs --<span style="color: rgba(0, 0, 0, 1)">daemon start namenode   
# hdfs </span>--daemon start datanode   <br># hdfs --daemon start secondarynamenode</pre>
</div>
<p>  或</p>
<div class="cnblogs_code">
<pre># start-dfs.<span style="color: rgba(0, 0, 255, 1)">sh</span></pre>
</div>
<p>&nbsp;</p>
<p>  <strong>2. 启动YARN集群</strong></p>
<div class="cnblogs_code">
<pre># yarn --daemon<span style="color: rgba(0, 0, 0, 1)"> start resourcemanager
# yarn -</span>-daemon start nodemanager</pre>
</div>
<p>  或</p>
<div class="cnblogs_code">
<pre># start-yarn.<span style="color: rgba(0, 0, 255, 1)">sh</span></pre>
</div>
<p>&nbsp;</p>
<p>  <strong>3. jps命令查看是否启动成功</strong></p>
<p><strong><img src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190515002941098-935223699.png" alt=""></strong></p>
<p>&nbsp;</p>
<p>  <strong>4. HDFS和YARN集群都有默认的Web可视化页面</strong></p>
<p>    HDFS: http://主机ip:50070</p>
<p><img src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190515003122543-823400990.png" alt="" width="1026" height="843"></p>
<p>&nbsp;</p>
<p>    YARN:http://主机ip:8088</p>
<p><img src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190515003249669-1241266639.png" alt="" width="960" height="415"></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<h2>&nbsp;3.4 测试Hadoop</h2>
<p>  建立测试文件:</p>
<div class="cnblogs_code">
<pre># vim test.txt</pre>
</div>
<p>  然后输入如下数据:</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 0, 1)">hello hadoop
hello World
Hello&nbsp;Java
Hey </span><span style="color: rgba(0, 0, 255, 1)">man</span><span style="color: rgba(0, 0, 0, 1)">
i am a programmer</span></pre>
</div>
<p>  将测试文件放到测试目录中:</p>
<div class="cnblogs_code">
<pre># hdfs dfs -<span style="color: rgba(0, 0, 255, 1)">mkdir</span> hdfs:<span style="color: rgba(128, 128, 128, 1)">///</span><span style="color: rgba(0, 128, 0, 1)">hadoop</span>
# hdfs dfs -<span style="color: rgba(0, 0, 255, 1)">mkdir</span> hdfs:<span style="color: rgba(128, 128, 128, 1)">///</span><span style="color: rgba(0, 128, 0, 1)">hadoop/input</span>
# hdfs dfs -put ./test.txt hdfs:<span style="color: rgba(128, 128, 128, 1)">///</span><span style="color: rgba(0, 128, 0, 1)">hadoop/input</span></pre>
</div>
<p>  <span style="font-family: 宋体">执行hadoop自带的</span>wordcount<span style="font-family: 宋体">程序:</span></p>
<div class="cnblogs_code">
<pre># hadoop jar /usr/local/hadoop-<span style="color: rgba(128, 0, 128, 1)">3.1</span>.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-<span style="color: rgba(128, 0, 128, 1)">3.1</span>.2.jar wordcount hdfs:<span style="color: rgba(128, 128, 128, 1)">///</span><span style="color: rgba(0, 128, 0, 1)">hadoop/input hdfs:</span><span style="color: rgba(128, 128, 128, 1)">///</span><span style="color: rgba(0, 128, 0, 1)">output</span></pre>
</div>
<p><img src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190515005110114-1614952286.png" alt=""></p>
<p>&nbsp;</p>
<p><img src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190515005315742-208032031.png" alt="" width="1221" height="373"></p>
<p>&nbsp;</p>
<p><span style="font-family: 宋体">  然后在命令行输入</span>&nbsp;hdfs dfs -cat hdfs:///output/part-r-00000 <span style="font-family: 宋体">查看词频统计结果:</span></p>
<div class="cnblogs_code">
<pre># hdfs dfs -<span style="color: rgba(0, 0, 255, 1)">cat</span> hdfs:<span style="color: rgba(128, 128, 128, 1)">///</span><span style="color: rgba(0, 128, 0, 1)">output/part-r-00000</span></pre>
</div>
<p><img style="display: block; margin-left: auto; margin-right: auto" src="https://img2018.cnblogs.com/blog/1426803/201905/1426803-20190515005425694-1892050826.png" alt=""></p>
<p>&nbsp;</p>
<h2>&nbsp;  到此,伪分布式Hadoop已经搭建成功了~</h2>
<p>&nbsp;</p>

</div>
<div id="MySignature" role="contentinfo">
    <p>作者:buildings<br>声明 :对于转载分享我是没有意见的,出于对博客园社区和作者的尊重请保留原文地址哈。<br>致读者 :坚持写博客不容易,写高质量博客更难,我也在不断的学习和进步,希望和所有同路人一道用技术来改变生活。觉得有点用就点个赞哈。</p><br><br>
来源:https://www.cnblogs.com/luengmingbiao/p/10865145.html
頁: [1]
查看完整版本: ubuntu安装伪分布式Hadoop3.1.2