Python 词云可视化
<p>最近看到不少公众号都有一些词云图,于是想学习一下使用Python生成可视化的词云,上B站搜索教程的时候,发现了一位UP讲的很不错,UP也给出了GitHub上的源码,是一个很不错的教程,这篇博客主要就是搬运UP主的教程吧,做一些笔记,留着以后看。</p><p><em><strong><span style="color: rgba(51, 102, 255, 1)">B站视频链接:<span style="color: rgba(51, 102, 255, 1)">https://www.bilibili.com/video/av53917673/?p=1</span></span></strong></em></p>
<p><em><strong><span style="color: rgba(51, 102, 255, 1)">Github源码:<span style="color: rgba(51, 102, 255, 1)">https://github.com/TommyZihao/zihaowordcloud</span></span></strong></em></p>
<p> </p>
<h2>本课概要</h2>
<p> </p>
<p>词云是文本大数据可视化的重要方式,可以将大段文本中的关键语句和词汇高亮展示。</p>
<p>从四行代码开始,一步步教你做出高大上的词云图片,可视化生动直观展示出枯燥文字背后的核心概念。进一步实现修改字体、字号、背景颜色、词云形状、勾勒边框、颜色渐变、分类填色、情感分析等高级玩法。</p>
<p>学完本课之后,你可以将四大名著、古典诗词、时事新闻、法律法规、政府报告、小说诗歌等大段文本做成高大上的可视化词云,还可以将你的微信好友个性签名导出,看看你微信好友的“画风”是怎样的。</p>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/7685b271a8b63f423ee43c85fdcf08263551112b/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d653661316334633862666266303437312e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="三国演艺词云" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-e6a1c4c8bfbf0471.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<p>从远古山洞壁画到微信表情包,人类千百年来始终都是懒惰的视觉动物。连篇累牍的大段文本会让人感到枯燥乏味。在这个“颜值即正义”的时代,大数据更需要“颜值”才能展现数据挖掘的魅力。</p>
<p>对于编程小白,学会此技可以玩转文本,入门中文分词、情感分析。对于编程高手,通过本课可以进一步熟悉Python的开源社区、计算生态、面向对象,自定义自己专属风格的词云。</p>
<p><strong>词云的应用场景</strong></p>
<ul>
<li>会议记录</li>
<li>海报制作</li>
<li>PPT制作</li>
<li>生日表白</li>
<li>数据挖掘</li>
<li>情感分析</li>
<li>用户画像</li>
<li>微信聊天记录分析</li>
<li>微博情感分析</li>
<li>Bilibili弹幕情感分析</li>
<li>年终总结</li>
</ul>
<h2>安装本课程所需的Python第三方模块</h2>
<h3>一行命令安装(推荐,适用于99.999%的情况)</h3>
<p>打开命令行,输入下面这行命令,回车执行即可。</p>
<div class="cnblogs_code">
<pre>pip install numpy matplotlib pillow wordcloud imageio jieba snownlp itchat -i https://pypi.tuna.tsinghua.edu.cn/simple</pre>
</div>
<h3>如果安装过程中报错(0.001%会发生)</h3>
<blockquote>
<p>如果报错:<code>Microsoft Visual C++ 14.0 is required.</code></p>
<p>解决方法:</p>
<p>到 http://www.lfd.uci.edu/~gohlke/pythonlibs/#wordcloud 页面下载所需的wordcloud模块的.whl文件,再用pip安装下载的文件。</p>
<p>比如,对于64位windows操作系统,python版本为3.6的电脑,就应该下载</p>
<p><code>wordcloud-1.4.1-cp36-cp36m-win_amd64.whl</code>这个文件</p>
<p>下载后打开命令行,使用cd命令切换到该文件的路径,执行<code>pip install wordcloud-1.4.1-cp36-cp36m-win_amd64.whl</code>命令,即可安装成功。</p>
</blockquote>
<h2>四行Python代码上手词云制作</h2>
<h3>1号词云:《葛底斯堡演说》黑色背景词云(4行代码上手)</h3>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> wordcloud
w </span>=<span style="color: rgba(0, 0, 0, 1)"> wordcloud.WordCloud()
w.generate(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">and that government of the people, by the people, for the people, shall not perish from the earth.</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
w.to_file(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output1.png</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
</div>
<p>运行完成之后,在代码所在的文件夹,就会出现<code>output.png</code>图片文件。可以看出,wordcloud自动将<code>and that by the not from</code>等废话词组过滤掉,并且把出现次数最多的<code>people</code>大号显示。</p>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/069bb87c311933b46d0230f0b62d59bc8cc9acff/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d343638643266383933616232306462622e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="1号词云:葛底斯堡演说黑色背景词云" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-468d2f893ab20dbb.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<h3>子豪兄带你逐行读代码</h3>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 1号词云:葛底斯堡演说黑色背景词云</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> B站专栏:同济子豪兄 2019-5-23</span>
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入词云制作第三方库wordcloud</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> wordcloud<br>
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 创建词云对象,赋值给w,现在w就表示了一个词云对象</span>
w =<span style="color: rgba(0, 0, 0, 1)"> wordcloud.WordCloud()
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 调用词云对象的generate方法,将文本传入</span>
w.generate(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">and that government of the people, by the people, for the people, shall not perish from the earth.</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将生成的词云保存为output1.png图片文件,保存出到当前文件夹中</span>
w.to_file(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output1.png</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
</div>
<p><code>wordcloud</code>库为每一个词云生成一个WordCloud对象(注意,此处的W和C是大写)</p>
<p>也就是说,<code>wordcloud.WordCloud()</code>代表一个词云对象,我们将它赋值给<code>w</code>。</p>
<p>现在,这个<code>w</code>就是词云对象啦!我们可以调用这个对象。</p>
<p>我们可以在<code>WordCloud()</code>括号里填入各种参数,控制词云的字体、字号、字的颜色、背景颜色等等。</p>
<p>wordcloud库会非常智能地按空格进行分词及词频统计,出现次数多的词就大。</p>
<h2>美化词云</h2>
<h3>2号词云:面朝大海,春暖花开(配置词云参数)</h3>
<p>增加宽、高、字体、背景颜色等参数</p>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 2号词云:面朝大海,春暖花开</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> B站专栏:同济子豪兄 2019-5-23</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> wordcloud
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 构建词云对象w,设置词云图片宽、高、字体、背景颜色等参数</span>
w = wordcloud.WordCloud(width=1000,height=700,background_color=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">white</span><span style="color: rgba(128, 0, 0, 1)">'</span>,font_path=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">msyh.ttc</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 调用词云对象的generate方法,将文本传入</span>
w.generate(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">从明天起,做一个幸福的人。喂马、劈柴,周游世界。从明天起,关心粮食和蔬菜。我有一所房子,面朝大海,春暖花开</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将生成的词云保存为output2-poem.png图片文件,保存到当前文件夹中</span>
w.to_file(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output2-poem.png</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
<p> </p>
<pre><span class="pl-c"><span class="pl-c"><span class="pl-c"><span class="pl-c"><span class="pl-k"><span class="pl-c"><span class="pl-c"><span class="pl-k"><span class="pl-v"><span class="pl-k"><span class="pl-c1"><span class="pl-v"><span class="pl-k"><span class="pl-c1"><span class="pl-v"><span class="pl-k"><span class="pl-s"><span class="pl-pds"><span class="pl-pds"><span class="pl-v"><span class="pl-k"><span class="pl-s"><span class="pl-pds"><span class="pl-pds"><span class="pl-c"><span class="pl-c"><span class="pl-s"><span class="pl-pds"><span class="pl-pds"><span class="pl-c"><span class="pl-c"><span class="pl-s"><span class="pl-pds"><span class="pl-pds"> </span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></pre>
</div>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/8fd70d681b80ec5c853757a56b1d4f8818bde539/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d303938346538396234613664393438622e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="2号词云:面朝大海,春暖花开" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-0984e89b4a6d948b.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<p>如果参数过多,第二行写成长长的一行不好看,可以写成多行,让代码更工整</p>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 2号词云:面朝大海,春暖花开</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> B站专栏:同济子豪兄 2019-5-23</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> wordcloud
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 构建词云对象w,设置词云图片宽、高、字体、背景颜色等参数</span>
w = wordcloud.WordCloud(width=1000<span style="color: rgba(0, 0, 0, 1)">,
height</span>=700<span style="color: rgba(0, 0, 0, 1)">,
background_color</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">white</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
font_path</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">msyh.ttc</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
w.generate(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">从明天起,做一个幸福的人。喂马、劈柴,周游世界。从明天起,关心粮食和蔬菜。我有一所房子,面朝大海,春暖花开</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
w.to_file(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output2-poem.png</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
</div>
<h3>常用参数</h3>
<ul>
<li>
<p>width 词云图片宽度,默认400像素</p>
</li>
<li>
<p>height 词云图片高度 默认200像素</p>
</li>
<li>
<p>background_color 词云图片的背景颜色,默认为黑色</p>
<p><code>background_color='white'</code></p>
</li>
<li>
<p>font_step 字号增大的步进间隔 默认1号</p>
<p>font_path 指定字体路径 默认None,对于中文可用<code>font_path='msyh.ttc'</code></p>
</li>
<li>
<p>mini_font_size 最小字号 默认4号</p>
</li>
<li>
<p>max_font_size 最大字号 根据高度自动调节</p>
</li>
<li>
<p>max_words 最大词数 默认200</p>
</li>
<li>
<p>stop_words 不显示的单词 <code>stop_words={"python","java"}</code></p>
</li>
<li>
<p>Scale 默认值1。值越大,图像密度越大越清晰</p>
</li>
<li>
<p>prefer_horizontal:默认值0.90,浮点数类型。表示在水平如果不合适,就旋转为垂直方向,水平放置的词数占0.9?</p>
</li>
<li>
<p>relative_scaling:默认值0.5,浮点型。设定按词频倒序排列,上一个词相对下一位词的大小倍数。有如下取值:“0”表示大小标准只参考频率排名,“1”如果词频是2倍,大小也是2倍</p>
</li>
<li>
<p>mask 指定词云形状图片,默认为矩形</p>
<p>通过以下代码读入外部词云形状图片(需要先<code>pip install imageio</code>安装imageio)</p>
</li>
</ul>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> imageio
mk </span>= imageio.imread(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">picture.png</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
w </span>= wordcloud.WordCloud(mask=mk)</pre>
</div>
</div>
<p>也就是说,我们可以这样来构建词云对象w,其中的参数均为常用参数的默认值,供我们自定义:</p>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre>w =<span style="color: rgba(0, 0, 0, 1)"> wordcloud.WordCloud(
width</span>=400<span style="color: rgba(0, 0, 0, 1)">,
height</span>=200<span style="color: rgba(0, 0, 0, 1)">,
background_color</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">black</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
font_path</span>=<span style="color: rgba(0, 0, 0, 1)">None,
font_step</span>=1<span style="color: rgba(0, 0, 0, 1)">,
min_font_size</span>=4<span style="color: rgba(0, 0, 0, 1)">,
max_font_size</span>=<span style="color: rgba(0, 0, 0, 1)">None,
max_words</span>=200<span style="color: rgba(0, 0, 0, 1)">,
stopwords</span>=<span style="color: rgba(0, 0, 0, 1)">{},
scale</span>=1<span style="color: rgba(0, 0, 0, 1)">,
prefer_horizontal</span>=0.9<span style="color: rgba(0, 0, 0, 1)">,
relative_scaling</span>=0.5<span style="color: rgba(0, 0, 0, 1)">,
mask</span>=None) </pre>
</div>
</div>
<h2>从外部文件读入文本</h2>
<h3>3号词云:乡村振兴战略中央文件(句子云)</h3>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 3号词云:乡村振兴战略中央文件</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> B站专栏:同济子豪兄 2019-5-23</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> wordcloud
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 从外部.txt文件中读取大段文本,存入变量txt中</span>
f = open(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">关于实施乡村振兴战略的意见.txt</span><span style="color: rgba(128, 0, 0, 1)">'</span>,encoding=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">utf-8</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
txt </span>=<span style="color: rgba(0, 0, 0, 1)"> f.read()
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 构建词云对象w,设置词云图片宽、高、字体、背景颜色等参数</span>
w = wordcloud.WordCloud(width=1000<span style="color: rgba(0, 0, 0, 1)">,
height</span>=700<span style="color: rgba(0, 0, 0, 1)">,
background_color</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">white</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
font_path</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">msyh.ttc</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将txt变量传入w的generate()方法,给词云输入文字</span>
<span style="color: rgba(0, 0, 0, 1)">w.generate(txt)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将词云图片导出到当前文件夹</span>
w.to_file(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output3-sentence.png</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
<pre><span class="pl-c"><span class="pl-c"><span class="pl-c"><span class="pl-c"><span class="pl-k"><span class="pl-c"><span class="pl-c"><span class="pl-k"><span class="pl-c1"><span class="pl-s"><span class="pl-pds"><span class="pl-pds"><span class="pl-v"><span class="pl-k"><span class="pl-s"><span class="pl-pds"><span class="pl-pds"><span class="pl-k"><span class="pl-c"><span class="pl-c"><span class="pl-k"><span class="pl-v"><span class="pl-k"><span class="pl-c1"><span class="pl-v"><span class="pl-k"><span class="pl-c1"><span class="pl-v"><span class="pl-k"><span class="pl-s"><span class="pl-pds"><span class="pl-pds"><span class="pl-v"><span class="pl-k"><span class="pl-s"><span class="pl-pds"><span class="pl-pds"><span class="pl-c"><span class="pl-c"><span class="pl-c"><span class="pl-c"><span class="pl-s"><span class="pl-pds"><span class="pl-pds"> </span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></pre>
</div>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/97d91604ae489bddf0512e8fda30fa4a256edb5c/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d643163353238633234633662353534662e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="3号词云:乡村振兴战略中央文件" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-d1c528c24c6b554f.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<h2>中文分词</h2>
<h3>中文分词第三方模块<code>jieba</code></h3>
<h4>中文分词-小试牛刀</h4>
<p>安装中文分词库jieba:在命令行中输入<code>pip install jieba</code></p>
<p>打开python的<code>交互式shell</code>界面,也就是有三个大于号<code>>>></code>的这个界面,依次输入以下命令。</p>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre>>>> <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> jieba
</span>>>> textlist = jieba.lcut(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">动力学和电磁学</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
</span>>>><span style="color: rgba(0, 0, 0, 1)"> textlist
[</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">动力学</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">和</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">电磁学</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]
</span>>>> string = <span style="color: rgba(128, 0, 0, 1)">"</span> <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">.join(textlist)
</span>>>><span style="color: rgba(0, 0, 0, 1)"> string
</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">动力学 和 电磁学</span><span style="color: rgba(128, 0, 0, 1)">'</span></pre>
</div>
</div>
<p>以上代码将一句<code>完整的中文字符串</code>转换成了<code>以空格分隔的词组成的字符串</code>,而后者是绘制词云时<code>generate()</code>方法要求传入的参数。</p>
<h4>中文分词库<code>jieba</code>的常用方法</h4>
<p><code>精确模式(最常用,只会这个就行)</code>:每个字只用一遍,不存在冗余词汇。<code>jieba.lcut('动力学和电磁学')</code></p>
<p><code>全模式</code>:把每个字可能形成的词汇都提取出来,存在冗余。<code>jieba.lcut('动力学和电磁学',cut_all=True)</code></p>
<p><code>搜索引擎模式</code>:将全模式分词的结果从短到长排列好。<code>jieba.lcut_for_search('动力学和电磁学')</code></p>
<p>以下命令演示了三种分词模式及结果,精确模式是最常用的。</p>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre>>>> <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> jieba
</span>>>> textlist1 = jieba.lcut(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">动力学和电磁学</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
</span>>>><span style="color: rgba(0, 0, 0, 1)"> textlist1
[</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">动力学</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">和</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">电磁学</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]
</span>>>> textlist2 = jieba.lcut(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">动力学和电磁学</span><span style="color: rgba(128, 0, 0, 1)">'</span>,cut_all=<span style="color: rgba(0, 0, 0, 1)">True)
</span>>>><span style="color: rgba(0, 0, 0, 1)"> textlist2
[</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">动力</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">动力学</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">力学</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">和</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">电磁</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">电磁学</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">磁学</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]
</span>>>> textlist3 = jieba.lcut_for_search(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">动力学和电磁学</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
</span>>>><span style="color: rgba(0, 0, 0, 1)"> textlist3
[</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">动力</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">力学</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">动力学</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">和</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">电磁</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">磁学</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">电磁学</span><span style="color: rgba(128, 0, 0, 1)">'</span>]</pre>
</div>
</div>
<p>一键执行的详细脚本文件详见github代码库-zihaowordcloud中的<code>test1-jieba.py</code>文件。</p>
<h3>4号词云:同济大学介绍词云(中文分词)</h3>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 4号词云:同济大学介绍词云</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> B站专栏:同济子豪兄 2019-5-23</span>
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入词云制作库wordcloud和中文分词库jieba</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> jieba
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> wordcloud
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 构建并配置词云对象w</span>
w = wordcloud.WordCloud(width=1000<span style="color: rgba(0, 0, 0, 1)">,
height</span>=700<span style="color: rgba(0, 0, 0, 1)">,
background_color</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">white</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
font_path</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">msyh.ttc</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 调用jieba的lcut()方法对原始文本进行中文分词,得到string</span>
txt = <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">同济大学(Tongji University),简称“同济”,是中华人民共和国教育部直属,由教育部、国家海洋局和上海市共建的全国重点大学,历史悠久、声誉卓著,是国家“双一流”、“211工程”、“985工程”重点建设高校,也是收生标准最严格的中国大学之一</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">
txtlist </span>=<span style="color: rgba(0, 0, 0, 1)"> jieba.lcut(txt)
string </span>= <span style="color: rgba(128, 0, 0, 1)">"</span> <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">.join(txtlist)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将string变量传入w的generate()方法,给词云输入文字</span>
<span style="color: rgba(0, 0, 0, 1)">w.generate(string)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将词云图片导出到当前文件夹</span>
w.to_file(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output4-tongji.png</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
<p> </p>
</div>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/a31dc4dd854f622c36dfbfb21771d346a302dda7/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d383235333336643332613566333531612e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="4号词云:同济大学介绍词云" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-825336d32a5f351a.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<h3>5号词云:乡村振兴战略中央文件(词云)</h3>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 5号词云:乡村振兴战略中央文件(词云)</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> B站专栏:同济子豪兄 2019-5-23</span>
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入词云制作库wordcloud和中文分词库jieba</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> jieba
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> wordcloud
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 构建并配置词云对象w</span>
w = wordcloud.WordCloud(width=1000<span style="color: rgba(0, 0, 0, 1)">,
height</span>=700<span style="color: rgba(0, 0, 0, 1)">,
background_color</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">white</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
font_path</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">msyh.ttc</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 对来自外部文件的文本进行中文分词,得到string</span>
f = open(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">关于实施乡村振兴战略的意见.txt</span><span style="color: rgba(128, 0, 0, 1)">'</span>,encoding=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">utf-8</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
txt </span>=<span style="color: rgba(0, 0, 0, 1)"> f.read()
txtlist </span>=<span style="color: rgba(0, 0, 0, 1)"> jieba.lcut(txt)
string </span>= <span style="color: rgba(128, 0, 0, 1)">"</span> <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">.join(txtlist)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将string变量传入w的generate()方法,给词云输入文字</span>
<span style="color: rgba(0, 0, 0, 1)">w.generate(string)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将词云图片导出到当前文件夹</span>
w.to_file(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output5-village.png</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
</div>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/e850adf8756088df9a50471c33fafa4ec4f3e2d9/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d333764326266633562303438393433642e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="5号词云:乡村振兴战略中央文件(词云)" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-37d2bfc5b048943d.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<h2>高级词云:绘制指定形状的词云</h2>
<p>通过以下代码读入外部词云形状图片(需要先<code>pip install imageio</code>安装imageio)</p>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> imageio
mk </span>= imageio.imread(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">picture.png</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
w </span>= wordcloud.WordCloud(mask=mk)</pre>
</div>
</div>
<h3>6号词云:乡村振兴战略中央文件(五角星形状)</h3>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 6号词云:乡村振兴战略中央文件(五角星形状)</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> B站专栏:同济子豪兄 2019-5-23</span>
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入词云制作库wordcloud和中文分词库jieba</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> jieba
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> wordcloud
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入imageio库中的imread函数,并用这个函数读取本地图片,作为词云形状图片</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> imageio
mk </span>= imageio.imread(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">wujiaoxing.png</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
w </span>= wordcloud.WordCloud(mask=<span style="color: rgba(0, 0, 0, 1)">mk)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 构建并配置词云对象w,注意要加scale参数,提高清晰度</span>
w = wordcloud.WordCloud(width=1000<span style="color: rgba(0, 0, 0, 1)">,
height</span>=700<span style="color: rgba(0, 0, 0, 1)">,
background_color</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">white</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
font_path</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">msyh.ttc</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
mask</span>=<span style="color: rgba(0, 0, 0, 1)">mk,
scale</span>=15<span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 对来自外部文件的文本进行中文分词,得到string</span>
f = open(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">关于实施乡村振兴战略的意见.txt</span><span style="color: rgba(128, 0, 0, 1)">'</span>,encoding=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">utf-8</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
txt </span>=<span style="color: rgba(0, 0, 0, 1)"> f.read()
txtlist </span>=<span style="color: rgba(0, 0, 0, 1)"> jieba.lcut(txt)
string </span>= <span style="color: rgba(128, 0, 0, 1)">"</span> <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">.join(txtlist)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将string变量传入w的generate()方法,给词云输入文字</span>
<span style="color: rgba(0, 0, 0, 1)">w.generate(string)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将词云图片导出到当前文件夹</span>
w.to_file(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output6-village.png</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
<p> </p>
</div>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/68bf3b4c033c11afb83a617703b184072d69308d/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d356530343238613939613938383435342e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="6号词云:乡村振兴战略中央文件(五角星)" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-5e0428a99a988454.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<h3>7号词云:新时代中国特色社会主义(中国地图形状)</h3>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 7号词云:新时代中国特色社会主义(中国地图形状)</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> B站专栏:同济子豪兄 2019-5-23</span>
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入词云制作库wordcloud和中文分词库jieba</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> jieba
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> wordcloud
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入imageio库中的imread函数,并用这个函数读取本地图片,作为词云形状图片</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> imageio
mk </span>= imageio.imread(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">chinamap.png</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
w </span>= wordcloud.WordCloud(mask=<span style="color: rgba(0, 0, 0, 1)">mk)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 构建并配置词云对象w,注意要加scale参数,提高清晰度</span>
w = wordcloud.WordCloud(width=1000<span style="color: rgba(0, 0, 0, 1)">,
height</span>=700<span style="color: rgba(0, 0, 0, 1)">,
background_color</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">white</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
font_path</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">msyh.ttc</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
mask</span>=<span style="color: rgba(0, 0, 0, 1)">mk,
scale</span>=15<span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 对来自外部文件的文本进行中文分词,得到string</span>
f = open(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">新时代中国特色社会主义.txt</span><span style="color: rgba(128, 0, 0, 1)">'</span>,encoding=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">utf-8</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
txt </span>=<span style="color: rgba(0, 0, 0, 1)"> f.read()
txtlist </span>=<span style="color: rgba(0, 0, 0, 1)"> jieba.lcut(txt)
string </span>= <span style="color: rgba(128, 0, 0, 1)">"</span> <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">.join(txtlist)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将string变量传入w的generate()方法,给词云输入文字</span>
<span style="color: rgba(0, 0, 0, 1)">w.generate(string)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将词云图片导出到当前文件夹</span>
w.to_file(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output7-chinamap.png</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
</div>
<p>加scale参数为15的效果</p>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/6a6b7f83555da7bb2450f8a084ddd22d11a14dac/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d363663373734633330313362643764632e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="7号词云:新时代中国特色社会主义(中国地图形状)" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-66c774c3013bd7dc.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<p>不加scale参数的效果,稍显模糊</p>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/86863617950eca7a6e18da89603de6d393eb639a/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d303032653662633030383563306264302e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="中国地图词云" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-002e6bc0085c0bd0.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<h3>8号词云:《三国演义》词云(stopwords参数去除词)</h3>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 8号词云:《三国演义》词云(stopwords参数去除“曹操”和“孔明”两个词)</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> B站专栏:同济子豪兄 2019-5-23</span>
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入词云制作库wordcloud和中文分词库jieba</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> jieba
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> wordcloud
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入imageio库中的imread函数,并用这个函数读取本地图片,作为词云形状图片</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> imageio
mk </span>= imageio.imread(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">chinamap.png</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 构建并配置词云对象w,注意要加stopwords集合参数,将不想展示在词云中的词放在stopwords集合里,这里去掉“曹操”和“孔明”两个词</span>
w = wordcloud.WordCloud(width=1000<span style="color: rgba(0, 0, 0, 1)">,
height</span>=700<span style="color: rgba(0, 0, 0, 1)">,
background_color</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">white</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
font_path</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">msyh.ttc</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
mask</span>=<span style="color: rgba(0, 0, 0, 1)">mk,
scale</span>=15<span style="color: rgba(0, 0, 0, 1)">,
stopwords</span>={<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">曹操</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">孔明</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">})
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 对来自外部文件的文本进行中文分词,得到string</span>
f = open(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">threekingdoms.txt</span><span style="color: rgba(128, 0, 0, 1)">'</span>,encoding=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">utf-8</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
txt </span>=<span style="color: rgba(0, 0, 0, 1)"> f.read()
txtlist </span>=<span style="color: rgba(0, 0, 0, 1)"> jieba.lcut(txt)
string </span>= <span style="color: rgba(128, 0, 0, 1)">"</span> <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">.join(txtlist)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将string变量传入w的generate()方法,给词云输入文字</span>
<span style="color: rgba(0, 0, 0, 1)">w.generate(string)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将词云图片导出到当前文件夹</span>
w.to_file(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output8-threekingdoms.png</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
<p> </p>
</div>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/d672fb167bfd1863e7fdeb09157c9d3c45635f7f/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d393634346234393661663264383734622e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="三国演艺词云" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-9644b496af2d874b.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<h3>9号词云:《哈姆雷特》(勾勒轮廓线)</h3>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 9号词云:哈姆雷特(勾勒轮廓线)</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> B站专栏:同济子豪兄 2019-5-23</span>
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入词云制作库wordcloud</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> wordcloud
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将外部文件包含的文本保存在string变量中</span>
string = open(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">hamlet.txt</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">).read()
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入imageio库中的imread函数,并用这个函数读取本地图片,作为词云形状图片</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> imageio
mk </span>= imageio.imread(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">alice.png</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 构建词云对象w,注意增加参数contour_width和contour_color设置轮廓宽度和颜色</span>
w = wordcloud.WordCloud(background_color=<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">white</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">,
mask</span>=<span style="color: rgba(0, 0, 0, 1)">mk,
contour_width</span>=1<span style="color: rgba(0, 0, 0, 1)">,
contour_color</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">steelblue</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> # 将string变量传入w的generate()方法,给词云输入文字</span>
<span style="color: rgba(0, 0, 0, 1)">w.generate(string)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将词云图片导出到当前文件夹</span>
w.to_file(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output9-contour.png</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
<p> </p>
</div>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/a56ab17604695f013b7ac41b75203b30a87c11a7/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d343337323031356135663538383831322e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="8号词云:哈姆雷特(勾勒轮廓线)" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-4372015a5f588812.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<h3>10号词云:《爱丽丝漫游仙境》词云(按模板填色)</h3>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 10号词云:《爱丽丝漫游仙境》词云(按模板填色)</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> B站专栏:同济子豪兄 2019-5-23</span>
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入绘图库matplotlib和词云制作库wordcloud</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> matplotlib.pyplot as plt
</span><span style="color: rgba(0, 0, 255, 1)">from</span> wordcloud <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> WordCloud,ImageColorGenerator
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将外部文件包含的文本保存在text变量中</span>
text = open(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">alice.txt</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">).read()
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入imageio库中的imread函数,并用这个函数读取本地图片queen2.jfif,作为词云形状图片</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> imageio
mk </span>= imageio.imread(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">alice_color.png</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 构建词云对象w</span>
wc = WordCloud(background_color=<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">white</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">,
mask</span>=<span style="color: rgba(0, 0, 0, 1)">mk,)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将text字符串变量传入w的generate()方法,给词云输入文字</span>
<span style="color: rgba(0, 0, 0, 1)">wc.generate(text)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 调用wordcloud库中的ImageColorGenerator()函数,提取模板图片各部分的颜色</span>
image_colors =<span style="color: rgba(0, 0, 0, 1)"> ImageColorGenerator(mk)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 显示原生词云图、按模板图片颜色的词云图和模板图片,按左、中、右显示</span>
fig, axes = plt.subplots(1, 3<span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 最左边的图片显示原生词云图</span>
<span style="color: rgba(0, 0, 0, 1)">axes.imshow(wc)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 中间的图片显示按模板图片颜色生成的词云图,采用双线性插值的方法显示颜色</span>
axes.imshow(wc.recolor(color_func=image_colors), interpolation=<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">bilinear</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 右边的图片显示模板图片</span>
axes.imshow(mk, cmap=<span style="color: rgba(0, 0, 0, 1)">plt.cm.gray)
</span><span style="color: rgba(0, 0, 255, 1)">for</span> ax <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> axes:
ax.set_axis_off()
plt.show()
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 给词云对象按模板图片的颜色重新上色</span>
wc_color = wc.recolor(color_func=<span style="color: rgba(0, 0, 0, 1)">image_colors)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将词云图片导出到当前文件夹</span>
wc_color.to_file(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output10-alice.png</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
</div>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/da60e9b2139ab420202340fe421341ec1a26f72d/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d386638386533663934656234336233322e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="10号词云:《爱丽丝漫游仙境》词云(勾勒轮廓线)" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-8f88e3f94eb43b32.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/6e2aa9ce1c4facc54c4514cb7d2056b4ac22cae2/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d626232343336623833663630336231392e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="image.png" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-bb2436b83f603b19.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/3c78d0cbb90f1d80bb3ce790f6f1cd5730d439d8/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d363235393763363063653566343432662e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="加模板图片颜色的词云" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-62597c60ce5f442f.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<h2>进阶词云:尽享数据驱动与开源社区</h2>
<h3>11号词云:绘制你的微信好友个性签名词云</h3>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 11号词云:绘制你的微信好友个性签名词云</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> B站专栏:同济子豪兄 2019-05-23</span>
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入微信库ichat,中文分词库jieba</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> itchat
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> jieba
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 先登录微信,跳出登陆二维码</span>
<span style="color: rgba(0, 0, 0, 1)">itchat.login()
tList </span>=<span style="color: rgba(0, 0, 0, 1)"> []
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 获取好友列表</span>
friends = itchat.get_friends(update=<span style="color: rgba(0, 0, 0, 1)">True)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 构建所有好友个性签名组成的大列表tList</span>
<span style="color: rgba(0, 0, 255, 1)">for</span> i <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> friends:
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 获取个性签名</span>
signature = i[<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Signature</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">]
</span><span style="color: rgba(0, 0, 255, 1)">if</span> <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">emoji</span><span style="color: rgba(128, 0, 0, 1)">'</span> <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> signature:
</span><span style="color: rgba(0, 0, 255, 1)">pass</span>
<span style="color: rgba(0, 0, 255, 1)">else</span><span style="color: rgba(0, 0, 0, 1)">:
tList.append(signature)
text </span>= <span style="color: rgba(128, 0, 0, 1)">"</span> <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">.join(tList)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 对个性签名进行中文分词</span>
wordlist_jieba = jieba.lcut(text, cut_all=<span style="color: rgba(0, 0, 0, 1)">True)
wl_space_split </span>= <span style="color: rgba(128, 0, 0, 1)">"</span> <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">.join(wordlist_jieba)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入imageio库中的imread函数,并用这个函数读取本地图片,作为词云形状图片</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> imageio
mk </span>= imageio.imread(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">chinamap.png</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入词云制作库wordcloud</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> wordcloud
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 构建并配置词云对象w,注意要加scale参数,提高清晰度</span>
my_wordcloud = wordcloud.WordCloud(background_color=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">white</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
width</span>=1000<span style="color: rgba(0, 0, 0, 1)">,
height</span>=700<span style="color: rgba(0, 0, 0, 1)">,
font_path</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">msyh.ttc</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
max_words</span>=2000<span style="color: rgba(0, 0, 0, 1)">,
mask</span>=<span style="color: rgba(0, 0, 0, 1)">mk,
scale</span>=20<span style="color: rgba(0, 0, 0, 1)">)
my_wordcloud.generate(wl_space_split)
nickname </span>= friends[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">NickName</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]
filename </span>= <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">output11-{}的微信好友个性签名词云图.png</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">.format(nickname)
my_wordcloud.to_file(filename)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 显示词云图片</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> matplotlib.pyplot as plt
plt.imshow(my_wordcloud)
plt.axis(</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">off</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
plt.show()
</span><span style="color: rgba(0, 0, 255, 1)">print</span>(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">程序结束</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
微信好友个性签名词云</span></pre>
</div>
<p> </p>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/6650cbba9ebcc33591da6229842a1a28d1919c98/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d663063363332373762656436653333362e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="微信好友个性签名词云" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-f0c63277bed6e336.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<h2>文字情感分析与文本挖掘</h2>
<h3>Python中文语言处理第三方库snownlp小试牛刀</h3>
<p>安装中文文本分析库snownlp:在命令行中输入<code>pip install snownlp</code>。</p>
<p>打开python的<code>交互式shell</code>界面,也就是有三个大于号<code>>>></code>的这个界面,依次输入以下命令。</p>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre>>>> <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> snownlp
</span>>>> word = snownlp.SnowNLP(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">中华民族伟大复兴</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
</span>>>> feeling =<span style="color: rgba(0, 0, 0, 1)"> word.sentiments
</span>>>><span style="color: rgba(0, 0, 0, 1)"> feeling
</span>0.9935086411278989
>>> word = snownlp.SnowNLP(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">快递慢到死,客服态度不好,退款!</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
</span>>>> feeling =<span style="color: rgba(0, 0, 0, 1)"> word.sentiments
</span>>>><span style="color: rgba(0, 0, 0, 1)"> feeling
</span>0.00012171645785852281</pre>
</div>
</div>
<blockquote>
<p>snownlp的语料库是淘宝等电商网站的评论,所以对购物类的文本情感分析准确度很高。</p>
</blockquote>
<p>一键执行的详细脚本文件详见github代码库-zihaowordcloud中的<code>test2-snownlp.py</code>文件。</p>
<h3>12号词云:《三体Ⅱ黑暗森林》情感分析词云</h3>
<div class="highlight highlight-source-python">
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 12号词云:《三体Ⅱ黑暗森林》情感分析词云</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> B站专栏:同济子豪兄 2019-5-23</span>
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入词云制作库wordcloud和中文分词库jieba</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> jieba
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> wordcloud
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入imageio库中的imread函数,并用这个函数读取本地图片,作为词云形状图片</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> imageio
mk </span>= imageio.imread(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">chinamap.png</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 构建并配置两个词云对象w1和w2,分别存放积极词和消极词</span>
w1 = wordcloud.WordCloud(width=1000<span style="color: rgba(0, 0, 0, 1)">,
height</span>=700<span style="color: rgba(0, 0, 0, 1)">,
background_color</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">white</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
font_path</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">msyh.ttc</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
mask</span>=<span style="color: rgba(0, 0, 0, 1)">mk,
scale</span>=15<span style="color: rgba(0, 0, 0, 1)">)
w2 </span>= wordcloud.WordCloud(width=1000<span style="color: rgba(0, 0, 0, 1)">,
height</span>=700<span style="color: rgba(0, 0, 0, 1)">,
background_color</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">white</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
font_path</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">msyh.ttc</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
mask</span>=<span style="color: rgba(0, 0, 0, 1)">mk,
scale</span>=15<span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 对来自外部文件的文本进行中文分词,得到积极词汇和消极词汇的两个列表</span>
f = open(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">三体黑暗森林.txt</span><span style="color: rgba(128, 0, 0, 1)">'</span>,encoding=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">utf-8</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
txt </span>=<span style="color: rgba(0, 0, 0, 1)"> f.read()
txtlist </span>=<span style="color: rgba(0, 0, 0, 1)"> jieba.lcut(txt)
positivelist </span>=<span style="color: rgba(0, 0, 0, 1)"> []
negativelist </span>=<span style="color: rgba(0, 0, 0, 1)"> []
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 下面对文本中的每个词进行情感分析,情感>0.96判为积极词,情感<0.06判为消极词</span>
<span style="color: rgba(0, 0, 255, 1)">print</span>(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">开始进行情感分析,请稍等,三国演义全文那么长的文本需要三分钟左右</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入自然语言处理第三方库snownlp</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> snownlp
</span><span style="color: rgba(0, 0, 255, 1)">for</span> each <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> txtlist:
each_word </span>=<span style="color: rgba(0, 0, 0, 1)"> snownlp.SnowNLP(each)
feeling </span>=<span style="color: rgba(0, 0, 0, 1)"> each_word.sentiments
</span><span style="color: rgba(0, 0, 255, 1)">if</span> feeling > 0.96<span style="color: rgba(0, 0, 0, 1)">:
positivelist.append(each)
</span><span style="color: rgba(0, 0, 255, 1)">elif</span> feeling < 0.06<span style="color: rgba(0, 0, 0, 1)">:
negativelist.append(each)
</span><span style="color: rgba(0, 0, 255, 1)">else</span><span style="color: rgba(0, 0, 0, 1)">:
</span><span style="color: rgba(0, 0, 255, 1)">pass</span>
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将积极和消极的两个列表各自合并成积极字符串和消极字符串,字符串中的词用空格分隔</span>
positive_string = <span style="color: rgba(128, 0, 0, 1)">"</span> <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">.join(positivelist)
negative_string </span>= <span style="color: rgba(128, 0, 0, 1)">"</span> <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">.join(negativelist)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将string变量传入w的generate()方法,给词云输入文字</span>
<span style="color: rgba(0, 0, 0, 1)">w1.generate(positive_string)
w2.generate(negative_string)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将积极、消极的两个词云图片导出到当前文件夹</span>
w1.to_file(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output12-positive.png</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
w2.to_file(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output12-negative.png</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 0, 255, 1)">print</span>(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">词云生成完成</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
<p> </p>
</div>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/47ec86cae03fdc2bfd6d0b5e0fd0d369c949d721/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d656164613562633430316532316239362e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="《三体Ⅱ黑暗森林 积极词汇词云和消极词汇词云》" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-eada5bc401e21b96.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p>
<h2>13号词云:《三国演义》人物阵营分色词云</h2>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 13号词云:三国人物阵营分色词云</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> B站专栏:同济子豪兄 2019-5-23</span>
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入wordcloud库,并定义两个函数</span>
<span style="color: rgba(0, 0, 255, 1)">from</span> wordcloud <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> (WordCloud, get_single_color_func)
</span><span style="color: rgba(0, 0, 255, 1)">class</span><span style="color: rgba(0, 0, 0, 1)"> SimpleGroupedColorFunc(object):
</span><span style="color: rgba(128, 0, 0, 1)">"""</span><span style="color: rgba(128, 0, 0, 1)">Create a color function object which assigns EXACT colors
to certain words based on the color to words mapping
Parameters
----------
color_to_words : dict(str -> list(str))
A dictionary that maps a color to the list of words.
default_color : str
Color that will be assigned to a word that's not a member
of any value from color_to_words.
</span><span style="color: rgba(128, 0, 0, 1)">"""</span>
<span style="color: rgba(0, 0, 255, 1)">def</span> <span style="color: rgba(128, 0, 128, 1)">__init__</span><span style="color: rgba(0, 0, 0, 1)">(self, color_to_words, default_color):
self.word_to_color </span>=<span style="color: rgba(0, 0, 0, 1)"> {word: color
</span><span style="color: rgba(0, 0, 255, 1)">for</span> (color, words) <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> color_to_words.items()
</span><span style="color: rgba(0, 0, 255, 1)">for</span> word <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> words}
self.default_color </span>=<span style="color: rgba(0, 0, 0, 1)"> default_color
</span><span style="color: rgba(0, 0, 255, 1)">def</span> <span style="color: rgba(128, 0, 128, 1)">__call__</span>(self, word, **<span style="color: rgba(0, 0, 0, 1)">kwargs):
</span><span style="color: rgba(0, 0, 255, 1)">return</span><span style="color: rgba(0, 0, 0, 1)"> self.word_to_color.get(word, self.default_color)
</span><span style="color: rgba(0, 0, 255, 1)">class</span><span style="color: rgba(0, 0, 0, 1)"> GroupedColorFunc(object):
</span><span style="color: rgba(128, 0, 0, 1)">"""</span><span style="color: rgba(128, 0, 0, 1)">Create a color function object which assigns DIFFERENT SHADES of
specified colors to certain words based on the color to words mapping.
Uses wordcloud.get_single_color_func
Parameters
----------
color_to_words : dict(str -> list(str))
A dictionary that maps a color to the list of words.
default_color : str
Color that will be assigned to a word that's not a member
of any value from color_to_words.
</span><span style="color: rgba(128, 0, 0, 1)">"""</span>
<span style="color: rgba(0, 0, 255, 1)">def</span> <span style="color: rgba(128, 0, 128, 1)">__init__</span><span style="color: rgba(0, 0, 0, 1)">(self, color_to_words, default_color):
self.color_func_to_words </span>=<span style="color: rgba(0, 0, 0, 1)"> [
(get_single_color_func(color), set(words))
</span><span style="color: rgba(0, 0, 255, 1)">for</span> (color, words) <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> color_to_words.items()]
self.default_color_func </span>=<span style="color: rgba(0, 0, 0, 1)"> get_single_color_func(default_color)
</span><span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> get_color_func(self, word):
</span><span style="color: rgba(128, 0, 0, 1)">"""</span><span style="color: rgba(128, 0, 0, 1)">Returns a single_color_func associated with the word</span><span style="color: rgba(128, 0, 0, 1)">"""</span>
<span style="color: rgba(0, 0, 255, 1)">try</span><span style="color: rgba(0, 0, 0, 1)">:
color_func </span>=<span style="color: rgba(0, 0, 0, 1)"> next(
color_func </span><span style="color: rgba(0, 0, 255, 1)">for</span> (color_func, words) <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> self.color_func_to_words
</span><span style="color: rgba(0, 0, 255, 1)">if</span> word <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> words)
</span><span style="color: rgba(0, 0, 255, 1)">except</span><span style="color: rgba(0, 0, 0, 1)"> StopIteration:
color_func </span>=<span style="color: rgba(0, 0, 0, 1)"> self.default_color_func
</span><span style="color: rgba(0, 0, 255, 1)">return</span><span style="color: rgba(0, 0, 0, 1)"> color_func
</span><span style="color: rgba(0, 0, 255, 1)">def</span> <span style="color: rgba(128, 0, 128, 1)">__call__</span>(self, word, **<span style="color: rgba(0, 0, 0, 1)">kwargs):
</span><span style="color: rgba(0, 0, 255, 1)">return</span> self.get_color_func(word)(word, **<span style="color: rgba(0, 0, 0, 1)">kwargs)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入imageio库中的imread函数,并用这个函数读取本地图片,作为词云形状图片</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> imageio
mk </span>= imageio.imread(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">chinamap.png</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
w </span>= WordCloud(width=1000<span style="color: rgba(0, 0, 0, 1)">,
height</span>=700<span style="color: rgba(0, 0, 0, 1)">,
background_color</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">white</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
font_path</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">msyh.ttc</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
mask</span>=<span style="color: rgba(0, 0, 0, 1)">mk,
scale</span>=15<span style="color: rgba(0, 0, 0, 1)">,
max_font_size</span>=60<span style="color: rgba(0, 0, 0, 1)">,
max_words</span>=20000<span style="color: rgba(0, 0, 0, 1)">,
font_step</span>=1<span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> jieba
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 对来自外部文件的文本进行中文分词,得到string</span>
f = open(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">三国演义.txt</span><span style="color: rgba(128, 0, 0, 1)">'</span>,encoding=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">utf-8</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
txt </span>=<span style="color: rgba(0, 0, 0, 1)"> f.read()
txtlist </span>=<span style="color: rgba(0, 0, 0, 1)"> jieba.lcut(txt)
string </span>= <span style="color: rgba(128, 0, 0, 1)">"</span> <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">.join(txtlist)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将string变量传入w的generate()方法,给词云输入文字</span>
<span style="color: rgba(0, 0, 0, 1)">w.generate(string)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 创建字典,按人物所在的不同阵营安排不同颜色,绿色是蜀国,橙色是魏国,紫色是东吴,粉色是诸侯群雄</span>
color_to_words =<span style="color: rgba(0, 0, 0, 1)"> {
</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">green</span><span style="color: rgba(128, 0, 0, 1)">'</span>: [<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">刘备</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">刘玄德</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">孔明</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">诸葛孔明</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">玄德</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">关公</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">玄德曰</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">孔明曰</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">张飞</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">赵云</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">后主</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">黄忠</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">马超</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">姜维</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">魏延</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">孟获</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">关兴</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">诸葛亮</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">云长</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">孟达</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">庞统</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">廖化</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">马岱</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">],
</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">red</span><span style="color: rgba(128, 0, 0, 1)">'</span>: [<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">曹操</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">司马懿</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">夏侯</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">荀彧</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">郭嘉</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">邓艾</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">许褚</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">徐晃</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">许诸</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">曹仁</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">司马昭</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">庞德</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">于禁</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">夏侯渊</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">曹真</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">钟会</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">],
</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">purple</span><span style="color: rgba(128, 0, 0, 1)">'</span>:[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">孙权</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">周瑜</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">东吴</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">孙策</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">吕蒙</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">陆逊</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">鲁肃</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">黄盖</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">太史慈</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">],
</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">pink</span><span style="color: rgba(128, 0, 0, 1)">'</span>:[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">董卓</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">袁术</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">袁绍</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">吕布</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">刘璋</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">刘表</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">貂蝉</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]
}
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 其它词语的颜色</span>
default_color = <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">gray</span><span style="color: rgba(128, 0, 0, 1)">'</span>
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 构建新的颜色规则</span>
grouped_color_func =<span style="color: rgba(0, 0, 0, 1)"> GroupedColorFunc(color_to_words, default_color)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 按照新的颜色规则重新绘制词云颜色</span>
w.recolor(color_func=<span style="color: rgba(0, 0, 0, 1)">grouped_color_func)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 将词云图片导出到当前文件夹</span>
w.to_file(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">output13-threekingdoms.png</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
13号词云:《三国演义》人物阵营分色词云</span></pre>
</div>
<p> </p>
<p><img style="max-width: 100%" src="https://camo.githubusercontent.com/297376338faefc1c191263a575564d88da9c68d5/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31333731343434382d316632643561363035346165666165642e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430" alt="13号词云:《三国演义》人物阵营分色词云" data-canonical-src="https://upload-images.jianshu.io/upload_images/13714448-1f2d5a6054aefaed.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240"></p><br><br>
来源:https://www.cnblogs.com/wkfvawl/p/11585986.html
頁:
[1]