萧又鸣 發表於 2019-7-17 21:39:00

python计算文件md5值

<p>  md5是一种常见不可逆加密算法,使用简单,计算速度快,在很多场景下都会用到,比如:给用户上传的文件命名,数据库中保存的用户密码,下载文件后检验文件是否正确等。下面讲解在python中如何使用md5算法。</p>
<p>&nbsp;</p>
<p><strong>一、计算字符串的md5值</strong></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">!/usr/bin/env python</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> -*- coding: utf-8 -*-</span>

<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> sys
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> hashlib

reload(sys)
sys.setdefaultencoding(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">utf-8</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)

</span><span style="color: rgba(0, 0, 255, 1)">if</span> <span style="color: rgba(128, 0, 128, 1)">__name__</span> == <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">__main__</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">:
    content </span>= <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">hello</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">
    md5hash </span>=<span style="color: rgba(0, 0, 0, 1)"> hashlib.md5(content)
    md5 </span>=<span style="color: rgba(0, 0, 0, 1)"> md5hash.hexdigest()
    </span><span style="color: rgba(0, 0, 255, 1)">print</span>(md5)</pre>
</div>
<p>&nbsp;运行上述代码,输出:5d41402abc4b2a76b9719d911017c592</p>
<p>用PHP自带的md5函数计算同一个字符串,验证下hello的md5是否正确。</p>
<div class="cnblogs_code">
<pre>&lt;?<span style="color: rgba(0, 0, 0, 1)">php

    $content </span>= "hello"<span style="color: rgba(0, 0, 0, 1)">;
    </span><span style="color: rgba(128, 0, 128, 1)">$md5</span> = <span style="color: rgba(0, 128, 128, 1)">md5</span><span style="color: rgba(0, 0, 0, 1)">($content);
    </span><span style="color: rgba(0, 128, 128, 1)">var_dump</span>(<span style="color: rgba(128, 0, 128, 1)">$md5</span><span style="color: rgba(0, 0, 0, 1)">);    </span><span style="color: rgba(0, 128, 0, 1)">//</span><span style="color: rgba(0, 128, 0, 1)"> 输出 5d41402abc4b2a76b9719d911017c592</span></pre>
</div>
<p>可见python下计算字符串的md5也是非常方便,使用hashlib库即可。网上有文章介绍python2.x下可以使用md5库,该库在python3.x不能使用,因此不推荐使用该库。</p>
<p>&nbsp;</p>
<p>字符串的md5计算比较简单,下面看下如何计算文件的md5值。</p>
<p><strong>二、计算文件的md5值</strong></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">!/usr/bin/env python</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> -*- coding: utf-8 -*-</span>

<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> sys
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> hashlib

reload(sys)
sys.setdefaultencoding(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">utf-8</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)

</span><span style="color: rgba(0, 0, 255, 1)">if</span> <span style="color: rgba(128, 0, 128, 1)">__name__</span> == <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">__main__</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">:
    file_name </span>= <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">3383430480_51_01.jpg</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">
    with open(file_name, </span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">rb</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">) as fp:
      data </span>=<span style="color: rgba(0, 0, 0, 1)"> fp.read()
    file_md5</span>=<span style="color: rgba(0, 0, 0, 1)"> hashlib.md5(data).hexdigest()
    </span><span style="color: rgba(0, 0, 255, 1)">print</span>(file_md5)   <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> ac3ee699961c58ef80a78c2434efe0d0</span></pre>
</div>
<p>文件md5计算跟字符串计算是一样,直接使用hashlib的md5方法,然后hexdigests就好了。同样用PHP代码验证下</p>
<div class="cnblogs_code">
<pre>&lt;?<span style="color: rgba(0, 0, 0, 1)">php

    </span><span style="color: rgba(128, 0, 128, 1)">$file_name</span> = "3383430480_51_01.jpg"<span style="color: rgba(0, 0, 0, 1)">;
    </span><span style="color: rgba(128, 0, 128, 1)">$file_md5</span> = <span style="color: rgba(0, 128, 128, 1)">md5_file</span>(<span style="color: rgba(128, 0, 128, 1)">$file_name</span><span style="color: rgba(0, 0, 0, 1)">);
    </span><span style="color: rgba(0, 128, 128, 1)">var_dump</span>(<span style="color: rgba(128, 0, 128, 1)">$file_md5</span>);    <span style="color: rgba(0, 128, 0, 1)">//</span><span style="color: rgba(0, 128, 0, 1)"> 输出 ac3ee699961c58ef80a78c2434efe0d0</span></pre>
</div>
<p>从结果可以看出md5是一样的,文件md5值也不过如此呀,心里窃喜。。。</p>
<p>如果大文件呢,比如几个G呢,上述代码肯定会内存溢出,怎么办呢,可以分块读取文件内容并计算。</p>
<p>&nbsp;</p>
<p><strong>三、计算大文件的md5值</strong></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">!/usr/bin/env python</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> -*- coding: utf-8 -*-</span>

<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> sys
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> hashlib

</span><span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> get_file_md5(fname):
    m </span>= hashlib.md5()   <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">创建md5对象</span>
    with open(fname,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">rb</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">) as fobj:
      </span><span style="color: rgba(0, 0, 255, 1)">while</span><span style="color: rgba(0, 0, 0, 1)"> True:
            data </span>= fobj.read(4096<span style="color: rgba(0, 0, 0, 1)">)
            </span><span style="color: rgba(0, 0, 255, 1)">if</span> <span style="color: rgba(0, 0, 255, 1)">not</span><span style="color: rgba(0, 0, 0, 1)"> data:
                </span><span style="color: rgba(0, 0, 255, 1)">break</span><span style="color: rgba(0, 0, 0, 1)">
            m.update(data)</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">更新md5对象</span>

    <span style="color: rgba(0, 0, 255, 1)">return</span> m.hexdigest()    <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">返回md5对象</span>
<span style="color: rgba(0, 0, 0, 1)">
reload(sys)
sys.setdefaultencoding(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">utf-8</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)

</span><span style="color: rgba(0, 0, 255, 1)">if</span> <span style="color: rgba(128, 0, 128, 1)">__name__</span> == <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">__main__</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">:
    file_name </span>= <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">mongodb_us.zip</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">
    file_md5 </span>=<span style="color: rgba(0, 0, 0, 1)"> get_file_md5(file_name)
    </span><span style="color: rgba(0, 0, 255, 1)">print</span>(file_md5)   <span style="color: rgba(0, 128, 0, 1)"># 0f45cdbf14de54001e82a17c3d199a4b</span></pre>
</div>
<p>分块读取文件内容,然后调用hashlib的update()方法将分块数据更新至md5对象中,最后调用hexdigest()方法得出md5值。</p>
<p>&nbsp;</p>
<p><strong>四、封装成常用库md5.py</strong></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">!/usr/bin/env python</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> -*- coding: utf-8 -*-</span>

<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> hashlib

</span><span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> get_file_md5(file_name):
    </span><span style="color: rgba(128, 0, 0, 1)">"""</span><span style="color: rgba(128, 0, 0, 1)">
    计算文件的md5
    :param file_name:
    :return:
    </span><span style="color: rgba(128, 0, 0, 1)">"""</span><span style="color: rgba(0, 0, 0, 1)">
    m </span>= hashlib.md5()   <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">创建md5对象</span>
    with open(file_name,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">rb</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">) as fobj:
      </span><span style="color: rgba(0, 0, 255, 1)">while</span><span style="color: rgba(0, 0, 0, 1)"> True:
            data </span>= fobj.read(4096<span style="color: rgba(0, 0, 0, 1)">)
            </span><span style="color: rgba(0, 0, 255, 1)">if</span> <span style="color: rgba(0, 0, 255, 1)">not</span><span style="color: rgba(0, 0, 0, 1)"> data:
                </span><span style="color: rgba(0, 0, 255, 1)">break</span><span style="color: rgba(0, 0, 0, 1)">
            m.update(data)</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">更新md5对象</span>

    <span style="color: rgba(0, 0, 255, 1)">return</span> m.hexdigest()    <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">返回md5对象</span>


<span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> get_str_md5(content):
    </span><span style="color: rgba(128, 0, 0, 1)">"""</span><span style="color: rgba(128, 0, 0, 1)">
    计算字符串md5
    :param content:
    :return:
    </span><span style="color: rgba(128, 0, 0, 1)">"""</span><span style="color: rgba(0, 0, 0, 1)">
    m </span>= hashlib.md5(content) <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">创建md5对象</span>
    <span style="color: rgba(0, 0, 255, 1)">return</span> m.hexdigest()</pre>
</div>
<p>&nbsp;</p>
<p>好了,关于md5的计算就到这里,有不同见解的同学,欢迎拍砖,一起来探讨,谢谢。</p>
<p>&nbsp;</p><br><br>
来源:https://www.cnblogs.com/xiaodekaixin/p/11203857.html
頁: [1]
查看完整版本: python计算文件md5值