金勇胜 發表於 2019-8-25 22:52:00

Python用Pandas读写Excel

<p>Pandas是python的一个数据分析包,纳入了大量库和一些标准的数据模型,提供了高效地操作大型数据集所需的工具。<br>Pandas提供了大量能使我们快速便捷地处理数据的函数和方法。</p>
<p>Pandas官方文档:https://pandas.pydata.org/pandas-docs/stable/<br><br></p>
<p><strong>一、安装包</strong></p>
<p>pandas处理Excel需要xlrd、openpyxl依赖包</p>
<div class="cnblogs_code">
<pre>pip3 install pandas
pip3 install xlrd
pip3 install openpyxl</pre>
</div>
<p><strong>二、创建Excel,写入数据</strong></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)">pandasas pd
</span><span style="color: rgba(0, 0, 255, 1)">from</span> pandas <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> DataFrame

</span><span style="color: rgba(0, 128, 0, 1)">#创建DataFrame可以用下面字典,也可以用数组ndarray</span>
dic1 = {<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">标题列1</span><span style="color: rgba(128, 0, 0, 1)">'</span>: [<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">张三</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">李四</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">],
      </span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">标题列2</span><span style="color: rgba(128, 0, 0, 1)">'</span>:
       }
df </span>=<span style="color: rgba(0, 0, 0, 1)"> pd.DataFrame(dic1)
df.to_excel(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">1.xlsx</span><span style="color: rgba(128, 0, 0, 1)">'</span>, index=False)</pre>
</div>
<p>read_excel方法说明</p>
<div class="cnblogs_code">
<pre>pd.read_excel(io, sheet_name=0, header=0, names=None, index_col=<span style="color: rgba(0, 0, 0, 1)">None,
            usecols</span>=None, squeeze=False,dtype=None, engine=<span style="color: rgba(0, 0, 0, 1)">None,
            converters</span>=None, true_values=None, false_values=<span style="color: rgba(0, 0, 0, 1)">None,
            skiprows</span>=None, nrows=None, na_values=None, parse_dates=<span style="color: rgba(0, 0, 0, 1)">False,
            date_parser</span>=None, thousands=None, comment=None, skipfooter=<span style="color: rgba(0, 0, 0, 1)">0,
            convert_float</span>=True, **kwds)</pre>
</div>
<p>io:excel文件</p>
<p>sheet_name:返回指定sheet,默认索引0返回第一个,也可用名称,如果返回多个则可用列表,为None则返回全表</p>
<p>header:指定表头,也可用列表指定多行</p>
<p>names:自定义列名,长度和Excel列长度必须一致</p>
<p>index_col:用作索引的列</p>
<p>usecols:读取指定的列,参数为列表,如表示第1和第2列</p>
<p><strong>三、读取Excel</strong></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)">pandasas pd
</span><span style="color: rgba(0, 0, 255, 1)">from</span> pandas <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> DataFrame

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">读</span>
data = pd.read_excel(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">1.xlsx</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">查看所有的值</span>
<span style="color: rgba(0, 0, 255, 1)">print</span><span style="color: rgba(0, 0, 0, 1)">(data.values)

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">查看第一行的值</span>
<span style="color: rgba(0, 0, 255, 1)">print</span><span style="color: rgba(0, 0, 0, 1)">(data.values)

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">查看某一列所有的值</span>
<span style="color: rgba(0, 0, 255, 1)">print</span>(data[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">标题列1</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">].values)

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">新增列</span>
data[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">标题列3</span><span style="color: rgba(128, 0, 0, 1)">'</span>] =<span style="color: rgba(0, 0, 0, 1)"> None

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">新增行</span>
data.loc = [<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">王五</span><span style="color: rgba(128, 0, 0, 1)">'</span>, 100, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">男</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">删除行:axis=0</span>
data = data.drop(, axis=<span style="color: rgba(0, 0, 0, 1)">0)

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">删除列:axis=1</span>
data.drop(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">标题列3</span><span style="color: rgba(128, 0, 0, 1)">'</span>, axis=1<span style="color: rgba(0, 0, 0, 1)">)

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">保存</span>
DataFrame(data).to_excel(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">1.xlsx</span><span style="color: rgba(128, 0, 0, 1)">'</span>, sheet_name=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">Sheet1</span><span style="color: rgba(128, 0, 0, 1)">'</span>, index=False, header=True)</pre>
</div>
<p><strong>&nbsp;四、其它实例</strong></p>
<p><strong>1、批量转换.xls文件为.xlsx文件</strong></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> os
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)">pandasas pd
</span><span style="color: rgba(0, 0, 255, 1)">from</span> pandas <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> DataFrame

</span><span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> xlsToXlsx(srcPath, destPath):
    </span><span style="color: rgba(128, 0, 0, 1)">"""</span><span style="color: rgba(128, 0, 0, 1)">srcPath路径的所有.xls文件,用.xlsx格式保存于destPath路径</span><span style="color: rgba(128, 0, 0, 1)">"""</span>
    <span style="color: rgba(0, 0, 255, 1)">for</span> filename <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> os.listdir(srcPath):
      filePath </span>=<span style="color: rgba(0, 0, 0, 1)"> os.path.join(srcPath,filename)
      </span><span style="color: rgba(0, 0, 255, 1)">if</span> <span style="color: rgba(0, 0, 255, 1)">not</span> filename.endswith(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">.xls</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">):
            </span><span style="color: rgba(0, 0, 255, 1)">continue</span><span style="color: rgba(0, 0, 0, 1)">      
      xlsxPath </span>=destPath + <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">\\</span><span style="color: rgba(128, 0, 0, 1)">'</span> + filename + <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">x</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">      
      data </span>=<span style="color: rgba(0, 0, 0, 1)"> pd.DataFrame(pd.read_excel(filePath))
      data.to_excel(xlsxPath, index</span>=<span style="color: rgba(0, 0, 0, 1)">False)

srcPath </span>= r<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">D:\python\xls</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">
destPath </span>= r<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">D:\python\xlsx</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">
xlsToXlsx(srcPath, destPath)</span></pre>
</div>
<p><strong>2、同一格式的多个Excle文件,合并到一个Excel文件中</strong></p>
<p>例如把左边2个Excel合并成右边1个Excel</p>
<p>&nbsp;<strong><img src="https://img2022.cnblogs.com/blog/201408/202202/201408-20220222231054684-1108233820.png" alt="" loading="lazy"></strong></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> os
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)">pandasas pd
</span><span style="color: rgba(0, 0, 255, 1)">from</span> pandas <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> DataFrame
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> numpy as np

</span><span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> excelMerge(srcPath, destPath):
    </span><span style="color: rgba(128, 0, 0, 1)">"""</span><span style="color: rgba(128, 0, 0, 1)">同种格式的Excel文件合并</span><span style="color: rgba(128, 0, 0, 1)">"""</span><span style="color: rgba(0, 0, 0, 1)">
    fileList </span>= [] <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">所有Excel文件列表</span>
    <span style="color: rgba(0, 0, 255, 1)">for</span> filename <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> os.listdir(srcPath):
      filePath </span>=<span style="color: rgba(0, 0, 0, 1)"> os.path.join(srcPath,filename)
      </span><span style="color: rgba(0, 0, 255, 1)">if</span> <span style="color: rgba(0, 0, 255, 1)">not</span> filename.endswith(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">.xlsx</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">):
            </span><span style="color: rgba(0, 0, 255, 1)">continue</span><span style="color: rgba(0, 0, 0, 1)">
      fileList.append(filePath)

    resultData </span>= pd.read_excel(fileList) <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">第一个Excel文件数据</span>
    colName = resultData.columns.values <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">Excel文件标题</span>
<span style="color: rgba(0, 0, 0, 1)">
    fileList.pop(0) </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">列表中删除第一个文件</span>

    <span style="color: rgba(0, 0, 255, 1)">for</span> filepath <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> fileList:
      fileData </span>=<span style="color: rgba(0, 0, 0, 1)"> pd.read_excel(filePath)
      resultData </span>= np.concatenate((resultData, fileData), axis=0) <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">合并Excel数据</span>
<span style="color: rgba(0, 0, 0, 1)">
    df </span>= pd.DataFrame(resultData, columns =<span style="color: rgba(0, 0, 0, 1)"> colName)
    df.to_excel(destPath, index</span>=False) <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">保存</span>
<span style="color: rgba(0, 0, 0, 1)">

srcPath </span>= r<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">D:\python\xlsx</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">
destPath </span>= r<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">D:\python\list.xlsx</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">
excelMerge(srcPath, destPath)</span></pre>
</div>
<p>&nbsp;</p><br><br>
来源:https://www.cnblogs.com/gdjlc/p/11409804.html
頁: [1]
查看完整版本: Python用Pandas读写Excel