使用 C# 提取 Word 文档中的表格数据
<p><span style="box-sizing: border-box; margin: 0">在日常办公或系统开发中,Word 文档里的表格数据常常需要被提取出来,用于数据导入、统计分析或报表生成。然而,手动复制粘贴效率低下,而借助 Office COM 组件又容易遇到版本兼容、部署繁琐等问题。本文将展示如何使用 C# 搭配 Free Spire.Doc 库,无需安装 Microsoft Word,即可快速、稳定地提取 Word 表格内容,并导出为结构化的文本文件。</span></p><hr style="box-sizing: border-box; margin: 1.5em 0; border-top: 2px solid rgba(230, 232, 235, 1); border-right: none; border-bottom: none; border-left: none; border-radius: 2px">
<h2 style="box-sizing: border-box; overflow-wrap: break-word; line-height: 1.35; margin: 0.3em 0; font-weight: 600; font-size: 1.35em; color: rgba(252, 252, 252, 1); background-color: rgba(242, 151, 24, 1); padding: 2px 12px; border-radius: 4px; display: inline-block"><span style="box-sizing: border-box; margin: 0">工具与环境准备</span></h2>
<p style="box-sizing: border-box; margin: 0 0 1em; overflow-wrap: break-word; line-height: 1.8"><span style="box-sizing: border-box; margin: 0">要实现 Word 表格提取,我们需要以下工具和组件:</span></p>
<ul style="box-sizing: border-box; margin: 0 0 1em; padding-left: 2em; list-style-type: disc">
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><strong style="box-sizing: border-box; margin: 0; font-weight: 600; color: rgba(242, 151, 24, 1)">开发环境</strong><span style="box-sizing: border-box; margin: 0">:Visual Studio(2022/2019 等)或任意 C# 开发工具</span></li>
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><strong style="box-sizing: border-box; margin: 0; font-weight: 600; color: rgba(242, 151, 24, 1)">第三方库</strong><span style="box-sizing: border-box; margin: 0">:Free Spire.Doc(用于解析 Word 文档结构,处理表格数据)</span></li>
</ul>
<p style="box-sizing: border-box; margin: 0 0 1em; overflow-wrap: break-word; line-height: 1.8"><span style="box-sizing: border-box; margin: 0">Free Spire.Doc 是一个免费的 Word 文档处理库,支持读取、编辑、生成 Word 文档,尤其对表格、段落等元素的处理非常便捷。可以通过 </span><strong style="box-sizing: border-box; margin: 0; font-weight: 600; color: rgba(242, 151, 24, 1)">NuGet 包管理器</strong><span style="box-sizing: border-box; margin: 0"> 安装它:</span></p>
<pre><code class="language-bash" style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; word-break: break-all; background-color: rgba(0, 0, 0, 0); color: inherit; padding: 0; border: none; border-radius: 0; font-size: 0.875em; line-height: 1.5">Install-Package FreeSpire.Doc
</code></pre>
<p style="box-sizing: border-box; margin: 0 0 1em; overflow-wrap: break-word; line-height: 1.8"><span style="box-sizing: border-box; margin: 0">或者在项目中右键“管理 NuGet 包”,搜索 </span><code style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; font-size: 0.9em; background-color: rgba(248, 249, 250, 1); color: rgba(242, 151, 24, 1); padding: 2px 4px; border-radius: 3px; word-break: break-all">Spire.Doc</code><span style="box-sizing: border-box; margin: 0"> 并安装。</span></p>
<blockquote style="box-sizing: border-box; display: block; font-size: 0.95em; overflow: auto; border-left: 3px solid rgba(242, 151, 24, 1); padding: 12px 20px; margin: 1em 0; background: rgba(248, 249, 250, 1)">
<p style="box-sizing: border-box; margin: 0; overflow-wrap: break-word; line-height: 1.7"><span style="box-sizing: border-box; margin: 0">⚠️ </span><strong style="box-sizing: border-box; margin: 0; font-weight: 600; color: rgba(242, 151, 24, 1)">注意</strong><span style="box-sizing: border-box; margin: 0">:免费版单文档最多支持 25 个表格,适用于学习、测试和小型业务场景</span></p>
</blockquote>
<hr style="box-sizing: border-box; margin: 1.5em 0; border-top: 2px solid rgba(230, 232, 235, 1); border-right: none; border-bottom: none; border-left: none; border-radius: 2px">
<h2 style="box-sizing: border-box; overflow-wrap: break-word; line-height: 1.35; margin: 0.3em 0; font-weight: 600; font-size: 1.35em; color: rgba(252, 252, 252, 1); background-color: rgba(242, 151, 24, 1); padding: 2px 12px; border-radius: 4px; display: inline-block"><span style="box-sizing: border-box; margin: 0">提取 Word 表格实现思路</span></h2>
<p style="box-sizing: border-box; margin: 0 0 1em; overflow-wrap: break-word; line-height: 1.8"><span style="box-sizing: border-box; margin: 0">从 Word 中提取表格的核心思路是 </span><strong style="box-sizing: border-box; margin: 0; font-weight: 600; color: rgba(242, 151, 24, 1)">逐层解析文档结构</strong><span style="box-sizing: border-box; margin: 0">:</span></p>
<ol style="box-sizing: border-box; margin: 0 0 1em; padding-left: 2em; list-style-type: decimal">
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><span style="box-sizing: border-box; margin: 0">加载 Word 文档,获取文档对象</span></li>
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><span style="box-sizing: border-box; margin: 0">遍历文档中的“节(Section)”(Word 文档的基本结构单位)</span></li>
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><span style="box-sizing: border-box; margin: 0">在每个节中获取表格集合,遍历所有表格</span></li>
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><span style="box-sizing: border-box; margin: 0">对每个表格,逐行、逐单元格提取文本内容</span></li>
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><span style="box-sizing: border-box; margin: 0">将提取的表格数据按格式(制表符分隔)保存到文本文件</span></li>
</ol><hr style="box-sizing: border-box; margin: 1.5em 0; border-top: 2px solid rgba(230, 232, 235, 1); border-right: none; border-bottom: none; border-left: none; border-radius: 2px">
<h2 style="box-sizing: border-box; overflow-wrap: break-word; line-height: 1.35; margin: 0.3em 0; font-weight: 600; font-size: 1.35em; color: rgba(252, 252, 252, 1); background-color: rgba(242, 151, 24, 1); padding: 2px 12px; border-radius: 4px; display: inline-block"><span style="box-sizing: border-box; margin: 0">完整代码</span></h2>
<pre><code class="language-csharp" style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; word-break: break-all; background-color: rgba(0, 0, 0, 0); color: inherit; padding: 0; border: none; border-radius: 0; font-size: 0.875em; line-height: 1.5"><span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">using</span> Spire.Doc;
<span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">using</span> Spire.Doc.Collections;
<span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">using</span> Spire.Doc.Interface;
<span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">using</span> System.IO;
<span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">using</span> System.Text;
<span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">namespace</span> <span class="hljs-title" style="color: rgba(240, 100, 49, 1); box-sizing: border-box; margin: 0">ExtractWordTable</span>
{
<span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">internal</span> <span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">class</span> <span class="hljs-title" style="color: rgba(240, 100, 49, 1); box-sizing: border-box; margin: 0">Program</span>
{
<span class="hljs-function" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0"><span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">static</span> <span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">void</span> <span class="hljs-title" style="color: rgba(240, 100, 49, 1); box-sizing: border-box; margin: 0">Main</span>(<span class="hljs-params" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0"><span class="hljs-built_in" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">string</span>[] <span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">args</span></span>)</span>
{
<span class="hljs-comment" style="color: rgba(165, 122, 76, 1); box-sizing: border-box; margin: 0">// 创建文档对象</span>
Document doc = <span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">new</span> Document();
<span class="hljs-comment" style="color: rgba(165, 122, 76, 1); box-sizing: border-box; margin: 0">// 加载Word文档</span>
doc.LoadFromFile(<span class="hljs-string" style="color: rgba(136, 155, 74, 1); box-sizing: border-box; margin: 0">"表格.docx"</span>);
<span class="hljs-comment" style="color: rgba(165, 122, 76, 1); box-sizing: border-box; margin: 0">// 遍历文档中的所有节</span>
<span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">for</span> (<span class="hljs-built_in" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">int</span> sectionIndex = <span class="hljs-number" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">0</span>; sectionIndex < doc.Sections.Count; sectionIndex++)
{
Section section = doc.Sections;
<span class="hljs-comment" style="color: rgba(165, 122, 76, 1); box-sizing: border-box; margin: 0">// 获取当前节中的所有表格</span>
TableCollection tables = section.Tables;
<span class="hljs-comment" style="color: rgba(165, 122, 76, 1); box-sizing: border-box; margin: 0">// 遍历当前节中的所有表格</span>
<span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">for</span> (<span class="hljs-built_in" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">int</span> tableIndex = <span class="hljs-number" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">0</span>; tableIndex < tables.Count; tableIndex++)
{
ITable table = tables;
<span class="hljs-comment" style="color: rgba(165, 122, 76, 1); box-sizing: border-box; margin: 0">// 用于存储当前表格的所有数据</span>
<span class="hljs-built_in" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">string</span> tableData = <span class="hljs-string" style="color: rgba(136, 155, 74, 1); box-sizing: border-box; margin: 0">""</span>;
<span class="hljs-comment" style="color: rgba(165, 122, 76, 1); box-sizing: border-box; margin: 0">// 遍历表格中的所有行</span>
<span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">for</span> (<span class="hljs-built_in" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">int</span> rowIndex = <span class="hljs-number" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">0</span>; rowIndex < table.Rows.Count; rowIndex++)
{
TableRow row = table.Rows;
<span class="hljs-comment" style="color: rgba(165, 122, 76, 1); box-sizing: border-box; margin: 0">// 遍历行中的所有单元格</span>
<span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">for</span> (<span class="hljs-built_in" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">int</span> cellIndex = <span class="hljs-number" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">0</span>; cellIndex < row.Cells.Count; cellIndex++)
{
TableCell cell = row.Cells;
<span class="hljs-comment" style="color: rgba(165, 122, 76, 1); box-sizing: border-box; margin: 0">// 提取单元格文本(单元格可能包含多个段落)</span>
<span class="hljs-built_in" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">string</span> cellText = <span class="hljs-string" style="color: rgba(136, 155, 74, 1); box-sizing: border-box; margin: 0">""</span>;
<span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">for</span> (<span class="hljs-built_in" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">int</span> paraIndex = <span class="hljs-number" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">0</span>; paraIndex < cell.Paragraphs.Count; paraIndex++)
{
cellText += (cell.Paragraphs.Text.Trim() + <span class="hljs-string" style="color: rgba(136, 155, 74, 1); box-sizing: border-box; margin: 0">" "</span>);
}
<span class="hljs-comment" style="color: rgba(165, 122, 76, 1); box-sizing: border-box; margin: 0">// 拼接单元格文本,用制表符分隔不同单元格</span>
tableData += cellText.Trim();
<span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">if</span> (cellIndex < row.Cells.Count - <span class="hljs-number" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">1</span>)
{
tableData += <span class="hljs-string" style="color: rgba(136, 155, 74, 1); box-sizing: border-box; margin: 0">"\t"</span>;
}
}
<span class="hljs-comment" style="color: rgba(165, 122, 76, 1); box-sizing: border-box; margin: 0">// 行结束后换行</span>
tableData += <span class="hljs-string" style="color: rgba(136, 155, 74, 1); box-sizing: border-box; margin: 0">"\n"</span>;
}
<span class="hljs-comment" style="color: rgba(165, 122, 76, 1); box-sizing: border-box; margin: 0">// 保存表格数据到文本文件)</span>
<span class="hljs-built_in" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">string</span> filePath = Path.Combine(<span class="hljs-string" style="color: rgba(136, 155, 74, 1); box-sizing: border-box; margin: 0">"Tables"</span>, <span class="hljs-string" style="color: rgba(136, 155, 74, 1); box-sizing: border-box; margin: 0">$"Section<span class="hljs-subst" style="box-sizing: border-box; margin: 0">{sectionIndex + <span class="hljs-number" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">1</span>}</span>_Table<span class="hljs-subst" style="box-sizing: border-box; margin: 0">{tableIndex + <span class="hljs-number" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">1</span>}</span>.txt"</span>);
File.WriteAllText(filePath, tableData, Encoding.UTF8);
}
}
doc.Close();
}
}
}
</code></pre>
<hr style="box-sizing: border-box; margin: 1.5em 0; border-top: 2px solid rgba(230, 232, 235, 1); border-right: none; border-bottom: none; border-left: none; border-radius: 2px">
<h2 style="box-sizing: border-box; overflow-wrap: break-word; line-height: 1.35; margin: 0.3em 0; font-weight: 600; font-size: 1.35em; color: rgba(252, 252, 252, 1); background-color: rgba(242, 151, 24, 1); padding: 2px 12px; border-radius: 4px; display: inline-block"><span style="box-sizing: border-box; margin: 0">代码核心逻辑解析</span></h2>
<h3 style="box-sizing: border-box; margin: 1.2em 0 0.5em; overflow-wrap: break-word; line-height: 1.35; font-weight: 600; font-size: 1.2em; color: rgba(242, 151, 24, 1)"><span style="box-sizing: border-box; margin: 0">① 遍历文档结构</span></h3>
<p style="box-sizing: border-box; margin: 0 0 1em; overflow-wrap: break-word; line-height: 1.8"><span style="box-sizing: border-box; margin: 0">Word 文档的逻辑结构是:</span><strong style="box-sizing: border-box; margin: 0; font-weight: 600; color: rgba(242, 151, 24, 1)">Document → Section → Table → Row → Cell</strong><span style="box-sizing: border-box; margin: 0">。</span></p>
<ul style="box-sizing: border-box; margin: 0 0 1em; padding-left: 2em; list-style-type: disc">
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><strong style="box-sizing: border-box; margin: 0; font-weight: 600; color: rgba(242, 151, 24, 1)">节(Section)</strong><span style="box-sizing: border-box; margin: 0">:一个文档可以有多个节(如不同页码格式、页眉页脚的区域)。通过 </span><code style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; font-size: 0.9em; background-color: rgba(248, 249, 250, 1); color: rgba(242, 151, 24, 1); padding: 2px 4px; border-radius: 3px; word-break: break-all">doc.Sections</code><span style="box-sizing: border-box; margin: 0"> 获取。</span></li>
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><strong style="box-sizing: border-box; margin: 0; font-weight: 600; color: rgba(242, 151, 24, 1)">表格(Table)</strong><span style="box-sizing: border-box; margin: 0">:每个节可以包含多个表格,通过 </span><code style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; font-size: 0.9em; background-color: rgba(248, 249, 250, 1); color: rgba(242, 151, 24, 1); padding: 2px 4px; border-radius: 3px; word-break: break-all">section.Tables</code><span style="box-sizing: border-box; margin: 0"> 获取。</span></li>
</ul>
<h3 style="box-sizing: border-box; margin: 1.2em 0 0.5em; overflow-wrap: break-word; line-height: 1.35; font-weight: 600; font-size: 1.2em; color: rgba(242, 151, 24, 1)"><span style="box-sizing: border-box; margin: 0">② 提取单元格文本</span></h3>
<p style="box-sizing: border-box; margin: 0 0 1em; overflow-wrap: break-word; line-height: 1.8"><span style="box-sizing: border-box; margin: 0">单元格 </span><code style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; font-size: 0.9em; background-color: rgba(248, 249, 250, 1); color: rgba(242, 151, 24, 1); padding: 2px 4px; border-radius: 3px; word-break: break-all">TableCell</code><span style="box-sizing: border-box; margin: 0"> 内部可能包含多个段落(</span><code style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; font-size: 0.9em; background-color: rgba(248, 249, 250, 1); color: rgba(242, 151, 24, 1); padding: 2px 4px; border-radius: 3px; word-break: break-all">Paragraph</code><span style="box-sizing: border-box; margin: 0">),每个段落可能有不同的格式(加粗、颜色等)。我们只需提取纯文本内容:</span></p>
<ul style="box-sizing: border-box; margin: 0 0 1em; padding-left: 2em; list-style-type: disc">
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><span style="box-sizing: border-box; margin: 0">遍历 </span><code style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; font-size: 0.9em; background-color: rgba(248, 249, 250, 1); color: rgba(242, 151, 24, 1); padding: 2px 4px; border-radius: 3px; word-break: break-all">cell.Paragraphs</code><span style="box-sizing: border-box; margin: 0">,获取每个段落的 Text</span></li>
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><span style="box-sizing: border-box; margin: 0">使用 </span><code style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; font-size: 0.9em; background-color: rgba(248, 249, 250, 1); color: rgba(242, 151, 24, 1); padding: 2px 4px; border-radius: 3px; word-break: break-all">Trim()</code><span style="box-sizing: border-box; margin: 0"> 去除段落首尾空白,避免多余换行。</span></li>
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><span style="box-sizing: border-box; margin: 0">多个段落之间用空格连接,保证可读性。</span></li>
</ul>
<h3 style="box-sizing: border-box; margin: 1.2em 0 0.5em; overflow-wrap: break-word; line-height: 1.35; font-weight: 600; font-size: 1.2em; color: rgba(242, 151, 24, 1)"><span style="box-sizing: border-box; margin: 0">③ 保存为文本文件</span></h3>
<ul style="box-sizing: border-box; margin: 0 0 1em; padding-left: 2em; list-style-type: disc">
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><span style="box-sizing: border-box; margin: 0">每个表格单独保存为一个 </span><code style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; font-size: 0.9em; background-color: rgba(248, 249, 250, 1); color: rgba(242, 151, 24, 1); padding: 2px 4px; border-radius: 3px; word-break: break-all">.txt</code><span style="box-sizing: border-box; margin: 0"> 文件,文件名包含节索引和表格索引,便于区分。</span></li>
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><span style="box-sizing: border-box; margin: 0">单元格之间用 </span><strong style="box-sizing: border-box; margin: 0; font-weight: 600; color: rgba(242, 151, 24, 1)">制表符 <code style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; font-size: 0.9em; background-color: rgba(248, 249, 250, 1); color: rgba(242, 151, 24, 1); padding: 2px 4px; border-radius: 3px; word-break: break-all">\t</code></strong><span style="box-sizing: border-box; margin: 0"> 分隔,行末添加换行符。这种格式可直接复制到 Excel 中粘贴,或者被其他数据分析工具读取。</span></li>
</ul>
<hr style="box-sizing: border-box; margin: 1.5em 0; border-top: 2px solid rgba(230, 232, 235, 1); border-right: none; border-bottom: none; border-left: none; border-radius: 2px">
<h2 style="box-sizing: border-box; overflow-wrap: break-word; line-height: 1.35; margin: 0.3em 0; font-weight: 600; font-size: 1.35em; color: rgba(252, 252, 252, 1); background-color: rgba(242, 151, 24, 1); padding: 2px 12px; border-radius: 4px; display: inline-block"><span style="box-sizing: border-box; margin: 0">实用扩展方向</span></h2>
<p style="box-sizing: border-box; margin: 0 0 1em; overflow-wrap: break-word; line-height: 1.8"><span style="box-sizing: border-box; margin: 0">基于本文代码,可以轻松扩展以下功能:</span></p>
<h3 style="box-sizing: border-box; margin: 1.2em 0 0.5em; overflow-wrap: break-word; line-height: 1.35; font-weight: 600; font-size: 1.2em; color: rgba(242, 151, 24, 1)"><span style="box-sizing: border-box; margin: 0">1. 导出为 Excel 文件(使用 Free Spire.XLS)</span></h3>
<pre><code class="language-csharp" style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; word-break: break-all; background-color: rgba(0, 0, 0, 0); color: inherit; padding: 0; border: none; border-radius: 0; font-size: 0.875em; line-height: 1.5"><span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">using</span> Spire.Xls;
<span class="hljs-comment" style="color: rgba(165, 122, 76, 1); box-sizing: border-box; margin: 0">// 将 tableData 的二维数组写入 Workbook</span>
</code></pre>
<h3 style="box-sizing: border-box; margin: 1.2em 0 0.5em; overflow-wrap: break-word; line-height: 1.35; font-weight: 600; font-size: 1.2em; color: rgba(242, 151, 24, 1)"><span style="box-sizing: border-box; margin: 0">2. 批量处理多个 Word 文档</span></h3>
<pre><code class="language-csharp" style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; word-break: break-all; background-color: rgba(0, 0, 0, 0); color: inherit; padding: 0; border: none; border-radius: 0; font-size: 0.875em; line-height: 1.5"><span class="hljs-built_in" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">string</span>[] files = Directory.GetFiles(<span class="hljs-string" style="color: rgba(136, 155, 74, 1); box-sizing: border-box; margin: 0">@"C:\Docs"</span>, <span class="hljs-string" style="color: rgba(136, 155, 74, 1); box-sizing: border-box; margin: 0">"*.docx"</span>);
<span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">foreach</span> (<span class="hljs-built_in" style="color: rgba(247, 154, 50, 1); box-sizing: border-box; margin: 0">string</span> <span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">file</span> <span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">in</span> files)
{
doc.LoadFromFile(<span class="hljs-keyword" style="color: rgba(152, 103, 106, 1); box-sizing: border-box; margin: 0">file</span>);
<span class="hljs-comment" style="color: rgba(165, 122, 76, 1); box-sizing: border-box; margin: 0">// ... 提取逻辑</span>
}
</code></pre>
<h3 style="box-sizing: border-box; margin: 1.2em 0 0.5em; overflow-wrap: break-word; line-height: 1.35; font-weight: 600; font-size: 1.2em; color: rgba(242, 151, 24, 1)"><span style="box-sizing: border-box; margin: 0">3. 文本清洗与格式化</span></h3>
<ul style="box-sizing: border-box; margin: 0 0 1em; padding-left: 2em; list-style-type: disc">
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><span style="box-sizing: border-box; margin: 0">去除特殊符号(如 </span><code style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; font-size: 0.9em; background-color: rgba(248, 249, 250, 1); color: rgba(242, 151, 24, 1); padding: 2px 4px; border-radius: 3px; word-break: break-all">\r</code><span style="box-sizing: border-box; margin: 0">, </span><code style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; font-size: 0.9em; background-color: rgba(248, 249, 250, 1); color: rgba(242, 151, 24, 1); padding: 2px 4px; border-radius: 3px; word-break: break-all">\n</code><span style="box-sizing: border-box; margin: 0"> 替换为空格)</span></li>
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><span style="box-sizing: border-box; margin: 0">统一数字格式(如货币符号、百分比)</span></li>
<li style="box-sizing: border-box; margin: 0 0 0.4em; line-height: 1.7"><span style="box-sizing: border-box; margin: 0">使用正则表达式过滤无关字符</span></li>
</ul>
<h3 style="box-sizing: border-box; margin: 1.2em 0 0.5em; overflow-wrap: break-word; line-height: 1.35; font-weight: 600; font-size: 1.2em; color: rgba(242, 151, 24, 1)"><span style="box-sizing: border-box; margin: 0">4. 直接导入数据库</span></h3>
<p style="box-sizing: border-box; margin: 0 0 1em; overflow-wrap: break-word; line-height: 1.8"><span style="box-sizing: border-box; margin: 0">将提取的表格数据转换为 </span><code style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; font-size: 0.9em; background-color: rgba(248, 249, 250, 1); color: rgba(242, 151, 24, 1); padding: 2px 4px; border-radius: 3px; word-break: break-all">DataTable</code><span style="box-sizing: border-box; margin: 0">,然后使用 </span><code style="box-sizing: border-box; margin: 0; font-family: "SF Mono", Consolas, "Liberation Mono", Menlo, monospace; font-size: 0.9em; background-color: rgba(248, 249, 250, 1); color: rgba(242, 151, 24, 1); padding: 2px 4px; border-radius: 3px; word-break: break-all">SqlBulkCopy</code><span style="box-sizing: border-box; margin: 0"> 批量写入 SQL Server。</span></p>
<hr style="box-sizing: border-box; margin: 1.5em 0; border-top: 2px solid rgba(230, 232, 235, 1); border-right: none; border-bottom: none; border-left: none; border-radius: 2px">
<h2 style="box-sizing: border-box; overflow-wrap: break-word; line-height: 1.35; margin: 0.3em 0; font-weight: 600; font-size: 1.35em; color: rgba(252, 252, 252, 1); background-color: rgba(242, 151, 24, 1); padding: 2px 12px; border-radius: 4px; display: inline-block"><span style="box-sizing: border-box; margin: 0">总结</span></h2>
<p style="box-sizing: border-box; margin: 0; overflow-wrap: break-word; line-height: 1.8"><span style="box-sizing: border-box; margin: 0">通过本文介绍的方法,你可以高效地从 Word 文档中提取表格数据,为后续的数据处理提供便利。相比原生 Office Interop(需要安装 Word 且不稳定),Free Spire.Doc 无需依赖 Office 客户端,运行更轻量、稳定;代码逻辑清晰,便于复用和二次开发。</span></p><br><br>
来源:https://www.cnblogs.com/jazz-z/p/19895710
頁:
[1]