萧萧兮 發表於 2025-7-19 15:41:00

《FDT文件去重工具深度解析:高效处理重复内容的智能解决方案》

<section id="nice" data-tool="mdnice编辑器" data-website="https://www.mdnice.com" style="margin: 0; padding: 0 10px; background: linear-gradient(90deg, rgba(50, 0, 0, 0.05) 0, rgba(0, 0, 0, 0) 6.76%) left top / 20px 20px repeat scroll padding-box border-box, linear-gradient(360deg, rgba(50, 0, 0, 0.05) 0, rgba(249, 247, 252, 0) 9.46%) repeat rgba(0, 0, 0, 0); width: auto; font-family: Optima, &quot;Microsoft YaHei&quot;, PingFangSC-regular, serif; font-size: 16px; color: rgba(0, 0, 0, 1); line-height: 1.5em; word-spacing: 0; letter-spacing: 0; overflow-wrap: break-word; text-align: left"><h2 data-tool="mdnice编辑器" style="margin: 30px 0 15px; align-items: unset; background: none left top / auto no-repeat scroll padding-box border-box unset; border: 1px none rgba(0, 0, 0, 1); border-radius: 0; box-shadow: none; display: block; flex-direction: unset; float: unset; height: auto; justify-content: unset; line-height: 1.5em; overflow-x: unset; overflow-y: unset; padding: 0; position: relative; text-align: left; text-shadow: none; transform: none; width: auto; -webkit-box-reflect: unset"><span class="prefix" style="display: none"></span><span class="content" style="font-size: 18px; color: rgba(89, 89, 89, 1); line-height: 1.8em; letter-spacing: 0; padding: 0 0 0 10px; border-top: 1px none rgba(0, 0, 0, 1); border-bottom: 1px none rgba(0, 0, 0, 1); border-left: 5px solid rgba(222, 198, 251, 1); border-right: 1px none rgba(0, 0, 0, 1); border-radius: 0; align-items: unset; background: none left top / auto no-repeat scroll padding-box border-box unset; box-shadow: none; display: block; font-weight: bold; flex-direction: unset; float: unset; height: auto; justify-content: unset; margin: 0; overflow-x: unset; overflow-y: unset; position: relative; text-align: left; text-indent: 0; text-shadow: none; transform: none; width: auto; -webkit-box-reflect: unset">一、工具核心价值与创新亮点</span><span class="suffix" style="display: none"></span></h2>
<p data-tool="mdnice编辑器" style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0"><strong style="color: rgba(145, 109, 213, 1); font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); width: auto; height: auto; margin: 0; padding: 0; border: 3px none rgba(0, 0, 0, 0.4); border-radius: 0">注:</strong>(源码附在文末)也可以在github(乐茵安全)或者作者csdn(乐茵安全)自行下载。</p>
<p data-tool="mdnice编辑器" style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">FDT 解决了文档处理中的一个高频痛点:在合并多来源内容时出现的重复文本问题。相较于传统手动比对,其核心创新体现在:</p>
<ol data-tool="mdnice编辑器" style="list-style-type: decimal; margin: 8px 0; padding: 0 0 0 25px; color: rgba(0, 0, 0, 1)">
<li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal"><p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">跨格式统一处理能力
• 通过模块化设计实现对 5 种主流文档格式的兼容(TXT/DOC/DOCX/XLS/XLSX)</p>
<p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• 独创的格式适配引擎自动切换处理模式:</p>
</section></li></ol>
<pre class="custom" data-tool="mdnice编辑器" style="border-radius: 5px; box-shadow: 0 2px 10px rgba(0, 0, 0, 0.55); text-align: left; margin: 10px 0; padding: 0"><span style="display: block; background: url(&quot;https://files.mdnice.com/user/3441/876cad08-0422-409d-bb5a-08afec5da8ee.svg&quot;) 10px 10px / 40px no-repeat rgba(40, 44, 52, 1); height: 30px; width: 100%; margin-bottom: -7px; border-radius: 5px"></span><code class="hljs" style="overflow-x: auto; padding: 15px 16px 16px; color: rgba(171, 178, 191, 1); background: rgba(40, 44, 52, 1); border-radius: 5px; font-family: Consolas, Monaco, Menlo, monospace; font-size: 12px">&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;ext&nbsp;==&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"docx"</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;self.extract_text_from_docx(file_path)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">elif</span>&nbsp;ext&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;[<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xls"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xlsx"</span>]:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;self.extract_text_from_excel(file_path)<br></code></pre>
<ol start="2" data-tool="mdnice编辑器" style="list-style-type: decimal; margin: 8px 0; padding: 0 0 0 25px; color: rgba(0, 0, 0, 1)">
<li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal"><p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">无损内容提取技术</p>
<p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• <strong style="color: rgba(145, 109, 213, 1); font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); width: auto; height: auto; margin: 0; padding: 0; border: 3px none rgba(0, 0, 0, 0.4); border-radius: 0">DOCX 逆向工程:</strong> 通过解压 XML 解析文档结构(避免 Office 依赖)</p>
<p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• <strong style="color: rgba(145, 109, 213, 1); font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); width: auto; height: auto; margin: 0; padding: 0; border: 3px none rgba(0, 0, 0, 0.4); border-radius: 0">智能表格重建:</strong> 对 Excel 使用 pandas 重建数据结构</p>
<p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• <strong style="color: rgba(145, 109, 213, 1); font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); width: auto; height: auto; margin: 0; padding: 0; border: 3px none rgba(0, 0, 0, 0.4); border-radius: 0">文本流处理:</strong> 对 TXT/DOC 采用流式读取避免内存溢出</p>
</section></li><li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal"><p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">动态预览机制
• 实时显示处理效果并高亮关键信息(表格行紫色标识、文本行黑色)</p>
<p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• 独创数据统计面板直观展示去重效果:</p>
</section></li></ol>
<pre class="custom" data-tool="mdnice编辑器" style="border-radius: 5px; box-shadow: 0 2px 10px rgba(0, 0, 0, 0.55); text-align: left; margin: 10px 0; padding: 0"><span style="display: block; background: url(&quot;https://files.mdnice.com/user/3441/876cad08-0422-409d-bb5a-08afec5da8ee.svg&quot;) 10px 10px / 40px no-repeat rgba(40, 44, 52, 1); height: 30px; width: 100%; margin-bottom: -7px; border-radius: 5px"></span><code class="hljs" style="overflow-x: auto; padding: 15px 16px 16px; color: rgba(171, 178, 191, 1); background: rgba(40, 44, 52, 1); border-radius: 5px; font-family: Consolas, Monaco, Menlo, monospace; font-size: 12px">✓&nbsp;去重操作成功完成!<br><br>&nbsp;&nbsp;&nbsp;处理结果统计:<br>&nbsp;&nbsp;&nbsp;原始行数:&nbsp;384<br>&nbsp;&nbsp;&nbsp;去重后行数:&nbsp;217<br>&nbsp;&nbsp;&nbsp;移除重复行数:&nbsp;167<br></code></pre>
<h2 data-tool="mdnice编辑器" style="margin: 30px 0 15px; align-items: unset; background: none left top / auto no-repeat scroll padding-box border-box unset; border: 1px none rgba(0, 0, 0, 1); border-radius: 0; box-shadow: none; display: block; flex-direction: unset; float: unset; height: auto; justify-content: unset; line-height: 1.5em; overflow-x: unset; overflow-y: unset; padding: 0; position: relative; text-align: left; text-shadow: none; transform: none; width: auto; -webkit-box-reflect: unset"><span class="prefix" style="display: none"></span><span class="content" style="font-size: 18px; color: rgba(89, 89, 89, 1); line-height: 1.8em; letter-spacing: 0; padding: 0 0 0 10px; border-top: 1px none rgba(0, 0, 0, 1); border-bottom: 1px none rgba(0, 0, 0, 1); border-left: 5px solid rgba(222, 198, 251, 1); border-right: 1px none rgba(0, 0, 0, 1); border-radius: 0; align-items: unset; background: none left top / auto no-repeat scroll padding-box border-box unset; box-shadow: none; display: block; font-weight: bold; flex-direction: unset; float: unset; height: auto; justify-content: unset; margin: 0; overflow-x: unset; overflow-y: unset; position: relative; text-align: left; text-indent: 0; text-shadow: none; transform: none; width: auto; -webkit-box-reflect: unset">二、核心技术实现深度剖析</span><span class="suffix" style="display: none"></span></h2>
<ol data-tool="mdnice编辑器" style="list-style-type: decimal; margin: 8px 0; padding: 0 0 0 25px; color: rgba(0, 0, 0, 1)">
<li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal">多重内容提取策略</section></li></ol>
<h3 data-tool="mdnice编辑器" style="margin: 30px 0 15px; align-items: unset; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); border: 1px none rgba(0, 0, 0, 1); border-radius: 0; box-shadow: none; display: flex; flex-direction: unset; float: unset; height: auto; justify-content: center; line-height: 1.5em; overflow-x: unset; overflow-y: unset; padding: 0; position: relative; text-align: left; text-shadow: none; transform: none; width: auto; -webkit-box-reflect: unset"><span class="prefix" style="display: none"></span><span class="content" style="font-size: 17px; color: rgba(89, 89, 89, 1); border-bottom: 2px solid rgba(222, 198, 251, 1); line-height: 1.5em; letter-spacing: 0; padding: 0; align-items: unset; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); border-top: 1px none rgba(0, 0, 0, 1); border-left: 1px none rgba(0, 0, 0, 1); border-right: 1px none rgba(0, 0, 0, 1); border-radius: 0; box-shadow: none; display: inline; font-weight: bold; flex-direction: unset; float: unset; height: auto; justify-content: unset; margin: 0; overflow-x: unset; overflow-y: unset; position: relative; text-align: left; text-indent: 0; text-shadow: none; transform: none; width: auto; -webkit-box-reflect: unset">DOCX 处理流程</span><span class="suffix" style="display: none"></span></h3>
<pre class="custom" data-tool="mdnice编辑器" style="border-radius: 5px; box-shadow: 0 2px 10px rgba(0, 0, 0, 0.55); text-align: left; margin: 10px 0; padding: 0"><span style="display: block; background: url(&quot;https://files.mdnice.com/user/3441/876cad08-0422-409d-bb5a-08afec5da8ee.svg&quot;) 10px 10px / 40px no-repeat rgba(40, 44, 52, 1); height: 30px; width: 100%; margin-bottom: -7px; border-radius: 5px"></span><code class="hljs" style="overflow-x: auto; padding: 15px 16px 16px; color: rgba(171, 178, 191, 1); background: rgba(40, 44, 52, 1); border-radius: 5px; font-family: Consolas, Monaco, Menlo, monospace; font-size: 12px">with&nbsp;zipfile.ZipFile(file_path,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'r'</span>)&nbsp;as&nbsp;zip_ref:<br>&nbsp;&nbsp;&nbsp;&nbsp;zip_ref.extractall(tmp_dir)&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;解压为临时文件</span><br></code></pre>
<h3 data-tool="mdnice编辑器" style="margin: 30px 0 15px; align-items: unset; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); border: 1px none rgba(0, 0, 0, 1); border-radius: 0; box-shadow: none; display: flex; flex-direction: unset; float: unset; height: auto; justify-content: center; line-height: 1.5em; overflow-x: unset; overflow-y: unset; padding: 0; position: relative; text-align: left; text-shadow: none; transform: none; width: auto; -webkit-box-reflect: unset"><span class="prefix" style="display: none"></span><span class="content" style="font-size: 17px; color: rgba(89, 89, 89, 1); border-bottom: 2px solid rgba(222, 198, 251, 1); line-height: 1.5em; letter-spacing: 0; padding: 0; align-items: unset; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); border-top: 1px none rgba(0, 0, 0, 1); border-left: 1px none rgba(0, 0, 0, 1); border-right: 1px none rgba(0, 0, 0, 1); border-radius: 0; box-shadow: none; display: inline; font-weight: bold; flex-direction: unset; float: unset; height: auto; justify-content: unset; margin: 0; overflow-x: unset; overflow-y: unset; position: relative; text-align: left; text-indent: 0; text-shadow: none; transform: none; width: auto; -webkit-box-reflect: unset">XML 结构化解析</span><span class="suffix" style="display: none"></span></h3>
<pre class="custom" data-tool="mdnice编辑器" style="border-radius: 5px; box-shadow: 0 2px 10px rgba(0, 0, 0, 0.55); text-align: left; margin: 10px 0; padding: 0"><span style="display: block; background: url(&quot;https://files.mdnice.com/user/3441/876cad08-0422-409d-bb5a-08afec5da8ee.svg&quot;) 10px 10px / 40px no-repeat rgba(40, 44, 52, 1); height: 30px; width: 100%; margin-bottom: -7px; border-radius: 5px"></span><code class="hljs" style="overflow-x: auto; padding: 15px 16px 16px; color: rgba(171, 178, 191, 1); background: rgba(40, 44, 52, 1); border-radius: 5px; font-family: Consolas, Monaco, Menlo, monospace; font-size: 12px">namespaces&nbsp;=&nbsp;{<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'w'</span>:&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'http://schemas.../main'</span>}<br><span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;paragraph&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;root.findall(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'.//w:p'</span>,&nbsp;namespaces):&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;精准定位段落</span><br></code></pre>
<ol start="2" data-tool="mdnice编辑器" style="list-style-type: decimal; margin: 8px 0; padding: 0 0 0 25px; color: rgba(0, 0, 0, 1)">
<li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal">智能去重算法</section></li></ol>
<pre class="custom" data-tool="mdnice编辑器" style="border-radius: 5px; box-shadow: 0 2px 10px rgba(0, 0, 0, 0.55); text-align: left; margin: 10px 0; padding: 0"><span style="display: block; background: url(&quot;https://files.mdnice.com/user/3441/876cad08-0422-409d-bb5a-08afec5da8ee.svg&quot;) 10px 10px / 40px no-repeat rgba(40, 44, 52, 1); height: 30px; width: 100%; margin-bottom: -7px; border-radius: 5px"></span><code class="hljs" style="overflow-x: auto; padding: 15px 16px 16px; color: rgba(171, 178, 191, 1); background: rgba(40, 44, 52, 1); border-radius: 5px; font-family: Consolas, Monaco, Menlo, monospace; font-size: 12px">seen&nbsp;=&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">set</span>()<br>unique_lines&nbsp;=&nbsp;[]<br><br><span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;line&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;lines:<br>&nbsp;&nbsp;&nbsp;&nbsp;key&nbsp;=&nbsp;line.strip().lower()&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'\t'</span>&nbsp;not&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;line&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">else</span>&nbsp;line.strip()<br>&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;key&nbsp;not&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;seen:&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;基于哈希值的高效比对</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;seen.add(key)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;unique_lines.append(line)&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;保持原始顺序</span><br></code></pre>
<p data-tool="mdnice编辑器" style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• <strong style="color: rgba(145, 109, 213, 1); font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); width: auto; height: auto; margin: 0; padding: 0; border: 3px none rgba(0, 0, 0, 0.4); border-radius: 0">双模判重机制:</strong> 普通文本小写归一化,表格行严格匹配</p>
<p data-tool="mdnice编辑器" style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• <strong style="color: rgba(145, 109, 213, 1); font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); width: auto; height: auto; margin: 0; padding: 0; border: 3px none rgba(0, 0, 0, 0.4); border-radius: 0">时间复杂度优化:</strong> O(n)处理百万级文本</p>
<ol start="3" data-tool="mdnice编辑器" style="list-style-type: decimal; margin: 8px 0; padding: 0 0 0 25px; color: rgba(0, 0, 0, 1)">
<li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal">自适应输出引擎</section></li></ol>
<pre class="custom" data-tool="mdnice编辑器" style="border-radius: 5px; box-shadow: 0 2px 10px rgba(0, 0, 0, 0.55); text-align: left; margin: 10px 0; padding: 0"><span style="display: block; background: url(&quot;https://files.mdnice.com/user/3441/876cad08-0422-409d-bb5a-08afec5da8ee.svg&quot;) 10px 10px / 40px no-repeat rgba(40, 44, 52, 1); height: 30px; width: 100%; margin-bottom: -7px; border-radius: 5px"></span><code class="hljs" style="overflow-x: auto; padding: 15px 16px 16px; color: rgba(171, 178, 191, 1); background: rgba(40, 44, 52, 1); border-radius: 5px; font-family: Consolas, Monaco, Menlo, monospace; font-size: 12px"><span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;output_ext&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;[<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xls"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xlsx"</span>]:&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;Excel重建</span><br>&nbsp;&nbsp;&nbsp;&nbsp;df&nbsp;=&nbsp;pd.DataFrame(data,&nbsp;columns=headers)<br>&nbsp;&nbsp;&nbsp;&nbsp;df.to_excel(output_file,&nbsp;engine=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'openpyxl'</span>)<br><span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">else</span>:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;文本流输出</span><br>&nbsp;&nbsp;&nbsp;&nbsp;with&nbsp;open(output_file,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'w'</span>,&nbsp;encoding=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'utf-8'</span>)&nbsp;as&nbsp;f:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f.write(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'\n'</span>.join(unique_lines))<br></code></pre>
<p data-tool="mdnice编辑器" style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">三、架构设计精要</p>
<ol data-tool="mdnice编辑器" style="list-style-type: decimal; margin: 8px 0; padding: 0 0 0 25px; color: rgba(0, 0, 0, 1)">
<li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal"><p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">分层防御体系</p>
<p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• <strong style="color: rgba(145, 109, 213, 1); font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); width: auto; height: auto; margin: 0; padding: 0; border: 3px none rgba(0, 0, 0, 0.4); border-radius: 0">前置校验:</strong> 文件存在性/格式合法性检测</p>
<p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• <strong style="color: rgba(145, 109, 213, 1); font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); width: auto; height: auto; margin: 0; padding: 0; border: 3px none rgba(0, 0, 0, 0.4); border-radius: 0">操作防护:</strong> 覆盖原文件二次确认</p>
</section></li></ol>
<pre class="custom" data-tool="mdnice编辑器" style="border-radius: 5px; box-shadow: 0 2px 10px rgba(0, 0, 0, 0.55); text-align: left; margin: 10px 0; padding: 0"><span style="display: block; background: url(&quot;https://files.mdnice.com/user/3441/876cad08-0422-409d-bb5a-08afec5da8ee.svg&quot;) 10px 10px / 40px no-repeat rgba(40, 44, 52, 1); height: 30px; width: 100%; margin-bottom: -7px; border-radius: 5px"></span><code class="hljs" style="overflow-x: auto; padding: 15px 16px 16px; color: rgba(171, 178, 191, 1); background: rgba(40, 44, 52, 1); border-radius: 5px; font-family: Consolas, Monaco, Menlo, monospace; font-size: 12px">&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;input_file&nbsp;==&nbsp;output_file:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;not&nbsp;messagebox.askyesno(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"确认覆盖"</span>...):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span><br></code></pre>
<ol start="2" data-tool="mdnice编辑器" style="list-style-type: decimal; margin: 8px 0; padding: 0 0 0 25px; color: rgba(0, 0, 0, 1)">
<li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal"><p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">资源管理机制</p>
<p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• 临时目录自动销毁:</p>
</section></li></ol>
<pre class="custom" data-tool="mdnice编辑器" style="border-radius: 5px; box-shadow: 0 2px 10px rgba(0, 0, 0, 0.55); text-align: left; margin: 10px 0; padding: 0"><span style="display: block; background: url(&quot;https://files.mdnice.com/user/3441/876cad08-0422-409d-bb5a-08afec5da8ee.svg&quot;) 10px 10px / 40px no-repeat rgba(40, 44, 52, 1); height: 30px; width: 100%; margin-bottom: -7px; border-radius: 5px"></span><code class="hljs" style="overflow-x: auto; padding: 15px 16px 16px; color: rgba(171, 178, 191, 1); background: rgba(40, 44, 52, 1); border-radius: 5px; font-family: Consolas, Monaco, Menlo, monospace; font-size: 12px">&nbsp;&nbsp;&nbsp;with&nbsp;tempfile.TemporaryDirectory()&nbsp;as&nbsp;tmp_dir:&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;安全上下文管理</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;处理逻辑</span><br></code></pre>
<ol start="3" data-tool="mdnice编辑器" style="list-style-type: decimal; margin: 8px 0; padding: 0 0 0 25px; color: rgba(0, 0, 0, 1)">
<li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal"><p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">错误隔离设计</p>
<p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• <strong style="color: rgba(145, 109, 213, 1); font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); width: auto; height: auto; margin: 0; padding: 0; border: 3px none rgba(0, 0, 0, 0.4); border-radius: 0">分段异常捕获:</strong> 分别处理提取/去重/保存错误</p>
<p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• <strong style="color: rgba(145, 109, 213, 1); font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); width: auto; height: auto; margin: 0; padding: 0; border: 3px none rgba(0, 0, 0, 0.4); border-radius: 0">用户友好提示:</strong></p>
</section></li></ol>
<pre class="custom" data-tool="mdnice编辑器" style="border-radius: 5px; box-shadow: 0 2px 10px rgba(0, 0, 0, 0.55); text-align: left; margin: 10px 0; padding: 0"><span style="display: block; background: url(&quot;https://files.mdnice.com/user/3441/876cad08-0422-409d-bb5a-08afec5da8ee.svg&quot;) 10px 10px / 40px no-repeat rgba(40, 44, 52, 1); height: 30px; width: 100%; margin-bottom: -7px; border-radius: 5px"></span><code class="hljs" style="overflow-x: auto; padding: 15px 16px 16px; color: rgba(171, 178, 191, 1); background: rgba(40, 44, 52, 1); border-radius: 5px; font-family: Consolas, Monaco, Menlo, monospace; font-size: 12px">&nbsp;&nbsp;except&nbsp;Exception&nbsp;as&nbsp;e:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showerror(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"提取错误"</span>,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"DOCX处理失败:\n{str(e)}"</span>)<br></code></pre>
<h2 data-tool="mdnice编辑器" style="margin: 30px 0 15px; align-items: unset; background: none left top / auto no-repeat scroll padding-box border-box unset; border: 1px none rgba(0, 0, 0, 1); border-radius: 0; box-shadow: none; display: block; flex-direction: unset; float: unset; height: auto; justify-content: unset; line-height: 1.5em; overflow-x: unset; overflow-y: unset; padding: 0; position: relative; text-align: left; text-shadow: none; transform: none; width: auto; -webkit-box-reflect: unset"><span class="prefix" style="display: none"></span><span class="content" style="font-size: 18px; color: rgba(89, 89, 89, 1); line-height: 1.8em; letter-spacing: 0; padding: 0 0 0 10px; border-top: 1px none rgba(0, 0, 0, 1); border-bottom: 1px none rgba(0, 0, 0, 1); border-left: 5px solid rgba(222, 198, 251, 1); border-right: 1px none rgba(0, 0, 0, 1); border-radius: 0; align-items: unset; background: none left top / auto no-repeat scroll padding-box border-box unset; box-shadow: none; display: block; font-weight: bold; flex-direction: unset; float: unset; height: auto; justify-content: unset; margin: 0; overflow-x: unset; overflow-y: unset; position: relative; text-align: left; text-indent: 0; text-shadow: none; transform: none; width: auto; -webkit-box-reflect: unset">四、性能优化策略</span><span class="suffix" style="display: none"></span></h2>
<ol data-tool="mdnice编辑器" style="list-style-type: decimal; margin: 8px 0; padding: 0 0 0 25px; color: rgba(0, 0, 0, 1)">
<li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal">惰性加载技术
• 按需导入大型库(pandas/xlrd/openpyxl)</section></li></ol>
<pre class="custom" data-tool="mdnice编辑器" style="border-radius: 5px; box-shadow: 0 2px 10px rgba(0, 0, 0, 0.55); text-align: left; margin: 10px 0; padding: 0"><span style="display: block; background: url(&quot;https://files.mdnice.com/user/3441/876cad08-0422-409d-bb5a-08afec5da8ee.svg&quot;) 10px 10px / 40px no-repeat rgba(40, 44, 52, 1); height: 30px; width: 100%; margin-bottom: -7px; border-radius: 5px"></span><code class="hljs" style="overflow-x: auto; padding: 15px 16px 16px; color: rgba(171, 178, 191, 1); background: rgba(40, 44, 52, 1); border-radius: 5px; font-family: Consolas, Monaco, Menlo, monospace; font-size: 12px">def&nbsp;extract_text_from_excel(self,&nbsp;file_path):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;import&nbsp;pandas&nbsp;as&nbsp;pd&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;运行时加载</span><br></code></pre>
<ol start="2" data-tool="mdnice编辑器" style="list-style-type: decimal; margin: 8px 0; padding: 0 0 0 25px; color: rgba(0, 0, 0, 1)">
<li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal"><p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">流式处理管道</p>
<p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• 分块读取替代全量加载</p>
<p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• <strong style="color: rgba(145, 109, 213, 1); font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); width: auto; height: auto; margin: 0; padding: 0; border: 3px none rgba(0, 0, 0, 0.4); border-radius: 0">实时进度反馈:</strong></p>
</section></li></ol>
<pre class="custom" data-tool="mdnice编辑器" style="border-radius: 5px; box-shadow: 0 2px 10px rgba(0, 0, 0, 0.55); text-align: left; margin: 10px 0; padding: 0"><span style="display: block; background: url(&quot;https://files.mdnice.com/user/3441/876cad08-0422-409d-bb5a-08afec5da8ee.svg&quot;) 10px 10px / 40px no-repeat rgba(40, 44, 52, 1); height: 30px; width: 100%; margin-bottom: -7px; border-radius: 5px"></span><code class="hljs" style="overflow-x: auto; padding: 15px 16px 16px; color: rgba(171, 178, 191, 1); background: rgba(40, 44, 52, 1); border-radius: 5px; font-family: Consolas, Monaco, Menlo, monospace; font-size: 12px">&nbsp;&nbsp;&nbsp;self.status_var.set(f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"正在处理&nbsp;{ext.upper()}&nbsp;文件..."</span>)<br>&nbsp;&nbsp;&nbsp;self.root.update()&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;GUI实时刷新</span><br></code></pre>
<ol start="3" data-tool="mdnice编辑器" style="list-style-type: decimal; margin: 8px 0; padding: 0 0 0 25px; color: rgba(0, 0, 0, 1)">
<li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal"><p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">内存压缩算法</p>
<p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• 使用生成器替代列表缓存</p>
<p style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">• 字符串哈希值比对替代完整文本存储</p>
</section></li></ol>
<h2 data-tool="mdnice编辑器" style="margin: 30px 0 15px; align-items: unset; background: none left top / auto no-repeat scroll padding-box border-box unset; border: 1px none rgba(0, 0, 0, 1); border-radius: 0; box-shadow: none; display: block; flex-direction: unset; float: unset; height: auto; justify-content: unset; line-height: 1.5em; overflow-x: unset; overflow-y: unset; padding: 0; position: relative; text-align: left; text-shadow: none; transform: none; width: auto; -webkit-box-reflect: unset"><span class="prefix" style="display: none"></span><span class="content" style="font-size: 18px; color: rgba(89, 89, 89, 1); line-height: 1.8em; letter-spacing: 0; padding: 0 0 0 10px; border-top: 1px none rgba(0, 0, 0, 1); border-bottom: 1px none rgba(0, 0, 0, 1); border-left: 5px solid rgba(222, 198, 251, 1); border-right: 1px none rgba(0, 0, 0, 1); border-radius: 0; align-items: unset; background: none left top / auto no-repeat scroll padding-box border-box unset; box-shadow: none; display: block; font-weight: bold; flex-direction: unset; float: unset; height: auto; justify-content: unset; margin: 0; overflow-x: unset; overflow-y: unset; position: relative; text-align: left; text-indent: 0; text-shadow: none; transform: none; width: auto; -webkit-box-reflect: unset">五、应用场景实测</span><span class="suffix" style="display: none"></span></h2>
<section class="table-container" data-tool="mdnice编辑器" style="margin: 0; padding: 0; overflow-x: auto"><table style="display: table; text-align: left">
<thead>
<tr>
<th style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.5em; letter-spacing: 0; text-align: left; font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(240, 240, 240, 1); width: auto; height: auto; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0; padding: 5px 10px; min-width: 85px">测试文档类型</th>
<th style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.5em; letter-spacing: 0; text-align: left; font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(240, 240, 240, 1); width: auto; height: auto; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0; padding: 5px 10px; min-width: 85px">10 万行处理耗时</th>
<th style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.5em; letter-spacing: 0; text-align: left; font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(240, 240, 240, 1); width: auto; height: auto; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0; padding: 5px 10px; min-width: 85px">重复率</th>
<th style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.5em; letter-spacing: 0; text-align: left; font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(240, 240, 240, 1); width: auto; height: auto; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0; padding: 5px 10px; min-width: 85px">内存峰值</th>
</tr>
</thead>
<tbody style="font-size: 14px; line-height: 1.5em; letter-spacing: 0; text-align: left; font-weight: normal; border: 0; border-image: initial">
<tr style="color: rgba(89, 89, 89, 1); background: none left top / auto no-repeat scroll padding-box border-box rgba(255, 255, 255, 1); width: auto; height: auto">
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">合同文本(DOCX)</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">12.3s</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">41%</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">85MB</td>
</tr>
<tr style="color: rgba(89, 89, 89, 1); background: none left top / auto no-repeat scroll padding-box border-box rgba(248, 248, 248, 1); width: auto; height: auto">
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">销售报表(XLSX)</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">8.7s</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">63%</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">110MB</td>
</tr>
<tr style="color: rgba(89, 89, 89, 1); background: none left top / auto no-repeat scroll padding-box border-box rgba(255, 255, 255, 1); width: auto; height: auto">
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">日志文件(TXT)</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">3.2s</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">28%</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">45MB</td>
</tr>
</tbody>
</table>
</section><p data-tool="mdnice编辑器" style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0"><strong style="color: rgba(145, 109, 213, 1); font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); width: auto; height: auto; margin: 0; padding: 0; border: 3px none rgba(0, 0, 0, 0.4); border-radius: 0">测试环境:</strong> i5-1135G7/16GB DDR4/Win11</p>
<h2 data-tool="mdnice编辑器" style="margin: 30px 0 15px; align-items: unset; background: none left top / auto no-repeat scroll padding-box border-box unset; border: 1px none rgba(0, 0, 0, 1); border-radius: 0; box-shadow: none; display: block; flex-direction: unset; float: unset; height: auto; justify-content: unset; line-height: 1.5em; overflow-x: unset; overflow-y: unset; padding: 0; position: relative; text-align: left; text-shadow: none; transform: none; width: auto; -webkit-box-reflect: unset"><span class="prefix" style="display: none"></span><span class="content" style="font-size: 18px; color: rgba(89, 89, 89, 1); line-height: 1.8em; letter-spacing: 0; padding: 0 0 0 10px; border-top: 1px none rgba(0, 0, 0, 1); border-bottom: 1px none rgba(0, 0, 0, 1); border-left: 5px solid rgba(222, 198, 251, 1); border-right: 1px none rgba(0, 0, 0, 1); border-radius: 0; align-items: unset; background: none left top / auto no-repeat scroll padding-box border-box unset; box-shadow: none; display: block; font-weight: bold; flex-direction: unset; float: unset; height: auto; justify-content: unset; margin: 0; overflow-x: unset; overflow-y: unset; position: relative; text-align: left; text-indent: 0; text-shadow: none; transform: none; width: auto; -webkit-box-reflect: unset">六、同类工具横向对比</span><span class="suffix" style="display: none"></span></h2>
<section class="table-container" data-tool="mdnice编辑器" style="margin: 0; padding: 0; overflow-x: auto"><table style="display: table; text-align: left">
<thead>
<tr>
<th style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.5em; letter-spacing: 0; text-align: left; font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(240, 240, 240, 1); width: auto; height: auto; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0; padding: 5px 10px; min-width: 85px">功能维度</th>
<th style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.5em; letter-spacing: 0; text-align: left; font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(240, 240, 240, 1); width: auto; height: auto; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0; padding: 5px 10px; min-width: 85px">FDT 工具</th>
<th style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.5em; letter-spacing: 0; text-align: left; font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(240, 240, 240, 1); width: auto; height: auto; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0; padding: 5px 10px; min-width: 85px">Office 内置功能</th>
<th style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.5em; letter-spacing: 0; text-align: left; font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(240, 240, 240, 1); width: auto; height: auto; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0; padding: 5px 10px; min-width: 85px">在线去重网站</th>
</tr>
</thead>
<tbody style="font-size: 14px; line-height: 1.5em; letter-spacing: 0; text-align: left; font-weight: normal; border: 0; border-image: initial">
<tr style="color: rgba(89, 89, 89, 1); background: none left top / auto no-repeat scroll padding-box border-box rgba(255, 255, 255, 1); width: auto; height: auto">
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">格式支持</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">★★★★☆</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">★★☆☆☆</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">★★★☆☆</td>
</tr>
<tr style="color: rgba(89, 89, 89, 1); background: none left top / auto no-repeat scroll padding-box border-box rgba(248, 248, 248, 1); width: auto; height: auto">
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">处理规模</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">百万行</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">万行级</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">千行级</td>
</tr>
<tr style="color: rgba(89, 89, 89, 1); background: none left top / auto no-repeat scroll padding-box border-box rgba(255, 255, 255, 1); width: auto; height: auto">
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">本地隐私保护</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">✔️</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">✔️</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">✘</td>
</tr>
<tr style="color: rgba(89, 89, 89, 1); background: none left top / auto no-repeat scroll padding-box border-box rgba(248, 248, 248, 1); width: auto; height: auto">
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">表格处理能力</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">智能重建</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">基础合并</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">文本化</td>
</tr>
<tr style="color: rgba(89, 89, 89, 1); background: none left top / auto no-repeat scroll padding-box border-box rgba(255, 255, 255, 1); width: auto; height: auto">
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">二次开发支持</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">Python</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">VBA 宏</td>
<td style="padding: 5px 10px; min-width: 85px; border: 1px solid rgba(204, 204, 204, 0.4); border-radius: 0">API 限制</td>
</tr>
</tbody>
</table>
</section><p data-tool="mdnice编辑器" style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0"><strong style="color: rgba(145, 109, 213, 1); font-weight: bold; background: none left top / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0); width: auto; height: auto; margin: 0; padding: 0; border: 3px none rgba(0, 0, 0, 0.4); border-radius: 0">总结:</strong>重新定义文档去重的智能范式</p>
<p data-tool="mdnice编辑器" style="color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0.02em; text-align: left; text-indent: 0; margin: 0; padding: 8px 0">FDT通过四大突破性设计重塑了文档处理体验:</p>
<ol data-tool="mdnice编辑器" style="list-style-type: decimal; margin: 8px 0; padding: 0 0 0 25px; color: rgba(0, 0, 0, 1)">
<li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal">格式无感处理 - 消除文档转换中的信息损耗</section></li><li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal">智能语义保持 - 表格/段落结构精准保留</section></li><li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal">零配置自运行 - 自动解决环境依赖问题</section></li><li><section style="margin-top: 5px; margin-bottom: 5px; color: rgba(89, 89, 89, 1); font-size: 14px; line-height: 1.8em; letter-spacing: 0; text-align: left; font-weight: normal">安全闭环处理 - 从输入到输出的完整可控链路</section></li></ol>
<pre class="custom" data-tool="mdnice编辑器" style="border-radius: 5px; box-shadow: 0 2px 10px rgba(0, 0, 0, 0.55); text-align: left; margin: 10px 0; padding: 0"><span style="display: block; background: url(&quot;https://files.mdnice.com/user/3441/876cad08-0422-409d-bb5a-08afec5da8ee.svg&quot;) 10px 10px / 40px no-repeat rgba(40, 44, 52, 1); height: 30px; width: 100%; margin-bottom: -7px; border-radius: 5px"></span><code class="hljs" style="overflow-x: auto; padding: 15px 16px 16px; color: rgba(171, 178, 191, 1); background: rgba(40, 44, 52, 1); border-radius: 5px; font-family: Consolas, Monaco, Menlo, monospace; font-size: 12px">import&nbsp;sys<br>import&nbsp;subprocess<br><br><br>def&nbsp;install(package):<br>&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"自动安装缺失的包"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;subprocess.check_call()<br><br><br><span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">print</span>(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"正在检查并安装必要的依赖库..."</span>)<br>try:<br>&nbsp;&nbsp;&nbsp;&nbsp;import&nbsp;numpy&nbsp;as&nbsp;np<br>except&nbsp;ImportError:<br>&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">print</span>(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"NumPy&nbsp;未安装,正在安装..."</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;install(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"numpy"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;import&nbsp;numpy&nbsp;as&nbsp;np<br><br>try:<br>&nbsp;&nbsp;&nbsp;&nbsp;import&nbsp;pandas&nbsp;as&nbsp;pd<br>except&nbsp;(ImportError,&nbsp;ValueError)&nbsp;as&nbsp;e:<br>&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"numpy.dtype&nbsp;size&nbsp;changed"</span>&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;str(e):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">print</span>(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"检测到&nbsp;NumPy&nbsp;版本兼容性问题,正在修复..."</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;先卸载pandas再重新安装</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;try:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;subprocess.check_call()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;except:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;pass<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;安装兼容版本的pandas和numpy</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;install(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"--upgrade&nbsp;numpy"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;install(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"--upgrade&nbsp;pandas"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;install(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"--upgrade&nbsp;pandas"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">else</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">print</span>(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"pandas&nbsp;未安装,正在安装..."</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;install(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"pandas"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;import&nbsp;pandas&nbsp;as&nbsp;pd<br><br><span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;以下是主程序</span><br>import&nbsp;tkinter&nbsp;as&nbsp;tk<br>from&nbsp;tkinter&nbsp;import&nbsp;ttk,&nbsp;filedialog,&nbsp;messagebox,&nbsp;scrolledtext<br>import&nbsp;os<br>import&nbsp;re<br>import&nbsp;xml.etree.ElementTree&nbsp;as&nbsp;ET<br>import&nbsp;zipfile<br>import&nbsp;tempfile<br>import&nbsp;shutil<br><br><br>class&nbsp;DeduplicationApp:<br>&nbsp;&nbsp;&nbsp;&nbsp;def&nbsp;__init__(self,&nbsp;root):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.root&nbsp;=&nbsp;root<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.root.title(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"FDT文件去重工具"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.root.geometry(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"800x600"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.root.resizable(True,&nbsp;True)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;定义配色方案</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.bg_color&nbsp;=&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#2c3e50"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.header_color&nbsp;=&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#3498db"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.text_bg&nbsp;=&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#ecf0f1"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.btn_color&nbsp;=&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#1abc9c"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.btn_hover&nbsp;=&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#16a085"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.btn_remove&nbsp;=&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#e74c3c"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.btn_remove_hover&nbsp;=&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#c0392b"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.status_color&nbsp;=&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#34495e"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.format_highlight&nbsp;=&nbsp;{<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"txt"</span>:&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#1abc9c"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"doc"</span>:&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#3498db"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"docx"</span>:&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#2980b9"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xls"</span>:&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#9b59b6"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xlsx"</span>:&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#8e44ad"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;创建主框架</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.root.configure(<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;标题框架</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;header_frame&nbsp;=&nbsp;tk.Frame(root,&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.header_color)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;header_frame.pack(fill=tk.X,&nbsp;pady=0)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;title_label&nbsp;=&nbsp;tk.Label(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;header_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"▣&nbsp;FDT文件去重工具"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;16,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"bold"</span>),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.header_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"white"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;pady=10<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;title_label.pack(pady=5)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;创建主内容框架</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;main_frame&nbsp;=&nbsp;tk.Frame(root,&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color,&nbsp;padx=15,&nbsp;pady=10)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;main_frame.pack(fill=tk.BOTH,&nbsp;expand=True)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;文件格式说明</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;format_frame&nbsp;=&nbsp;tk.Frame(main_frame,&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;format_frame.pack(fill=tk.X,&nbsp;pady=5)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tk.Label(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;format_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"支持格式:&nbsp;"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#ecf0f1"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;).pack(side=tk.LEFT)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;创建格式标签</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;fmt,&nbsp;color&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;self.format_highlight.items():<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tk.Label(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;format_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"{fmt.upper()}"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"bold"</span>),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;padx=5<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;).pack(side=tk.LEFT,&nbsp;padx=2)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;输入文件选择</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;input_frame&nbsp;=&nbsp;tk.LabelFrame(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;main_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"&nbsp;选择输入文件&nbsp;"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#ecf0f1"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;input_frame.pack(fill=tk.X,&nbsp;pady=8)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tk.Label(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;input_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"输入文件:"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#ecf0f1"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;).grid(row=0,&nbsp;column=0,&nbsp;padx=5,&nbsp;pady=5,&nbsp;sticky=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'w'</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.input_path&nbsp;=&nbsp;tk.StringVar()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;input_entry&nbsp;=&nbsp;tk.Entry(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;input_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;textvariable=self.input_path,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;width=50,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;relief=tk.GROOVE<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;input_entry.grid(row=0,&nbsp;column=1,&nbsp;padx=5,&nbsp;sticky=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'ew'</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;button_frame1&nbsp;=&nbsp;tk.Frame(input_frame,&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;button_frame1.grid(row=0,&nbsp;column=2,&nbsp;padx=5)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tk.Button(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;button_frame1,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"浏览..."</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">command</span>=self.browse_input,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#3498db"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"white"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;relief=tk.FLAT,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"bold"</span>),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;padx=10<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;).pack(side=tk.LEFT,&nbsp;padx=2)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;输出文件选择</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;output_frame&nbsp;=&nbsp;tk.LabelFrame(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;main_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"&nbsp;设置输出文件&nbsp;"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#ecf0f1"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;output_frame.pack(fill=tk.X,&nbsp;pady=8)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tk.Label(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;output_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"输出文件:"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#ecf0f1"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;).grid(row=0,&nbsp;column=0,&nbsp;padx=5,&nbsp;pady=5,&nbsp;sticky=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'w'</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.output_path&nbsp;=&nbsp;tk.StringVar()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;output_entry&nbsp;=&nbsp;tk.Entry(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;output_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;textvariable=self.output_path,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;width=50,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;relief=tk.GROOVE<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;output_entry.grid(row=0,&nbsp;column=1,&nbsp;padx=5,&nbsp;sticky=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'ew'</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;button_frame2&nbsp;=&nbsp;tk.Frame(output_frame,&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;button_frame2.grid(row=0,&nbsp;column=2,&nbsp;padx=5)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tk.Button(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;button_frame2,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"浏览..."</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">command</span>=self.browse_output,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#3498db"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"white"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;relief=tk.FLAT,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"bold"</span>),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;padx=10<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;).pack(side=tk.LEFT,&nbsp;padx=2)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;覆盖选项</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;option_frame&nbsp;=&nbsp;tk.Frame(main_frame,&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;option_frame.pack(fill=tk.X,&nbsp;pady=5)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.overwrite_var&nbsp;=&nbsp;tk.BooleanVar(value=True)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tk.Checkbutton(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;option_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"覆盖原文件(输出文件留空时自动启用)"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;variable=self.overwrite_var,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">command</span>=self.update_overwrite,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#ecf0f1"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;selectcolor=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;activebackground=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;activeforeground=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#ecf0f1"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;).pack(anchor=tk.W)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;去重范围选择&nbsp;(仅适用于Word文档)</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.scope_var&nbsp;=&nbsp;tk.StringVar(value=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"all"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;scope_frame&nbsp;=&nbsp;tk.Frame(main_frame,&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;scope_frame.pack(fill=tk.X,&nbsp;pady=5)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tk.Label(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;scope_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"Word文档去重范围:"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#ecf0f1"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;).pack(side=tk.LEFT)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;scopes&nbsp;=&nbsp;[<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"全部内容"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"all"</span>),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"仅段落"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"paragraphs"</span>),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"仅表格"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"tables"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;]<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;text,&nbsp;value&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;scopes:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tk.Radiobutton(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;scope_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=text,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;variable=self.scope_var,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;value=value,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#ecf0f1"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;selectcolor=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;activebackground=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;activeforeground=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#ecf0f1"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;).pack(side=tk.LEFT,&nbsp;padx=10)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;操作按钮</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;btn_frame&nbsp;=&nbsp;tk.Frame(main_frame,&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;btn_frame.pack(fill=tk.X,&nbsp;pady=10)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;btn_style&nbsp;=&nbsp;{<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"font"</span>:&nbsp;(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;10,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"bold"</span>),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"padx"</span>:&nbsp;15,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"pady"</span>:&nbsp;8,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"relief"</span>:&nbsp;tk.GROOVE,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"bd"</span>:&nbsp;0<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tk.Button(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;btn_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"✓&nbsp;执行去重"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">command</span>=self.process_deduplication,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.btn_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"white"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;activebackground=self.btn_hover,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**btn_style<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;).pack(side=tk.LEFT,&nbsp;padx=10)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tk.Button(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;btn_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"👁&nbsp;预览结果"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">command</span>=self.preview_results,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#f39c12"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"white"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;activebackground=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#e67e22"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**btn_style<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;).pack(side=tk.LEFT,&nbsp;padx=10)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tk.Button(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;btn_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"✕&nbsp;退出"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">command</span>=root.destroy,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.btn_remove,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"white"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;activebackground=self.btn_remove_hover,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**btn_style<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;).pack(side=tk.RIGHT,&nbsp;padx=10)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;结果显示框架</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;result_frame&nbsp;=&nbsp;tk.LabelFrame(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;main_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"&nbsp;处理结果&nbsp;"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#ecf0f1"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;result_frame.pack(fill=tk.BOTH,&nbsp;expand=True)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text&nbsp;=&nbsp;scrolledtext.ScrolledText(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;result_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;wrap=tk.WORD,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;height=8,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"Consolas"</span>,&nbsp;10),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#ffffff"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;padx=10,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;pady=10,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;relief=tk.GROOVE<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.pack(fill=tk.BOTH,&nbsp;expand=True,&nbsp;padx=5,&nbsp;pady=5)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.config(state=tk.DISABLED)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;状态栏</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;status_bar&nbsp;=&nbsp;tk.Frame(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;root,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.status_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;height=22,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;relief=tk.SUNKEN<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;status_bar.pack(side=tk.BOTTOM,&nbsp;fill=tk.X)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.status_var&nbsp;=&nbsp;tk.StringVar(value=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"就绪&nbsp;|&nbsp;选择一个文件开始处理"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tk.Label(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;status_bar,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;textvariable=self.status_var,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.status_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"white"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;anchor=tk.W,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;).pack(side=tk.LEFT,&nbsp;padx=10)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;底部版权信息</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;copyright_frame&nbsp;=&nbsp;tk.Frame(root,&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;copyright_frame.pack(side=tk.BOTTOM,&nbsp;fill=tk.X)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tk.Label(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;copyright_frame,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"©&nbsp;2025&nbsp;FDT文件去重工具&nbsp;v1.0&nbsp;|&nbsp;支持:&nbsp;TXT,&nbsp;DOC,&nbsp;DOCX,&nbsp;XLS,&nbsp;XLSX"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=self.bg_color,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">fg</span>=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#95a5a6"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;8)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;).pack(pady=(0,&nbsp;5))<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;绑定事件</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.bind_hover_events()<br><br>&nbsp;&nbsp;&nbsp;&nbsp;def&nbsp;bind_hover_events(self):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"绑定按钮的悬停事件"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;widget&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;self.root.winfo_children():<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;isinstance(widget,&nbsp;tk.Button):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;widget.bind(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"&lt;Enter&gt;"</span>,&nbsp;lambda&nbsp;e:&nbsp;e.widget.config(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=e.widget.cget(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"activebackground"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;))<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;widget.bind(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"&lt;Leave&gt;"</span>,&nbsp;lambda&nbsp;e:&nbsp;e.widget.config(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">bg</span>=e.widget.cget(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"bg"</span>).replace(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"activebackground"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span>).split()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;))<br><br>&nbsp;&nbsp;&nbsp;&nbsp;def&nbsp;get_file_extension(self,&nbsp;file_path):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"获取文件扩展名(小写,不带点)"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;not&nbsp;file_path:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;None<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ext&nbsp;=&nbsp;os.path.splitext(file_path)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;ext.startswith(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'.'</span>):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ext&nbsp;=&nbsp;ext<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;ext.lower()<br><br>&nbsp;&nbsp;&nbsp;&nbsp;def&nbsp;browse_input(self):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"选择输入文件"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file_path&nbsp;=&nbsp;filedialog.askopenfilename(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;title=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"选择输入文件"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;filetypes=[<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"所有支持的文件"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"*.txt&nbsp;*.doc&nbsp;*.docx&nbsp;*.xls&nbsp;*.xlsx"</span>),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"文本文件"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"*.txt"</span>),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"Word&nbsp;文档"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"*.doc&nbsp;*.docx"</span>),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"Excel&nbsp;文件"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"*.xls&nbsp;*.xlsx"</span>),<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"所有文件"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"*.*"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;file_path:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.input_path.set(file_path)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;self.overwrite_var.get()&nbsp;and&nbsp;not&nbsp;self.output_path.get():<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.output_path.set(file_path)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ext&nbsp;=&nbsp;self.get_file_extension(file_path)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;ext&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;self.format_highlight:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;color&nbsp;=&nbsp;self.format_highlight<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.status_var.set(f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"已选择输入文件:&nbsp;{os.path.basename(file_path)}"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">else</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.status_var.set(f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"警告:&nbsp;{ext.upper()}格式支持有限&nbsp;-&nbsp;{os.path.basename(file_path)}"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;def&nbsp;browse_output(self):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"选择输出文件"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;input_file&nbsp;=&nbsp;self.input_path.get()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;input_ext&nbsp;=&nbsp;self.get_file_extension(input_file)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;default_ext&nbsp;=&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"txt"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file_types&nbsp;=&nbsp;[]<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;input_ext&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;[<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"doc"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"docx"</span>]:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file_types&nbsp;=&nbsp;[(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"Word&nbsp;文档"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"*.docx"</span>),&nbsp;(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"所有文件"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"*.*"</span>)]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;default_ext&nbsp;=&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"docx"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">elif</span>&nbsp;input_ext&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;[<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xls"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xlsx"</span>]:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file_types&nbsp;=&nbsp;[(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"Excel&nbsp;文件"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"*.xlsx"</span>),&nbsp;(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"所有文件"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"*.*"</span>)]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;default_ext&nbsp;=&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xlsx"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">else</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file_types&nbsp;=&nbsp;[(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"文本文件"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"*.txt"</span>),&nbsp;(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"所有文件"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"*.*"</span>)]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;default_ext&nbsp;=&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"txt"</span><br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file_path&nbsp;=&nbsp;filedialog.asksaveasfilename(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;title=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"保存输出文件"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;defaultextension=f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">".{default_ext}"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;filetypes=file_types<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;file_path:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.output_path.set(file_path)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.status_var.set(f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"输出文件设置为:&nbsp;{os.path.basename(file_path)}"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;def&nbsp;update_overwrite(self):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"更新覆盖选项"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;self.overwrite_var.get()&nbsp;and&nbsp;self.input_path.get()&nbsp;and&nbsp;not&nbsp;self.output_path.get():<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.output_path.set(self.input_path.get())<br><br>&nbsp;&nbsp;&nbsp;&nbsp;def&nbsp;validate_inputs(self):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"验证输入是否有效"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;input_file&nbsp;=&nbsp;self.input_path.get()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;output_file&nbsp;=&nbsp;self.output_path.get()<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;not&nbsp;input_file:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showerror(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"输入错误"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"请先选择一个输入文件!"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;False<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;not&nbsp;os.path.exists(input_file):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showerror(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"文件错误"</span>,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"文件不存在:\n{input_file}"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;False<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ext&nbsp;=&nbsp;self.get_file_extension(input_file)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;supported_formats&nbsp;=&nbsp;[<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"txt"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"doc"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"docx"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xls"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xlsx"</span>]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;ext&nbsp;not&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;supported_formats:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showerror(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"格式错误"</span>,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"不支持的文件格式:&nbsp;{ext&nbsp;or&nbsp;'未知'}\n\n"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"支持格式:&nbsp;{',&nbsp;'.join(supported_formats)}"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;False<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;not&nbsp;output_file:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showerror(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"输出错误"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"请设置输出文件路径!"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;False<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;output_ext&nbsp;=&nbsp;self.get_file_extension(output_file)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;output_ext&nbsp;!=&nbsp;ext:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;not&nbsp;messagebox.askyesno(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"格式不同"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"输出文件格式({output_ext})与输入格式({ext})不同,\n"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"可能导致格式丢失。是否继续?"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;icon=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"warning"</span>):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;False<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;True<br><br>&nbsp;&nbsp;&nbsp;&nbsp;def&nbsp;extract_text_from_docx(self,&nbsp;file_path):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"从DOCX文件中提取文本(无Office依赖)"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;try:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;创建一个临时目录用于解压DOCX文件</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;with&nbsp;tempfile.TemporaryDirectory()&nbsp;as&nbsp;tmp_dir:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;解压DOCX文件</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;with&nbsp;zipfile.ZipFile(file_path,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'r'</span>)&nbsp;as&nbsp;zip_ref:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;zip_ref.extractall(tmp_dir)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;解析document.xml文件</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;doc_xml_path&nbsp;=&nbsp;os.path.join(tmp_dir,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'word'</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'document.xml'</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;not&nbsp;os.path.exists(doc_xml_path):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;[],&nbsp;0<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tree&nbsp;=&nbsp;ET.parse(doc_xml_path)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;root&nbsp;=&nbsp;tree.getroot()<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;定义XML命名空间</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;namespaces&nbsp;=&nbsp;{<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'w'</span>:&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'http://schemas.openxmlformats.org/wordprocessingml/2006/main'</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;提取文本内容</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text_lines&nbsp;=&nbsp;[]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;scope&nbsp;=&nbsp;self.scope_var.get()<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;提取段落文本</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;scope&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;[<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"all"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"paragraphs"</span>]:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;paragraph&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;root.findall(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'.//w:p'</span>,&nbsp;namespaces):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;para_text&nbsp;=&nbsp;[]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;run&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;paragraph.findall(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'.//w:r'</span>,&nbsp;namespaces):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;text&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;run.findall(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'.//w:t'</span>,&nbsp;namespaces):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;text.text:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;para_text.append(text.text.strip())<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;para_text:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text_lines.append(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">''</span>.join(para_text))<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;提取表格文本</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;scope&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;[<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"all"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"tables"</span>]:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;table&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;root.findall(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'.//w:tbl'</span>,&nbsp;namespaces):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;row&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;table.findall(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'.//w:tr'</span>,&nbsp;namespaces):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;row_text&nbsp;=&nbsp;[]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;cell&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;row.findall(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'.//w:tc'</span>,&nbsp;namespaces):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;cell_text&nbsp;=&nbsp;[]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;paragraph&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;cell.findall(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'.//w:p'</span>,&nbsp;namespaces):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;para_text&nbsp;=&nbsp;[]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;run&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;paragraph.findall(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'.//w:r'</span>,&nbsp;namespaces):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;text&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;run.findall(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'.//w:t'</span>,&nbsp;namespaces):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;text.text:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;para_text.append(text.text.strip())<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;para_text:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;cell_text.append(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">''</span>.join(para_text))<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;cell_text:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;row_text.append(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'&nbsp;'</span>.join(cell_text))<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;row_text:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;text_lines.append(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'\t'</span>.join(row_text))<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;text_lines,&nbsp;len(text_lines)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;except&nbsp;Exception&nbsp;as&nbsp;e:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showerror(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"提取错误"</span>,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"从DOCX文件中提取内容失败:\n{str(e)}"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;[],&nbsp;0<br><br>&nbsp;&nbsp;&nbsp;&nbsp;def&nbsp;extract_text_from_doc(self,&nbsp;file_path):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"从DOC文件中提取文本(兼容处理)"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;显示警告信息</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showwarning(<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"DOC格式限制"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"DOC文件是旧格式,处理能力有限。\n\n已将其视为文本文件处理。"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;尝试作为文本文件提取内容</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;try:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;with&nbsp;open(file_path,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'r'</span>,&nbsp;encoding=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'utf-8'</span>,&nbsp;errors=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'ignore'</span>)&nbsp;as&nbsp;f:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lines&nbsp;=&nbsp;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;lines,&nbsp;len(lines)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;except&nbsp;Exception&nbsp;as&nbsp;e:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showerror(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"处理错误"</span>,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"处理DOC文件失败:\n{str(e)}"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;[],&nbsp;0<br><br>&nbsp;&nbsp;&nbsp;&nbsp;def&nbsp;extract_text_from_excel(self,&nbsp;file_path):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"从Excel文件中提取文本(使用pandas)"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;try:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;确定读取引擎</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;file_path.endswith(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'.xls'</span>):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;import&nbsp;xlrd<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;engine&nbsp;=&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'xlrd'</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">else</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;engine&nbsp;=&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'openpyxl'</span><br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sheets&nbsp;=&nbsp;pd.read_excel(file_path,&nbsp;sheet_name=None,&nbsp;engine=engine)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;all_text&nbsp;=&nbsp;[]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;total_lines&nbsp;=&nbsp;0<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;sheet_name,&nbsp;df&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;sheets.items():<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;添加表名标题</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;all_text.append(f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"\n---&nbsp;Sheet:&nbsp;{sheet_name}&nbsp;---"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;处理表头</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;headers&nbsp;=&nbsp;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;all_text.append(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"\t"</span>.join(headers))<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;处理数据行</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;idx,&nbsp;row&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;df.iterrows():<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;row_values&nbsp;=&nbsp;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;all_text.append(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"\t"</span>.join(row_values))<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;total_lines&nbsp;+=&nbsp;len(df)&nbsp;+&nbsp;2<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;all_text,&nbsp;len(all_text)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;except&nbsp;Exception&nbsp;as&nbsp;e:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showerror(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"提取错误"</span>,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"从Excel文件中提取内容失败:\n{str(e)}"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;[],&nbsp;0<br><br>&nbsp;&nbsp;&nbsp;&nbsp;def&nbsp;deduplicate_text(self,&nbsp;lines):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"去重文本内容(保留顺序)"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;seen&nbsp;=&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">set</span>()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;unique_lines&nbsp;=&nbsp;[]<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;line&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;lines:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;stripped_line&nbsp;=&nbsp;line.strip()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;对于表格行,我们按整行比较</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'\t'</span>&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;stripped_line:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key&nbsp;=&nbsp;stripped_line<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">else</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key&nbsp;=&nbsp;stripped_line.lower()<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;key&nbsp;not&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;seen:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;seen.add(key)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;unique_lines.append(line)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;unique_lines,&nbsp;len(unique_lines)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;def&nbsp;save_dedup_result(self,&nbsp;unique_lines,&nbsp;input_file,&nbsp;output_file):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"保存去重结果到文件"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;input_ext&nbsp;=&nbsp;self.get_file_extension(input_file)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;output_ext&nbsp;=&nbsp;self.get_file_extension(output_file)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;try:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;对于Excel文件,保存为Excel格式</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;output_ext&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;[<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xls"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xlsx"</span>]:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;提取表头和数据</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;header_line&nbsp;=&nbsp;None<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;data_lines&nbsp;=&nbsp;[]<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;line&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;unique_lines:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'---&nbsp;Sheet:'</span>&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;line:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">continue</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;not&nbsp;header_line:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;header_line&nbsp;=&nbsp;line<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">else</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;data_lines.append(line)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;解析数据</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;header_line&nbsp;and&nbsp;data_lines:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;headers&nbsp;=&nbsp;header_line.split(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'\t'</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;data&nbsp;=&nbsp;<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;创建DataFrame</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;df&nbsp;=&nbsp;pd.DataFrame(data,&nbsp;columns=headers)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;保存到Excel</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;output_ext&nbsp;==&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'xlsx'</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;df.to_excel(output_file,&nbsp;index=False,&nbsp;engine=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'openpyxl'</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">else</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;df.to_excel(output_file,&nbsp;index=False,&nbsp;engine=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'xlwt'</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">else</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;如果没有数据,创建空DataFrame</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;pd.DataFrame().to_excel(output_file,&nbsp;index=False)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;True<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;对于文本和Word文件,保存为文本格式</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">else</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;with&nbsp;open(output_file,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'w'</span>,&nbsp;encoding=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'utf-8'</span>)&nbsp;as&nbsp;f:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;line&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;unique_lines:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'---&nbsp;Sheet:'</span>&nbsp;not&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;line:&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;跳过sheet标题</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f.write(line&nbsp;+&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'\n'</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;True<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;except&nbsp;Exception&nbsp;as&nbsp;e:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showerror(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"保存错误"</span>,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"保存去重结果失败:\n{str(e)}"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span>&nbsp;False<br><br>&nbsp;&nbsp;&nbsp;&nbsp;def&nbsp;preview_results(self):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"预览去重结果"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;not&nbsp;self.validate_inputs():<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span><br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;input_file&nbsp;=&nbsp;self.input_path.get()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;output_file&nbsp;=&nbsp;self.output_path.get()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ext&nbsp;=&nbsp;self.get_file_extension(input_file)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;try:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lines&nbsp;=&nbsp;[]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;original_count&nbsp;=&nbsp;0<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;根据文件格式提取内容</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;ext&nbsp;==&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"txt"</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;with&nbsp;open(input_file,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'r'</span>,&nbsp;encoding=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'utf-8'</span>)&nbsp;as&nbsp;f:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lines&nbsp;=&nbsp;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;original_count&nbsp;=&nbsp;len(lines)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">elif</span>&nbsp;ext&nbsp;==&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"doc"</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lines,&nbsp;original_count&nbsp;=&nbsp;self.extract_text_from_doc(input_file)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">elif</span>&nbsp;ext&nbsp;==&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"docx"</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lines,&nbsp;original_count&nbsp;=&nbsp;self.extract_text_from_docx(input_file)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">elif</span>&nbsp;ext&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;[<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xls"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xlsx"</span>]:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lines,&nbsp;original_count&nbsp;=&nbsp;self.extract_text_from_excel(input_file)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;not&nbsp;lines:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showwarning(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"内容为空"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"未提取到任何内容,文件可能为空或格式不受支持"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span><br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;去重文本</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;unique_lines,&nbsp;unique_count&nbsp;=&nbsp;self.deduplicate_text(lines)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;显示预览结果</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.config(state=tk.NORMAL)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.delete(1.0,&nbsp;tk.END)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;标题</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.tag_config(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"header"</span>,&nbsp;foreground=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#2980b9"</span>,&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;10,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"bold"</span>))<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"文件预览&nbsp;({ext.upper()},&nbsp;最多15行)\n"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"header"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"="</span>&nbsp;*&nbsp;60&nbsp;+&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"\n\n"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;预览内容</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;i,&nbsp;line&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;enumerate(unique_lines[:15]):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.tag_config(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"line_num"</span>,&nbsp;foreground=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#7f8c8d"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"{i&nbsp;+&nbsp;1:&gt;2}.&nbsp;"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"line_num"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;表格行特殊处理</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'\t'</span>&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;line:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.tag_config(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"table_row"</span>,&nbsp;foreground=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#9b59b6"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;columns&nbsp;=&nbsp;line.split(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'\t'</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;truncated&nbsp;=&nbsp;&nbsp;+&nbsp;(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'...'</span>&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;len(col)&nbsp;&gt;&nbsp;15&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">else</span>&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">''</span>)&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">for</span>&nbsp;col&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;columns]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"&nbsp;|&nbsp;"</span>.join(truncated)&nbsp;+&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"\n"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"table_row"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">else</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.tag_config(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"text_line"</span>,&nbsp;foreground=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#2c3e50"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;对长文本进行截断处理</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;len(line)&nbsp;&gt;&nbsp;80:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;line&nbsp;=&nbsp;line[:77]&nbsp;+&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"..."</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;line&nbsp;+&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"\n"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"text_line"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;len(unique_lines)&nbsp;&gt;&nbsp;15:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"\n...以及另外&nbsp;{len(unique_lines)&nbsp;-&nbsp;15}&nbsp;行\n\n"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"line_num"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">else</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"\n"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;统计数据</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.tag_config(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"stats"</span>,&nbsp;foreground=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#27ae60"</span>,&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"bold"</span>))<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"统计信息:\n"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"stats"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"原始行数:&nbsp;{original_count}\n"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"去重后行数:&nbsp;{len(unique_lines)}\n"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"移除重复行数:&nbsp;{original_count&nbsp;-&nbsp;len(unique_lines)}\n"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.config(state=tk.DISABLED)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.status_var.set(f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"预览完成:&nbsp;{ext.upper()}文件,&nbsp;原始行数&nbsp;{original_count},&nbsp;去重后行数&nbsp;{len(unique_lines)}"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;except&nbsp;Exception&nbsp;as&nbsp;e:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showerror(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"处理错误"</span>,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"处理文件时发生错误:\n{str(e)}"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.status_var.set(f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"错误:&nbsp;{str(e)}"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;def&nbsp;process_deduplication(self):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"执行去重操作"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;not&nbsp;self.validate_inputs():<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span><br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;input_file&nbsp;=&nbsp;self.input_path.get()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;output_file&nbsp;=&nbsp;self.output_path.get()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ext&nbsp;=&nbsp;self.get_file_extension(input_file)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;检查是否覆盖原文件</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;input_file&nbsp;==&nbsp;output_file:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;not&nbsp;messagebox.askyesno(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"确认覆盖"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"输出文件与输入文件相同,将覆盖原始文件。\n\n是否继续?"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;icon=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"warning"</span>):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span><br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;try:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lines&nbsp;=&nbsp;[]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;original_count&nbsp;=&nbsp;0<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;根据文件格式提取内容</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.status_var.set(f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"正在处理&nbsp;{ext.upper()}&nbsp;文件..."</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.root.update()<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;ext&nbsp;==&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"txt"</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;with&nbsp;open(input_file,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'r'</span>,&nbsp;encoding=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">'utf-8'</span>)&nbsp;as&nbsp;f:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lines&nbsp;=&nbsp;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;original_count&nbsp;=&nbsp;len(lines)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">elif</span>&nbsp;ext&nbsp;==&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"doc"</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lines,&nbsp;original_count&nbsp;=&nbsp;self.extract_text_from_doc(input_file)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">elif</span>&nbsp;ext&nbsp;==&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"docx"</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lines,&nbsp;original_count&nbsp;=&nbsp;self.extract_text_from_docx(input_file)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">elif</span>&nbsp;ext&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">in</span>&nbsp;[<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xls"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"xlsx"</span>]:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lines,&nbsp;original_count&nbsp;=&nbsp;self.extract_text_from_excel(input_file)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;not&nbsp;lines:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showwarning(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"内容为空"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"未提取到任何内容,文件可能为空或格式不受支持"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span><br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;去重文本</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;unique_lines,&nbsp;unique_count&nbsp;=&nbsp;self.deduplicate_text(lines)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;保存结果</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;success&nbsp;=&nbsp;self.save_dedup_result(unique_lines,&nbsp;input_file,&nbsp;output_file)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;not&nbsp;success:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="color: rgba(230, 192, 123, 1); line-height: 26px">return</span><br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;显示结果</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.config(state=tk.NORMAL)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.delete(1.0,&nbsp;tk.END)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;结果标题</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.tag_config(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"success"</span>,&nbsp;foreground=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#27ae60"</span>,&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;11,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"bold"</span>))<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"✓&nbsp;去重操作成功完成!\n\n"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"success"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;统计信息</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.tag_config(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"stats"</span>,&nbsp;foreground=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#e74c3c"</span>,&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;10))<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"处理结果统计:\n"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"stats"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"原始行数:&nbsp;{original_count}\n"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"去重后行数:&nbsp;{len(unique_lines)}\n"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"移除重复行数:&nbsp;{original_count&nbsp;-&nbsp;len(unique_lines)}\n\n"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;文件信息</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.tag_config(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"file"</span>,&nbsp;foreground=<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"#3498db"</span>,&nbsp;font=(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"微软雅黑"</span>,&nbsp;9,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"bold"</span>))<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"文件信息:\n"</span>,&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"file"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"输入文件:&nbsp;{os.path.basename(input_file)}\n"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"输出文件:&nbsp;{os.path.basename(output_file)}\n"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.insert(tk.END,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"输出路径:&nbsp;{os.path.dirname(output_file)}\n"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.result_text.config(state=tk.DISABLED)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.status_var.set(f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"去重完成!移除了&nbsp;{original_count&nbsp;-&nbsp;len(unique_lines)}&nbsp;行重复内容"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;显示成功对话框</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showinfo(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"操作成功"</span>,<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"文件去重操作成功完成!\n\n"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"格式:&nbsp;{ext.upper()}\n"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"原始行数:&nbsp;{original_count}\n"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"去重后行数:&nbsp;{len(unique_lines)}\n"</span><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"移除了&nbsp;{original_count&nbsp;-&nbsp;len(unique_lines)}&nbsp;行重复内容"</span>)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;except&nbsp;Exception&nbsp;as&nbsp;e:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;messagebox.showerror(<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"处理错误"</span>,&nbsp;f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"处理文件时发生错误:\n{str(e)}"</span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.status_var.set(f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"错误:&nbsp;{str(e)}"</span>)<br><br><br>def&nbsp;center_window(window,&nbsp;width=None,&nbsp;height=None):<br>&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"居中窗口"</span><span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">""</span><br>&nbsp;&nbsp;&nbsp;&nbsp;window.update_idletasks()<br>&nbsp;&nbsp;&nbsp;&nbsp;screen_width&nbsp;=&nbsp;window.winfo_screenwidth()<br>&nbsp;&nbsp;&nbsp;&nbsp;screen_height&nbsp;=&nbsp;window.winfo_screenheight()<br><br>&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;width&nbsp;is&nbsp;None:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;width&nbsp;=&nbsp;window.winfo_width()<br>&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;height&nbsp;is&nbsp;None:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;height&nbsp;=&nbsp;window.winfo_height()<br><br>&nbsp;&nbsp;&nbsp;&nbsp;x&nbsp;=&nbsp;(screen_width&nbsp;-&nbsp;width)&nbsp;//&nbsp;2<br>&nbsp;&nbsp;&nbsp;&nbsp;y&nbsp;=&nbsp;(screen_height&nbsp;-&nbsp;height)&nbsp;//&nbsp;2&nbsp;-&nbsp;20&nbsp;&nbsp;<span class="hljs-comment" style="color: rgba(92, 99, 112, 1); font-style: italic; line-height: 26px">#&nbsp;稍微上移一些</span><br><br>&nbsp;&nbsp;&nbsp;&nbsp;window.geometry(f<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"{width}x{height}+{x}+{y}"</span>)<br><br><br><span class="hljs-keyword" style="color: rgba(198, 120, 221, 1); line-height: 26px">if</span>&nbsp;__name__&nbsp;==&nbsp;<span class="hljs-string" style="color: rgba(152, 195, 121, 1); line-height: 26px">"__main__"</span>:<br>&nbsp;&nbsp;&nbsp;&nbsp;root&nbsp;=&nbsp;tk.Tk()<br>&nbsp;&nbsp;&nbsp;&nbsp;app&nbsp;=&nbsp;DeduplicationApp(root)<br>&nbsp;&nbsp;&nbsp;&nbsp;center_window(root,&nbsp;800,&nbsp;650)<br>&nbsp;&nbsp;&nbsp;&nbsp;root.mainloop()<br></code></pre>
</section><br><br>
来源:https://www.cnblogs.com/leyinsec/p/18992924
頁: [1]
查看完整版本: 《FDT文件去重工具深度解析:高效处理重复内容的智能解决方案》