.NET使用DocNET库实现快速高效的操作PDF文档
<div id="navCategory"><h5 class="catalogue">目录</h5><ul class="first_class_ul"><li><a href="#_label0">前言</a></li><li><a href="#_label1">项目介绍</a></li><li><a href="#_label2">项目特性</a></li><li><a href="#_label3">项目源代码</a></li><li><a href="#_label4">创建DocNETExercises控制台应用</a></li><li><a href="#_label5">Docnet.Core NuGet包安装</a></li><li><a href="#_label6">获取 PDF 文件页码和版本</a></li><li><a href="#_label7">获取 PDF 文件的文本内容</a></li><li><a href="#_label8">将 JPEG 图片转换为 PDF 文件</a></li><li><a href="#_label9">将 PDF 文件转换为图片</a></li><li><a href="#_label10">项目源码地址</a></li></ul></div><p class="maodian"><a name="_label0"></a></p><h2>前言</h2><p><span>PDF 文档,作为日常工作中不可或缺的文档格式,广泛应用于各类场景。今天我们来讲讲在 .NET 中使用 DocNET 库快速高效的操作 PDF 文档。</span></p>
<p class="maodian"><a name="_label1"></a></p><h2>项目介绍</h2>
<p><span>DocNET 是一个基于 .NET 开源(MIT license)、跨平台(支持Windows、Linux和macOS平台)的旨在提供快速 PDF 编辑和数据提取的操作库。它是基于 Chromium 所使用的 PDFium C++ 库开发的 .NET Standard 2.0 封装库。</span></p>
<p class="maodian"><a name="_label2"></a></p><h2>项目特性</h2>
<p><span>PDF 提取功能:支持 PDF 版本、页数、页面宽度、页面高度、页面文本、字符字体大小等相关PDF信息提取。</span></p>
<p><span>PDF 编辑功能: 支持分割 PDF 文档、合并 PDF 文档、解锁 PDF 文档。</span></p>
<p><span>支持渲染页面为图像、JPEG 图片转换为 PDF 文件等等。</span></p>
<p class="maodian"><a name="_label3"></a></p><h2>项目源代码</h2>
<p style="text-align:center"><img alt="" src="https://img.jbzj.com/file_images/article/202507/202572184053241.png" /></p>
<p class="maodian"><a name="_label4"></a></p><h2>创建DocNETExercises控制台应用</h2>
<p><span>创建一个名为 <code><span>DocNETExercises</span></code><span> 的.NET 9 控制台应用:</span></span></p>
<p style="text-align:center"><img alt="" src="https://img.jbzj.com/file_images/article/202507/2025072108380674.png" /></p>
<p style="text-align:center"><img alt="" src="https://img.jbzj.com/file_images/article/202507/2025072108380640.png" /></p>
<p class="maodian"><a name="_label5"></a></p><h2>Docnet.Core NuGet包安装</h2>
<p><span>在 NuGet 包管理器中搜索 <code><span>Docnet.Core</span></code><span> 安装:</span></span></p>
<p style="text-align:center"><img alt="" src="https://img.jbzj.com/file_images/article/202507/2025072108380617.png" /></p>
<p class="maodian"><a name="_label6"></a></p><h2>获取 PDF 文件页码和版本</h2>
<div class="jb51code"><pre class="brush:csharp;"> /// <summary>
/// 获取 PDF 文件页码和版本
/// </summary>
public static void GetPDFPageCountAndVersion()
{
using var docReader = _docNetInstance.GetDocReader(FilePath, new PageDimensions(1080, 1920));
var getPageCount = docReader.GetPageCount();
var getPdfVersion = docReader.GetPdfVersion();
Console.WriteLine($"PageCount:{getPageCount},PdfVersion:{getPdfVersion}");
}</pre></div>
<p style="text-align:center"><img alt="" src="https://img.jbzj.com/file_images/article/202507/202572184254507.png" /></p>
<p class="maodian"><a name="_label7"></a></p><h2>获取 PDF 文件的文本内容</h2>
<div class="jb51code"><pre class="brush:csharp;"> /// <summary>
/// 获取 PDF 文件的文本内容
/// </summary>
public static void GetPDFText()
{
using var docReader = _docNetInstance.GetDocReader(FilePath, new PageDimensions(1080, 1920));
using var pageReader = docReader.GetPageReader(0); //注意pageIndex从0开始
// 获取指定页面的文本(自动处理编码)
string pageText = pageReader.GetText();
Console.WriteLine(pageText);
}</pre></div>
<p style="text-align:center"><img alt="" src="https://img.jbzj.com/file_images/article/202507/2025072108380654.png" /></p>
<p class="maodian"><a name="_label8"></a></p><h2>将 JPEG 图片转换为 PDF 文件</h2>
<div class="jb51code"><pre class="brush:csharp;"> /// <summary>
/// 将 JPEG 图片转换为 PDF 文件
/// </summary>
public static void JPEGImageConvertToPDF()
{
var file = new JpegImage
{
Bytes = File.ReadAllBytes("Assets/image1.jpeg"),
Width = 580,
Height = 387
};
var bytes = _docNetInstance.JpegToPdf(new[] { file });
File.WriteAllBytes("Assets/output_file.pdf", bytes);
}</pre></div>
<p style="text-align:center"><img alt="" src="https://img.jbzj.com/file_images/article/202507/2025072108380675.png" /></p>
<p class="maodian"><a name="_label9"></a></p><h2>将 PDF 文件转换为图片</h2>
<div class="jb51code"><pre class="brush:csharp;">/// <summary>
/// 将 PDF 文件转换为图片
/// </summary>
public static void PDFConvertToImage()
{
using var docReader = _docNetInstance.GetDocReader(FilePath, new PageDimensions(1080, 1920));
//指定第一页
using var pageReader = docReader.GetPageReader(0);
var rawBytes = pageReader.GetImage();
var width = pageReader.GetPageWidth();
var height = pageReader.GetPageHeight();
var characters = pageReader.GetCharacters();
using var bmp = new Bitmap(width, height, PixelFormat.Format32bppArgb);
AddBytes(bmp, rawBytes);
DrawRectangles(bmp, characters);
using var stream = new MemoryStream();
bmp.Save(stream, ImageFormat.Png);
File.WriteAllBytes("Assets/output_image.png", stream.ToArray());
}
private static void AddBytes(Bitmap bmp, byte[] rawBytes)
{
var rect = new Rectangle(0, 0, bmp.Width, bmp.Height);
var bmpData = bmp.LockBits(rect, ImageLockMode.WriteOnly, bmp.PixelFormat);
var pNative = bmpData.Scan0;
Marshal.Copy(rawBytes, 0, pNative, rawBytes.Length);
bmp.UnlockBits(bmpData);
}
private static void DrawRectangles(Bitmap bmp, IEnumerable<Character> characters)
{
var pen = new Pen(Color.Red);
using var graphics = Graphics.FromImage(bmp);
foreach (var c in characters)
{
var rect = new Rectangle(c.Box.Left, c.Box.Top, c.Box.Right - c.Box.Left, c.Box.Bottom - c.Box.Top);
graphics.DrawRectangle(pen, rect);
}
}</pre></div>
<p style="text-align:center"><img alt="" src="https://img.jbzj.com/file_images/article/202507/2025072108380662.png" /></p>
<p class="maodian"><a name="_label10"></a></p><h2>项目源码地址</h2>
<p><span>更多项目实用功能和特性欢迎前往项目开源地址查看。</span></p>
<ul><li><strong>GitHub开源地址:</strong><span><a href="https://github.com/GowenGit/docnet" rel="external nofollow" target="_blank"><span>https://github.com/GowenGit/docnet</span></a></span></li><li><strong>本文示例源码地址:</strong><span><a href="https://github.com/YSGStudyHards/DotNetExercises/tree/master/DocNETExercises" rel="external nofollow" target="_blank"><span>https://github.com/YSGStudyHards/DotNetExercises/tree/master/DocNETExercises</span></a></span></li></ul>
頁:
[1]