我做的FFmpeg开源C#封装库Sdcb.FFmpeg
<h1 id="我做的ffmpeg开源c封装库sdcbffmpeg-">我做的FFmpeg开源C#封装库Sdcb.FFmpeg <img src="https://github.com/sdcb/FFmpeg.AutoGen/actions/workflows/main.yml/badge.svg" alt="main" loading="lazy"></h1><blockquote>
<p>写在前面:</p>
<p>该主题为2022年12月份.NET Conf China 2022我的主题,项目地址:https://github.com/sdcb/Sdcb.FFmpeg</p>
<p>对应的PPT可以从这下载:https://io.starworks.cc:88/cv-public/2022/.NET玩转音视频操作FFmpeg.pptx</p>
<p>对应的视频可以从这里观看(从3:19:00开始):https://bbs.csdn.net/topics/609897502</p>
</blockquote>
<p><code>FFmpeg</code>是知名的音频视频处理软件,我平时工作生活中会经常用到。但同时我也是<code>.NET</code>程序员,在尝试性的用<code>C#</code>调用<code>FFmpeg</code>时,有以下这些选择:</p>
<ul>
<li>进程外调用,比如:
<ul>
<li>FFmpeg.NET</li>
<li>MediaToolkit</li>
<li>Xabe.Ffmpeg</li>
</ul>
</li>
<li>基于C API平台调用,比如:
<ul>
<li>FFmpeg.AutoGen</li>
<li>EmguFFmpeg</li>
<li><strong>Sdcb.FFmpeg</strong></li>
</ul>
</li>
</ul>
<p>如果基于命令行的话,有以下优缺点:</p>
<ul>
<li>优点:容易学习、入门方便、不与GPL开源协议冲突</li>
<li>基于进程互操作,依赖于标准流重定向管理状态</li>
<li>输入和输出依赖于文件,很难精细控制</li>
</ul>
<p>如果是基于C API做平台调用,则可以很好解决上面一些问题,有如下优缺点:</p>
<ul>
<li>输入和输出可基于内存,可精细控制每一帧</li>
<li>性能方面减少了跨进程的损耗,更能有保障</li>
<li>缺点:C API代码比较复杂</li>
<li>缺点:业界普遍使用FFmpeg.AutoGen,在C#的基础上糅合C指针,写起来甚至比C API更复杂</li>
</ul>
<h2 id="我做了什么">我做了什么?</h2>
<p>受制于以上这些困难,我以业界普遍使用的开源项目<code>FFmpeg.AutoGen</code>为基础,我我自己动手做了一个<code>Sdcb.FFmpeg</code>,它有如下优点:</p>
<ul>
<li>保留所有直接调用C API的能力、保留跨平台的能力</li>
<li>删掉并完全重写了<code>ClangMacroParser</code>依赖,因此比原版支持更多的宏解析</li>
<li>动态库加载方式从手动LoadLibrary改为了自动的<code></code>,这在.NET Core中可以自动从NuGet包中加载dll,这更符合.NET社区共识</li>
<li>删掉了仓库所有大二进制依赖和大二进制历史,改成自动从网上下载,这缩小了仓库体积</li>
<li>简化了枚举名字,如<code>AVCodecID.AV_CODEC_ID_H264</code> -> <code>AVCodecID.H264</code></li>
<li>为许多C宏改造成了C#枚举,如<code>ffmpeg.AV_DICT_MATCH_CASE</code> -> <code>AV_DICT_READ.MatchCase</code></li>
<li>除了底层封装,还提供了中层(类)封装和高层(帮助类)封装,比如<code>CodecContext</code>和<code>MediaDictionary</code></li>
<li>我制作了动态链接库的<code>NuGet</code>包,这可以保障程序不需要安装外部依赖直接就能运行</li>
</ul>
<h2 id="nuget包列表">NuGet包列表</h2>
<ul>
<li>
<p>FFmpeg 5.x:</p>
<table>
<thead>
<tr>
<th>Package</th>
<th>Link</th>
</tr>
</thead>
<tbody>
<tr>
<td>Sdcb.FFmpeg</td>
<td><img src="https://img.shields.io/nuget/vpre/Sdcb.FFmpeg.svg" alt="NuGet" loading="lazy"></td>
</tr>
<tr>
<td>Sdcb.FFmpeg.runtime.windows-x64</td>
<td><img src="https://img.shields.io/nuget/v/Sdcb.FFmpeg.runtime.windows-x64.svg" alt="NuGet" loading="lazy"></td>
</tr>
</tbody>
</table>
</li>
<li>
<p>FFmpeg 4.4.x:</p>
<table>
<thead>
<tr>
<th>Package</th>
<th>Link</th>
</tr>
</thead>
<tbody>
<tr>
<td>Sdcb.FFmpeg</td>
<td><img src="https://img.shields.io/badge/nuget-4.4.2-blue" alt="NuGet" loading="lazy"></td>
</tr>
<tr>
<td>Sdcb.FFmpeg.runtime.windows-x64</td>
<td><img src="https://img.shields.io/badge/nuget-4.4.3-blue" alt="NuGet" loading="lazy"></td>
</tr>
</tbody>
</table>
</li>
</ul>
<h3 id="linuxmacos下如何使用">Linux/MacOS下如何使用?</h3>
<p><code>Linux</code>下你并不需要这些<code>NuGet</code>包,<code>Linux</code>的发行版本很多,这些发行版大都内置了<code>FFmpeg</code>这样非常常见的库,比如在<code>Ubuntu 22.04</code>中,就可以通过如下命令来安装<code>FFmpeg 5.x</code>的动态链接库:</p>
<pre><code class="language-bash">apt update
apt install software-properties-common
add-apt-repository ppa:savoury1/ffmpeg4 -y
add-apt-repository ppa:savoury1/ffmpeg5 -y
apt update
apt install ffmpeg -y
</code></pre>
<p>如果是<code>FFmpeg 4.x</code>,则可以通过以下命令来安装动态链接库:</p>
<pre><code class="language-bash">apt update
apt install software-properties-common
add-apt-repository ppa:savoury1/ffmpeg4 -y
apt update
apt install ffmpeg -y
</code></pre>
<p>如果是<code>MacOS</code>,则可以通过以下命令来安装动态链接库:</p>
<pre><code class="language-bash">brew install ffmpeg
</code></pre>
<p><code>NuGet</code>包一般会和<code>libc</code>相关的库绑定,没有很好的泛用性,而且一般<code>Linux</code>中有更好的解决方案,因此我没有为<code>Linux</code>制作运行时<code>NuGet</code>包。</p>
<p>但不要理解错了,<code>Sdcb.FFmpeg</code>在<code>Linux</code>中也是经过测试的,也运行得很好,<code>Github Actions</code>测试链接:https://github.com/sdcb/Sdcb.FFmpeg/actions</p>
<h2 id="为什么我要另起炉灶">为什么我要另起炉灶?</h2>
<p>其实我并不是一上来就准备另起炉灶,一开始我受到北京大佬<em>于宏伟</em>这个EmguFFmpeg项目的启发,觉得<code>FFmpeg.AutoGen</code>确实很难用,但只要依赖于<code>FFmpeg.AutoGen</code>,稍做点封装,就能减少许多维护工作,为此我于2020~2021年一直在想办法开发和维护这个开源项目:Sdcb.FFmpegAPIWrapper,这个项目是完全基于<code>FFmpeg.AutoGen</code>开发的,当时这个项目也已经基本完成(就是没怎么做宣传、示例和教程)。</p>
<p>然而随着项目的深入,我越来越觉得直接依赖于<code>FFmpeg.AutoGen</code>会导致代码过于“笨重”,比如同一套东西,原始的和“高级”的有两种不同的写法(比如同时存在<code>AVCodecID.AV_CODEC_ID_H264</code>和<code>AVCodecID.H264</code>,用户大概率会迷失,因此经过了许久的迷茫期后我终于下定决心改造<code>FFmpeg.AutoGen</code>,整个改造的过程伴随了大约一年的时间,最后就造就了今天的状态。</p>
<h1 id="6个示例演示sdcbffmpeg">6个示例演示Sdcb.FFmpeg</h1>
<h2 id="示例1-纯代码生成视频">示例1 纯代码生成视频</h2>
<p>可以理解这个示例是<code>FFmpeg</code>的“Hello World”,需要引用如下NuGet包:</p>
<ul>
<li>Sdcb.FFmpeg 5.1.2</li>
<li>Sdcb.FFmpeg.runtime.windows-x64</li>
</ul>
<p>需要引用以下名字空间:</p>
<ul>
<li>Sdcb.FFmpeg.Codecs</li>
<li>Sdcb.FFmpeg.Formats</li>
<li>Sdcb.FFmpeg.Raw</li>
<li>Sdcb.FFmpeg.Toolboxs.Extensions</li>
<li>Sdcb.FFmpeg.Toolboxs.Generators</li>
<li>Sdcb.FFmpeg.Utils</li>
</ul>
<p>完整代码如下(<strong>点击展开</strong>):</p>
<details>
<pre><code class="language-csharp">// this example is based on Sdcb.FFmpeg 5.1.2
FFmpegLogger.LogWriter = (level, msg) => Console.Write(msg);
using FormatContext fc = FormatContext.AllocOutput(formatName: "mp4");
fc.VideoCodec = Codec.CommonEncoders.Libx264;
MediaStream vstream = fc.NewStream(fc.VideoCodec);
using CodecContext vcodec = new CodecContext(fc.VideoCodec)
{
Width = 800,
Height = 600,
TimeBase = new AVRational(1, 30),
PixelFormat = AVPixelFormat.Yuv420p,
Flags = AV_CODEC_FLAG.GlobalHeader,
};
vcodec.Open(fc.VideoCodec);
vstream.Codecpar!.CopyFrom(vcodec);
vstream.TimeBase = vcodec.TimeBase;
string outputPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "muxing.mp4");
fc.DumpFormat(streamIndex: 0, outputPath, isOutput: true);
using IOContext io = IOContext.OpenWrite(outputPath);
fc.Pb = io;
fc.WriteHeader();
VideoFrameGenerator.Yuv420pSequence(vcodec.Width, vcodec.Height, 600)
.ConvertFrames(vcodec)
.EncodeAllFrames(fc, null, vcodec)
.WriteAll(fc);
fc.WriteTrailer();
</code></pre>
</details>
<p>运行后应该可以在桌面上看到一个<code>muxing.mp4</code>的文件,这个文件就是通过上述代码生成的,这个视频效果如下图所示:</p>
<p><img src="https://img2023.cnblogs.com/blog/233608/202302/233608-20230227093942090-499608836.webp" alt="" loading="lazy"></p>
<p>值得一提的是,我写了<code>VideoFrameGenerator.Yuv420pSequence</code>,它输入了少量参数,返回了<code>IEnumerable<Frame></code>(或者在其它示例中<code>IEnumerable<Packet></code>),这是我项目里面非常常见的写法,这样既体现了<code>C#</code>语言简明强大的魅力,又其实保障了资源管理和内存释放。</p>
<h2 id="示例2-压制视频">示例2 压制视频</h2>
<p>这个示例将展示如何将一个视频压制成如下参数,这些参数也是微信Windows桌面端视频不受二压的参数:</p>
<ul>
<li>编码:H264</li>
<li>视频码率:600kbps以下</li>
<li>视频分辨率:未限制,但推荐长边960</li>
<li>音频编码:AAC</li>
<li>音频码率:48kbps</li>
</ul>
<p>需要引用如下NuGet包:</p>
<ul>
<li>Sdcb.FFmpeg 5.1.2</li>
<li>Sdcb.FFmpeg.runtime.windows-x64</li>
</ul>
<p>需要引用如下名字空间:</p>
<ul>
<li>Sdcb.FFmpeg.Codecs</li>
<li>Sdcb.FFmpeg.Common</li>
<li>Sdcb.FFmpeg.Filters</li>
<li>Sdcb.FFmpeg.Formats</li>
<li>Sdcb.FFmpeg.Raw</li>
<li>Sdcb.FFmpeg.Toolboxs</li>
<li>Sdcb.FFmpeg.Toolboxs.Extensions</li>
<li>Sdcb.FFmpeg.Toolboxs.FilterTools</li>
<li>Sdcb.FFmpeg.Toolboxs.Generators</li>
<li>Sdcb.FFmpeg.Utils</li>
<li>static Sdcb.FFmpeg.Raw.ffmpeg</li>
<li>System.Collections.Concurrent</li>
<li>System.Runtime.CompilerServices</li>
<li>System.Threading.Tasks</li>
</ul>
<p>完整代码如下(<strong>点击展开</strong>):</p>
<details>
<pre><code class="language-csharp">void Main()
{
FFmpegLogger.LogLevel = LogLevel.Error;
FFmpegLogger.LogWriter = (level, msg) => Console.Write(msg);
Task.Run(() => A7r3VideoToWechat(@"Y:\a7r3\2022-12-12\C0060.MP4")).Wait();
}
void A7r3VideoToWechat(string mp4Path)
{
using FormatContext inFc = FormatContext.OpenInputUrl(mp4Path);
inFc.LoadStreamInfo();
// prepare input stream/codec
MediaStream inAudioStream = inFc.GetAudioStream();
using CodecContext audioDecoder = new(Codec.FindDecoderById(inAudioStream.Codecpar!.CodecId));
audioDecoder.FillParameters(inAudioStream.Codecpar);
audioDecoder.Open();
audioDecoder.ChannelLayout = (ulong)ffmpeg.av_get_default_channel_layout(audioDecoder.Channels);
MediaStream inVideoStream = inFc.GetVideoStream();
using CodecContext videoDecoder = new(Codec.FindDecoderByName("h264_cuvid"));
videoDecoder.FillParameters(inVideoStream.Codecpar!);
videoDecoder.Open();
// dest file
string destFile = Path.Combine(Path.GetDirectoryName(mp4Path)!, Path.GetFileNameWithoutExtension(mp4Path) + "_wechat.mp4");
using FormatContext outFc = FormatContext.AllocOutput(fileName: destFile);
// dest encoder and streams
outFc.AudioCodec = Codec.CommonEncoders.AAC;
MediaStream outAudioStream = outFc.NewStream(outFc.AudioCodec);
using CodecContext audioEncoder = new(outFc.AudioCodec)
{
Channels = 1,
SampleFormat = outFc.AudioCodec.Value.NegociateSampleFormat(AVSampleFormat.Fltp),
SampleRate = outFc.AudioCodec.Value.NegociateSampleRates(48000),
BitRate = 48000
};
audioEncoder.ChannelLayout = (ulong)ffmpeg.av_get_default_channel_layout(audioEncoder.Channels);
audioEncoder.TimeBase = new AVRational(1, audioEncoder.SampleRate);
audioEncoder.Open(outFc.AudioCodec);
outAudioStream.Codecpar!.CopyFrom(audioEncoder);
outFc.VideoCodec = Codec.FindEncoderByName("libx264");
MediaStream outVideoStream = outFc.NewStream(outFc.VideoCodec);
using VideoFilterContext vfilter = VideoFilterContext.Create(inVideoStream, "scale=1920:-1");
using CodecContext videoEncoder = new(outFc.VideoCodec)
{
Flags = AV_CODEC_FLAG.GlobalHeader,
ThreadCount = Environment.ProcessorCount,
ThreadType = ffmpeg.FF_THREAD_FRAME,
BitRate = 595_000
};
vfilter.ConfigureEncoder(videoEncoder);
var dict = new MediaDictionary
{
//["qp"] = "30",
["tune"] = "zerolatency",
["preset"] = "veryfast"
};
videoEncoder.Open(outFc.VideoCodec, dict);
//dict.Dump();
outVideoStream.Codecpar!.CopyFrom(videoEncoder);
outVideoStream.TimeBase = videoEncoder.TimeBase;
// begin write
using IOContext io = IOContext.OpenWrite(destFile);
outFc.Pb = io;
outFc.WriteHeader();
MediaThreadQueue<Frame> decodingQueue = inFc
.ReadPackets(inVideoStream.Index, inAudioStream.Index)
.DecodeAllPackets(inFc, audioDecoder, videoDecoder)
.ToThreadQueue(cancellationToken: QueryCancelToken, boundedCapacity: 64);
MediaThreadQueue<Packet> encodingQueue = decodingQueue.GetConsumingEnumerable()
.ApplyVideoFilters(vfilter)
.ConvertAllFrames(audioEncoder, videoEncoder)
.AudioFifo(audioEncoder)
.EncodeAllFrames(outFc, audioEncoder, videoEncoder)
.ToThreadQueue(cancellationToken: QueryCancelToken);
CancellationTokenSource end = new();
QueryCancelToken.Register(() => end.Cancel());
Dictionary<int, PtsDts> ptsDts = new();
Task.Run(async () =>
{
double totalDuration = Math.Max(inVideoStream.GetDurationInSeconds(), inAudioStream.GetDurationInSeconds());
try
{
while (!end.IsCancellationRequested)
{
Log();
await Task.Delay(1000, end.Token);
}
}
finally
{
Log();
}
void Log() => Console.WriteLine($"{GetStatusText()}, dec/enc queue: {decodingQueue.Count}/{encodingQueue.Count}");
string GetStatusText() => $"{(outVideoStream.TimeBase * ptsDts.GetValueOrDefault(outVideoStream.Index, PtsDts.Default).Dts).ToDouble():F2} of {totalDuration:F2}";
});
encodingQueue.GetConsumingEnumerable()
.RecordPtsDts(ptsDts)
.WriteAll(outFc);
end.Cancel();
outFc.WriteTrailer();
}
</code></pre>
</details>
<p>运行效果如图(将500多MB压缩为5MB):</p>
<p><img src="https://img2023.cnblogs.com/blog/233608/202302/233608-20230227094031386-1349111015.png" alt="" loading="lazy"></p>
<p>值得一提的是这里的<code>MediaThreadQueue<Frame></code>和<code>MediaThreadQueue<Packet></code>,内部都是基于<code>C#</code>的<code>BlockingCollection</code>加多线程做的,这样可能提高效率,保证性能。</p>
<h2 id="示例3-创建gif表情包">示例3 创建gif(表情包?)</h2>
<p>注意,我创建了一个demo网站可以用于演示该功能,可以点击“生成”按钮,比如可以得到这样的表情包:</p>
<p><img src="https://ffmpeg-sorry-demo.starworks.cc:88/sorry/generate?type=wjz&subtitle=%E8%BF%98%E6%84%A3%E7%9D%80%E5%B9%B2%E5%98%9B%7C%E4%B8%8A%E9%A1%B5%E9%9D%A2%E6%98%BE%E7%A4%BA%7C%E4%B8%8A%E6%8A%A5%E9%94%99%E6%97%A5%E5%BF%97%7C%E4%BD%A0%E6%89%BE%E5%88%AB%E4%BA%BA%E5%90%A7%EF%BC%8C%E6%88%91%E4%B8%8D%E4%BC%9A" alt="" loading="lazy"></p>
<p>我把所有有完整<code>Visual Studio</code>代码示例上传到了Github,可以在这下载:https://github.com/sdcb/ffmpeg-wjz-sorry-generator</p>
<p>它有如下步骤和要点:</p>
<ol>
<li>视频解码</li>
<li>将每一帧转换为BGRA像素格式</li>
<li>使用Direct2D读取并绘制字幕</li>
<li>将每一帧输入视频过滤器,转换为PAL8格式</li>
<li>将PAL8编码像素格式的帧编码为gif</li>
</ol>
<p>注意这个demo我用到了<code>Direct2D</code>,它基于这个开源项目做的:Vortice.Windows</p>
<h2 id="示例4-实际桌面投屏远程桌面">示例4 实际桌面投屏(远程桌面?)</h2>
<p>这个可以实现将一台电脑的屏幕内容,以较低的网络开销,通过网络实时地传输到另一台电脑,它的使用场景包含实时视频通话、远程投屏、远程桌面控制等。</p>
<p>代码分为两部分,<strong>桌面录制-编码-发送端</strong>和<strong>远程接收-解码-显示端</strong>。</p>
<h3 id="桌面录制-编码-发送端完整源代码">桌面录制-编码-发送端完整源代码</h3>
<p>需要引用<code>NuGet</code>包:</p>
<ul>
<li>Sdcb.FFmpeg 4.4.3</li>
<li>Sdcb.FFmpeg.runtime.windows-x64 4.4.3</li>
<li>Sdcb.ScreenCapture</li>
</ul>
<p>完整源代码如下(<strong>点击展开</strong>):</p>
<details>
<pre><code class="language-csharp">// This example was initially written based on Sdcb.FFmpeg 4.4.3 & Sdcb.ScreenCapture
void Main()
{
StartService(QueryCancelToken);
}
void StartService(CancellationToken cancellationToken = default)
{
var tcpListener = new TcpListener(IPAddress.Any, 5555);
cancellationToken.Register(() => tcpListener.Stop());
tcpListener.Start();
while (!cancellationToken.IsCancellationRequested)
{
TcpClient client = tcpListener.AcceptTcpClient();
Task.Run(() => ServeClient(client, cancellationToken));
}
}
void ServeClient(TcpClient tcpClient, CancellationToken cancellationToken = default)
{
try
{
using var _ = tcpClient;
using NetworkStream stream = tcpClient.GetStream();
using BinaryWriter writer = new(stream);
RectI screenSize = ScreenCapture.GetScreenSize(screenId: 0);
RdpCodecParameter rcp = new(AVCodecID.H264, screenSize.Width, screenSize.Height, AVPixelFormat.Bgr0);
using CodecContext cc = new(Codec.CommonEncoders.Libx264RGB)
{
Width = rcp.Width,
Height = rcp.Height,
PixelFormat = rcp.PixelFormat,
TimeBase = new AVRational(1, 20),
};
cc.Open(null, new MediaDictionary
{
["crf"] = "30",
["tune"] = "zerolatency",
["preset"] = "veryfast"
});
writer.Write(rcp.ToArray());
using Frame source = new();
foreach (Packet packet in ScreenCapture
.CaptureScreenFrames(screenId: 0)
.ToBgraFrame()
.ConvertFrames(cc)
.EncodeFrames(cc))
{
if (cancellationToken.IsCancellationRequested)
{
break;
}
writer.Write(packet.Data.Length);
writer.Write(packet.Data.AsSpan());
}
}
catch (IOException ex)
{
// Unable to write data to the transport connection: 远程主机强迫关闭了一个现有的连接。.
// Unable to write data to the transport connection: 你的主机中的软件中止了一个已建立的连接。
ex.Dump();
}
}
public class Filo<T> : IDisposable
{
private T? Item { get; set; }
private ManualResetEventSlim Notify { get; } = new ManualResetEventSlim();
public void Update(T item)
{
Item = item;
Notify.Set();
}
public IEnumerable<T> Consume(CancellationToken cancellationToken = default)
{
while (!cancellationToken.IsCancellationRequested)
{
Notify.Wait(cancellationToken);
yield return Item!;
}
}
public void Dispose() => Notify.Dispose();
}
public static class BgraFrameExtensions
{
public static IEnumerable<Frame> ToBgraFrame(this IEnumerable<LockedBgraFrame> bgras)
{
using Frame frame = new Frame();
foreach (LockedBgraFrame bgra in bgras)
{
frame.Width = bgra.Width;
frame.Height = bgra.Height;
frame.Format = (int)AVPixelFormat.Bgra;
frame.Data = bgra.DataPointer;
frame.Linesize = bgra.RowPitch;
yield return frame;
}
}
}
record RdpCodecParameter(AVCodecID CodecId, int Width, int Height, AVPixelFormat PixelFormat)
{
public byte[] ToArray()
{
byte[] data = new byte;
Span<byte> span = data.AsSpan();
BinaryPrimitives.WriteInt32LittleEndian(span, (int)CodecId);
BinaryPrimitives.WriteInt32LittleEndian(span, Width);
BinaryPrimitives.WriteInt32LittleEndian(span, Height);
BinaryPrimitives.WriteInt32LittleEndian(span, (int)PixelFormat);
return data;
}
}
</code></pre>
</details>
<p>值得一提的是<code>Sdcb.ScreenCapture</code>这个<code>NuGet</code>包也是我做的,它是基于<code>DXGI</code>的技术,录屏时能做到内存0复制,可以实现每秒60帧录屏且CPU占用率很低。这里挖个坑以后有机会介绍这个开源项目,Github地址如下:https://github.com/sdcb/Sdcb.ScreenCapture</p>
<h3 id="远程接收-解码-显示端完整源代码">远程接收-解码-显示端完整源代码</h3>
<p>需要引用的NuGet包:</p>
<ul>
<li>Sdcb.FFmpeg 4.4.3</li>
<li>Sdcb.FFmpeg.runtime.windows-x64 4.4.3</li>
<li>FlysEngine.Desktop</li>
</ul>
<p>请<strong>点击展开</strong>显示:</p>
<details>
<pre><code class="language-csharp">// This example was initially written based on Sdcb.FFmpeg 4.4.3 & FlysEngine.Desktop
#nullable enable
ManagedBgraFrame? managedFrame = null;
bool cancel = false;
unsafe void Main()
{
using RenderWindow w = new();
w.FormClosed += delegate { cancel = true; };
Task decodingTask = Task.Run(() => DecodeThread(() => (3840, 2160)));
w.Draw += (_, ctx) =>
{
ctx.Clear(Colors.CornflowerBlue);
if (managedFrame == null) return;
ManagedBgraFrame frame = managedFrame.Value;
fixed (byte* ptr = frame.Data)
{
//new System.Drawing.Bitmap(frame.Width, frame.Height, frame.RowPitch, System.Drawing.Imaging.PixelFormat.Format32bppPArgb, (IntPtr)ptr).DumpUnscaled();
BitmapProperties1 props = new(new PixelFormat(Format.B8G8R8A8_UNorm, Vortice.DCommon.AlphaMode.Premultiplied));
using ID2D1Bitmap bmp = ctx.CreateBitmap(new SizeI(frame.Width, frame.Height), (IntPtr)ptr, frame.RowPitch, props);
ctx.UnitMode = UnitMode.Dips;
ctx.DrawBitmap(bmp, 1.0f, InterpolationMode.NearestNeighbor);
}
};
RenderLoop.Run(w, () => w.Render(1, Vortice.DXGI.PresentFlags.None));
}
async Task DecodeThread(Func<(int width, int height)> sizeAccessor)
{
using TcpClient client = new TcpClient();
await client.ConnectAsync(IPAddress.Loopback, 5555);
using NetworkStream stream = client.GetStream();
using BinaryReader reader = new(stream);
RdpCodecParameter rcp = RdpCodecParameter.FromSpan(reader.ReadBytes(16));
using CodecContext cc = new(Codec.FindDecoderById(rcp.CodecId))
{
Width = rcp.Width,
Height = rcp.Height,
PixelFormat = rcp.PixelFormat,
};
cc.Open(null);
foreach (var frame in reader
.ReadPackets()
.DecodePackets(cc)
.ConvertVideoFrames(sizeAccessor, AVPixelFormat.Bgra)
.ToManaged()
)
{
if (cancel) break;
managedFrame = frame;
}
}
public static class FramesExtensions
{
public static IEnumerable<ManagedBgraFrame> ToManaged(this IEnumerable<Frame> bgraFrames, bool unref = true)
{
foreach (Frame frame in bgraFrames)
{
int rowPitch = frame.Linesize;
int length = rowPitch * frame.Height;
byte[] buffer = new byte;
Marshal.Copy(frame.Data._0, buffer, 0, length);
ManagedBgraFrame managed = new(buffer, length, length / frame.Height);
if (unref) frame.Unref();
yield return managed;
}
}
}
public record struct ManagedBgraFrame(byte[] Data, int Length, int RowPitch)
{
public int Width => RowPitch / BytePerPixel;
public int Height => Length / RowPitch;
public const int BytePerPixel = 4;
}
public static class ReadPacketExtensions
{
public static IEnumerable<Packet> ReadPackets(this BinaryReader reader)
{
using Packet packet = new();
while (true)
{
int packetSize = reader.ReadInt32();
if (packetSize == 0) yield break;
byte[] data = reader.ReadBytes(packetSize);
GCHandle dataHandle = GCHandle.Alloc(data, GCHandleType.Pinned);
try
{
packet.Data = new DataPointer(dataHandle.AddrOfPinnedObject(), packetSize);
yield return packet;
}
finally
{
dataHandle.Free();
}
}
}
}
record RdpCodecParameter(AVCodecID CodecId, int Width, int Height, AVPixelFormat PixelFormat)
{
public static RdpCodecParameter FromSpan(ReadOnlySpan<byte> data)
{
return new RdpCodecParameter(
CodecId: (AVCodecID)BinaryPrimitives.ReadInt32LittleEndian(data),
Width: BinaryPrimitives.ReadInt32LittleEndian(data),
Height: BinaryPrimitives.ReadInt32LittleEndian(data),
PixelFormat: (AVPixelFormat)BinaryPrimitives.ReadInt32LittleEndian(data));
}
}
</code></pre>
</details>
<p>两者运行效果如图:</p>
<p><img src="https://img2023.cnblogs.com/blog/233608/202302/233608-20230227094136212-1033980532.png" alt="" loading="lazy"></p>
<p>可见传输延迟在<code>0.28</code>秒的样子,这是通过<code>libx264</code>编码通过<code>yuv420p</code>传输的我<code>4k</code>显示器视频,可见可以满足实际网络会议演示、投屏直播、远程控制方面的需求(如果是1080p延迟应该可以更低)。</p>
<p>注意该源代码用上了我自己写的开源<code>Direct2D</code>封装引擎:FlysEngine,你不需要关注它的细节(只需要安装NuGet包即可),但如果你碰巧关注,这里又挖个坑看以后有机会介绍介绍,在这之前只需要知道的是它只对D3D11、DXGI、Direct2D、WIC、DirectWrite做了一层薄薄的封装。</p>
<h2 id="示例5-接收显示rtsp摄像头视频">示例5 接收显示RTSP摄像头视频</h2>
<p>这个程序依赖于如下NuGet包:</p>
<ul>
<li>FlysEngine.Desktop</li>
<li>Sdcb.FFmpeg 4.4.3</li>
<li>Sdcb.FFmpeg.runtime.windows-x64 4.4.3</li>
</ul>
<p>完整代码(<strong>点击展开</strong>):</p>
<details>
<pre><code class="language-csharp">#nullable enable
FFmpegBmp? ffBmp = null;
FFmpegBmp? lastFFbmp = null;
FFmpegLogger.LogWriter = (level, msg) => Console.Write(msg);
CancellationTokenSource cts = new();
using RenderWindow w = new();
Task.Run(() => DecodeRTSP(Util.GetPassword("home-rtsp-ipc"), cts.Token));
w.Draw += (_, ctx) =>
{
if (ffBmp == null) return;
if (lastFFbmp == ffBmp) return;
GCHandle handle = GCHandle.Alloc(ffBmp.Data, GCHandleType.Pinned);
try
{
using ID2D1Bitmap bmp = ctx.CreateBitmap(new SizeI(ffBmp.Width, ffBmp.Height), handle.AddrOfPinnedObject(), ffBmp.RowPitch, new BitmapProperties(new Vortice.DCommon.PixelFormat(Format.B8G8R8A8_UNorm, Vortice.DCommon.AlphaMode.Premultiplied)));
lastFFbmp = ffBmp;
Size clientSize = ctx.Size;
float top = (clientSize.Height - ffBmp.Height) / 2;
ctx.Transform = Matrix3x2.CreateTranslation(0, top);
ctx.DrawBitmap(bmp, 1.0f, InterpolationMode.Linear);
}
finally
{
handle.Free();
}
};
w.FormClosing += delegate { cts.Cancel(); };
RenderLoop.Run(w, () => w.Render(1, Vortice.DXGI.PresentFlags.None));
void DecodeRTSP(string url, CancellationToken cancellationToken = default)
{
using FormatContext fc = FormatContext.OpenInputUrl(url);
fc.LoadStreamInfo();
MediaStream videoStream = fc.GetVideoStream();
using CodecContext videoDecoder = new CodecContext(Codec.FindDecoderByName("hevc_qsv"));
videoDecoder.FillParameters(videoStream.Codecpar!);
videoDecoder.Open();
foreach (Frame frame in fc
.ReadPackets(videoStream.Index)
.DecodePackets(videoDecoder)
.ConvertVideoFrames(() => new(w.ClientSize.Width, w.ClientSize.Width * videoDecoder.Height / videoDecoder.Width), AVPixelFormat.Bgr0))
{
if (cancellationToken.IsCancellationRequested) break;
try
{
byte[] data = new byte * frame.Height];
Marshal.Copy(frame.Data._0, data, 0, data.Length);
ffBmp = new FFmpegBmp(frame.Width, frame.Height, frame.Linesize, data);
}
finally
{
frame.Unref();
}
}
}
public record FFmpegBmp(int Width, int Height, int RowPitch, byte[] Data);
</code></pre>
</details>
<p>我农村老家的摄像头使用的是RTSP摄像头,这是使用上述代码的运行效果:</p>
<p><img src="https://img2023.cnblogs.com/blog/233608/202302/233608-20230227094100664-1586363563.png" alt="" loading="lazy"></p>
<h2 id="示例6-读rtsp流并保存为mp4mov文件">示例6 读RTSP流并保存为mp4/mov文件</h2>
<p>这个示例依赖于以下<code>NuGet</code>包:</p>
<ul>
<li>Sdcb.FFmpeg 4.4.3</li>
<li>Sdcb.FFmpeg.runtime.windows-x64 4.4.3</li>
</ul>
<p>完整代码示例(请<strong>点击展开</strong>):</p>
<details>
<pre><code class="language-csharp">// The example was initially written using Sdcb.FFmpeg 4.4.3
FFmpegLogger.LogWriter = (level, msg) => Console.Write(msg);
using FormatContext inFc = FormatContext.OpenInputUrl(Util.GetPassword("home-rtsp-ipc"));
inFc.LoadStreamInfo();
MediaStream inAudioStream = inFc.GetAudioStream();
MediaStream inVideoStream = inFc.GetVideoStream();
long gpts_v = 0, gpts_a = 0, gdts_v = 0, gdts_a = 0;
while (!QueryCancelToken.IsCancellationRequested)
{
using FormatContext outFc = FormatContext.AllocOutput(formatName: "mov");
string dir = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "rtsp", DateTime.Now.ToString("yyyy-MM-dd"));
Directory.CreateDirectory(dir);
using IOContext io = IOContext.OpenWrite(Path.Combine(dir, $"{DateTime.Now:HHmmss}.mov"));
outFc.Pb = io;
MediaStream videoStream = outFc.NewStream(Codec.FindEncoderById(inVideoStream.Codecpar!.CodecId));
videoStream.Codecpar!.CopyFrom(inVideoStream.Codecpar);
videoStream.TimeBase = inVideoStream.RFrameRate.Inverse();
videoStream.SampleAspectRatio = inVideoStream.SampleAspectRatio;
MediaStream audioStream = outFc.NewStream(Codec.FindEncoderById(inAudioStream.Codecpar!.CodecId));
audioStream.Codecpar!.CopyFrom(inAudioStream.Codecpar);
audioStream.TimeBase = inAudioStream.TimeBase;
audioStream.Codecpar.ChannelLayout = (ulong)ffmpeg.av_get_default_channel_layout(inAudioStream.Codecpar.Channels);
outFc.WriteHeader();
FilterPackets(inFc.ReadPackets(inAudioStream.Index, inVideoStream.Index), videoFrameCount: 60 * 20)
.WriteAll(outFc);
outFc.WriteTrailer();
IEnumerable<Packet> FilterPackets(IEnumerable<Packet> packets, int videoFrameCount)
{
long pts_v = gpts_v, pts_a = gpts_a, dts_v = gdts_v, dts_a = gdts_a;
long[] buffer = new long;
long ithreshold = -1;
int videoFrame = 0;
foreach (Packet pkt in packets)
{
pkt.StreamIndex = pkt.StreamIndex == inAudioStream.Index ?
audioStream.Index :
videoStream.Index;
if (pkt.StreamIndex == inAudioStream.Index)
{
// audio
(gpts_a, gdts_a, pkt.Pts, pkt.Dts) = (pkt.Pts, pkt.Dts, pkt.Pts - pts_a, pkt.Dts - dts_a);
pkt.RescaleTimestamp(inAudioStream.TimeBase, audioStream.TimeBase);
}
else
{
// video
if (videoFrame < buffer.Length)
{
buffer = pkt.Data.Length;
ithreshold = -1;
}
else if (videoFrame == buffer.Length)
{
ithreshold = buffer.Order().ToArray() * 4;
}
if (videoFrame >= videoFrameCount && pkt.Data.Length > ithreshold)
{
break;
}
(gpts_v, gdts_v, pkt.Pts, pkt.Dts) = (pkt.Pts, pkt.Dts, pkt.Pts - pts_v, pkt.Dts - dts_v);
pkt.RescaleTimestamp(inVideoStream.TimeBase, videoStream.TimeBase);
videoFrame++;
}
yield return pkt;
}
}
}
</code></pre>
</details>
<p>这个程序可以全天候运行,运行后RTSP摄像头录的完整视频和音频,大约每1.5分钟对应一个视频文件,都会保存到桌面的这个文件夹中(如图):</p>
<p><img src="https://img2023.cnblogs.com/blog/233608/202302/233608-20230227094216413-1640009844.png" alt="" loading="lazy"></p>
<p>这样的话也许就有机会取代录机了~</p>
<h1 id="总结与展望">总结与展望</h1>
<p>我认为把东西做出来和把东西做好是有区别的,以前在<code>C#</code>里面东西也就是“能用”的状态,这和许多<code>node.js</code>或者<code>python</code>那样的极客玩家有本质区别,希望通过这样一个开源项目能向“.NET作为第一等公民”方向努力。</p>
<p>维护开源不易,喜欢的朋友请点个赞,赏个star:https://github.com/sdcb/Sdcb.FFmpeg</p>
<p>我也想能给自己立个flag,希望未来我可以封装<code>FlyCV</code>、<code>libyuv</code>、<code>x264</code>基于<code>libaom-av1</code>,甚至也许有一点有机会做一个<code>.NET</code>版本的<code>FFmpeg</code>。</p>
<p>喜欢的朋友 请关注我的微信公众号:【DotNet骚操作】</p>
<p><img src="https://img2018.cnblogs.com/blog/233608/201908/233608-20190825165420518-990227633.jpg" alt="DotNet骚操作" loading="lazy"></p><br><br>
来源:https://www.cnblogs.com/sdcb/p/dotnet-conf-china-2022-ffmpeg.html
頁:
[1]