使用RNNoise进行音频降噪
<p><span><span>操作系统:<span>Debian 12.5_x64 & Windows10_x64</span></span></span></p><p><span>rnnoise版本:0.2</span></p>
<p><span>gcc版本:12.2.0</span></p>
<p><span>python版本: 3.9.13</span></p>
<p><span>RNNoise是一个将传统数字信号处理与深度学习相结合的开源实时音频降噪库<span>,可在消耗极少计算资源的情况下实现毫秒级降噪,今天整理下这方面的笔记,希望对你有帮助。</span></span></p>
<p><span>该库涉及算法的描述详见论文(一种混合 DSP/深度学习方法的实时全频带语音增强技术):</span></p>
<p>https://arxiv.org/pdf/1709.08243</p>
<p><span>如果打不开,可从文末提供的渠道获取该论文。</span></p>
<p><span>之前整理过如何使用 noisereduce 、 fft 和 Audacity 音频文件降噪,如有需要<span>可参考:</span></span></p>
<div>https://www.cnblogs.com/MikeZhang/p/18313792/pynr20240720</div>
<h1><span><span>一、编译及C使用示例</span></span></h1>
<div>GitHub地址:</div>
<div>https://github.com/xiph/rnnoise</div>
<div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031204538619-1096569242.png" alt="image" width="1392" height="799" loading="lazy"> </p>
</div>
<div>
<h2 id="BN9o-1761914742513">1、编译及文件说明</h2>
<div>编译步骤如下:</div>
<div data-theme="default" data-language="">
<div class="cnblogs_code">
<pre>./autogen.<span style="color: rgba(0, 0, 255, 1)">sh</span><span style="color: rgba(0, 0, 0, 1)">
.</span>/<span style="color: rgba(0, 0, 0, 1)">configure
</span><span style="color: rgba(0, 0, 255, 1)">make</span></pre>
</div>
<p>其中,执行 ./autogen.sh 时,会下载models文件(RNNoise 项目预训练的模型数据文件,如果下载过慢,可从文末提供的渠道获取):</p>
</div>
<div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031204648578-1486937037.png" alt="image" loading="lazy"></p>
<p><span>rnnoise_data主要包含了项目预训练好的模型权重,<span>使得用户在编译 RNNoise 后,无需自己从头训练模型,就能直接使用其降噪功能。</span></span></p>
<p><span>rnnoise_data文件里面是c代码及pth文件:</span></p>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031204712244-931314406.png" alt="image" loading="lazy"></p>
<p><span>这里面有.c文件和.pth文件,其中:</span></p>
<p><span>.c 文件由.pth文件生成,存储预训练模型权重,将神经网络权重以C数组形式嵌入,供降噪算法直接调用,降噪时由 rnnoise_process_frame 等函数直接使用。</span></p>
<p><span>.pth 文件存储训练模型,用于模型研究、微调或重新训练,并非RNNoise运行时必需。</span></p>
<p><span>使用说明:</span></p>
<p><span>1)若只需使用RNNoise的降噪功能,关注编译好的库及API即可。</span></p>
<p><span>2)若需要优化模型或适配特殊场景,才需研究 .pth 文件及项目的训练脚本。</span></p>
<h2><span><span>2、降噪效果验证</span></span></h2>
<p><span>在examples目录里面有可直接运行的demo程序,需要准备s16le 48khz格式的音频文件。</span></p>
<p><span>输出为pcm格式的文件。</span></p>
<p><span>导入效果如下:</span></p>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031204737434-36851334.png" alt="image" loading="lazy"></p>
<p> 降噪效果如下:</p>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031204755767-28050951.png" alt="image" loading="lazy"></p>
<p> 这里用的是Audacity软件查看降噪效果的,关于Audacity软件的使用,可参考这篇文章:</p>
<p>https://www.cnblogs.com/MikeZhang/p/audacity2022022.html</p>
<div>关于pcm音频的播放可参考这篇文章:</div>
<div>https://www.cnblogs.com/MikeZhang/p/pcm20232330.html</div>
<div>
<div>配套的音频文件可从如下渠道获取:</div>
<div>关注微信公众号(聊聊博文,文末可扫码)后回复 20251031 获取。</div>
<div>
<h2><span><span>3、使用静态库二次开发</span></span></h2>
<p><span>实际使用过程中,会涉及基于rnnoise库进行二次开发的情况,这里提供下简单示例。</span></p>
<p><span>基于rnnoise_demo.c修改的示例代码如下(test1.c):</span></p>
</div>
</div>
<div class="cnblogs_code">
<pre>#include <stdio.h><span style="color: rgba(0, 0, 0, 1)">
#include </span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">rnnoise.h</span><span style="color: rgba(128, 0, 0, 1)">"</span>
<span style="color: rgba(0, 0, 255, 1)">#define</span> FRAME_SIZE 480
<span style="color: rgba(0, 0, 255, 1)">int</span> main(<span style="color: rgba(0, 0, 255, 1)">int</span> argc, <span style="color: rgba(0, 0, 255, 1)">char</span> **<span style="color: rgba(0, 0, 0, 1)">argv) {
</span><span style="color: rgba(0, 0, 255, 1)">int</span><span style="color: rgba(0, 0, 0, 1)"> i;
</span><span style="color: rgba(0, 0, 255, 1)">int</span> first = <span style="color: rgba(128, 0, 128, 1)">1</span><span style="color: rgba(0, 0, 0, 1)">;
</span><span style="color: rgba(0, 0, 255, 1)">float</span><span style="color: rgba(0, 0, 0, 1)"> x;
FILE </span>*f1, *<span style="color: rgba(0, 0, 0, 1)">fout;
DenoiseState </span>*<span style="color: rgba(0, 0, 0, 1)">st;
st </span>=<span style="color: rgba(0, 0, 0, 1)"> rnnoise_create(NULL);
</span><span style="color: rgba(0, 0, 255, 1)">if</span> (argc!=<span style="color: rgba(128, 0, 128, 1)">3</span><span style="color: rgba(0, 0, 0, 1)">) {
fprintf(stderr, </span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">usage: %s <noisy speech> <output denoised>\n</span><span style="color: rgba(128, 0, 0, 1)">"</span>, argv[<span style="color: rgba(128, 0, 128, 1)">0</span><span style="color: rgba(0, 0, 0, 1)">]);
</span><span style="color: rgba(0, 0, 255, 1)">return</span> <span style="color: rgba(128, 0, 128, 1)">1</span><span style="color: rgba(0, 0, 0, 1)">;
}
f1 </span>= fopen(argv[<span style="color: rgba(128, 0, 128, 1)">1</span>], <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">rb</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">);
fout </span>= fopen(argv[<span style="color: rgba(128, 0, 128, 1)">2</span>], <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">wb</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">);
</span><span style="color: rgba(0, 0, 255, 1)">while</span> (<span style="color: rgba(128, 0, 128, 1)">1</span><span style="color: rgba(0, 0, 0, 1)">) {
</span><span style="color: rgba(0, 0, 255, 1)">short</span><span style="color: rgba(0, 0, 0, 1)"> tmp;
fread(tmp, </span><span style="color: rgba(0, 0, 255, 1)">sizeof</span>(<span style="color: rgba(0, 0, 255, 1)">short</span><span style="color: rgba(0, 0, 0, 1)">), FRAME_SIZE, f1);
</span><span style="color: rgba(0, 0, 255, 1)">if</span> (feof(f1)) <span style="color: rgba(0, 0, 255, 1)">break</span><span style="color: rgba(0, 0, 0, 1)">;
</span><span style="color: rgba(0, 0, 255, 1)">for</span> (i=<span style="color: rgba(128, 0, 128, 1)">0</span>;i<FRAME_SIZE;i++) x =<span style="color: rgba(0, 0, 0, 1)"> tmp;
rnnoise_process_frame(st, x, x);
</span><span style="color: rgba(0, 0, 255, 1)">for</span> (i=<span style="color: rgba(128, 0, 128, 1)">0</span>;i<FRAME_SIZE;i++) tmp =<span style="color: rgba(0, 0, 0, 1)"> x;
</span><span style="color: rgba(0, 0, 255, 1)">if</span> (!first) fwrite(tmp, <span style="color: rgba(0, 0, 255, 1)">sizeof</span>(<span style="color: rgba(0, 0, 255, 1)">short</span><span style="color: rgba(0, 0, 0, 1)">), FRAME_SIZE, fout);
first </span>= <span style="color: rgba(128, 0, 128, 1)">0</span><span style="color: rgba(0, 0, 0, 1)">;
}
rnnoise_destroy(st);
fclose(f1);
fclose(fout);
</span><span style="color: rgba(0, 0, 255, 1)">return</span> <span style="color: rgba(128, 0, 128, 1)">0</span><span style="color: rgba(0, 0, 0, 1)">;
}</span></pre>
</div>
<p>编译命令如下:</p>
<div class="cnblogs_code">
<pre>g++ test1.c -o test1-Iinclude-<span style="color: rgba(0, 0, 255, 1)">static</span> libs/librnnoise.a</pre>
</div>
<div>也可写使用Makefile文件:</div>
<div>
<div class="cnblogs_code">
<pre>CC = g++<span style="color: rgba(0, 0, 0, 1)">
CFLAGS </span>= -g -O2 -<span style="color: rgba(0, 0, 0, 1)">Wall
HDRS</span>= -<span style="color: rgba(0, 0, 0, 1)">Iinclude
LIBS </span>= -<span style="color: rgba(0, 0, 255, 1)">static</span> libs/<span style="color: rgba(0, 0, 0, 1)">librnnoise.a
# g</span>++ test1.c -o test1-Iinclude-<span style="color: rgba(0, 0, 255, 1)">static</span> libs/<span style="color: rgba(0, 0, 0, 1)">librnnoise.a
all:
make test1
test1:test1.o
$(CC) </span>-<span style="color: rgba(0, 0, 0, 1)">o test1 test1.o $(LIBS)
clean:
rm </span>-<span style="color: rgba(0, 0, 0, 1)">f test1
rm </span>-f *<span style="color: rgba(0, 0, 0, 1)">.o
.c.o:
$(CC) $(CFLAGS) $(HDRS) </span>-c -o $*.o $<</pre>
</div>
<div>编译及运行效果如下:</div>
<div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205121665-980890308.png" alt="image" loading="lazy"></p>
<p>降噪效果如下:</p>
<div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205145485-113099316.png" alt="image" loading="lazy"></p>
<div>配套代码及文件可从如下渠道获取:</div>
<div>关注微信公众号(聊聊博文,文末可扫码)后回复 20251031 获取。 </div>
<h1 id="zTET-1761915122722">二、在python中使用rnnoise库</h1>
<div> rnnoise的python库内置的有降噪模型,不用额外下载模型。</div>
<div> </div>
<div>pypi地址:</div>
<div>https://pypi.org/project/pyrnnoise/</div>
</div>
<div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205224400-276009452.png" alt="image" loading="lazy"></p>
<p> 安装rnnoise库:</p>
<div>
<div class="cnblogs_code">
<pre>pip install pyrnnoise</pre>
</div>
<div>主流平台都支持的:</div>
<div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205255313-2129101136.png" alt="image" loading="lazy"></p>
<p> </p>
<div>安装时会下载很多依赖库:</div>
<div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205331768-1882721034.png" alt="image" loading="lazy"></p>
<p> 安装后dll路径:</p>
</div>
</div>
</div>
</div>
<div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205339513-1124376899.png" alt="image" loading="lazy"></p>
<p> 示例代码(rnnoiseTest1.py):</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">from</span> pyrnnoise <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> RNNoise
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> Create denoiser instance</span>
denoiser = RNNoise(sample_rate=16000<span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> Process audio file</span>
<span style="color: rgba(0, 0, 255, 1)">for</span> speech_prob <span style="color: rgba(0, 0, 255, 1)">in</span> denoiser.denoise_wav(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">mix1.wav</span><span style="color: rgba(128, 0, 0, 1)">"</span>, <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">output.wav</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">):
</span><span style="color: rgba(0, 0, 255, 1)">print</span>(f<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Processing frame with speech probability: {speech_prob}</span><span style="color: rgba(128, 0, 0, 1)">"</span>)</pre>
</div>
<p>运行效果如下:</p>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205439777-1570147164.png" alt="image" loading="lazy"></p>
<p> 降噪效果如下:</p>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205459620-617421163.png" alt="image" loading="lazy"></p>
<p>配套代码及文件可从如下渠道获取:</p>
<div>关注微信公众号(聊聊博文,文末可扫码)后回复 20251031 获取。 </div>
<h1 id="cu0W-1761915322934">三、模型训练</h1>
<div>这里做下简单说明,具体可参考GitHub上的README文档:</div>
<div>https://github.com/xiph/rnnoise</div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205541168-1378156865.png" alt="image" loading="lazy"></p>
<h2 id="F8EC-1761915346648">1、数据集获取</h2>
<div> 数据及模型下载地址:</div>
<div>https://media.xiph.org/rnnoise/</div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205559885-1610024551.png" alt="image" loading="lazy"></p>
<p>1)rnnoise_contributions.tar.gz 是 RNNoise 项目提供的一个数据集压缩包,主要用于训练 RNNoise 模型;</p>
<div>2)data目录里面包含语音数据、噪音数据及其它辅助数据,展开如下;</div>
<div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205616603-1842775224.png" alt="image" loading="lazy"></p>
<p> 3)misc目录只有一个wav音频文件;</p>
<div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205632908-1226950215.png" alt="image" loading="lazy"></p>
<p> 4)models文件夹存储的训练好的模型,可直接使用;</p>
<div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205648340-1449034260.png" alt="image" loading="lazy"></p>
</div>
</div>
</div>
<h2 id="YltJ-1761915413597">2、训练过程</h2>
<div>大致过程如下:</div>
<div><strong>1)使用dump_features提取特征文件。</strong></div>
<div>示例如下:</div>
<div class="cnblogs_code">
<pre>./dump_features -rir_list rir_list.txt speech.pcm background_noise.pcm foreground_noise.pcm features.f32 <count></pre>
</div>
<div>其中 为处理的序列数量,建议至少 10000 次(越多越好,推荐 200000 次以上)。</div>
<div>dump_features在rnnoise的根目录(编译后):</div>
<div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205733600-1435276822.png" alt="image" loading="lazy"></p>
<p><strong>2)可使用脚本 script/dump_features_parallel.sh 加速特征生成。</strong></p>
<p><span>使用方法如下:</span></p>
<div class="cnblogs_code">
<pre>script/dump_features_parallel.sh ./dump_features speech.pcm background_noise.pcm foreground_noise.pcm features.f32 <count> rir_list.txt</pre>
</div>
<div>该脚本会启动多个进程,每个进程处理一定数量的序列,并将结果合并为一个文件。</div>
<div> </div>
<div><strong>3)执行训练,生成模型文件。</strong></div>
<div>训练脚本目录:torch/rnnoise</div>
<div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205820992-1714214888.png" alt="image" loading="lazy"></p>
<p> 训练命令如下:</p>
<div class="cnblogs_code">
<pre>python3 train_rnnoise.py features.f32 output_directory</pre>
</div>
<p><span>可选择适当的训练轮数(通过 --epochs 参数指定,比如 75000 次),当使权重更新次数达到约 75000 次时,会生成 .pth 文件(比如 rnnoise_50.pth )。</span></p>
<p><strong>4)将模型文件转换为 C 代码。</strong></p>
<p><span>脚本名称: dump_rnnoise_weights.py</span></p>
<p><span>转换示例:</span></p>
<div class="cnblogs_code">
<pre>python3 dump_rnnoise_weights.py --quantize rnnoise_50.pth rnnoise_c</pre>
</div>
<p><span>会自动创建 rnnoise_c 文件夹,然后在该文件夹里面生成 rnnoise_data.c 和 rnnoise_data.h 文件。</span></p>
<p><strong>5)在C代码中使用模型。</strong></p>
<p><span>复制 rnnoise_data.c 和 rnnoise_data.h 文件到 src/ 目录,然后按照之前描述的方法编译 RNNoise 工程,会在examples目录里面找到可直接运行的demo程序(rnnoise_demo)。</span></p>
<h1><span><span>四、资源获取</span></span></h1>
<p><span>本文相关资源及运行环境,可从如下渠道获取:</span></p>
</div>
</div>
<div>关注微信公众号(聊聊博文,文末可扫码)后回复 20251031 获取。</div>
<div>
<p><img src="https://img2024.cnblogs.com/blog/300959/202510/300959-20251031205946645-1769507086.png" alt="image" loading="lazy"></p>
<p> </p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<div id="MySignature" role="contentinfo">
如果你对该文章有疑问,可通过微信公众号(聊聊博文)向我提问:<br>
<img src="https://files.cnblogs.com/files/MikeZhang/201804weixingongzhong1.gif" width="170"><br>
转载请注明出处,谢谢!<br><br>
来源:https://www.cnblogs.com/MikeZhang/p/19181243/rnnoise20251031
頁:
[1]