记一次 .NET 某RFID标签打印客户端 崩溃分析
<h2 id="一背景">一:背景</h2><h3 id="1-讲故事">1. 讲故事</h3>
<p>去年微信上有位朋友找到我,说他们的RFID标签打印出现了偶发性崩溃,一直没找到原因,让我帮忙看下怎么回事?然后就让这位朋友用procdump抓一个崩溃dump给我,我看看就好。</p>
<h2 id="二崩溃分析">二:崩溃分析</h2>
<h3 id="1-为什么会崩溃">1. 为什么会崩溃</h3>
<p>双击打开dump,windbg会自动定位到崩溃的上下文,这一点我比较喜欢,有的时候也省去了用 <code>!analyze -v</code> 无趣的等待,参考输出如下:</p>
<pre><code class="language-C#">
This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(4120.43a0): Access violation - code c0000005 (first/second chance not available)
For analysis of this file, run !analyze -v
clr!WKS::gc_heap::find_first_object+0xea:
00007ffd`9eaa7ecb 833800 cmp dword ptr ,0 ds:30302c30`2c302c30=????????
</code></pre>
<p>从卦中的 <code>find_first_object</code> 来看,这是经典的 gc标记阶段,在进行深度优先遍历的时候发现了无效对象,进而引发灾难性后果,可以用 <code>k 8</code> 观察调用栈。</p>
<pre><code class="language-C#">
0:006> k 8
# Child-SP RetAddr Call Site
00 0000000e`c103c4e8 00007ffd`9eaa8955 clr!WKS::gc_heap::find_first_object+0xea
01 0000000e`c103c500 00007ffd`9ea298aa clr!WKS::GCHeap::Promote+0xc7
02 0000000e`c103c570 00007ffd`9eaf2822 clr!GcEnumObject+0x97
03 0000000e`c103c5c0 00007ffd`9ea27f68 clr!GcInfoDecoder::EnumerateLiveSlots+0x1856
04 0000000e`c103ca20 00007ffd`9ea2887f clr!GcStackCrawlCallBack+0x2bd
05 0000000e`c103ce40 00007ffd`9eaa25d8 clr!GCToEEInterface::GcScanRoots+0x4b6
06 0000000e`c103e300 00007ffd`9eaa0e55 clr!WKS::gc_heap::mark_phase+0x1d9
07 0000000e`c103e3b0 00007ffd`9eaa0d6b clr!WKS::gc_heap::gc1+0xef
</code></pre>
<h3 id="2-崩溃原因是什么">2. 崩溃原因是什么</h3>
<p>既然托管堆上有坏对象,那如何找到呢?可以用 <code>!verifyheap</code> 识别就好,参考输出如下:</p>
<pre><code class="language-C#">
0:006> !verifyheap
Could not request method table data for object 0000015A9D59B0D0 (MethodTable: 30302C302C302C30).
Last good object: 0000015A9D59B048.
</code></pre>
<p>从卦中可以清晰的看到,Object <code>0000015A9D59B0D0</code>的 MethodTable <code>30302C302C302C30</code> 是一个无效值,从形态上看很像一段字符的ascii码,有点意思,接下来我们观察对象附近的内存,使用 <code>dp 0000015A9D59B0D0-0xa0 L30</code> 命令观察。</p>
<pre><code class="language-C#">
0:006> dp 0000015A9D59B0D0-0xa0 L30
0000015a`9d59b03000000000`00000002 00000000`003a002f
0000015a`9d59b04000000000`00000000 00007ffd`9cd985e0
0000015a`9d59b05000000000`0000001c 00000002`00000001
0000015a`9d59b06000000000`00000008 00000000`00000000
0000015a`9d59b07000000000`00000000 00000000`00000000
0000015a`9d59b08000000000`00000000 00000000`00000000
0000015a`9d59b09000000000`00000000 00000000`00000000
0000015a`9d59b0a000000000`00000000 65636976`6564227b
0000015a`9d59b0b074735f74`736f682e 30223a22`73757461
0000015a`9d59b0c0312c302c`302c3033 2c383330`2c383132
0000015a`9d59b0d030302c30`2c302c30 5c302c30`2c302c30
0000015a`9d59b0e0302c3130`306e5c72 322c312c`302c302c
0000015a`9d59b0f03030302c`302c362c 2c312c31`30303030
0000015a`9d59b100306e5c72`5c313030 007d2230`2c303030
0000015a`9d59b1100000002e`00000001 00000000`00000000
0000015a`9d59b12000000000`00000000 00007ffd`9cd95a68
0000015a`9d59b1300073006d`00000005 00000073`006e0074
0000015a`9d59b14000000000`00000000 00000000`00000000
0000015a`9d59b1500000015a`9b4ddbb0 00000000`00000000
0000015a`9d59b16000000000`00000000 00007ffd`9cd95a68
0000015a`9d59b17000720050`00000013 00650074`006e0069
0000015a`9d59b18000700061`00430072 006c0069`00620061
0000015a`9d59b19000650069`00740069 00000000`00000073
0000015a`9d59b1a000000000`00000000 00007ffd`9cd96878
</code></pre>
<p>从卦中可以看到 <code>0000015A9D59B0D0</code> 附近被一段字符串覆盖了,看样子是有域外代码将string写溢出了。。。。接下来使用 <code>da</code> 把这段内容给 dig 出来。</p>
<pre><code class="language-C#">
0:006> da /c100 0000015a`9d59b0a0+0x8
0000015a`9d59b0a8"{"device.host_status":"030,0,0,1218,038,0,0,0,000,0,0,0\r\n001,0,0,0,1,2,6,0,00000001,1,001\r\n0000,0"}"
</code></pre>
<p>从卦中看是一段json字符串,看样子应该是非托管代码回写string溢出了,但这个对象生前是不是string呢?这个只能在当前破坏现场寻找了,使用 <code>!lno</code> 观察附近的好对象。</p>
<pre><code class="language-C#">
0:006> !lno 0000015a9d59b128
Before:0000015a9d59b048 136 (0x88) System.Int32[]
Current: 0000015a9d59b128 40 (0x28) System.String
After: 0000015a9d59b150 24 (0x18) Free
Heap local consistency not confirmed.
0:006> !lno 0000015a9d59b150
Before:0000015a9d59b048 136 (0x88) System.Int32[]
Current: 0000015a9d59b150 24 (0x18) Free
After: 0000015a9d59b168 64 (0x40) System.String
Heap local consistency not confirmed.
0:006> !mdt -e:2 0000015a9d59b048
0000015a9d59b048 (System.Int32[], Elements: 28)
0x1
....
0x0
0x0
0x0
0x0
0x6564227b
0x65636976
0x736f682e
0x74735f74
0x73757461
0x30223a22
0x302c3033
0x312c302c
0:006> !do 0000015a9d59b168
Name: System.String
MethodTable: 00007ffd9cd95a68
EEClass: 00007ffd9cd72ec0
Size: 64(0x40) bytes
File: C:\WINDOWS\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
String: PrinterCapabilities
Fields:
MT Field Offset Type VT Attr Value Name
00007ffd9cd986484000283 8 System.Int321 instance 19 m_stringLength
00007ffd9cd968e04000284 c System.Char1 instance 50 m_firstChar
00007ffd9cd95a684000288 e0 System.String0 shared static Empty
>> Domain:Value0000015a9b506c70:NotInit<<
0:006> !do 0000015a9d59b128
Name: System.String
MethodTable: 00007ffd9cd95a68
EEClass: 00007ffd9cd72ec0
Size: 36(0x24) bytes
File: C:\WINDOWS\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
String: mstns
Fields:
MT Field Offset Type VT Attr Value Name
00007ffd9cd986484000283 8 System.Int321 instance 5 m_stringLength
00007ffd9cd968e04000284 c System.Char1 instance 6d m_firstChar
00007ffd9cd95a684000288 e0 System.String0 shared static Empty
>> Domain:Value0000015a9b506c70:NotInit<<
</code></pre>
<p>从卦中可以看到,这个 json 把 前面的 <code>int</code> 也给部分破坏了,后面跟着字符串 <code>PrinterCapabilities</code> 和 <code>mstns</code>,看样子这块和打印操作有关,将这些信息告诉朋友,让朋友重点关注下。</p>
<p>由于当前看到的是第二现场,无法知道谁导致的第一现场,如果想知道,需要上各种黑科技,这个在我之前的文章中多有涉及。</p>
<h2 id="三总结">三:总结</h2>
<p>这次生产事故还是挺有意思,比较考验你对托管堆以及对内存的敏感度。</p>
<img src="https://images.cnblogs.com/cnblogs_com/huangxincheng/345039/o_210929020104最新消息优惠促销公众号关注二维码.jpg" width="700" height="300" alt="图片名称" align="center"><br><br>
来源:https://www.cnblogs.com/huangxincheng/p/19446182
頁:
[1]