高原蓝 發表於 2026-2-4 16:34:00

.NET 中如何快速实现 List 集合去重?

<h2 class="md-end-block md-heading"><span class="md-plain md-expand" style="font-size: 16px">前言</span></h2>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">在数据处理中,去除集合中的重复元素是一个常见的需求。.NET 6 和 .NET 7 引入了 <span class="md-pair-s"><code>DistinctBy</code><span class="md-plain"> 方法,这是一个非常实用的新特性,可以方便地根据指定的键对集合进行去重。</span></span></span></p>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">本文将详细介绍 <span class="md-pair-s"><code>DistinctBy</code><span class="md-plain"> 方法的使用,并通过具体的案例来展示其在实际开发中的应用。</span></span></span></p>
<h2 class="md-end-block md-heading"><span class="md-plain" style="font-size: 16px">正文</span></h2>
<h3><strong>1、<code>DistinctBy</code> 方法</strong></h3>
<p class="md-end-block md-p"><span class="md-pair-s" style="font-size: 16px"><code>DistinctBy</code><span class="md-plain"> 方法允许我们在 LINQ 查询中根据某个键对集合中的元素进行去重。</span></span></p>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">这个方法返回一个新的集合,其中只包含根据指定键唯一确定的元素。</span></p>
<p class="md-end-block md-p"><span class="md-pair-s md-expand" style="font-size: 16px"><strong>方法签名</strong></span></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">public</span> <span style="color: rgba(0, 0, 255, 1)">static</span> IEnumerable&lt;TSource&gt; DistinctBy&lt;TSource, TKey&gt;<span style="color: rgba(0, 0, 0, 1)">(
    </span><span style="color: rgba(0, 0, 255, 1)">this</span> IEnumerable&lt;TSource&gt;<span style="color: rgba(0, 0, 0, 1)"> source,
    Func</span>&lt;TSource, TKey&gt;<span style="color: rgba(0, 0, 0, 1)"> keySelector
);</span></pre>
</div>
<h3 class="md-end-block md-heading"><strong><span class="md-plain" style="font-size: 16px">2、基本用法</span></strong></h3>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">最简单的用法是在 LINQ 查询中直接调用 <span class="md-pair-s"><code>DistinctBy</code><span class="md-plain"> 方法,然后处理去重后的集合。</span></span></span></p>
<p class="md-end-block md-p"><span class="md-pair-s" style="font-size: 16px"><strong>说明</strong></span></p>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">假设我们有一个用户列表,我们想要根据用户名去除重复的用户。</span></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">using</span><span style="color: rgba(0, 0, 0, 1)"> System.Linq;

</span><span style="color: rgba(0, 0, 255, 1)">class</span><span style="color: rgba(0, 0, 0, 1)"> User
{
    </span><span style="color: rgba(0, 0, 255, 1)">public</span> <span style="color: rgba(0, 0, 255, 1)">string</span> Name { <span style="color: rgba(0, 0, 255, 1)">get</span>; <span style="color: rgba(0, 0, 255, 1)">set</span><span style="color: rgba(0, 0, 0, 1)">; }
    </span><span style="color: rgba(0, 0, 255, 1)">public</span> <span style="color: rgba(0, 0, 255, 1)">int</span> Age { <span style="color: rgba(0, 0, 255, 1)">get</span>; <span style="color: rgba(0, 0, 255, 1)">set</span><span style="color: rgba(0, 0, 0, 1)">; }
}

</span><span style="color: rgba(0, 0, 255, 1)">var</span> users = <span style="color: rgba(0, 0, 255, 1)">new</span> List&lt;User&gt;<span style="color: rgba(0, 0, 0, 1)">
{
    </span><span style="color: rgba(0, 0, 255, 1)">new</span> User { Name = <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Alice</span><span style="color: rgba(128, 0, 0, 1)">"</span>, Age = <span style="color: rgba(128, 0, 128, 1)">25</span><span style="color: rgba(0, 0, 0, 1)"> },
    </span><span style="color: rgba(0, 0, 255, 1)">new</span> User { Name = <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Bob</span><span style="color: rgba(128, 0, 0, 1)">"</span>, Age = <span style="color: rgba(128, 0, 128, 1)">32</span><span style="color: rgba(0, 0, 0, 1)"> },
    </span><span style="color: rgba(0, 0, 255, 1)">new</span> User { Name = <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Alice</span><span style="color: rgba(128, 0, 0, 1)">"</span>, Age = <span style="color: rgba(128, 0, 128, 1)">28</span><span style="color: rgba(0, 0, 0, 1)"> },
    </span><span style="color: rgba(0, 0, 255, 1)">new</span> User { Name = <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">David</span><span style="color: rgba(128, 0, 0, 1)">"</span>, Age = <span style="color: rgba(128, 0, 128, 1)">35</span><span style="color: rgba(0, 0, 0, 1)"> }
};

</span><span style="color: rgba(0, 0, 255, 1)">var</span> distinctUsers = users.DistinctBy(user =&gt;<span style="color: rgba(0, 0, 0, 1)"> user.Name);

</span><span style="color: rgba(0, 0, 255, 1)">foreach</span> (<span style="color: rgba(0, 0, 255, 1)">var</span> user <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> distinctUsers)
{
    Console.WriteLine($</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Name: {user.Name}, Age: {user.Age}</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">);
}</span></pre>
</div>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">输出结果:</span></p>
<div class="cnblogs_code">
<pre>Name: Alice, Age: <span style="color: rgba(128, 0, 128, 1)">25</span><span style="color: rgba(0, 0, 0, 1)">
Name: Bob, Age: </span><span style="color: rgba(128, 0, 128, 1)">32</span><span style="color: rgba(0, 0, 0, 1)">
Name: David, Age: </span><span style="color: rgba(128, 0, 128, 1)">35</span></pre>
</div>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">过滤前后元素还是保持原有的顺序,我们可以查看源码。</span></p>
<p class="md-end-block md-p"><span class="md-pair-s" style="font-size: 16px"><strong>源码</strong></span></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">private</span> <span style="color: rgba(0, 0, 255, 1)">static</span> IEnumerable&lt;TSource&gt; DistinctByIterator&lt;TSource, TKey&gt;(IEnumerable&lt;TSource&gt; source, Func&lt;TSource, TKey&gt; keySelector, IEqualityComparer&lt;TKey&gt;?<span style="color: rgba(0, 0, 0, 1)"> comparer)
{
    </span><span style="color: rgba(0, 0, 255, 1)">using</span> IEnumerator&lt;TSource&gt; enumerator =<span style="color: rgba(0, 0, 0, 1)"> source.GetEnumerator();

    </span><span style="color: rgba(0, 0, 255, 1)">if</span><span style="color: rgba(0, 0, 0, 1)"> (enumerator.MoveNext())
    {
      </span><span style="color: rgba(0, 0, 255, 1)">var</span> <span style="color: rgba(0, 0, 255, 1)">set</span> = <span style="color: rgba(0, 0, 255, 1)">new</span> HashSet&lt;TKey&gt;<span style="color: rgba(0, 0, 0, 1)">(DefaultInternalSetCapacity, comparer);
      </span><span style="color: rgba(0, 0, 255, 1)">do</span><span style="color: rgba(0, 0, 0, 1)">
      {
            TSource element </span>=<span style="color: rgba(0, 0, 0, 1)"> enumerator.Current;
            </span><span style="color: rgba(0, 0, 255, 1)">if</span> (<span style="color: rgba(0, 0, 255, 1)">set</span><span style="color: rgba(0, 0, 0, 1)">.Add(keySelector(element)))
            {
                </span><span style="color: rgba(0, 0, 255, 1)">yield</span> <span style="color: rgba(0, 0, 255, 1)">return</span><span style="color: rgba(0, 0, 0, 1)"> element;
            }
      }
      </span><span style="color: rgba(0, 0, 255, 1)">while</span><span style="color: rgba(0, 0, 0, 1)"> (enumerator.MoveNext());
    }
}</span></pre>
</div>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">通过查看源码,可以看到是利用了 <span class="md-pair-s"><code>HashSet</code><span class="md-plain"> 去重,元素顺序并未被打乱。</span></span></span></p>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">在处理集合时,我们经常需要去除重复的元素,同时保持原有的顺序。</span></p>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">使用 <span class="md-pair-s"><code>HashSet</code><span class="md-plain"> 可以高效地实现这一目标。</span></span></span></p>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">首先将指定的键尝试添加到 <span class="md-pair-s"><code>HashSet</code><span class="md-plain"> 中,如果添加成功,说明该键没有重复;</span></span></span></p>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">如果添加失败,说明已经存在相同的键,此元素将被过滤掉。</span></p>
<h3 class="md-end-block md-heading"><strong><span class="md-plain" style="font-size: 16px">3、复杂用法</span></strong></h3>
<p class="md-end-block md-p"><span class="md-pair-s" style="font-size: 16px"><code>DistinctBy</code><span class="md-plain"> 方法可以用于更复杂的去重逻辑,例如根据多个属性进行去重。</span></span></p>
<p class="md-end-block md-p"><span class="md-pair-s " style="font-size: 16px"><strong>说明</strong></span></p>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">假设我们有一个订单列表,我们想要根据客户名称和订单金额去除重复的订单。</span></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">class</span><span style="color: rgba(0, 0, 0, 1)"> Order
{
    </span><span style="color: rgba(0, 0, 255, 1)">public</span> <span style="color: rgba(0, 0, 255, 1)">int</span> OrderId { <span style="color: rgba(0, 0, 255, 1)">get</span>; <span style="color: rgba(0, 0, 255, 1)">set</span><span style="color: rgba(0, 0, 0, 1)">; }
    </span><span style="color: rgba(0, 0, 255, 1)">public</span> <span style="color: rgba(0, 0, 255, 1)">string</span> CustomerName { <span style="color: rgba(0, 0, 255, 1)">get</span>; <span style="color: rgba(0, 0, 255, 1)">set</span><span style="color: rgba(0, 0, 0, 1)">; }
    </span><span style="color: rgba(0, 0, 255, 1)">public</span> <span style="color: rgba(0, 0, 255, 1)">decimal</span> Amount { <span style="color: rgba(0, 0, 255, 1)">get</span>; <span style="color: rgba(0, 0, 255, 1)">set</span><span style="color: rgba(0, 0, 0, 1)">; }
}

</span><span style="color: rgba(0, 0, 255, 1)">var</span> orders = <span style="color: rgba(0, 0, 255, 1)">new</span> List&lt;Order&gt;<span style="color: rgba(0, 0, 0, 1)">
{
    </span><span style="color: rgba(0, 0, 255, 1)">new</span> Order { OrderId = <span style="color: rgba(128, 0, 128, 1)">1</span>, CustomerName = <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Alice</span><span style="color: rgba(128, 0, 0, 1)">"</span>, Amount = <span style="color: rgba(128, 0, 128, 1)">100.0m</span><span style="color: rgba(0, 0, 0, 1)"> },
    </span><span style="color: rgba(0, 0, 255, 1)">new</span> Order { OrderId = <span style="color: rgba(128, 0, 128, 1)">2</span>, CustomerName = <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Bob</span><span style="color: rgba(128, 0, 0, 1)">"</span>, Amount = <span style="color: rgba(128, 0, 128, 1)">150.0m</span><span style="color: rgba(0, 0, 0, 1)"> },
    </span><span style="color: rgba(0, 0, 255, 1)">new</span> Order { OrderId = <span style="color: rgba(128, 0, 128, 1)">3</span>, CustomerName = <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Alice</span><span style="color: rgba(128, 0, 0, 1)">"</span>, Amount = <span style="color: rgba(128, 0, 128, 1)">100.0m</span><span style="color: rgba(0, 0, 0, 1)"> },
    </span><span style="color: rgba(0, 0, 255, 1)">new</span> Order { OrderId = <span style="color: rgba(128, 0, 128, 1)">4</span>, CustomerName = <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Charlie</span><span style="color: rgba(128, 0, 0, 1)">"</span>, Amount = <span style="color: rgba(128, 0, 128, 1)">120.0m</span><span style="color: rgba(0, 0, 0, 1)"> },
    </span><span style="color: rgba(0, 0, 255, 1)">new</span> Order { OrderId = <span style="color: rgba(128, 0, 128, 1)">5</span>, CustomerName = <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Bob</span><span style="color: rgba(128, 0, 0, 1)">"</span>, Amount = <span style="color: rgba(128, 0, 128, 1)">150.0m</span><span style="color: rgba(0, 0, 0, 1)"> }
};

</span><span style="color: rgba(0, 0, 255, 1)">var</span> distinctOrders = orders.DistinctBy(order =&gt;<span style="color: rgba(0, 0, 0, 1)"> (order.CustomerName, order.Amount));

</span><span style="color: rgba(0, 0, 255, 1)">foreach</span> (<span style="color: rgba(0, 0, 255, 1)">var</span> order <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> distinctOrders)
{
    Console.WriteLine($</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Order ID: {order.OrderId}, Customer: {order.CustomerName}, Amount: {order.Amount}</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">);
}</span></pre>
</div>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">输出结果:</span></p>
<div class="cnblogs_code">
<pre>Order ID: <span style="color: rgba(128, 0, 128, 1)">1</span>, Customer: Alice, Amount: <span style="color: rgba(128, 0, 128, 1)">100.0</span><span style="color: rgba(0, 0, 0, 1)">
Order ID: </span><span style="color: rgba(128, 0, 128, 1)">2</span>, Customer: Bob, Amount: <span style="color: rgba(128, 0, 128, 1)">150.0</span><span style="color: rgba(0, 0, 0, 1)">
Order ID: </span><span style="color: rgba(128, 0, 128, 1)">4</span>, Customer: Charlie, Amount: <span style="color: rgba(128, 0, 128, 1)">120.0</span></pre>
</div>
<h3 class="md-end-block md-heading"><strong><span class="md-plain" style="font-size: 16px">4、性能考虑</span></strong></h3>
<p class="md-end-block md-p"><span class="md-pair-s" style="font-size: 16px"><code>DistinctBy</code><span class="md-plain"> 方法在内部使用哈希表来跟踪已经出现的键,因此在大多数情况下性能非常好。但在处理非常大的数据集时,仍然需要注意内存使用情况。</span></span></p>
<p class="md-end-block md-p"><span class="md-pair-s" style="font-size: 16px"><strong>说明</strong></span></p>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">假设我们有一个包含数百万条记录的大集合,我们需要根据某个键进行去重。</span></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">var</span> largeCollection = Enumerable.Range(<span style="color: rgba(128, 0, 128, 1)">1</span>, <span style="color: rgba(128, 0, 128, 1)">10000000</span>).Select(i =&gt; <span style="color: rgba(0, 0, 255, 1)">new</span> { Id = i, Value = i % <span style="color: rgba(128, 0, 128, 1)">1000</span><span style="color: rgba(0, 0, 0, 1)"> });
</span><span style="color: rgba(0, 0, 255, 1)">var</span> distinctLargeCollection = largeCollection.DistinctBy(item =&gt;<span style="color: rgba(0, 0, 0, 1)"> item.Value);
Console.WriteLine($</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Distinct count: {distinctLargeCollection.Count()}</span><span style="color: rgba(128, 0, 0, 1)">"</span>);</pre>
</div>
<h3 class="md-end-block md-heading"><strong><span class="md-plain" style="font-size: 16px">5、异步 LINQ 查询中的使用</span></strong></h3>
<p class="md-end-block md-p"><span class="md-pair-s" style="font-size: 16px"><code>DistinctBy</code><span class="md-plain"> 方法也可以在异步 LINQ 查询中使用,结合 <span class="md-pair-s"><code>IAsyncEnumerable&lt;T&gt;</code><span class="md-plain"> 类型,处理大量数据时更加高效。</span></span></span></span></p>
<p class="md-end-block md-p"><span class="md-pair-s " style="font-size: 16px"><strong>说明</strong></span></p>
<p class="md-end-block md-p"><span class="md-plain" style="font-size: 16px">假设我们有一个异步方法返回一个用户列表,我们想要根据用户名去除重复的用户。</span></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">using</span><span style="color: rgba(0, 0, 0, 1)"> System.Net.Http.Json;

</span><span style="color: rgba(0, 0, 255, 1)">public</span> <span style="color: rgba(0, 0, 255, 1)">async</span> IAsyncEnumerable&lt;User&gt;<span style="color: rgba(0, 0, 0, 1)"> GetUsersAsync()
{
    </span><span style="color: rgba(0, 0, 255, 1)">var</span> response = <span style="color: rgba(0, 0, 255, 1)">await</span> httpClient.GetAsync(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">https://api.example.com/users</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">);
    </span><span style="color: rgba(0, 0, 255, 1)">var</span> usersJson = <span style="color: rgba(0, 0, 255, 1)">await</span><span style="color: rgba(0, 0, 0, 1)"> response.Content.ReadAsStringAsync();
   
    </span><span style="color: rgba(0, 128, 0, 1)">//</span><span style="color: rgba(0, 128, 0, 1)"> 使用Json序列化工具解析用户列表</span>
    <span style="color: rgba(0, 0, 255, 1)">var</span> users = JsonSerializer.Deserialize&lt;List&lt;User&gt;&gt;<span style="color: rgba(0, 0, 0, 1)">(usersJson);
   
    </span><span style="color: rgba(0, 0, 255, 1)">foreach</span> (<span style="color: rgba(0, 0, 255, 1)">var</span> user <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> users)
    {
      </span><span style="color: rgba(0, 0, 255, 1)">yield</span> <span style="color: rgba(0, 0, 255, 1)">return</span><span style="color: rgba(0, 0, 0, 1)"> user;
    }
}

</span><span style="color: rgba(0, 128, 0, 1)">//</span><span style="color: rgba(0, 128, 0, 1)"> 使用异步LINQ查询</span>
<span style="color: rgba(0, 0, 255, 1)">var</span> distinctUsers = <span style="color: rgba(0, 0, 255, 1)">await</span> GetUsersAsync().DistinctByAsync(user =&gt;<span style="color: rgba(0, 0, 0, 1)"> user.Name).ToListAsync();

</span><span style="color: rgba(0, 0, 255, 1)">foreach</span> (<span style="color: rgba(0, 0, 255, 1)">var</span> user <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> distinctUsers)
{
    Console.WriteLine($</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Name: {user.Name}, Age: {user.Age}</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">);
}</span></pre>
</div>
<h2 class="md-end-block md-heading"><span class="md-plain" style="font-size: 16px">总结</span></h2>
<p class="md-end-block md-p"><span class="md-pair-s" style="font-size: 16px"><code>DistinctBy</code><span class="md-plain"> 方法是 .NET 6 和 .NET 7 中 LINQ 的一个非常实用的新特性。我们在 LINQ 查询中根据指定的键对集合进行去重,简化了代码并提高了开发效率。</span></span></p>
<p class="md-end-block md-p md-focus"><span class="md-plain" style="font-size: 16px">希望本文能帮助大家更好地理解和利用 .NET 6 和 .NET 7 中 LINQ 的 <span class="md-pair-s"><code>DistinctBy</code><span class="md-plain md-expand"> 方法,从而在项目中发挥更大的作用。</span></span></span></p>
<h2 class="md-end-block md-heading"><span class="md-plain" style="font-size: 16px">最后</span></h2>
<p><span class="md-plain" style="font-size: 16px"><span class="md-plain md-expand">如果你觉得这篇文章对你有帮助,不妨点个赞支持一下!你的支持是我继续分享知识的动力。如果有任何疑问或需要进一步的帮助,欢迎随时留言。也可以加入微信公众号&nbsp;<span class="md-pair-s "><strong></strong><span class="md-plain md-expand">&nbsp;社区,与其他热爱技术的同行一起交流心得,共同成长!</span></span></span></span></p>
<p><span class="md-plain" style="font-size: 16px"><img src="https://img2024.cnblogs.com/blog/576536/202408/576536-20240813102419584-1596250541.png" alt="" style="display: block; margin-left: auto; margin-right: auto"></span></p><br><br>
来源:https://www.cnblogs.com/1312mn/p/18552496
頁: [1]
查看完整版本: .NET 中如何快速实现 List 集合去重?