C#.Net筑基-集合知识全解

健康哈利 發表於 2024-6-20 08:00:00

<p><img src="https://img2024.cnblogs.com/blog/151257/202406/151257-20240603202619549-244952827.png" alt="image.png" loading="lazy"></p>
<h1 id="01集合基础知识">01、集合基础知识</h1>
<p>.Net 中提供了一系列的管理对象集合的类型，数组、可变列表、字典等。从类型安全上集合分为两类，<strong>泛型集合</strong> 和 <strong>非泛型集合</strong>，传统的非泛型集合存储为Object，需要类型转。而泛型集合提供了更好的性能、编译时类型安全，推荐使用。</p>
<p>.Net中集合主要集中在下面几个命名空间中：</p>
<p><img src="https://img2024.cnblogs.com/blog/151257/202406/151257-20240603202619669-346612117.jpg" alt="" loading="lazy"></p>
<h2 id="11集合的起源接口关系">1.1、集合的起源：接口关系</h2>
<p><img src="https://img2024.cnblogs.com/blog/151257/202406/151257-20240603202619684-668374224.jpg" alt="" loading="lazy"></p>
<table>
<thead>
<tr>
<th><strong>集合接口</strong></th>
<th><strong>特点/说明</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>IEnumerator、IEnumerator<code><T></code></td>
<td>枚举器（还不是集合），提供<code>foreach</code>枚举项的能力</td>
</tr>
<tr>
<td>IEnumerable、IEnumerable<code><T></code></td>
<td>可枚举集合，几乎所有集合都实现了该接口，属于集合最基础的接口。就一个<code>IEnumerator GetEnumerator()</code> 方法，返回一个枚举器。</td>
</tr>
<tr>
<td>ICollection、ICollection<code><T></code></td>
<td>提供了基础集合操作：Count、Add()、Remove()、Clear()、Contains()、CopyTo()</td>
</tr>
<tr>
<td>IList、IList<code><T></code></td>
<td>索引器、IndexOf()、Insert()、RemoveAt()</td>
</tr>
<tr>
<td>IDictionary、IDictionary<code><TKey, TValue></code></td>
<td>键值集合操作：Keys、Values、索引器、Add()、Remove()</td>
</tr>
<tr>
<td>IReadOnly***</td>
<td>只读的集合，包括IReadOnlyCollection、IReadOnlyList、IReadOnlyDictionary等</td>
</tr>
</tbody>
</table>
<br>
<ul>
<li><strong>天赋技能 —— foreach</strong>：几乎所有集合都可以用<code>foreach</code>循环操作，是因为他们都继承自<code>IEnumerable</code>接口，由枚举器（IEnumerator）提供枚举操作。</li>
<li>几乎所有集合都提供添加、删除、计数，来自基础接口 <code>ICollection</code>、<code>ICollection<T></code>。</li>
<li><code>IList</code>、<code>IList<T></code> 提供了数组的索引器、查找、插入等操作，几乎所有具体的集合类型都实现了该接口。</li>
<li>Array 是一个抽象类，是所有数组<code>T[]</code>的基类，她是类型安全的。</li>
<li>推荐尽量使用数组T[]、泛型版的集合，提供了更好的类型安全和性能。</li>
</ul>
<p><img src="https://img2024.cnblogs.com/blog/151257/202406/151257-20240603202619544-663183630.png" alt="image.png" loading="lazy"></p>
<h2 id="12非泛型集合-还有什么存在的价值">1.2、非泛型集合—— 还有什么存在的价值？</h2>
<ul>
<li>非泛型的<code>Hashtable</code>，Key、Value都是Object类型的，Dictionary 是泛型版本的 Hashtable。</li>
<li>ArrayList 是非泛型版本的 <code>List<T></code>，基本很少使用，也尽量不用。</li>
</ul>
<p>❓既然非泛型版本类型不安全，性能还差，为什么还存在呢？</p>
<p>主要是历史原因，泛型是<code>.Net2.0</code> 引入的，因此为了向后兼容，依然保留的非泛型版本集合。在接口实现时，非泛型接口一般都是显示实现的，因此基本不会用到。不过在有些场景下，非泛型接口、集合还是有点用的，如类型不固定的集合，或者用接口作为约束条件或类型判断。</p>
<pre><code class="language-csharp">ArrayList arr = new ArrayList();
arr.Add(1);
arr.Add("sam");
arr.Add(new Point());
if (arr is IList) {}

class User<T> where T :IList {}
</code></pre>
<h2 id="13collectiontlistt有何不同">1.3、<code>Collection<T></code>、<code>List<T></code>有何不同？</h2>
<p>❓两者比较相似，他们到底有什么区别呢？该如何选择？</p>
<ul>
<li><code>Collection<T></code> 作为<strong>自定义集合基类</strong>，内部提供了一些<code>virtual</code>的实现，便于继承实现自己的集合类型。其内部集合用的就是<code>List<T></code>，如下部分源码 Collection.cs。</li>
<li><code>List<T></code> 作为<strong>集合</strong>使用，是最常用的可变长集合类型了，他优化了性能，但是丢失了可扩展性，没有提供任何可以<code>override</code>的成员。</li>
</ul>
<pre><code class="language-csharp">public class Collection<T>
{
public Collection()
{
   items = new List<T>();
}
protected virtual void InsertItem(int index, T item)
{
   items.Insert(index, item);
}
}
</code></pre>
<hr>
<h1 id="02枚举器foreach的秘密">02、枚举器——foreach的秘密！</h1>
<p><code>foreach</code> 用来循环迭代可枚举对象，用一种非常简洁、优雅的姿势访问可枚举元素。常用于数组、集合，当然不仅限于集合，只要符合要求枚举要求的都可以。</p>
<table>
<thead>
<tr>
<th><strong>foreach 可枚举类型</strong></th>
<th><strong>说明</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>数组</td>
<td>包括Array数组、List、字典等，他们都实现了<code>IEnumerable</code>接口的。</td>
</tr>
<tr>
<td>IEnumerable</td>
<td>可枚举接口</td>
</tr>
<tr>
<td>IEnumerable<code><T></code></td>
<td>同上，泛型版本</td>
</tr>
<tr>
<td>GetEnumerator()方法</td>
<td>包含公共方法“IEnumerator GetEnumerator();”的任意类型</td>
</tr>
<tr>
<td>yield迭代器</td>
<td><code>yield</code>语句实现的迭代器，实际返回的也是IEnumerable、IEnumerator</td>
</tr>
</tbody>
</table>
<p><img src="https://img2024.cnblogs.com/blog/151257/202406/151257-20240603204820454-194503965.png" alt="image" loading="lazy"></p>
<p><img src="https://img2024.cnblogs.com/blog/151257/202406/151257-20240603202619549-995001861.png" alt="image.png" loading="lazy"></p>
<h2 id="21ienumerator枚举器">2.1、IEnumerator枚举器</h2>
<p>枚举可以<code>foreach</code> 枚举的密码是他们都继承自<code>IEnumerable</code>接口，而更重要的是其内部的枚举器 —— IEnumerator。枚举器<code>IEnumerator</code>定义了<strong>向前遍历</strong>集合元素的基本协议，其申明如下：</p>
<pre><code class="language-csharp">public interface IEnumerator
{
object Current { get; }

bool MoveNext();

void Reset();//这个方法是非必须的，用于重置游标，可不实现
}
public interface IEnumerator<out T> : IDisposable, IEnumerator
{
new T Current { get; }
}
</code></pre>
<ul>
<li><code>MoveNext()</code> 移动当前元素到下一个位置，<code>Current</code>获取当前元素，如果没有元素了，则<code>MoveNext()</code>返回<code>false</code>。注意<code>MoveNext()</code>会先调用，因此首次<code>MoveNext()</code>是把位置移动到第一个位置。</li>
<li><code>Reset()</code>用于重置到起点，主要用于COM互操作，使用很少，可不用实现（直接抛出 NotSupportedException）。</li>
</ul>
<blockquote>
<p>📢 该接口不是必须的，只要实现了公共的<code>Current</code>、无参<code>MoveNext()</code>成员就可进行枚举操作。</p>
</blockquote>
<p>实现一个获取偶数的枚举器：</p>
<pre><code class="language-csharp">void Main()
{
var evenor = new EvenNumbersEnumerator(1, 10);
while (evenor.MoveNext())
{
Console.WriteLine(evenor.Current); //2 4 6 8 10
}
}
//获取偶数的枚举器
public struct EvenNumbersEnumerator : IEnumerator<int> //不继承IEnumerator接口，效果也是一样的
{
private int _start;
private int _end;
private int _position = int.MinValue;

public EvenNumbersEnumerator(int start, int end)
{
_start = start;
_end = end;
}
public int Current => _position;
object IEnumerator.Current => Current;//显示实现非泛型接口，然后隐藏起来

public bool MoveNext()
{
if (_position == int.MinValue)
_position = (int.IsEvenInteger(_start) ? _start : _start + 1) - 2;
_position += 2;
return (_position <= _end);
}
public void Reset() => throw new NotSupportedException();
public void Dispose() { } //IEnumerator 是实现了 IDisposable接口的
}
</code></pre>
<h2 id="22ienumerable可枚举集合">2.2、IEnumerable可枚举集合</h2>
<p><code>IEnumerable</code>、<code>IEnumerable<T></code>是所有集合的基础接口，其核心方法就是 <code>GetEnumerator()</code> 获取一个枚举器。</p>
<pre><code class="language-csharp">public interface IEnumerable
{
IEnumerator GetEnumerator();
}
public interface IEnumerable<out T> : IEnumerable
{
new IEnumerator<T> GetEnumerator();
}
</code></pre>
<blockquote>
<p>📢 该接口也不是必须的，只要包含<code>public</code>的“GetEnumerator()”方法也是一样的。</p>
</blockquote>
<p>有了 <code>GetEnumerator()</code>，就可以使用<code>foreach</code>来枚举元素了，这里<code>foreach</code>会被编译为 <code>while (evenor.MoveNext()){}</code> 形式的代码。在上面偶数枚举器的基础上实现一个偶数类型。</p>
<pre><code class="language-csharp">void Main()
{
var evenNumber = new EvenNumbers();
foreach (var n in evenNumber)
{
Console.WriteLine(n); //2 4 6 8 10
}
}
public class EvenNumbers : IEnumerable<int> //不用必须继承接口，只要有GetEnumerator()即可
{
public IEnumerator<int> GetEnumerator()
{
return new EvenNumbersEnumerator(1, 10);
}

IEnumerator IEnumerable.GetEnumerator() //显示实现非泛型接口，然后隐藏起来
{
return GetEnumerator();
}
}
</code></pre>
<p><strong>foreach</strong> 迭代其实就是调用其<code>GetEnumerator()</code>、<code>Current</code>、<code>MoveNext()</code>实现的，因此接口并不是必须的，只要有对应的成员即可。</p>
<pre><code class="language-csharp">foreach (var n in evenNumber)
{
Console.WriteLine(n); //2 4 6 8 10
}
/************** 上面代码编译后的效果如下：*****************/
IEnumerator<int> enumerator = evenNumber.GetEnumerator();
try
{
while (enumerator.MoveNext ())
{
      int i = enumerator.Current;
      Console.WriteLine (i);
}
}
finally
{
if (enumerator != null)
{
      enumerator.Dispose ();
}
}
</code></pre>
<h2 id="23yield-迭代器">2.3、yield 迭代器</h2>
<p><code>yield return</code> 是一个用于实现迭代器的专用语句，它允许你一次返回一个元素，而不是一次性返回整个集合。常来用来实现自定义的简单迭代器，非常方便，无需实现<code>IEnumerator</code>接口。<br>
<strong>🔸惰性执行</strong>：元素是按需生成的，这可以提高性能并减少内存占用（当然这个要看具体情况），特别是在处理大型集合或复杂的计算时。迭代器方法在被调用时，不会立即执行，而是在<code>MoveNext()</code>时，才会执行对应<code>yield return</code>的语句，并返回该语句的结果。📢Linq里的很多操作也是惰性的。<br>
<strong>🔸简化代码</strong>：使用<code>yield return</code>可以避免手动编写迭代器的繁琐过程。<br>
<strong>🔸状态保持</strong>：<code>yield return</code>自动处理状态保持，使得在每次迭代中保存当前状态变得非常简单。每一条<code>yield return</code>语句执行完后，代码的控制权会交还给调用者，由调用者控制继续。</p>
<p><img src="https://img2024.cnblogs.com/blog/151257/202406/151257-20240603202619820-1473858922.jpg" alt="" loading="lazy"></p>
<p><code>yield</code>迭代器方法会被会被编译为一个实现了<code>IEnumerator</code> 接口的私有类，可以看做是一个高级的语法糖，有一些限制（要求）：</p>
<ul>
<li>迭代器的返回类型可以是<code>IEnumerable</code>、<code>IEnumerator</code>或他们的泛型版本。还可以用 IAsyncEnumerable<code><T></code> 来实现异步的迭代器。</li>
<li><code>yield break</code> 语句提前退出迭代器，不可直接用<code>return</code>，是非法的。</li>
<li><code>yield</code>语句不能和<code>try...catch</code>一起使用。</li>
</ul>
<pre><code class="language-csharp">void Main()
{
var us = new User();
foreach (string name in us)
{
   Console.WriteLine(name); //sam kwong
}
foreach (string name in us.GetEnumerator1())
{
   Console.WriteLine(name); //1sam2
}
foreach (string name in us.GetEnumerator2())
{
   Console.WriteLine(name);//KWONG
}
}
public class User
{
private string firstName = "sam";
private string lastName = "Kwong";
public IEnumerator GetEnumerator()
{
   yield return firstName;
   yield return lastName;
}
public IEnumerable GetEnumerator1() //返回IEnumerable
{
   Console.WriteLine("1");
yield return firstName;//第一次执行到这里
Console.WriteLine("2");
yield break;          //第二次执行到这里，也是最后一次了
yield return lastName;
}
public IEnumerable<string> GetEnumerator2() //返回IEnumerable<string>
{
   yield return lastName.ToUpper();
}
}
</code></pre>
<hr>
<p><img src="https://img2024.cnblogs.com/blog/151257/202406/151257-20240603202619827-875709364.png" alt="image.png" loading="lazy"></p>
<h1 id="03集合装逼了">03、集合！装逼了！</h1>
<h2 id="31常用集合类型">3.1、⭐常用集合类型</h2>
<table>
<thead>
<tr>
<th><strong>集合类型</strong></th>
<th><strong>特点/说明</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>Array（数组之父）</td>
<td>是一个抽象类，是有所有数组的父类，提供了很多用于数组操作的静态方法，详见下一章节</td>
</tr>
<tr>
<td>数组：<code>T[]</code>⭐</td>
<td><strong>定长</strong>（内存连续）的数组集合，所有数组都继承自 <strong>Array</strong>，<code>int[] arr = {1,2,3};</code></td>
</tr>
<tr>
<td><strong>ArrayList</strong></td>
<td>可变长数组，存放Object对象，内部会自动扩容，很少使用。</td>
</tr>
<tr>
<td><strong>List</strong><code><T></code>⭐</td>
<td>泛型版的ArrayList，可变长集合，很常用。</td>
</tr>
<tr>
<td><strong>HashTable</strong></td>
<td>存储Key、Value结构的哈希表，可根据Key快速获取Value值。Key不可重复，都是Object类型。</td>
</tr>
<tr>
<td><strong>Dictionary</strong><code><TK, TV></code>⭐</td>
<td>泛型版本的哈希表，代替HashTable</td>
</tr>
<tr>
<td><code>HashSet<T></code></td>
<td>只有Key的<code>Dictionary</code>，Key不可重复，适用于与不可重复的集合</td>
</tr>
<tr>
<td><code>SortedSet<T></code></td>
<td>支持排序的HashSet，内部用一个红黑树存储，添加删除慢，因为要维护元素的状态</td>
</tr>
<tr>
<td>Queue、<code>Queue<T></code></td>
<td>队列，先进先出（FIFO），Enqueue（后入）、Dequeue（前出）</td>
</tr>
<tr>
<td><code>PriorityQueue<T, TP></code></td>
<td>支持优先级顺序的队列，只能保证优先级顺序，相同优先级不保证先进先出</td>
</tr>
<tr>
<td>Stack、<code>Stack<T></code></td>
<td>栈表，后进先出（LIFO），Push（前入）、Pop（前出）</td>
</tr>
<tr>
<td>SortedList<code><TKey,TValue></code></td>
<td>按照Key排序的列表，内部为数组，支持索引器，在插入时按照顺序存储（性能较差）</td>
</tr>
<tr>
<td><strong>SortedDictionary</strong><code><TK,TV></code></td>
<td>同SortedList，内部为红黑树存储键值对，大数据量时性能更好，不支持索引器。</td>
</tr>
<tr>
<td>LinkedList<code><T></code></td>
<td>双向链表，每个节点包含一个值、前节点指针、后节点指针，不支持索引器。插入删除很快O(1)，不会移动元素，查找O(n)较慢，会遍历整个集合。</td>
</tr>
<tr>
<td>ListDictionary</td>
<td>单向链表字典，轻量级的字典，它适用于小型集合（小于10个）。</td>
</tr>
<tr>
<td>HybridDictionary</td>
<td>Hashtable（高查询效率）和ListDictionary（内存少）的杂交集合，根据数量内部切换容器。</td>
</tr>
<tr>
<td>ReadOnlyCollection<code><T>‌</code></td>
<td>只读集合，接收一个<code>IList</code>集合，可以看做是对普通集合的只读包装，修改会抛出异常。</td>
</tr>
<tr>
<td>Immutable***</td>
<td>不可变集合，通过静态方法<code>Create()</code>创建并初始化，任何修改都会创建新的集合。</td>
</tr>
<tr>
<td><code>Collection<TItem></code></td>
<td>专为自定义扩展（继承）用的集合基类</td>
</tr>
<tr>
<td><code>KeyedCollection<TK, TV></code></td>
<td>同上，专为自定义扩展（继承）用的字典集合基类，继承自<code>Collection<TItem></code></td>
</tr>
<tr>
<td>BitArray</td>
<td>位数组（不是字节，只有一位），用来存储<code>bool</code>值，支持位操作。</td>
</tr>
<tr>
<td>Concurrent***</td>
<td>线程安全的集合，各种类型的集合都有线程安全的版本，可多线程访问</td>
</tr>
<tr>
<td>ConcurrentQueue<code><T></code></td>
<td>线程安全的队列 Queue</td>
</tr>
<tr>
<td>ConcurrentStack<code><T></code></td>
<td>线程安全的栈表 Stack</td>
</tr>
<tr>
<td>ConcurrentDictionary<code><T,K></code></td>
<td>线程安全的字典 Dictionary</td>
</tr>
<tr>
<td>BlockingCollection<code><T></code></td>
<td>提供<strong>阻塞</strong>功能的线程安全集合，适合用于生产者-消费者场景，消费者会自动阻塞等待生产者消息</td>
</tr>
<tr>
<td>ConcurrentBag<code><T></code></td>
<td>线程安全的集合<code>List<T></code></td>
</tr>
<tr>
<td>Channel<code><T></code></td>
<td>专用于生产、消费场景的<strong>现代异步</strong>消息队列，比 BlockingCollection更强大、灵活。</td>
</tr>
</tbody>
</table>
<pre><code class="language-csharp">ArrayList arr2 = new ArrayList();
arr2.Add(null);
arr2.Add("sam");
arr2.Add(1);
Console.WriteLine(arr2);
</code></pre>
<p><img src="https://img2024.cnblogs.com/blog/151257/202406/151257-20240603202619832-310840246.webp" alt="" loading="lazy"></p>
<h2 id="32数组array">3.2、⭐数组Array[]</h2>
<p><strong>Array</strong> 数组是一种有序的集合，通过唯一索引编号进行访问。数组<code>T[]</code>是最常用的数据集合了，几乎支持创建任意类型的数组。<strong><code>Array</code></strong>是所有的数组<code>T[]</code>的（隐式）基类，包括一维、多维数组。CLR会将数组隐式转换为 Array 的子类，生成一个伪类型。</p>
<ul>
<li>索引从0开始。</li>
<li><strong>定长</strong>：数组在申明时必须指定长度，超出长度访问会抛出<code>IndexOutOfRangeException</code>异常。</li>
<li><strong>内存连续</strong>：为了高效访问，数组元素在内存中总是连续存储的。如果是值类型数组，值和数组是存储在一起的；如果是引用类型数组，则数组值存储其引用对象的（堆内存）地址。因此数组的访问是非常高效的！</li>
<li>多维数组：矩阵数组用逗号隔开，<code>int[,] arr = {{1,2},{3,4}};</code></li>
<li>多维数组：锯齿形数组（数组的数组），<code>int[][] arr =new int[];</code></li>
</ul>
<pre><code class="language-csharp">int[] arr = new int;    //申请长度100的int数组
int[] arr2 = new int[]{1,2,3}; //申请并赋值，长度为3
int[] arr3 = {1,2,3};       //同上，前面已制定类型，后面可省略
arr = 1;
Console.WriteLine(arr); //未赋值，默认为0
</code></pre>
<blockquote>
<p>📢 几乎大部分编程语言的数组索引都是从0开始的，如C、Java、Python、JavaScript等。当然也有从1开始的，如MATLAB、R、Lua。</p>
</blockquote>
<table>
<thead>
<tr>
<th><strong>属性</strong></th>
<th><strong>特点/说明</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>Length</td>
<td>数组的长度、元素的数量</td>
</tr>
<tr>
<td>Rank</td>
<td>获取数组的维度，一维数组就是1</td>
</tr>
<tr>
<td></td>
<td>索引器，这是方法数组元素的最常用方式，没有之一！</td>
</tr>
<tr>
<td><strong>🔸方法</strong></td>
<td><strong>特点/说明</strong></td>
</tr>
<tr>
<td>AsReadOnly<code><T>(T[])</code></td>
<td>获取一个只读的数组ReadOnlyCollection<code><T></code></td>
</tr>
<tr>
<td>CopyTo(array, index)</td>
<td>复制数组元素到目标数组<code>array</code>，参数<code>index</code>为目标<code>array</code>的索引位置</td>
</tr>
<tr>
<td>object? GetValue(Int32)</td>
<td>获取制定索引位置的值，对应的还有SetValue(obj, index)，这两个方法都会装箱，慎用！</td>
</tr>
<tr>
<td><strong>🔸静态方法</strong></td>
<td><strong>特点/说明</strong></td>
</tr>
<tr>
<td>BinarySearch(Array)</td>
<td>二分查找，返回找到的元素的索引，负数表示没找到。前提是数组必须是有序的。</td>
</tr>
<tr>
<td>Clear(Array)</td>
<td>清除数组的内容</td>
</tr>
<tr>
<td>Clone()</td>
<td>创建 Array 的浅表副本</td>
</tr>
<tr>
<td>Copy(array1, array2) </td>
<td>将一个<code>array1</code>中的元素复制到数组<code>array2</code>中</td>
</tr>
<tr>
<td>CreateInstance(Type, len)</td>
<td>创建数组，指定类型和长度，<code>Array.CreateInstance(typeof(int),10)</code></td>
</tr>
<tr>
<td>Exists<code><T>(arr, Predicate)</code></td>
<td>根据谓词条件判断是否存在的元素，返回bool，<code>Array.Exists(arr,s=>s>5)</code></td>
</tr>
<tr>
<td>Array.Fill(arr, value)</td>
<td>填充数组值为<code>value</code>，<code>Array.Fill(arr,1);</code></td>
</tr>
<tr>
<td>Find<code><T>(arr, Predicate)</code></td>
<td>根据条件查找元素，返回第一个匹配的元素，<code>Array.Find(arr,s=>s>5)</code></td>
</tr>
<tr>
<td>FindLast(arr, Predicate)</td>
<td>同上，返回最后一个匹配的元素</td>
</tr>
<tr>
<td>FindAll(arr, Predicate)</td>
<td>查找所有匹配的元素，返回数组。</td>
</tr>
<tr>
<td>FindIndex(T[], Predicate)</td>
<td>查找第一个匹配元素的索引，对应的还有 FindLastIndex</td>
</tr>
<tr>
<td>ForEach<code>(T[], Action<T>)</code></td>
<td>循环遍历元素执行<code>action</code></td>
</tr>
<tr>
<td>IndexOf、LastIndexOf</td>
<td>根据元素查找索引位置，-1表示没找到</td>
</tr>
<tr>
<td>Reverse<code><T>(T[])</code></td>
<td>反转元素顺序，<code>Array.Reverse(arr)</code>，不会创建新数组。‼️Linq的<code>Reverse</code>会创建新对象</td>
</tr>
<tr>
<td>Resize<code><T>(ref T[], Int)</code></td>
<td>更改数组长度，会创建一个新数组，所以用了<code>ref</code></td>
</tr>
<tr>
<td>Sort(Array)</td>
<td>对数组元素排序，<code>Array.Sort(arr)</code>，不会创建新数组。</td>
</tr>
<tr>
<td>TrueForAll(T[], Predicate)</td>
<td>判断是否所有元素都符合维词条件，返回bool，<code>Array.TrueForAll(arr,s=>s>1)</code></td>
</tr>
<tr>
<td><strong>🔸扩展方法</strong></td>
<td><strong>特点/说明</strong></td>
</tr>
<tr>
<td>IEnumerable Cast<code><T>()</code></td>
<td>强制转换为指定类型的数组，延迟实现+会装箱，<code>var arr2 = arr.Cast<uint>()</code></td>
</tr>
<tr>
<td>AsSpan()</td>
<td>创建数组的<code>Span<T></code>对象</td>
</tr>
<tr>
<td>AsMemory()</td>
<td>创建数组的<code>Memory<T></code>对象</td>
</tr>
</tbody>
</table>
<blockquote>
<p>📢 通过上表发现，Array 的很多方法都是静态方法，而不是实例方法，这一点有点困惑，造成了使用不便。而且大部分方法都可以用Linq的扩展来代替。</p>
</blockquote>
<p><img src="https://img2024.cnblogs.com/blog/151257/202406/151257-20240603202619820-159700194.png" alt="image.png" loading="lazy"></p>
<h2 id="33linq扩展">3.3、Linq扩展</h2>
<p>LINQ to Objects (C#) 提供了大量的对集合操作的扩展，可以使用 LINQ 来查询任何可枚举的集合（IEnumerable）。扩展实现主要集中在代码 Enumerable 类（源码Enumerable.cs），涵盖了查询、排序、分组、统计等各种功能，非常强大。</p>
<ul>
<li>简洁、易读，可以链式操作，简单的代码即可实现丰富的筛选、排序和分组功能。</li>
<li>延迟执行，只有在ToList、ToArray时才会正式执行，和<code>yeild</code>一样的效果。</li>
</ul>
<pre><code class="language-csharp">var arr = Enumerable.Range(1, 100).ToArray(); //生成一个数组
var evens = arr.Where(n => int.IsEvenInteger(n)); //并没有执行
var arr2 = arr.GroupBy(n => n % 10).ToArray();
</code></pre>
<hr>
<h1 id="04集合的一些小技巧">04、集合的一些小技巧</h1>
<table>
<thead>
<tr>
<th><strong>技巧</strong></th>
<th><strong>说明</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>集合初始化器{}</td>
<td>省略<code>new</code>和类型，用<code>{}</code>初始化值，<code>int[] arr = {1,2,3};</code></td>
</tr>
<tr>
<td>集合表达式[]</td>
<td>C#12，简化集合赋值，比上面更简化，<code>int[] arr = </code></td>
</tr>
<tr>
<td>范围运算符 <code>a..b</code></td>
<td>C#8，表示a到b的范围（不含b），可获取集合中指定范围的子集<code>var sub =arr</code></td>
</tr>
<tr>
<td><code>^n</code>倒数</td>
<td>C#8，索引倒数，<code>arr[^1]</code>//倒数第一个</td>
</tr>
<tr>
<td>..展开运算符</td>
<td>支持集合、可枚举表达式，展开每个枚举元素到数组，配合集合表达式使用比较方便。</td>
</tr>
</tbody>
</table>
<p><img src="https://img2024.cnblogs.com/blog/151257/202406/151257-20240603202619819-805669446.png" alt="image.png" loading="lazy"></p>
<h2 id="41集合初始化器">4.1、集合初始化器{}</h2>
<p>同类的初始化器类似，用<code>{}</code>来初始化设置集合值，支持数组、字典。</p>
<pre><code class="language-csharp">//数组
int[] arr1 = new int { 1, 2, 3 };
int[] arr2 = new int[] { 1, 2, 3 };
int[] arr4 = { 1, 2, 3 };
//字典
Dictionary<int, string> dict1 = new() { { 1, "sam" }, { 2, "william" } };
Dictionary<int, string> dict2 = new() { = "sam", = "zhangsan" }; //索引器写法
var dict3 = new Dictionary<int, string> { { 1, "sam" }, { 2, "william" } };
</code></pre>
<h2 id="42集合表达式">4.2、集合表达式[]</h2>
<p>集合表达式简化了集合的申明和赋值，直接用<code>[]</code>赋值，比初始化器更简洁，语法形式和<code>JavaScript</code>差不多了。可用于数组、Sapn、List，还可以自定义集合生成器。</p>
<pre><code class="language-csharp">int[] iarr1 = new int[] { 1, 2, 3, 4 }; //完整的申明方式
int[] iarr2 = { 1, 2, 3, 4 }; //前面声明有类型int[]，可省略new
int[] iarr3 = ; //简化版的集合表达式

List<string> list = ["a1", "b1", "c1"];
Span<char> sc = ['a', 'b', 'c'];
HashSet<string> set = ["a2", "b2", "c2"];

//..展开运算符，把集合中的元素展开
List<string> list2 = [.. list,..set, "ccc"]; //a1 b1 c1 a2 b2 c2 ccc
</code></pre>
<h2 id="43范围运算符">4.3、范围运算符..</h2>
<p><code>a..b</code>表示a到b的范围（不含b），其本质是 System.Range 类型数据，表示一个索引范围，常用与集合操作。</p>
<ul>
<li>可省略<code>a</code>或<code>b</code>，缺省则表示到边界。</li>
<li>可结合倒数<code>^</code>使用。</li>
</ul>
<pre><code class="language-csharp">int[] arr =new[] { 0, 1, 2, 3, 4, 5 };
Console.WriteLine(arr); //1 2//索引1、2
Console.WriteLine(arr); //3 4 5 //索引3到结尾
Console.WriteLine(arr[..]);//全部
Console.WriteLine(arr[^2..]);//4 5 //倒数到2到结尾

var r = 1..3;
Console.WriteLine(r.GetType()); //System.Range
</code></pre>
<blockquote>
<p>自定义的索引器也可用用范围<code>Range</code>作为范围参数。</p>
</blockquote>
<h1 id="05提高集合性能的一些实践">05、提高集合性能的一些实践</h1>
<table>
<thead>
<tr>
<th><strong>技巧</strong></th>
<th><strong>说明</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>使用泛型版本集合</td>
<td>尽量不使用非泛型版本的集合（如ArrayList、Hashtable），避免装箱。</td>
</tr>
<tr>
<td>初始化合适的容量</td>
<td>创建集合时，根据实际情况尽量给定一个合适的初始容量，避免频繁的扩容</td>
</tr>
<tr>
<td>使用Span</td>
<td>使用<code>Span<T></code>和<code>Memory<T></code>进行高效内存访问，更多参考《高性能的Span、Memory》</td>
</tr>
<tr>
<td>使用<code>ArrayPool<T></code></td>
<td>对于频繁获取集合数据的场景，采用池化技术复用数组对象，减少数组对象的创建</td>
</tr>
</tbody>
</table>
<p><strong>🚩尽量给集合一个合适的“容量”（ capacity）</strong>，几乎所有可变长集合的“动态变长”其实都是有代价的。他们内部会有一个定长的“数组”，当添加元素较多（大于容量）时，就会自动扩容（如倍增），然后把原有“数组”数据拷贝（搬运）到新“数组“中。</p>
<ul>
<li>因此在使用可变长集合时，尽量给一个合适的大小，可减少频繁扩容带来的性能影响。当然也不可盲目设置一个比较大的容量，这就很浪费内存空间了。<code>stringBuilder</code>也是一样的道理。</li>
<li>可变长集合的插入、删除效率都不高，因为会移动其后续元素。</li>
</ul>
<p>下面测试一下<code>List<T></code>，当创建一个长度为1000的<code>List</code>时，设置容量（1000）和不设置容量（默认4）的对比。</p>
<pre><code class="language-csharp">int max = 10000;
public void List_AutoLength(){
List<int> arr = new List<int>();
for (int i = 0; i < max; i++)
{
   arr.Add(i);
}
}
public void List_FixedLength()
{
List<int> arr = new List<int>(max);
for (int i = 0; i < max; i++)
{
   arr.Add(i);
}
}
</code></pre>
<p><img src="https://img2024.cnblogs.com/blog/151257/202406/151257-20240603202619823-1846856710.png" alt="image.png" loading="lazy"></p>
<p>很明显，自动长度的<code>List</code>速度更慢，也消耗了更多的内存。</p>
<p><img src="https://img2024.cnblogs.com/blog/151257/202406/151257-20240603202619829-1688762282.png" alt="image.png" loading="lazy"></p>
<p><strong>🚩尽量不创建新数组</strong>，使用一些数组方法时需要注意尽量不要创建新的数组，如下面示例代码：</p>
<pre><code class="language-csharp">var arr = Enumerable.Range(1, 100).ToArray();
// 需求：对arr进行反序操作
var arr2 = arr.Reverse().ToArray(); //用Linq，创建了新数组
Array.Reverse(arr);             //使用Array的静态方法，原地反序，没有创建新对象
</code></pre>
<p>比较一下上面两种反序的性能：</p>
<p><img src="https://img2024.cnblogs.com/blog/151257/202406/151257-20240603202619828-546294567.png" alt="image.png" loading="lazy"></p>
<hr>
<h1 id="参考资料">参考资料</h1>
<ul>
<li>集合和数据结构</li>
<li>《C#8.0 In a Nutshell》</li>
</ul>
<hr>
<blockquote>
<p><strong>©️版权申明</strong>：版权所有@安木夕，本文内容仅供学习，欢迎指正、交流，转载请注明出处！<em>原文编辑地址-语雀</em></p>
</blockquote><br><br>
来源：https://www.cnblogs.com/anding/p/18229596

頁: [1]

圆梦公社's Archiver

C#.Net筑基-集合知识全解