新版 C# 高效率编程指南

向日葵滴約仃 發表於 2020-9-24 16:47:00

<h2 id="前言">前言</h2>
<p>C# 从 7 版本开始一直到如今的 9 版本，加入了非常多的特性，其中不乏改善性能、增加程序健壮性和代码简洁性、可读性的改进，这里我整理一些使用新版 C# 的时候个人推荐的写法，可能不适用于所有的人，但是还是希望对你们有所帮助。</p>
<p>注意：本指南适用于 .NET 5 或以上版本。</p>
<h2 id="使用-ref-struct-做到-0-gc">使用 ref struct 做到 0 GC</h2>
<p>C# 7 开始引入了一种叫做 <code>ref struct</code> 的结构，这种结构本质是 <code>struct</code> ，结构存储在栈内存。但是与 <code>struct</code> 不同的是，该结构不允许实现任何接口，并由编译器保证该结构永远不会被装箱，因此不会给 GC 带来任何的压力。相对的，使用中就会有不能逃逸出栈的强制限制。</p>
<p><code>Span<T></code> 就是利用 <code>ref struct</code> 的产物，成功的封装出了安全且高性能的内存访问操作，且可在大多数情况下代替指针而不损失任何的性能。</p>
<pre><code class="language-csharp">ref struct MyStruct
{
public int Value { get; set; }
}

class RefStructGuide
{
static void Test()
{
   MyStruct x = new MyStruct();
   x.Value = 100;
   Foo(x); // ok
   Bar(x); // error, x cannot be boxed
}

static void Foo(MyStruct x) { }

static void Bar(object x) { }
}
</code></pre>
<h2 id="使用-in-关键字传递不可修改的引用">使用 in 关键字传递不可修改的引用</h2>
<p>当参数以 <code>ref</code> 传递时，虽然传递的是引用但是无法确保引用值不被对方修改，这个时候只需要将 <code>ref</code> 改为 <code>in</code>，便能确保安全性：</p>
<pre><code class="language-csharp">SomeBigReadonlyStruct x = ...;
Foo(x);

void Foo(in SomeBigReadonlyStruct v)
{
v = ...; // error
}
</code></pre>
<p>在使用大的 <code>readonly struct</code> 时收益非常明显。</p>
<h2 id="使用-stackalloc-在栈上分配连续内存">使用 stackalloc 在栈上分配连续内存</h2>
<p>对于部分性能敏感却需要使用少量的连续内存的情况，不必使用数组，而可以通过 <code>stackalloc</code> 直接在栈上分配内存，并使用 <code>Span<T></code> 来安全的访问，同样的，这么做可以做到 0 GC 压力。</p>
<p><code>stackalloc</code> 允许任何的值类型结构，但是要注意，<code>Span<T></code> 目前不支持 <code>ref struct</code> 作为泛型参数，因此在使用 <code>ref struct</code> 时需要直接使用指针。</p>
<pre><code class="language-csharp">ref struct MyStruct
{
public int Value { get; set; }
}

class AllocGuide
{
static unsafe void RefStructAlloc()
{
   MyStruct* x = stackalloc MyStruct;
   for (int i = 0; i < 10; i++)
   {
         *(x + i) = new MyStruct { Value = i };
   }
}

static void StructAlloc()
{
   Span<int> x = stackalloc int;
   for (int i = 0; i < x.Length; i++)
   {
         x = i;
   }
}
}
</code></pre>
<h2 id="使用-span-操作连续内存">使用 Span<t> 操作连续内存</t></h2>
<p>C# 7 开始引入了 <code>Span<T></code>，它封装了一种安全且高性能的内存访问操作方法，可用于在大多数情况下代替指针操作。</p>
<pre><code class="language-csharp">static void SpanTest()
{
Span<int> x = stackalloc int;
for (int i = 0; i < x.Length; i++)
{
   x = i;
}

ReadOnlySpan<char> str = "12345".AsSpan();
for (int i = 0; i < str.Length; i++)
{
   Console.WriteLine(str);
}
}
</code></pre>
<h2 id="性能敏感时对于频繁调用的函数使用-skiplocalsinit">性能敏感时对于频繁调用的函数使用 SkipLocalsInit</h2>
<p>C# 为了确保代码的安全会将所有的局部变量在声明时就进行初始化，无论是否必要。一般情况下这对性能并没有太大影响，但是如果你的函数在操作很多栈上分配的内存，并且该函数还是被频繁调用的，那么这一消耗的副作用将会被放大变成不可忽略的损失。</p>
<p>因此你可以使用 <code>SkipLocalsInit</code> 这一特性禁用自动初始化局部变量的行为。</p>
<pre><code class="language-csharp">
unsafe static void Main()
{
Guid g;
Console.WriteLine(*&g);
}
</code></pre>
<p>上述代码将输出不可预期的结果，因为 <code>g</code> 并没有被初始化为 0。另外，访问未初始化的变量需要在 <code>unsafe</code> 上下文中使用指针进行访问。</p>
<h2 id="使用函数指针代替-marshal-进行互操作">使用函数指针代替 Marshal 进行互操作</h2>
<p>C# 9 带来了函数指针功能，该特性支持 managed 和 unmanaged 的函数，在进行 native interop 时，使用函数指针将能显著改善性能。</p>
<p>例如，你有如下 C++ 代码：</p>
<pre><code class="language-cpp">#define UNICODE
#define WIN32
#include <cstring>

extern "C" __declspec(dllexport) char* __cdecl InvokeFun(char* (*foo)(int)) {
return foo(5);
}
</code></pre>
<p>并且你编写了如下 C# 代码进行互操作：</p>
<pre><code class="language-csharp">
static extern string InvokeFun(delegate* unmanaged<int, IntPtr> fun);

{ typeof(CallConvCdecl) })]
public static IntPtr Foo(int x)
{
var str = Enumerable.Repeat("x", x).Aggregate((a, b) => $"{a}{b}");
return Marshal.StringToHGlobalAnsi(str);
}

static void Main(string[] args)
{
var callback = (delegate* unmanaged<int, nint>)(delegate*<int, nint>)&Foo;
Console.WriteLine(InvokeFun(callback));
}
</code></pre>
<p>上述代码中，首先 C# 将自己的 <code>Foo</code> 方法作为函数指针传给了 C++ 的 <code>InvokeFun</code> 函数，然后 C++ 用参数 5 调用该函数并返回其返回值到 C# 的调用方。</p>
<p>注意到上述代码还用了 <code>UnmanagedCallersOnly</code> 这一特性，这样可以告诉编译器该方法只会从 unmanaged 的代码被调用，因此编译器可以做一些额外的优化。</p>
<p>使用函数指针产生的 IL 指令非常高效：</p>
<pre><code class="language-msil">ldftn native int Test.Program::Foo(int32)
stloc.0
ldloc.0
call string Test.Program::InvokeFun(method native int *(int32))
</code></pre>
<p>除了 unmanaged 的情况外，managed 函数也是可以使用函数指针的：</p>
<pre><code class="language-csharp">static void Foo(int v) { }
unsafe static void Main(string[] args)
{
delegate* managed<int, void> fun = &Foo;
fun(4);
}
</code></pre>
<p>产生的代码相对于原本的 Delegate 来说更加高效：</p>
<pre><code class="language-msil">ldftn void Test.Program::Foo(int32)
stloc.0
ldc.i4.4
ldloc.0
calli void(int32)
</code></pre>
<h2 id="使用模式匹配">使用模式匹配</h2>
<p>有了<code>if-else</code>、<code>as</code>和强制类型转换，为什么要使用模式匹配呢?有三方面原因：性能、鲁棒性和可读性。</p>
<p>为什么说性能也是一个原因呢?因为 C# 编译器会根据你的模式编译出最优的匹配路径。</p>
<p>考虑一下以下代码（代码 1）：</p>
<pre><code class="language-csharp">int Match(int v)
{
if (v > 3)
{
   return 5;
}
if (v < 3)
{
   if (v > 1)
   {
         return 6;
   }
   if (v > -5)
   {
         return 7;
   }
   else
   {
         return 8;
   }
}
return 9;
}
</code></pre>
<p>如果改用模式匹配，配合 <code>switch</code> 表达式写法则变成（代码 2）：</p>
<pre><code class="language-csharp">int Match(int v)
{
return v switch
{
   > 3 => 5,
   < 3 and > 1 => 6,
   < 3 and > -5 => 7,
   < 3 => 8,
   _ => 9
};
}
</code></pre>
<p>以上代码会被编译器编译为：</p>
<pre><code class="language-csharp">int Match(int v)
{
if (v > 1)
{
   if (v <= 3)
   {
         if (v < 3)
         {
            return 6;
         }
         return 9;
   }
   return 5;
}
if (v > -5)
{
   return 7;
}
return 8;
}
</code></pre>
<p>我们计算一下平均比较次数：</p>
<table>
<thead>
<tr>
<th>代码</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
<th>9</th>
<th>总数</th>
<th>平均</th>
</tr>
</thead>
<tbody>
<tr>
<td>代码 1</td>
<td>1</td>
<td>3</td>
<td>4</td>
<td>4</td>
<td>2</td>
<td>14</td>
<td>2.8</td>
</tr>
<tr>
<td>代码 2</td>
<td>2</td>
<td>3</td>
<td>2</td>
<td>2</td>
<td>3</td>
<td>12</td>
<td>2.4</td>
</tr>
</tbody>
</table>
<p>可以看到使用模式匹配时，编译器选择了更优的比较方案，你在编写的时候无需考虑如何组织判断语句，心智负担降低，并且代码 2 可读性和简洁程度显然比代码 1 更好，有哪些条件分支一目了然。</p>
<p>甚至遇到类似以下的情况时：</p>
<pre><code class="language-csharp">int Match(int v)
{
return v switch
{
   1 => 5,
   2 => 6,
   3 => 7,
   4 => 8,
   _ => 9
};
}
</code></pre>
<p>编译器会直接将代码从条件判断语句编译成 <code>switch</code> 语句：</p>
<pre><code class="language-csharp">int Match(int v)
{
switch (v)
{
   case 1:
         return 5;
   case 2:
         return 6;
   case 3:
         return 7;
   case 4:
         return 8;
   default:
         return 9;
}
}
</code></pre>
<p>如此一来所有的判断都不需要比较（因为 <code>switch</code> 可根据 HashCode 直接跳转）。</p>
<p>编译器非常智能地为你选择了最佳的方案。</p>
<p>那鲁棒性从何谈起呢?假设你漏掉了一个分支：</p>
<pre><code class="language-csharp">int v = 5;
var x = v switch
{
> 3 => 1,
< 3 => 2
};
</code></pre>
<p>此时编译的话，编译器就会警告你漏掉了 <code>v</code> 可能为 3 的情况，帮助减少程序出错的可能性。</p>
<p>最后一点，可读性。</p>
<p>假设你现在有这样的东西：</p>
<pre><code class="language-csharp">abstract class Entry { }

class UserEntry : Entry
{
public int UserId { get; set; }
}

class DataEntry : Entry
{
public int DataId { get; set; }
}

class EventEntry : Entry
{
public int EventId { get; set; }
// 如果 CanRead 为 false 则查询的时候直接返回空字符串
public bool CanRead { get; set; }
}
</code></pre>
<p>现在有接收类型为 <code>Entry</code> 的参数的一个函数，该函数根据不同类型的 <code>Entry</code> 去数据库查询对应的 <code>Content</code>，那么只需要写：</p>
<pre><code class="language-csharp">string QueryMessage(Entry entry)
{
return entry switch
{
   UserEntry u => dbContext1.User.FirstOrDefault(i => i.Id == u.UserId).Content,
   DataEntry d => dbContext1.Data.FirstOrDefault(i => i.Id == d.DataId).Content,
   EventEntry { EventId: var eventId, CanRead: true } => dbContext1.Event.FirstOrDefault(i => i.Id == eventId).Content,
   EventEntry { CanRead: false } => "",
   _ => throw new InvalidArgumentException("无效的参数")
};
}
</code></pre>
<p>更进一步，假如 <code>Entry.Id</code> 分布在了数据库 1 和 2 中，如果在数据库 1 当中找不到则需要去数据库 2 进行查询，如果 2 也找不到才返回空字符串，由于 C# 的模式匹配支持递归模式，因此只需要这样写：</p>
<pre><code class="language-csharp">string QueryMessage(Entry entry)
{
return entry switch
{
   UserEntry u => dbContext1.User.FirstOrDefault(i => i.Id == u.UserId) switch
   {
         null => dbContext2.User.FirstOrDefault(i => i.Id == u.UserId)?.Content ?? "",
         var found => found.Content
   },
   DataEntry d => dbContext1.Data.FirstOrDefault(i => i.Id == d.DataId) switch
   {
         null => dbContext2.Data.FirstOrDefault(i => i.Id == u.DataId)?.Content ?? "",
         var found => found.Content
   },
   EventEntry { EventId: var eventId, CanRead: true } => dbContext1.Event.FirstOrDefault(i => i.Id == eventId) switch
   {
         null => dbContext2.Event.FirstOrDefault(i => i.Id == eventId)?.Content ?? "",
         var found => found.Content
   },
   EventEntry { CanRead: false } => "",
   _ => throw new InvalidArgumentException("无效的参数")
};
}
</code></pre>
<p>就全部搞定了，代码非常简洁，而且数据的流向一眼就能看清楚，就算是没有接触过这部分代码的人看一下模式匹配的过程，也能一眼就立刻掌握各分支的情况，而不需要在一堆的 <code>if-else</code> 当中梳理这段代码到底干了什么。</p>
<h2 id="使用记录类型和不可变数据">使用记录类型和不可变数据</h2>
<p><code>record</code> 作为 C# 9 的新工具，配合 <code>init</code> 仅可初始化属性，为我们带来了高效的数据交互能力和不可变性。</p>
<p>消除可变性意味着无副作用，一个无副作用的函数无需担心数据同步互斥问题，因此在无锁的并行编程中非常有用。</p>
<pre><code class="language-csharp">record Point(int X, int Y);
</code></pre>
<p>简单的一句话等价于我们写了如下代码，帮我们解决了 <code>ToString()</code> 格式化输出、基于值的 <code>GetHashCode()</code> 和相等判断等等各种问题：</p>
<pre><code class="language-csharp">internal class Point : IEquatable<Point>
{
private readonly int x;
private readonly int y;

protected virtual Type EqualityContract => typeof(Point);

public int X
{
   get => x;
   set => x = value;
}

public int Y
{
   get => y;
   set => y = value;
}

public Point(int X, int Y)
{
   x = X;
   y = Y;
}

public override string ToString()
{
   StringBuilder stringBuilder = new StringBuilder();
   stringBuilder.Append("Point");
   stringBuilder.Append(" { ");
   if (PrintMembers(stringBuilder))
   {
         stringBuilder.Append(" ");
   }
   stringBuilder.Append("}");
   return stringBuilder.ToString();
}

protected virtual bool PrintMembers(StringBuilder builder)
{
   builder.Append("X");
   builder.Append(" = ");
   builder.Append(X.ToString());
   builder.Append(", ");
   builder.Append("Y");
   builder.Append(" = ");
   builder.Append(Y.ToString());
   return true;
}

public static bool operator !=(Point r1, Point r2)
{
   return !(r1 == r2);
}

public static bool operator ==(Point r1, Point r2)
{
   if ((object)r1 != r2)
   {
         if ((object)r1 != null)
         {
            return r1.Equals(r2);
         }
         return false;
   }
   return true;
}

public override int GetHashCode()
{
   return (EqualityComparer<Type>.Default.GetHashCode(EqualityContract) * -1521134295 + EqualityComparer<int>.Default.GetHashCode(x)) * -1521134295 + EqualityComparer<int>.Default.GetHashCode(y);
}

public override bool Equals(object obj)
{
   return Equals(obj as Point);
}

public virtual bool Equals(Point other)
{
   if ((object)other != null && EqualityContract == other.EqualityContract && EqualityComparer<int>.Default.Equals(x, other.x))
   {
         return EqualityComparer<int>.Default.Equals(y, other.y);
   }
   return false;
}

public virtual Point Clone()
{
   return new Point(this);
}

protected Point(Point original)
{
   x = original.x;
   y = original.y;
}

public void Deconstruct(out int X, out int Y)
{
   X = this.X;
   Y = this.Y;
}
}
</code></pre>
<p>注意到 <code>x</code> 与 <code>y</code> 都是 <code>readonly</code> 的，因此一旦实例创建了就不可变，如果想要变更可以通过 <code>with</code> 创建一份副本，于是这种方式彻底消除了任何的副作用。</p>
<pre><code class="language-csharp">var p1 = new Point(1, 2);
var p2 = p1 with { Y = 3 }; // (1, 3)
</code></pre>
<p>当然，你也可以自己使用 <code>init</code> 属性表示这个属性只能在初始化时被赋值：</p>
<pre><code class="language-csharp">class Point
{
public int X { get; init; }
public int Y { get; init; }
}
</code></pre>
<p>这样一来，一旦 <code>Point</code> 被创建，则 <code>X</code> 和 <code>Y</code> 的值就不会被修改了，可以放心地在并行编程模型中使用，而不需要加锁。</p>
<pre><code class="language-csharp">var p1 = new Point { X = 1, Y = 2 };
p1.Y = 3; // error
var p2 = p1 with { Y = 3 }; //ok
</code></pre>
<h2 id="使用-readonly-类型">使用 readonly 类型</h2>
<p>上面说到了不可变性的重要性，当然，<code>struct</code> 也可以是只读的：</p>
<pre><code class="language-csharp">readonly struct Foo
{
public int X { get; set; } // error
}
</code></pre>
<p>上面的代码会报错，因为违反了 <code>X</code> 只读的约束。</p>
<p>如果改成：</p>
<pre><code class="language-csharp">readonly struct Foo
{
public int X { get; }
}
</code></pre>
<p>或</p>
<pre><code class="language-csharp">readonly struct Foo
{
public int X { get; init; }
}
</code></pre>
<p>则不会存在问题。</p>
<p><code>Span<T></code> 本身是一个 <code>readonly ref struct</code>，通过这样做保证了 <code>Span<T></code> 里的东西不会被意外的修改，确保不变性和安全。</p>
<h2 id="使用局部函数而不是-lambda-创建临时委托">使用局部函数而不是 lambda 创建临时委托</h2>
<p>在使用 <code>Expression<Func<>></code> 作为参数的 API 时，使用 lambda 表达式是非常正确的，因为编译器会把我们写的 lambda 表达式编译成 Expression Tree，而非直观上的函数委托。</p>
<p>而在单纯只是 <code>Func<></code>、<code>Action<></code> 时，使用 lambda 表达式恐怕不是一个好的决定，因为这样做必定会引入一个新的闭包，造成额外的开销和 GC 压力。从 C# 8 开始，我们可以使用局部函数很好的替换掉 lambda：</p>
<pre><code class="language-csharp">int SomeMethod(Func<int, int> fun)
{
if (fun(3) > 3) return 3;
else return fun(5);
}

void Caller()
{
int Foo(int v) => v + 1;

var result = SomeMethod(Foo);
Console.WriteLine(result);
}
</code></pre>
<p>以上代码便不会导致一个多余的闭包开销。</p>
<h2 id="使用-valuetask-代替-task">使用 ValueTask 代替 Task</h2>
<p>我们在遇到 <code>Task<T></code> 时，大多数情况下只是需要简单的对其进行 <code>await</code> 而已，而并不需要将其保存下来以后再 <code>await</code>，那么 <code>Task<T></code> 提供的很多的功能则并没有被使用，反而在高并发下，由于反复分配 <code>Task</code> 导致 GC 压力增加。</p>
<p>这种情况下，我们可以使用 <code>ValueTask<T></code> 代替 <code>Task<T></code>：</p>
<pre><code class="language-csharp">ValueTask<int> Foo()
{
return ValueTask.FromResult(1);
}

async ValueTask Caller()
{
await Foo();
}
</code></pre>
<p>由于 <code>ValueTask<T></code> 是值类型结构，因此该对象本身不会在堆上分配内存，于是可以减轻 GC 压力。</p>
<h2 id="实现解构函数代替创建元组">实现解构函数代替创建元组</h2>
<p>如果我们想要把一个类型中的数据提取出来，我们可以选择返回一个元组，其中包含我们需要的数据：</p>
<pre><code class="language-csharp">class Foo
{
private int x;
private int y;

public Foo(int x, int y)
{
   this.x = x;
   this.y = y;
}

public (int, int) Deconstruct()
{
   return (x, y);
}
}

class Program
{
static void Bar(Foo v)
{
   var (x, y) = v.Deconstruct();
   Console.WriteLine($"X = {x}, Y = {y}");
}
}
</code></pre>
<p>上述代码会导致一个 <code>ValueTuple<int, int></code> 的开销，如果我们将代码改成实现解构方法：</p>
<pre><code class="language-csharp">class Foo
{
private int x;
private int y;

public Foo(int x, int y)
{
   this.x = x;
   this.y = y;
}

public void Deconstruct(out int x, out int y)
{
   x = this.x;
   y = this.y;
}
}

class Program
{
static void Bar(Foo v)
{
   var (x, y) = v;
   Console.WriteLine($"X = {x}, Y = {y}");
}
}
</code></pre>
<p>则不仅省掉了 <code>Deconstruct()</code> 的调用，同时还没有任何的额外开销。你可以看到实现 Deconstruct 函数并不需要让你的类型实现任何的接口，从根本上杜绝了装箱的可能性，这是一种 0 开销抽象。另外，解构函数还能用于做模式匹配，你可以像使用元组一样地使用解构函数（下面代码的意思是，当 <code>x</code> 为 3 时取 <code>y</code>，否则取 <code>x + y</code>）：</p>
<pre><code class="language-csharp">void Bar(Foo v)
{
var result = v switch
{
   Foo (3, var y) => y,
   Foo (var x, var y) => x + y,
   _ => 0
};

Console.WriteLine(result);
}
</code></pre>
<h2 id="null-安全">Null 安全</h2>
<p>在项目属性文件 csproj 中启用 null 安全后即可对整个项目的代码启用 null 安全静态分析：</p>
<pre><code class="language-xml"><PropertyGroup>
<Nullable>enable</Nullable>
</PropertyGroup>
</code></pre>
<p>这样便可以在编译的时候检查一切潜在的导致 NRE 的问题。例如如下代码：</p>
<pre><code class="language-csharp">var list = new List<Entry>();
var value = list.FirstOrDefault(i => i.Id == 3).Value;
Console.WriteLine(value);
</code></pre>
<p><code>list.FirstOrDefault()</code> 可能返回 <code>null</code>，因此启用 null 安全之后编译器将会给出警告，这有助于避免不必要的 NRE 异常发生。</p>
<p>另外，启用 null 安全之后，对于可空引用类型，也可以通过在类型后加一个 <code>?</code> 来表示可为 <code>null</code>：</p>
<pre><code class="language-csharp">string? x = null;
</code></pre>
<h2 id="总结">总结</h2>
<p>在合适的时候使用 C# 的新特性，不但可以提升开发效率，同时还能兼顾代码质量和运行效率的提升。</p>
<p>但是切忌滥用。新特性的引入对于我们写高质量的代码无疑有很大的帮助，但是如果不分时宜地使用，可能会带来反效果。</p>
<p>希望本文能对各位开发者使用新版 C# 时带来一定的帮助，感谢阅读。</p><br><br>
来源：https://www.cnblogs.com/hez2010/p/13724904.html

頁: [1]

圆梦公社's Archiver

新版 C# 高效率编程指南