大家在平时开发中大多都会遵循接口编程,这样就可以方便实现依赖注入也方便实现多态等各种小技巧,但这种是以牺牲性能为代价换取代码的灵活性,万物皆有阴阳,看你的应用场景进行取舍。
一:背景
1. 缘由
在项目的性能改造中,发现很多方法签名的返回值都是采用IEnumerable接口,比如下面这段代码:
public static void Main(string[] args)
{
var list = GetHasEmailCustomerIDList();
foreach (var item in list){}
Console.ReadLine();
}
public static IEnumerable GetHasEmailCustomerIDList()
{
return Enumerable.Range(1, 5000000).ToArray();
}
2. 有什么问题
这段代码乍一看也没啥什么性能问题,foreach迭代天经地义,这个还能怎么优化???
从MSIL中寻找问题
首先我们尽可能把原貌还原出来,简化后的MSIL如下。
.method public hidebysig static
void Main (
string[] args
) cil managed
{
IL_0009: callvirt instance class [mscorlib]System.Collections.Generic.IEnumerator`1 class [mscorlib]System.Collections.Generic.IEnumerable`1::GetEnumerator()
IL_000e: stloc.1
.try
{
IL_000f: br.s IL_001a
// loop start (head: IL_001a)
IL_0011: ldloc.1
IL_0012: callvirt instance !0 class [mscorlib]System.Collections.Generic.IEnumerator`1::get_Current()
IL_0017: stloc.2
IL_0018: nop
IL_0019: nop
IL_001a: ldloc.1
IL_001b: callvirt instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()
IL_0020: brtrue.s IL_0011
// end loop
IL_0022: leave.s IL_002f
} // end .try
finally
{
IL_0024: ldloc.1
IL_0025: brfalse.s IL_002e
IL_0027: ldloc.1
IL_0028: callvirt instance void [mscorlib]System.IDisposable::Dispose()
IL_002d: nop
IL_002e: endfinally
} // end handler
IL_002f: ret
} // end of method Program::Main
从IL中看到了标准的get_Current,MoveNext,Dispose 还有一个try,finally,一下子多了这么多方法和关键词,不就是一个简单的foreach迭代数组嘛?至于搞的这么复杂嘛?这样在大数据下怎么快的起来?
还有一个奇葩的事,如果你仔细观察IL代码,比如这句:[mscorlib]System.Collections.Generic.IEnumerable``1::GetEnumerator(), 这个GetEnumerator前面是接口IEnumerable,正常情况下应该是具体迭代类吧,按理说应该会调用Array的GetEnumerator方法,如下所示。
[Serializable]
[ComVisible(true)]
[__DynamicallyInvokable]
public abstract class Array : ICloneable, IList, ICollection, IEnumerable, IStructuralComparable, IStructuralEquatable
{
[__DynamicallyInvokable]
public IEnumerator GetEnumerator()
{
int lowerBound = GetLowerBound(0);
if (Rank == 1 && lowerBound == 0)
{
return new SZArrayEnumerator(this);
}
return new ArrayEnumerator(this, lowerBound, Length);
}
}
从windbg中寻找问题
IL中发现的第二个问题我特别好奇,😄😄,我们到托管堆上去看下到底是哪一个具体类调用了GetEnumerator()方法。
!clrstack -l > !do xx 到线程栈上抓list变量
0:000> !clrstack -l
000000229e3feda0 00007ff889e40951 *** WARNING: Unable to verify checksum for ConsoleApp2.exe
ConsoleApp2.Program.Main(System.String[]) [C:\dream\Csharp\ConsoleApp1\ConsoleApp2\Program.cs @ 32]
LOCALS:
0x000000229e3fede8 = 0x0000019bf33b9a88
0x000000229e3fede0 = 0x0000019be33b2d90
0x000000229e3fedfc = 0x00000000004c4b40
0:000> !do 0x0000019be33b2d90
Name: System.SZArrayHelper+SZGenericArrayEnumerator`1[[System.Int32, mscorlib]]
MethodTable: 00007ff8e8d36d18
EEClass: 00007ff8e7cf5640
Size: 32(0x20) bytes
File: C:\WINDOWS\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
Fields:
MT Field Offset Type VT Attr Value Name
00007ff8e7a98538 4002ffe 8 System.Int32[] 0 instance 0000019bf33b9a88 _array
00007ff8e7a985a0 4002fff 10 System.Int32 1 instance 5000000 _index
00007ff8e7a985a0 4003000 14 System.Int32 1 instance 5000000 _endIndex
00007ff8e8d36d18 4003001 0 ...Int32, mscorlib]] 0 shared static Empty
>> Domain:Value dynamic statics NYI 0000019be1893a80:NotInit