1. Description
Iterability is the foundation of operations such as arrays; in the process of C++ program development, iterable operations are very common and extensive. However, how much you know about this operation and how much you don’t know will affect development flexibility. The progress of the development. Therefore, this article simply lists all such applications systematically for reference when using them.
2. Start with a simple example
Implementing the sum of items in an array is very simple. I think most developers implement it this way:
static int Sum(int[] array)
{
var sum = 0;
for (var index = 0; index < array.Length; index++)
sum += array[index];
return sum;
}
There's actually a simpler alternative in C#:
static int Sum(int[] array)
{
var sum = 0;
foreach (var item in array)
sum += item;
return sum;
}
Another alternative is to use the operations provided by LINQ . It can be applied to any enumerable, including arrays.Sum()
So, how do these three even out in terms of performance?
This benchmark compares the performance of array sizes 6 and 7.8 on .NET 10, 1 and 000.int
You can see that using a loop is about 30% faster than using a loop . foreach
for
The LINQ implementation has been greatly improved in the latest .NET versions . It was much slower in .NET 6, but much slower in .NET 7, and much faster for large arrays in .NET 8.
three,foreach
for
How can it be faster than loop? foreachfor
and loops are both syntactic sugar for loops. The compiler actually generates very similar code when these codes are used on arrays.for
foreach
while
You can see the following code in SharpLab :
var array = new[] {0, 1, 2, 3, 4, 5 };
Console.WriteLine(Sum_For());
Console.WriteLine(Sum_ForEach());
int Sum_For()
{
var sum = 0;
for (var index = 0; index < array.Length; index++)
sum += array[index];
return sum;
}
int Sum_ForEach()
{
var sum = 0;
foreach (var item in array)
sum += item;
return sum;
}
The compiler generates the following:
[CompilerGenerated]
private static int <<Main>$>g__Sum_For|0_0(ref <>c__DisplayClass0_0 P_0)
{
int num = 0;
int num2 = 0;
while (num2 < P_0.array.Length)
{
num += P_0.array[num2];
num2++;
}
return num;
}
[CompilerGenerated]
private static int <<Main>$>g__Sum_ForEach|0_1(ref <>c__DisplayClass0_0 P_0)
{
int num = 0;
int[] array = P_0.array; // copy array reference
int num2 = 0;
while (num2 < array.Length)
{
int num3 = array[num2];
num += num3;
num2++;
}
return num;
}
The code is very similar, but notice that the reference to the array is added as a local variable. This allows the JIT compiler to remove bounds checks, making iteration faster. Check the differences of the generated assemblies: foreach
Program.<<Main>$>g__Sum_For|0_0(<>c__DisplayClass0_0 ByRef)
L0000: sub rsp, 0x28
L0004: xor eax, eax
L0006: xor edx, edx
L0008: mov rcx, [rcx]
L000b: cmp dword ptr [rcx+8], 0
L000f: jle short L0038
L0011: nop [rax]
L0018: nop [rax+rax]
L0020: mov r8, rcx
L0023: cmp edx, [r8+8]
L0027: jae short L003d
L0029: mov r9d, edx
L002c: add eax, [r8+r9*4+0x10]
L0031: inc edx
L0033: cmp [rcx+8], edx
L0036: jg short L0020
L0038: add rsp, 0x28
L003c: ret
L003d: call 0x000002e975d100fc
L0042: int3
Program.<<Main>$>g__Sum_ForEach|0_1(<>c__DisplayClass0_0 ByRef)
L0000: xor eax, eax
L0002: mov rdx, [rcx]
L0005: xor ecx, ecx
L0007: mov r8d, [rdx+8]
L000b: test r8d, r8d
L000e: jle short L001f
L0010: mov r9d, ecx
L0013: add eax, [rdx+r9*4+0x10]
L0018: inc ecx
L001a: cmp r8d, ecx
L001d: jg short L0010
L001f: ret
This leads to improved performance in benchmarks.
Note that in SharpLab the array is already a local variable and no copy is made. In this case performance is equivalent.
Fourth, slice the array
Sometimes we may only want to iterate over a portion of an array. Again, I think most developers will implement the following:
static int Sum(int[] source, int start, int length)
{
var sum = 0;
for (var index = start; index < start + length; index++)
sum += source[index];
return sum;
}
This can be easily converted to foreach by using the Span.Slice() method :
static int Sum(int[] source, int start, int length)
=> Sum(source.AsSpan().Slice(start, length));
static int Sum(ReadOnlySpan<int> source)
{
var sum = 0;
foreach (var item in source)
sum += item;
return sum;
}
So, how are these shows performing?
It's also about 20% better than using a loop over a portion of the array.foreach
for
5. LINQ
Check the source code for in, for versions of .NET prior to .NET 8, you'll find that it uses a loop . So, if using a is faster than a , why is it so slow in this case?Sum()
System.Linq
foreach
foreach
for
This implementation is an extension method of the type. Unlike the AND operation, there is no special case when the source is in an array. The compiler translates this implementation into something like this:Sum()
IEnumerable<int>
Count()
Where()
Sum()
static int Sum(this IEnumerable<int> source)
{
var sum = 0;
IEnumerator<int> enumerator = source.GetEnumerator();
try
{
while(enumerator.MoveNext())
sum += enumerator.Current;
}
finally
{
enumerator?.Dispose()
}
return sum;
}
There are several performance issues with this code:
GetEnumerator()
return. This means that the enumerator is a reference type, which means it must be allocated on the heap, increasing the pressure on the garbage collector.IEnumerator<T>
IEnumerator<T>
from. It then needs to free the enumerator, so this method cannot be inlined.IDisposable
try/finally
- Iteration over pairs requires calling methods and properties. Since enumerators are reference types, these calls are virtual.
IEnumerable<T>
MoveNext()
Current
All of this makes enumeration of arrays much slower.
Note: Check out my other post " Performance of Value Type Enumerators vs. Reference Type Enumerators " to learn about the performance differences between these two types of enumerators.
.NET 8's performance is much better, because it will perform the sum when the source is an array or . If it is an array or list of or , it will do more optimizations by using SIMD , allowing multiple items to be summed at the same time.Sum()
List<T>
int
long
Note: Check out my other article "Single Instruction Multiple Data (SIMD) in .NET" to learn how SIMD works and how to use it in your code.
6. Conclusion
Iteration of arrays is a special case of code optimizations that the compiler can perform. The use of the guarantees the best conditions for these optimizations.foreach
Converting the array to will make it iterate much slower.IEnumerable<T>
Not all LINQ methods are optimized for the case of arrays. Prior to .NET 8, it was better to use a custom implementation of the Sum() method.