[C# Performance] Array iteration in C# language

1. Description

        Iterability is the foundation of operations such as arrays; in the process of C++ program development, iterable operations are very common and extensive. However, how much you know about this operation and how much you don’t know will affect development flexibility. The progress of the development. Therefore, this article simply lists all such applications systematically for reference when using them.

2. Start with a simple example

        Implementing the sum of items in an array is very simple. I think most developers implement it this way:

static int Sum(int[] array)
{
    var sum = 0;
    for (var index = 0; index < array.Length; index++)
        sum += array[index];
    return sum;
}

        There's actually a simpler alternative in C#:

static int Sum(int[] array)
{
    var sum = 0;
    foreach (var item in array)
        sum += item;
    return sum;
}

        Another alternative is to use  the operations provided by LINQ  . It can be applied to any enumerable, including arrays.Sum()

        So, how do these three even out in terms of performance?

        This benchmark compares the performance of array sizes 6 and 7.8 on .NET 10, 1 and 000.int

        You can see that using a loop is about 30% faster than using a loop  . foreachfor

The LINQ  implementation has been greatly improved         in the latest .NET versions . It was much slower in .NET 6, but much slower in .NET 7, and much faster for large arrays in .NET 8.

three,foreachfor

        How can it be faster than loop? foreachforand loops are both syntactic sugar for loops. The compiler actually generates very similar code when these codes are used on arrays.forforeachwhile

        You can see the following code in SharpLab :

var array = new[] {0, 1, 2, 3, 4, 5 };

Console.WriteLine(Sum_For());
Console.WriteLine(Sum_ForEach());

int Sum_For()
{
    var sum = 0;
    for (var index = 0; index < array.Length; index++)
        sum += array[index];
    return sum;                     
}

int Sum_ForEach()
{
    var sum = 0;
    foreach (var item in array)
        sum += item;
    return sum;                     
}

        The compiler generates the following:

[CompilerGenerated]
private static int <<Main>$>g__Sum_For|0_0(ref <>c__DisplayClass0_0 P_0)
{
    int num = 0;
    int num2 = 0;
    while (num2 < P_0.array.Length)
    {
        num += P_0.array[num2];
        num2++;
    }
    return num;
}

[CompilerGenerated]
private static int <<Main>$>g__Sum_ForEach|0_1(ref <>c__DisplayClass0_0 P_0)
{
    int num = 0;
    int[] array = P_0.array; // copy array reference
    int num2 = 0;
    while (num2 < array.Length)
    {
        int num3 = array[num2];
        num += num3;
        num2++;
    }
    return num;
}

        The code is very similar, but notice that the reference to the array is added as a local variable. This allows the JIT compiler to remove bounds checks, making iteration faster. Check the differences of the generated assemblies: foreach

Program.<<Main>$>g__Sum_For|0_0(<>c__DisplayClass0_0 ByRef)
    L0000: sub rsp, 0x28
    L0004: xor eax, eax
    L0006: xor edx, edx
    L0008: mov rcx, [rcx]
    L000b: cmp dword ptr [rcx+8], 0
    L000f: jle short L0038
    L0011: nop [rax]
    L0018: nop [rax+rax]
    L0020: mov r8, rcx
    L0023: cmp edx, [r8+8]
    L0027: jae short L003d
    L0029: mov r9d, edx
    L002c: add eax, [r8+r9*4+0x10]
    L0031: inc edx
    L0033: cmp [rcx+8], edx
    L0036: jg short L0020
    L0038: add rsp, 0x28
    L003c: ret
    L003d: call 0x000002e975d100fc
    L0042: int3

Program.<<Main>$>g__Sum_ForEach|0_1(<>c__DisplayClass0_0 ByRef)
    L0000: xor eax, eax
    L0002: mov rdx, [rcx]
    L0005: xor ecx, ecx
    L0007: mov r8d, [rdx+8]
    L000b: test r8d, r8d
    L000e: jle short L001f
    L0010: mov r9d, ecx
    L0013: add eax, [rdx+r9*4+0x10]
    L0018: inc ecx
    L001a: cmp r8d, ecx
    L001d: jg short L0010
    L001f: ret

        This leads to improved performance in benchmarks.

        Note that in  SharpLab  the array is already a local variable and no copy is made. In this case performance is equivalent.

Fourth, slice the array

        Sometimes we may only want to iterate over a portion of an array. Again, I think most developers will implement the following:

static int Sum(int[] source, int start, int length)
{
    var sum = 0;
    for (var index = start; index < start + length; index++)
        sum += source[index];
    return sum;        
}

This can be easily converted to foreach         by using  the Span.Slice()  method  :

static int Sum(int[] source, int start, int length)  
    => Sum(source.AsSpan().Slice(start, length));

static int Sum(ReadOnlySpan<int> source)
{
    var sum = 0;
    foreach (var item in source)
        sum += item;
    return sum;
}

        So, how are these shows performing?

        It's also about 20% better than using a loop over a portion of the array.foreachfor

5. LINQ

        Check the source code for in, for versions of .NET prior to .NET 8, you'll find that it uses a loop  . So, if using a is faster than a , why is it so slow in this case?Sum()System.Linqforeachforeachfor

        This implementation is an extension method of the type. Unlike the AND operation, there is no special case when the source is in an array. The compiler translates this implementation into something like this:Sum()IEnumerable<int>Count()Where()Sum()

static int Sum(this IEnumerable<int> source)
{
    var sum = 0;
    IEnumerator<int> enumerator = source.GetEnumerator();
    try
    {
        while(enumerator.MoveNext())
            sum += enumerator.Current;
    }
    finally
    {
        enumerator?.Dispose()
    }
    return sum;
}

        There are several performance issues with this code:

  • GetEnumerator()return. This means that the enumerator is a reference type, which means it must be allocated on the heap, increasing the pressure on the garbage collector.IEnumerator<T>
  • IEnumerator<T>from. It then needs to free the enumerator, so this method cannot be inlined.IDisposabletry/finally
  • Iteration over pairs requires calling methods and properties. Since enumerators are reference types, these calls are virtual.IEnumerable<T>MoveNext()Current

        All of this makes enumeration of arrays much slower.

Note: Check out my other post " Performance of Value Type Enumerators vs. Reference Type Enumerators " to learn about the performance differences between these two types of enumerators.

.NET 8's performance is much better, because it will perform the sum when  the source is an array or  . If it is  an array or list of or  , it will do more optimizations by using SIMD , allowing multiple items to be summed at the same time.Sum()List<T>intlong

Note: Check out my other article "Single Instruction Multiple Data (SIMD) in .NET" to learn  how SIMD works and how to use it in your code.

6. Conclusion

        Iteration of arrays is a special case of code optimizations that the compiler can perform. The use of the guarantees the best conditions for these optimizations.foreach

        Converting the array to will make it iterate much slower.IEnumerable<T>

        Not all  LINQ  methods are optimized for the case of arrays. Prior to .NET 8, it was better to use   a custom implementation of the Sum() method.

Guess you like

Origin blog.csdn.net/gongdiwudu/article/details/131902025