One article explains Task and its scheduling problem

The importance of ask to .NET is beyond doubt. Through the recent experience of some interviewers, it is found that many people do not have a clear understanding of the relationship between Task and its scheduling mechanism, as well as threads and thread pools. This article simulates the implementation of Task in the simplest way, aiming to explain what is Task? How is it scheduled for execution?

一、Task(Job)

Task represents an operation with a certain state. We use the following Job type to simulate Task. The operation encapsulated by the Job is embodied as an Action delegate, and the status is represented by the JobStatus enumeration (corresponding to the TaskStatus enumeration). For simplicity, we only define four states (created, scheduled, executed and completed). The Invoke method is responsible for executing the encapsulated Action delegate and setting the state accordingly.

public class Job
{
    private readonly Action _work;
    public Job(Action work)=> _work = work;
    public JobStatus Status { get; internal set; }

    internal protected virtual void Invoke()
    {
        Status = JobStatus.Running;
        _work();
        Status = JobStatus.Completed;    

    }
}

public enum JobStatus
{
    Created,
    Scheduled,
    Running,
    Completed
}

二、TaskScheduler(JobScheduler)

The operations carried by the Task are executed through scheduling, and the specific scheduling strategy depends on the choice of the scheduler. The Task scheduler is represented by TaskScheduler, and we use the following JobScheduler type to simulate it. As shown in the code snippet below, we only define a unique QueueJob method for the abstract class JobScheduler to schedule the Job object as a parameter. The static Current property represents the current default implementation of the scheduler.

public abstract class JobScheduler
{
    public abstract void QueueJob(Job job);
    public static JobScheduler Current { get; set; } = new ThreadPoolJobScheduler ();
}

For developers, executing a Task means submitting it to the scheduler, which is reflected in the static Start method we define for the Job type. This method specifies a specific scheduler through parameters. If not specified explicitly, the default scheduler set by the Current static property of JobScheduler is used by default. For the convenience of later demonstrations, we also define a static Run method, which encapsulates the specified Action object into a Job, and calls the Start method to schedule using the default scheduler.

public class Job
{
    private readonly Action _work;
    public Job(Action work)=> _work = work;
    public JobStatus Status { get; internal set; }

    internal protected virtual void Invoke()
    {
        Status = JobStatus.Running;
        _work();
        Status = JobStatus.Completed;

    }

    public void Start(JobScheduler? scheduler = null) => (scheduler ?? JobScheduler.Current).QueueJob(this);
    public static Job Run(Action work)
    {
        var job = new Job(work);
        job.Start();
        return job;
    }
}

3. Scheduling based on thread pool

How the Task executes depends on the scheduler selected. .NET adopts the thread pool-based scheduling strategy by default. This strategy is reflected in the ThreadPoolTaskScheduler type. We use the following ThreadPoolJobScheduler for simulation. As shown in the following code snippet, the rewritten QueueJob method executes the Action delegate encapsulated by the specified Job object by calling the ThreadPool.QueueUserWorkItem method. The default scheduler set by JobScheduler's Current property is such a ThreadPoolJobScheduler object.

public class ThreadPoolJobScheduler : JobScheduler
{
    public override void QueueJob(Job job)
    {
        job.Status = JobStatus.Scheduled;
        var executionContext = ExecutionContext.Capture();
        ThreadPool.QueueUserWorkItem(_ => ExecutionContext.Run(executionContext!, _ => job.Invoke(), null));
    }
}

We call the static Run method of the Job to create and execute three Jobs as follows, and the Action delegate encapsulated by each Job will print out the current thread ID when it is executed.

_ = Job.Run(() => Console.WriteLine($"Job1 is excuted in thread {Thread.CurrentThread.ManagedThreadId}"));
_ = Job.Run(() => Console.WriteLine($"Job2 is excuted in thread {Thread.CurrentThread.ManagedThreadId}"));
_ = Job.Run(() => Console.WriteLine($"Job3 is excuted in thread {Thread.CurrentThread.ManagedThreadId}"));

Console.ReadLine();

Due to the default thread pool-based scheduling strategy, the three jobs will be executed on three different threads.

Fourth, use the specified thread for scheduling

We know that the .NET process has only one global thread pool. For some operations that need to run for a long time and have higher priority, it may not be a good choice to use the call based on the thread pool. For example, in a web application, worker threads in the thread pool are used to process requests, and a job that needs to run continuously may be blocked due to insufficient available worker threads. .NET has a different way of handling this situation (select the TaskCreationOptions.LongRunning option when starting the Task), here we use a custom scheduler to solve this problem. The following DedicatedThreadJobScheduler uses the created "dedicated thread" to ensure that the called Job can be executed "immediately".

internal class DedicatedThreadJobScheduler : JobScheduler
{
    private readonly BlockingCollection<Job> _queues = new();
    private readonly Thread[] _threads;
    public DedicatedThreadJobScheduler(int threadCount)
    {
        _threads = Enumerable.Range(1, threadCount).Select(i_ => new Thread(Invoke)).ToArray();
        Array.ForEach(_threads, it => it.Start());
        void Invoke(object? state)
        {
            while (true)
            {
                _queues.Take().Invoke();
            }
        }
    }
    public override void QueueJob(Job job)=>_queues.Add(job);
}

Still the program demonstrated above, this time we set the current scheduler to the DedicatedThreadJobScheduler above, and set the number of threads used to 2.

JobScheduler.Current = new DedicatedThreadJobScheduler (2);
_ = Job.Run(() => Console.WriteLine($"Job1 is excuted in thread {Thread.CurrentThread.ManagedThreadId}"));
_ = Job.Run(() => Console.WriteLine($"Job2 is excuted in thread {Thread.CurrentThread.ManagedThreadId}"));
_ = Job.Run(() => Console.WriteLine($"Job3 is excuted in thread {Thread.CurrentThread.ManagedThreadId}"));
_ = Job.Run(() => Console.WriteLine($"Job4 is excuted in thread {Thread.CurrentThread.ManagedThreadId}"));
_ = Job.Run(() => Console.WriteLine($"Job5 is excuted in thread {Thread.CurrentThread.ManagedThreadId}"));
_ = Job.Run(() => Console.WriteLine($"Job6 is excuted in thread {Thread.CurrentThread.ManagedThreadId}"));

Console.ReadLine();

We will find that all operations will only be performed in two fixed threads.

5. Asynchronous waiting

If we need to perform subsequent operations after a Task is executed, we can call its ContinueWith method to specify the operations to be performed. Now we define this method on the Job type. There are some differences between Job and Task's ContinueWith. Here we think that ContinueWith specifies a Job, so multiple Jobs can form a linked list in a pre-arranged order. After the current job is executed, you only need to deliver the subsequent job to the scheduler. As shown in the following code snippet, we use the _continue field to represent the Job that is waiting to be executed asynchronously, and use it to maintain a Job linked list. The ContinueWith method will encapsulate the specified Action delegate into a Job and add it to the end of the linked list.

public class Job
{
    private readonly Action _work;
    private Job? _continue;
    public Job(Action work) => _work = work;
    public JobStatus Status { get; internal set; }
    public void Start(JobScheduler? scheduler = null) => (scheduler ?? JobScheduler.Current).QueueJob(this);
    internal protected virtual void Invoke()
    {
        Status = JobStatus.Running;
        _work();
        Status = JobStatus.Completed;
        _continue?.Start();
    }

    public static Job Run(Action work)
    {
        var job = new Job(work);
        job.Start();
        return job;
    }

    public Job ContinueWith(Action<Job> continuation)
    {
        if (_continue == null)
        {
            var job = new Job(() => continuation(this));
            _continue = job;
        }
        else
        {
            _continue.ContinueWith(continuation);
        }
        return this;
    }
}

Using the ContinueWith method to realize the sequential execution of asynchronous operations is reflected in the following program.

Job.Run(() =>{
    Thread.Sleep(1000);
    Console.WriteLine("Foo1");
}).ContinueWith(_ =>{
    Thread.Sleep(100);
    Console.WriteLine("Bar1");
}).ContinueWith(_ =>{
    Thread.Sleep(100);
    Console.WriteLine("Baz1");
});



Job.Run(() =>{
    Thread.Sleep(100);
    Console.WriteLine("Foo2");
}).ContinueWith(_ =>{
    Thread.Sleep(10);
    Console.WriteLine("Bar2");
}).ContinueWith(_ =>{
    Thread.Sleep(10);
    Console.WriteLine("Baz2");
});


Console.ReadLine();

output result

Six, the use of the await keyword

Although the ContinueWith method can solve the "asynchronous wait" problem, we prefer to use the await keyword, and then we will give Job this ability. To this end, we define the following JobAwaiter structure that implements the ICriticalNotifyCompletion interface. As the name suggests, this interface is used to send notifications of operation completion. A JobAwaiter object is constructed from a Job object. When its own execution is completed, the OnCompleted method will be called, and we use it to perform subsequent operations.

public struct JobAwaiter: ICriticalNotifyCompletion
{
    private readonly Job _job;
    public bool IsCompleted => _job.Status ==  JobStatus.Completed;
    public JobAwaiter(Job job)
    {
        _job = job;
        if (job.Status == JobStatus.Created)
        {
            job.Start();
        }
    }
    public void OnCompleted(Action continuation)
    {
        _job.ContinueWith(_ => continuation());
    }
    public void GetResult() { }
    public void UnsafeOnCompleted(Action continuation)=>OnCompleted(continuation);
}

We add this GetAwaiter method on the Job type to return the JobAwaiter object created by itself.

public class Job
{
    private readonly Action _work;
    private Job? _continue;
    public Job(Action work) => _work = work;
    public JobStatus Status { get; internal set; }
    public void Start(JobScheduler? scheduler = null) => (scheduler ?? JobScheduler.Current).QueueJob(this);
    internal protected virtual void Invoke()
    {
        Status = JobStatus.Running;
        _work();
        Status = JobStatus.Completed;
        _continue?.Start();
    }


    public static Job Run(Action work)
    {
        var job = new Job(work);
        job.Start();
        return job;
    }
    public Job ContinueWith(Action<Job> continuation)
    {
        if (_continue == null)
        {
            var job = new Job(() => continuation(this));
            _continue = job;
        }
        else
        {
            _continue.ContinueWith(continuation);
        }
        return this;
    }
    public JobAwaiter GetAwaiter() => new(this);
}

Once any type has such a GetAwaiter method, we can apply the await keyword to the corresponding object.

await Foo();
await Bar();
await Baz();


Console.ReadLine();

static Job Foo() =>  new Job(() =>
{
    Thread.Sleep(1000);
    Console.WriteLine("Foo");
});

static Job Bar() => new Job(() =>
{
    Thread.Sleep(100);
    Console.WriteLine("Bar");
});

static Job Baz() => new Job(() =>
{
    Thread.Sleep(10);
    Console.WriteLine("Baz");
});

Output result:

Seven, state machine

I think you should know that the await keyword is just syntactic sugar provided by the compiler. The compiled code will use a "state machine" to implement the "asynchronous wait" function. The above code is finally compiled into the following form. It is worth mentioning that the code compiled in Debug and Release modes is different. The following is the compilation result in Release mode. The above state machine is reflected in the generated <<Main>$>d__0 structure. Its implementation is actually very simple: if there are N await keywords in a method, they are equivalent to cutting the execution process of the entire method into N+1 segments, and the state of the state machine is reflected in the segment that should be executed currently, and the specific execution is reflected on the MoveNext method. The ICriticalNotifyCompletion object returned by the GetAwaiter method is used to determine whether the current operation is over. If it is over, you can directly specify the follow-up operation. Otherwise, you need to call AwaitUnsafeOnCompleted to process the follow-up operation.

// Program
using System;
using System.Diagnostics;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Threading.Tasks;
using Jobs;

[CompilerGenerated]
internal class Program
{
	[StructLayout(LayoutKind.Auto)]
	[CompilerGenerated]
	private struct <<Main>$>d__0 : IAsyncStateMachine
	{
		public int <>1__state;

		public AsyncTaskMethodBuilder <>t__builder;

		private JobAwaiter <>u__1;

		private void MoveNext()
		{
			int num = <>1__state;
			try
			{
				JobAwaiter awaiter;
				switch (num)
				{
				default:
					awaiter = <<Main>$>g__Foo|0_0().GetAwaiter();
					if (!awaiter.IsCompleted)
					{
						num = (<>1__state = 0);
						<>u__1 = awaiter;
						<>t__builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
						return;
					}
					goto IL_006c;
				case 0:
					awaiter = <>u__1;
					<>u__1 = default(JobAwaiter);
					num = (<>1__state = -1);
					goto IL_006c;
				case 1:
					awaiter = <>u__1;
					<>u__1 = default(JobAwaiter);
					num = (<>1__state = -1);
					goto IL_00c6;
				case 2:
					{
						awaiter = <>u__1;
						<>u__1 = default(JobAwaiter);
						num = (<>1__state = -1);
						break;
					}
					IL_00c6:
					awaiter.GetResult();
					awaiter = <<Main>$>g__Baz|0_2().GetAwaiter();
					if (!awaiter.IsCompleted)
					{
						num = (<>1__state = 2);
						<>u__1 = awaiter;
						<>t__builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
						return;
					}
					break;
					IL_006c:
					awaiter.GetResult();
					awaiter = <<Main>$>g__Bar|0_1().GetAwaiter();
					if (!awaiter.IsCompleted)
					{
						num = (<>1__state = 1);
						<>u__1 = awaiter;
						<>t__builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
						return;
					}
					goto IL_00c6;
				}
				awaiter.GetResult();
				Console.ReadLine();
			}
			catch (Exception exception)
			{
				<>1__state = -2;
				<>t__builder.SetException(exception);
				return;
			}
			<>1__state = -2;
			<>t__builder.SetResult();
		}

		void IAsyncStateMachine.MoveNext()
		{
			//ILSpy generated this explicit interface implementation from .override directive in MoveNext
			this.MoveNext();
		}

		[DebuggerHidden]
		private void SetStateMachine([System.Runtime.CompilerServices.Nullable(1)] IAsyncStateMachine stateMachine)
		{
			<>t__builder.SetStateMachine(stateMachine);
		}

		void IAsyncStateMachine.SetStateMachine([System.Runtime.CompilerServices.Nullable(1)] IAsyncStateMachine stateMachine)
		{
			//ILSpy generated this explicit interface implementation from .override directive in SetStateMachine
			this.SetStateMachine(stateMachine);
		}
	}

	[AsyncStateMachine(typeof(<<Main>$>d__0))]
	private static Task <Main>$(string[] args)
	{
		<<Main>$>d__0 stateMachine = default(<<Main>$>d__0);
		stateMachine.<>t__builder = AsyncTaskMethodBuilder.Create();
		stateMachine.<>1__state = -1;
		stateMachine.<>t__builder.Start(ref stateMachine);
		return stateMachine.<>t__builder.Task;
	}

	[SpecialName]
	private static void <Main>(string[] args)
	{
		<Main>$(args).GetAwaiter().GetResult();
	}
}

As mentioned above, the state machine code generated by the compiler is different in Debug and Release modes. In Release mode, the state machine is a structure. Although it is used in the form of interface ICriticalNotifyCompletion, since the ref keyword is used, no boxing is involved, so it will not have any impact on GC. But the state machine generated in Debug mode is a class (as shown below), which will involve the allocation and recycling of heap memory. For an application that is littered with the await keyword, the performance difference between the two is definitely different. In fact, many optimization strategies for Task, such as the use of ValueTask, the reuse of some Task<T> objects (such as Task<bool> objects whose status is Completed), and the use of IValueTaskSource, etc., are all to solve the problem of memory allocation. .

// Program
using System;
using System.Diagnostics;
using System.Runtime.CompilerServices;
using System.Threading.Tasks;
using Jobs;

[CompilerGenerated]
internal class Program
{
	[CompilerGenerated]
	private sealed class <<Main>$>d__0 : IAsyncStateMachine
	{
		public int <>1__state;

		public AsyncTaskMethodBuilder <>t__builder;

		public string[] args;

		private JobAwaiter <>u__1;

		private void MoveNext()
		{
			int num = <>1__state;
			try
			{
				JobAwaiter awaiter3;
				JobAwaiter awaiter2;
				JobAwaiter awaiter;
				switch (num)
				{
				default:
					awaiter3 = <<Main>$>g__Foo|0_0().GetAwaiter();
					if (!awaiter3.IsCompleted)
					{
						num = (<>1__state = 0);
						<>u__1 = awaiter3;
						<<Main>$>d__0 stateMachine = this;
						<>t__builder.AwaitUnsafeOnCompleted(ref awaiter3, ref stateMachine);
						return;
					}
					goto IL_007e;
				case 0:
					awaiter3 = <>u__1;
					<>u__1 = default(JobAwaiter);
					num = (<>1__state = -1);
					goto IL_007e;
				case 1:
					awaiter2 = <>u__1;
					<>u__1 = default(JobAwaiter);
					num = (<>1__state = -1);
					goto IL_00dd;
				case 2:
					{
						awaiter = <>u__1;
						<>u__1 = default(JobAwaiter);
						num = (<>1__state = -1);
						break;
					}
					IL_00dd:
					awaiter2.GetResult();
					awaiter = <<Main>$>g__Baz|0_2().GetAwaiter();
					if (!awaiter.IsCompleted)
					{
						num = (<>1__state = 2);
						<>u__1 = awaiter;
						<<Main>$>d__0 stateMachine = this;
						<>t__builder.AwaitUnsafeOnCompleted(ref awaiter, ref stateMachine);
						return;
					}
					break;
					IL_007e:
					awaiter3.GetResult();
					awaiter2 = <<Main>$>g__Bar|0_1().GetAwaiter();
					if (!awaiter2.IsCompleted)
					{
						num = (<>1__state = 1);
						<>u__1 = awaiter2;
						<<Main>$>d__0 stateMachine = this;
						<>t__builder.AwaitUnsafeOnCompleted(ref awaiter2, ref stateMachine);
						return;
					}
					goto IL_00dd;
				}
				awaiter.GetResult();
				Console.ReadLine();
			}
			catch (Exception exception)
			{
				<>1__state = -2;
				<>t__builder.SetException(exception);
				return;
			}
			<>1__state = -2;
			<>t__builder.SetResult();
		}

		void IAsyncStateMachine.MoveNext()
		{
			//ILSpy generated this explicit interface implementation from .override directive in MoveNext
			this.MoveNext();
		}

		[DebuggerHidden]
		private void SetStateMachine([System.Runtime.CompilerServices.Nullable(1)] IAsyncStateMachine stateMachine)
		{
		}

		void IAsyncStateMachine.SetStateMachine([System.Runtime.CompilerServices.Nullable(1)] IAsyncStateMachine stateMachine)
		{
			//ILSpy generated this explicit interface implementation from .override directive in SetStateMachine
			this.SetStateMachine(stateMachine);
		}
	}

	[AsyncStateMachine(typeof(<<Main>$>d__0))]
	[DebuggerStepThrough]
	private static Task <Main>$(string[] args)
	{
		<<Main>$>d__0 stateMachine = new <<Main>$>d__0();
		stateMachine.<>t__builder = AsyncTaskMethodBuilder.Create();
		stateMachine.args = args;
		stateMachine.<>1__state = -1;
		stateMachine.<>t__builder.Start(ref stateMachine);
		return stateMachine.<>t__builder.Task;
	}

	[SpecialName]
	[DebuggerStepThrough]
	private static void <Main>(string[] args)
	{
		<Main>$(args).GetAwaiter().GetResult();
	}
}

Guess you like

Origin blog.csdn.net/weixin_55305220/article/details/131059574