February 3, 2019

Tasks and Async (C#.NET)

In this article, I'm going to help you get more out of Tasks by outlining some simple patterns that are applicable to many situations

A brief note on Async

Good Asynchronous code enables responsive interfaces and more efficient use of resources. Good asynchronous programming releases control of the CPU resources by a thread or process to be used elsewhere. In a .NET web server environment, asynchronous code allows for better scalability to handle sudden traffic spikes by allowing threads to be shared by more than one request in progress at a time.


Did you know...

...the default thread pool in .NET only creates two new threads a second if it gets a traffic increase?

Async code allows threads to be shared between simultaneous responses. This makes two new threads a second translate to MORE than two new simultaneous requests.


What is a Task?

Here are some ways to describe a Task.

  1. Tasks are an abstraction of a thread - by awaiting and creating Tasks we are indicating where programming could possibly suspend a thread or introduce a new one for better efficiency.
  2. Tasks represent work that may or may not be done yet.
  3. Similarly to how functions represent instructions to do work, a Task represents a worker that executes those instructions.
  4. It's Mickey's magic broom created to independently carry out a job.

Unlike actual Threads where we create, run and suspend them manually, .NET will manage Tasks for us in a way that attempts to be fair and avoid starvation. This removes much of the complexity of async programming.

A Simple Await

A Task represents work that may or may not be done yet. A Task can have "await" called on it, causing the calling code to pause, wait for the child Task to complete (and it's result to be returned), and then to resume. This is conceptually similar to how a normal function call would be handled, except the function call might be run on a separate thread.

    public async Task<string> processProject(int id){
       //fetch project from DB asyncronously    
       Project project = await ProjectDAL.GetProjectAsync(id);
       return project.title;
    }
A possible timeline diagram for a simple await. This describes the optimum scenario. In reality, there might be a delay starting GetProjectAsync, or a delay resuming the caller task if many Tasks are competing for CPU time.

Awaiting a Task that is already complete will return the result instantly without pausing the calling code. This applies even if the Task has already been awaited.

Awaiting a Task that has not completed, will cause the calling/parent function to be suspended so that any other ready-to-run Tasks in the system can take over it's unused resources.

Running a Task in the Background

The first pattern we looked is great for enabling a web-server to run more unrelated requests at once, increasing overall response time. The following patterns enable us to describe how a singe request can be executed faster, by providing opportunities for parallelism and so is also relevant to stand-alone applications.

Whether or not a Task is actually run in parallel with other processing is up to the Task Scheduler. However, Task.Run does allow us to supply hints to the scheduler to encourage this. This is not discussed here.

Not immediately calling await on a task allows the calling code/Task to continue execution without delay, and the child Task to be executed as the Task Scheduler sees fit (usually pretty quickly).

Task projectAsync = ProjectDAL.GetProjectAsync(pid);    //no await here
//below code continues at same time as above is running
Bitmap rendering = Renderer.GaussianMap(newdata);

//wait for project task to complete if it hasn't already,
//then store result to new variable
var project = await projectAsync;

project.rendering = rendering;
Diagram showing a potential callback pattern. In this situation, because GaussianMap happened to complete after GetProjectAsync, the await resolved immediately.

Fire and Forget

If a very slow background Task's result is unimportant, and it is acceptable for it to fail silently, we can use a "Fire and Forget" pattern by omitting the await completely.

In the below example the sendEmailAsync Task is started and never has "await" called on it. Assuming no errors, the email will get sent. It may be sent well after we have exited the UpdateOld function. If there are errors, they are ignored.

public static Task UpdateOld(Project p){
    _ = sendEmailAsync(p.ManagerEmail, subject="Long time no see!");
    await ProjectDAL.SaveProjectAsync(p);
}

NOTE: The underscore character is a discard; it will chuck out the returned task. This will stop warnings coming up about whether you intended to not wait on a task, which will appear in Visual Studio if you do not assign the task anywhere.
FURTHER NOTE: In an .NET MVC context, if you're doing authorisation checks in your fire-and-forget Task, you need to ensure that the HttpContext is simulated inside the Task as well. This is because the real context will be destroyed once the web request returns, and the fire-and-forget task might still try and use it. This mostly applies if you're using "Claims".

FURTHER GRIPEY NOTE: Some may claim this fire-and-forget approach is an anti-pattern because any Exceptions that happen during the task that is discarded are lost. In my experience, the need for the fire-and-forget pattern often indicates a non-critical task anyway. In situations where the task is critical but slow-running, then the task should assume responsibility for handling errors. I would argue not handling errors in a complex function (the task) is the anti-pattern, not the usage of fire-and-forget itself, which is a perfectly useful pattern.

Multiple Tasks

In the previous example, the SaveProjectAsync and sendEmailAsync Tasks were created at roughly the same time, and we didn't need to know if email succeeded or not.

If we need to make sure both our tasks complete before we can continue, we simply await both, one after the other.

Task<Project> projectAsync = ProjectDAL.GetProjectAsync(pid);   //no await here
//below code continues at same time as above is running
Task<Bitmap> renderingAsync = Renderer.GaussianMapAsync(newdata);

var project = await projectAsync;
project.rendering = await rendering;
Diagram showing a possible timeline of events. Please note, the two Tasks will not necessarily be simultaneous, but they certainly could be.

In the above example, the order we await the tasks is unimportant to execution time because neither of the Tasks depend on the other, even though the result of one will use the result of the other.

A side note: Lambdas

In next section, I'm going to start using lambdas - if you're not familiar with them, they're pretty straightforward, and super cool. Read more.

Chaining Tasks

In the next example, the rendering task requires the project before it can start, so we can't start both at once like before. We also want to calculate pi for some reason.

Task<Project> projectAsync = ProjectDAL.GetProjectAsync(pid);
Task<Bitmap> renderingAsync = Renderer.GaussianMapAsync(await projectAsync);

//below code will run simultaneously to rendering
//once "await projectAsync" is done.
var pi = slowPiCalculator(40000);
Bitmap rendering = await renderingAsync;    //wait for result of renderingAsync

The pi calculation will start at the same time as the rendering. It will not start with projectAsync because we have an await on projectAsync before the pi calculation.

Diagram showing timeline events for using a combination of simple await and background processing methods. In this scenario, slowPiCalculator is started later than it needs to be because of the await on GetProjectAsync

If we move the pi calculation to begin when ProjectDAL does, the pi calculation will return earlier, but now the rendering will not start until pi is calculated:

Task<Project> projectAsync = ProjectDAL.GetProjectAsync(pid);   //no await here
//Gaussian map will start once project is returned
//below code will run simultaneously to rendering
var pi = slowPiCalculator(40000);    
Task<Bitmap> renderingAsync = Renderer.GaussianMapAsync(await projectAsync);
Bitmap rendering = await renderingAsync;    //wait for result of renderingAsync
Diagram of a possible callback timeline for provided example shows there is a significant gap between when GetProjectAsync completes and when GaussianMapAsync begins because GaussianMapAsync is not started until SlowPiCalculator finishes.

Ideally we want the pi calculation to be independent to project and bitmap such that the rendering is not held up by the pi calculation, yet still have the dependency between project and bitmap preserved. This can be done with ContinueWith:

Task<Project> projectAsync = ProjectDAL.GetProjectAsync(pid);    //no await

Task<Task<Bitmap>> bitmapAsyncAsync = projectAsync.ContinueWith(
    async (Task<Project> p)=>
        Renderer.GaussianMapAsync((await p).newdata));

//below code will run simultaneously to rendering
var pi = slowPiCalculator(40000);
Bitmap rendering = await await bitmapAsyncAsync;

NOTE: Because we are guaranteed the task is completed when the ContinueWith callback lambda runs, and that the previous task/thread has already surrendered the context, it is safe to use "p.Result" instead of "await p" in the above example. In many other situations, .Result can cause deadlock.

Diagram of a possible callback timeline for provided example shows how ContinueWith removes the gap between GetProjectAsync and GaussianMapAsync, and allows slowPiCalculator to start immediately on the calling Task.

Task.Run

A better option than using a ContinueWith might be to move the pi calculation to it's own Task. This allows us to use a much simpler, easy to follow pattern. You can run non-async code inside a Task with the help of Task.Run().

Task piTask = Task.Run(()=>slowPiCalculator(40000));
Project project = await ProjectDAL.GetProjectAsync(pid);
Bitmap rendering = await Renderer.GaussianMapAsync(project.newdata);

//below code will run simultaneously to rendering
var pi = await piTask;
Diagram of a possible (optimal) callback timeline, showing the three functions/Tasks executing in a similar schedule to the previous example using ContinueWith.

There's more to talk about with Tasks. Check out my follow up article on handling more tasks, handling exceptions and cancelling tasks.