Asynchronous Synchronization without Locks

Forward

This post assumes you have a basic overview of Threads and Thread Management in C#, and have at least heard of lock and other Thread Synchronization Primitives. If you haven’t, take a few minutes to skim through the MSDN articles on Managed Threading Basics. This article discusses Synchronizing data for multithreading, specifically with the context of creating older EAP synchronized code, and then converting it to newer TAP synchronized code. The purpose of this exercise is to expose you to lock statement blocks which are abundant in C# multithreaded applications, and then demonstrate how to adapt that code to perform asynchronously under the new async-await pattern introduced in C# 5 (.NET Framework 4.5). Many developers are still unfamiliar with this pattern or how to adapt EAP code to TAP code, due to many APIs or engines being limited to older .NET Frameworks, and they too will benefit from the information presented in this post – particularly around the midsection forward.

Back in the dawn of the computer age computers were only able to handle doing one operation at a time. As time went on, computers were increasingly better at hiding this due to clever task scheduling, but at it’s core (pardon the pun), computer processors were only actually able to do a single task at a time. Spin a thread up to do some work, pause it, spin another thread up to do some work, pause it and return to Thread A for a bit, pause it and return to Thread B for a bit, rinse and repeat until all of the work is done. As task management and computer programs became more complex, developers increasingly ran into issues as multiple sets of instructions would come through, overwriting changes to the same data, creating dangerous conditions where a program’s state becomes indeterminate, bugs pop up, errors loom, and systems come crashing to a halt.

Consider you have two threads, Thread A, and Thread B. Thread A and Thread B both perform operations on variables X and Y – Thread A writes to X and then to Y, and Thread B reads X and then Y. This works fine for several loops, but eventually the task scheduler pauses Thread A at just the wrong time, causing it to write to Y just a few milliseconds too late. Thread B ends up reading an old value for Y, and the application comes crashing to a halt. Thread synchronization became vital to computing, and frameworks were invented to keep data sets in sync. Terms like “Race Conditions” were coined, and computers became many magnitudes more powerful.

Note when Thread A and Thread B lack synchronization, the order in which they access Variables X and Y cannot be predicted, and Thread B will sometimes read state data out of sync.

With a little bit of synchronization, you could simply instruct Thread B to wait until Thread A had finished writing, and then instruct Thread A to wait until Thread B had finished reading. By synchronizing access to variables X and Y, we can ensure both threads have access without stepping on each other’s toes.

If we synchronize access to X and Y, we can ensure Thread B always receives data sets in valid pairs.

As time went on, computer task management and scheduling became more and more robust, but it still couldn’t change the fact that the processor could still only perform a single operation at a time. The need for faster and faster computers was ever increasing, and of course the only logical solution if you can’t make the processor faster, is to simply stick two of them in there – they would need to be specially adapted and very carefully used to function in tandem, but you could, in theory, double the data throughput at relatively little cost. Enter the Dual Core.

With the advent of the Multi-Core processor, and their adoption by the consumer, prosumer, and commercial user alike, computers were finally able to – simultaneously as opposed to in tandem – process multiple sets of commands. Thread A and Thread B could not only perform operations on two different data sets, but they could actually process those data sets at the same time, rather than taking turns. The benefits of this were massive, but came at a cost. The increasing demand for computers which could do more led to a demand for computers which could process more data simultaneously, which led to the task schedulers becoming more and more complex.

Thankfully, we stand on the shoulders of giants, and languages such as C# typically have very robust and clean task schedulers. In the case of C# specifically, the most common and basic method of thread synchronization is a simple lock. The documentation explains it best:

The lock statement acquires the mutual-exclusion lock for a given object, executes a statement block, and then releases the lock. While a lock is held, the thread that holds the lock can again acquire and release the lock. Any other thread is blocked from acquiring the lock and waits until the lock is released.

That’s exactly what we need! If we have a set of resources that need to be updated all in one go, like updating X and Y at the same time, we can “lock” onto that resource set to ensure nothing else messes with it until we have finished updating the entire set.

Let’s try that! To create a lock statement, we’ll first need an object reference which we intend to gain a lock on. Thankfully, it can be any object. In some cases it may be appropriate to create a separate object for the sole purpose of locking onto its reference. In other cases, it may be appropriate to lock onto another object, for example you may lock onto a Dictionary if you are only synchronizing access to that Dictionary. In our test case, it will be sufficient to create a new object, so lets just go ahead and define it:

private readonly object _lockHandle = new();

Obtaining a mutual-exclusion lock is as simple as

// "lock" the resource handle so only this thread may access it
lock (_lockHandle)
{    
    // Do sensitive operation
    // Only one thread may run this code at a time
}

And that’s it! With one statement block, we’ve guaranteed only a single thread can access our resource at a time (provided we ensure all references to our sensitive code run through lock code blocks)! Under the hood, the compiler unwraps the lock statement as such:

// "lock" the resource handle so only this thread may access it
object __lockObj = _lockHandle;
bool __lockWasTaken = false;
try
{
    System.Threading.Monitor.Enter(__lockObj, ref __lockWasTaken);
    // Do sensitive operation
    // Only one thread may run this code at a time
}
finally
{
    if (__lockWasTaken) System.Threading.Monitor.Exit(__lockObj);
}

With that in mind, we now have everything we need to implement our own thread safe resource handlers:

private int _x;
private int _y;

...

/// <summary> Gets the Resources stored in a single operation. </summary>
public Tuple<int, int> GetResource()
{
    // Wait until the resource handle is available for us to block
    lock (_lockHandle)
    {
        // Do sensitive operation
        return new Tuple<int, int>(_x, _y);
    }
}
/// <summary> Sets the Resources to <paramref name="x"/> and <paramref name="y"/> in a single operation. </summary>
public void SetResource(int x, int y)
{
    // Wait until the resource handle is available for us to block
    lock (_lockHandle)
    {
        // Do sensitive operation
        _x = x;
        _y = y;
    }
}

And then we’re ready to call it from anywhere in our code:

// Store the values X = 10, Y = 20 in a single operation
SetResource(10, 20);

// Retrieve the resource values in a single operation
var resource = GetResource();

Console.WriteLine($"X: {resource.Item1}, Y: {resource.Item2}"); // X: 10, Y: 20

It doesn’t look like much, but there it is! No matter how many threads try to access our variables, only one thread will be able to access them at a time, so they will always remain in sync. Not very useful in this particular case, but it can be very useful when you’re trying to synchronize applications which are doing dozens of tasks at the same time. A few simple lock blocks, and we’re well on our way to a respectable multi-threaded application. Everything is good, and everyone is happy, right? Wrong. Time keeps moving and task scheduling continues to improve worldwide, but .NET remains neglected. Managing multi-threaded applications is difficult and tiresome. Commercial software is reaching levels of complexity so high, you could spend weeks tracking down a race condition or trying to unravel a code base just enough to wrap your head around it. Developers around the world cry, and Microsoft answers: All you need, is TAP.

The Task asynchronous programming model (TAP) provides an abstraction over asynchronous code. You write code as a sequence of statements, just like always. You can read that code as though each statement completes before the next begins. The compiler performs many transformations because some of those statements may start work and return a Task that represents the ongoing work.

With the introduction of their new Task Scheduling model, as well as the Async-Await-Pattern, Microsoft had introduced an easy way to manage and process threads and their data, and the world was saved.. sort of. Unfortunately, their new programming model does have some quirks, and while porting over existing code shouldn’t be too difficult, it does require special considerations. One of these such quirks is that it is illegal to await inside of a lock statement block. The exact reasoning for that isn’t that it is impossible, or even difficult. Eric Lippert (former C# language design team member at Microsoft) notes that the reason that this is not implemented by the compiler team is to protect the developer from making mistakes; awaiting inside a lock is a recipe for producing deadlocks. To get around this, the .NET team was kind enough to implement a new type for synchronization of async contexts: SemaphoreSlim. SemaphoreSlims offer an easy way to limit the number of threads which are able to enter a region of code at a time – if we configure the SemaphoreSlim to only allow a single thread, then we can use it just like a lock statement, so lets go ahead and do that:

private readonly SemaphoreSlim _semaphoreSlim = new(1, 1);

Now that we have our SemaphoreSlim configured, we can use it just like a lock statement. Unfortunately, the compiler doesn’t offer a handy macro like the lock statement, but take a peak at the snippet below, and see if you get flashbacks:

// Wait until the resource handle is available for us to block
await _semaphoreSlim.WaitAsync();
try
{
    // Do sensitive operation
    // Only one thread may run this code at a time
    // This section of code operates just like a `lock` block
}
finally
{
    // Release the resource handle for the next thread
    _semaphoreSlim.Release();
}

Nothing? Take a look back at how the compiler unrolls a lock statement:

// "lock" the resource handle so only this thread may access it
object __lockObj = _lockHandle;
bool __lockWasTaken = false;
try
{
    System.Threading.Monitor.Enter(__lockObj, ref __lockWasTaken);
    // Do sensitive operation
    // Only one thread may run this code at a time
}
finally
{
    if (__lockWasTaken) System.Threading.Monitor.Exit(__lockObj);
}

They’re a little different, but they serve the same purpose. We lock onto a unique handle, we manipulate a resource, and then we release the handle. Pay careful attention to the structure of these two snippets here – it’s very important to use the try-catch-finally blocks here, since we need to be sure we always release the locked handle when are finished with it, no matter what. If one of our threads throws an error while it has a lock, it may cause a deadlock (completely freezing our program) if it never releases that lock.

Now that we know how to adapt our code, lets go ahead and adapt our GetResource() and SetResource() so they support the new async-await pattern. The TAP pattern adds the following naming convention (among others):

Asynchronous methods in TAP include the Async suffix after the operation name for methods that return awaitable types, such as Task, Task<TResult>, ValueTask, and ValueTask<TResult>. For example, an asynchronous Get operation that returns a Task<String> can be named GetAsync.

So lets go ahead and update our method names to match the new naming convention, as well:

/// <summary> Gets the Resources stored in a single operation. </summary>
public async Task<Tuple<int, int>> GetResourceAsync()
{
    // Wait until the resource handle is available for us to block
    await _semaphoreSlim.WaitAsync();
    try
    {
        // Do sensitive operation
        return new Tuple<int, int>(_x, _y);
    }
    finally
    {
        // Release the resource handle for the next thread
        _semaphoreSlim.Release();
    }
}
/// <summary> Sets the Resources to <paramref name="x"/> and <paramref name="y"/> in a single operation. </summary>
public async Task SetResourceAsync(int x, int y)
{
    // Wait until the resource handle is available for us to block
    await _semaphoreSlim.WaitAsync();
    try
    {
        // Do sensitive operation
        _x = x;
        _y = y;
    }
    finally
    {
        // Release the resource handle for the next thread
        _semaphoreSlim.Release();
    }
}

And that’s it! Just like that, we’ve converted ancient Event-based Asynchronous Pattern code into glorious Task-based Asynchronous Pattern code!

Now lets see it in action:

// Store the values X = 10, Y = 20 in a single operation
await SetResourceAsync(10, 20);

// Retrieve the resource values in a single operation
var resourceAsync = await GetResourceAsync();
Console.WriteLine($"X: {resourceAsync.Item1}, Y: {resourceAsync.Item2}"); // X: 10, Y: 20

Viola! No matter how many threads try to access our resource, only one will be able to access it at a time!

Final Thoughts

Understanding how and when to synchronize your asynchronous code will make or break your multi-threaded applications. If you don’t synchronize the right things, you’ll run into Race Conditions where bits of your code are running out of sync, but if you synchronize too much, you’ll lose the benefit of multi-threading your application in the first place, since all your threads will be waiting on each other constantly.

The code given here is a very simplified example.

In the real world, it generally isn’t useful to synchronize access only to a single object – generally you’ll synchronize access to multiple objects, such as several Dictionaries or a range of Properties for some object. In future posts, I’ll detail use cases for asynchronous synchronization more heavily, and more real world examples will be given which better demonstrate their usage.

Project Files

You may view the project files here on GitHub. The final version of the demo code is attached below.

The final version has been altered slightly:

  • The tutorial code has been integrated into a new project, such that the code may be run as-is.
  • GetResource and SetResource respectively have had their code removed, and they are now expressions to their Async siblings. This was done to demonstrate the need to update EAP methods to synchronize with their new TAP counterparts. EAP code which synchronizes with lock statements will not automatically synchronize with SemaphoreSlim, meaning the developer must handle the synchronization manually.
  • The tutorial code has been genericized to enhance its usefulness.
namespace Hotrian.com_Tutorials
{
    public static class Program
    {
        public static async void Main()
        {
            var obj = new SensitiveResource();

            // Store the values X = 10, Y = 20 in a single operation
            // obj.SetResource(10, 20);
            // Retrieve the resource values in a single operation
            // var resource = obj.GetResource();
            // Console.WriteLine($"X: {resource.Item1}, Y: {resource.Item2}"); // X: 10, Y: 20

            // Store the values X = 10, Y = 20 in a single operation
            await obj.SetResourceAsync(10, 20);
            // Retrieve the resource values in a single operation
            var resourceAsync = await obj.GetResourceAsync();
            Console.WriteLine($"X: {resourceAsync.Item1}, Y: {resourceAsync.Item2}"); // X: 10, Y: 20
        }
    }

    public class SensitiveResource
    {
        private readonly SemaphoreSlim _semaphoreSlim = new(1, 1);

        private int _x;
        private int _y;

        // Remap the old EAP stubs to our new TAP methods, so older code can synchronize properly with newer code
        /// <inheritdoc cref="GetResourceAsync"/>
        public Tuple<int, int> GetResource() => GetResourceAsync().GetAwaiter().GetResult();
        /// <inheritdoc cref="SetResourceAsync"/>
        public void SetResource(int x, int y) => SetResourceAsync(x, y).GetAwaiter().GetResult();

        /// <summary> Gets the Resources stored in a single operation. </summary>
        public async Task<Tuple<int, int>> GetResourceAsync()
        {
            // Wait until the resource handle is available for us to block
            await _semaphoreSlim.WaitAsync();
            try
            {
                // Do sensitive operation
                return new Tuple<int, int>(_x, _y);
            }
            finally
            {
                // Release the resource handle for the next thread
                _semaphoreSlim.Release();
            }
        }
        /// <summary> Sets the Resources to <paramref name="x"/> and <paramref name="y"/> in a single operation. </summary>
        public async Task SetResourceAsync(int x, int y)
        {
            // Wait until the resource handle is available for us to block
            await _semaphoreSlim.WaitAsync();
            try
            {
                // Do sensitive operation
                _x = x;
                _y = y;
            }
            finally
            {
                // Release the resource handle for the next thread
                _semaphoreSlim.Release();
            }
        }
    }
}
Hotrian Avatar