When working on several Ada concurrent counter implementations, I was interested to point out the concurrent issue that exists in multi-processor environment. This article explains why you really have to take this issue seriously in multi-tasks applications, specially because multi-core processors are now quite common.
What's the issue
Let's say we have a simple integer shared by several tasks:
Counter : Integer;
And several tasks will use the following statement to increment the counter:
Counter := Counter + 1;
We will see that this implementation is wrong (even if a single instruction is used).
Multi task increment sample
To show up the issue, let's define two counters. One not protected and another protected from concurrent accesses by using a specific data structure provided by the Ada Util library.
with Util.Concurrent.Counters; .. Unsafe : Integer := 0; Counter : Util.Concurrent.Counters.Counter;
In our testing procedure, let's declare a task type that will increment both versions of our counters. Several tasks will run concurrently so that the shared counter variables will experience a lot of concurrent accesses. The task type is declared in a declare block inside our procedure so that we will benefit from task synchronisation at the end of the block (See RM 7.6, and RM 9.3).
Each task will increment both counters in a loop. We should expect the two counters to get the same value at the end. We will see this is not the case in multi-processor environments.
declare task type Worker is entry Start (Count : in Natural); end Worker; task body Worker is Cnt : Natural; begin accept Start (Count : in Natural) do Cnt := Count; end; for I in 1 .. Cnt loop Util.Concurrent.Counters.Increment (Counter); Unsafe := Unsafe + 1; end loop; end Worker;
Now, in the same declaration block, we will define an array of tasks to show up the concurrency.
type Worker_Array is array (1 .. Task_Count) of Worker; Tasks : Worker_Array;
Our tasks are activated and they are waiting to get the counter. Let's make our tasks count 10 million times.
begin for I in Tasks'Range loop Tasks (I).Start (10_000_000); end loop; end;
Before leaving the declare scope, Ada will wait until the tasks have finished. (yes, there is no need to write any pthread_join code). After this block, we can just print out the value stored in the two counters and compare them:
Log.Info ("Counter value at the end : " & Integer'Image (Value (Counter))); Log.Info ("Unprotected counter at the end : " & Integer'Image (Unsafe));
With one task, everything is Ok (Indeed!):
Starting 1 tasks Expected value at the end : 10000000 Counter value at the end : 10000000 Unprotected counter at the end : 10000000
With two tasks, the problem appears:
Starting 2 tasks Expected value at the end : 10000000 Counter value at the end : 10000000 Unprotected counter at the end : 8033821
And it aggravates as the number of tasks increases.
Starting 16 tasks Expected value at the end : 10000000 Counter value at the end : 10000000 Unprotected counter at the end : 2496811
(The above results have been produced on an Intel Core Quad; Similar problems show up on Atom processors as well)
On x86 processors, the compiler can use an incl instruction for the unsafe counter increment. So, one instruction for our increment. You thought it was thread safe. Big mistake!
This instruction is atomic in a mono-processor environment meaning that it cannot be interrupted. However, in a multi-processor environment, each processor has its own memory cache (L1 cache) and will read and increment the value into its own cache. Caches are synchronized but this is almost always too late. Indeed, two processors can read their L1 cache, increment the value and save it at the same time (thus, loosing one increment). This is what is happening with the unprotected counter.
Let's see how to do the protection.
Protection with specific assembly instruction
To avoid this, it is necessary to use special instructions that will force the memory location to be synchronized and locked until the instruction completes. On x86, this is achieved by the lock instruction prefix. The following is guaranteed to be atomic on multi-processors:
lock incl %(eax)
The lock instruction prefix introduces a delay to the execution of the instruction it protects. This delay increases slightly when concurrency occurs but it remains acceptable (up to 10 times slower).
For Sparc, Mips and other processors, the implementation requires to loop until either a lock is get (Spinlock) or it is guaranteed that no other processor has modified the counter at the same time.
Protection with an Ada protected type
A safe and portable counter implementation can be made by using Ada protected types. The protected type allows to define a protected procedure Increment which provides an exclusive read-write access to the data (RM 9.5.1). The protected function Value will offer a concurrent read-only access to the data.
package Util.Concurrent.Counters is type Counter is limited private; procedure Increment (C : in out Counter); function Value (C : in Counter) return Integer; private protected type Cnt is procedure Increment; function Get return Integer; private N : Integer := 0; end Cnt; type Counter is limited record Value : Cnt; end record; end Util.Concurrent.Counters;