Problem Description
The problem is to update a cache that is almost never modified and only read in multi-threaded context. The read performance is critical and the goal is to reduce the thread contention as much as possible to obtain a fast and non-blocking path when reading the cache.
Cache Declaration
Java Implementation
Let's define the cache using the HashMap
class.
public class Cache {
private HashMap<String,String> map = new HashMap<String, String>();
}
Ada Implementation
In Ada, let's instantiate the Indefinite_Hashed_Maps
package for the cache.
with Ada.Strings.Hash;
with Ada.Containers.Indefinite_Hashed_Maps;
...
package Hash_Map is
new Ada.Containers.Indefinite_Hashed_Maps (Key_Type => String,
Element_Type => String,
Hash => Hash,
"=" => "=");
Map : Hash_Map.Map;
Solution 1: safe and concurrent implementation
This solution is a straightforward solution using the language thread safe constructs. In Java this solution does not allow several threads to look at the cache at the same time. The cache access will be serialized. This is not a problem with Ada, where multiple concurrent readers are allowed. Only writing locks the cache object
Java Implementation
The thread safe implementation is protected by the synchronized keyword. It guarantees mutual exclusions of threads invoking the getCache
and addCache
methods.
public synchronized String getCache(String key) {
return map.get(key);
}
public synchronized void addCache(String key, String value) {
map.put(key, value);
}
Ada Implementation
In Ada, we can use the protected
type. The cache could be declared as follows:
protected type Cache is
function Get(Key : in String) return String;
procedure Put(Key, Value: in String);
private
Map : Hash_Map.Map;
end Cache;
and the implementation is straightforward:
protected body Cache is
function Get(Key : in String) return String is
begin
return Map.Element (Key);
end Get;
procedure Put(Key, Value: in String) is
begin
Map.Insert (Key, Value);
end Put;
end Cache;
Pros and Cons
+: This implementation is thread safe.
-: In Java, thread contention is high as only one thread can look in the cache at a time.
-: In Ada, thread contention occurs only if another thread updates the cache (which is far better than Java but could be annoying for realtime performance if the Put
operation takes time).
-: Thread contention is high as only one thread can look in the cache at a time.
Solution 2: weak but efficient implementation
The Solution 1 does not allow multiple threads to access the cache at the same time, thus providing a contention point. The second solution proposed here, removes this contention point by relaxing some thread safety condition at the expense of cache behavior.
In this second solution, several threads can read the cache at the same time. The cache can be updated by one or several threads but the update does not guarantee that all entries added will be present in the cache. In other words, if two threads update the cache at the same time, the updated cache will contain only one of the new entry. This behavior can be acceptable in some cases and it may not fit for all uses. It must be used with great care.
Java Implementation
A cache entry can be added in a thread-safe manner using the following code:
private volatile HashMap<String, String> map = new HashMap<String, String>();
public String getCache(String key) {
return map.get(key);
}
public void addCache(String key, String value) {
HashMap<String, String> newMap = new HashMap<String, String>(map);
newMap.put(newKey, newValue);
map = newMap;
}
This implementation is thread safe because the hash map is never modified. If a modification is made, it is done on a separate hash map object. The new hash map is then installed by the map = newMap
assignment operation which is atomic. Again this code extract does not guarantee that all the cache entries added will be part of the cache.
Ada Implementation
The Ada implementation is slightly more complex basically because there is no garbage collector. If we allocate a new hash map and update the access pointer, we still have to free the old hash map when no other thread is accessing it.
The first step is to use a reference counter to automatically release the hash table when the last thread finishes its work. The reference counter will handle memory management issues for us. An implementation of thread-safe reference counter is provided by Ada Util. In this implementation, counters are updated using specific instruction (See Showing multiprocessor issue when updating a shared counter).
with Util.Refs;
...
type Cache is new Util.Refs.Ref_Entity with record
Map : Hash_Map.Map;
end record;
type Cache_Access is access all Cache;
package Cache_Ref is new Util.Refs.References (Element_Type => Cache,
Element_Access => Cache_Access);
C : Cache_Ref.Atomic_Ref;
Source: Util.Refs.ads, Util.Refs.adb
The References
package defines a Ref
type representing the reference to a Cache
instance. To be able to replace a reference by another one in an atomic manner, it is necessary to use the Atomic_Ref
type. This is necessary because the Ada assignment of an Ref
type is not atomic (the assignment copy and the call to the Adjust
operation to update the reference counter are not atomic). The Atomic_Ref
type is a protected type that provides a getter and a setter. Their use guarantees the atomicity.
function Get(Key : in String) return String is
R : constant Cache_Ref.Ref := C.Get;
begin
return R.Value.Map.Element (Key); -- concurrent access
end Get;
procedure Put(Key, Value: in String) is
R : constant Cache_Ref.Ref := C.Get;
N : constant Cache_Ref.Ref := Cache_Ref.Create;
begin
N.Value.all.Map := R.Value.Map;
N.Value.all.Insert (Key, Value);
C.Set (N); -- install the new map atomically
end Put;
Pros and Cons
+: high performance in SMP environments
+: no thread contention in Java
-: cache update can loose some entries
-: still some thread contention in Ada but limited to copying a reference (C.Set)
Add a comment
To add a comment, you must be connected. Login1 comments
Hello Stephane,
I don't think your "Pros and Cons" to solution 1 are correct for the Ada solution. If you read the ARM 9.5.1 Protected Subprograms and Protected Actions(1), you'll notice this on the very first line:
"protected functions provide concurrent read-only access to the data."
Multiple readers can call your Get function without any kind of contention. So you don't need the second solution for Ada, as the problem has already been solved in the first. Only when Put is called is the object locked down, as the ARM states:
"Protected procedures provide exclusive read-write access to the data of a protected object;"
Regards,
Thomas Løcke
(1) http://www.adaic.org/resources/add_...