Compare-And-Swap; lockless paradigm with CouchDB/Cloudant

I’m currently working on a matchmaking implementation using database back-end provided by Cloudant (which is built on CouchDB).

The algorithm requires me to be able to take ownership of records in a database in a fast, first-come-first-serve, way which locks these records out of subsequent requests until they are released.

In lockless, concurrent, programming there is the concept of “CAS” (compare-and-swap) which is an atomic operation with signature;

compareAndSwap(target, testValue,newValue) : boolean

The function tests the value of “target” and sets it to “newValue” if it is equal to “testValue”. If the swap succeeds the function returns true, if not it returns false. All of this is done atomically and on most modern hardware (and JVMs) this is implemented as a single instruction.

The idea behind this function is that a thread which wants to obtain or change a resource tries to do so with minimal overhead and no global locking required. If the swap fails the thread is responsible for either trying again or giving up; the key here is that the caller (the thread) is the only part which might block or wait, all other threads accessing the resource can continue uninterrupted. Contrast this to a global lock or synchronisation approach where the thread effectively locks out everybody else when accessing the resource. Using a CAS approach turns the locking problem on its head, so to speak. You can implement very effective concurrent collections, such as queues and linked lists, using a CAS approach where link pointers and indexes are the only things changed.

For the matchmaking problem an algorithm that locks out records using a CAS-function would check if the current record is “available” and then try to set it to “unavailable”. If it fails it continues to the next available record (in my case; it could also wait.)

With CouchDB and Cloudant you can implement this behaviour at a database record level;

Whenever a record (or “document”) is changed in these database implementations they get a new unique revision number (see for example: http://wiki.apache.org/couchdb/HTTP_Document_API) and whenever you want to update to a document you also need to provide the revision number for the document you want to change. The key here is that if the document currently in the database is at a different revision number the update fails (returning a 409 error.)

This is all we need to implement an atomic document CAS for these databases;

 docCAS(targetDocumentId, revisionNumber, newDocumentContents)

The “testValue” is now the expected revision number of the document which we want to change to “newDocumentContents”. The PUT call to the database will return a 409 error if the revision number is different from what we expected it to be. Otherwise the document is changed to the new contents (and the revision number is updated.)

There is nothing particularly clever or strange about all this but I thought it was quite nice that the CAS idea from lockless programming – usually confined to the domain of atomic single-instruction environments – was so easily and completely transferrable to a big, comparatively non-atomic, database.

And the matchmaking algorithm became a breeze to implement after this realisation had sunk in…

UPDATE

Cloudant pointed out that there is a problem with this implementation (as I put in my comment earlier) since a distributed database solution like Cloudant (or big couch) has a “propagation speed limit” (my term, can’t help drawing on physics parallels) such that two docCAS requests might apparently succeed at the same time and quietly cause a conflict. Internally Cloudant will resolve it and pick a “winner” but the writers won’t know that until they explicitly ask for conflicts from the database.

In my case the matchmaking can handle this because matches are played out asynchronously (i.e. while one player is online and the other is offline) so we have a little bit of leeway to manage the conflict. Even if these situations might arise infrequently they can arise and the matchmaker would be incomplete (and fragile) if it didn’t handle them.

For my case there are two situations where conflicts can arise;

  1. An offline player picked for a match is picked by more than one online player at the same time (because all of their docCAS requests overlap and cause a quiet conflict)
  2. An offline player picked for a match comes online the exact same moment as they are matched against and the docCAS succeeds for both

To resolve this I’m introducing a “begin/end” -match semantics (i.e. consider a match like a transaction); when a match between an attacker and a defender is over (and this might happen on a number of different clients since we are assuming a conflict happened) my code issues an GET on the defender’s document used for matchmaking with the ?conflicts=true flag in the query (more about this here: http://docs.cloudant.com/guides/mvcc.html). If this results in a list of conflicts I check if the current match was the winning one (i.e. if the revision of the matchmaking document for the defender was the one Cloudant picked as the winner); if it is I can go ahead and update the defender’s stats (and the attacker). If not I don’t (and I can decide to update the attacker depending on game design.) This way the defender will only ever lose once (and not have their resources stripped by several attackers at once.) Conversely, and depending on the game design, multiple attackers might benefit from the same attack but this is less of an issue.

For the second case the situation is a wee bit more involved (but not much); I have to check each revision in the conflict set (including the winning one) to find which one was the one corresponding to the defender coming online (knowing this requires a little bit more information in the matchmaking table) and determine if I update the defender and/or attacker based on this. I.e. it requires that a bit more information persists and that some more information is processed (the entire conflict set as opposed to just the winning one.)

So far, at least, it looks good on paper and in a test framework. The proof will be in the proverbial eating of the pud…

Update 2

It works but it can be simplified a lot; conflict resolution as I outlined it above is not needed because it only really affects the “match document” which are used to hold the lock. Once a match is over the defender’s state documents will be updated regardless – there is even no need to check if the revision number is the same as it was when the match started; one document will be the final word regardless what happens. However (!) this depends on the state of the defender not being updated accumulatively; i.e. if the defender’s state is cached at the beginning of a match and then updated from that cache at the end, as opposed to reloading the defender’s state fresh from the DB and then doing the update, the update will never be accumulative (i.e. it won’t be added to an update done by another attack at the same time.)

The end result is that only one document (state) is left as the final one, regardless of how many attacks happened. If a conflict happens (i.e. the replication issue outlined above) then Cloudant will pick a winner and we’re back to a single state update again. The defender is only penalised once.

That simplifies things….

Update 3

We’ve had this running for a while now without simultaneous requests generating any conflicts or issues. That doesn’t mean that they can’t happen (and we’re watching it carefully as the peak numbers grow) but as far as both reliability and performance is concerned the approach is working well and Cloudant serves this type of match making very well indeed.

Advertisements