Some finite field arithmetic; a simple Swift implementation

The title of this post started out as  “A Swift implementation of binary arithmetic in GF(2)^n” but I decided against it because, even though it’s more accurate, it defeats the purpose of the post which is to (hopefully) explain the principles in-code and in human-readable form…

For your perusal; here is some background, references, and more formal details;

http://xcore.github.io/doc_tips_and_tricks/crc.html

https://sites.google.com/site/ctxtree/crc/crc-binarydivision

https://en.wikipedia.org/wiki/Modular_arithmetic

http://jameskbeard.com/Temple/Data/Binary_Polynomial_Division.pdf

Now, to divide two binary numbers, modulo 2, you use the same technique as “long division” but I’ve not done long division by hand since a very long time and quite frankly can’t remember much of it so I am not going to use that excuse and instead present you directly with an algorithm, firstly in pseudo-code, then in Swift;

divideBinaryModulo2(N, D) where N>D
 let divisor = msb of D aligned with the msb of N by 
               shifting it up k bits
 let dividend = N
 let quotient = 0
 let remainder = 0
 for i = k-1..0 {
   if high bit of dividend is set
     set i’th bit of quotient to 1
     dividend = dividend XOR divisor

   shift dividend up 1 bit
 }
 remainder = dividend shifted back down k bits

The result is the quotient and remainder such that

quotient * D + remainder = N

Addition and subtraction is the same, modulo 2, and is implemented with the XOR operation so another way of writing this is

quotient * D XOR remainder = N

Where multiplication of A and B, modulo 2, is done by summing A multiplied by each term of B, i.e;

multiplyBinaryModulo2(A,B)
 result = 0 
 for each set bit in B {   
   let i = bit position
   result = result XOR (A shifted up by i)
}

Note; For the multiplication to be correct we also need to ensure that the result doesn’t go beyond 32 bits and the way to do this is to do the multiplication modulo a primitive polynomial of order 32 (i.e. one that uses 33 bits.)

Imagine we wanted to work within only 8- of our 32 -bits. Since addition is an XOR operation, assuming the terms are of order less than 8, it can never bring a result to overflow. Multiplication, however, could easily do that and the question then is; given the result of a polynomial multiply which overflows our 8 bits, what do we do…?

One approach might be to just AND it by 0x7f (0b1111111) to mask out the upper bits. This certainly works but it’s not going to give us the right results because, firstly, it’s a logical operation (not an arithmetic one that’s either a + a – or a *, which is required for this to remain inside the confines of a “ring” – which is a bit like a mathematical “group” a little more structure, like the concept of an “inverse”), secondly it’s not guaranteed to give a non-0 result and 0 isn’t a polynomial (it’s just a constant) and we don’t want it to show up in our results. Fixing the overflow also needs to give us a result which can be any of the possible polynomials we can fit in the 8 bits; i.e. it shouldn’t rule any out and thereby reduce our set of polynomials.

To the rescue; prime numbers. Or, to be more specific, prime polynomials. Something is prime if it’s not a product of other parts so a prime polynomial, or a primitive-, and irreducible  -polynomial can’t be written out as the product of other, lower order, polynomials. Another important property of prime polynomials (or numbers) is that they guarantee that a “multiplicative inverse” exist in a collection where we define multiplication as including a modulo on the result; i.e. a*b = a*b mod p, where p is the prime (number or polynomial).

To be more specific; since a prime number can’t be written as the product of two or more, smaller, numbers the product a*b can never be a multiple of that (or any other) prime number. This means that taking the modulo will never result in 0, since you only get 0 if the remainder is 0, i.e. if a*b is a multiple of p…The same logic applies to p being a prime (irreducible, primitive) polynomial.

So; to make our multiplications correct and mathematically consistent for polynomials, and to fit inside our example of 8 bits we need to follow every multiplication with a modulo (i.e. remainder) operation on the result by a prime polynomial. This prime polynomial can be picked relatively arbitrarily and there are tables for these (they can also be found using the same techniques used to find prime numbers. I’ve tried this with the Sieve of Eratosthenes and I’ll publish this addition to the code below as soon as I have time.) So, therefore; the code below is, strictly, not correct for multiplications.

Now for the Swift part;

I’ve created a small class GfPolynomial32 which contains the code for performing modular arithmetic on bit strings up to 32 bits in length. The reason for the “Polynomial” part in the name, btw, is because another way of representing these bit strings is as polynomials of a parameter x which have either a “1” or a “0” coefficient. I.e., given the bit string 10110 we can interpret this as;

 1*x^4+0*x^3+1*x^2+1*x+0

I’ve added some extension methods to create string representations of the bits either as conventional 1’s and 0’s or in the polynomial representation.

Here, without further ado, is the code (just remember that it has not been written for speed and performance, but rather for understandability)

//

//  Gf2Polynomial32.swift

//

//  Created by Jarl Ostensen on 07/01/02015.

//  Copyright (c) 2015 SonarJetLens. All rights reserved.

//

import Foundation

// Gf2 representation of a polynomial of maximum order 31

public class Gf2Polynomial32 {

    private var _value:UInt32;

    private let _valueMsbPos:UInt32 = 0;

    // return the position of the highest set bit in value

    private func msbPosOf(Value:UInt32) -> UInt32 {

        var pos:UInt32 = 0;

        var c = Value;

        while(c != 0) {

            c >>= 1;

            ++pos;

        }

        return pos – 1;

    }

    public init(val:UInt32) {

        _value = val;

                  _valueMsbPos = msbPosOf(_value);

    }

    // order of the polynomial (equals position of msb)

    public var order:UInt32 {

        get {

            return _valueMsbPos;

        }

    }

    // raw value

    public var value:UInt32 {

        get {

            return _value;

        }

    }

}

// ====================================== various operators on Gf2Polynomial32’s;

public func == (left:Gf2Polynomial32, right:Gf2Polynomial32) -> Bool {

    return left.value == right.value;

}

public func != (left:Gf2Polynomial32, right:Gf2Polynomial32) -> Bool {

    return left.value != right.value;

}

public func + (left:Gf2Polynomial32, right:Gf2Polynomial32) -> Gf2Polynomial32 {

    return Gf2Polynomial32(val:(left.value ^ right.value));

}

public func – (left:Gf2Polynomial32, right:Gf2Polynomial32) -> Gf2Polynomial32 {

    // NOTE: same as for +

    return Gf2Polynomial32(val:(left.value ^ right.value));

}

// multiplication

public func * (left:Gf2Polynomial32, right:Gf2Polynomial32) -> Gf2Polynomial32 {

    var b:UInt32 = right.value;

    let a:UInt32 = left.value;

    var result:UInt32 = 0;

    var shift:UInt32 = 0;

    while(b != 0) {

        if ( b&1 == 1 ) {

            result ^= (a << shift);

        }

        ++shift;

        b >>= 1;

    }

    return Gf2Polynomial32(val:result);

}

// division (returns a pair; quotient and remainder)

public func / (left:Gf2Polynomial32, right:Gf2Polynomial32) -> (Gf2Polynomial32,Gf2Polynomial32) {

    let shiftAlign = (left.order – right.order);

    let divisor = right.value << shiftAlign;

    let highBitMask:UInt32 = 1 << left.order;

    var q:UInt32 = 0;

    var dividend = left.value;

    var qShift = shiftAlign+1;

    do {

        if ( (dividend&highBitMask) != 0 ) {

            dividend ^= divisor;

            q ^= (1 << (qShift-1));

        }

        –qShift;

        dividend = (dividend << 1);

    } while( qShift != 0 );

    return (Gf2Polynomial32(val:q),Gf2Polynomial32(val:(dividend>>(shiftAlign+1))));

}

And some helpful extensions;

//

//  Gf2Polynomial32+Extension.swift

//

//  Created by Jarl Ostensen on 07/01/02015.

//  Copyright (c) 2015 SonarJetLens. All rights reserved.

//

import Foundation

extension Gf2Polynomial32 {

    // return a 1’s and 0’s representation

    func asBinaryString() -> String {

        if ( value == 0 ) {

            return “0”;

        }

        else {

            var result = “”;

            var c = value;

            while( c != 0 ) {

                if ( (c & 1) != 0 ) {

                    result = “1” + result;

                }

                else {

                    result = “0” + result;

                }

                c >>= 1;

            }

            return “0b” + result;

        }

    }

    // return a polynomial representation

    func asPolynomial() -> String {

        if ( value==0 ) {

            return “0”;

        }

        else {

            var result = “”;

            var c = value;

            var pos = 0;

            while( c != 0 ) {

                if ( (c & 1) != 0 ) {

                    if ( pos>0 ) {

                        result = “x^\(pos) + (result.utf16Count>0 ? ” + \(result) : “”);

                    }

                    else {

                        result = “1”;

                    }

                }

                c >>= 1;

                ++pos;

            }

            return result;

        }

    }

}

And finally; here is some example output from using this class;

N = 0b1100110111; x^9 + x^8 + x^5 + x^4 + x^2 + x^1 + 1

D = 0b10011; x^4 + x^1 + 1

q = 0b110110; x^5 + x^4 + x^2 + x^1

r = 0b1101; x^3 + x^2 + 1

q*D     = x^9 + x^8 + x^5 + x^4 + x^3 + x^1

q*D + r = x^9 + x^8 + x^5 + x^4 + x^2 + x^1 + 1

Compare-And-Swap; lockless paradigm with CouchDB/Cloudant

I’m currently working on a matchmaking implementation using database back-end provided by Cloudant (which is built on CouchDB).

The algorithm requires me to be able to take ownership of records in a database in a fast, first-come-first-serve, way which locks these records out of subsequent requests until they are released.

In lockless, concurrent, programming there is the concept of “CAS” (compare-and-swap) which is an atomic operation with signature;

compareAndSwap(target, testValue,newValue) : boolean

The function tests the value of “target” and sets it to “newValue” if it is equal to “testValue”. If the swap succeeds the function returns true, if not it returns false. All of this is done atomically and on most modern hardware (and JVMs) this is implemented as a single instruction.

The idea behind this function is that a thread which wants to obtain or change a resource tries to do so with minimal overhead and no global locking required. If the swap fails the thread is responsible for either trying again or giving up; the key here is that the caller (the thread) is the only part which might block or wait, all other threads accessing the resource can continue uninterrupted. Contrast this to a global lock or synchronisation approach where the thread effectively locks out everybody else when accessing the resource. Using a CAS approach turns the locking problem on its head, so to speak. You can implement very effective concurrent collections, such as queues and linked lists, using a CAS approach where link pointers and indexes are the only things changed.

For the matchmaking problem an algorithm that locks out records using a CAS-function would check if the current record is “available” and then try to set it to “unavailable”. If it fails it continues to the next available record (in my case; it could also wait.)

With CouchDB and Cloudant you can implement this behaviour at a database record level;

Whenever a record (or “document”) is changed in these database implementations they get a new unique revision number (see for example: http://wiki.apache.org/couchdb/HTTP_Document_API) and whenever you want to update to a document you also need to provide the revision number for the document you want to change. The key here is that if the document currently in the database is at a different revision number the update fails (returning a 409 error.)

This is all we need to implement an atomic document CAS for these databases;

 docCAS(targetDocumentId, revisionNumber, newDocumentContents)

The “testValue” is now the expected revision number of the document which we want to change to “newDocumentContents”. The PUT call to the database will return a 409 error if the revision number is different from what we expected it to be. Otherwise the document is changed to the new contents (and the revision number is updated.)

There is nothing particularly clever or strange about all this but I thought it was quite nice that the CAS idea from lockless programming – usually confined to the domain of atomic single-instruction environments – was so easily and completely transferrable to a big, comparatively non-atomic, database.

And the matchmaking algorithm became a breeze to implement after this realisation had sunk in…

UPDATE

Cloudant pointed out that there is a problem with this implementation (as I put in my comment earlier) since a distributed database solution like Cloudant (or big couch) has a “propagation speed limit” (my term, can’t help drawing on physics parallels) such that two docCAS requests might apparently succeed at the same time and quietly cause a conflict. Internally Cloudant will resolve it and pick a “winner” but the writers won’t know that until they explicitly ask for conflicts from the database.

In my case the matchmaking can handle this because matches are played out asynchronously (i.e. while one player is online and the other is offline) so we have a little bit of leeway to manage the conflict. Even if these situations might arise infrequently they can arise and the matchmaker would be incomplete (and fragile) if it didn’t handle them.

For my case there are two situations where conflicts can arise;

  1. An offline player picked for a match is picked by more than one online player at the same time (because all of their docCAS requests overlap and cause a quiet conflict)
  2. An offline player picked for a match comes online the exact same moment as they are matched against and the docCAS succeeds for both

To resolve this I’m introducing a “begin/end” -match semantics (i.e. consider a match like a transaction); when a match between an attacker and a defender is over (and this might happen on a number of different clients since we are assuming a conflict happened) my code issues an GET on the defender’s document used for matchmaking with the ?conflicts=true flag in the query (more about this here: http://docs.cloudant.com/guides/mvcc.html). If this results in a list of conflicts I check if the current match was the winning one (i.e. if the revision of the matchmaking document for the defender was the one Cloudant picked as the winner); if it is I can go ahead and update the defender’s stats (and the attacker). If not I don’t (and I can decide to update the attacker depending on game design.) This way the defender will only ever lose once (and not have their resources stripped by several attackers at once.) Conversely, and depending on the game design, multiple attackers might benefit from the same attack but this is less of an issue.

For the second case the situation is a wee bit more involved (but not much); I have to check each revision in the conflict set (including the winning one) to find which one was the one corresponding to the defender coming online (knowing this requires a little bit more information in the matchmaking table) and determine if I update the defender and/or attacker based on this. I.e. it requires that a bit more information persists and that some more information is processed (the entire conflict set as opposed to just the winning one.)

So far, at least, it looks good on paper and in a test framework. The proof will be in the proverbial eating of the pud…

Update 2

It works but it can be simplified a lot; conflict resolution as I outlined it above is not needed because it only really affects the “match document” which are used to hold the lock. Once a match is over the defender’s state documents will be updated regardless – there is even no need to check if the revision number is the same as it was when the match started; one document will be the final word regardless what happens. However (!) this depends on the state of the defender not being updated accumulatively; i.e. if the defender’s state is cached at the beginning of a match and then updated from that cache at the end, as opposed to reloading the defender’s state fresh from the DB and then doing the update, the update will never be accumulative (i.e. it won’t be added to an update done by another attack at the same time.)

The end result is that only one document (state) is left as the final one, regardless of how many attacks happened. If a conflict happens (i.e. the replication issue outlined above) then Cloudant will pick a winner and we’re back to a single state update again. The defender is only penalised once.

That simplifies things….

Update 3

We’ve had this running for a while now without simultaneous requests generating any conflicts or issues. That doesn’t mean that they can’t happen (and we’re watching it carefully as the peak numbers grow) but as far as both reliability and performance is concerned the approach is working well and Cloudant serves this type of match making very well indeed.