Exploration into primitive CRDTs
What even is a CRDT?
Formally defined in 2011 by a group of French researchers, A CRDT, or conflict-free replicated data type, is a data structure which allows for distributed systems to achieve eventual data consistency in a deterministic manner.
Any replica of the data structure is able to be modified/operated on and via some associated process it will eventually become consistent with all other replica’s
How to achieve consistency?
Achieving eventual consistency across multiple replicas of a data structure relies on a few key requirements of the algorithm used to ‘merge’ the differing states. Following the below rules any two merges between replicas would result in the same result.
- Commutative. M(A, B) = M(B, A)
- Associative. M(M(A, B), C) = M(A, M(B, C))
- Idempotent. M(A, A) = A
- Deterministic Logic
Set based example
A set is a simple data structure which allows for modelling a CRDT quite well, I will be using typescript to create a generic implementation for use in my collaborative text editor project
Operations
- Addition of Elements
- Deletion of Elements
- Lookup of Elements
A set that can only grow
class GrowSet<T> {
items: set<T>();
constructor(items){
this.items = items ?? new set();
}
addItem(item: T){
this.items.add(item);
}
contains(item: T): Boolean {
return this.items.contains(item);
}
}
function merge(A: GrowSet<T>, B: GrowSet<T>): GrowSet<T> {
return new GrowSet(Union(A.items, B.items));
}
In this example you can see that to combine any two GrowSets, A and B, we could simply perform the union of the items in A and B, every set of possible states would reach the same consistent point as the union of two sets is commutative, associative and idempotent.
class Set<T> {
addedItems: set<T>();
removedItems: set<T>();
constructor(added, removed){
this.addedItems = added ?? new set();
this.removedItems = removed ?? new set();
}
addItem(item: T){
this.addedItems.add(item);
}
removeItem(item: T){
this.addedItems.remove(item);
this.removedItems.add(item);
}
contains(item: T): Boolean {
return this.addedItems.contains(item) && !this.removedItems.contains(item);
}
}
function merge(A: Set<T>, B: Set<T>): Set<T> {
return new Set(
Union(A.addedItems, B.addedItems),
Union(A.removedItems, B.removedItems)
);
}
There is a limitation to this approach, if an element is removed from the set then it can never be placed back into the addedItems/main working set. Removal of an item takes precedent over addition. In the case of a document editor each file could be stored as an object containing a unique uuid with a payload, this would allow for the file to be restored by creating a copy with a new uuid.
There is the issue of this not allowing for any sort of ordering to be given to the elements. To get around this limitation there are a few workarounds..
Sequence Based Approach
To represent a textual document we can create