A library for replicating your python class between multiple servers, based on raft protocol


Keywords
network, replication, raft, synchronization, distributed-systems, fault-tolerance, python, raft-protocol
License
MIT
Install
pip install pysyncobj==0.3.12

Documentation

PySyncObj

Build Status Windows Build Status Coverage Status Release License gitter docs

PySyncObj is a python library for building fault-tolerant distributed systems. It provides the ability to replicate your application data between multiple servers. It has following features:

  • raft protocol for leader election and log replication
  • Log compaction - it use fork for copy-on-write while serializing data on disk
  • Dynamic membership changes - you can do it with syncobj_admin utility or directly from your code
  • Zero downtime deploy - no need to stop cluster to update nodes
  • In-memory and on-disk serialization - you can use in-memory mode for small data and on-disk for big one
  • Encryption - you can set password and use it in external network
  • Python2 and Python3 on linux, macos and windows - no dependencies required (only optional one, eg. cryptography)
  • Configurable event loop - it can works in separate thread with it's own event loop - or you can call onTick function inside your own one
  • Convenient interface - you can easily transform arbitrary class into a replicated one (see example below).

Content

Install

PySyncObj itself:

pip install pysyncobj

Cryptography for encryption (optional):

pip install cryptography

Usage

Consider you have a class that implements counter:

class MyCounter(object):
	def __init__(self):
		self.__counter = 0

	def incCounter(self):
		self.__counter += 1

	def getCounter(self):
		return self.__counter

So, to transform your class into a replicated one:

  • Inherit it from SyncObj
  • Initialize SyncObj with a self address and a list of partner addresses. Eg. if you have serverA, serverB and serverC and want to use 4321 port, you should use self address serverA:4321 with partners [serverB:4321, serverC:4321] for your application, running at serverA; self address serverB:4321 with partners [serverA:4321, serverC:4321] for your application at serverB; self address serverC:4321 with partners [serverA:4321, serverB:4321] for app at serverC.
  • Mark all your methods that modifies your class fields with @replicated decorator. So your final class will looks like:
class MyCounter(SyncObj):
	def __init__(self):
		super(MyCounter, self).__init__('serverA:4321', ['serverB:4321', 'serverC:4321'])
		self.__counter = 0

	@replicated
	def incCounter(self):
		self.__counter += 1

	def getCounter(self):
		return self.__counter

And thats all! Now you can call incCounter on serverA, and check counter value on serverB - they will be synchronized.

Batteries

If you just need some distributed data structures - try built-in "batteries". Few examples:

Counter & Dict

from pysyncobj import SyncObj
from pysyncobj.batteries import ReplCounter, ReplDict

counter1 = ReplCounter()
counter2 = ReplCounter()
dict1 = ReplDict()
syncObj = SyncObj('serverA:4321', ['serverB:4321', 'serverC:4321'], consumers=[counter1, counter2, dict1])

counter1.set(42, sync=True) # set initial value to 42, 'sync' means that operation is blocking
counter1.add(10, sync=True) # add 10 to counter value
counter2.inc(sync=True) # increment counter value by one
dict1.set('testKey1', 'testValue1', sync=True)
dict1['testKey2'] = 'testValue2' # this is basically the same as previous, but asynchronous (non-blocking)
print(counter1, counter2, dict1['testKey1'], dict1.get('testKey2'))

Lock

from pysyncobj import SyncObj
from pysyncobj.batteries import ReplLockManager

lockManager = ReplLockManager(autoUnlockTime=75) # Lock will be released if connection dropped for more than 75 seconds
syncObj = SyncObj('serverA:4321', ['serverB:4321', 'serverC:4321'], consumers=[lockManager])
if lockManager.tryAcquire('testLockName', sync=True):
  # do some actions
  lockManager.release('testLockName')

You can look at batteries implementation, examples and unit-tests for more use-cases. Also there is an API documentation. Feel free to create proposals and/or pull requests with new batteries, features, etc. Join our gitter chat if you have any questions.

Performance

15K rps on 3 nodes; 14K rps on 7 nodes; 22K rps on 10 byte requests; 5K rps on 20Kb requests;

Publications