SSync
A Java and Haskell implementation of the rsync algorithm.
Note that this is not an implementation of rsync itself! The data it produces is not compatible with either rsync or librsync. It is merely an implementation of the signature-generation, delta-analysis, and patch-application as described in the paper linked above.
Java
To compute the signature of a file, use
SignatureComputer.compute
or a SignatureComputer.SignatureFileInputStream
;
to create a patch, read the generated signature data into a
SignatureTable
and pass it together with an input stream to
PatchComputer.compute
or a PatchComputer.PatchComputerInputStream
to build a patch, and finally send the patch together with a
BlockFinder
to
PatchApplier.apply
or a PatchApplier.PatchInputStream
to generate the new file.
The use of these classes is demonstrated in the class
com.socrata.ssync.SSync
.
Haskell
The SSync
library uses
conduit for streaming data.
The
produceSignatureTable
conduit will digest a byte-stream into a signature file, which can
itself be read into a SignatureTable
value via
consumeSignatureTable
.
If the signature table is malformed, consumeSignatureTable
will
throw a SignatureTableException
. The
patchComputer
conduit can
combine the signature table with a stream of bytes to produce a patch
file. Finally, the
patchApplier
conduit can
combine the patch file with the data from the file being patched to
produce the target.
The use of these functions is demonstrated in the code for the
ssync
executable.
The ssync
library (but not the executable) is compatible with GHCJS
(note: GHCJS is currently a moving target; ssync has been built with
the version at commit
100fa6d67).
When using GHCJS, the only HashAlgorithm
available is MD5
.
The binary-equivalence-test.sh
file contains tests that ensure the
Java and Haskell versions produce exactly the same output for the same
input.