com.socrata:ssync

ssync


License
Apache-2.0

Documentation

SSync

A Java and Haskell implementation of the rsync algorithm.

Note that this is not an implementation of rsync itself! The data it produces is not compatible with either rsync or librsync. It is merely an implementation of the signature-generation, delta-analysis, and patch-application as described in the paper linked above.

Java

To compute the signature of a file, use SignatureComputer.compute or a SignatureComputer.SignatureFileInputStream; to create a patch, read the generated signature data into a SignatureTable and pass it together with an input stream to PatchComputer.compute or a PatchComputer.PatchComputerInputStream to build a patch, and finally send the patch together with a BlockFinder to PatchApplier.apply or a PatchApplier.PatchInputStream to generate the new file.

The use of these classes is demonstrated in the class com.socrata.ssync.SSync.

Haskell

The SSync library uses conduit for streaming data.

The produceSignatureTable conduit will digest a byte-stream into a signature file, which can itself be read into a SignatureTable value via consumeSignatureTable. If the signature table is malformed, consumeSignatureTable will throw a SignatureTableException. The patchComputer conduit can combine the signature table with a stream of bytes to produce a patch file. Finally, the patchApplier conduit can combine the patch file with the data from the file being patched to produce the target.

The use of these functions is demonstrated in the code for the ssync executable.

The ssync library (but not the executable) is compatible with GHCJS (note: GHCJS is currently a moving target; ssync has been built with the version at commit 100fa6d67). When using GHCJS, the only HashAlgorithm available is MD5.

The binary-equivalence-test.sh file contains tests that ensure the Java and Haskell versions produce exactly the same output for the same input.