github.com/willglynn/hatt

Hash all the things! Checksumming tool on crack.


License
BSD-2-Clause
Install
go get github.com/willglynn/hatt

Documentation

hatt

Hash all the things! Checksumming tool on crack.

HASH ALL THE THINGS

Example

cd /path/to/files
hatt hash -m ../files.hatt
cd ..
hatt export -m files.hatt -o files.md5sum

hatt hash is used to build a manifest, which includes information about all the files hatt has processed.

Manifests list files along with various fingerprints – MD5, SHA1, SHA2-256, CRC32, etc. – for the purpose of later comparison. By default, hatt hash computes all these hashes, on the theory that CPU time is much cheaper than disk I/O. This way, if you ever need one specific type of hash, odds are you already have it.

hatt is designed for use on large, semi-static datasets. Therefore, in addition to hashes for each file, the manifest includes the file's modification time and size in bytes. By default, hatt hash skips files with matching mtime and size under the assumption that they haven't changed, greatly reducing overall I/O requirements.

hatt hash can be interrupted with SIGINT (^C), in which case it writes the in-progress manifest and terminates.

hatt hash supports multithreading, i.e. -t 4/--threads=8. This should probably be 1-2 if you're using rotational storage (i.e. a hard drive), since reading more files at once will just cause your drive to spend more time seeking. On the other hand, if you're on a disk array or an SSD, consider setting it to the number of CPUs in your machine for maximum performance.

Interoperability

hatt doesn't exist in a vacuum; there's lots of other tools that do similar things. hatt manifests aren't really useful outside of hatt, but hatt can talk to other formats.

Check hatt help export or hatt formats for more.