Native Go version of HighwayHash with optimized assembly implementations on Intel and ARM. Able to process over 10 GB/sec on a single core on Intel CPUs - https://en.wikipedia.org/wiki/HighwayHash


Keywords
assembly, avx2, hash-functions, highway-hash, neon, plan9
License
Apache-2.0
Install
go get github.com/minio/highwayhash

Documentation

Godoc Reference Build Status

HighwayHash

HighwayHash is a pseudo-random-function (PRF) developed by Jyrki Alakuijala, Bill Cox and Jan Wassenberg (Google research). HighwayHash takes a 256 bit key and computes 64, 128 or 256 bit hash values of given messages.

It can be used to prevent hash-flooding attacks or authenticate short-lived messages. Additionally it can be used as a fingerprinting function. HighwayHash is not a general purpose cryptographic hash function (such as Blake2b, SHA-3 or SHA-2) and should not be used if strong collision resistance is required.

This repository contains a native Go version and optimized assembly implementations for Intel, ARM and ppc64le architectures.

High performance

HighwayHash is an approximately 5x faster SIMD hash function as compared to SipHash which in itself is a fast and 'cryptographically strong' pseudo-random function designed by Aumasson and Bernstein.

HighwayHash uses a new way of mixing inputs with AVX2 multiply and permute instructions. The multiplications are 32x32 bit giving 64 bits-wide results and are therefore infeasible to reverse. Additionally permuting equalizes the distribution of the resulting bytes. The algorithm outputs digests ranging from 64 bits up to 256 bits at no extra cost.

Stable

All three output sizes of HighwayHash have been declared stable as of January 2018. This means that the hash results for any given input message are guaranteed not to change.

Installation

Install: go get -u github.com/minio/highwayhash

Intel Performance

Below are the single core results on an Intel Core i7 (3.1 GHz) for 256 bit outputs:

BenchmarkSum256_16      		  204.17 MB/s
BenchmarkSum256_64      		 1040.63 MB/s
BenchmarkSum256_1K      		 8653.30 MB/s
BenchmarkSum256_8K      		13476.07 MB/s
BenchmarkSum256_1M      		14928.71 MB/s
BenchmarkSum256_5M      		14180.04 MB/s
BenchmarkSum256_10M     		12458.65 MB/s
BenchmarkSum256_25M     		11927.25 MB/s

So for moderately sized messages it tops out at about 15 GB/sec. Also for small messages (1K) the performance is already at approximately 60% of the maximum throughput.

ARM Performance

Below are the single core results on an EC2 c7g.4xlarge (Graviton3) instance for 256 bit outputs:

BenchmarkSum256_16                143.66 MB/s
BenchmarkSum256_64                628.75 MB/s
BenchmarkSum256_1K               3621.71 MB/s
BenchmarkSum256_8K               5039.64 MB/s
BenchmarkSum256_1M               5279.79 MB/s
BenchmarkSum256_5M               5474.60 MB/s
BenchmarkSum256_10M              5621.73 MB/s
BenchmarkSum256_25M              5250.47 MB/s

ppc64le Performance

The ppc64le accelerated version is roughly 10x faster compared to the non-optimized version:

benchmark              old MB/s     new MB/s     speedup
BenchmarkWrite_8K      531.19       5566.41      10.48x
BenchmarkSum64_8K      518.86       4971.88      9.58x
BenchmarkSum256_8K     502.45       4474.20      8.90x

Performance compared to other hashing techniques

On a Skylake CPU (3.0 GHz Xeon Platinum 8124M) the table below shows how HighwayHash compares to other hashing techniques for 5 MB messages (single core performance, all Golang implementations, see benchmark).

BenchmarkHighwayHash      	    	11986.98 MB/s
BenchmarkSHA256_AVX512    	    	 3552.74 MB/s
BenchmarkBlake2b          	    	  972.38 MB/s
BenchmarkSHA1             	    	  950.64 MB/s
BenchmarkMD5              	    	  684.18 MB/s
BenchmarkSHA512           	    	  562.04 MB/s
BenchmarkSHA256           	    	  383.07 MB/s

Note: the AVX512 version of SHA256 uses the multi-buffer crypto library technique as developed by Intel, more details can be found in sha256-simd.

Qualitative assessment

We have performed a 'qualitative' assessment of how HighwayHash compares to Blake2b in terms of the distribution of the checksums for varying numbers of messages. It shows that HighwayHash behaves similarly according to the following graph:

Hash Comparison Overview

More information can be found in HashCompare.

Requirements

All Go versions >= 1.11 are supported (needed for required assembly support for the different platforms).

Contributing

Contributions are welcome, please send PRs for any enhancements.