BLAKE and 256-bit advanced vector extensions



Intel recently documented its AVX2 instruction set extension that introduces
support for 256-bit wide single-instruction multiple-data (SIMD) integer arithmetic over
double (32-bit) and quad (64-bit) words. This will enable Intel's future processors|starting
with the Haswell architecture, to be released in 2013|to fully support 4-way SIMD com-
putation of 64-bit ARX algorithms (32-bit is already supported since SSE2). AVX2 also
introduces instructions with potential to speed-up cryptographic functions, like any-to-any
permute and vectorized table lookup. In this paper we show how the AVX2 instructions will
benet the SHA-3 nalist hash function BLAKE, an algorithm that naturally lends itself
to 4-way 32- or 64-bit SIMD implementations thanks to its inherent parallelism. We also
wrote BLAKE-256 assembly code for AVX and AVX2, and measured for the former a speed
of 7.62 cycles per byte, setting a new speed record.


hash functions, SHA-3, implementation, SIMD


Third SHA-3 Candidate Conference 2012

Cited by

Year 2015 : 1 citations

 JP Aumasson, W Meier, RCW Phan, L Henzen, The Hash Function BLAKE, Publication/NA, 2015

Year 2013 : 2 citations

 Kivilinna, Jussi. "Block Ciphers: Fast Implementations on x86-64 Architecture." (2013).

 Aumasson, Jean-Philippe, Samuel Neves, Zooko Wilcox-O’Hearn, and Christian Winnerlein. "BLAKE2: simpler, smaller, fast as MD5." (2013).