BLAKE and 256-bit advanced vector extensions



Intel recently documented its AVX2 instruction set extension that introduces
support for 256-bit wide single-instruction multiple-data (SIMD) integer arithmetic over
double (32-bit) and quad (64-bit) words. This will enable Intel's future processors|starting
with the Haswell architecture, to be released in 2013|to fully support 4-way SIMD com-
putation of 64-bit ARX algorithms (32-bit is already supported since SSE2). AVX2 also
introduces instructions with potential to speed-up cryptographic functions, like any-to-any
permute and vectorized table lookup. In this paper we show how the AVX2 instructions will
benet the SHA-3 nalist hash function BLAKE, an algorithm that naturally lends itself
to 4-way 32- or 64-bit SIMD implementations thanks to its inherent parallelism. We also
wrote BLAKE-256 assembly code for AVX and AVX2, and measured for the former a speed
of 7.62 cycles per byte, setting a new speed record.


hash functions, SHA-3, implementation, SIMD


Third SHA-3 Candidate Conference 2012

