summaryrefslogtreecommitdiff
path: root/arch/x86/crypto/sha1_avx2_x86_64_asm.S
AgeCommit message (Collapse)Author
2014-03-25crypto: x86/sha1 - reduce size of the AVX2 asm implementationMathias Krause
There is really no need to page align sha1_transform_avx2. The default alignment is just fine. This is not the hot code but only the entry point, after all. Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: Mathias Krause <minipli@googlemail.com> Reviewed-by: H. Peter Anvin <hpa@linux.intel.com> Reviewed-by: Marek Vasut <marex@denx.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-03-25crypto: x86/sha1 - fix stack alignment of AVX2 variantMathias Krause
The AVX2 implementation might waste up to a page of stack memory because of a wrong alignment calculation. This will, in the worst case, increase the stack usage of sha1_transform_avx2() alone to 5.4 kB -- way to big for a kernel function. Even worse, it might also allocate *less* bytes than needed if the stack pointer is already aligned bacause in that case the 'sub %rbx, %rsp' is effectively moving the stack pointer upwards, not downwards. Fix those issues by changing and simplifying the alignment calculation to use a 32 byte alignment, the alignment really needed. Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: Mathias Krause <minipli@googlemail.com> Reviewed-by: H. Peter Anvin <hpa@linux.intel.com> Reviewed-by: Marek Vasut <marex@denx.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-03-21crypto: sha - SHA1 transform x86_64 AVX2chandramouli narayanan
This git patch adds x86_64 AVX2 optimization of SHA1 transform to crypto support. The patch has been tested with 3.14.0-rc1 kernel. On a Haswell desktop, with turbo disabled and all cpus running at maximum frequency, tcrypt shows AVX2 performance improvement from 3% for 256 bytes update to 16% for 1024 bytes update over AVX implementation. This patch adds sha1_avx2_transform(), the glue, build and configuration changes needed for AVX2 optimization of SHA1 transform to crypto support. sha1-ssse3 is one module which adds the necessary optimization support (SSSE3/AVX/AVX2) for the low-level SHA1 transform function. With better optimization support, transform function is overridden as the case may be. In the case of AVX2, due to performance reasons across datablock sizes, the AVX or AVX2 transform function is used at run-time as it suits best. The Makefile change therefore appends the necessary objects to the linkage. Due to this, the patch merely appends AVX2 transform to the existing build mix and Kconfig support and leaves the configuration build support as is. Signed-off-by: Chandramouli Narayanan <mouli@linux.intel.com> Reviewed-by: Marek Vasut <marex@denx.de> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>