Manuel Bluhm via RT
2013-08-28 05:06:14 UTC
Hello all,
This patch is a contribution to OpenSSL.
It offers an efficient and constant-time implementation of the elliptic
curve point multiplication, for the following standard NIST/SECG binary
elliptic curves:
sect163k1, sect163r1, sect163r2, sect193r1, sect193r2, sect233k1,
sect233r1, sect239k1, sect283k1, sect283r1, sect409k1, sect409r1,
sect571k1, and sect571r1.
The patch implements several improvements at the algorithmic and the
coding levels (using SSE/AVX and PCLMULQDQ instructions).
Depending on the curve and architecture, this patch offers a speedup of
between 4x to 10x for ECDH and ECDSA, compared to the current
implementation of OpenSSL 1.0.1e.
Additionally, it adds side channel protection to avoid (cache) timing
attacks using a number of mechanisms.
The code is written in C and uses compiler intrinsics, for simplicity
and portability. The following results were obtained with gcc 4.8.1.
For detailed explanations of the rationale and algorithms of this code
refer to [1].
ECDH performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility as follows:
$ openssl speed ecdh
The results for a Core i7-4770 CPU @ 3.40GHz (Haswell) in ECDH op/s:
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 6586.9 || 67029.6 || 10.18 ||
(nistk233) || 5121.9 || 39441.3 || 7.70 ||
(nistk283) || 2825.7 || 27718.5 || 9.81 ||
(nistk409) || 1745.8 || 11634.2 || 6.66 ||
(nistk571) || 763.2 || 5930.9 || 7.77 ||
(nistb163) || 6382.5 || 60729.6 || 9.52 ||
(nistb233) || 4881.9 || 35230.4 || 7.22 ||
(nistb283) || 2651.6 || 24456.4 || 9.22 ||
(nistb409) || 1640.3 || 10228.6 || 6.24 ||
(nistb571) || 693.8 || 5172.1 || 7.45 ||
|| || || ||
------------||----------------||-------------||----------||
The results for a Core i5-3210M @ 2.50 GHz (Ivy Bridge) in ECDH op/s:
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 3271.5 || 28087.3 || 8.59 ||
(nistk233) || 2504.9 || 15106.0 || 6.03 ||
(nistk283) || 1317.0 || 9030.5 || 6.86 ||
(nistk409) || 772.1 || 3880.8 || 5.03 ||
(nistk571) || 327.3 || 1821.1 || 5.56 ||
(nistb163) || 3067.9 || 24357.1 || 7.94 ||
(nistb233) || 2424.9 || 3147.3 || 5.42 ||
(nistb283) || 1227.0 || 7765.1 || 6.33 ||
(nistb409) || 709.7 || 3319.9 || 4.68 ||
(nistb571) || 296.2 || 1563.9 || 5.28 ||
|| || || ||
------------||----------------||-------------||----------||
ECDSA performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility as follows:
$ openssl speed ecdsa
The results for a Core i7-4770 CPU @ 3.40GHz (Haswell):
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 6,465.3 3,159.5 || 36,872.6 26,508.4 || 5.70 8.39 ||
(nistk233) || 3,259.2 2,419.8 || 22,998.4 15,557.1 || 7.06 6.43 ||
(nistk283) || 2,204.7 1,355.7 || 16,884.9 11,003.2 || 7.66 8.12 ||
(nistk409) || 977.0 839.1 || 8,150.0 4,845.0 || 8.34 5.77 ||
(nistk571) || 466.4 368.3 || 4,424.1 2,533.6 || 9.49 6.88 ||
(nistb163) || 6,487.3 3,043.9 || 35,110.0 24,904.8 || 5.41 8.18 ||
(nistb233) || 3,279.2 2,348.0 || 21,468.8 14,095.6 || 6.55 6.00 ||
(nistb283) || 2,196.4 1,283.5 || 15,602.7 9,888.5 || 7.10 7.70 ||
(nistb409) || 976.3 786.9 || 7,423.1 4,361.9 || 7.60 5.54 ||
(nistb571) || 466.6 341.0 || 3,977.0 2,251.6 || 8.52 6.60 ||
|| || || ||
-----------||-----------------||-------------------||-----------------||
The results for a Core i5-3210M CPU @ 2.50 GHz (Ivy Bridge):
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 3,749.9 1,578.6 || 17,721.8 11,688.1 || 4.73 7.40 ||
(nistk233) || 1,881.7 1,211.6 || 10,359.0 6,439.4 || 5.51 5.31 ||
(nistk283) || 1,267.5 639.3 || 6,688.9 3,951.1 || 5.28 6.18 ||
(nistk409) || 542.2 361.9 || 3,140.9 1,757.1 || 5.79 4.86 ||
(nistk571) || 257.6 159.9 || 1,556.0 834.6 || 6.04 5.22 ||
(nistb163) || 3,766.5 1,514.5 || 16,203.5 10,453.8 || 4.30 6.90 ||
(nistb233) || 1,893.1 1,150.4 || 9,386.5 5,711.9 || 4.96 4.97 ||
(nistb283) || 1,265.7 594.2 || 5,962.3 3,445.5 || 4.71 5.80 ||
(nistb409) || 539.3 344.2 || 2,763.4 1,522.4 || 5.12 4.42 ||
(nistb571) || 257.2 145.7 || 1,354.8 724.9 || 5.27 4.98 ||
|| || || ||
-----------||-----------------||-------------------||-----------------||
Changes to OpenSSL-1.0.1e
--------------------------------------------------------------------------
crypto/bn:
bn_gf2m_xmm.c : New file, contains XMM GF2m implementation
bn.h : Added new function declarations
bn_gf2m.c : Added constant time bn operations
Makefile : Added bn_gf2m_xmm.c to makefile
crypto/ec:
ec2_nist_mult.c: New file, implements Montgomery point multiplication
ec2_nist.c : New file, implements EC methods
ec2_nist_prec.c: New file, implements method to get precomputated values
ec.h : Added function declarations (ec_methods)
ec_lcl.h : Added function declarations (all functions in the ec_method)
ec_curve.c: Added new EC methods to builtin curves
Makefile : Added new files to makefile
Configuration flags
--------------------------------------------------------------------------
-DOPENSSL_FAST_EC2M : Enable the fast implementation of binary curves
-DFAST_PCLMUL : Enable the pclmul reduction for pentanomial curves
-mpclmul : Enable pclmulqdq
-msse4 : Enable SSE4
-mavx : Enable AVX
-mavx2 : Enable AVX2
-march=native : Enable all instruction subsets
The results above have been created with the following configurations:
(1) Core i7-4770 @ 3.40GHz (Haswell):
./config -mavx2 -mpclmul -DFAST_PCLMUL -DOPENSSL_FAST_EC2M
(2) Core i5-3210M @ 2.50 GHz (Ivy Bridge):
./config -mavx -mpclmul -DOPENSSL_FAST_EC2M
[1] M. Bluhm, S. Gueron, Fast Software Implementation of Binary Elliptic
Curve Cryptography (2013; to be published)
Developers and authors:
***************************************************************************
Manuel Bluhm (1) and Shay Gueron (2, 3)
(1) Ruhr University Bochum, Germany
(2) Intel Corporation, Israel Development Center, Haifa, Israel
(3) University of Haifa, Israel
***************************************************************************
This patch is a contribution to OpenSSL.
It offers an efficient and constant-time implementation of the elliptic
curve point multiplication, for the following standard NIST/SECG binary
elliptic curves:
sect163k1, sect163r1, sect163r2, sect193r1, sect193r2, sect233k1,
sect233r1, sect239k1, sect283k1, sect283r1, sect409k1, sect409r1,
sect571k1, and sect571r1.
The patch implements several improvements at the algorithmic and the
coding levels (using SSE/AVX and PCLMULQDQ instructions).
Depending on the curve and architecture, this patch offers a speedup of
between 4x to 10x for ECDH and ECDSA, compared to the current
implementation of OpenSSL 1.0.1e.
Additionally, it adds side channel protection to avoid (cache) timing
attacks using a number of mechanisms.
The code is written in C and uses compiler intrinsics, for simplicity
and portability. The following results were obtained with gcc 4.8.1.
For detailed explanations of the rationale and algorithms of this code
refer to [1].
ECDH performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility as follows:
$ openssl speed ecdh
The results for a Core i7-4770 CPU @ 3.40GHz (Haswell) in ECDH op/s:
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 6586.9 || 67029.6 || 10.18 ||
(nistk233) || 5121.9 || 39441.3 || 7.70 ||
(nistk283) || 2825.7 || 27718.5 || 9.81 ||
(nistk409) || 1745.8 || 11634.2 || 6.66 ||
(nistk571) || 763.2 || 5930.9 || 7.77 ||
(nistb163) || 6382.5 || 60729.6 || 9.52 ||
(nistb233) || 4881.9 || 35230.4 || 7.22 ||
(nistb283) || 2651.6 || 24456.4 || 9.22 ||
(nistb409) || 1640.3 || 10228.6 || 6.24 ||
(nistb571) || 693.8 || 5172.1 || 7.45 ||
|| || || ||
------------||----------------||-------------||----------||
The results for a Core i5-3210M @ 2.50 GHz (Ivy Bridge) in ECDH op/s:
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 3271.5 || 28087.3 || 8.59 ||
(nistk233) || 2504.9 || 15106.0 || 6.03 ||
(nistk283) || 1317.0 || 9030.5 || 6.86 ||
(nistk409) || 772.1 || 3880.8 || 5.03 ||
(nistk571) || 327.3 || 1821.1 || 5.56 ||
(nistb163) || 3067.9 || 24357.1 || 7.94 ||
(nistb233) || 2424.9 || 3147.3 || 5.42 ||
(nistb283) || 1227.0 || 7765.1 || 6.33 ||
(nistb409) || 709.7 || 3319.9 || 4.68 ||
(nistb571) || 296.2 || 1563.9 || 5.28 ||
|| || || ||
------------||----------------||-------------||----------||
ECDSA performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility as follows:
$ openssl speed ecdsa
The results for a Core i7-4770 CPU @ 3.40GHz (Haswell):
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 6,465.3 3,159.5 || 36,872.6 26,508.4 || 5.70 8.39 ||
(nistk233) || 3,259.2 2,419.8 || 22,998.4 15,557.1 || 7.06 6.43 ||
(nistk283) || 2,204.7 1,355.7 || 16,884.9 11,003.2 || 7.66 8.12 ||
(nistk409) || 977.0 839.1 || 8,150.0 4,845.0 || 8.34 5.77 ||
(nistk571) || 466.4 368.3 || 4,424.1 2,533.6 || 9.49 6.88 ||
(nistb163) || 6,487.3 3,043.9 || 35,110.0 24,904.8 || 5.41 8.18 ||
(nistb233) || 3,279.2 2,348.0 || 21,468.8 14,095.6 || 6.55 6.00 ||
(nistb283) || 2,196.4 1,283.5 || 15,602.7 9,888.5 || 7.10 7.70 ||
(nistb409) || 976.3 786.9 || 7,423.1 4,361.9 || 7.60 5.54 ||
(nistb571) || 466.6 341.0 || 3,977.0 2,251.6 || 8.52 6.60 ||
|| || || ||
-----------||-----------------||-------------------||-----------------||
The results for a Core i5-3210M CPU @ 2.50 GHz (Ivy Bridge):
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 3,749.9 1,578.6 || 17,721.8 11,688.1 || 4.73 7.40 ||
(nistk233) || 1,881.7 1,211.6 || 10,359.0 6,439.4 || 5.51 5.31 ||
(nistk283) || 1,267.5 639.3 || 6,688.9 3,951.1 || 5.28 6.18 ||
(nistk409) || 542.2 361.9 || 3,140.9 1,757.1 || 5.79 4.86 ||
(nistk571) || 257.6 159.9 || 1,556.0 834.6 || 6.04 5.22 ||
(nistb163) || 3,766.5 1,514.5 || 16,203.5 10,453.8 || 4.30 6.90 ||
(nistb233) || 1,893.1 1,150.4 || 9,386.5 5,711.9 || 4.96 4.97 ||
(nistb283) || 1,265.7 594.2 || 5,962.3 3,445.5 || 4.71 5.80 ||
(nistb409) || 539.3 344.2 || 2,763.4 1,522.4 || 5.12 4.42 ||
(nistb571) || 257.2 145.7 || 1,354.8 724.9 || 5.27 4.98 ||
|| || || ||
-----------||-----------------||-------------------||-----------------||
Changes to OpenSSL-1.0.1e
--------------------------------------------------------------------------
crypto/bn:
bn_gf2m_xmm.c : New file, contains XMM GF2m implementation
bn.h : Added new function declarations
bn_gf2m.c : Added constant time bn operations
Makefile : Added bn_gf2m_xmm.c to makefile
crypto/ec:
ec2_nist_mult.c: New file, implements Montgomery point multiplication
ec2_nist.c : New file, implements EC methods
ec2_nist_prec.c: New file, implements method to get precomputated values
ec.h : Added function declarations (ec_methods)
ec_lcl.h : Added function declarations (all functions in the ec_method)
ec_curve.c: Added new EC methods to builtin curves
Makefile : Added new files to makefile
Configuration flags
--------------------------------------------------------------------------
-DOPENSSL_FAST_EC2M : Enable the fast implementation of binary curves
-DFAST_PCLMUL : Enable the pclmul reduction for pentanomial curves
-mpclmul : Enable pclmulqdq
-msse4 : Enable SSE4
-mavx : Enable AVX
-mavx2 : Enable AVX2
-march=native : Enable all instruction subsets
The results above have been created with the following configurations:
(1) Core i7-4770 @ 3.40GHz (Haswell):
./config -mavx2 -mpclmul -DFAST_PCLMUL -DOPENSSL_FAST_EC2M
(2) Core i5-3210M @ 2.50 GHz (Ivy Bridge):
./config -mavx -mpclmul -DOPENSSL_FAST_EC2M
[1] M. Bluhm, S. Gueron, Fast Software Implementation of Binary Elliptic
Curve Cryptography (2013; to be published)
Developers and authors:
***************************************************************************
Manuel Bluhm (1) and Shay Gueron (2, 3)
(1) Ruhr University Bochum, Germany
(2) Intel Corporation, Israel Development Center, Haifa, Israel
(3) University of Haifa, Israel
***************************************************************************