Discussion:
[openssl.org #3117] [PATCH] A fast vectorized implementation of binary elliptic curves on x86-64 processors
(too old to reply)
Manuel Bluhm via RT
2013-08-28 05:06:14 UTC
Permalink
Hello all,

This patch is a contribution to OpenSSL.

It offers an efficient and constant-time implementation of the elliptic
curve point multiplication, for the following standard NIST/SECG binary
elliptic curves:
sect163k1, sect163r1, sect163r2, sect193r1, sect193r2, sect233k1,
sect233r1, sect239k1, sect283k1, sect283r1, sect409k1, sect409r1,
sect571k1, and sect571r1.

The patch implements several improvements at the algorithmic and the
coding levels (using SSE/AVX and PCLMULQDQ instructions).

Depending on the curve and architecture, this patch offers a speedup of
between 4x to 10x for ECDH and ECDSA, compared to the current
implementation of OpenSSL 1.0.1e.
Additionally, it adds side channel protection to avoid (cache) timing
attacks using a number of mechanisms.

The code is written in C and uses compiler intrinsics, for simplicity
and portability. The following results were obtained with gcc 4.8.1.

For detailed explanations of the rationale and algorithms of this code
refer to [1].


ECDH performance
--------------------------------------------------------------------------

The performance was measured by using openssl speed utility as follows:

$ openssl speed ecdh


The results for a Core i7-4770 CPU @ 3.40GHz (Haswell) in ECDH op/s:

Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 6586.9 || 67029.6 || 10.18 ||
(nistk233) || 5121.9 || 39441.3 || 7.70 ||
(nistk283) || 2825.7 || 27718.5 || 9.81 ||
(nistk409) || 1745.8 || 11634.2 || 6.66 ||
(nistk571) || 763.2 || 5930.9 || 7.77 ||
(nistb163) || 6382.5 || 60729.6 || 9.52 ||
(nistb233) || 4881.9 || 35230.4 || 7.22 ||
(nistb283) || 2651.6 || 24456.4 || 9.22 ||
(nistb409) || 1640.3 || 10228.6 || 6.24 ||
(nistb571) || 693.8 || 5172.1 || 7.45 ||
|| || || ||
------------||----------------||-------------||----------||


The results for a Core i5-3210M @ 2.50 GHz (Ivy Bridge) in ECDH op/s:

Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 3271.5 || 28087.3 || 8.59 ||
(nistk233) || 2504.9 || 15106.0 || 6.03 ||
(nistk283) || 1317.0 || 9030.5 || 6.86 ||
(nistk409) || 772.1 || 3880.8 || 5.03 ||
(nistk571) || 327.3 || 1821.1 || 5.56 ||
(nistb163) || 3067.9 || 24357.1 || 7.94 ||
(nistb233) || 2424.9 || 3147.3 || 5.42 ||
(nistb283) || 1227.0 || 7765.1 || 6.33 ||
(nistb409) || 709.7 || 3319.9 || 4.68 ||
(nistb571) || 296.2 || 1563.9 || 5.28 ||
|| || || ||
------------||----------------||-------------||----------||



ECDSA performance
--------------------------------------------------------------------------

The performance was measured by using openssl speed utility as follows:

$ openssl speed ecdsa


The results for a Core i7-4770 CPU @ 3.40GHz (Haswell):

Curve || OpenSSL 1.0.1e || This patch || Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 6,465.3 3,159.5 || 36,872.6 26,508.4 || 5.70 8.39 ||
(nistk233) || 3,259.2 2,419.8 || 22,998.4 15,557.1 || 7.06 6.43 ||
(nistk283) || 2,204.7 1,355.7 || 16,884.9 11,003.2 || 7.66 8.12 ||
(nistk409) || 977.0 839.1 || 8,150.0 4,845.0 || 8.34 5.77 ||
(nistk571) || 466.4 368.3 || 4,424.1 2,533.6 || 9.49 6.88 ||
(nistb163) || 6,487.3 3,043.9 || 35,110.0 24,904.8 || 5.41 8.18 ||
(nistb233) || 3,279.2 2,348.0 || 21,468.8 14,095.6 || 6.55 6.00 ||
(nistb283) || 2,196.4 1,283.5 || 15,602.7 9,888.5 || 7.10 7.70 ||
(nistb409) || 976.3 786.9 || 7,423.1 4,361.9 || 7.60 5.54 ||
(nistb571) || 466.6 341.0 || 3,977.0 2,251.6 || 8.52 6.60 ||
|| || || ||
-----------||-----------------||-------------------||-----------------||


The results for a Core i5-3210M CPU @ 2.50 GHz (Ivy Bridge):

Curve || OpenSSL 1.0.1e || This patch || Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 3,749.9 1,578.6 || 17,721.8 11,688.1 || 4.73 7.40 ||
(nistk233) || 1,881.7 1,211.6 || 10,359.0 6,439.4 || 5.51 5.31 ||
(nistk283) || 1,267.5 639.3 || 6,688.9 3,951.1 || 5.28 6.18 ||
(nistk409) || 542.2 361.9 || 3,140.9 1,757.1 || 5.79 4.86 ||
(nistk571) || 257.6 159.9 || 1,556.0 834.6 || 6.04 5.22 ||
(nistb163) || 3,766.5 1,514.5 || 16,203.5 10,453.8 || 4.30 6.90 ||
(nistb233) || 1,893.1 1,150.4 || 9,386.5 5,711.9 || 4.96 4.97 ||
(nistb283) || 1,265.7 594.2 || 5,962.3 3,445.5 || 4.71 5.80 ||
(nistb409) || 539.3 344.2 || 2,763.4 1,522.4 || 5.12 4.42 ||
(nistb571) || 257.2 145.7 || 1,354.8 724.9 || 5.27 4.98 ||
|| || || ||
-----------||-----------------||-------------------||-----------------||



Changes to OpenSSL-1.0.1e
--------------------------------------------------------------------------

crypto/bn:

bn_gf2m_xmm.c : New file, contains XMM GF2m implementation
bn.h : Added new function declarations
bn_gf2m.c : Added constant time bn operations
Makefile : Added bn_gf2m_xmm.c to makefile

crypto/ec:

ec2_nist_mult.c: New file, implements Montgomery point multiplication
ec2_nist.c : New file, implements EC methods
ec2_nist_prec.c: New file, implements method to get precomputated values

ec.h : Added function declarations (ec_methods)
ec_lcl.h : Added function declarations (all functions in the ec_method)
ec_curve.c: Added new EC methods to builtin curves

Makefile : Added new files to makefile




Configuration flags
--------------------------------------------------------------------------

-DOPENSSL_FAST_EC2M : Enable the fast implementation of binary curves
-DFAST_PCLMUL : Enable the pclmul reduction for pentanomial curves

-mpclmul : Enable pclmulqdq
-msse4 : Enable SSE4
-mavx : Enable AVX
-mavx2 : Enable AVX2
-march=native : Enable all instruction subsets


The results above have been created with the following configurations:

(1) Core i7-4770 @ 3.40GHz (Haswell):

./config -mavx2 -mpclmul -DFAST_PCLMUL -DOPENSSL_FAST_EC2M

(2) Core i5-3210M @ 2.50 GHz (Ivy Bridge):

./config -mavx -mpclmul -DOPENSSL_FAST_EC2M




[1] M. Bluhm, S. Gueron, Fast Software Implementation of Binary Elliptic
Curve Cryptography (2013; to be published)

Developers and authors:
***************************************************************************
Manuel Bluhm (1) and Shay Gueron (2, 3)
(1) Ruhr University Bochum, Germany
(2) Intel Corporation, Israel Development Center, Haifa, Israel
(3) University of Haifa, Israel
***************************************************************************
Andrey Kulikov
2013-09-02 18:47:19 UTC
Permalink
Dear Manuel,

Exciting news!
While your paper still unpublished, could you please advice, it there
anything even nearly similar possible for curves over primary fields?
(e.g. curves secp* )

Best regards,
Andrey
Post by Manuel Bluhm via RT
Hello all,
This patch is a contribution to OpenSSL.
It offers an efficient and constant-time implementation of the elliptic
curve point multiplication, for the following standard NIST/SECG binary
sect163k1, sect163r1, sect163r2, sect193r1, sect193r2, sect233k1,
sect233r1, sect239k1, sect283k1, sect283r1, sect409k1, sect409r1,
sect571k1, and sect571r1.
The patch implements several improvements at the algorithmic and the
coding levels (using SSE/AVX and PCLMULQDQ instructions).
Depending on the curve and architecture, this patch offers a speedup of
between 4x to 10x for ECDH and ECDSA, compared to the current
implementation of OpenSSL 1.0.1e.
Additionally, it adds side channel protection to avoid (cache) timing
attacks using a number of mechanisms.
The code is written in C and uses compiler intrinsics, for simplicity
and portability. The following results were obtained with gcc 4.8.1.
For detailed explanations of the rationale and algorithms of this code
refer to [1].
ECDH performance
--------------------------------------------------------------------------
$ openssl speed ecdh
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 6586.9 || 67029.6 || 10.18 ||
(nistk233) || 5121.9 || 39441.3 || 7.70 ||
(nistk283) || 2825.7 || 27718.5 || 9.81 ||
(nistk409) || 1745.8 || 11634.2 || 6.66 ||
(nistk571) || 763.2 || 5930.9 || 7.77 ||
(nistb163) || 6382.5 || 60729.6 || 9.52 ||
(nistb233) || 4881.9 || 35230.4 || 7.22 ||
(nistb283) || 2651.6 || 24456.4 || 9.22 ||
(nistb409) || 1640.3 || 10228.6 || 6.24 ||
(nistb571) || 693.8 || 5172.1 || 7.45 ||
|| || || ||
------------||----------------||-------------||----------||
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 3271.5 || 28087.3 || 8.59 ||
(nistk233) || 2504.9 || 15106.0 || 6.03 ||
(nistk283) || 1317.0 || 9030.5 || 6.86 ||
(nistk409) || 772.1 || 3880.8 || 5.03 ||
(nistk571) || 327.3 || 1821.1 || 5.56 ||
(nistb163) || 3067.9 || 24357.1 || 7.94 ||
(nistb233) || 2424.9 || 3147.3 || 5.42 ||
(nistb283) || 1227.0 || 7765.1 || 6.33 ||
(nistb409) || 709.7 || 3319.9 || 4.68 ||
(nistb571) || 296.2 || 1563.9 || 5.28 ||
|| || || ||
------------||----------------||-------------||----------||
ECDSA performance
--------------------------------------------------------------------------
$ openssl speed ecdsa
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 6,465.3 3,159.5 || 36,872.6 26,508.4 || 5.70 8.39 ||
(nistk233) || 3,259.2 2,419.8 || 22,998.4 15,557.1 || 7.06 6.43 ||
(nistk283) || 2,204.7 1,355.7 || 16,884.9 11,003.2 || 7.66 8.12 ||
(nistk409) || 977.0 839.1 || 8,150.0 4,845.0 || 8.34 5.77 ||
(nistk571) || 466.4 368.3 || 4,424.1 2,533.6 || 9.49 6.88 ||
(nistb163) || 6,487.3 3,043.9 || 35,110.0 24,904.8 || 5.41 8.18 ||
(nistb233) || 3,279.2 2,348.0 || 21,468.8 14,095.6 || 6.55 6.00 ||
(nistb283) || 2,196.4 1,283.5 || 15,602.7 9,888.5 || 7.10 7.70 ||
(nistb409) || 976.3 786.9 || 7,423.1 4,361.9 || 7.60 5.54 ||
(nistb571) || 466.6 341.0 || 3,977.0 2,251.6 || 8.52 6.60 ||
|| || || ||
-----------||-----------------||-------------------||-----------------||
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 3,749.9 1,578.6 || 17,721.8 11,688.1 || 4.73 7.40 ||
(nistk233) || 1,881.7 1,211.6 || 10,359.0 6,439.4 || 5.51 5.31 ||
(nistk283) || 1,267.5 639.3 || 6,688.9 3,951.1 || 5.28 6.18 ||
(nistk409) || 542.2 361.9 || 3,140.9 1,757.1 || 5.79 4.86 ||
(nistk571) || 257.6 159.9 || 1,556.0 834.6 || 6.04 5.22 ||
(nistb163) || 3,766.5 1,514.5 || 16,203.5 10,453.8 || 4.30 6.90 ||
(nistb233) || 1,893.1 1,150.4 || 9,386.5 5,711.9 || 4.96 4.97 ||
(nistb283) || 1,265.7 594.2 || 5,962.3 3,445.5 || 4.71 5.80 ||
(nistb409) || 539.3 344.2 || 2,763.4 1,522.4 || 5.12 4.42 ||
(nistb571) || 257.2 145.7 || 1,354.8 724.9 || 5.27 4.98 ||
|| || || ||
-----------||-----------------||-------------------||-----------------||
Changes to OpenSSL-1.0.1e
--------------------------------------------------------------------------
bn_gf2m_xmm.c : New file, contains XMM GF2m implementation
bn.h : Added new function declarations
bn_gf2m.c : Added constant time bn operations
Makefile : Added bn_gf2m_xmm.c to makefile
ec2_nist_mult.c: New file, implements Montgomery point multiplication
ec2_nist.c : New file, implements EC methods
ec2_nist_prec.c: New file, implements method to get precomputated values
ec.h : Added function declarations (ec_methods)
ec_lcl.h : Added function declarations (all functions in the ec_method)
ec_curve.c: Added new EC methods to builtin curves
Makefile : Added new files to makefile
Configuration flags
--------------------------------------------------------------------------
-DOPENSSL_FAST_EC2M : Enable the fast implementation of binary curves
-DFAST_PCLMUL : Enable the pclmul reduction for pentanomial curves
-mpclmul : Enable pclmulqdq
-msse4 : Enable SSE4
-mavx : Enable AVX
-mavx2 : Enable AVX2
-march=native : Enable all instruction subsets
./config -mavx2 -mpclmul -DFAST_PCLMUL -DOPENSSL_FAST_EC2M
./config -mavx -mpclmul -DOPENSSL_FAST_EC2M
[1] M. Bluhm, S. Gueron, Fast Software Implementation of Binary Elliptic
Curve Cryptography (2013; to be published)
***************************************************************************
Manuel Bluhm (1) and Shay Gueron (2, 3)
(1) Ruhr University Bochum, Germany
(2) Intel Corporation, Israel Development Center, Haifa, Israel
(3) University of Haifa, Israel
***************************************************************************
Manuel Bluhm
2013-09-03 08:05:32 UTC
Permalink
Dear Andrey,

the scope of this work is limited to binary curves only. However, there
might be ways to
speed up prime curves with vector instructions, but not the same way as
we did for
binary curves. Most of the performance gain comes from the fast binary
field arithmetic
implementation, which is different from the prime field arithmetic.

Best regards,
Manuel
Post by Andrey Kulikov
Dear Manuel,
Exciting news!
While your paper still unpublished, could you please advice, it there
anything even nearly similar possible for curves over primary fields?
(e.g. curves secp* )
Best regards,
Andrey
Hello all,
This patch is a contribution to OpenSSL.
It offers an efficient and constant-time implementation of the
elliptic
curve point multiplication, for the following standard
NIST/SECG binary
sect163k1, sect163r1, sect163r2, sect193r1, sect193r2,
sect233k1,
sect233r1, sect239k1, sect283k1, sect283r1, sect409k1,
sect409r1,
sect571k1, and sect571r1.
The patch implements several improvements at the algorithmic
and the
coding levels (using SSE/AVX and PCLMULQDQ instructions).
Depending on the curve and architecture, this patch offers a
speedup of
between 4x to 10x for ECDH and ECDSA, compared to the current
implementation of OpenSSL 1.0.1e.
Additionally, it adds side channel protection to avoid (cache)
timing
attacks using a number of mechanisms.
The code is written in C and uses compiler intrinsics, for
simplicity
and portability. The following results were obtained with gcc
4.8.1.
For detailed explanations of the rationale and algorithms of
this code
refer to [1].
ECDH performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility as
$ openssl speed ecdh
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 6586.9 || 67029.6 || 10.18 ||
(nistk233) || 5121.9 || 39441.3 || 7.70 ||
(nistk283) || 2825.7 || 27718.5 || 9.81 ||
(nistk409) || 1745.8 || 11634.2 || 6.66 ||
(nistk571) || 763.2 || 5930.9 || 7.77 ||
(nistb163) || 6382.5 || 60729.6 || 9.52 ||
(nistb233) || 4881.9 || 35230.4 || 7.22 ||
(nistb283) || 2651.6 || 24456.4 || 9.22 ||
(nistb409) || 1640.3 || 10228.6 || 6.24 ||
(nistb571) || 693.8 || 5172.1 || 7.45 ||
|| || || ||
------------||----------------||-------------||----------||
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 3271.5 || 28087.3 || 8.59 ||
(nistk233) || 2504.9 || 15106.0 || 6.03 ||
(nistk283) || 1317.0 || 9030.5 || 6.86 ||
(nistk409) || 772.1 || 3880.8 || 5.03 ||
(nistk571) || 327.3 || 1821.1 || 5.56 ||
(nistb163) || 3067.9 || 24357.1 || 7.94 ||
(nistb233) || 2424.9 || 3147.3 || 5.42 ||
(nistb283) || 1227.0 || 7765.1 || 6.33 ||
(nistb409) || 709.7 || 3319.9 || 4.68 ||
(nistb571) || 296.2 || 1563.9 || 5.28 ||
|| || || ||
------------||----------------||-------------||----------||
ECDSA performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility as
$ openssl speed ecdsa
Curve || OpenSSL 1.0.1e || This patch ||
Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s
verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 6,465.3 3,159.5 || 36,872.6 26,508.4 || 5.70
8.39 ||
(nistk233) || 3,259.2 2,419.8 || 22,998.4 15,557.1 || 7.06
6.43 ||
(nistk283) || 2,204.7 1,355.7 || 16,884.9 11,003.2 || 7.66
8.12 ||
(nistk409) || 977.0 839.1 || 8,150.0 4,845.0 || 8.34
5.77 ||
(nistk571) || 466.4 368.3 || 4,424.1 2,533.6 || 9.49
6.88 ||
(nistb163) || 6,487.3 3,043.9 || 35,110.0 24,904.8 || 5.41
8.18 ||
(nistb233) || 3,279.2 2,348.0 || 21,468.8 14,095.6 || 6.55
6.00 ||
(nistb283) || 2,196.4 1,283.5 || 15,602.7 9,888.5 || 7.10
7.70 ||
(nistb409) || 976.3 786.9 || 7,423.1 4,361.9 || 7.60
5.54 ||
(nistb571) || 466.6 341.0 || 3,977.0 2,251.6 || 8.52
6.60 ||
|| || ||
||
-----------||-----------------||-------------------||-----------------||
Curve || OpenSSL 1.0.1e || This patch ||
Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s
verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 3,749.9 1,578.6 || 17,721.8 11,688.1 || 4.73
7.40 ||
(nistk233) || 1,881.7 1,211.6 || 10,359.0 6,439.4 || 5.51
5.31 ||
(nistk283) || 1,267.5 639.3 || 6,688.9 3,951.1 || 5.28
6.18 ||
(nistk409) || 542.2 361.9 || 3,140.9 1,757.1 || 5.79
4.86 ||
(nistk571) || 257.6 159.9 || 1,556.0 834.6 || 6.04
5.22 ||
(nistb163) || 3,766.5 1,514.5 || 16,203.5 10,453.8 || 4.30
6.90 ||
(nistb233) || 1,893.1 1,150.4 || 9,386.5 5,711.9 || 4.96
4.97 ||
(nistb283) || 1,265.7 594.2 || 5,962.3 3,445.5 || 4.71
5.80 ||
(nistb409) || 539.3 344.2 || 2,763.4 1,522.4 || 5.12
4.42 ||
(nistb571) || 257.2 145.7 || 1,354.8 724.9 || 5.27
4.98 ||
|| || ||
||
-----------||-----------------||-------------------||-----------------||
Changes to OpenSSL-1.0.1e
--------------------------------------------------------------------------
bn_gf2m_xmm.c : New file, contains XMM GF2m implementation
bn.h : Added new function declarations
bn_gf2m.c : Added constant time bn operations
Makefile : Added bn_gf2m_xmm.c to makefile
ec2_nist_mult.c: New file, implements Montgomery point
multiplication
ec2_nist.c : New file, implements EC methods
ec2_nist_prec.c: New file, implements method to get
precomputated values
ec.h : Added function declarations (ec_methods)
ec_lcl.h : Added function declarations (all functions in the
ec_method)
ec_curve.c: Added new EC methods to builtin curves
Makefile : Added new files to makefile
Configuration flags
--------------------------------------------------------------------------
-DOPENSSL_FAST_EC2M : Enable the fast implementation of binary
curves
-DFAST_PCLMUL : Enable the pclmul reduction for
pentanomial curves
-mpclmul : Enable pclmulqdq
-msse4 : Enable SSE4
-mavx : Enable AVX
-mavx2 : Enable AVX2
-march=native : Enable all instruction subsets
The results above have been created with the following
./config -mavx2 -mpclmul -DFAST_PCLMUL
-DOPENSSL_FAST_EC2M
./config -mavx -mpclmul -DOPENSSL_FAST_EC2M
[1] M. Bluhm, S. Gueron, Fast Software Implementation of
Binary Elliptic
Curve Cryptography (2013; to be published)
***************************************************************************
Manuel Bluhm (1) and Shay Gueron (2, 3)
(1) Ruhr University Bochum, Germany
(2) Intel Corporation, Israel Development Center, Haifa,
Israel
(3) University of Haifa, Israel
***************************************************************************
Manuel Bluhm
2013-09-02 20:56:58 UTC
Permalink
Dear Andrey,

the scope of this work is limited to binary curves only, however, there
might be ways to
speed up prime curves with vector instructions.

Best regards,
Manuel
Post by Andrey Kulikov
Dear Manuel,
Exciting news!
While your paper still unpublished, could you please advice, it there
anything even nearly similar possible for curves over primary fields?
(e.g. curves secp* )
Best regards,
Andrey
Hello all,
This patch is a contribution to OpenSSL.
It offers an efficient and constant-time implementation of the
elliptic
curve point multiplication, for the following standard
NIST/SECG binary
sect163k1, sect163r1, sect163r2, sect193r1, sect193r2,
sect233k1,
sect233r1, sect239k1, sect283k1, sect283r1, sect409k1,
sect409r1,
sect571k1, and sect571r1.
The patch implements several improvements at the algorithmic
and the
coding levels (using SSE/AVX and PCLMULQDQ instructions).
Depending on the curve and architecture, this patch offers a
speedup of
between 4x to 10x for ECDH and ECDSA, compared to the current
implementation of OpenSSL 1.0.1e.
Additionally, it adds side channel protection to avoid (cache)
timing
attacks using a number of mechanisms.
The code is written in C and uses compiler intrinsics, for
simplicity
and portability. The following results were obtained with gcc
4.8.1.
For detailed explanations of the rationale and algorithms of
this code
refer to [1].
ECDH performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility as
$ openssl speed ecdh
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 6586.9 || 67029.6 || 10.18 ||
(nistk233) || 5121.9 || 39441.3 || 7.70 ||
(nistk283) || 2825.7 || 27718.5 || 9.81 ||
(nistk409) || 1745.8 || 11634.2 || 6.66 ||
(nistk571) || 763.2 || 5930.9 || 7.77 ||
(nistb163) || 6382.5 || 60729.6 || 9.52 ||
(nistb233) || 4881.9 || 35230.4 || 7.22 ||
(nistb283) || 2651.6 || 24456.4 || 9.22 ||
(nistb409) || 1640.3 || 10228.6 || 6.24 ||
(nistb571) || 693.8 || 5172.1 || 7.45 ||
|| || || ||
------------||----------------||-------------||----------||
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 3271.5 || 28087.3 || 8.59 ||
(nistk233) || 2504.9 || 15106.0 || 6.03 ||
(nistk283) || 1317.0 || 9030.5 || 6.86 ||
(nistk409) || 772.1 || 3880.8 || 5.03 ||
(nistk571) || 327.3 || 1821.1 || 5.56 ||
(nistb163) || 3067.9 || 24357.1 || 7.94 ||
(nistb233) || 2424.9 || 3147.3 || 5.42 ||
(nistb283) || 1227.0 || 7765.1 || 6.33 ||
(nistb409) || 709.7 || 3319.9 || 4.68 ||
(nistb571) || 296.2 || 1563.9 || 5.28 ||
|| || || ||
------------||----------------||-------------||----------||
ECDSA performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility as
$ openssl speed ecdsa
Curve || OpenSSL 1.0.1e || This patch ||
Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s
verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 6,465.3 3,159.5 || 36,872.6 26,508.4 || 5.70
8.39 ||
(nistk233) || 3,259.2 2,419.8 || 22,998.4 15,557.1 || 7.06
6.43 ||
(nistk283) || 2,204.7 1,355.7 || 16,884.9 11,003.2 || 7.66
8.12 ||
(nistk409) || 977.0 839.1 || 8,150.0 4,845.0 || 8.34
5.77 ||
(nistk571) || 466.4 368.3 || 4,424.1 2,533.6 || 9.49
6.88 ||
(nistb163) || 6,487.3 3,043.9 || 35,110.0 24,904.8 || 5.41
8.18 ||
(nistb233) || 3,279.2 2,348.0 || 21,468.8 14,095.6 || 6.55
6.00 ||
(nistb283) || 2,196.4 1,283.5 || 15,602.7 9,888.5 || 7.10
7.70 ||
(nistb409) || 976.3 786.9 || 7,423.1 4,361.9 || 7.60
5.54 ||
(nistb571) || 466.6 341.0 || 3,977.0 2,251.6 || 8.52
6.60 ||
|| || ||
||
-----------||-----------------||-------------------||-----------------||
Curve || OpenSSL 1.0.1e || This patch ||
Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s
verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 3,749.9 1,578.6 || 17,721.8 11,688.1 || 4.73
7.40 ||
(nistk233) || 1,881.7 1,211.6 || 10,359.0 6,439.4 || 5.51
5.31 ||
(nistk283) || 1,267.5 639.3 || 6,688.9 3,951.1 || 5.28
6.18 ||
(nistk409) || 542.2 361.9 || 3,140.9 1,757.1 || 5.79
4.86 ||
(nistk571) || 257.6 159.9 || 1,556.0 834.6 || 6.04
5.22 ||
(nistb163) || 3,766.5 1,514.5 || 16,203.5 10,453.8 || 4.30
6.90 ||
(nistb233) || 1,893.1 1,150.4 || 9,386.5 5,711.9 || 4.96
4.97 ||
(nistb283) || 1,265.7 594.2 || 5,962.3 3,445.5 || 4.71
5.80 ||
(nistb409) || 539.3 344.2 || 2,763.4 1,522.4 || 5.12
4.42 ||
(nistb571) || 257.2 145.7 || 1,354.8 724.9 || 5.27
4.98 ||
|| || ||
||
-----------||-----------------||-------------------||-----------------||
Changes to OpenSSL-1.0.1e
--------------------------------------------------------------------------
bn_gf2m_xmm.c : New file, contains XMM GF2m implementation
bn.h : Added new function declarations
bn_gf2m.c : Added constant time bn operations
Makefile : Added bn_gf2m_xmm.c to makefile
ec2_nist_mult.c: New file, implements Montgomery point
multiplication
ec2_nist.c : New file, implements EC methods
ec2_nist_prec.c: New file, implements method to get
precomputated values
ec.h : Added function declarations (ec_methods)
ec_lcl.h : Added function declarations (all functions in the
ec_method)
ec_curve.c: Added new EC methods to builtin curves
Makefile : Added new files to makefile
Configuration flags
--------------------------------------------------------------------------
-DOPENSSL_FAST_EC2M : Enable the fast implementation of binary
curves
-DFAST_PCLMUL : Enable the pclmul reduction for
pentanomial curves
-mpclmul : Enable pclmulqdq
-msse4 : Enable SSE4
-mavx : Enable AVX
-mavx2 : Enable AVX2
-march=native : Enable all instruction subsets
The results above have been created with the following
./config -mavx2 -mpclmul -DFAST_PCLMUL
-DOPENSSL_FAST_EC2M
./config -mavx -mpclmul -DOPENSSL_FAST_EC2M
[1] M. Bluhm, S. Gueron, Fast Software Implementation of
Binary Elliptic
Curve Cryptography (2013; to be published)
***************************************************************************
Manuel Bluhm (1) and Shay Gueron (2, 3)
(1) Ruhr University Bochum, Germany
(2) Intel Corporation, Israel Development Center, Haifa,
Israel
(3) University of Haifa, Israel
***************************************************************************
David Jacobson
2013-09-03 02:57:00 UTC
Permalink
Let me chime in with an amendment to Audrey's message. It would be nice
if the tables included performance numbers for prime modulus curves,
even if the technique's of Manuel's patch are not applicable there.
Many people would like to know whether there is significant performance
gains to be had by switching from GF(p) to GF(2^k) curves.

--David Jacobson
Post by Andrey Kulikov
Dear Manuel,
Exciting news!
While your paper still unpublished, could you please advice, it there
anything even nearly similar possible for curves over primary fields?
(e.g. curves secp* )
Best regards,
Andrey
Hello all,
This patch is a contribution to OpenSSL.
It offers an efficient and constant-time implementation of the
elliptic
curve point multiplication, for the following standard NIST/SECG
binary
sect163k1, sect163r1, sect163r2, sect193r1, sect193r2, sect233k1,
sect233r1, sect239k1, sect283k1, sect283r1, sect409k1, sect409r1,
sect571k1, and sect571r1.
The patch implements several improvements at the algorithmic and the
coding levels (using SSE/AVX and PCLMULQDQ instructions).
Depending on the curve and architecture, this patch offers a
speedup of
between 4x to 10x for ECDH and ECDSA, compared to the current
implementation of OpenSSL 1.0.1e.
Additionally, it adds side channel protection to avoid (cache) timing
attacks using a number of mechanisms.
The code is written in C and uses compiler intrinsics, for simplicity
and portability. The following results were obtained with gcc 4.8.1.
For detailed explanations of the rationale and algorithms of this code
refer to [1].
ECDH performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility as
$ openssl speed ecdh
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 6586.9 || 67029.6 || 10.18 ||
(nistk233) || 5121.9 || 39441.3 || 7.70 ||
(nistk283) || 2825.7 || 27718.5 || 9.81 ||
(nistk409) || 1745.8 || 11634.2 || 6.66 ||
(nistk571) || 763.2 || 5930.9 || 7.77 ||
(nistb163) || 6382.5 || 60729.6 || 9.52 ||
(nistb233) || 4881.9 || 35230.4 || 7.22 ||
(nistb283) || 2651.6 || 24456.4 || 9.22 ||
(nistb409) || 1640.3 || 10228.6 || 6.24 ||
(nistb571) || 693.8 || 5172.1 || 7.45 ||
|| || || ||
------------||----------------||-------------||----------||
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 3271.5 || 28087.3 || 8.59 ||
(nistk233) || 2504.9 || 15106.0 || 6.03 ||
(nistk283) || 1317.0 || 9030.5 || 6.86 ||
(nistk409) || 772.1 || 3880.8 || 5.03 ||
(nistk571) || 327.3 || 1821.1 || 5.56 ||
(nistb163) || 3067.9 || 24357.1 || 7.94 ||
(nistb233) || 2424.9 || 3147.3 || 5.42 ||
(nistb283) || 1227.0 || 7765.1 || 6.33 ||
(nistb409) || 709.7 || 3319.9 || 4.68 ||
(nistb571) || 296.2 || 1563.9 || 5.28 ||
|| || || ||
------------||----------------||-------------||----------||
ECDSA performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility as
$ openssl speed ecdsa
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s
verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 6,465.3 3,159.5 || 36,872.6 26,508.4 || 5.70
8.39 ||
(nistk233) || 3,259.2 2,419.8 || 22,998.4 15,557.1 || 7.06
6.43 ||
(nistk283) || 2,204.7 1,355.7 || 16,884.9 11,003.2 || 7.66
8.12 ||
(nistk409) || 977.0 839.1 || 8,150.0 4,845.0 || 8.34
5.77 ||
(nistk571) || 466.4 368.3 || 4,424.1 2,533.6 || 9.49
6.88 ||
(nistb163) || 6,487.3 3,043.9 || 35,110.0 24,904.8 || 5.41
8.18 ||
(nistb233) || 3,279.2 2,348.0 || 21,468.8 14,095.6 || 6.55
6.00 ||
(nistb283) || 2,196.4 1,283.5 || 15,602.7 9,888.5 || 7.10
7.70 ||
(nistb409) || 976.3 786.9 || 7,423.1 4,361.9 || 7.60
5.54 ||
(nistb571) || 466.6 341.0 || 3,977.0 2,251.6 || 8.52
6.60 ||
|| || || ||
-----------||-----------------||-------------------||-----------------||
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s
verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 3,749.9 1,578.6 || 17,721.8 11,688.1 || 4.73
7.40 ||
(nistk233) || 1,881.7 1,211.6 || 10,359.0 6,439.4 || 5.51
5.31 ||
(nistk283) || 1,267.5 639.3 || 6,688.9 3,951.1 || 5.28
6.18 ||
(nistk409) || 542.2 361.9 || 3,140.9 1,757.1 || 5.79
4.86 ||
(nistk571) || 257.6 159.9 || 1,556.0 834.6 || 6.04
5.22 ||
(nistb163) || 3,766.5 1,514.5 || 16,203.5 10,453.8 || 4.30
6.90 ||
(nistb233) || 1,893.1 1,150.4 || 9,386.5 5,711.9 || 4.96
4.97 ||
(nistb283) || 1,265.7 594.2 || 5,962.3 3,445.5 || 4.71
5.80 ||
(nistb409) || 539.3 344.2 || 2,763.4 1,522.4 || 5.12
4.42 ||
(nistb571) || 257.2 145.7 || 1,354.8 724.9 || 5.27
4.98 ||
|| || || ||
-----------||-----------------||-------------------||-----------------||
Changes to OpenSSL-1.0.1e
--------------------------------------------------------------------------
bn_gf2m_xmm.c : New file, contains XMM GF2m implementation
bn.h : Added new function declarations
bn_gf2m.c : Added constant time bn operations
Makefile : Added bn_gf2m_xmm.c to makefile
ec2_nist_mult.c: New file, implements Montgomery point multiplication
ec2_nist.c : New file, implements EC methods
ec2_nist_prec.c: New file, implements method to get precomputated
values
ec.h : Added function declarations (ec_methods)
ec_lcl.h : Added function declarations (all functions in the
ec_method)
ec_curve.c: Added new EC methods to builtin curves
Makefile : Added new files to makefile
Configuration flags
--------------------------------------------------------------------------
-DOPENSSL_FAST_EC2M : Enable the fast implementation of binary curves
-DFAST_PCLMUL : Enable the pclmul reduction for pentanomial
curves
-mpclmul : Enable pclmulqdq
-msse4 : Enable SSE4
-mavx : Enable AVX
-mavx2 : Enable AVX2
-march=native : Enable all instruction subsets
./config -mavx2 -mpclmul -DFAST_PCLMUL -DOPENSSL_FAST_EC2M
./config -mavx -mpclmul -DOPENSSL_FAST_EC2M
[1] M. Bluhm, S. Gueron, Fast Software Implementation of Binary
Elliptic
Curve Cryptography (2013; to be published)
***************************************************************************
Manuel Bluhm (1) and Shay Gueron (2, 3)
(1) Ruhr University Bochum, Germany
(2) Intel Corporation, Israel Development Center, Haifa, Israel
(3) University of Haifa, Israel
***************************************************************************
handshak3
2017-04-20 21:36:09 UTC
Permalink
Reviving this old thread as i am trying to explore ways to improve the performance of 163k curve.

Was this match ever accepted in openssl?

-Rohit
Post by David Jacobson
Let me chime in with an amendment to
Audrey's message.  It would be nice if the tables included
performance numbers for prime modulus curves, even if the
technique's of Manuel's patch are not applicable there.  Many
people would like to know whether there is significant performance
gains to be had by switching from GF(p) to GF(2^k) curves.
    --David Jacobson
Dear Manuel,
Exciting news!
While your paper still unpublished, could you please
advice, it there anything even nearly similar possible for
curves over primary fields?
(e.g. curves secp* )
Best regards,
Andrey
On 28 August 2013 09:06, Manuel Bluhm
Hello all,
This patch is a contribution to OpenSSL.
It offers an efficient and constant-time implementation of
the elliptic
curve point multiplication, for the following standard
NIST/SECG binary
sect163k1, sect163r1, sect163r2, sect193r1, sect193r2, sect233k1,
sect233r1, sect239k1, sect283k1, sect283r1, sect409k1, sect409r1,
sect571k1, and sect571r1.
The patch implements several improvements at the algorithmic
and the
coding levels (using SSE/AVX and PCLMULQDQ instructions).
Depending on the curve and architecture, this patch offers a
speedup of
between 4x to 10x for ECDH and ECDSA, compared to the current
implementation of OpenSSL 1.0.1e.
Additionally, it adds side channel protection to avoid
(cache) timing
attacks using a number of mechanisms.
The code is written in C and uses compiler intrinsics, for
simplicity
and portability. The following results were obtained with
gcc 4.8.1.
For detailed explanations of the rationale and algorithms of
this code
refer to [1].
    ECDH performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility
$ openssl speed ecdh
    Curve   || OpenSSL 1.0.1e || This patch  || Speedup  ||
------------||----------------||-------------||----------||
            ||                ||             ||          ||
(nistk163)  ||    6586.9      ||  67029.6    ||  10.18   ||
(nistk233)  ||    5121.9      ||  39441.3    ||   7.70   ||
(nistk283)  ||    2825.7      ||  27718.5    ||   9.81   ||
(nistk409)  ||    1745.8      ||  11634.2    ||   6.66   ||
(nistk571)  ||     763.2      ||   5930.9    ||   7.77   ||
(nistb163)  ||    6382.5      ||  60729.6    ||   9.52   ||
(nistb233)  ||    4881.9      ||  35230.4    ||   7.22   ||
(nistb283)  ||    2651.6      ||  24456.4    ||   9.22   ||
(nistb409)  ||    1640.3      ||  10228.6    ||   6.24   ||
(nistb571)  ||     693.8      ||   5172.1    ||   7.45   ||
            ||                ||             ||          ||
------------||----------------||-------------||----------||
Curve       || OpenSSL 1.0.1e || This patch  || Speedup  ||
------------||----------------||-------------||----------||
            ||                ||             ||          ||
(nistk163)  ||    3271.5      ||  28087.3    ||   8.59   ||
(nistk233)  ||    2504.9      ||  15106.0    ||   6.03   ||
(nistk283)  ||    1317.0      ||   9030.5    ||   6.86   ||
(nistk409)  ||     772.1      ||   3880.8    ||   5.03   ||
(nistk571)  ||     327.3      ||   1821.1    ||   5.56   ||
(nistb163)  ||    3067.9      ||  24357.1    ||   7.94   ||
(nistb233)  ||    2424.9      ||   3147.3    ||   5.42   ||
(nistb283)  ||    1227.0      ||   7765.1    ||   6.33   ||
(nistb409)  ||     709.7      ||   3319.9    ||   4.68   ||
(nistb571)  ||     296.2      ||   1563.9    ||   5.28   ||
            ||                ||             ||          ||
------------||----------------||-------------||----------||
    ECDSA performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility
$ openssl speed ecdsa
Curve      ||  OpenSSL 1.0.1e ||    This patch     ||    
Speedup     ||
-----------||-----------------||-------------------||-----------------||
           || sign/s verify/s || sign/s  verify/s  || sign/s
verify/s ||
         
 ||-----------------||-------------------||-----------------||
(nistk163) || 6,465.3 3,159.5 || 36,872.6 26,508.4 ||  5.70
   8.39   ||
(nistk233) || 3,259.2 2,419.8 || 22,998.4 15,557.1 ||  7.06
   6.43   ||
(nistk283) || 2,204.7 1,355.7 || 16,884.9 11,003.2 ||  7.66
   8.12   ||
(nistk409) ||   977.0   839.1 ||  8,150.0  4,845.0 ||  8.34
   5.77   ||
(nistk571) ||   466.4   368.3 ||  4,424.1  2,533.6 ||  9.49
   6.88   ||
(nistb163) || 6,487.3 3,043.9 || 35,110.0 24,904.8 ||  5.41
   8.18   ||
(nistb233) || 3,279.2 2,348.0 || 21,468.8 14,095.6 ||  6.55
   6.00   ||
(nistb283) || 2,196.4 1,283.5 || 15,602.7  9,888.5 ||  7.10
   7.70   ||
(nistb409) ||   976.3   786.9 ||  7,423.1  4,361.9 ||  7.60
   5.54   ||
(nistb571) ||   466.6   341.0 ||  3,977.0  2,251.6 ||  8.52
   6.60   ||
           ||                 ||                   ||      
          ||
-----------||-----------------||-------------------||-----------------||
Curve      ||  OpenSSL 1.0.1e ||    This patch     ||  
 Speedup      ||
-----------||-----------------||-------------------||-----------------||
           || sign/s verify/s || sign/s  verify/s  || sign/s
verify/s ||
         
 ||-----------------||-------------------||-----------------||
(nistk163) || 3,749.9 1,578.6 || 17,721.8 11,688.1 ||  4.73
   7.40   ||
(nistk233) || 1,881.7 1,211.6 || 10,359.0  6,439.4 ||  5.51
   5.31   ||
(nistk283) || 1,267.5   639.3 ||  6,688.9  3,951.1 ||  5.28
   6.18   ||
(nistk409) ||   542.2   361.9 ||  3,140.9  1,757.1 ||  5.79
   4.86   ||
(nistk571) ||   257.6   159.9 ||  1,556.0    834.6 ||  6.04
   5.22   ||
(nistb163) || 3,766.5 1,514.5 || 16,203.5 10,453.8 ||  4.30
   6.90   ||
(nistb233) || 1,893.1 1,150.4 ||  9,386.5  5,711.9 ||  4.96
   4.97   ||
(nistb283) || 1,265.7   594.2 ||  5,962.3  3,445.5 ||  4.71
   5.80   ||
(nistb409) ||   539.3   344.2 ||  2,763.4  1,522.4 ||  5.12
   4.42   ||
(nistb571) ||   257.2   145.7 ||  1,354.8    724.9 ||  5.27
   4.98   ||
           ||                 ||                   ||      
          ||
-----------||-----------------||-------------------||-----------------||
    Changes to OpenSSL-1.0.1e
--------------------------------------------------------------------------
bn_gf2m_xmm.c : New file, contains XMM GF2m implementation
bn.h          : Added new function declarations
bn_gf2m.c     : Added constant time bn operations
Makefile      : Added bn_gf2m_xmm.c to makefile
ec2_nist_mult.c: New file, implements Montgomery point
multiplication
ec2_nist.c     : New file, implements EC methods
ec2_nist_prec.c: New file, implements method to get
precomputated values
ec.h      : Added function declarations (ec_methods)
ec_lcl.h  : Added function declarations (all functions in
the ec_method)
ec_curve.c: Added new EC methods to builtin curves
Makefile  : Added new files to makefile
    Configuration flags
--------------------------------------------------------------------------
-DOPENSSL_FAST_EC2M : Enable the fast implementation of
binary curves
-DFAST_PCLMUL       : Enable the pclmul reduction for
pentanomial curves
-mpclmul      : Enable pclmulqdq
-msse4        : Enable SSE4
-mavx         : Enable AVX
-mavx2        : Enable AVX2
-march=native : Enable all instruction subsets
The results above have been created with the following
         ./config -mavx2 -mpclmul -DFAST_PCLMUL
-DOPENSSL_FAST_EC2M
         ./config -mavx -mpclmul -DOPENSSL_FAST_EC2M
[1] M. Bluhm, S. Gueron, Fast Software Implementation of
Binary Elliptic
Curve Cryptography (2013; to be published)
***************************************************************************
Manuel Bluhm (1) and Shay Gueron (2, 3)
(1) Ruhr University Bochum, Germany
(2) Intel Corporation, Israel Development Center, Haifa, Israel
(3) University of Haifa, Israel
***************************************************************************
Manuel Bluhm
2013-09-04 02:33:36 UTC
Permalink
Dear David,

in response to your comment, the numbers below provide a comparison for
the patch, compared to OpenSSL-1.0.1e, on Haswell and Ivy Bridge. The
speedup indicates the different performance of binary and prime curves
of similar bit length.

With this patch, both architectures perform much more ECDH operations
with binary curves. Additionally, more ECDSA sign/verify operations are
achieved on Haswell, and more verifications on Ivy Bridge (but less
signs).


Curves for speed comparison:

GF(p) GF(2^m)
secp160r1 <-> nist(b,k)163
nistp224 <-> nist(b,k)233
nistp256 <-> nist(b,k)283
nistp384 <-> nist(b,k)409
nistp521 <-> nist(b,k)571


The results for a Core i7-4770 CPU @ 3.40GHz (Haswell) [1]:

./openssl speed ecdh

ECDH op/s
(secp160r1) 7391.5
(nistp224) 11993.8
(nistp256) 6489.0
(nistp384) 1848.5
(nistp521) 1682.8

Speedup
(nistk163) 67212.4 9.09
(nistk233) 39102.2 3.26
(nistk283) 27586.5 4.25
(nistk409) 11611.2 6.28
(nistk571) 5941.8 3.53

Speedup
(nistb163) 61667.8 8.34
(nistb233) 35246.4 2.94
(nistb283) 24320.7 3.75
(nistb409) 10238.1 5.54
(nistb571) 5158.8 3.07


./openssl speed ecdsa

SIGN/s/s VERIFY/s
(secp160r1) 21750.8 6029.0
(nistp224) 18393.5 8345.4
(nistp256) 11391.7 4744.9
(nistp384) 6447.4 1566.0
(nistp521) 2949.5 1249.7

SIGN/s VERIFY/s Speedups
(nistk163) 36660.3 26646.7 1.69 4.42
(nistk233) 23142.7 15842.9 1.26 1.90
(nistk283) 16941.7 11059.7 1.49 2.33
(nistk409) 8198.4 4861.4 1.27 3.10
(nistk571) 4446.7 2547.6 1.51 2.04

SIGN/s VERIFY/s Speedups
(nistb163) 34738.8 25113.6 1.60 4.17
(nistb233) 21531.7 14341.8 1.17 1.72
(nistb283) 15635.1 10061.6 1.37 2.12
(nistb409) 7479.5 4390.6 1.16 2.80
(nistb571) 4029.7 2269.6 1.37 1.82


The results for a Core i5-3210M @ 2.50 GHz (Ivy Bridge) [2]:

./openssl speed ecdh

ECDH op/s
(secp160r1) 4444.3
(nistp224) 7573.1
(nistp256) 3891.5
(nistp384) 1051.9
(nistp521) 971.0

Speedup
(nistk163) 27837.0 6.26
(nistk233) 14946.4 1.97
(nistk283) 9026.5 2.32
(nistk409) 3879.5 3.69
(nistk571) 1822.3 1.88

Speedup
(nistb163) 24043.5 5.41
(nistb233) 13057.0 1.72
(nistb283) 7754.4 1.99
(nistb409) 3319.6 3.16
(nistb571) 1565.0 1.61


./openssl speed ecdsa

SIGN/s VERIFY/s
(secp160r1) 12978.6 3671.5
(nistp224) 11196.0 5130.2
(nistp256) 6819.4 2829.3
(nistp384) 3727.5 849.5
(nistp521) 1712.6 723.5

SIGN/s VERIFY/s Speedups
(nistk163) 17794.1 11730.1 1.37 3.19
(nistk233) 10396.8 6450.8 0.93 1.26
(nistk283) 6671.7 3955.1 0.98 1.40
(nistk409) 3148.8 1754.4 0.84 2.07
(nistk571) 1560.1 836.1 0.91 1.16

SIGN/s VERIFY/s Speedups
(nistb163) 16278.4 10452.2 1.25 2.85
(nistb233) 9423.7 5715.4 0.84 1.11
(nistb283) 5987.4 3458.5 0.88 1.22
(nistb409) 2769.2 1523.1 0.74 1.79
(nistb571) 1358.0 724.0 0.79 1.00


The code has been compiled with gcc 4.8.1 and the following
configurations:

[1] Core i7-4770 @ 3.40GHz (Haswell):

./Configure linux-x86_64 enable-ec_nistp_64_gcc_128 -march=native
-DFAST_PCLMUL -DOPENSSL_FAST_EC2M

[2] Core i5-3210M @ 2.50 GHz (Ivy Bridge):

./Configure linux-x86_64 enable-ec_nistp_64_gcc_128 -march=native
-DOPENSSL_FAST_EC2M


Best regards,
Manuel
Post by David Jacobson
Let me chime in with an amendment to Audrey's message. It would be
nice if the tables included performance numbers for prime modulus
curves, even if the technique's of Manuel's patch are not applicable
there. Many people would like to know whether there is significant
performance gains to be had by switching from GF(p) to GF(2^k) curves.
--David Jacobson
Post by Andrey Kulikov
Dear Manuel,
Exciting news!
While your paper still unpublished, could you please advice, it
there anything even nearly similar possible for curves over primary
fields?
(e.g. curves secp* )
Best regards,
Andrey
Hello all,
This patch is a contribution to OpenSSL.
It offers an efficient and constant-time implementation of
the elliptic
curve point multiplication, for the following standard
NIST/SECG binary
sect163k1, sect163r1, sect163r2, sect193r1, sect193r2,
sect233k1,
sect233r1, sect239k1, sect283k1, sect283r1, sect409k1,
sect409r1,
sect571k1, and sect571r1.
The patch implements several improvements at the algorithmic
and the
coding levels (using SSE/AVX and PCLMULQDQ instructions).
Depending on the curve and architecture, this patch offers a
speedup of
between 4x to 10x for ECDH and ECDSA, compared to the
current
implementation of OpenSSL 1.0.1e.
Additionally, it adds side channel protection to avoid
(cache) timing
attacks using a number of mechanisms.
The code is written in C and uses compiler intrinsics, for
simplicity
and portability. The following results were obtained with
gcc 4.8.1.
For detailed explanations of the rationale and algorithms of
this code
refer to [1].
ECDH performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility
$ openssl speed ecdh
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 6586.9 || 67029.6 || 10.18 ||
(nistk233) || 5121.9 || 39441.3 || 7.70 ||
(nistk283) || 2825.7 || 27718.5 || 9.81 ||
(nistk409) || 1745.8 || 11634.2 || 6.66 ||
(nistk571) || 763.2 || 5930.9 || 7.77 ||
(nistb163) || 6382.5 || 60729.6 || 9.52 ||
(nistb233) || 4881.9 || 35230.4 || 7.22 ||
(nistb283) || 2651.6 || 24456.4 || 9.22 ||
(nistb409) || 1640.3 || 10228.6 || 6.24 ||
(nistb571) || 693.8 || 5172.1 || 7.45 ||
|| || || ||
------------||----------------||-------------||----------||
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 3271.5 || 28087.3 || 8.59 ||
(nistk233) || 2504.9 || 15106.0 || 6.03 ||
(nistk283) || 1317.0 || 9030.5 || 6.86 ||
(nistk409) || 772.1 || 3880.8 || 5.03 ||
(nistk571) || 327.3 || 1821.1 || 5.56 ||
(nistb163) || 3067.9 || 24357.1 || 7.94 ||
(nistb233) || 2424.9 || 3147.3 || 5.42 ||
(nistb283) || 1227.0 || 7765.1 || 6.33 ||
(nistb409) || 709.7 || 3319.9 || 4.68 ||
(nistb571) || 296.2 || 1563.9 || 5.28 ||
|| || || ||
------------||----------------||-------------||----------||
ECDSA performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility
$ openssl speed ecdsa
Curve || OpenSSL 1.0.1e || This patch ||
Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s
verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 6,465.3 3,159.5 || 36,872.6 26,508.4 || 5.70
8.39 ||
(nistk233) || 3,259.2 2,419.8 || 22,998.4 15,557.1 || 7.06
6.43 ||
(nistk283) || 2,204.7 1,355.7 || 16,884.9 11,003.2 || 7.66
8.12 ||
(nistk409) || 977.0 839.1 || 8,150.0 4,845.0 || 8.34
5.77 ||
(nistk571) || 466.4 368.3 || 4,424.1 2,533.6 || 9.49
6.88 ||
(nistb163) || 6,487.3 3,043.9 || 35,110.0 24,904.8 || 5.41
8.18 ||
(nistb233) || 3,279.2 2,348.0 || 21,468.8 14,095.6 || 6.55
6.00 ||
(nistb283) || 2,196.4 1,283.5 || 15,602.7 9,888.5 || 7.10
7.70 ||
(nistb409) || 976.3 786.9 || 7,423.1 4,361.9 || 7.60
5.54 ||
(nistb571) || 466.6 341.0 || 3,977.0 2,251.6 || 8.52
6.60 ||
|| || ||
||
-----------||-----------------||-------------------||-----------------||
Curve || OpenSSL 1.0.1e || This patch ||
Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s
verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 3,749.9 1,578.6 || 17,721.8 11,688.1 || 4.73
7.40 ||
(nistk233) || 1,881.7 1,211.6 || 10,359.0 6,439.4 || 5.51
5.31 ||
(nistk283) || 1,267.5 639.3 || 6,688.9 3,951.1 || 5.28
6.18 ||
(nistk409) || 542.2 361.9 || 3,140.9 1,757.1 || 5.79
4.86 ||
(nistk571) || 257.6 159.9 || 1,556.0 834.6 || 6.04
5.22 ||
(nistb163) || 3,766.5 1,514.5 || 16,203.5 10,453.8 || 4.30
6.90 ||
(nistb233) || 1,893.1 1,150.4 || 9,386.5 5,711.9 || 4.96
4.97 ||
(nistb283) || 1,265.7 594.2 || 5,962.3 3,445.5 || 4.71
5.80 ||
(nistb409) || 539.3 344.2 || 2,763.4 1,522.4 || 5.12
4.42 ||
(nistb571) || 257.2 145.7 || 1,354.8 724.9 || 5.27
4.98 ||
|| || ||
||
-----------||-----------------||-------------------||-----------------||
Changes to OpenSSL-1.0.1e
--------------------------------------------------------------------------
bn_gf2m_xmm.c : New file, contains XMM GF2m implementation
bn.h : Added new function declarations
bn_gf2m.c : Added constant time bn operations
Makefile : Added bn_gf2m_xmm.c to makefile
ec2_nist_mult.c: New file, implements Montgomery point
multiplication
ec2_nist.c : New file, implements EC methods
ec2_nist_prec.c: New file, implements method to get
precomputated values
ec.h : Added function declarations (ec_methods)
ec_lcl.h : Added function declarations (all functions in
the ec_method)
ec_curve.c: Added new EC methods to builtin curves
Makefile : Added new files to makefile
Configuration flags
--------------------------------------------------------------------------
-DOPENSSL_FAST_EC2M : Enable the fast implementation of
binary curves
-DFAST_PCLMUL : Enable the pclmul reduction for
pentanomial curves
-mpclmul : Enable pclmulqdq
-msse4 : Enable SSE4
-mavx : Enable AVX
-mavx2 : Enable AVX2
-march=native : Enable all instruction subsets
The results above have been created with the following
./config -mavx2 -mpclmul -DFAST_PCLMUL
-DOPENSSL_FAST_EC2M
./config -mavx -mpclmul -DOPENSSL_FAST_EC2M
[1] M. Bluhm, S. Gueron, Fast Software Implementation of
Binary Elliptic
Curve Cryptography (2013; to be published)
***************************************************************************
Manuel Bluhm (1) and Shay Gueron (2, 3)
(1) Ruhr University Bochum, Germany
(2) Intel Corporation, Israel Development Center, Haifa,
Israel
(3) University of Haifa, Israel
***************************************************************************
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
David Jacobson
2013-09-04 15:26:41 UTC
Permalink
Manuel,

Thank you very much for this data. It is amazing how different the
results for ECDH and ECDSA sign/verify are. I would have thought that
the ECDSA sign/verify times would be dominated by the point
multiplication and hence would be very similar to ECDH, which is
basically nothing but point multiplication, but apparently not.

--David
Post by Manuel Bluhm
Dear David,
in response to your comment, the numbers below provide a comparison for
the patch, compared to OpenSSL-1.0.1e, on Haswell and Ivy Bridge. The
speedup indicates the different performance of binary and prime curves
of similar bit length.
With this patch, both architectures perform much more ECDH operations
with binary curves. Additionally, more ECDSA sign/verify operations are
achieved on Haswell, and more verifications on Ivy Bridge (but less
signs).
GF(p) GF(2^m)
secp160r1 <-> nist(b,k)163
nistp224 <-> nist(b,k)233
nistp256 <-> nist(b,k)283
nistp384 <-> nist(b,k)409
nistp521 <-> nist(b,k)571
./openssl speed ecdh
ECDH op/s
(secp160r1) 7391.5
(nistp224) 11993.8
(nistp256) 6489.0
(nistp384) 1848.5
(nistp521) 1682.8
Speedup
(nistk163) 67212.4 9.09
(nistk233) 39102.2 3.26
(nistk283) 27586.5 4.25
(nistk409) 11611.2 6.28
(nistk571) 5941.8 3.53
Speedup
(nistb163) 61667.8 8.34
(nistb233) 35246.4 2.94
(nistb283) 24320.7 3.75
(nistb409) 10238.1 5.54
(nistb571) 5158.8 3.07
./openssl speed ecdsa
SIGN/s/s VERIFY/s
(secp160r1) 21750.8 6029.0
(nistp224) 18393.5 8345.4
(nistp256) 11391.7 4744.9
(nistp384) 6447.4 1566.0
(nistp521) 2949.5 1249.7
SIGN/s VERIFY/s Speedups
(nistk163) 36660.3 26646.7 1.69 4.42
(nistk233) 23142.7 15842.9 1.26 1.90
(nistk283) 16941.7 11059.7 1.49 2.33
(nistk409) 8198.4 4861.4 1.27 3.10
(nistk571) 4446.7 2547.6 1.51 2.04
SIGN/s VERIFY/s Speedups
(nistb163) 34738.8 25113.6 1.60 4.17
(nistb233) 21531.7 14341.8 1.17 1.72
(nistb283) 15635.1 10061.6 1.37 2.12
(nistb409) 7479.5 4390.6 1.16 2.80
(nistb571) 4029.7 2269.6 1.37 1.82
./openssl speed ecdh
ECDH op/s
(secp160r1) 4444.3
(nistp224) 7573.1
(nistp256) 3891.5
(nistp384) 1051.9
(nistp521) 971.0
Speedup
(nistk163) 27837.0 6.26
(nistk233) 14946.4 1.97
(nistk283) 9026.5 2.32
(nistk409) 3879.5 3.69
(nistk571) 1822.3 1.88
Speedup
(nistb163) 24043.5 5.41
(nistb233) 13057.0 1.72
(nistb283) 7754.4 1.99
(nistb409) 3319.6 3.16
(nistb571) 1565.0 1.61
./openssl speed ecdsa
SIGN/s VERIFY/s
(secp160r1) 12978.6 3671.5
(nistp224) 11196.0 5130.2
(nistp256) 6819.4 2829.3
(nistp384) 3727.5 849.5
(nistp521) 1712.6 723.5
SIGN/s VERIFY/s Speedups
(nistk163) 17794.1 11730.1 1.37 3.19
(nistk233) 10396.8 6450.8 0.93 1.26
(nistk283) 6671.7 3955.1 0.98 1.40
(nistk409) 3148.8 1754.4 0.84 2.07
(nistk571) 1560.1 836.1 0.91 1.16
SIGN/s VERIFY/s Speedups
(nistb163) 16278.4 10452.2 1.25 2.85
(nistb233) 9423.7 5715.4 0.84 1.11
(nistb283) 5987.4 3458.5 0.88 1.22
(nistb409) 2769.2 1523.1 0.74 1.79
(nistb571) 1358.0 724.0 0.79 1.00
The code has been compiled with gcc 4.8.1 and the following
./Configure linux-x86_64 enable-ec_nistp_64_gcc_128 -march=native
-DFAST_PCLMUL -DOPENSSL_FAST_EC2M
./Configure linux-x86_64 enable-ec_nistp_64_gcc_128 -march=native
-DOPENSSL_FAST_EC2M
Best regards,
Manuel
Post by David Jacobson
Let me chime in with an amendment to Audrey's message. It would be
nice if the tables included performance numbers for prime modulus
curves, even if the technique's of Manuel's patch are not applicable
there. Many people would like to know whether there is significant
performance gains to be had by switching from GF(p) to GF(2^k) curves.
--David Jacobson
Post by Andrey Kulikov
Dear Manuel,
Exciting news!
While your paper still unpublished, could you please advice, it
there anything even nearly similar possible for curves over primary
fields?
(e.g. curves secp* )
Best regards,
Andrey
Hello all,
This patch is a contribution to OpenSSL.
It offers an efficient and constant-time implementation of
the elliptic
curve point multiplication, for the following standard
NIST/SECG binary
sect163k1, sect163r1, sect163r2, sect193r1, sect193r2,
sect233k1,
sect233r1, sect239k1, sect283k1, sect283r1, sect409k1,
sect409r1,
sect571k1, and sect571r1.
The patch implements several improvements at the algorithmic
and the
coding levels (using SSE/AVX and PCLMULQDQ instructions).
Depending on the curve and architecture, this patch offers a
speedup of
between 4x to 10x for ECDH and ECDSA, compared to the
current
implementation of OpenSSL 1.0.1e.
Additionally, it adds side channel protection to avoid
(cache) timing
attacks using a number of mechanisms.
The code is written in C and uses compiler intrinsics, for
simplicity
and portability. The following results were obtained with
gcc 4.8.1.
For detailed explanations of the rationale and algorithms of
this code
refer to [1].
ECDH performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility
$ openssl speed ecdh
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 6586.9 || 67029.6 || 10.18 ||
(nistk233) || 5121.9 || 39441.3 || 7.70 ||
(nistk283) || 2825.7 || 27718.5 || 9.81 ||
(nistk409) || 1745.8 || 11634.2 || 6.66 ||
(nistk571) || 763.2 || 5930.9 || 7.77 ||
(nistb163) || 6382.5 || 60729.6 || 9.52 ||
(nistb233) || 4881.9 || 35230.4 || 7.22 ||
(nistb283) || 2651.6 || 24456.4 || 9.22 ||
(nistb409) || 1640.3 || 10228.6 || 6.24 ||
(nistb571) || 693.8 || 5172.1 || 7.45 ||
|| || || ||
------------||----------------||-------------||----------||
Curve || OpenSSL 1.0.1e || This patch || Speedup ||
------------||----------------||-------------||----------||
|| || || ||
(nistk163) || 3271.5 || 28087.3 || 8.59 ||
(nistk233) || 2504.9 || 15106.0 || 6.03 ||
(nistk283) || 1317.0 || 9030.5 || 6.86 ||
(nistk409) || 772.1 || 3880.8 || 5.03 ||
(nistk571) || 327.3 || 1821.1 || 5.56 ||
(nistb163) || 3067.9 || 24357.1 || 7.94 ||
(nistb233) || 2424.9 || 3147.3 || 5.42 ||
(nistb283) || 1227.0 || 7765.1 || 6.33 ||
(nistb409) || 709.7 || 3319.9 || 4.68 ||
(nistb571) || 296.2 || 1563.9 || 5.28 ||
|| || || ||
------------||----------------||-------------||----------||
ECDSA performance
--------------------------------------------------------------------------
The performance was measured by using openssl speed utility
$ openssl speed ecdsa
Curve || OpenSSL 1.0.1e || This patch ||
Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s
verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 6,465.3 3,159.5 || 36,872.6 26,508.4 || 5.70
8.39 ||
(nistk233) || 3,259.2 2,419.8 || 22,998.4 15,557.1 || 7.06
6.43 ||
(nistk283) || 2,204.7 1,355.7 || 16,884.9 11,003.2 || 7.66
8.12 ||
(nistk409) || 977.0 839.1 || 8,150.0 4,845.0 || 8.34
5.77 ||
(nistk571) || 466.4 368.3 || 4,424.1 2,533.6 || 9.49
6.88 ||
(nistb163) || 6,487.3 3,043.9 || 35,110.0 24,904.8 || 5.41
8.18 ||
(nistb233) || 3,279.2 2,348.0 || 21,468.8 14,095.6 || 6.55
6.00 ||
(nistb283) || 2,196.4 1,283.5 || 15,602.7 9,888.5 || 7.10
7.70 ||
(nistb409) || 976.3 786.9 || 7,423.1 4,361.9 || 7.60
5.54 ||
(nistb571) || 466.6 341.0 || 3,977.0 2,251.6 || 8.52
6.60 ||
|| || ||
||
-----------||-----------------||-------------------||-----------------||
Curve || OpenSSL 1.0.1e || This patch ||
Speedup ||
-----------||-----------------||-------------------||-----------------||
|| sign/s verify/s || sign/s verify/s || sign/s
verify/s ||
||-----------------||-------------------||-----------------||
(nistk163) || 3,749.9 1,578.6 || 17,721.8 11,688.1 || 4.73
7.40 ||
(nistk233) || 1,881.7 1,211.6 || 10,359.0 6,439.4 || 5.51
5.31 ||
(nistk283) || 1,267.5 639.3 || 6,688.9 3,951.1 || 5.28
6.18 ||
(nistk409) || 542.2 361.9 || 3,140.9 1,757.1 || 5.79
4.86 ||
(nistk571) || 257.6 159.9 || 1,556.0 834.6 || 6.04
5.22 ||
(nistb163) || 3,766.5 1,514.5 || 16,203.5 10,453.8 || 4.30
6.90 ||
(nistb233) || 1,893.1 1,150.4 || 9,386.5 5,711.9 || 4.96
4.97 ||
(nistb283) || 1,265.7 594.2 || 5,962.3 3,445.5 || 4.71
5.80 ||
(nistb409) || 539.3 344.2 || 2,763.4 1,522.4 || 5.12
4.42 ||
(nistb571) || 257.2 145.7 || 1,354.8 724.9 || 5.27
4.98 ||
|| || ||
||
-----------||-----------------||-------------------||-----------------||
Changes to OpenSSL-1.0.1e
--------------------------------------------------------------------------
bn_gf2m_xmm.c : New file, contains XMM GF2m implementation
bn.h : Added new function declarations
bn_gf2m.c : Added constant time bn operations
Makefile : Added bn_gf2m_xmm.c to makefile
ec2_nist_mult.c: New file, implements Montgomery point
multiplication
ec2_nist.c : New file, implements EC methods
ec2_nist_prec.c: New file, implements method to get
precomputated values
ec.h : Added function declarations (ec_methods)
ec_lcl.h : Added function declarations (all functions in
the ec_method)
ec_curve.c: Added new EC methods to builtin curves
Makefile : Added new files to makefile
Configuration flags
--------------------------------------------------------------------------
-DOPENSSL_FAST_EC2M : Enable the fast implementation of
binary curves
-DFAST_PCLMUL : Enable the pclmul reduction for
pentanomial curves
-mpclmul : Enable pclmulqdq
-msse4 : Enable SSE4
-mavx : Enable AVX
-mavx2 : Enable AVX2
-march=native : Enable all instruction subsets
The results above have been created with the following
./config -mavx2 -mpclmul -DFAST_PCLMUL
-DOPENSSL_FAST_EC2M
./config -mavx -mpclmul -DOPENSSL_FAST_EC2M
[1] M. Bluhm, S. Gueron, Fast Software Implementation of
Binary Elliptic
Curve Cryptography (2013; to be published)
***************************************************************************
Manuel Bluhm (1) and Shay Gueron (2, 3)
(1) Ruhr University Bochum, Germany
(2) Intel Corporation, Israel Development Center, Haifa,
Israel
(3) University of Haifa, Israel
***************************************************************************
______________________________________________________________________
OpenSSL Project http://www.openssl.org
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Loading...