deps: upgrade openssl to 1.0.2a

This just replaces all sources in deps/openssl/openssl to originals in https://www.openssl.org/source/openssl-1.0.2a.tar.gz Fixes: https://github.com/iojs/io.js/issues/589 PR-URL: https://github.com/iojs/io.js/pull/1389 Reviewed-By: Fedor Indutny <fedor@indutny.com> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
10 years ago · e4872d7405
678 changed files with 97139 additions and 14133 deletions
--- a/deps/openssl/openssl/CHANGES
+++ b/deps/openssl/openssl/CHANGES
@ -2,7 +2,49 @@
 OpenSSL CHANGES
 _______________

- Changes between 1.0.1l and 1.0.1m [19 Mar 2015]
+ Changes between 1.0.2 and 1.0.2a [19 Mar 2015]
+
+  *) ClientHello sigalgs DoS fix
+
+     If a client connects to an OpenSSL 1.0.2 server and renegotiates with an
+     invalid signature algorithms extension a NULL pointer dereference will
+     occur. This can be exploited in a DoS attack against the server.
+
+     This issue was was reported to OpenSSL by David Ramos of Stanford
+     University.
+     (CVE-2015-0291)
+     [Stephen Henson and Matt Caswell]
+
+  *) Multiblock corrupted pointer fix
+
+     OpenSSL 1.0.2 introduced the "multiblock" performance improvement. This
+     feature only applies on 64 bit x86 architecture platforms that support AES
+     NI instructions. A defect in the implementation of "multiblock" can cause
+     OpenSSL's internal write buffer to become incorrectly set to NULL when
+     using non-blocking IO. Typically, when the user application is using a
+     socket BIO for writing, this will only result in a failed connection.
+     However if some other BIO is used then it is likely that a segmentation
+     fault will be triggered, thus enabling a potential DoS attack.
+
+     This issue was reported to OpenSSL by Daniel Danner and Rainer Mueller.
+     (CVE-2015-0290)
+     [Matt Caswell]
+
+  *) Segmentation fault in DTLSv1_listen fix
+
+     The DTLSv1_listen function is intended to be stateless and processes the
+     initial ClientHello from many peers. It is common for user code to loop
+     over the call to DTLSv1_listen until a valid ClientHello is received with
+     an associated cookie. A defect in the implementation of DTLSv1_listen means
+     that state is preserved in the SSL object from one invocation to the next
+     that can lead to a segmentation fault. Errors processing the initial
+     ClientHello can trigger this scenario. An example of such an error could be
+     that a DTLS1.0 only client is attempting to connect to a DTLS1.2 only
+     server.
+
+     This issue was reported to OpenSSL by Per Allansson.
+     (CVE-2015-0207)
+     [Matt Caswell]

  *) Segmentation fault in ASN1_TYPE_cmp fix

@ -15,6 +57,20 @@
     (CVE-2015-0286)
     [Stephen Henson]

+  *) Segmentation fault for invalid PSS parameters fix
+
+     The signature verification routines will crash with a NULL pointer
+     dereference if presented with an ASN.1 signature using the RSA PSS
+     algorithm and invalid parameters. Since these routines are used to verify
+     certificate signature algorithms this can be used to crash any
+     certificate verification operation and exploited in a DoS attack. Any
+     application which performs certificate verification is vulnerable including
+     OpenSSL clients and servers which enable client authentication.
+
+     This issue was was reported to OpenSSL by Brian Carpenter.
+     (CVE-2015-0208)
+     [Stephen Henson]
+
  *) ASN.1 structure reuse memory corruption fix

     Reusing a structure in ASN.1 parsing may allow an attacker to cause
@ -53,6 +109,36 @@
     (CVE-2015-0293)
     [Emilia Käsper]

+  *) Empty CKE with client auth and DHE fix
+
+     If client auth is used then a server can seg fault in the event of a DHE
+     ciphersuite being selected and a zero length ClientKeyExchange message
+     being sent by the client. This could be exploited in a DoS attack.
+     (CVE-2015-1787)
+     [Matt Caswell]
+
+  *) Handshake with unseeded PRNG fix
+
+     Under certain conditions an OpenSSL 1.0.2 client can complete a handshake
+     with an unseeded PRNG. The conditions are:
+     - The client is on a platform where the PRNG has not been seeded
+     automatically, and the user has not seeded manually
+     - A protocol specific client method version has been used (i.e. not
+     SSL_client_methodv23)
+     - A ciphersuite is used that does not require additional random data from
+     the PRNG beyond the initial ClientHello client random (e.g. PSK-RC4-SHA).
+
+     If the handshake succeeds then the client random that has been used will
+     have been generated from a PRNG with insufficient entropy and therefore the
+     output may be predictable.
+
+     For example using the following command with an unseeded openssl will
+     succeed on an unpatched platform:
+
+     openssl s_client -psk 1a2b3c4d -tls1_2 -cipher PSK-RC4-SHA
+     (CVE-2015-0285)
+     [Matt Caswell]
+
  *) Use After Free following d2i_ECPrivatekey error fix

     A malformed EC private key file consumed via the d2i_ECPrivateKey function
@ -79,6 +165,336 @@
  *) Removed the export ciphers from the DEFAULT ciphers
     [Kurt Roeckx]

+ Changes between 1.0.1l and 1.0.2 [22 Jan 2015]
+
+  *) Facilitate "universal" ARM builds targeting range of ARM ISAs, e.g.
+     ARMv5 through ARMv8, as opposite to "locking" it to single one.
+     So far those who have to target multiple plaforms would compromise
+     and argue that binary targeting say ARMv5 would still execute on
+     ARMv8. "Universal" build resolves this compromise by providing
+     near-optimal performance even on newer platforms.
+     [Andy Polyakov]
+
+  *) Accelerated NIST P-256 elliptic curve implementation for x86_64
+     (other platforms pending).
+     [Shay Gueron & Vlad Krasnov (Intel Corp), Andy Polyakov]
+
+  *) Add support for the SignedCertificateTimestampList certificate and
+     OCSP response extensions from RFC6962.
+     [Rob Stradling]
+
+  *) Fix ec_GFp_simple_points_make_affine (thus, EC_POINTs_mul etc.)
+     for corner cases. (Certain input points at infinity could lead to
+     bogus results, with non-infinity inputs mapped to infinity too.)
+     [Bodo Moeller]
+
+  *) Initial support for PowerISA 2.0.7, first implemented in POWER8.
+     This covers AES, SHA256/512 and GHASH. "Initial" means that most
+     common cases are optimized and there still is room for further
+     improvements. Vector Permutation AES for Altivec is also added.
+     [Andy Polyakov]
+
+  *) Add support for little-endian ppc64 Linux target.
+     [Marcelo Cerri (IBM)]
+
+  *) Initial support for AMRv8 ISA crypto extensions. This covers AES,
+     SHA1, SHA256 and GHASH. "Initial" means that most common cases
+     are optimized and there still is room for further improvements.
+     Both 32- and 64-bit modes are supported.
+     [Andy Polyakov, Ard Biesheuvel (Linaro)]
+
+  *) Improved ARMv7 NEON support.
+     [Andy Polyakov]
+
+  *) Support for SPARC Architecture 2011 crypto extensions, first
+     implemented in SPARC T4. This covers AES, DES, Camellia, SHA1,
+     SHA256/512, MD5, GHASH and modular exponentiation.
+     [Andy Polyakov, David Miller]
+
+  *) Accelerated modular exponentiation for Intel processors, a.k.a.
+     RSAZ.
+     [Shay Gueron & Vlad Krasnov (Intel Corp)]
+
+  *) Support for new and upcoming Intel processors, including AVX2,
+     BMI and SHA ISA extensions. This includes additional "stitched"
+     implementations, AESNI-SHA256 and GCM, and multi-buffer support
+     for TLS encrypt.
+
+     This work was sponsored by Intel Corp.
+     [Andy Polyakov]
+
+  *) Support for DTLS 1.2. This adds two sets of DTLS methods: DTLS_*_method()
+     supports both DTLS 1.2 and 1.0 and should use whatever version the peer
+     supports and DTLSv1_2_*_method() which supports DTLS 1.2 only.
+     [Steve Henson]
+
+  *) Use algorithm specific chains in SSL_CTX_use_certificate_chain_file():
+     this fixes a limiation in previous versions of OpenSSL.
+     [Steve Henson]
+
+  *) Extended RSA OAEP support via EVP_PKEY API. Options to specify digest,
+     MGF1 digest and OAEP label.
+     [Steve Henson]
+
+  *) Add EVP support for key wrapping algorithms, to avoid problems with
+     existing code the flag EVP_CIPHER_CTX_WRAP_ALLOW has to be set in
+     the EVP_CIPHER_CTX or an error is returned. Add AES and DES3 wrap
+     algorithms and include tests cases.
+     [Steve Henson]
+
+  *) Add functions to allocate and set the fields of an ECDSA_METHOD
+     structure.
+     [Douglas E. Engert, Steve Henson]
+
+  *) New functions OPENSSL_gmtime_diff and ASN1_TIME_diff to find the
+     difference in days and seconds between two tm or ASN1_TIME structures.
+     [Steve Henson]
+
+  *) Add -rev test option to s_server to just reverse order of characters
+     received by client and send back to server. Also prints an abbreviated
+     summary of the connection parameters.
+     [Steve Henson]
+
+  *) New option -brief for s_client and s_server to print out a brief summary
+     of connection parameters.
+     [Steve Henson]
+
+  *) Add callbacks for arbitrary TLS extensions.
+     [Trevor Perrin <trevp@trevp.net> and Ben Laurie]
+
+  *) New option -crl_download in several openssl utilities to download CRLs
+     from CRLDP extension in certificates.
+     [Steve Henson]
+
+  *) New options -CRL and -CRLform for s_client and s_server for CRLs.
+     [Steve Henson]
+
+  *) New function X509_CRL_diff to generate a delta CRL from the difference
+     of two full CRLs. Add support to "crl" utility.
+     [Steve Henson]
+
+  *) New functions to set lookup_crls function and to retrieve
+     X509_STORE from X509_STORE_CTX.
+     [Steve Henson]
+
+  *) Print out deprecated issuer and subject unique ID fields in
+     certificates.
+     [Steve Henson]
+
+  *) Extend OCSP I/O functions so they can be used for simple general purpose
+     HTTP as well as OCSP. New wrapper function which can be used to download
+     CRLs using the OCSP API.
+     [Steve Henson]
+
+  *) Delegate command line handling in s_client/s_server to SSL_CONF APIs.
+     [Steve Henson]
+
+  *) SSL_CONF* functions. These provide a common framework for application
+     configuration using configuration files or command lines.
+     [Steve Henson]
+
+  *) SSL/TLS tracing code. This parses out SSL/TLS records using the
+     message callback and prints the results. Needs compile time option
+     "enable-ssl-trace". New options to s_client and s_server to enable
+     tracing.
+     [Steve Henson]
+
+  *) New ctrl and macro to retrieve supported points extensions.
+     Print out extension in s_server and s_client.
+     [Steve Henson]
+
+  *) New functions to retrieve certificate signature and signature
+     OID NID.
+     [Steve Henson]
+
+  *) Add functions to retrieve and manipulate the raw cipherlist sent by a
+     client to OpenSSL.
+     [Steve Henson]
+
+  *) New Suite B modes for TLS code. These use and enforce the requirements
+     of RFC6460: restrict ciphersuites, only permit Suite B algorithms and
+     only use Suite B curves. The Suite B modes can be set by using the
+     strings "SUITEB128", "SUITEB192" or "SUITEB128ONLY" for the cipherstring.
+     [Steve Henson]
+
+  *) New chain verification flags for Suite B levels of security. Check
+     algorithms are acceptable when flags are set in X509_verify_cert.
+     [Steve Henson]
+
+  *) Make tls1_check_chain return a set of flags indicating checks passed
+     by a certificate chain. Add additional tests to handle client
+     certificates: checks for matching certificate type and issuer name
+     comparison.
+     [Steve Henson]
+
+  *) If an attempt is made to use a signature algorithm not in the peer
+     preference list abort the handshake. If client has no suitable
+     signature algorithms in response to a certificate request do not
+     use the certificate.
+     [Steve Henson]
+
+  *) If server EC tmp key is not in client preference list abort handshake.
+     [Steve Henson]
+
+  *) Add support for certificate stores in CERT structure. This makes it
+     possible to have different stores per SSL structure or one store in
+     the parent SSL_CTX. Include distint stores for certificate chain
+     verification and chain building. New ctrl SSL_CTRL_BUILD_CERT_CHAIN
+     to build and store a certificate chain in CERT structure: returing
+     an error if the chain cannot be built: this will allow applications
+     to test if a chain is correctly configured.
+
+     Note: if the CERT based stores are not set then the parent SSL_CTX
+     store is used to retain compatibility with existing behaviour.
+
+     [Steve Henson]
+
+  *) New function ssl_set_client_disabled to set a ciphersuite disabled
+     mask based on the current session, check mask when sending client
+     hello and checking the requested ciphersuite.
+     [Steve Henson]
+
+  *) New ctrls to retrieve and set certificate types in a certificate
+     request message. Print out received values in s_client. If certificate
+     types is not set with custom values set sensible values based on
+     supported signature algorithms.
+     [Steve Henson]
+
+  *) Support for distinct client and server supported signature algorithms.
+     [Steve Henson]
+
+  *) Add certificate callback. If set this is called whenever a certificate
+     is required by client or server. An application can decide which
+     certificate chain to present based on arbitrary criteria: for example
+     supported signature algorithms. Add very simple example to s_server.
+     This fixes many of the problems and restrictions of the existing client
+     certificate callback: for example you can now clear an existing
+     certificate and specify the whole chain.
+     [Steve Henson]
+
+  *) Add new "valid_flags" field to CERT_PKEY structure which determines what
+     the certificate can be used for (if anything). Set valid_flags field
+     in new tls1_check_chain function. Simplify ssl_set_cert_masks which used
+     to have similar checks in it.
+
+     Add new "cert_flags" field to CERT structure and include a "strict mode".
+     This enforces some TLS certificate requirements (such as only permitting
+     certificate signature algorithms contained in the supported algorithms
+     extension) which some implementations ignore: this option should be used
+     with caution as it could cause interoperability issues.
+     [Steve Henson]
+
+  *) Update and tidy signature algorithm extension processing. Work out
+     shared signature algorithms based on preferences and peer algorithms
+     and print them out in s_client and s_server. Abort handshake if no
+     shared signature algorithms.
+     [Steve Henson]
+
+  *) Add new functions to allow customised supported signature algorithms
+     for SSL and SSL_CTX structures. Add options to s_client and s_server
+     to support them.
+     [Steve Henson]
+
+  *) New function SSL_certs_clear() to delete all references to certificates
+     from an SSL structure. Before this once a certificate had been added
+     it couldn't be removed.
+     [Steve Henson]
+
+  *) Integrate hostname, email address and IP address checking with certificate
+     verification. New verify options supporting checking in opensl utility.
+     [Steve Henson]
+
+  *) Fixes and wildcard matching support to hostname and email checking
+     functions. Add manual page.
+     [Florian Weimer (Red Hat Product Security Team)]
+
+  *) New functions to check a hostname email or IP address against a
+     certificate. Add options x509 utility to print results of checks against
+     a certificate.
+     [Steve Henson]
+
+  *) Fix OCSP checking.
+     [Rob Stradling <rob.stradling@comodo.com> and Ben Laurie]
+
+  *) Initial experimental support for explicitly trusted non-root CAs.
+     OpenSSL still tries to build a complete chain to a root but if an
+     intermediate CA has a trust setting included that is used. The first
+     setting is used: whether to trust (e.g., -addtrust option to the x509
+     utility) or reject.
+     [Steve Henson]
+
+  *) Add -trusted_first option which attempts to find certificates in the
+     trusted store even if an untrusted chain is also supplied.
+     [Steve Henson]
+
+  *) MIPS assembly pack updates: support for MIPS32r2 and SmartMIPS ASE,
+     platform support for Linux and Android.
+     [Andy Polyakov]
+
+  *) Support for linux-x32, ILP32 environment in x86_64 framework.
+     [Andy Polyakov]
+
+  *) Experimental multi-implementation support for FIPS capable OpenSSL.
+     When in FIPS mode the approved implementations are used as normal,
+     when not in FIPS mode the internal unapproved versions are used instead.
+     This means that the FIPS capable OpenSSL isn't forced to use the
+     (often lower perfomance) FIPS implementations outside FIPS mode.
+     [Steve Henson]
+
+  *) Transparently support X9.42 DH parameters when calling
+     PEM_read_bio_DHparameters. This means existing applications can handle
+     the new parameter format automatically.
+     [Steve Henson]
+
+  *) Initial experimental support for X9.42 DH parameter format: mainly
+     to support use of 'q' parameter for RFC5114 parameters.
+     [Steve Henson]
+
+  *) Add DH parameters from RFC5114 including test data to dhtest.
+     [Steve Henson]
+
+  *) Support for automatic EC temporary key parameter selection. If enabled
+     the most preferred EC parameters are automatically used instead of
+     hardcoded fixed parameters. Now a server just has to call:
+     SSL_CTX_set_ecdh_auto(ctx, 1) and the server will automatically
+     support ECDH and use the most appropriate parameters.
+     [Steve Henson]
+
+  *) Enhance and tidy EC curve and point format TLS extension code. Use
+     static structures instead of allocation if default values are used.
+     New ctrls to set curves we wish to support and to retrieve shared curves.
+     Print out shared curves in s_server. New options to s_server and s_client
+     to set list of supported curves.
+     [Steve Henson]
+
+  *) New ctrls to retrieve supported signature algorithms and
+     supported curve values as an array of NIDs. Extend openssl utility
+     to print out received values.
+     [Steve Henson]
+
+  *) Add new APIs EC_curve_nist2nid and EC_curve_nid2nist which convert
+     between NIDs and the more common NIST names such as "P-256". Enhance
+     ecparam utility and ECC method to recognise the NIST names for curves.
+     [Steve Henson]
+
+  *) Enhance SSL/TLS certificate chain handling to support different
+     chains for each certificate instead of one chain in the parent SSL_CTX.
+     [Steve Henson]
+
+  *) Support for fixed DH ciphersuite client authentication: where both
+     server and client use DH certificates with common parameters.
+     [Steve Henson]
+
+  *) Support for fixed DH ciphersuites: those requiring DH server
+     certificates.
+     [Steve Henson]
+
+  *) New function i2d_re_X509_tbs for re-encoding the TBS portion of
+     the certificate.
+     Note: Related 1.0.2-beta specific macros X509_get_cert_info,
+     X509_CINF_set_modified, X509_CINF_get_issuer, X509_CINF_get_extensions and
+     X509_CINF_get_signature were reverted post internal team review.
+
 Changes between 1.0.1k and 1.0.1l [15 Jan 2015]

  *) Build fixes for the Windows and OpenVMS platforms
--- a/deps/openssl/openssl/Configure
+++ b/deps/openssl/openssl/Configure
@ -105,6 +105,8 @@ my $usage="Usage: Configure [no-<cipher> ...] [enable-<cipher> ...] [experimenta

 my $gcc_devteam_warn = "-Wall -pedantic -DPEDANTIC -Wno-long-long -Wsign-compare -Wmissing-prototypes -Wshadow -Wformat -Werror -DCRYPTO_MDEBUG_ALL -DCRYPTO_MDEBUG_ABORT -DREF_CHECK -DOPENSSL_NO_DEPRECATED";

+my $clang_disabled_warnings = "-Wno-language-extension-token -Wno-extended-offsetof -Wno-padded -Wno-shorten-64-to-32 -Wno-format-nonliteral -Wno-missing-noreturn -Wno-unused-parameter -Wno-sign-conversion -Wno-unreachable-code -Wno-conversion -Wno-documentation -Wno-missing-variable-declarations -Wno-cast-align -Wno-incompatible-pointer-types-discards-qualifiers -Wno-missing-variable-declarations -Wno-missing-field-initializers -Wno-unused-macros -Wno-disabled-macro-expansion -Wno-conditional-uninitialized -Wno-switch-enum";
+
 my $strict_warnings = 0;

 my $x86_gcc_des="DES_PTR DES_RISC1 DES_UNROLL";
@ -124,24 +126,25 @@ my $tlib="-lnsl -lsocket";
 my $bits1="THIRTY_TWO_BIT ";
 my $bits2="SIXTY_FOUR_BIT ";

-my $x86_asm="x86cpuid.o:bn-586.o co-586.o x86-mont.o x86-gf2m.o:des-586.o crypt586.o:aes-586.o vpaes-x86.o aesni-x86.o:bf-586.o:md5-586.o:sha1-586.o sha256-586.o sha512-586.o:cast-586.o:rc4-586.o:rmd-586.o:rc5-586.o:wp_block.o wp-mmx.o:cmll-x86.o:ghash-x86.o:";
+my $x86_asm="x86cpuid.o:bn-586.o co-586.o x86-mont.o x86-gf2m.o::des-586.o crypt586.o:aes-586.o vpaes-x86.o aesni-x86.o:bf-586.o:md5-586.o:sha1-586.o sha256-586.o sha512-586.o:cast-586.o:rc4-586.o:rmd-586.o:rc5-586.o:wp_block.o wp-mmx.o:cmll-x86.o:ghash-x86.o:";

 my $x86_elf_asm="$x86_asm:elf";

-my $x86_64_asm="x86_64cpuid.o:x86_64-gcc.o x86_64-mont.o x86_64-mont5.o x86_64-gf2m.o modexp512-x86_64.o::aes-x86_64.o vpaes-x86_64.o bsaes-x86_64.o aesni-x86_64.o aesni-sha1-x86_64.o::md5-x86_64.o:sha1-x86_64.o sha256-x86_64.o sha512-x86_64.o::rc4-x86_64.o rc4-md5-x86_64.o:::wp-x86_64.o:cmll-x86_64.o cmll_misc.o:ghash-x86_64.o:";
-my $ia64_asm="ia64cpuid.o:bn-ia64.o ia64-mont.o::aes_core.o aes_cbc.o aes-ia64.o::md5-ia64.o:sha1-ia64.o sha256-ia64.o sha512-ia64.o::rc4-ia64.o rc4_skey.o:::::ghash-ia64.o::void";
-my $sparcv9_asm="sparcv9cap.o sparccpuid.o:bn-sparcv9.o sparcv9-mont.o sparcv9a-mont.o:des_enc-sparc.o fcrypt_b.o:aes_core.o aes_cbc.o aes-sparcv9.o:::sha1-sparcv9.o sha256-sparcv9.o sha512-sparcv9.o:::::::ghash-sparcv9.o::void";
-my $sparcv8_asm=":sparcv8.o:des_enc-sparc.o fcrypt_b.o:::::::::::::void";
-my $alpha_asm="alphacpuid.o:bn_asm.o alpha-mont.o:::::sha1-alpha.o:::::::ghash-alpha.o::void";
-my $mips32_asm=":bn-mips.o::aes_cbc.o aes-mips.o:::sha1-mips.o sha256-mips.o::::::::";
-my $mips64_asm=":bn-mips.o mips-mont.o::aes_cbc.o aes-mips.o:::sha1-mips.o sha256-mips.o sha512-mips.o::::::::";
-my $s390x_asm="s390xcap.o s390xcpuid.o:bn-s390x.o s390x-mont.o s390x-gf2m.o::aes-s390x.o aes-ctr.o aes-xts.o:::sha1-s390x.o sha256-s390x.o sha512-s390x.o::rc4-s390x.o:::::ghash-s390x.o:";
-my $armv4_asm="armcap.o armv4cpuid.o:bn_asm.o armv4-mont.o armv4-gf2m.o::aes_cbc.o aes-armv4.o:::sha1-armv4-large.o sha256-armv4.o sha512-armv4.o:::::::ghash-armv4.o::void";
-my $parisc11_asm="pariscid.o:bn_asm.o parisc-mont.o::aes_core.o aes_cbc.o aes-parisc.o:::sha1-parisc.o sha256-parisc.o sha512-parisc.o::rc4-parisc.o:::::ghash-parisc.o::32";
-my $parisc20_asm="pariscid.o:pa-risc2W.o parisc-mont.o::aes_core.o aes_cbc.o aes-parisc.o:::sha1-parisc.o sha256-parisc.o sha512-parisc.o::rc4-parisc.o:::::ghash-parisc.o::64";
-my $ppc32_asm="ppccpuid.o ppccap.o:bn-ppc.o ppc-mont.o ppc64-mont.o::aes_core.o aes_cbc.o aes-ppc.o:::sha1-ppc.o sha256-ppc.o::::::::";
-my $ppc64_asm="ppccpuid.o ppccap.o:bn-ppc.o ppc-mont.o ppc64-mont.o::aes_core.o aes_cbc.o aes-ppc.o:::sha1-ppc.o sha256-ppc.o sha512-ppc.o::::::::";
-my $no_asm=":::::::::::::::void";
+my $x86_64_asm="x86_64cpuid.o:x86_64-gcc.o x86_64-mont.o x86_64-mont5.o x86_64-gf2m.o rsaz_exp.o rsaz-x86_64.o rsaz-avx2.o:ecp_nistz256.o ecp_nistz256-x86_64.o::aes-x86_64.o vpaes-x86_64.o bsaes-x86_64.o aesni-x86_64.o aesni-sha1-x86_64.o aesni-sha256-x86_64.o aesni-mb-x86_64.o::md5-x86_64.o:sha1-x86_64.o sha256-x86_64.o sha512-x86_64.o sha1-mb-x86_64.o sha256-mb-x86_64.o::rc4-x86_64.o rc4-md5-x86_64.o:::wp-x86_64.o:cmll-x86_64.o cmll_misc.o:ghash-x86_64.o aesni-gcm-x86_64.o:";
+my $ia64_asm="ia64cpuid.o:bn-ia64.o ia64-mont.o:::aes_core.o aes_cbc.o aes-ia64.o::md5-ia64.o:sha1-ia64.o sha256-ia64.o sha512-ia64.o::rc4-ia64.o rc4_skey.o:::::ghash-ia64.o::void";
+my $sparcv9_asm="sparcv9cap.o sparccpuid.o:bn-sparcv9.o sparcv9-mont.o sparcv9a-mont.o vis3-mont.o sparct4-mont.o sparcv9-gf2m.o::des_enc-sparc.o fcrypt_b.o dest4-sparcv9.o:aes_core.o aes_cbc.o aes-sparcv9.o aest4-sparcv9.o::md5-sparcv9.o:sha1-sparcv9.o sha256-sparcv9.o sha512-sparcv9.o::::::camellia.o cmll_misc.o cmll_cbc.o cmllt4-sparcv9.o:ghash-sparcv9.o::void";
+my $sparcv8_asm=":sparcv8.o::des_enc-sparc.o fcrypt_b.o:::::::::::::void";
+my $alpha_asm="alphacpuid.o:bn_asm.o alpha-mont.o::::::sha1-alpha.o:::::::ghash-alpha.o::void";
+my $mips64_asm=":bn-mips.o mips-mont.o:::aes_cbc.o aes-mips.o:::sha1-mips.o sha256-mips.o sha512-mips.o::::::::";
+my $mips32_asm=$mips64_asm; $mips32_asm =~ s/\s*sha512\-mips\.o//;
+my $s390x_asm="s390xcap.o s390xcpuid.o:bn-s390x.o s390x-mont.o s390x-gf2m.o:::aes-s390x.o aes-ctr.o aes-xts.o:::sha1-s390x.o sha256-s390x.o sha512-s390x.o::rc4-s390x.o:::::ghash-s390x.o:";
+my $armv4_asm="armcap.o armv4cpuid.o:bn_asm.o armv4-mont.o armv4-gf2m.o:::aes_cbc.o aes-armv4.o bsaes-armv7.o aesv8-armx.o:::sha1-armv4-large.o sha256-armv4.o sha512-armv4.o:::::::ghash-armv4.o ghashv8-armx.o::void";
+my $aarch64_asm="armcap.o arm64cpuid.o mem_clr.o::::aes_core.o aes_cbc.o aesv8-armx.o:::sha1-armv8.o sha256-armv8.o sha512-armv8.o:::::::ghashv8-armx.o:";
+my $parisc11_asm="pariscid.o:bn_asm.o parisc-mont.o:::aes_core.o aes_cbc.o aes-parisc.o:::sha1-parisc.o sha256-parisc.o sha512-parisc.o::rc4-parisc.o:::::ghash-parisc.o::32";
+my $parisc20_asm="pariscid.o:pa-risc2W.o parisc-mont.o:::aes_core.o aes_cbc.o aes-parisc.o:::sha1-parisc.o sha256-parisc.o sha512-parisc.o::rc4-parisc.o:::::ghash-parisc.o::64";
+my $ppc64_asm="ppccpuid.o ppccap.o:bn-ppc.o ppc-mont.o ppc64-mont.o:::aes_core.o aes_cbc.o aes-ppc.o vpaes-ppc.o aesp8-ppc.o:::sha1-ppc.o sha256-ppc.o sha512-ppc.o sha256p8-ppc.o sha512p8-ppc.o:::::::ghashp8-ppc.o:";
+my $ppc32_asm=$ppc64_asm;
+my $no_asm="::::::::::::::::void";

 # As for $BSDthreads. Idea is to maintain "collective" set of flags,
 # which would cover all BSD flavors. -pthread applies to them all, 
@ -152,7 +155,7 @@ my $no_asm=":::::::::::::::void";
 # seems to be sufficient?
 my $BSDthreads="-pthread -D_THREAD_SAFE -D_REENTRANT";

-#config-string	$cc : $cflags : $unistd : $thread_cflag : $sys_id : $lflags : $bn_ops : $cpuid_obj : $bn_obj : $des_obj : $aes_obj : $bf_obj : $md5_obj : $sha1_obj : $cast_obj : $rc4_obj : $rmd160_obj : $rc5_obj : $wp_obj : $cmll_obj : $modes_obj : $engines_obj : $dso_scheme : $shared_target : $shared_cflag : $shared_ldflag : $shared_extension : $ranlib : $arflags : $multilib
+#config-string	$cc : $cflags : $unistd : $thread_cflag : $sys_id : $lflags : $bn_ops : $cpuid_obj : $bn_obj : $ec_obj : $des_obj : $aes_obj : $bf_obj : $md5_obj : $sha1_obj : $cast_obj : $rc4_obj : $rmd160_obj : $rc5_obj : $wp_obj : $cmll_obj : $modes_obj : $engines_obj : $dso_scheme : $shared_target : $shared_cflag : $shared_ldflag : $shared_extension : $ranlib : $arflags : $multilib

 my %table=(
 # File 'TABLE' (created by 'make TABLE') contains the data from this list,
@ -174,14 +177,14 @@ my %table=(
 "debug-ben-debug-64",	"gcc:$gcc_devteam_warn -Wno-error=overlength-strings -DBN_DEBUG -DCONF_DEBUG -DDEBUG_SAFESTACK -DDEBUG_UNUSED -g3 -O3 -pipe::${BSDthreads}:::SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:bsd-gcc-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "debug-ben-macos",	"cc:$gcc_devteam_warn -arch i386 -DBN_DEBUG -DCONF_DEBUG -DDEBUG_SAFESTACK -DDEBUG_UNUSED -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -O3 -DL_ENDIAN -g3 -pipe::(unknown)::-Wl,-search_paths_first::::",
 "debug-ben-macos-gcc46",	"gcc-mp-4.6:$gcc_devteam_warn -Wconversion -DBN_DEBUG -DCONF_DEBUG -DDEBUG_SAFESTACK -DDEBUG_UNUSED -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -O3 -DL_ENDIAN -g3 -pipe::(unknown)::::::",
-"debug-ben-darwin64","cc:$gcc_devteam_warn -Wno-language-extension-token -Wno-extended-offsetof -arch x86_64 -O3 -DL_ENDIAN -Wall::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL:".eval{my $asm=$x86_64_asm;$asm=~s/rc4\-[^:]+//;$asm}.":macosx:dlfcn:darwin-shared:-fPIC -fno-common:-arch x86_64 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
+"debug-ben-darwin64","cc:$gcc_devteam_warn -g -Wno-language-extension-token -Wno-extended-offsetof -arch x86_64 -O3 -DL_ENDIAN -Wall::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL:".eval{my $asm=$x86_64_asm;$asm=~s/rc4\-[^:]+//;$asm}.":macosx:dlfcn:darwin-shared:-fPIC -fno-common:-arch x86_64 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
+"debug-ben-debug-64-clang",	"clang:$gcc_devteam_warn -Wno-error=overlength-strings -Wno-error=extended-offsetof -Qunused-arguments -DBN_DEBUG -DCONF_DEBUG -DDEBUG_SAFESTACK -DDEBUG_UNUSED -g3 -O3 -pipe::${BSDthreads}:::SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:bsd-gcc-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "debug-ben-no-opt",	"gcc: -Wall -Wmissing-prototypes -Wstrict-prototypes -Wmissing-declarations -DDEBUG_SAFESTACK -DCRYPTO_MDEBUG -Werror -DL_ENDIAN -DTERMIOS -Wall -g3::(unknown)::::::",
 "debug-ben-strict",	"gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DBN_CTX_DEBUG -DCRYPTO_MDEBUG -DCONST_STRICT -O2 -Wall -Wshadow -Werror -Wpointer-arith -Wcast-qual -Wwrite-strings -pipe::(unknown)::::::",
 "debug-rse","cc:-DTERMIOS -DL_ENDIAN -pipe -O -g -ggdb3 -Wall::(unknown):::BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}",
 "debug-bodo",	"gcc:$gcc_devteam_warn -Wno-error=overlength-strings -DBN_DEBUG -DBN_DEBUG_RAND -DCONF_DEBUG -DBIO_PAIR_DEBUG -m64 -DL_ENDIAN -DTERMIO -g -DMD32_REG_T=int::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
-"debug-ulf", "gcc:-DTERMIOS -DL_ENDIAN -march=i486 -Wall -DBN_DEBUG -DBN_DEBUG_RAND -DREF_CHECK -DCONF_DEBUG -DBN_CTX_DEBUG -DCRYPTO_MDEBUG -DOPENSSL_NO_ASM -g -Wformat -Wshadow -Wmissing-prototypes -Wmissing-declarations:::CYGWIN32:::${no_asm}:win32:cygwin-shared:::.dll",
 "debug-steve64", "gcc:$gcc_devteam_warn -m64 -DL_ENDIAN -DTERMIO -DCONF_DEBUG -DDEBUG_SAFESTACK -Wno-overlength-strings -g::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
-"debug-steve32", "gcc:$gcc_devteam_warn -m32 -DL_ENDIAN -DCONF_DEBUG -DDEBUG_SAFESTACK -g -pipe::-D_REENTRANT::-rdynamic -ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-fPIC:-m32:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"debug-steve32", "gcc:$gcc_devteam_warn -m32 -DL_ENDIAN -DCONF_DEBUG -DDEBUG_SAFESTACK -Wno-overlength-strings -g -pipe::-D_REENTRANT::-rdynamic -ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-fPIC:-m32:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "debug-steve-opt", "gcc:$gcc_devteam_warn -m64 -O3 -DL_ENDIAN -DTERMIO -DCONF_DEBUG -DDEBUG_SAFESTACK -g::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "debug-levitte-linux-elf","gcc:-DLEVITTE_DEBUG -DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -DL_ENDIAN -ggdb -g3 -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "debug-levitte-linux-noasm","gcc:-DLEVITTE_DEBUG -DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -DOPENSSL_NO_ASM -DL_ENDIAN -ggdb -g3 -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
@ -193,9 +196,9 @@ my %table=(
 "debug-linux-ppro","gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DBN_CTX_DEBUG -DCRYPTO_MDEBUG -DL_ENDIAN -g -mcpu=pentiumpro -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn",
 "debug-linux-elf","gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DBN_CTX_DEBUG -DCRYPTO_MDEBUG -DL_ENDIAN -g -march=i486 -Wall::-D_REENTRANT::-lefence -ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "debug-linux-elf-noefence","gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DBN_CTX_DEBUG -DCRYPTO_MDEBUG -DL_ENDIAN -g -march=i486 -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
-"debug-linux-ia32-aes", "gcc:-DAES_EXPERIMENTAL -DL_ENDIAN -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:x86cpuid.o:bn-586.o co-586.o x86-mont.o:des-586.o crypt586.o:aes_x86core.o aes_cbc.o aesni-x86.o:bf-586.o:md5-586.o:sha1-586.o sha256-586.o sha512-586.o:cast-586.o:rc4-586.o:rmd-586.o:rc5-586.o:wp_block.o wp-mmx.o::ghash-x86.o::elf:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"debug-linux-ia32-aes", "gcc:-DAES_EXPERIMENTAL -DL_ENDIAN -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:x86cpuid.o:bn-586.o co-586.o x86-mont.o::des-586.o crypt586.o:aes_x86core.o aes_cbc.o aesni-x86.o:bf-586.o:md5-586.o:sha1-586.o sha256-586.o sha512-586.o:cast-586.o:rc4-586.o:rmd-586.o:rc5-586.o:wp_block.o wp-mmx.o::ghash-x86.o::elf:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "debug-linux-generic32","gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -g -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
-"debug-linux-generic64","gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -g -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"debug-linux-generic64","gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -DTERMIO -g -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "debug-linux-x86_64","gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -m64 -DL_ENDIAN -g -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
 "dist",		"cc:-O::(unknown)::::::",

@ -225,7 +228,7 @@ my %table=(
 "solaris64-x86_64-gcc","gcc:-m64 -O3 -Wall -DL_ENDIAN::-D_REENTRANT::-lsocket -lnsl -ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:solaris-shared:-fPIC:-m64 -shared -static-libgcc:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::/64",
 
 #### Solaris x86 with Sun C setups
-"solaris-x86-cc","cc:-fast -O -Xa::-D_REENTRANT::-lsocket -lnsl -ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_PTR DES_UNROLL BF_PTR:${no_asm}:dlfcn:solaris-shared:-KPIC:-G -dy -z text:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"solaris-x86-cc","cc:-fast -xarch=generic -O -Xa::-D_REENTRANT::-lsocket -lnsl -ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_PTR DES_UNROLL BF_PTR:${no_asm}:dlfcn:solaris-shared:-KPIC:-G -dy -z text:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "solaris64-x86_64-cc","cc:-fast -xarch=amd64 -xstrconst -Xa -DL_ENDIAN::-D_REENTRANT::-lsocket -lnsl -ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:solaris-shared:-KPIC:-xarch=amd64 -G -dy -z text:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::/64",

 #### SPARC Solaris with GNU C setups
@ -300,7 +303,7 @@ my %table=(
 "hpux-parisc-gcc","gcc:-O3 -DB_ENDIAN -DBN_DIV2W::-D_REENTRANT::-Wl,+s -ldld:BN_LLONG DES_PTR DES_UNROLL DES_RISC1:${no_asm}:dl:hpux-shared:-fPIC:-shared:.sl.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "hpux-parisc1_1-gcc","gcc:-O3 -DB_ENDIAN -DBN_DIV2W::-D_REENTRANT::-Wl,+s -ldld:BN_LLONG DES_PTR DES_UNROLL DES_RISC1:${parisc11_asm}:dl:hpux-shared:-fPIC:-shared:.sl.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::/pa1.1",
 "hpux-parisc2-gcc","gcc:-march=2.0 -O3 -DB_ENDIAN -D_REENTRANT::::-Wl,+s -ldld:SIXTY_FOUR_BIT RC4_CHAR RC4_CHUNK DES_PTR DES_UNROLL DES_RISC1:".eval{my $asm=$parisc20_asm;$asm=~s/2W\./2\./;$asm=~s/:64/:32/;$asm}.":dl:hpux-shared:-fPIC:-shared:.sl.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::/pa20_32",
-"hpux64-parisc2-gcc","gcc:-O3 -DB_ENDIAN -D_REENTRANT::::-ldl:SIXTY_FOUR_BIT_LONG MD2_CHAR RC4_INDEX RC4_CHAR DES_UNROLL DES_RISC1 DES_INT::pa-risc2W.o::::::::::::::void:dlfcn:hpux-shared:-fpic:-shared:.sl.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::/pa20_64",
+"hpux64-parisc2-gcc","gcc:-O3 -DB_ENDIAN -D_REENTRANT::::-ldl:SIXTY_FOUR_BIT_LONG MD2_CHAR RC4_INDEX RC4_CHAR DES_UNROLL DES_RISC1 DES_INT::pa-risc2W.o:::::::::::::::void:dlfcn:hpux-shared:-fpic:-shared:.sl.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::/pa20_64",

 # More attempts at unified 10.X and 11.X targets for HP C compiler.
 #
@ -347,20 +350,57 @@ my %table=(
 # throw in -D[BL]_ENDIAN, whichever appropriate...
 "linux-generic32","gcc:-O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "linux-ppc",	"gcc:-DB_ENDIAN -O3 -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_RISC1 DES_UNROLL:${ppc32_asm}:linux32:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
-# It's believed that majority of ARM toolchains predefine appropriate -march.
-# If you compiler does not, do complement config command line with one!
-"linux-armv4",	"gcc:-O3 -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${armv4_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+
+#######################################################################
+# Note that -march is not among compiler options in below linux-armv4
+# target line. Not specifying one is intentional to give you choice to:
+#
+# a) rely on your compiler default by not specifying one;
+# b) specify your target platform explicitly for optimal performance,
+#    e.g. -march=armv6 or -march=armv7-a;
+# c) build "universal" binary that targets *range* of platforms by
+#    specifying minimum and maximum supported architecture;
+#
+# As for c) option. It actually makes no sense to specify maximum to be
+# less than ARMv7, because it's the least requirement for run-time
+# switch between platform-specific code paths. And without run-time
+# switch performance would be equivalent to one for minimum. Secondly,
+# there are some natural limitations that you'd have to accept and
+# respect. Most notably you can *not* build "universal" binary for
+# big-endian platform. This is because ARMv7 processor always picks
+# instructions in little-endian order. Another similar limitation is
+# that -mthumb can't "cross" -march=armv6t2 boundary, because that's
+# where it became Thumb-2. Well, this limitation is a bit artificial,
+# because it's not really impossible, but it's deemed too tricky to
+# support. And of course you have to be sure that your binutils are
+# actually up to the task of handling maximum target platform. With all
+# this in mind here is an example of how to configure "universal" build:
+#
+#       ./Configure linux-armv4 -march=armv6 -D__ARM_MAX_ARCH__=8
+#
+"linux-armv4",	"gcc: -O3 -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${armv4_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"linux-aarch64","gcc: -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${aarch64_asm}:linux64:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+# Configure script adds minimally required -march for assembly support,
+# if no -march was specified at command line. mips32 and mips64 below
+# refer to contemporary MIPS Architecture specifications, MIPS32 and
+# MIPS64, rather than to kernel bitness.
+"linux-mips32",	"gcc:-mabi=32 -O3 -Wall -DBN_DIV3W::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${mips32_asm}:o32:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"linux-mips64",   "gcc:-mabi=n32 -O3 -Wall -DBN_DIV3W::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${mips64_asm}:n32:dlfcn:linux-shared:-fPIC:-mabi=n32:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::32",
+"linux64-mips64",   "gcc:-mabi=64 -O3 -Wall -DBN_DIV3W::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${mips64_asm}:64:dlfcn:linux-shared:-fPIC:-mabi=64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
 #### IA-32 targets...
-"linux-ia32-icc",	"icc:-DL_ENDIAN -O2 -no_cpprt::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-KPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"linux-ia32-icc",	"icc:-DL_ENDIAN -O2::-D_REENTRANT::-ldl -no_cpprt:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-KPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "linux-elf",	"gcc:-DL_ENDIAN -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "linux-aout",	"gcc:-DL_ENDIAN -O3 -fomit-frame-pointer -march=i486 -Wall::(unknown):::BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_asm}:a.out",
 ####
 "linux-generic64","gcc:-O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "linux-ppc64",	"gcc:-m64 -DB_ENDIAN -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_RISC1 DES_UNROLL:${ppc64_asm}:linux64:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
-"linux-ia64",	"gcc:-DL_ENDIAN -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_UNROLL DES_INT:${ia64_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
-"linux-ia64-ecc","ecc:-DL_ENDIAN -O2 -Wall -no_cpprt::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT:${ia64_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
-"linux-ia64-icc","icc:-DL_ENDIAN -O2 -Wall -no_cpprt::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_RISC1 DES_INT:${ia64_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"linux-ppc64le","gcc:-m64 -DL_ENDIAN -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_RISC1 DES_UNROLL:$ppc64_asm:linux64le:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::",
+"linux-ia64",	"gcc:-DL_ENDIAN -DTERMIO -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_UNROLL DES_INT:${ia64_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"linux-ia64-icc","icc:-DL_ENDIAN -O2 -Wall::-D_REENTRANT::-ldl -no_cpprt:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_RISC1 DES_INT:${ia64_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "linux-x86_64",	"gcc:-m64 -DL_ENDIAN -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
+"linux-x86_64-clang",	"clang: -m64 -DL_ENDIAN -O3 -Weverything $clang_disabled_warnings -Qunused-arguments::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
+"linux-x86_64-icc", "icc:-DL_ENDIAN -O2::-D_REENTRANT::-ldl -no_cpprt:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
+"linux-x32",	"gcc:-mx32 -DL_ENDIAN -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT RC4_CHUNK_LL DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-mx32:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::x32",
 "linux64-s390x",	"gcc:-m64 -DB_ENDIAN -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL:${s390x_asm}:64:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
 #### So called "highgprs" target for z/Architecture CPUs
 # "Highgprs" is kernel feature first implemented in Linux 2.6.32, see
@ -407,6 +447,7 @@ my %table=(
 "android","gcc:-mandroid -I\$(ANDROID_DEV)/include -B\$(ANDROID_DEV)/lib -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "android-x86","gcc:-mandroid -I\$(ANDROID_DEV)/include -B\$(ANDROID_DEV)/lib -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:".eval{my $asm=${x86_elf_asm};$asm=~s/:elf/:android/;$asm}.":dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 "android-armv7","gcc:-march=armv7-a -mandroid -I\$(ANDROID_DEV)/include -B\$(ANDROID_DEV)/lib -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${armv4_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"android-mips","gcc:-mandroid -I\$(ANDROID_DEV)/include -B\$(ANDROID_DEV)/lib -O3 -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${mips32_asm}:o32:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",

 #### *BSD [do see comment about ${BSDthreads} above!]
 "BSD-generic32","gcc:-O3 -fomit-frame-pointer -Wall::${BSDthreads}:::BN_LLONG RC2_CHAR RC4_INDEX DES_INT DES_UNROLL:${no_asm}:dlfcn:bsd-gcc-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
@ -454,11 +495,11 @@ my %table=(
 # UnixWare 2.0x fails destest with -O.
 "unixware-2.0","cc:-DFILIO_H -DNO_STRINGS_H::-Kthread::-lsocket -lnsl -lresolv -lx:${x86_gcc_des} ${x86_gcc_opts}:::",
 "unixware-2.1","cc:-O -DFILIO_H::-Kthread::-lsocket -lnsl -lresolv -lx:${x86_gcc_des} ${x86_gcc_opts}:::",
-"unixware-7","cc:-O -DFILIO_H -Kalloca::-Kthread::-lsocket -lnsl:BN_LLONG MD2_CHAR RC4_INDEX ${x86_gcc_des}:${x86_elf_asm}:dlfcn:svr5-shared:-Kpic::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
-"unixware-7-gcc","gcc:-DL_ENDIAN -DFILIO_H -O3 -fomit-frame-pointer -march=pentium -Wall::-D_REENTRANT::-lsocket -lnsl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:gnu-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"unixware-7","cc:-O -DFILIO_H -Kalloca::-Kthread::-lsocket -lnsl:BN_LLONG MD2_CHAR RC4_INDEX ${x86_gcc_des}:${x86_elf_asm}-1:dlfcn:svr5-shared:-Kpic::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"unixware-7-gcc","gcc:-DL_ENDIAN -DFILIO_H -O3 -fomit-frame-pointer -march=pentium -Wall::-D_REENTRANT::-lsocket -lnsl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}-1:dlfcn:gnu-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
 # SCO 5 - Ben Laurie <ben@algroup.co.uk> says the -O breaks the SCO cc.
-"sco5-cc",  "cc:-belf::(unknown)::-lsocket -lnsl:${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:svr3-shared:-Kpic::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
-"sco5-gcc",  "gcc:-O3 -fomit-frame-pointer::(unknown)::-lsocket -lnsl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:svr3-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"sco5-cc",  "cc:-belf::(unknown)::-lsocket -lnsl:${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}-1:dlfcn:svr3-shared:-Kpic::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"sco5-gcc",  "gcc:-O3 -fomit-frame-pointer::(unknown)::-lsocket -lnsl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}-1:dlfcn:svr3-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",

 #### IBM's AIX.
 "aix3-cc",  "cc:-O -DB_ENDIAN -qmaxmem=16384::(unknown):AIX::BN_LLONG RC4_CHAR:::",
@ -518,9 +559,9 @@ my %table=(
 # Visual C targets
 #
 # Win64 targets, WIN64I denotes IA-64 and WIN64A - AMD64
-"VC-WIN64I","cl:-W3 -Gs0 -Gy -nologo -DOPENSSL_SYSNAME_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -DUNICODE -D_UNICODE -D_CRT_SECURE_NO_DEPRECATE:::WIN64I::SIXTY_FOUR_BIT RC4_CHUNK_LL DES_INT EXPORT_VAR_AS_FN:ia64cpuid.o:ia64.o ia64-mont.o::aes_core.o aes_cbc.o aes-ia64.o::md5-ia64.o:sha1-ia64.o sha256-ia64.o sha512-ia64.o:::::::ghash-ia64.o::ias:win32",
+"VC-WIN64I","cl:-W3 -Gs0 -Gy -nologo -DOPENSSL_SYSNAME_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -DUNICODE -D_UNICODE -D_CRT_SECURE_NO_DEPRECATE:::WIN64I::SIXTY_FOUR_BIT RC4_CHUNK_LL DES_INT EXPORT_VAR_AS_FN:ia64cpuid.o:ia64.o ia64-mont.o:::aes_core.o aes_cbc.o aes-ia64.o::md5-ia64.o:sha1-ia64.o sha256-ia64.o sha512-ia64.o:::::::ghash-ia64.o::ias:win32",
 "VC-WIN64A","cl:-W3 -Gs0 -Gy -nologo -DOPENSSL_SYSNAME_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -DUNICODE -D_UNICODE -D_CRT_SECURE_NO_DEPRECATE:::WIN64A::SIXTY_FOUR_BIT RC4_CHUNK_LL DES_INT EXPORT_VAR_AS_FN:".eval{my $asm=$x86_64_asm;$asm=~s/x86_64-gcc\.o/bn_asm.o/;$asm}.":auto:win32",
-"debug-VC-WIN64I","cl:-W3 -Gs0 -Gy -Zi -nologo -DOPENSSL_SYSNAME_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -DUNICODE -D_UNICODE -D_CRT_SECURE_NO_DEPRECATE:::WIN64I::SIXTY_FOUR_BIT RC4_CHUNK_LL DES_INT EXPORT_VAR_AS_FN:ia64cpuid.o:ia64.o::aes_core.o aes_cbc.o aes-ia64.o::md5-ia64.o:sha1-ia64.o sha256-ia64.o sha512-ia64.o:::::::ghash-ia64.o::ias:win32",
+"debug-VC-WIN64I","cl:-W3 -Gs0 -Gy -Zi -nologo -DOPENSSL_SYSNAME_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -DUNICODE -D_UNICODE -D_CRT_SECURE_NO_DEPRECATE:::WIN64I::SIXTY_FOUR_BIT RC4_CHUNK_LL DES_INT EXPORT_VAR_AS_FN:ia64cpuid.o:ia64.o:::aes_core.o aes_cbc.o aes-ia64.o::md5-ia64.o:sha1-ia64.o sha256-ia64.o sha512-ia64.o:::::::ghash-ia64.o::ias:win32",
 "debug-VC-WIN64A","cl:-W3 -Gs0 -Gy -Zi -nologo -DOPENSSL_SYSNAME_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -DUNICODE -D_UNICODE -D_CRT_SECURE_NO_DEPRECATE:::WIN64A::SIXTY_FOUR_BIT RC4_CHUNK_LL DES_INT EXPORT_VAR_AS_FN:".eval{my $asm=$x86_64_asm;$asm=~s/x86_64-gcc\.o/bn_asm.o/;$asm}.":auto:win32",
 # x86 Win32 target defaults to ANSI API, if you want UNICODE, complement
 # 'perl Configure VC-WIN32' with '-DUNICODE -D_UNICODE'
@ -547,9 +588,8 @@ my %table=(
 "UWIN", "cc:-DTERMIOS -DL_ENDIAN -O -Wall:::UWIN::BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${no_asm}:win32",

 # Cygwin
-"Cygwin-pre1.3", "gcc:-DTERMIOS -DL_ENDIAN -fomit-frame-pointer -O3 -m486 -Wall::(unknown):CYGWIN32::BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${no_asm}:win32",
-"Cygwin", "gcc:-DTERMIOS -DL_ENDIAN -fomit-frame-pointer -O3 -march=i486 -Wall:::CYGWIN32::BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_asm}:coff:dlfcn:cygwin-shared:-D_WINDLL:-shared:.dll.a",
-"debug-Cygwin", "gcc:-DTERMIOS -DL_ENDIAN -march=i486 -Wall -DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DBN_CTX_DEBUG -DCRYPTO_MDEBUG -DOPENSSL_NO_ASM -g -Wformat -Wshadow -Wmissing-prototypes -Wmissing-declarations -Werror:::CYGWIN32:::${no_asm}:dlfcn:cygwin-shared:-D_WINDLL:-shared:.dll.a",
+"Cygwin", "gcc:-DTERMIOS -DL_ENDIAN -fomit-frame-pointer -O3 -march=i486 -Wall:::CYGWIN::BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_asm}:coff:dlfcn:cygwin-shared:-D_WINDLL:-shared:.dll.a",
+"Cygwin-x86_64", "gcc:-DTERMIOS -DL_ENDIAN -O3 -Wall:::CYGWIN::SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:mingw64:dlfcn:cygwin-shared:-D_WINDLL:-shared:.dll.a",

 # NetWare from David Ward (dsward@novell.com)
 # requires either MetroWerks NLM development tools, or gcc / nlmconv
@ -581,7 +621,8 @@ my %table=(
 "darwin64-ppc-cc","cc:-arch ppc64 -O3 -DB_ENDIAN::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_UNROLL BF_PTR:${ppc64_asm}:osx64:dlfcn:darwin-shared:-fPIC -fno-common:-arch ppc64 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
 "darwin-i386-cc","cc:-arch i386 -O3 -fomit-frame-pointer -DL_ENDIAN::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:BN_LLONG RC4_INT RC4_CHUNK DES_UNROLL BF_PTR:".eval{my $asm=$x86_asm;$asm=~s/cast\-586\.o//;$asm}.":macosx:dlfcn:darwin-shared:-fPIC -fno-common:-arch i386 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
 "debug-darwin-i386-cc","cc:-arch i386 -g3 -DL_ENDIAN::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:BN_LLONG RC4_INT RC4_CHUNK DES_UNROLL BF_PTR:${x86_asm}:macosx:dlfcn:darwin-shared:-fPIC -fno-common:-arch i386 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
-"darwin64-x86_64-cc","cc:-arch x86_64 -O3 -DL_ENDIAN -Wall::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL:".eval{my $asm=$x86_64_asm;$asm=~s/rc4\-[^:]+//;$asm}.":macosx:dlfcn:darwin-shared:-fPIC -fno-common:-arch x86_64 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
+"darwin64-x86_64-cc","cc:-arch x86_64 -O3 -DL_ENDIAN -Wall::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:".eval{my $asm=$x86_64_asm;$asm=~s/rc4\-[^:]+//;$asm}.":macosx:dlfcn:darwin-shared:-fPIC -fno-common:-arch x86_64 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
+"debug-darwin64-x86_64-cc","cc:-arch x86_64 -ggdb -g2 -O0 -DL_ENDIAN -Wall::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:".eval{my $asm=$x86_64_asm;$asm=~s/rc4\-[^:]+//;$asm}.":macosx:dlfcn:darwin-shared:-fPIC -fno-common:-arch x86_64 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
 "debug-darwin-ppc-cc","cc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -DB_ENDIAN -g -Wall -O::-D_REENTRANT:MACOSX::BN_LLONG RC4_CHAR RC4_CHUNK DES_UNROLL BF_PTR:${ppc32_asm}:osx32:dlfcn:darwin-shared:-fPIC:-dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
 # iPhoneOS/iOS
 "iphoneos-cross","llvm-gcc:-O3 -isysroot \$(CROSS_TOP)/SDKs/\$(CROSS_SDK) -fomit-frame-pointer -fno-common::-D_REENTRANT:iOS:-Wl,-search_paths_first%:BN_LLONG RC4_CHAR RC4_CHUNK DES_UNROLL BF_PTR:${no_asm}:dlfcn:darwin-shared:-fPIC -fno-common:-dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
@ -634,6 +675,7 @@ my $idx_lflags = $idx++;
 my $idx_bn_ops = $idx++;
 my $idx_cpuid_obj = $idx++;
 my $idx_bn_obj = $idx++;
+my $idx_ec_obj = $idx++;
 my $idx_des_obj = $idx++;
 my $idx_aes_obj = $idx++;
 my $idx_bf_obj = $idx++;
@ -714,11 +756,13 @@ my %disabled = ( # "what"         => "comment" [or special keyword "experimental
 		 "ec_nistp_64_gcc_128" => "default",
 		 "gmp"		  => "default",
 		 "jpake"          => "experimental",
+		 "libunbound"     => "experimental",
 		 "md2"            => "default",
 		 "rc5"            => "default",
 		 "rfc3779"	  => "default",
 		 "sctp"       => "default",
 		 "shared"         => "default",
+		 "ssl-trace"	  => "default",
 		 "store"	  => "experimental",
 		 "unit-test"	  => "default",
 		 "zlib"           => "default",
@ -728,7 +772,7 @@ my @experimental = ();

 # This is what $depflags will look like with the above defaults
 # (we need this to see if we should advise the user to run "make depend"):
-my $default_depflags = " -DOPENSSL_NO_EC_NISTP_64_GCC_128 -DOPENSSL_NO_GMP -DOPENSSL_NO_JPAKE -DOPENSSL_NO_MD2 -DOPENSSL_NO_RC5 -DOPENSSL_NO_RFC3779 -DOPENSSL_NO_SCTP -DOPENSSL_NO_STORE -DOPENSSL_NO_UNIT_TEST";
+my $default_depflags = " -DOPENSSL_NO_EC_NISTP_64_GCC_128 -DOPENSSL_NO_GMP -DOPENSSL_NO_JPAKE -DOPENSSL_NO_LIBUNBOUND -DOPENSSL_NO_MD2 -DOPENSSL_NO_RC5 -DOPENSSL_NO_RFC3779 -DOPENSSL_NO_SCTP -DOPENSSL_NO_SSL_TRACE -DOPENSSL_NO_STORE -DOPENSSL_NO_UNIT_TEST";

 # Explicit "no-..." options will be collected in %disabled along with the defaults.
 # To remove something from %disabled, use "enable-foo" (unless it's experimental).
@ -873,16 +917,7 @@ PROCESS_ARGS:
 			}
 		elsif (/^[-+]/)
 			{
-			if (/^-[lL](.*)$/ or /^-Wl,/)
-				{
-				$libs.=$_." ";
-				}
-			elsif (/^-[^-]/ or /^\+/)
-				{
-				$_ =~ s/%([0-9a-f]{1,2})/chr(hex($1))/gei;
-				$flags.=$_." ";
-				}
-			elsif (/^--prefix=(.*)$/)
+			if (/^--prefix=(.*)$/)
 				{
 				$prefix=$1;
 				}
@ -926,10 +961,14 @@ PROCESS_ARGS:
 				{
 				$cross_compile_prefix=$1;
 				}
-			else
+			elsif (/^-[lL](.*)$/ or /^-Wl,/)
+				{
+				$libs.=$_." ";
+				}
+			else	# common if (/^[-+]/), just pass down...
 				{
-				print STDERR $usage;
-				exit(1);
+				$_ =~ s/%([0-9a-f]{1,2})/chr(hex($1))/gei;
+				$flags.=$_." ";
 				}
 			}
 		elsif ($_ =~ /^([^:]+):(.+)$/)
@ -1164,6 +1203,7 @@ my $lflags = $fields[$idx_lflags];
 my $bn_ops = $fields[$idx_bn_ops];
 my $cpuid_obj = $fields[$idx_cpuid_obj];
 my $bn_obj = $fields[$idx_bn_obj];
+my $ec_obj = $fields[$idx_ec_obj];
 my $des_obj = $fields[$idx_des_obj];
 my $aes_obj = $fields[$idx_aes_obj];
 my $bf_obj = $fields[$idx_bf_obj];
@ -1209,6 +1249,12 @@ if ($target =~ /^mingw/ && `$cc --target-help 2>&1` !~ m/\-mno\-cygwin/m)
 	$shared_ldflag =~ s/\-mno\-cygwin\s*//;
 	}

+if ($target =~ /linux.*\-mips/ && !$no_asm && $flags !~ /\-m(ips|arch=)/) {
+	# minimally required architecture flags for assembly modules
+	$cflags="-mips2 $cflags" if ($target =~ /mips32/);
+	$cflags="-mips3 $cflags" if ($target =~ /mips64/);
+}
+
 my $no_shared_warn=0;
 my $no_user_cflags=0;

@ -1335,7 +1381,7 @@ $lflags="$libs$lflags" if ($libs ne "");

 if ($no_asm)
 	{
-	$cpuid_obj=$bn_obj=
+	$cpuid_obj=$bn_obj=$ec_obj=
 	$des_obj=$aes_obj=$bf_obj=$cast_obj=$rc4_obj=$rc5_obj=$cmll_obj=
 	$modes_obj=$sha1_obj=$md5_obj=$rmd160_obj=$wp_obj=$engines_obj="";
 	}
@ -1416,6 +1462,7 @@ if ($target =~ /\-icc$/)	# Intel C compiler
 		}
 	if ($iccver>=8)
 		{
+		$cflags=~s/\-KPIC/-fPIC/;
 		# Eliminate unnecessary dependency from libirc.a. This is
 		# essential for shared library support, as otherwise
 		# apps/openssl can end up in endless loop upon startup...
@ -1423,12 +1470,17 @@ if ($target =~ /\-icc$/)	# Intel C compiler
 		}
 	if ($iccver>=9)
 		{
-		$cflags.=" -i-static";
-		$cflags=~s/\-no_cpprt/-no-cpprt/;
+		$lflags.=" -i-static";
+		$lflags=~s/\-no_cpprt/-no-cpprt/;
 		}
 	if ($iccver>=10)
 		{
-		$cflags=~s/\-i\-static/-static-intel/;
+		$lflags=~s/\-i\-static/-static-intel/;
+		}
+	if ($iccver>=11)
+		{
+		$cflags.=" -no-intel-extensions";	# disable Cilk
+		$lflags=~s/\-no\-cpprt/-no-cxxlib/;
 		}
 	}

@ -1509,7 +1561,7 @@ if ($rmd160_obj =~ /\.o$/)
 	}
 if ($aes_obj =~ /\.o$/)
 	{
-	$cflags.=" -DAES_ASM";
+	$cflags.=" -DAES_ASM" if ($aes_obj =~ m/\baes\-/);;
 	# aes-ctr.o is not a real file, only indication that assembler
 	# module implements AES_ctr32_encrypt...
 	$cflags.=" -DAES_CTR_ASM" if ($aes_obj =~ s/\s*aes\-ctr\.o//);
@ -1531,10 +1583,14 @@ else	{
 	$wp_obj="wp_block.o";
 	}
 $cmll_obj=$cmll_enc	unless ($cmll_obj =~ /.o$/);
-if ($modes_obj =~ /ghash/)
+if ($modes_obj =~ /ghash\-/)
 	{
 	$cflags.=" -DGHASH_ASM";
 	}
+if ($ec_obj =~ /ecp_nistz256/)
+	{
+	$cflags.=" -DECP_NISTZ256_ASM";
+	}

 # "Stringify" the C flags string.  This permits it to be made part of a string
 # and works as well on command lines.
@ -1575,7 +1631,7 @@ if ($shlib_version_number =~ /(^[0-9]*)\.([0-9\.]*)/)
 if ($strict_warnings)
 	{
 	my $wopt;
-	die "ERROR --strict-warnings requires gcc" unless ($cc =~ /gcc$/);
+	die "ERROR --strict-warnings requires gcc or clang" unless ($cc =~ /gcc$/ or $cc =~ /clang$/);
 	foreach $wopt (split /\s+/, $gcc_devteam_warn)
 		{
 		$cflags .= " $wopt" unless ($cflags =~ /$wopt/)
@ -1638,6 +1694,7 @@ while (<IN>)
 	s/^EXE_EXT=.*$/EXE_EXT= $exe_ext/;
 	s/^CPUID_OBJ=.*$/CPUID_OBJ= $cpuid_obj/;
 	s/^BN_ASM=.*$/BN_ASM= $bn_obj/;
+	s/^EC_ASM=.*$/EC_ASM= $ec_obj/;
 	s/^DES_ENC=.*$/DES_ENC= $des_obj/;
 	s/^AES_ENC=.*$/AES_ENC= $aes_obj/;
 	s/^BF_ENC=.*$/BF_ENC= $bf_obj/;
@ -1699,6 +1756,7 @@ print "CFLAG         =$cflags\n";
 print "EX_LIBS       =$lflags\n";
 print "CPUID_OBJ     =$cpuid_obj\n";
 print "BN_ASM        =$bn_obj\n";
+print "EC_ASM        =$ec_obj\n";
 print "DES_ENC       =$des_obj\n";
 print "AES_ENC       =$aes_obj\n";
 print "BF_ENC        =$bf_obj\n";
@ -1997,7 +2055,7 @@ BEGIN
 	    VALUE "ProductVersion", "$version\\0"
 	    // Optional:
 	    //VALUE "Comments", "\\0"
-	    VALUE "LegalCopyright", "Copyright © 1998-2005 The OpenSSL Project. Copyright © 1995-1998 Eric A. Young, Tim J. Hudson. All rights reserved.\\0"
+	    VALUE "LegalCopyright", "Copyright  © 1998-2005 The OpenSSL Project. Copyright © 1995-1998 Eric A. Young, Tim J. Hudson. All rights reserved.\\0"
 	    //VALUE "LegalTrademarks", "\\0"
 	    //VALUE "PrivateBuild", "\\0"
 	    //VALUE "SpecialBuild", "\\0"
@ -2106,12 +2164,12 @@ sub print_table_entry
 	{
 	my $target = shift;

-	(my $cc,my $cflags,my $unistd,my $thread_cflag,my $sys_id,my $lflags,
-	my $bn_ops,my $cpuid_obj,my $bn_obj,my $des_obj,my $aes_obj, my $bf_obj,
-	my $md5_obj,my $sha1_obj,my $cast_obj,my $rc4_obj,my $rmd160_obj,
-	my $rc5_obj,my $wp_obj,my $cmll_obj,my $modes_obj, my $engines_obj,
-	my $perlasm_scheme,my $dso_scheme,my $shared_target,my $shared_cflag,
-	my $shared_ldflag,my $shared_extension,my $ranlib,my $arflags,my $multilib)=
+	my ($cc, $cflags, $unistd, $thread_cflag, $sys_id, $lflags,
+	    $bn_ops, $cpuid_obj, $bn_obj, $ec_obj, $des_obj, $aes_obj, $bf_obj,
+	    $md5_obj, $sha1_obj, $cast_obj, $rc4_obj, $rmd160_obj,
+	    $rc5_obj, $wp_obj, $cmll_obj, $modes_obj, $engines_obj,
+	    $perlasm_scheme, $dso_scheme, $shared_target, $shared_cflag,
+	    $shared_ldflag, $shared_extension, $ranlib, $arflags, $multilib)=
 	split(/\s*:\s*/,$table{$target} . ":" x 30 , -1);
 			
 	print <<EOF
@ -2126,6 +2184,7 @@ sub print_table_entry
 \$bn_ops       = $bn_ops
 \$cpuid_obj    = $cpuid_obj
 \$bn_obj       = $bn_obj
+\$ec_obj       = $ec_obj
 \$des_obj      = $des_obj
 \$aes_obj      = $aes_obj
 \$bf_obj       = $bf_obj
--- a/deps/openssl/openssl/FAQ
+++ b/deps/openssl/openssl/FAQ
@ -83,7 +83,7 @@ OpenSSL  -  Frequently Asked Questions
 * Which is the current version of OpenSSL?

 The current version is available from <URL: http://www.openssl.org>.
-OpenSSL 1.0.1e was released on Feb 11th, 2013.
+OpenSSL 1.0.1a was released on Apr 19th, 2012.

 In addition to the current stable release, you can also access daily
 snapshots of the OpenSSL development version at <URL:
@ -184,14 +184,18 @@ Therefore the answer to the common question "when will feature X be
 backported to OpenSSL 1.0.0/0.9.8?" is "never" but it could appear
 in the next minor release.

+* What happens when the letter release reaches z?
+
+It was decided after the release of OpenSSL 0.9.8y the next version should
+be 0.9.8za then 0.9.8zb and so on.
+
+
 [LEGAL] =======================================================================

 * Do I need patent licenses to use OpenSSL?

-The patents section of the README file lists patents that may apply to
-you if you want to use OpenSSL.  For information on intellectual
-property rights, please consult a lawyer.  The OpenSSL team does not
-offer legal advice.
+For information on intellectual property rights, please consult a lawyer.
+The OpenSSL team does not offer legal advice.

 You can configure OpenSSL so as not to use IDEA, MDC2 and RC5 by using
 ./config no-idea no-mdc2 no-rc5
@ -608,8 +612,8 @@ valid for the current DOS session.
 * What is special about OpenSSL on Redhat?

 Red Hat Linux (release 7.0 and later) include a preinstalled limited
-version of OpenSSL. For patent reasons, support for IDEA, RC5 and MDC2
-is disabled in this version. The same may apply to other Linux distributions.
+version of OpenSSL. Red Hat has chosen to disable support for IDEA, RC5 and
+MDC2 in this version. The same may apply to other Linux distributions.
 Users may therefore wish to install more or all of the features left out.

 To do this you MUST ensure that you do not overwrite the openssl that is in
@ -632,11 +636,6 @@ relevant updates in packages up to and including 0.9.6b.
 A possible way around this is to persuade Red Hat to produce a non-US
 version of Red Hat Linux.

-FYI: Patent numbers and expiry dates of US patents:
-MDC-2: 4,908,861 13/03/2007
-IDEA:  5,214,703 25/05/2010
-RC5:   5,724,428 03/03/2015
-

 * Why does the OpenSSL compilation fail on MacOS X?

@ -862,7 +861,7 @@ The opposite assumes we already have len bytes in buf:
 p = buf;
 p7 = d2i_PKCS7(NULL, &p, len);

-At this point p7 contains a valid PKCS7 structure of NULL if an error
+At this point p7 contains a valid PKCS7 structure or NULL if an error
 occurred. If an error occurred ERR_print_errors(bio) should give more
 information.

@ -874,6 +873,21 @@ that has been read or written. This may well be uninitialized data
 and attempts to free the buffer will have unpredictable results
 because it no longer points to the same address.

+Memory allocation and encoding can also be combined in a single
+operation by the ASN1 routines:
+
+ unsigned char *buf = NULL;	/* mandatory */
+ int len;
+ len = i2d_PKCS7(p7, &buf);
+ if (len < 0)
+	/* Error */
+ /* Do some things with 'buf' */
+ /* Finished with buf: free it */
+ OPENSSL_free(buf);
+
+In this special case the "buf" parameter is *not* incremented, it points
+to the start of the encoding.
+

 * OpenSSL uses DER but I need BER format: does OpenSSL support BER?

--- a/deps/openssl/openssl/GitConfigure
+++ b/deps/openssl/openssl/GitConfigure
@ -0,0 +1,8 @@
+#!/bin/sh
+
+BRANCH=`git rev-parse --abbrev-ref HEAD`
+
+./Configure $@ no-symlinks
+make files
+util/mk1mf.pl OUT=out.$BRANCH TMP=tmp.$BRANCH INC=inc.$BRANCH copy > makefile.$BRANCH
+make -f makefile.$BRANCH init
--- a/deps/openssl/openssl/GitMake
+++ b/deps/openssl/openssl/GitMake
@ -0,0 +1,5 @@
+#!/bin/sh
+
+BRANCH=`git rev-parse --abbrev-ref HEAD`
+
+make -f makefile.$BRANCH $@
--- a/deps/openssl/openssl/Makefile
+++ b/deps/openssl/openssl/Makefile
@ -4,16 +4,16 @@
 ## Makefile for OpenSSL
 ##

-VERSION=1.0.1m
+VERSION=1.0.2a
 MAJOR=1
-MINOR=0.1
+MINOR=0.2
 SHLIB_VERSION_NUMBER=1.0.0
 SHLIB_VERSION_HISTORY=
 SHLIB_MAJOR=1
 SHLIB_MINOR=0.0
 SHLIB_EXT=
 PLATFORM=dist
-OPTIONS= no-ec_nistp_64_gcc_128 no-gmp no-jpake no-krb5 no-md2 no-rc5 no-rfc3779 no-sctp no-shared no-store no-unit-test no-zlib no-zlib-dynamic static-engine
+OPTIONS= no-ec_nistp_64_gcc_128 no-gmp no-jpake no-krb5 no-libunbound no-md2 no-rc5 no-rfc3779 no-sctp no-shared no-ssl-trace no-store no-unit-test no-zlib no-zlib-dynamic static-engine
 CONFIGURE_ARGS=dist
 SHLIB_TARGET=

@ -61,7 +61,7 @@ OPENSSLDIR=/usr/local/ssl

 CC= cc
 CFLAG= -O
-DEPFLAG= -DOPENSSL_NO_EC_NISTP_64_GCC_128 -DOPENSSL_NO_GMP -DOPENSSL_NO_JPAKE -DOPENSSL_NO_MD2 -DOPENSSL_NO_RC5 -DOPENSSL_NO_RFC3779 -DOPENSSL_NO_SCTP -DOPENSSL_NO_STORE -DOPENSSL_NO_UNIT_TEST
+DEPFLAG= -DOPENSSL_NO_EC_NISTP_64_GCC_128 -DOPENSSL_NO_GMP -DOPENSSL_NO_JPAKE -DOPENSSL_NO_LIBUNBOUND -DOPENSSL_NO_MD2 -DOPENSSL_NO_RC5 -DOPENSSL_NO_RFC3779 -DOPENSSL_NO_SCTP -DOPENSSL_NO_SSL_TRACE -DOPENSSL_NO_STORE -DOPENSSL_NO_UNIT_TEST
 PEX_LIBS= 
 EX_LIBS= 
 EXE_EXT= 
@ -71,7 +71,7 @@ RANLIB= /usr/bin/ranlib
 NM= nm
 PERL= /usr/bin/perl
 TAR= tar
-TARFLAGS= --no-recursion --record-size=10240
+TARFLAGS= --no-recursion
 MAKEDEPPROG=makedepend
 LIBDIR=lib

@ -90,6 +90,7 @@ PROCESSOR=
 # CPUID module collects small commonly used assembler snippets
 CPUID_OBJ= mem_clr.o
 BN_ASM= bn_asm.o
+EC_ASM=
 DES_ENC= des_enc.o fcrypt_b.o
 AES_ENC= aes_core.o aes_cbc.o
 BF_ENC= bf_enc.o
@ -223,8 +224,8 @@ BUILDENV=	PLATFORM='$(PLATFORM)' PROCESSOR='$(PROCESSOR)' \
 		EXE_EXT='$(EXE_EXT)' SHARED_LIBS='$(SHARED_LIBS)'	\
 		SHLIB_EXT='$(SHLIB_EXT)' SHLIB_TARGET='$(SHLIB_TARGET)'	\
 		PEX_LIBS='$(PEX_LIBS)' EX_LIBS='$(EX_LIBS)'	\
-		CPUID_OBJ='$(CPUID_OBJ)'			\
-		BN_ASM='$(BN_ASM)' DES_ENC='$(DES_ENC)' 	\
+		CPUID_OBJ='$(CPUID_OBJ)' BN_ASM='$(BN_ASM)'	\
+		EC_ASM='$(EC_ASM)' DES_ENC='$(DES_ENC)' 	\
 		AES_ENC='$(AES_ENC)' CMLL_ENC='$(CMLL_ENC)'	\
 		BF_ENC='$(BF_ENC)' CAST_ENC='$(CAST_ENC)'	\
 		RC4_ENC='$(RC4_ENC)' RC5_ENC='$(RC5_ENC)'	\
@ -328,7 +329,7 @@ clean-shared:
 			done; \
 		fi; \
 		( set -x; rm -f lib$$i$(SHLIB_EXT) ); \
-		if [ "$(PLATFORM)" = "Cygwin" ]; then \
+		if expr "$(PLATFORM)" : "Cygwin" >/dev/null; then \
 			( set -x; rm -f cyg$$i$(SHLIB_EXT) lib$$i$(SHLIB_EXT).a ); \
 		fi; \
 	done
@ -377,11 +378,11 @@ libssl.pc: Makefile
 	    echo 'libdir=$${exec_prefix}/$(LIBDIR)'; \
 	    echo 'includedir=$${prefix}/include'; \
 	    echo ''; \
-	    echo 'Name: OpenSSL'; \
+	    echo 'Name: OpenSSL-libssl'; \
 	    echo 'Description: Secure Sockets Layer and cryptography libraries'; \
 	    echo 'Version: '$(VERSION); \
-	    echo 'Requires: '; \
-	    echo 'Libs: -L$${libdir} -lssl -lcrypto'; \
+	    echo 'Requires.private: libcrypto'; \
+	    echo 'Libs: -L$${libdir} -lssl'; \
 	    echo 'Libs.private: $(EX_LIBS)'; \
 	    echo 'Cflags: -I$${includedir} $(KRB5_INCLUDES)' ) > libssl.pc

@ -394,10 +395,7 @@ openssl.pc: Makefile
 	    echo 'Name: OpenSSL'; \
 	    echo 'Description: Secure Sockets Layer and cryptography libraries and tools'; \
 	    echo 'Version: '$(VERSION); \
-	    echo 'Requires: '; \
-	    echo 'Libs: -L$${libdir} -lssl -lcrypto'; \
-	    echo 'Libs.private: $(EX_LIBS)'; \
-	    echo 'Cflags: -I$${includedir} $(KRB5_INCLUDES)' ) > openssl.pc
+	    echo 'Requires: libssl libcrypto' ) > openssl.pc

 Makefile: Makefile.org Configure config
 	@echo "Makefile is older than Makefile.org, Configure or config."
@ -573,11 +571,7 @@ install_sw:
 		do \
 			if [ -f "$$i" -o -f "$$i.a" ]; then \
 			(       echo installing $$i; \
-				if [ "$(PLATFORM)" != "Cygwin" ]; then \
-					cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
-					chmod 555 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
-					mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
-				else \
+				if expr "$(PLATFORM)" : "Cygwin" >/dev/null; then \
 					c=`echo $$i | sed 's/^lib\(.*\)\.dll\.a/cyg\1-$(SHLIB_VERSION_NUMBER).dll/'`; \
 					cp $$c $(INSTALL_PREFIX)$(INSTALLTOP)/bin/$$c.new; \
 					chmod 755 $(INSTALL_PREFIX)$(INSTALLTOP)/bin/$$c.new; \
@ -585,6 +579,10 @@ install_sw:
 					cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
 					chmod 644 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
 					mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
+				else \
+					cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
+					chmod 555 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
+					mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
 				fi ); \
 				if expr $(PLATFORM) : 'mingw' > /dev/null; then \
 				(	case $$i in \
@ -617,6 +615,10 @@ install_sw:

 install_html_docs:
 	here="`pwd`"; \
+	filecase=; \
+	case "$(PLATFORM)" in DJGPP|Cygwin*|mingw*|darwin*-*-cc) \
+		filecase=-i; \
+	esac; \
 	for subdir in apps crypto ssl; do \
 		mkdir -p $(INSTALL_PREFIX)$(HTMLDIR)/$$subdir; \
 		for i in doc/$$subdir/*.pod; do \
@ -645,9 +647,9 @@ install_docs:
 	@pod2man="`cd ./util; ./pod2mantest $(PERL)`"; \
 	here="`pwd`"; \
 	filecase=; \
-	if [ "$(PLATFORM)" = "DJGPP" -o "$(PLATFORM)" = "Cygwin" -o "$(PLATFORM)" = "mingw" ]; then \
+	case "$(PLATFORM)" in DJGPP|Cygwin*|mingw*|darwin*-*-cc) \
 		filecase=-i; \
-	fi; \
+	esac; \
 	set -e; for i in doc/apps/*.pod; do \
 		fn=`basename $$i .pod`; \
 		sec=`$(PERL) util/extract-section.pl 1 < $$i`; \
--- a/deps/openssl/openssl/Makefile.bak
+++ b/deps/openssl/openssl/Makefile.bak
@ -4,16 +4,16 @@
 ## Makefile for OpenSSL
 ##

-VERSION=1.0.1m-dev
+VERSION=1.0.2a-dev
 MAJOR=1
-MINOR=0.1
+MINOR=0.2
 SHLIB_VERSION_NUMBER=1.0.0
 SHLIB_VERSION_HISTORY=
 SHLIB_MAJOR=1
 SHLIB_MINOR=0.0
 SHLIB_EXT=
 PLATFORM=gcc
-OPTIONS= no-ec_nistp_64_gcc_128 no-gmp no-jpake no-krb5 no-md2 no-rc5 no-rfc3779 no-sctp no-shared no-store no-unit-test no-zlib no-zlib-dynamic static-engine
+OPTIONS= no-ec_nistp_64_gcc_128 no-gmp no-jpake no-krb5 no-libunbound no-md2 no-rc5 no-rfc3779 no-sctp no-shared no-ssl-trace no-store no-unit-test no-zlib no-zlib-dynamic static-engine
 CONFIGURE_ARGS=gcc
 SHLIB_TARGET=

@ -61,7 +61,7 @@ OPENSSLDIR=/usr/local/ssl

 CC= gcc
 CFLAG= -O3
-DEPFLAG= -DOPENSSL_NO_EC_NISTP_64_GCC_128 -DOPENSSL_NO_GMP -DOPENSSL_NO_JPAKE -DOPENSSL_NO_MD2 -DOPENSSL_NO_RC5 -DOPENSSL_NO_RFC3779 -DOPENSSL_NO_SCTP -DOPENSSL_NO_STORE -DOPENSSL_NO_UNIT_TEST
+DEPFLAG= -DOPENSSL_NO_EC_NISTP_64_GCC_128 -DOPENSSL_NO_GMP -DOPENSSL_NO_JPAKE -DOPENSSL_NO_LIBUNBOUND -DOPENSSL_NO_MD2 -DOPENSSL_NO_RC5 -DOPENSSL_NO_RFC3779 -DOPENSSL_NO_SCTP -DOPENSSL_NO_SSL_TRACE -DOPENSSL_NO_STORE -DOPENSSL_NO_UNIT_TEST
 PEX_LIBS= 
 EX_LIBS= 
 EXE_EXT= 
@ -71,7 +71,7 @@ RANLIB= /usr/bin/ranlib
 NM= nm
 PERL= /usr/bin/perl
 TAR= tar
-TARFLAGS= --no-recursion --record-size=10240
+TARFLAGS= --no-recursion
 MAKEDEPPROG= gcc
 LIBDIR=lib

@ -90,6 +90,7 @@ PROCESSOR=
 # CPUID module collects small commonly used assembler snippets
 CPUID_OBJ= mem_clr.o
 BN_ASM= bn_asm.o
+EC_ASM=
 DES_ENC= des_enc.o fcrypt_b.o
 AES_ENC= aes_core.o aes_cbc.o
 BF_ENC= bf_enc.o
@ -223,8 +224,8 @@ BUILDENV=	PLATFORM='$(PLATFORM)' PROCESSOR='$(PROCESSOR)' \
 		EXE_EXT='$(EXE_EXT)' SHARED_LIBS='$(SHARED_LIBS)'	\
 		SHLIB_EXT='$(SHLIB_EXT)' SHLIB_TARGET='$(SHLIB_TARGET)'	\
 		PEX_LIBS='$(PEX_LIBS)' EX_LIBS='$(EX_LIBS)'	\
-		CPUID_OBJ='$(CPUID_OBJ)'			\
-		BN_ASM='$(BN_ASM)' DES_ENC='$(DES_ENC)' 	\
+		CPUID_OBJ='$(CPUID_OBJ)' BN_ASM='$(BN_ASM)'	\
+		EC_ASM='$(EC_ASM)' DES_ENC='$(DES_ENC)' 	\
 		AES_ENC='$(AES_ENC)' CMLL_ENC='$(CMLL_ENC)'	\
 		BF_ENC='$(BF_ENC)' CAST_ENC='$(CAST_ENC)'	\
 		RC4_ENC='$(RC4_ENC)' RC5_ENC='$(RC5_ENC)'	\
@ -328,7 +329,7 @@ clean-shared:
 			done; \
 		fi; \
 		( set -x; rm -f lib$$i$(SHLIB_EXT) ); \
-		if [ "$(PLATFORM)" = "Cygwin" ]; then \
+		if expr "$(PLATFORM)" : "Cygwin" >/dev/null; then \
 			( set -x; rm -f cyg$$i$(SHLIB_EXT) lib$$i$(SHLIB_EXT).a ); \
 		fi; \
 	done
@ -377,11 +378,11 @@ libssl.pc: Makefile
 	    echo 'libdir=$${exec_prefix}/$(LIBDIR)'; \
 	    echo 'includedir=$${prefix}/include'; \
 	    echo ''; \
-	    echo 'Name: OpenSSL'; \
+	    echo 'Name: OpenSSL-libssl'; \
 	    echo 'Description: Secure Sockets Layer and cryptography libraries'; \
 	    echo 'Version: '$(VERSION); \
-	    echo 'Requires: '; \
-	    echo 'Libs: -L$${libdir} -lssl -lcrypto'; \
+	    echo 'Requires.private: libcrypto'; \
+	    echo 'Libs: -L$${libdir} -lssl'; \
 	    echo 'Libs.private: $(EX_LIBS)'; \
 	    echo 'Cflags: -I$${includedir} $(KRB5_INCLUDES)' ) > libssl.pc

@ -394,10 +395,7 @@ openssl.pc: Makefile
 	    echo 'Name: OpenSSL'; \
 	    echo 'Description: Secure Sockets Layer and cryptography libraries and tools'; \
 	    echo 'Version: '$(VERSION); \
-	    echo 'Requires: '; \
-	    echo 'Libs: -L$${libdir} -lssl -lcrypto'; \
-	    echo 'Libs.private: $(EX_LIBS)'; \
-	    echo 'Cflags: -I$${includedir} $(KRB5_INCLUDES)' ) > openssl.pc
+	    echo 'Requires: libssl libcrypto' ) > openssl.pc

 Makefile: Makefile.org Configure config
 	@echo "Makefile is older than Makefile.org, Configure or config."
@ -573,11 +571,7 @@ install_sw:
 		do \
 			if [ -f "$$i" -o -f "$$i.a" ]; then \
 			(       echo installing $$i; \
-				if [ "$(PLATFORM)" != "Cygwin" ]; then \
-					cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
-					chmod 555 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
-					mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
-				else \
+				if expr "$(PLATFORM)" : "Cygwin" >/dev/null; then \
 					c=`echo $$i | sed 's/^lib\(.*\)\.dll\.a/cyg\1-$(SHLIB_VERSION_NUMBER).dll/'`; \
 					cp $$c $(INSTALL_PREFIX)$(INSTALLTOP)/bin/$$c.new; \
 					chmod 755 $(INSTALL_PREFIX)$(INSTALLTOP)/bin/$$c.new; \
@ -585,6 +579,10 @@ install_sw:
 					cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
 					chmod 644 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
 					mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
+				else \
+					cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
+					chmod 555 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
+					mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
 				fi ); \
 				if expr $(PLATFORM) : 'mingw' > /dev/null; then \
 				(	case $$i in \
@ -617,6 +615,10 @@ install_sw:

 install_html_docs:
 	here="`pwd`"; \
+	filecase=; \
+	case "$(PLATFORM)" in DJGPP|Cygwin*|mingw*|darwin*-*-cc) \
+		filecase=-i; \
+	esac; \
 	for subdir in apps crypto ssl; do \
 		mkdir -p $(INSTALL_PREFIX)$(HTMLDIR)/$$subdir; \
 		for i in doc/$$subdir/*.pod; do \
@ -645,9 +647,9 @@ install_docs:
 	@pod2man="`cd ./util; ./pod2mantest $(PERL)`"; \
 	here="`pwd`"; \
 	filecase=; \
-	if [ "$(PLATFORM)" = "DJGPP" -o "$(PLATFORM)" = "Cygwin" -o "$(PLATFORM)" = "mingw" ]; then \
+	case "$(PLATFORM)" in DJGPP|Cygwin*|mingw*|darwin*-*-cc) \
 		filecase=-i; \
-	fi; \
+	esac; \
 	set -e; for i in doc/apps/*.pod; do \
 		fn=`basename $$i .pod`; \
 		sec=`$(PERL) util/extract-section.pl 1 < $$i`; \
--- a/deps/openssl/openssl/Makefile.org
+++ b/deps/openssl/openssl/Makefile.org
@ -69,7 +69,7 @@ RANLIB= ranlib
 NM= nm
 PERL= perl
 TAR= tar
-TARFLAGS= --no-recursion --record-size=10240
+TARFLAGS= --no-recursion
 MAKEDEPPROG=makedepend
 LIBDIR=lib

@ -88,6 +88,7 @@ PROCESSOR=
 # CPUID module collects small commonly used assembler snippets
 CPUID_OBJ= 
 BN_ASM= bn_asm.o
+EC_ASM=
 DES_ENC= des_enc.o fcrypt_b.o
 AES_ENC= aes_core.o aes_cbc.o
 BF_ENC= bf_enc.o
@ -221,8 +222,8 @@ BUILDENV=	PLATFORM='$(PLATFORM)' PROCESSOR='$(PROCESSOR)' \
 		EXE_EXT='$(EXE_EXT)' SHARED_LIBS='$(SHARED_LIBS)'	\
 		SHLIB_EXT='$(SHLIB_EXT)' SHLIB_TARGET='$(SHLIB_TARGET)'	\
 		PEX_LIBS='$(PEX_LIBS)' EX_LIBS='$(EX_LIBS)'	\
-		CPUID_OBJ='$(CPUID_OBJ)'			\
-		BN_ASM='$(BN_ASM)' DES_ENC='$(DES_ENC)' 	\
+		CPUID_OBJ='$(CPUID_OBJ)' BN_ASM='$(BN_ASM)'	\
+		EC_ASM='$(EC_ASM)' DES_ENC='$(DES_ENC)' 	\
 		AES_ENC='$(AES_ENC)' CMLL_ENC='$(CMLL_ENC)'	\
 		BF_ENC='$(BF_ENC)' CAST_ENC='$(CAST_ENC)'	\
 		RC4_ENC='$(RC4_ENC)' RC5_ENC='$(RC5_ENC)'	\
@ -326,7 +327,7 @@ clean-shared:
 			done; \
 		fi; \
 		( set -x; rm -f lib$$i$(SHLIB_EXT) ); \
-		if [ "$(PLATFORM)" = "Cygwin" ]; then \
+		if expr "$(PLATFORM)" : "Cygwin" >/dev/null; then \
 			( set -x; rm -f cyg$$i$(SHLIB_EXT) lib$$i$(SHLIB_EXT).a ); \
 		fi; \
 	done
@ -375,11 +376,11 @@ libssl.pc: Makefile
 	    echo 'libdir=$${exec_prefix}/$(LIBDIR)'; \
 	    echo 'includedir=$${prefix}/include'; \
 	    echo ''; \
-	    echo 'Name: OpenSSL'; \
+	    echo 'Name: OpenSSL-libssl'; \
 	    echo 'Description: Secure Sockets Layer and cryptography libraries'; \
 	    echo 'Version: '$(VERSION); \
-	    echo 'Requires: '; \
-	    echo 'Libs: -L$${libdir} -lssl -lcrypto'; \
+	    echo 'Requires.private: libcrypto'; \
+	    echo 'Libs: -L$${libdir} -lssl'; \
 	    echo 'Libs.private: $(EX_LIBS)'; \
 	    echo 'Cflags: -I$${includedir} $(KRB5_INCLUDES)' ) > libssl.pc

@ -392,10 +393,7 @@ openssl.pc: Makefile
 	    echo 'Name: OpenSSL'; \
 	    echo 'Description: Secure Sockets Layer and cryptography libraries and tools'; \
 	    echo 'Version: '$(VERSION); \
-	    echo 'Requires: '; \
-	    echo 'Libs: -L$${libdir} -lssl -lcrypto'; \
-	    echo 'Libs.private: $(EX_LIBS)'; \
-	    echo 'Cflags: -I$${includedir} $(KRB5_INCLUDES)' ) > openssl.pc
+	    echo 'Requires: libssl libcrypto' ) > openssl.pc

 Makefile: Makefile.org Configure config
 	@echo "Makefile is older than Makefile.org, Configure or config."
@ -571,11 +569,7 @@ install_sw:
 		do \
 			if [ -f "$$i" -o -f "$$i.a" ]; then \
 			(       echo installing $$i; \
-				if [ "$(PLATFORM)" != "Cygwin" ]; then \
-					cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
-					chmod 555 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
-					mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
-				else \
+				if expr "$(PLATFORM)" : "Cygwin" >/dev/null; then \
 					c=`echo $$i | sed 's/^lib\(.*\)\.dll\.a/cyg\1-$(SHLIB_VERSION_NUMBER).dll/'`; \
 					cp $$c $(INSTALL_PREFIX)$(INSTALLTOP)/bin/$$c.new; \
 					chmod 755 $(INSTALL_PREFIX)$(INSTALLTOP)/bin/$$c.new; \
@ -583,6 +577,10 @@ install_sw:
 					cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
 					chmod 644 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
 					mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
+				else \
+					cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
+					chmod 555 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
+					mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
 				fi ); \
 				if expr $(PLATFORM) : 'mingw' > /dev/null; then \
 				(	case $$i in \
@ -615,6 +613,10 @@ install_sw:

 install_html_docs:
 	here="`pwd`"; \
+	filecase=; \
+	case "$(PLATFORM)" in DJGPP|Cygwin*|mingw*|darwin*-*-cc) \
+		filecase=-i; \
+	esac; \
 	for subdir in apps crypto ssl; do \
 		mkdir -p $(INSTALL_PREFIX)$(HTMLDIR)/$$subdir; \
 		for i in doc/$$subdir/*.pod; do \
@ -643,9 +645,9 @@ install_docs:
 	@pod2man="`cd ./util; ./pod2mantest $(PERL)`"; \
 	here="`pwd`"; \
 	filecase=; \
-	if [ "$(PLATFORM)" = "DJGPP" -o "$(PLATFORM)" = "Cygwin" -o "$(PLATFORM)" = "mingw" ]; then \
+	case "$(PLATFORM)" in DJGPP|Cygwin*|mingw*|darwin*-*-cc) \
 		filecase=-i; \
-	fi; \
+	esac; \
 	set -e; for i in doc/apps/*.pod; do \
 		fn=`basename $$i .pod`; \
 		sec=`$(PERL) util/extract-section.pl 1 < $$i`; \
--- a/deps/openssl/openssl/NEWS
+++ b/deps/openssl/openssl/NEWS
@ -5,16 +5,33 @@
  This file gives a brief overview of the major changes between each OpenSSL
  release. For more details please read the CHANGES file.

-  Major changes between OpenSSL 1.0.1l and OpenSSL 1.0.1m [19 Mar 2015]
+  Major changes between OpenSSL 1.0.2 and OpenSSL 1.0.2a [19 Mar 2015]

+      o OpenSSL 1.0.2 ClientHello sigalgs DoS fix (CVE-2015-0291)
+      o Multiblock corrupted pointer fix (CVE-2015-0290)
+      o Segmentation fault in DTLSv1_listen fix (CVE-2015-0207)
      o Segmentation fault in ASN1_TYPE_cmp fix (CVE-2015-0286)
+      o Segmentation fault for invalid PSS parameters fix (CVE-2015-0208)
      o ASN.1 structure reuse memory corruption fix (CVE-2015-0287)
      o PKCS7 NULL pointer dereferences fix (CVE-2015-0289)
      o DoS via reachable assert in SSLv2 servers fix (CVE-2015-0293)
+      o Empty CKE with client auth and DHE fix (CVE-2015-1787)
+      o Handshake with unseeded PRNG fix (CVE-2015-0285)
      o Use After Free following d2i_ECPrivatekey error fix (CVE-2015-0209)
      o X509_to_X509_REQ NULL pointer deref fix (CVE-2015-0288)
      o Removed the export ciphers from the DEFAULT ciphers

+  Major changes between OpenSSL 1.0.1l and OpenSSL 1.0.2 [22 Jan 2015]:
+
+      o Suite B support for TLS 1.2 and DTLS 1.2
+      o Support for DTLS 1.2
+      o TLS automatic EC curve selection.
+      o API to set TLS supported signature algorithms and curves
+      o SSL_CONF configuration API.
+      o TLS Brainpool support.
+      o ALPN support.
+      o CMS support for RSA-PSS, RSA-OAEP, ECDH and X9.42 DH.
+
  Major changes between OpenSSL 1.0.1k and OpenSSL 1.0.1l [15 Jan 2015]

      o Build fixes for the Windows and OpenVMS platforms
--- a/deps/openssl/openssl/README
+++ b/deps/openssl/openssl/README
@ -1,5 +1,5 @@

- OpenSSL 1.0.1m 19 Mar 2015
+ OpenSSL 1.0.2a 19 Mar 2015

 Copyright (c) 1998-2011 The OpenSSL Project
 Copyright (c) 1995-1998 Eric A. Young, Tim J. Hudson
@ -90,32 +90,6 @@
        SSL/TLS Client and Server Tests
        Handling of S/MIME signed or encrypted mail

-
- PATENTS
- -------
-
- Various companies hold various patents for various algorithms in various
- locations around the world. _YOU_ are responsible for ensuring that your use
- of any algorithms is legal by checking if there are any patents in your
- country.  The file contains some of the patents that we know about or are
- rumored to exist. This is not a definitive list.
-
- RSA Security holds software patents on the RC5 algorithm.  If you
- intend to use this cipher, you must contact RSA Security for
- licensing conditions. Their web page is http://www.rsasecurity.com/.
-
- RC4 is a trademark of RSA Security, so use of this label should perhaps
- only be used with RSA Security's permission.
-
- The IDEA algorithm is patented by Ascom in Austria, France, Germany, Italy,
- Japan, the Netherlands, Spain, Sweden, Switzerland, UK and the USA.  They
- should be contacted if that algorithm is to be used; their web page is
- http://www.ascom.ch/.
-
- NTT and Mitsubishi have patents and pending patents on the Camellia
- algorithm, but allow use at no charge without requiring an explicit
- licensing agreement: http://info.isl.ntt.co.jp/crypt/eng/info/chiteki.html
-
 INSTALLATION
 ------------

@ -161,8 +135,7 @@
    - Problem Description (steps that will reproduce the problem, if known)
    - Stack Traceback (if the application dumps core)

- Report the bug to the OpenSSL project via the Request Tracker
- (http://www.openssl.org/support/rt.html) by mail to:
+ Email the report to:

    openssl-bugs@openssl.org

@ -170,10 +143,11 @@
 or support queries. Just because something doesn't work the way you expect
 does not mean it is necessarily a bug in OpenSSL.

- Note that mail to openssl-bugs@openssl.org is recorded in the publicly
- readable request tracker database and is forwarded to a public
- mailing list. Confidential mail may be sent to openssl-security@openssl.org
- (PGP key available from the key servers).
+ Note that mail to openssl-bugs@openssl.org is recorded in the public
+ request tracker database (see https://www.openssl.org/support/rt.html
+ for details) and also forwarded to a public mailing list. Confidential
+ mail may be sent to openssl-security@openssl.org (PGP key available from
+ the key servers).

 HOW TO CONTRIBUTE TO OpenSSL
 ----------------------------
--- a/deps/openssl/openssl/apps/apps.c
+++ b/deps/openssl/openssl/apps/apps.c
@ -119,7 +119,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
-#if !defined(OPENSSL_SYSNAME_WIN32) && !defined(NETWARE_CLIB)
+#if !defined(OPENSSL_SYSNAME_WIN32) && !defined(OPENSSL_SYSNAME_WINCE) && !defined(NETWARE_CLIB)
 # include <strings.h>
 #endif
 #include <sys/types.h>
@ -285,6 +285,8 @@ int str2fmt(char *s)
        return (FORMAT_PKCS12);
    else if ((*s == 'E') || (*s == 'e'))
        return (FORMAT_ENGINE);
+    else if ((*s == 'H') || (*s == 'h'))
+        return FORMAT_HTTP;
    else if ((*s == 'P') || (*s == 'p')) {
        if (s[1] == 'V' || s[1] == 'v')
            return FORMAT_PVK;
@ -787,12 +789,72 @@ static int load_pkcs12(BIO *err, BIO *in, const char *desc,
    return ret;
 }

+int load_cert_crl_http(const char *url, BIO *err,
+                       X509 **pcert, X509_CRL **pcrl)
+{
+    char *host = NULL, *port = NULL, *path = NULL;
+    BIO *bio = NULL;
+    OCSP_REQ_CTX *rctx = NULL;
+    int use_ssl, rv = 0;
+    if (!OCSP_parse_url(url, &host, &port, &path, &use_ssl))
+        goto err;
+    if (use_ssl) {
+        if (err)
+            BIO_puts(err, "https not supported\n");
+        goto err;
+    }
+    bio = BIO_new_connect(host);
+    if (!bio || !BIO_set_conn_port(bio, port))
+        goto err;
+    rctx = OCSP_REQ_CTX_new(bio, 1024);
+    if (!rctx)
+        goto err;
+    if (!OCSP_REQ_CTX_http(rctx, "GET", path))
+        goto err;
+    if (!OCSP_REQ_CTX_add1_header(rctx, "Host", host))
+        goto err;
+    if (pcert) {
+        do {
+            rv = X509_http_nbio(rctx, pcert);
+        }
+        while (rv == -1);
+    } else {
+        do {
+            rv = X509_CRL_http_nbio(rctx, pcrl);
+        } while (rv == -1);
+    }
+
+ err:
+    if (host)
+        OPENSSL_free(host);
+    if (path)
+        OPENSSL_free(path);
+    if (port)
+        OPENSSL_free(port);
+    if (bio)
+        BIO_free_all(bio);
+    if (rctx)
+        OCSP_REQ_CTX_free(rctx);
+    if (rv != 1) {
+        if (bio && err)
+            BIO_printf(bio_err, "Error loading %s from %s\n",
+                       pcert ? "certificate" : "CRL", url);
+        ERR_print_errors(bio_err);
+    }
+    return rv;
+}
+
 X509 *load_cert(BIO *err, const char *file, int format,
                const char *pass, ENGINE *e, const char *cert_descrip)
 {
    X509 *x = NULL;
    BIO *cert;

+    if (format == FORMAT_HTTP) {
+        load_cert_crl_http(file, err, &x, NULL);
+        return x;
+    }
+
    if ((cert = BIO_new(BIO_s_file())) == NULL) {
        ERR_print_errors(err);
        goto end;
@ -850,6 +912,49 @@ X509 *load_cert(BIO *err, const char *file, int format,
    return (x);
 }

+X509_CRL *load_crl(const char *infile, int format)
+{
+    X509_CRL *x = NULL;
+    BIO *in = NULL;
+
+    if (format == FORMAT_HTTP) {
+        load_cert_crl_http(infile, bio_err, NULL, &x);
+        return x;
+    }
+
+    in = BIO_new(BIO_s_file());
+    if (in == NULL) {
+        ERR_print_errors(bio_err);
+        goto end;
+    }
+
+    if (infile == NULL)
+        BIO_set_fp(in, stdin, BIO_NOCLOSE);
+    else {
+        if (BIO_read_filename(in, infile) <= 0) {
+            perror(infile);
+            goto end;
+        }
+    }
+    if (format == FORMAT_ASN1)
+        x = d2i_X509_CRL_bio(in, NULL);
+    else if (format == FORMAT_PEM)
+        x = PEM_read_bio_X509_CRL(in, NULL, NULL, NULL);
+    else {
+        BIO_printf(bio_err, "bad input format specified for input crl\n");
+        goto end;
+    }
+    if (x == NULL) {
+        BIO_printf(bio_err, "unable to load CRL\n");
+        ERR_print_errors(bio_err);
+        goto end;
+    }
+
+ end:
+    BIO_free(in);
+    return (x);
+}
+
 EVP_PKEY *load_key(BIO *err, const char *file, int format, int maybe_stdin,
                   const char *pass, ENGINE *e, const char *key_descrip)
 {
@ -2159,6 +2264,9 @@ int args_verify(char ***pargs, int *pargc,
    char **oldargs = *pargs;
    char *arg = **pargs, *argn = (*pargs)[1];
    time_t at_time = 0;
+    char *hostname = NULL;
+    char *email = NULL;
+    char *ipasc = NULL;
    if (!strcmp(arg, "-policy")) {
        if (!argn)
            *badarg = 1;
@ -2212,6 +2320,21 @@ int args_verify(char ***pargs, int *pargc,
            at_time = (time_t)timestamp;
        }
        (*pargs)++;
+    } else if (strcmp(arg, "-verify_hostname") == 0) {
+        if (!argn)
+            *badarg = 1;
+        hostname = argn;
+        (*pargs)++;
+    } else if (strcmp(arg, "-verify_email") == 0) {
+        if (!argn)
+            *badarg = 1;
+        email = argn;
+        (*pargs)++;
+    } else if (strcmp(arg, "-verify_ip") == 0) {
+        if (!argn)
+            *badarg = 1;
+        ipasc = argn;
+        (*pargs)++;
    } else if (!strcmp(arg, "-ignore_critical"))
        flags |= X509_V_FLAG_IGNORE_CRITICAL;
    else if (!strcmp(arg, "-issuer_checks"))
@ -2238,6 +2361,16 @@ int args_verify(char ***pargs, int *pargc,
        flags |= X509_V_FLAG_NOTIFY_POLICY;
    else if (!strcmp(arg, "-check_ss_sig"))
        flags |= X509_V_FLAG_CHECK_SS_SIGNATURE;
+    else if (!strcmp(arg, "-trusted_first"))
+        flags |= X509_V_FLAG_TRUSTED_FIRST;
+    else if (!strcmp(arg, "-suiteB_128_only"))
+        flags |= X509_V_FLAG_SUITEB_128_LOS_ONLY;
+    else if (!strcmp(arg, "-suiteB_128"))
+        flags |= X509_V_FLAG_SUITEB_128_LOS;
+    else if (!strcmp(arg, "-suiteB_192"))
+        flags |= X509_V_FLAG_SUITEB_192_LOS;
+    else if (!strcmp(arg, "-partial_chain"))
+        flags |= X509_V_FLAG_PARTIAL_CHAIN;
    else
        return 0;

@ -2267,6 +2400,15 @@ int args_verify(char ***pargs, int *pargc,
    if (at_time)
        X509_VERIFY_PARAM_set_time(*pm, at_time);

+    if (hostname && !X509_VERIFY_PARAM_set1_host(*pm, hostname, 0))
+        *badarg = 1;
+
+    if (email && !X509_VERIFY_PARAM_set1_email(*pm, email, 0))
+        *badarg = 1;
+
+    if (ipasc && !X509_VERIFY_PARAM_set1_ip_asc(*pm, ipasc))
+        *badarg = 1;
+
 end:

    (*pargs)++;
@ -2550,6 +2692,9 @@ void jpake_client_auth(BIO *out, BIO *conn, const char *secret)

    BIO_puts(out, "JPAKE authentication succeeded, setting PSK\n");

+    if (psk_key)
+        OPENSSL_free(psk_key);
+
    psk_key = BN_bn2hex(JPAKE_get_shared_key(ctx));

    BIO_pop(bconn);
@ -2579,6 +2724,9 @@ void jpake_server_auth(BIO *out, BIO *conn, const char *secret)

    BIO_puts(out, "JPAKE authentication succeeded, setting PSK\n");

+    if (psk_key)
+        OPENSSL_free(psk_key);
+
    psk_key = BN_bn2hex(JPAKE_get_shared_key(ctx));

    BIO_pop(bconn);
@ -2589,7 +2737,7 @@ void jpake_server_auth(BIO *out, BIO *conn, const char *secret)

 #endif

-#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG)
+#ifndef OPENSSL_NO_TLSEXT
 /*-
 * next_protos_parse parses a comma separated list of strings into a string
 * in a format suitable for passing to SSL_CTX_set_next_protos_advertised.
@ -2628,8 +2776,106 @@ unsigned char *next_protos_parse(unsigned short *outlen, const char *in)
    *outlen = len + 1;
    return out;
 }
-#endif                          /* !OPENSSL_NO_TLSEXT &&
-                                 * !OPENSSL_NO_NEXTPROTONEG */
+#endif                          /* ndef OPENSSL_NO_TLSEXT */
+
+void print_cert_checks(BIO *bio, X509 *x,
+                       const char *checkhost,
+                       const char *checkemail, const char *checkip)
+{
+    if (x == NULL)
+        return;
+    if (checkhost) {
+        BIO_printf(bio, "Hostname %s does%s match certificate\n",
+                   checkhost, X509_check_host(x, checkhost, 0, 0, NULL) == 1
+                   ? "" : " NOT");
+    }
+
+    if (checkemail) {
+        BIO_printf(bio, "Email %s does%s match certificate\n",
+                   checkemail, X509_check_email(x, checkemail, 0,
+                                                0) ? "" : " NOT");
+    }
+
+    if (checkip) {
+        BIO_printf(bio, "IP %s does%s match certificate\n",
+                   checkip, X509_check_ip_asc(x, checkip, 0) ? "" : " NOT");
+    }
+}
+
+/* Get first http URL from a DIST_POINT structure */
+
+static const char *get_dp_url(DIST_POINT *dp)
+{
+    GENERAL_NAMES *gens;
+    GENERAL_NAME *gen;
+    int i, gtype;
+    ASN1_STRING *uri;
+    if (!dp->distpoint || dp->distpoint->type != 0)
+        return NULL;
+    gens = dp->distpoint->name.fullname;
+    for (i = 0; i < sk_GENERAL_NAME_num(gens); i++) {
+        gen = sk_GENERAL_NAME_value(gens, i);
+        uri = GENERAL_NAME_get0_value(gen, &gtype);
+        if (gtype == GEN_URI && ASN1_STRING_length(uri) > 6) {
+            char *uptr = (char *)ASN1_STRING_data(uri);
+            if (!strncmp(uptr, "http://", 7))
+                return uptr;
+        }
+    }
+    return NULL;
+}
+
+/*
+ * Look through a CRLDP structure and attempt to find an http URL to
+ * downloads a CRL from.
+ */
+
+static X509_CRL *load_crl_crldp(STACK_OF(DIST_POINT) *crldp)
+{
+    int i;
+    const char *urlptr = NULL;
+    for (i = 0; i < sk_DIST_POINT_num(crldp); i++) {
+        DIST_POINT *dp = sk_DIST_POINT_value(crldp, i);
+        urlptr = get_dp_url(dp);
+        if (urlptr)
+            return load_crl(urlptr, FORMAT_HTTP);
+    }
+    return NULL;
+}
+
+/*
+ * Example of downloading CRLs from CRLDP: not usable for real world as it
+ * always downloads, doesn't support non-blocking I/O and doesn't cache
+ * anything.
+ */
+
+static STACK_OF(X509_CRL) *crls_http_cb(X509_STORE_CTX *ctx, X509_NAME *nm)
+{
+    X509 *x;
+    STACK_OF(X509_CRL) *crls = NULL;
+    X509_CRL *crl;
+    STACK_OF(DIST_POINT) *crldp;
+    x = X509_STORE_CTX_get_current_cert(ctx);
+    crldp = X509_get_ext_d2i(x, NID_crl_distribution_points, NULL, NULL);
+    crl = load_crl_crldp(crldp);
+    sk_DIST_POINT_pop_free(crldp, DIST_POINT_free);
+    if (!crl)
+        return NULL;
+    crls = sk_X509_CRL_new_null();
+    sk_X509_CRL_push(crls, crl);
+    /* Try to download delta CRL */
+    crldp = X509_get_ext_d2i(x, NID_freshest_crl, NULL, NULL);
+    crl = load_crl_crldp(crldp);
+    sk_DIST_POINT_pop_free(crldp, DIST_POINT_free);
+    if (crl)
+        sk_X509_CRL_push(crls, crl);
+    return crls;
+}
+
+void store_setup_crl_download(X509_STORE *st)
+{
+    X509_STORE_set_lookup_crls_cb(st, crls_http_cb);
+}

 /*
 * Platform-specific sections
--- a/deps/openssl/openssl/apps/apps.h
+++ b/deps/openssl/openssl/apps/apps.h
@ -205,7 +205,7 @@ extern BIO *bio_err;
 #  endif
 # endif

-# ifdef OPENSSL_SYSNAME_WIN32
+# if defined(OPENSSL_SYSNAME_WIN32) || defined(OPENSSL_SYSNAME_WINCE)
 #  define openssl_fdset(a,b) FD_SET((unsigned int)a, b)
 # else
 #  define openssl_fdset(a,b) FD_SET(a, b)
@ -245,6 +245,9 @@ int app_passwd(BIO *err, char *arg1, char *arg2, char **pass1, char **pass2);
 int add_oid_section(BIO *err, CONF *conf);
 X509 *load_cert(BIO *err, const char *file, int format,
                const char *pass, ENGINE *e, const char *cert_descrip);
+X509_CRL *load_crl(const char *infile, int format);
+int load_cert_crl_http(const char *url, BIO *err,
+                       X509 **pcert, X509_CRL **pcrl);
 EVP_PKEY *load_key(BIO *err, const char *file, int format, int maybe_stdin,
                   const char *pass, ENGINE *e, const char *key_descrip);
 EVP_PKEY *load_pubkey(BIO *err, const char *file, int format, int maybe_stdin,
@ -262,8 +265,9 @@ ENGINE *setup_engine(BIO *err, const char *engine, int debug);

 # ifndef OPENSSL_NO_OCSP
 OCSP_RESPONSE *process_responder(BIO *err, OCSP_REQUEST *req,
-                                 char *host, char *path, char *port,
-                                 int use_ssl, STACK_OF(CONF_VALUE) *headers,
+                                 const char *host, const char *path,
+                                 const char *port, int use_ssl,
+                                 const STACK_OF(CONF_VALUE) *headers,
                                 int req_timeout);
 # endif

@ -334,10 +338,15 @@ void jpake_client_auth(BIO *out, BIO *conn, const char *secret);
 void jpake_server_auth(BIO *out, BIO *conn, const char *secret);
 # endif

-# if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG)
+# ifndef OPENSSL_NO_TLSEXT
 unsigned char *next_protos_parse(unsigned short *outlen, const char *in);
-# endif                         /* !OPENSSL_NO_TLSEXT &&
-                                 * !OPENSSL_NO_NEXTPROTONEG */
+# endif                         /* ndef OPENSSL_NO_TLSEXT */
+
+void print_cert_checks(BIO *bio, X509 *x,
+                       const char *checkhost,
+                       const char *checkemail, const char *checkip);
+
+void store_setup_crl_download(X509_STORE *st);

 # define FORMAT_UNDEF    0
 # define FORMAT_ASN1     1
@ -353,6 +362,7 @@ unsigned char *next_protos_parse(unsigned short *outlen, const char *in);
 # define FORMAT_ASN1RSA  10     /* DER RSAPubicKey format */
 # define FORMAT_MSBLOB   11     /* MS Key blob format */
 # define FORMAT_PVK      12     /* MS PVK file format */
+# define FORMAT_HTTP     13     /* Download using HTTP */

 # define EXT_COPY_NONE   0
 # define EXT_COPY_ADD    1
--- a/deps/openssl/openssl/apps/ca.c
+++ b/deps/openssl/openssl/apps/ca.c
@ -479,6 +479,11 @@ int MAIN(int argc, char **argv)
                goto bad;
            infile = *(++argv);
            dorevoke = 1;
+        } else if (strcmp(*argv, "-valid") == 0) {
+            if (--argc < 1)
+                goto bad;
+            infile = *(++argv);
+            dorevoke = 2;
        } else if (strcmp(*argv, "-extensions") == 0) {
            if (--argc < 1)
                goto bad;
@ -1441,6 +1446,8 @@ int MAIN(int argc, char **argv)
            revcert = load_cert(bio_err, infile, FORMAT_PEM, NULL, e, infile);
            if (revcert == NULL)
                goto err;
+            if (dorevoke == 2)
+                rev_type = -1;
            j = do_revoke(revcert, db, rev_type, rev_arg);
            if (j <= 0)
                goto err;
@ -1968,8 +1975,12 @@ static int do_body(X509 **xret, EVP_PKEY *pkey, X509 *x509,

    if (enddate == NULL)
        X509_time_adj_ex(X509_get_notAfter(ret), days, 0, NULL);
-    else
+    else {
+        int tdays;
        ASN1_TIME_set_string(X509_get_notAfter(ret), enddate);
+        ASN1_TIME_diff(&tdays, NULL, NULL, X509_get_notAfter(ret));
+        days = tdays;
+    }

    if (!X509_set_subject_name(ret, subject))
        goto err;
@ -2409,13 +2420,20 @@ static int do_revoke(X509 *x509, CA_DB *db, int type, char *value)
        }

        /* Revoke Certificate */
-        ok = do_revoke(x509, db, type, value);
+        if (type == -1)
+            ok = 1;
+        else
+            ok = do_revoke(x509, db, type, value);

        goto err;

    } else if (index_name_cmp_noconst(row, rrow)) {
        BIO_printf(bio_err, "ERROR:name does not match %s\n", row[DB_name]);
        goto err;
+    } else if (type == -1) {
+        BIO_printf(bio_err, "ERROR:Already present, serial number %s\n",
+                   row[DB_serial]);
+        goto err;
    } else if (rrow[DB_type][0] == 'R') {
        BIO_printf(bio_err, "ERROR:Already revoked, serial number %s\n",
                   row[DB_serial]);
--- a/deps/openssl/openssl/apps/ciphers.c
+++ b/deps/openssl/openssl/apps/ciphers.c
@ -85,6 +85,9 @@ int MAIN(int argc, char **argv)
 {
    int ret = 1, i;
    int verbose = 0, Verbose = 0;
+#ifndef OPENSSL_NO_SSL_TRACE
+    int stdname = 0;
+#endif
    const char **pp;
    const char *p;
    int badops = 0;
@ -119,6 +122,10 @@ int MAIN(int argc, char **argv)
            verbose = 1;
        else if (strcmp(*argv, "-V") == 0)
            verbose = Verbose = 1;
+#ifndef OPENSSL_NO_SSL_TRACE
+        else if (strcmp(*argv, "-stdname") == 0)
+            stdname = verbose = 1;
+#endif
 #ifndef OPENSSL_NO_SSL2
        else if (strcmp(*argv, "-ssl2") == 0)
            meth = SSLv2_client_method();
@ -202,7 +209,14 @@ int MAIN(int argc, char **argv)
                               id1, id2, id3);
                }
            }
-
+#ifndef OPENSSL_NO_SSL_TRACE
+            if (stdname) {
+                const char *nm = SSL_CIPHER_standard_name(c);
+                if (nm == NULL)
+                    nm = "UNKNOWN";
+                BIO_printf(STDout, "%s - ", nm);
+            }
+#endif
            BIO_puts(STDout, SSL_CIPHER_description(c, buf, sizeof buf));
        }
    }
--- a/deps/openssl/openssl/apps/cms.c
+++ b/deps/openssl/openssl/apps/cms.c
@ -75,6 +75,8 @@ static void receipt_request_print(BIO *out, CMS_ContentInfo *cms);
 static CMS_ReceiptRequest *make_receipt_request(STACK_OF(OPENSSL_STRING)
                                                *rr_to, int rr_allorfirst, STACK_OF(OPENSSL_STRING)
                                                *rr_from);
+static int cms_set_pkey_param(EVP_PKEY_CTX *pctx,
+                              STACK_OF(OPENSSL_STRING) *param);

 # define SMIME_OP        0x10
 # define SMIME_IP        0x20
@ -98,6 +100,14 @@ static CMS_ReceiptRequest *make_receipt_request(STACK_OF(OPENSSL_STRING)

 int verify_err = 0;

+typedef struct cms_key_param_st cms_key_param;
+
+struct cms_key_param_st {
+    int idx;
+    STACK_OF(OPENSSL_STRING) *param;
+    cms_key_param *next;
+};
+
 int MAIN(int, char **);

 int MAIN(int argc, char **argv)
@ -112,7 +122,7 @@ int MAIN(int argc, char **argv)
    STACK_OF(OPENSSL_STRING) *sksigners = NULL, *skkeys = NULL;
    char *certfile = NULL, *keyfile = NULL, *contfile = NULL;
    char *certsoutfile = NULL;
-    const EVP_CIPHER *cipher = NULL;
+    const EVP_CIPHER *cipher = NULL, *wrap_cipher = NULL;
    CMS_ContentInfo *cms = NULL, *rcms = NULL;
    X509_STORE *store = NULL;
    X509 *cert = NULL, *recip = NULL, *signer = NULL;
@ -140,6 +150,8 @@ int MAIN(int argc, char **argv)
    unsigned char *pwri_pass = NULL, *pwri_tmp = NULL;
    size_t secret_keylen = 0, secret_keyidlen = 0;

+    cms_key_param *key_first = NULL, *key_param = NULL;
+
    ASN1_OBJECT *econtent_type = NULL;

    X509_VERIFY_PARAM *vpm = NULL;
@ -201,6 +213,8 @@ int MAIN(int argc, char **argv)
            cipher = EVP_des_ede3_cbc();
        else if (!strcmp(*args, "-des"))
            cipher = EVP_des_cbc();
+        else if (!strcmp(*args, "-des3-wrap"))
+            wrap_cipher = EVP_des_ede3_wrap();
 # endif
 # ifndef OPENSSL_NO_SEED
        else if (!strcmp(*args, "-seed"))
@ -221,6 +235,12 @@ int MAIN(int argc, char **argv)
            cipher = EVP_aes_192_cbc();
        else if (!strcmp(*args, "-aes256"))
            cipher = EVP_aes_256_cbc();
+        else if (!strcmp(*args, "-aes128-wrap"))
+            wrap_cipher = EVP_aes_128_wrap();
+        else if (!strcmp(*args, "-aes192-wrap"))
+            wrap_cipher = EVP_aes_192_wrap();
+        else if (!strcmp(*args, "-aes256-wrap"))
+            wrap_cipher = EVP_aes_256_wrap();
 # endif
 # ifndef OPENSSL_NO_CAMELLIA
        else if (!strcmp(*args, "-camellia128"))
@ -378,7 +398,17 @@ int MAIN(int argc, char **argv)
        } else if (!strcmp(*args, "-recip")) {
            if (!args[1])
                goto argerr;
-            recipfile = *++args;
+            if (operation == SMIME_ENCRYPT) {
+                if (!encerts)
+                    encerts = sk_X509_new_null();
+                cert = load_cert(bio_err, *++args, FORMAT_PEM,
+                                 NULL, e, "recipient certificate file");
+                if (!cert)
+                    goto end;
+                sk_X509_push(encerts, cert);
+                cert = NULL;
+            } else
+                recipfile = *++args;
        } else if (!strcmp(*args, "-certsout")) {
            if (!args[1])
                goto argerr;
@ -413,6 +443,40 @@ int MAIN(int argc, char **argv)
            if (!args[1])
                goto argerr;
            keyform = str2fmt(*++args);
+        } else if (!strcmp(*args, "-keyopt")) {
+            int keyidx = -1;
+            if (!args[1])
+                goto argerr;
+            if (operation == SMIME_ENCRYPT) {
+                if (encerts)
+                    keyidx += sk_X509_num(encerts);
+            } else {
+                if (keyfile || signerfile)
+                    keyidx++;
+                if (skkeys)
+                    keyidx += sk_OPENSSL_STRING_num(skkeys);
+            }
+            if (keyidx < 0) {
+                BIO_printf(bio_err, "No key specified\n");
+                goto argerr;
+            }
+            if (key_param == NULL || key_param->idx != keyidx) {
+                cms_key_param *nparam;
+                nparam = OPENSSL_malloc(sizeof(cms_key_param));
+                if(!nparam) {
+                    BIO_printf(bio_err, "Out of memory\n");
+                    goto argerr;
+                }
+                nparam->idx = keyidx;
+                nparam->param = sk_OPENSSL_STRING_new_null();
+                nparam->next = NULL;
+                if (key_first == NULL)
+                    key_first = nparam;
+                else
+                    key_param->next = nparam;
+                key_param = nparam;
+            }
+            sk_OPENSSL_STRING_push(key_param->param, *++args);
        } else if (!strcmp(*args, "-rctform")) {
            if (!args[1])
                goto argerr;
@ -502,7 +566,7 @@ int MAIN(int argc, char **argv)
            badarg = 1;
        }
    } else if (operation == SMIME_ENCRYPT) {
-        if (!*args && !secret_key && !pwri_pass) {
+        if (!*args && !secret_key && !pwri_pass && !encerts) {
            BIO_printf(bio_err, "No recipient(s) certificate(s) specified\n");
            badarg = 1;
        }
@ -567,6 +631,7 @@ int MAIN(int argc, char **argv)
                   "-inkey file    input private key (if not signer or recipient)\n");
        BIO_printf(bio_err,
                   "-keyform arg   input private key format (PEM or ENGINE)\n");
+        BIO_printf(bio_err, "-keyopt nm:v   set public key parameters\n");
        BIO_printf(bio_err, "-out file      output file\n");
        BIO_printf(bio_err,
                   "-outform arg   output format SMIME (default), PEM or DER\n");
@ -650,7 +715,7 @@ int MAIN(int argc, char **argv)
            goto end;
        }

-        if (*args)
+        if (*args && !encerts)
            encerts = sk_X509_new_null();
        while (*args) {
            if (!(cert = load_cert(bio_err, *args, FORMAT_PEM,
@ -802,10 +867,39 @@ int MAIN(int argc, char **argv)
    } else if (operation == SMIME_COMPRESS) {
        cms = CMS_compress(in, -1, flags);
    } else if (operation == SMIME_ENCRYPT) {
+        int i;
        flags |= CMS_PARTIAL;
-        cms = CMS_encrypt(encerts, in, cipher, flags);
+        cms = CMS_encrypt(NULL, in, cipher, flags);
        if (!cms)
            goto end;
+        for (i = 0; i < sk_X509_num(encerts); i++) {
+            CMS_RecipientInfo *ri;
+            cms_key_param *kparam;
+            int tflags = flags;
+            X509 *x = sk_X509_value(encerts, i);
+            for (kparam = key_first; kparam; kparam = kparam->next) {
+                if (kparam->idx == i) {
+                    tflags |= CMS_KEY_PARAM;
+                    break;
+                }
+            }
+            ri = CMS_add1_recipient_cert(cms, x, tflags);
+            if (!ri)
+                goto end;
+            if (kparam) {
+                EVP_PKEY_CTX *pctx;
+                pctx = CMS_RecipientInfo_get0_pkey_ctx(ri);
+                if (!cms_set_pkey_param(pctx, kparam->param))
+                    goto end;
+            }
+            if (CMS_RecipientInfo_type(ri) == CMS_RECIPINFO_AGREE
+                && wrap_cipher) {
+                EVP_CIPHER_CTX *wctx;
+                wctx = CMS_RecipientInfo_kari_get0_ctx(ri);
+                EVP_EncryptInit_ex(wctx, wrap_cipher, NULL, NULL, NULL);
+            }
+        }
+
        if (secret_key) {
            if (!CMS_add0_recipient_key(cms, NID_undef,
                                        secret_key, secret_keylen,
@ -878,8 +972,11 @@ int MAIN(int argc, char **argv)
            flags |= CMS_REUSE_DIGEST;
        for (i = 0; i < sk_OPENSSL_STRING_num(sksigners); i++) {
            CMS_SignerInfo *si;
+            cms_key_param *kparam;
+            int tflags = flags;
            signerfile = sk_OPENSSL_STRING_value(sksigners, i);
            keyfile = sk_OPENSSL_STRING_value(skkeys, i);
+
            signer = load_cert(bio_err, signerfile, FORMAT_PEM, NULL,
                               e, "signer certificate");
            if (!signer)
@ -888,9 +985,21 @@ int MAIN(int argc, char **argv)
                           "signing key file");
            if (!key)
                goto end;
-            si = CMS_add1_signer(cms, signer, key, sign_md, flags);
+            for (kparam = key_first; kparam; kparam = kparam->next) {
+                if (kparam->idx == i) {
+                    tflags |= CMS_KEY_PARAM;
+                    break;
+                }
+            }
+            si = CMS_add1_signer(cms, signer, key, sign_md, tflags);
            if (!si)
                goto end;
+            if (kparam) {
+                EVP_PKEY_CTX *pctx;
+                pctx = CMS_SignerInfo_get0_pkey_ctx(si);
+                if (!cms_set_pkey_param(pctx, kparam->param))
+                    goto end;
+            }
            if (rr && !CMS_add1_ReceiptRequest(si, rr))
                goto end;
            X509_free(signer);
@ -1045,6 +1154,13 @@ int MAIN(int argc, char **argv)
        sk_OPENSSL_STRING_free(rr_to);
    if (rr_from)
        sk_OPENSSL_STRING_free(rr_from);
+    for (key_param = key_first; key_param;) {
+        cms_key_param *tparam;
+        sk_OPENSSL_STRING_free(key_param->param);
+        tparam = key_param->next;
+        OPENSSL_free(key_param);
+        key_param = tparam;
+    }
    X509_STORE_free(store);
    X509_free(cert);
    X509_free(recip);
@ -1218,4 +1334,22 @@ static CMS_ReceiptRequest *make_receipt_request(STACK_OF(OPENSSL_STRING)
    return NULL;
 }

+static int cms_set_pkey_param(EVP_PKEY_CTX *pctx,
+                              STACK_OF(OPENSSL_STRING) *param)
+{
+    char *keyopt;
+    int i;
+    if (sk_OPENSSL_STRING_num(param) <= 0)
+        return 1;
+    for (i = 0; i < sk_OPENSSL_STRING_num(param); i++) {
+        keyopt = sk_OPENSSL_STRING_value(param, i);
+        if (pkey_ctrl_string(pctx, keyopt) <= 0) {
+            BIO_printf(bio_err, "parameter error \"%s\"\n", keyopt);
+            ERR_print_errors(bio_err);
+            return 0;
+        }
+    }
+    return 1;
+}
+
 #endif
--- a/deps/openssl/openssl/apps/crl.c
+++ b/deps/openssl/openssl/apps/crl.c
@ -96,7 +96,6 @@ static const char *crl_usage[] = {
    NULL
 };

-static X509_CRL *load_crl(char *file, int format);
 static BIO *bio_out = NULL;

 int MAIN(int, char **);
@ -106,10 +105,10 @@ int MAIN(int argc, char **argv)
    unsigned long nmflag = 0;
    X509_CRL *x = NULL;
    char *CAfile = NULL, *CApath = NULL;
-    int ret = 1, i, num, badops = 0;
+    int ret = 1, i, num, badops = 0, badsig = 0;
    BIO *out = NULL;
-    int informat, outformat;
-    char *infile = NULL, *outfile = NULL;
+    int informat, outformat, keyformat;
+    char *infile = NULL, *outfile = NULL, *crldiff = NULL, *keyfile = NULL;
    int hash = 0, issuer = 0, lastupdate = 0, nextupdate = 0, noout =
        0, text = 0;
 #ifndef OPENSSL_NO_MD5
@ -147,6 +146,7 @@ int MAIN(int argc, char **argv)

    informat = FORMAT_PEM;
    outformat = FORMAT_PEM;
+    keyformat = FORMAT_PEM;

    argc--;
    argv++;
@ -173,6 +173,18 @@ int MAIN(int argc, char **argv)
            if (--argc < 1)
                goto bad;
            infile = *(++argv);
+        } else if (strcmp(*argv, "-gendelta") == 0) {
+            if (--argc < 1)
+                goto bad;
+            crldiff = *(++argv);
+        } else if (strcmp(*argv, "-key") == 0) {
+            if (--argc < 1)
+                goto bad;
+            keyfile = *(++argv);
+        } else if (strcmp(*argv, "-keyform") == 0) {
+            if (--argc < 1)
+                goto bad;
+            keyformat = str2fmt(*(++argv));
        } else if (strcmp(*argv, "-out") == 0) {
            if (--argc < 1)
                goto bad;
@ -214,6 +226,8 @@ int MAIN(int argc, char **argv)
            fingerprint = ++num;
        else if (strcmp(*argv, "-crlnumber") == 0)
            crlnumber = ++num;
+        else if (strcmp(*argv, "-badsig") == 0)
+            badsig = 1;
        else if ((md_alg = EVP_get_digestbyname(*argv + 1))) {
            /* ok */
            digest = md_alg;
@ -281,6 +295,33 @@ int MAIN(int argc, char **argv)
            BIO_printf(bio_err, "verify OK\n");
    }

+    if (crldiff) {
+        X509_CRL *newcrl, *delta;
+        if (!keyfile) {
+            BIO_puts(bio_err, "Missing CRL signing key\n");
+            goto end;
+        }
+        newcrl = load_crl(crldiff, informat);
+        if (!newcrl)
+            goto end;
+        pkey = load_key(bio_err, keyfile, keyformat, 0, NULL, NULL,
+                        "CRL signing key");
+        if (!pkey) {
+            X509_CRL_free(newcrl);
+            goto end;
+        }
+        delta = X509_CRL_diff(x, newcrl, pkey, digest, 0);
+        X509_CRL_free(newcrl);
+        EVP_PKEY_free(pkey);
+        if (delta) {
+            X509_CRL_free(x);
+            x = delta;
+        } else {
+            BIO_puts(bio_err, "Error creating delta CRL\n");
+            goto end;
+        }
+    }
+
    if (num) {
        for (i = 1; i <= num; i++) {
            if (issuer == i) {
@ -369,6 +410,9 @@ int MAIN(int argc, char **argv)
        goto end;
    }

+    if (badsig)
+        x->signature->data[x->signature->length - 1] ^= 0x1;
+
    if (outformat == FORMAT_ASN1)
        i = (int)i2d_X509_CRL_bio(out, x);
    else if (outformat == FORMAT_PEM)
@ -383,6 +427,8 @@ int MAIN(int argc, char **argv)
    }
    ret = 0;
 end:
+    if (ret != 0)
+        ERR_print_errors(bio_err);
    BIO_free_all(out);
    BIO_free_all(bio_out);
    bio_out = NULL;
@ -394,41 +440,3 @@ int MAIN(int argc, char **argv)
    apps_shutdown();
    OPENSSL_EXIT(ret);
 }
-
-static X509_CRL *load_crl(char *infile, int format)
-{
-    X509_CRL *x = NULL;
-    BIO *in = NULL;
-
-    in = BIO_new(BIO_s_file());
-    if (in == NULL) {
-        ERR_print_errors(bio_err);
-        goto end;
-    }
-
-    if (infile == NULL)
-        BIO_set_fp(in, stdin, BIO_NOCLOSE);
-    else {
-        if (BIO_read_filename(in, infile) <= 0) {
-            perror(infile);
-            goto end;
-        }
-    }
-    if (format == FORMAT_ASN1)
-        x = d2i_X509_CRL_bio(in, NULL);
-    else if (format == FORMAT_PEM)
-        x = PEM_read_bio_X509_CRL(in, NULL, NULL, NULL);
-    else {
-        BIO_printf(bio_err, "bad input format specified for input crl\n");
-        goto end;
-    }
-    if (x == NULL) {
-        BIO_printf(bio_err, "unable to load CRL\n");
-        ERR_print_errors(bio_err);
-        goto end;
-    }
-
- end:
-    BIO_free(in);
-    return (x);
-}
--- a/deps/openssl/openssl/apps/dgst.c
+++ b/deps/openssl/openssl/apps/dgst.c
@ -103,7 +103,7 @@ int MAIN(int, char **);

 int MAIN(int argc, char **argv)
 {
-    ENGINE *e = NULL;
+    ENGINE *e = NULL, *impl = NULL;
    unsigned char *buf = NULL;
    int i, err = 1;
    const EVP_MD *md = NULL, *m;
@ -124,6 +124,7 @@ int MAIN(int argc, char **argv)
    char *passargin = NULL, *passin = NULL;
 #ifndef OPENSSL_NO_ENGINE
    char *engine = NULL;
+    int engine_impl = 0;
 #endif
    char *hmac_key = NULL;
    char *mac_name = NULL;
@ -199,7 +200,8 @@ int MAIN(int argc, char **argv)
                break;
            engine = *(++argv);
            e = setup_engine(bio_err, engine, 0);
-        }
+        } else if (strcmp(*argv, "-engine_impl") == 0)
+            engine_impl = 1;
 #endif
        else if (strcmp(*argv, "-hex") == 0)
            out_bin = 0;
@ -284,6 +286,10 @@ int MAIN(int argc, char **argv)
        EVP_MD_do_all_sorted(list_md_fn, bio_err);
        goto end;
    }
+#ifndef OPENSSL_NO_ENGINE
+    if (engine_impl)
+        impl = e;
+#endif

    in = BIO_new(BIO_s_file());
    bmd = BIO_new(BIO_f_md());
@ -357,7 +363,7 @@ int MAIN(int argc, char **argv)
    if (mac_name) {
        EVP_PKEY_CTX *mac_ctx = NULL;
        int r = 0;
-        if (!init_gen_str(bio_err, &mac_ctx, mac_name, e, 0))
+        if (!init_gen_str(bio_err, &mac_ctx, mac_name, impl, 0))
            goto mac_end;
        if (macopts) {
            char *macopt;
@ -391,7 +397,7 @@ int MAIN(int argc, char **argv)
    }

    if (hmac_key) {
-        sigkey = EVP_PKEY_new_mac_key(EVP_PKEY_HMAC, e,
+        sigkey = EVP_PKEY_new_mac_key(EVP_PKEY_HMAC, impl,
                                      (unsigned char *)hmac_key, -1);
        if (!sigkey)
            goto end;
@ -407,9 +413,9 @@ int MAIN(int argc, char **argv)
            goto end;
        }
        if (do_verify)
-            r = EVP_DigestVerifyInit(mctx, &pctx, md, NULL, sigkey);
+            r = EVP_DigestVerifyInit(mctx, &pctx, md, impl, sigkey);
        else
-            r = EVP_DigestSignInit(mctx, &pctx, md, NULL, sigkey);
+            r = EVP_DigestSignInit(mctx, &pctx, md, impl, sigkey);
        if (!r) {
            BIO_printf(bio_err, "Error setting context\n");
            ERR_print_errors(bio_err);
@ -429,9 +435,15 @@ int MAIN(int argc, char **argv)
    }
    /* we use md as a filter, reading from 'in' */
    else {
+        EVP_MD_CTX *mctx = NULL;
+        if (!BIO_get_md_ctx(bmd, &mctx)) {
+            BIO_printf(bio_err, "Error getting context\n");
+            ERR_print_errors(bio_err);
+            goto end;
+        }
        if (md == NULL)
            md = EVP_md5();
-        if (!BIO_set_md(bmd, md)) {
+        if (!EVP_DigestInit_ex(mctx, md, impl)) {
            BIO_printf(bio_err, "Error setting digest %s\n", pname);
            ERR_print_errors(bio_err);
            goto end;
@ -483,7 +495,8 @@ int MAIN(int argc, char **argv)
                    EVP_PKEY_asn1_get0_info(NULL, NULL,
                                            NULL, NULL, &sig_name, ameth);
            }
-            md_name = EVP_MD_name(md);
+            if (md)
+                md_name = EVP_MD_name(md);
        }
        err = 0;
        for (i = 0; i < argc; i++) {
@ -581,9 +594,12 @@ int do_fp(BIO *out, unsigned char *buf, BIO *bp, int sep, int binout,
            BIO_printf(out, "%02x", buf[i]);
        BIO_printf(out, " *%s\n", file);
    } else {
-        if (sig_name)
-            BIO_printf(out, "%s-%s(%s)= ", sig_name, md_name, file);
-        else if (md_name)
+        if (sig_name) {
+            BIO_puts(out, sig_name);
+            if (md_name)
+                BIO_printf(out, "-%s", md_name);
+            BIO_printf(out, "(%s)= ", file);
+        } else if (md_name)
            BIO_printf(out, "%s(%s)= ", md_name, file);
        else
            BIO_printf(out, "(%s)= ", file);
--- a/deps/openssl/openssl/apps/dhparam.c
+++ b/deps/openssl/openssl/apps/dhparam.c
@ -130,7 +130,7 @@
 # undef PROG
 # define PROG    dhparam_main

-# define DEFBITS 512
+# define DEFBITS 2048

 /*-
 * -inform arg  - input format - default PEM (DER or PEM)
@ -254,7 +254,7 @@ int MAIN(int argc, char **argv)
        BIO_printf(bio_err,
                   " -5            generate parameters using  5 as the generator value\n");
        BIO_printf(bio_err,
-                   " numbits       number of bits in to generate (default 512)\n");
+                   " numbits       number of bits in to generate (default 2048)\n");
 # ifndef OPENSSL_NO_ENGINE
        BIO_printf(bio_err,
                   " -engine e     use engine e, possibly a hardware device.\n");
@ -489,9 +489,12 @@ int MAIN(int argc, char **argv)
    if (!noout) {
        if (outformat == FORMAT_ASN1)
            i = i2d_DHparams_bio(out, dh);
-        else if (outformat == FORMAT_PEM)
-            i = PEM_write_bio_DHparams(out, dh);
-        else {
+        else if (outformat == FORMAT_PEM) {
+            if (dh->q)
+                i = PEM_write_bio_DHxparams(out, dh);
+            else
+                i = PEM_write_bio_DHparams(out, dh);
+        } else {
            BIO_printf(bio_err, "bad output format specified for outfile\n");
            goto end;
        }
--- a/deps/openssl/openssl/apps/ecparam.c
+++ b/deps/openssl/openssl/apps/ecparam.c
@ -370,6 +370,9 @@ int MAIN(int argc, char **argv)
        } else
            nid = OBJ_sn2nid(curve_name);

+        if (nid == 0)
+            nid = EC_curve_nist2nid(curve_name);
+
        if (nid == 0) {
            BIO_printf(bio_err, "unknown curve name (%s)\n", curve_name);
            goto end;
--- a/deps/openssl/openssl/apps/gendh.c
+++ b/deps/openssl/openssl/apps/gendh.c
@ -80,7 +80,7 @@
 # include <openssl/x509.h>
 # include <openssl/pem.h>

-# define DEFBITS 512
+# define DEFBITS 2048
 # undef PROG
 # define PROG gendh_main

--- a/deps/openssl/openssl/apps/genrsa.c
+++ b/deps/openssl/openssl/apps/genrsa.c
@ -80,7 +80,7 @@
 # include <openssl/pem.h>
 # include <openssl/rand.h>

-# define DEFBITS 1024
+# define DEFBITS 2048
 # undef PROG
 # define PROG genrsa_main

--- a/deps/openssl/openssl/apps/makeapps.com
+++ b/deps/openssl/openssl/apps/makeapps.com
@ -776,7 +776,7 @@ $ IF F$TYPE(USER_CCFLAGS) .NES. "" THEN CCEXTRAFLAGS = USER_CCFLAGS
 $ CCDISABLEWARNINGS = "" !!! "MAYLOSEDATA3" !!! "LONGLONGTYPE,LONGLONGSUFX,FOUNDCR"
 $ IF F$TYPE(USER_CCDISABLEWARNINGS) .NES. ""
 $ THEN
-$     IF CCDISABLEWARNINGS .NES. "" THEN CCDISABLEWARNINGS = CCDISABLEWARNINGS + ","
+$     IF CCDISABLEWARNINGS .NES. THEN CCDISABLEWARNINGS = CCDISABLEWARNINGS + ","
 $     CCDISABLEWARNINGS = CCDISABLEWARNINGS + USER_CCDISABLEWARNINGS
 $ ENDIF
 $!
--- a/deps/openssl/openssl/apps/ocsp.c
+++ b/deps/openssl/openssl/apps/ocsp.c
@ -110,16 +110,17 @@ static int print_ocsp_summary(BIO *out, OCSP_BASICRESP *bs, OCSP_REQUEST *req,

 static int make_ocsp_response(OCSP_RESPONSE **resp, OCSP_REQUEST *req,
                              CA_DB *db, X509 *ca, X509 *rcert,
-                              EVP_PKEY *rkey, STACK_OF(X509) *rother,
-                              unsigned long flags, int nmin, int ndays);
+                              EVP_PKEY *rkey, const EVP_MD *md,
+                              STACK_OF(X509) *rother, unsigned long flags,
+                              int nmin, int ndays, int badsig);

 static char **lookup_serial(CA_DB *db, ASN1_INTEGER *ser);
-static BIO *init_responder(char *port);
+static BIO *init_responder(const char *port);
 static int do_responder(OCSP_REQUEST **preq, BIO **pcbio, BIO *acbio,
-                        char *port);
+                        const char *port);
 static int send_ocsp_response(BIO *cbio, OCSP_RESPONSE *resp);
-static OCSP_RESPONSE *query_responder(BIO *err, BIO *cbio, char *path,
-                                      STACK_OF(CONF_VALUE) *headers,
+static OCSP_RESPONSE *query_responder(BIO *err, BIO *cbio, const char *path,
+                                      const STACK_OF(CONF_VALUE) *headers,
                                      OCSP_REQUEST *req, int req_timeout);

 # undef PROG
@ -154,12 +155,14 @@ int MAIN(int argc, char **argv)
    long nsec = MAX_VALIDITY_PERIOD, maxage = -1;
    char *CAfile = NULL, *CApath = NULL;
    X509_STORE *store = NULL;
+    X509_VERIFY_PARAM *vpm = NULL;
    STACK_OF(X509) *sign_other = NULL, *verify_other = NULL, *rother = NULL;
    char *sign_certfile = NULL, *verify_certfile = NULL, *rcertfile = NULL;
    unsigned long sign_flags = 0, verify_flags = 0, rflags = 0;
    int ret = 1;
    int accept_count = -1;
    int badarg = 0;
+    int badsig = 0;
    int i;
    int ignore_err = 0;
    STACK_OF(OPENSSL_STRING) *reqnames = NULL;
@ -170,7 +173,7 @@ int MAIN(int argc, char **argv)
    char *rca_filename = NULL;
    CA_DB *rdb = NULL;
    int nmin = 0, ndays = -1;
-    const EVP_MD *cert_id_md = NULL;
+    const EVP_MD *cert_id_md = NULL, *rsign_md = NULL;

    if (bio_err == NULL)
        bio_err = BIO_new_fp(stderr, BIO_NOCLOSE);
@ -264,6 +267,8 @@ int MAIN(int argc, char **argv)
            verify_flags |= OCSP_TRUSTOTHER;
        else if (!strcmp(*args, "-no_intern"))
            verify_flags |= OCSP_NOINTERN;
+        else if (!strcmp(*args, "-badsig"))
+            badsig = 1;
        else if (!strcmp(*args, "-text")) {
            req_text = 1;
            resp_text = 1;
@ -320,6 +325,10 @@ int MAIN(int argc, char **argv)
                CApath = *args;
            } else
                badarg = 1;
+        } else if (args_verify(&args, NULL, &badarg, bio_err, &vpm)) {
+            if (badarg)
+                goto end;
+            continue;
        } else if (!strcmp(*args, "-validity_period")) {
            if (args[1]) {
                args++;
@ -465,6 +474,14 @@ int MAIN(int argc, char **argv)
                rcertfile = *args;
            } else
                badarg = 1;
+        } else if (!strcmp(*args, "-rmd")) {
+            if (args[1]) {
+                args++;
+                rsign_md = EVP_get_digestbyname(*args);
+                if (!rsign_md)
+                    badarg = 1;
+            } else
+                badarg = 1;
        } else if ((cert_id_md = EVP_get_digestbyname((*args) + 1)) == NULL) {
            badarg = 1;
        }
@ -582,7 +599,10 @@ int MAIN(int argc, char **argv)
        add_nonce = 0;

    if (!req && reqin) {
-        derbio = BIO_new_file(reqin, "rb");
+        if (!strcmp(reqin, "-"))
+            derbio = BIO_new_fp(stdin, BIO_NOCLOSE);
+        else
+            derbio = BIO_new_file(reqin, "rb");
        if (!derbio) {
            BIO_printf(bio_err, "Error Opening OCSP request file\n");
            goto end;
@ -679,7 +699,10 @@ int MAIN(int argc, char **argv)
        OCSP_REQUEST_print(out, req, 0);

    if (reqout) {
-        derbio = BIO_new_file(reqout, "wb");
+        if (!strcmp(reqout, "-"))
+            derbio = BIO_new_fp(stdout, BIO_NOCLOSE);
+        else
+            derbio = BIO_new_file(reqout, "wb");
        if (!derbio) {
            BIO_printf(bio_err, "Error opening file %s\n", reqout);
            goto end;
@ -704,7 +727,7 @@ int MAIN(int argc, char **argv)

    if (rdb) {
        i = make_ocsp_response(&resp, req, rdb, rca_cert, rsigner, rkey,
-                               rother, rflags, nmin, ndays);
+                               rsign_md, rother, rflags, nmin, ndays, badsig);
        if (cbio)
            send_ocsp_response(cbio, resp);
    } else if (host) {
@ -719,7 +742,10 @@ int MAIN(int argc, char **argv)
        goto end;
 # endif
    } else if (respin) {
-        derbio = BIO_new_file(respin, "rb");
+        if (!strcmp(respin, "-"))
+            derbio = BIO_new_fp(stdin, BIO_NOCLOSE);
+        else
+            derbio = BIO_new_file(respin, "rb");
        if (!derbio) {
            BIO_printf(bio_err, "Error Opening OCSP response file\n");
            goto end;
@ -739,7 +765,10 @@ int MAIN(int argc, char **argv)
 done_resp:

    if (respout) {
-        derbio = BIO_new_file(respout, "wb");
+        if (!strcmp(respout, "-"))
+            derbio = BIO_new_fp(stdout, BIO_NOCLOSE);
+        else
+            derbio = BIO_new_file(respout, "wb");
        if (!derbio) {
            BIO_printf(bio_err, "Error opening file %s\n", respout);
            goto end;
@ -776,6 +805,10 @@ int MAIN(int argc, char **argv)
            resp = NULL;
            goto redo_accept;
        }
+        ret = 0;
+        goto end;
+    } else if (ridx_filename) {
+        ret = 0;
        goto end;
    }

@ -783,6 +816,8 @@ int MAIN(int argc, char **argv)
        store = setup_verify(bio_err, CAfile, CApath);
    if (!store)
        goto end;
+    if (vpm)
+        X509_STORE_set1_param(store, vpm);
    if (verify_certfile) {
        verify_other = load_certs(bio_err, verify_certfile, FORMAT_PEM,
                                  NULL, e, "validator certificate");
@ -797,37 +832,38 @@ int MAIN(int argc, char **argv)
        goto end;
    }

+    ret = 0;
+
    if (!noverify) {
        if (req && ((i = OCSP_check_nonce(req, bs)) <= 0)) {
            if (i == -1)
                BIO_printf(bio_err, "WARNING: no nonce in response\n");
            else {
                BIO_printf(bio_err, "Nonce Verify error\n");
+                ret = 1;
                goto end;
            }
        }

        i = OCSP_basic_verify(bs, verify_other, store, verify_flags);
-        if (i < 0)
-            i = OCSP_basic_verify(bs, NULL, store, 0);
-
        if (i <= 0) {
            BIO_printf(bio_err, "Response Verify Failure\n");
            ERR_print_errors(bio_err);
+            ret = 1;
        } else
            BIO_printf(bio_err, "Response verify OK\n");

    }

    if (!print_ocsp_summary(out, bs, req, reqnames, ids, nsec, maxage))
-        goto end;
-
-    ret = 0;
+        ret = 1;

 end:
    ERR_print_errors(bio_err);
    X509_free(signer);
    X509_STORE_free(store);
+    if (vpm)
+        X509_VERIFY_PARAM_free(vpm);
    EVP_PKEY_free(key);
    EVP_PKEY_free(rkey);
    X509_free(issuer);
@ -982,8 +1018,9 @@ static int print_ocsp_summary(BIO *out, OCSP_BASICRESP *bs, OCSP_REQUEST *req,

 static int make_ocsp_response(OCSP_RESPONSE **resp, OCSP_REQUEST *req,
                              CA_DB *db, X509 *ca, X509 *rcert,
-                              EVP_PKEY *rkey, STACK_OF(X509) *rother,
-                              unsigned long flags, int nmin, int ndays)
+                              EVP_PKEY *rkey, const EVP_MD *rmd,
+                              STACK_OF(X509) *rother, unsigned long flags,
+                              int nmin, int ndays, int badsig)
 {
    ASN1_TIME *thisupd = NULL, *nextupd = NULL;
    OCSP_CERTID *cid, *ca_id = NULL;
@ -1067,7 +1104,10 @@ static int make_ocsp_response(OCSP_RESPONSE **resp, OCSP_REQUEST *req,

    OCSP_copy_nonce(bs, req);

-    OCSP_basic_sign(bs, rcert, rkey, NULL, rother, flags);
+    OCSP_basic_sign(bs, rcert, rkey, rmd, rother, flags);
+
+    if (badsig)
+        bs->signature->data[bs->signature->length - 1] ^= 0x1;

    *resp = OCSP_response_create(OCSP_RESPONSE_STATUS_SUCCESSFUL, bs);

@ -1103,7 +1143,7 @@ static char **lookup_serial(CA_DB *db, ASN1_INTEGER *ser)

 /* Quick and dirty OCSP server: read in and parse input request */

-static BIO *init_responder(char *port)
+static BIO *init_responder(const char *port)
 {
    BIO *acbio = NULL, *bufbio = NULL;
    bufbio = BIO_new(BIO_f_buffer());
@ -1135,7 +1175,7 @@ static BIO *init_responder(char *port)
 }

 static int do_responder(OCSP_REQUEST **preq, BIO **pcbio, BIO *acbio,
-                        char *port)
+                        const char *port)
 {
    int have_post = 0, len;
    OCSP_REQUEST *req = NULL;
@ -1196,8 +1236,8 @@ static int send_ocsp_response(BIO *cbio, OCSP_RESPONSE *resp)
    return 1;
 }

-static OCSP_RESPONSE *query_responder(BIO *err, BIO *cbio, char *path,
-                                      STACK_OF(CONF_VALUE) *headers,
+static OCSP_RESPONSE *query_responder(BIO *err, BIO *cbio, const char *path,
+                                      const STACK_OF(CONF_VALUE) *headers,
                                      OCSP_REQUEST *req, int req_timeout)
 {
    int fd;
@ -1284,8 +1324,9 @@ static OCSP_RESPONSE *query_responder(BIO *err, BIO *cbio, char *path,
 }

 OCSP_RESPONSE *process_responder(BIO *err, OCSP_REQUEST *req,
-                                 char *host, char *path, char *port,
-                                 int use_ssl, STACK_OF(CONF_VALUE) *headers,
+                                 const char *host, const char *path,
+                                 const char *port, int use_ssl,
+                                 const STACK_OF(CONF_VALUE) *headers,
                                 int req_timeout)
 {
    BIO *cbio = NULL;
--- a/deps/openssl/openssl/apps/openssl-vms.cnf
+++ b/deps/openssl/openssl/apps/openssl-vms.cnf
@ -103,7 +103,7 @@ emailAddress		= optional

 ####################################################################
 [ req ]
-default_bits		= 1024
+default_bits		= 2048
 default_keyfile 	= privkey.pem
 distinguished_name	= req_distinguished_name
 attributes		= req_attributes
--- a/deps/openssl/openssl/apps/openssl.cnf
+++ b/deps/openssl/openssl/apps/openssl.cnf
@ -103,7 +103,7 @@ emailAddress		= optional

 ####################################################################
 [ req ]
-default_bits		= 1024
+default_bits		= 2048
 default_keyfile 	= privkey.pem
 distinguished_name	= req_distinguished_name
 attributes		= req_attributes
--- a/deps/openssl/openssl/apps/pkcs8.c
+++ b/deps/openssl/openssl/apps/pkcs8.c
@ -124,6 +124,16 @@ int MAIN(int argc, char **argv)
                }
            } else
                badarg = 1;
+        } else if (!strcmp(*args, "-v2prf")) {
+            if (args[1]) {
+                args++;
+                pbe_nid = OBJ_txt2nid(*args);
+                if (!EVP_PBE_find(EVP_PBE_TYPE_PRF, pbe_nid, NULL, NULL, 0)) {
+                    BIO_printf(bio_err, "Unknown PRF algorithm %s\n", *args);
+                    badarg = 1;
+                }
+            } else
+                badarg = 1;
        } else if (!strcmp(*args, "-inform")) {
            if (args[1]) {
                args++;
--- a/deps/openssl/openssl/apps/s_apps.h
+++ b/deps/openssl/openssl/apps/s_apps.h
@ -152,15 +152,21 @@ typedef fd_mask fd_set;
 #define PROTOCOL        "tcp"

 int do_server(int port, int type, int *ret,
-              int (*cb) (char *hostname, int s, unsigned char *context),
-              unsigned char *context);
+              int (*cb) (char *hostname, int s, int stype,
+                         unsigned char *context), unsigned char *context,
+              int naccept);
 #ifdef HEADER_X509_H
 int MS_CALLBACK verify_callback(int ok, X509_STORE_CTX *ctx);
 #endif
 #ifdef HEADER_SSL_H
 int set_cert_stuff(SSL_CTX *ctx, char *cert_file, char *key_file);
-int set_cert_key_stuff(SSL_CTX *ctx, X509 *cert, EVP_PKEY *key);
+int set_cert_key_stuff(SSL_CTX *ctx, X509 *cert, EVP_PKEY *key,
+                       STACK_OF(X509) *chain, int build_chain);
+int ssl_print_sigalgs(BIO *out, SSL *s);
+int ssl_print_point_formats(BIO *out, SSL *s);
+int ssl_print_curves(BIO *out, SSL *s, int noshared);
 #endif
+int ssl_print_tmp_key(BIO *out, SSL *s);
 int init_client(int *sock, char *server, int port, int type);
 int should_retry(int i);
 int extract_port(char *str, short *port_ptr);
@ -182,3 +188,24 @@ int MS_CALLBACK generate_cookie_callback(SSL *ssl, unsigned char *cookie,
                                         unsigned int *cookie_len);
 int MS_CALLBACK verify_cookie_callback(SSL *ssl, unsigned char *cookie,
                                       unsigned int cookie_len);
+
+typedef struct ssl_excert_st SSL_EXCERT;
+
+void ssl_ctx_set_excert(SSL_CTX *ctx, SSL_EXCERT *exc);
+void ssl_excert_free(SSL_EXCERT *exc);
+int args_excert(char ***pargs, int *pargc,
+                int *badarg, BIO *err, SSL_EXCERT **pexc);
+int load_excert(SSL_EXCERT **pexc, BIO *err);
+void print_ssl_summary(BIO *bio, SSL *s);
+#ifdef HEADER_SSL_H
+int args_ssl(char ***pargs, int *pargc, SSL_CONF_CTX *cctx,
+             int *badarg, BIO *err, STACK_OF(OPENSSL_STRING) **pstr);
+int args_ssl_call(SSL_CTX *ctx, BIO *err, SSL_CONF_CTX *cctx,
+                  STACK_OF(OPENSSL_STRING) *str, int no_ecdhe, int no_jpake);
+int ssl_ctx_add_crls(SSL_CTX *ctx, STACK_OF(X509_CRL) *crls,
+                     int crl_download);
+int ssl_load_stores(SSL_CTX *ctx, const char *vfyCApath,
+                    const char *vfyCAfile, const char *chCApath,
+                    const char *chCAfile, STACK_OF(X509_CRL) *crls,
+                    int crl_download);
+#endif
--- a/deps/openssl/openssl/apps/s_cb.c
+++ b/deps/openssl/openssl/apps/s_cb.c
@ -125,6 +125,7 @@
 #define COOKIE_SECRET_LENGTH    16

 int verify_depth = 0;
+int verify_quiet = 0;
 int verify_error = X509_V_OK;
 int verify_return_error = 0;
 unsigned char cookie_secret[COOKIE_SECRET_LENGTH];
@ -139,13 +140,16 @@ int MS_CALLBACK verify_callback(int ok, X509_STORE_CTX *ctx)
    err = X509_STORE_CTX_get_error(ctx);
    depth = X509_STORE_CTX_get_error_depth(ctx);

-    BIO_printf(bio_err, "depth=%d ", depth);
-    if (err_cert) {
-        X509_NAME_print_ex(bio_err, X509_get_subject_name(err_cert),
-                           0, XN_FLAG_ONELINE);
-        BIO_puts(bio_err, "\n");
-    } else
-        BIO_puts(bio_err, "<no cert>\n");
+    if (!verify_quiet || !ok) {
+        BIO_printf(bio_err, "depth=%d ", depth);
+        if (err_cert) {
+            X509_NAME_print_ex(bio_err,
+                               X509_get_subject_name(err_cert),
+                               0, XN_FLAG_ONELINE);
+            BIO_puts(bio_err, "\n");
+        } else
+            BIO_puts(bio_err, "<no cert>\n");
+    }
    if (!ok) {
        BIO_printf(bio_err, "verify error:num=%d:%s\n", err,
                   X509_verify_cert_error_string(err));
@ -178,13 +182,14 @@ int MS_CALLBACK verify_callback(int ok, X509_STORE_CTX *ctx)
        BIO_printf(bio_err, "\n");
        break;
    case X509_V_ERR_NO_EXPLICIT_POLICY:
-        policies_print(bio_err, ctx);
+        if (!verify_quiet)
+            policies_print(bio_err, ctx);
        break;
    }
-    if (err == X509_V_OK && ok == 2)
+    if (err == X509_V_OK && ok == 2 && !verify_quiet)
        policies_print(bio_err, ctx);
-
-    BIO_printf(bio_err, "verify return:%d\n", ok);
+    if (ok && !verify_quiet)
+        BIO_printf(bio_err, "verify return:%d\n", ok);
    return (ok);
 }

@ -245,8 +250,10 @@ int set_cert_stuff(SSL_CTX *ctx, char *cert_file, char *key_file)
    return (1);
 }

-int set_cert_key_stuff(SSL_CTX *ctx, X509 *cert, EVP_PKEY *key)
+int set_cert_key_stuff(SSL_CTX *ctx, X509 *cert, EVP_PKEY *key,
+                       STACK_OF(X509) *chain, int build_chain)
 {
+    int chflags = chain ? SSL_BUILD_CHAIN_FLAG_CHECK : 0;
    if (cert == NULL)
        return 1;
    if (SSL_CTX_use_certificate(ctx, cert) <= 0) {
@ -254,6 +261,7 @@ int set_cert_key_stuff(SSL_CTX *ctx, X509 *cert, EVP_PKEY *key)
        ERR_print_errors(bio_err);
        return 0;
    }
+
    if (SSL_CTX_use_PrivateKey(ctx, key) <= 0) {
        BIO_printf(bio_err, "error setting private key\n");
        ERR_print_errors(bio_err);
@ -268,6 +276,263 @@ int set_cert_key_stuff(SSL_CTX *ctx, X509 *cert, EVP_PKEY *key)
                   "Private key does not match the certificate public key\n");
        return 0;
    }
+    if (chain && !SSL_CTX_set1_chain(ctx, chain)) {
+        BIO_printf(bio_err, "error setting certificate chain\n");
+        ERR_print_errors(bio_err);
+        return 0;
+    }
+    if (build_chain && !SSL_CTX_build_cert_chain(ctx, chflags)) {
+        BIO_printf(bio_err, "error building certificate chain\n");
+        ERR_print_errors(bio_err);
+        return 0;
+    }
+    return 1;
+}
+
+static void ssl_print_client_cert_types(BIO *bio, SSL *s)
+{
+    const unsigned char *p;
+    int i;
+    int cert_type_num = SSL_get0_certificate_types(s, &p);
+    if (!cert_type_num)
+        return;
+    BIO_puts(bio, "Client Certificate Types: ");
+    for (i = 0; i < cert_type_num; i++) {
+        unsigned char cert_type = p[i];
+        char *cname;
+        switch (cert_type) {
+        case TLS_CT_RSA_SIGN:
+            cname = "RSA sign";
+            break;
+
+        case TLS_CT_DSS_SIGN:
+            cname = "DSA sign";
+            break;
+
+        case TLS_CT_RSA_FIXED_DH:
+            cname = "RSA fixed DH";
+            break;
+
+        case TLS_CT_DSS_FIXED_DH:
+            cname = "DSS fixed DH";
+            break;
+
+        case TLS_CT_ECDSA_SIGN:
+            cname = "ECDSA sign";
+            break;
+
+        case TLS_CT_RSA_FIXED_ECDH:
+            cname = "RSA fixed ECDH";
+            break;
+
+        case TLS_CT_ECDSA_FIXED_ECDH:
+            cname = "ECDSA fixed ECDH";
+            break;
+
+        case TLS_CT_GOST94_SIGN:
+            cname = "GOST94 Sign";
+            break;
+
+        case TLS_CT_GOST01_SIGN:
+            cname = "GOST01 Sign";
+            break;
+
+        default:
+            cname = NULL;
+        }
+
+        if (i)
+            BIO_puts(bio, ", ");
+
+        if (cname)
+            BIO_puts(bio, cname);
+        else
+            BIO_printf(bio, "UNKNOWN (%d),", cert_type);
+    }
+    BIO_puts(bio, "\n");
+}
+
+static int do_print_sigalgs(BIO *out, SSL *s, int shared)
+{
+    int i, nsig, client;
+    client = SSL_is_server(s) ? 0 : 1;
+    if (shared)
+        nsig = SSL_get_shared_sigalgs(s, -1, NULL, NULL, NULL, NULL, NULL);
+    else
+        nsig = SSL_get_sigalgs(s, -1, NULL, NULL, NULL, NULL, NULL);
+    if (nsig == 0)
+        return 1;
+
+    if (shared)
+        BIO_puts(out, "Shared ");
+
+    if (client)
+        BIO_puts(out, "Requested ");
+    BIO_puts(out, "Signature Algorithms: ");
+    for (i = 0; i < nsig; i++) {
+        int hash_nid, sign_nid;
+        unsigned char rhash, rsign;
+        const char *sstr = NULL;
+        if (shared)
+            SSL_get_shared_sigalgs(s, i, &sign_nid, &hash_nid, NULL,
+                                   &rsign, &rhash);
+        else
+            SSL_get_sigalgs(s, i, &sign_nid, &hash_nid, NULL, &rsign, &rhash);
+        if (i)
+            BIO_puts(out, ":");
+        if (sign_nid == EVP_PKEY_RSA)
+            sstr = "RSA";
+        else if (sign_nid == EVP_PKEY_DSA)
+            sstr = "DSA";
+        else if (sign_nid == EVP_PKEY_EC)
+            sstr = "ECDSA";
+        if (sstr)
+            BIO_printf(out, "%s+", sstr);
+        else
+            BIO_printf(out, "0x%02X+", (int)rsign);
+        if (hash_nid != NID_undef)
+            BIO_printf(out, "%s", OBJ_nid2sn(hash_nid));
+        else
+            BIO_printf(out, "0x%02X", (int)rhash);
+    }
+    BIO_puts(out, "\n");
+    return 1;
+}
+
+int ssl_print_sigalgs(BIO *out, SSL *s)
+{
+    int mdnid;
+    if (!SSL_is_server(s))
+        ssl_print_client_cert_types(out, s);
+    do_print_sigalgs(out, s, 0);
+    do_print_sigalgs(out, s, 1);
+    if (SSL_get_peer_signature_nid(s, &mdnid))
+        BIO_printf(out, "Peer signing digest: %s\n", OBJ_nid2sn(mdnid));
+    return 1;
+}
+
+#ifndef OPENSSL_NO_EC
+int ssl_print_point_formats(BIO *out, SSL *s)
+{
+    int i, nformats;
+    const char *pformats;
+    nformats = SSL_get0_ec_point_formats(s, &pformats);
+    if (nformats <= 0)
+        return 1;
+    BIO_puts(out, "Supported Elliptic Curve Point Formats: ");
+    for (i = 0; i < nformats; i++, pformats++) {
+        if (i)
+            BIO_puts(out, ":");
+        switch (*pformats) {
+        case TLSEXT_ECPOINTFORMAT_uncompressed:
+            BIO_puts(out, "uncompressed");
+            break;
+
+        case TLSEXT_ECPOINTFORMAT_ansiX962_compressed_prime:
+            BIO_puts(out, "ansiX962_compressed_prime");
+            break;
+
+        case TLSEXT_ECPOINTFORMAT_ansiX962_compressed_char2:
+            BIO_puts(out, "ansiX962_compressed_char2");
+            break;
+
+        default:
+            BIO_printf(out, "unknown(%d)", (int)*pformats);
+            break;
+
+        }
+    }
+    if (nformats <= 0)
+        BIO_puts(out, "NONE");
+    BIO_puts(out, "\n");
+    return 1;
+}
+
+int ssl_print_curves(BIO *out, SSL *s, int noshared)
+{
+    int i, ncurves, *curves, nid;
+    const char *cname;
+    ncurves = SSL_get1_curves(s, NULL);
+    if (ncurves <= 0)
+        return 1;
+    curves = OPENSSL_malloc(ncurves * sizeof(int));
+    if(!curves) {
+        BIO_puts(out, "Malloc error getting supported curves\n");
+        return 0;
+    }
+    SSL_get1_curves(s, curves);
+
+
+    BIO_puts(out, "Supported Elliptic Curves: ");
+    for (i = 0; i < ncurves; i++) {
+        if (i)
+            BIO_puts(out, ":");
+        nid = curves[i];
+        /* If unrecognised print out hex version */
+        if (nid & TLSEXT_nid_unknown)
+            BIO_printf(out, "0x%04X", nid & 0xFFFF);
+        else {
+            /* Use NIST name for curve if it exists */
+            cname = EC_curve_nid2nist(nid);
+            if (!cname)
+                cname = OBJ_nid2sn(nid);
+            BIO_printf(out, "%s", cname);
+        }
+    }
+    if (ncurves == 0)
+        BIO_puts(out, "NONE");
+    OPENSSL_free(curves);
+    if (noshared) {
+        BIO_puts(out, "\n");
+        return 1;
+    }
+    BIO_puts(out, "\nShared Elliptic curves: ");
+    ncurves = SSL_get_shared_curve(s, -1);
+    for (i = 0; i < ncurves; i++) {
+        if (i)
+            BIO_puts(out, ":");
+        nid = SSL_get_shared_curve(s, i);
+        cname = EC_curve_nid2nist(nid);
+        if (!cname)
+            cname = OBJ_nid2sn(nid);
+        BIO_printf(out, "%s", cname);
+    }
+    if (ncurves == 0)
+        BIO_puts(out, "NONE");
+    BIO_puts(out, "\n");
+    return 1;
+}
+#endif
+int ssl_print_tmp_key(BIO *out, SSL *s)
+{
+    EVP_PKEY *key;
+    if (!SSL_get_server_tmp_key(s, &key))
+        return 1;
+    BIO_puts(out, "Server Temp Key: ");
+    switch (EVP_PKEY_id(key)) {
+    case EVP_PKEY_RSA:
+        BIO_printf(out, "RSA, %d bits\n", EVP_PKEY_bits(key));
+        break;
+
+    case EVP_PKEY_DH:
+        BIO_printf(out, "DH, %d bits\n", EVP_PKEY_bits(key));
+        break;
+#ifndef OPENSSL_NO_ECDH
+    case EVP_PKEY_EC:
+        {
+            EC_KEY *ec = EVP_PKEY_get1_EC_KEY(key);
+            int nid;
+            const char *cname;
+            nid = EC_GROUP_get_curve_name(EC_KEY_get0_group(ec));
+            EC_KEY_free(ec);
+            cname = EC_curve_nid2nist(nid);
+            if (!cname)
+                cname = OBJ_nid2sn(nid);
+            BIO_printf(out, "ECDH, %s, %d bits\n", cname, EVP_PKEY_bits(key));
+        }
+#endif
+    }
+    EVP_PKEY_free(key);
    return 1;
 }

@ -883,3 +1148,504 @@ int MS_CALLBACK verify_cookie_callback(SSL *ssl, unsigned char *cookie,

    return 0;
 }
+
+/*
+ * Example of extended certificate handling. Where the standard support of
+ * one certificate per algorithm is not sufficient an application can decide
+ * which certificate(s) to use at runtime based on whatever criteria it deems
+ * appropriate.
+ */
+
+/* Linked list of certificates, keys and chains */
+struct ssl_excert_st {
+    int certform;
+    const char *certfile;
+    int keyform;
+    const char *keyfile;
+    const char *chainfile;
+    X509 *cert;
+    EVP_PKEY *key;
+    STACK_OF(X509) *chain;
+    int build_chain;
+    struct ssl_excert_st *next, *prev;
+};
+
+struct chain_flags {
+    int flag;
+    const char *name;
+};
+
+struct chain_flags chain_flags_list[] = {
+    {CERT_PKEY_VALID, "Overall Validity"},
+    {CERT_PKEY_SIGN, "Sign with EE key"},
+    {CERT_PKEY_EE_SIGNATURE, "EE signature"},
+    {CERT_PKEY_CA_SIGNATURE, "CA signature"},
+    {CERT_PKEY_EE_PARAM, "EE key parameters"},
+    {CERT_PKEY_CA_PARAM, "CA key parameters"},
+    {CERT_PKEY_EXPLICIT_SIGN, "Explicity sign with EE key"},
+    {CERT_PKEY_ISSUER_NAME, "Issuer Name"},
+    {CERT_PKEY_CERT_TYPE, "Certificate Type"},
+    {0, NULL}
+};
+
+static void print_chain_flags(BIO *out, SSL *s, int flags)
+{
+    struct chain_flags *ctmp = chain_flags_list;
+    while (ctmp->name) {
+        BIO_printf(out, "\t%s: %s\n", ctmp->name,
+                   flags & ctmp->flag ? "OK" : "NOT OK");
+        ctmp++;
+    }
+    BIO_printf(out, "\tSuite B: ");
+    if (SSL_set_cert_flags(s, 0) & SSL_CERT_FLAG_SUITEB_128_LOS)
+        BIO_puts(out, flags & CERT_PKEY_SUITEB ? "OK\n" : "NOT OK\n");
+    else
+        BIO_printf(out, "not tested\n");
+}
+
+/*
+ * Very basic selection callback: just use any certificate chain reported as
+ * valid. More sophisticated could prioritise according to local policy.
+ */
+static int set_cert_cb(SSL *ssl, void *arg)
+{
+    int i, rv;
+    SSL_EXCERT *exc = arg;
+#ifdef CERT_CB_TEST_RETRY
+    static int retry_cnt;
+    if (retry_cnt < 5) {
+        retry_cnt++;
+        fprintf(stderr, "Certificate callback retry test: count %d\n",
+                retry_cnt);
+        return -1;
+    }
+#endif
+    SSL_certs_clear(ssl);
+
+    if (!exc)
+        return 1;
+
+    /*
+     * Go to end of list and traverse backwards since we prepend newer
+     * entries this retains the original order.
+     */
+    while (exc->next)
+        exc = exc->next;
+
+    i = 0;
+
+    while (exc) {
+        i++;
+        rv = SSL_check_chain(ssl, exc->cert, exc->key, exc->chain);
+        BIO_printf(bio_err, "Checking cert chain %d:\nSubject: ", i);
+        X509_NAME_print_ex(bio_err, X509_get_subject_name(exc->cert), 0,
+                           XN_FLAG_ONELINE);
+        BIO_puts(bio_err, "\n");
+
+        print_chain_flags(bio_err, ssl, rv);
+        if (rv & CERT_PKEY_VALID) {
+            SSL_use_certificate(ssl, exc->cert);
+            SSL_use_PrivateKey(ssl, exc->key);
+            /*
+             * NB: we wouldn't normally do this as it is not efficient
+             * building chains on each connection better to cache the chain
+             * in advance.
+             */
+            if (exc->build_chain) {
+                if (!SSL_build_cert_chain(ssl, 0))
+                    return 0;
+            } else if (exc->chain)
+                SSL_set1_chain(ssl, exc->chain);
+        }
+        exc = exc->prev;
+    }
+    return 1;
+}
+
+void ssl_ctx_set_excert(SSL_CTX *ctx, SSL_EXCERT *exc)
+{
+    SSL_CTX_set_cert_cb(ctx, set_cert_cb, exc);
+}
+
+static int ssl_excert_prepend(SSL_EXCERT **pexc)
+{
+    SSL_EXCERT *exc;
+    exc = OPENSSL_malloc(sizeof(SSL_EXCERT));
+    if (!exc)
+        return 0;
+    exc->certfile = NULL;
+    exc->keyfile = NULL;
+    exc->chainfile = NULL;
+    exc->cert = NULL;
+    exc->key = NULL;
+    exc->chain = NULL;
+    exc->prev = NULL;
+    exc->build_chain = 0;
+
+    exc->next = *pexc;
+    *pexc = exc;
+
+    if (exc->next) {
+        exc->certform = exc->next->certform;
+        exc->keyform = exc->next->keyform;
+        exc->next->prev = exc;
+    } else {
+        exc->certform = FORMAT_PEM;
+        exc->keyform = FORMAT_PEM;
+    }
+    return 1;
+
+}
+
+void ssl_excert_free(SSL_EXCERT *exc)
+{
+    SSL_EXCERT *curr;
+    while (exc) {
+        if (exc->cert)
+            X509_free(exc->cert);
+        if (exc->key)
+            EVP_PKEY_free(exc->key);
+        if (exc->chain)
+            sk_X509_pop_free(exc->chain, X509_free);
+        curr = exc;
+        exc = exc->next;
+        OPENSSL_free(curr);
+    }
+}
+
+int load_excert(SSL_EXCERT **pexc, BIO *err)
+{
+    SSL_EXCERT *exc = *pexc;
+    if (!exc)
+        return 1;
+    /* If nothing in list, free and set to NULL */
+    if (!exc->certfile && !exc->next) {
+        ssl_excert_free(exc);
+        *pexc = NULL;
+        return 1;
+    }
+    for (; exc; exc = exc->next) {
+        if (!exc->certfile) {
+            BIO_printf(err, "Missing filename\n");
+            return 0;
+        }
+        exc->cert = load_cert(err, exc->certfile, exc->certform,
+                              NULL, NULL, "Server Certificate");
+        if (!exc->cert)
+            return 0;
+        if (exc->keyfile) {
+            exc->key = load_key(err, exc->keyfile, exc->keyform,
+                                0, NULL, NULL, "Server Key");
+        } else {
+            exc->key = load_key(err, exc->certfile, exc->certform,
+                                0, NULL, NULL, "Server Key");
+        }
+        if (!exc->key)
+            return 0;
+        if (exc->chainfile) {
+            exc->chain = load_certs(err,
+                                    exc->chainfile, FORMAT_PEM,
+                                    NULL, NULL, "Server Chain");
+            if (!exc->chain)
+                return 0;
+        }
+    }
+    return 1;
+}
+
+int args_excert(char ***pargs, int *pargc,
+                int *badarg, BIO *err, SSL_EXCERT **pexc)
+{
+    char *arg = **pargs, *argn = (*pargs)[1];
+    SSL_EXCERT *exc = *pexc;
+    int narg = 2;
+    if (!exc) {
+        if (ssl_excert_prepend(&exc))
+            *pexc = exc;
+        else {
+            BIO_printf(err, "Error initialising xcert\n");
+            *badarg = 1;
+            goto err;
+        }
+    }
+    if (strcmp(arg, "-xcert") == 0) {
+        if (!argn) {
+            *badarg = 1;
+            return 1;
+        }
+        if (exc->certfile && !ssl_excert_prepend(&exc)) {
+            BIO_printf(err, "Error adding xcert\n");
+            *badarg = 1;
+            goto err;
+        }
+        exc->certfile = argn;
+    } else if (strcmp(arg, "-xkey") == 0) {
+        if (!argn) {
+            *badarg = 1;
+            return 1;
+        }
+        if (exc->keyfile) {
+            BIO_printf(err, "Key already specified\n");
+            *badarg = 1;
+            return 1;
+        }
+        exc->keyfile = argn;
+    } else if (strcmp(arg, "-xchain") == 0) {
+        if (!argn) {
+            *badarg = 1;
+            return 1;
+        }
+        if (exc->chainfile) {
+            BIO_printf(err, "Chain already specified\n");
+            *badarg = 1;
+            return 1;
+        }
+        exc->chainfile = argn;
+    } else if (strcmp(arg, "-xchain_build") == 0) {
+        narg = 1;
+        exc->build_chain = 1;
+    } else if (strcmp(arg, "-xcertform") == 0) {
+        if (!argn) {
+            *badarg = 1;
+            goto err;
+        }
+        exc->certform = str2fmt(argn);
+    } else if (strcmp(arg, "-xkeyform") == 0) {
+        if (!argn) {
+            *badarg = 1;
+            goto err;
+        }
+        exc->keyform = str2fmt(argn);
+    } else
+        return 0;
+
+    (*pargs) += narg;
+
+    if (pargc)
+        *pargc -= narg;
+
+    *pexc = exc;
+
+    return 1;
+
+ err:
+    ERR_print_errors(err);
+    ssl_excert_free(exc);
+    *pexc = NULL;
+    return 1;
+}
+
+static void print_raw_cipherlist(BIO *bio, SSL *s)
+{
+    const unsigned char *rlist;
+    static const unsigned char scsv_id[] = { 0, 0, 0xFF };
+    size_t i, rlistlen, num;
+    if (!SSL_is_server(s))
+        return;
+    num = SSL_get0_raw_cipherlist(s, NULL);
+    rlistlen = SSL_get0_raw_cipherlist(s, &rlist);
+    BIO_puts(bio, "Client cipher list: ");
+    for (i = 0; i < rlistlen; i += num, rlist += num) {
+        const SSL_CIPHER *c = SSL_CIPHER_find(s, rlist);
+        if (i)
+            BIO_puts(bio, ":");
+        if (c)
+            BIO_puts(bio, SSL_CIPHER_get_name(c));
+        else if (!memcmp(rlist, scsv_id - num + 3, num))
+            BIO_puts(bio, "SCSV");
+        else {
+            size_t j;
+            BIO_puts(bio, "0x");
+            for (j = 0; j < num; j++)
+                BIO_printf(bio, "%02X", rlist[j]);
+        }
+    }
+    BIO_puts(bio, "\n");
+}
+
+void print_ssl_summary(BIO *bio, SSL *s)
+{
+    const SSL_CIPHER *c;
+    X509 *peer;
+    /*
+     * const char *pnam = SSL_is_server(s) ? "client" : "server";
+     */
+    BIO_printf(bio, "Protocol version: %s\n", SSL_get_version(s));
+    print_raw_cipherlist(bio, s);
+    c = SSL_get_current_cipher(s);
+    BIO_printf(bio, "Ciphersuite: %s\n", SSL_CIPHER_get_name(c));
+    do_print_sigalgs(bio, s, 0);
+    peer = SSL_get_peer_certificate(s);
+    if (peer) {
+        int nid;
+        BIO_puts(bio, "Peer certificate: ");
+        X509_NAME_print_ex(bio, X509_get_subject_name(peer),
+                           0, XN_FLAG_ONELINE);
+        BIO_puts(bio, "\n");
+        if (SSL_get_peer_signature_nid(s, &nid))
+            BIO_printf(bio, "Hash used: %s\n", OBJ_nid2sn(nid));
+    } else
+        BIO_puts(bio, "No peer certificate\n");
+    if (peer)
+        X509_free(peer);
+#ifndef OPENSSL_NO_EC
+    ssl_print_point_formats(bio, s);
+    if (SSL_is_server(s))
+        ssl_print_curves(bio, s, 1);
+    else
+        ssl_print_tmp_key(bio, s);
+#else
+    if (!SSL_is_server(s))
+        ssl_print_tmp_key(bio, s);
+#endif
+}
+
+int args_ssl(char ***pargs, int *pargc, SSL_CONF_CTX *cctx,
+             int *badarg, BIO *err, STACK_OF(OPENSSL_STRING) **pstr)
+{
+    char *arg = **pargs, *argn = (*pargs)[1];
+    int rv;
+
+    /* Attempt to run SSL configuration command */
+    rv = SSL_CONF_cmd_argv(cctx, pargc, pargs);
+    /* If parameter not recognised just return */
+    if (rv == 0)
+        return 0;
+    /* see if missing argument error */
+    if (rv == -3) {
+        BIO_printf(err, "%s needs an argument\n", arg);
+        *badarg = 1;
+        goto end;
+    }
+    /* Check for some other error */
+    if (rv < 0) {
+        BIO_printf(err, "Error with command: \"%s %s\"\n",
+                   arg, argn ? argn : "");
+        *badarg = 1;
+        goto end;
+    }
+    /* Store command and argument */
+    /* If only one argument processed store value as NULL */
+    if (rv == 1)
+        argn = NULL;
+    if (!*pstr)
+        *pstr = sk_OPENSSL_STRING_new_null();
+    if (!*pstr || !sk_OPENSSL_STRING_push(*pstr, arg) ||
+        !sk_OPENSSL_STRING_push(*pstr, argn)) {
+        BIO_puts(err, "Memory allocation failure\n");
+        goto end;
+    }
+
+ end:
+    if (*badarg)
+        ERR_print_errors(err);
+
+    return 1;
+}
+
+int args_ssl_call(SSL_CTX *ctx, BIO *err, SSL_CONF_CTX *cctx,
+                  STACK_OF(OPENSSL_STRING) *str, int no_ecdhe, int no_jpake)
+{
+    int i;
+    SSL_CONF_CTX_set_ssl_ctx(cctx, ctx);
+    for (i = 0; i < sk_OPENSSL_STRING_num(str); i += 2) {
+        const char *param = sk_OPENSSL_STRING_value(str, i);
+        const char *value = sk_OPENSSL_STRING_value(str, i + 1);
+        /*
+         * If no_ecdhe or named curve already specified don't need a default.
+         */
+        if (!no_ecdhe && !strcmp(param, "-named_curve"))
+            no_ecdhe = 1;
+#ifndef OPENSSL_NO_JPAKE
+        if (!no_jpake && !strcmp(param, "-cipher")) {
+            BIO_puts(err, "JPAKE sets cipher to PSK\n");
+            return 0;
+        }
+#endif
+        if (SSL_CONF_cmd(cctx, param, value) <= 0) {
+            BIO_printf(err, "Error with command: \"%s %s\"\n",
+                       param, value ? value : "");
+            ERR_print_errors(err);
+            return 0;
+        }
+    }
+    /*
+     * This is a special case to keep existing s_server functionality: if we
+     * don't have any curve specified *and* we haven't disabled ECDHE then
+     * use P-256.
+     */
+    if (!no_ecdhe) {
+        if (SSL_CONF_cmd(cctx, "-named_curve", "P-256") <= 0) {
+            BIO_puts(err, "Error setting EC curve\n");
+            ERR_print_errors(err);
+            return 0;
+        }
+    }
+#ifndef OPENSSL_NO_JPAKE
+    if (!no_jpake) {
+        if (SSL_CONF_cmd(cctx, "-cipher", "PSK") <= 0) {
+            BIO_puts(err, "Error setting cipher to PSK\n");
+            ERR_print_errors(err);
+            return 0;
+        }
+    }
+#endif
+    if (!SSL_CONF_CTX_finish(cctx)) {
+        BIO_puts(err, "Error finishing context\n");
+        ERR_print_errors(err);
+        return 0;
+    }
+    return 1;
+}
+
+static int add_crls_store(X509_STORE *st, STACK_OF(X509_CRL) *crls)
+{
+    X509_CRL *crl;
+    int i;
+    for (i = 0; i < sk_X509_CRL_num(crls); i++) {
+        crl = sk_X509_CRL_value(crls, i);
+        X509_STORE_add_crl(st, crl);
+    }
+    return 1;
+}
+
+int ssl_ctx_add_crls(SSL_CTX *ctx, STACK_OF(X509_CRL) *crls, int crl_download)
+{
+    X509_STORE *st;
+    st = SSL_CTX_get_cert_store(ctx);
+    add_crls_store(st, crls);
+    if (crl_download)
+        store_setup_crl_download(st);
+    return 1;
+}
+
+int ssl_load_stores(SSL_CTX *ctx,
+                    const char *vfyCApath, const char *vfyCAfile,
+                    const char *chCApath, const char *chCAfile,
+                    STACK_OF(X509_CRL) *crls, int crl_download)
+{
+    X509_STORE *vfy = NULL, *ch = NULL;
+    int rv = 0;
+    if (vfyCApath || vfyCAfile) {
+        vfy = X509_STORE_new();
+        if (!X509_STORE_load_locations(vfy, vfyCAfile, vfyCApath))
+            goto err;
+        add_crls_store(vfy, crls);
+        SSL_CTX_set1_verify_cert_store(ctx, vfy);
+        if (crl_download)
+            store_setup_crl_download(vfy);
+    }
+    if (chCApath || chCAfile) {
+        ch = X509_STORE_new();
+        if (!X509_STORE_load_locations(ch, chCAfile, chCApath))
+            goto err;
+        SSL_CTX_set1_chain_cert_store(ctx, ch);
+    }
+    rv = 1;
+ err:
+    if (vfy)
+        X509_STORE_free(vfy);
+    if (ch)
+        X509_STORE_free(ch);
+    return rv;
+}
--- a/deps/openssl/openssl/apps/s_client.c
+++ b/deps/openssl/openssl/apps/s_client.c
@ -180,13 +180,6 @@ typedef unsigned int u_int;
 # include <fcntl.h>
 #endif

-/* Use Windows API with STD_INPUT_HANDLE when checking for input?
-   Don't look at OPENSSL_SYS_MSDOS for this, since it is always defined if
-   OPENSSL_SYS_WINDOWS is defined */
-#if defined(OPENSSL_SYS_WINDOWS) && !defined(OPENSSL_SYS_WINCE) && defined(STD_INPUT_HANDLE)
-#define OPENSSL_USE_STD_INPUT_HANDLE
-#endif
-
 #undef PROG
 #define PROG    s_client_main

@ -209,6 +202,7 @@ typedef unsigned int u_int;
 extern int verify_depth;
 extern int verify_error;
 extern int verify_return_error;
+extern int verify_quiet;

 #ifdef FIONBIO
 static int c_nbio = 0;
@ -231,8 +225,10 @@ static void print_stuff(BIO *berr, SSL *con, int full);
 static int ocsp_resp_cb(SSL *s, void *arg);
 #endif
 static BIO *bio_c_out = NULL;
+static BIO *bio_c_msg = NULL;
 static int c_quiet = 0;
 static int c_ign_eof = 0;
+static int c_brief = 0;

 #ifndef OPENSSL_NO_PSK
 /* Default PSK identity and key */
@ -311,6 +307,12 @@ static void sc_usage(void)
    BIO_printf(bio_err,
               " -connect host:port - who to connect to (default is %s:%s)\n",
               SSL_HOST_NAME, PORT_STR);
+    BIO_printf(bio_err,
+               " -verify_host host - check peer certificate matches \"host\"\n");
+    BIO_printf(bio_err,
+               " -verify_email email - check peer certificate matches \"email\"\n");
+    BIO_printf(bio_err,
+               " -verify_ip ipaddr - check peer certificate matches \"ipaddr\"\n");

    BIO_printf(bio_err,
               " -verify arg   - turn on peer certificate verification\n");
@ -418,11 +420,15 @@ static void sc_usage(void)
               " -status           - request certificate status from server\n");
    BIO_printf(bio_err,
               " -no_ticket        - disable use of RFC4507bis session tickets\n");
-# ifndef OPENSSL_NO_NEXTPROTONEG
+    BIO_printf(bio_err,
+               " -serverinfo types - send empty ClientHello extensions (comma-separated numbers)\n");
+#endif
+#ifndef OPENSSL_NO_NEXTPROTONEG
    BIO_printf(bio_err,
               " -nextprotoneg arg - enable NPN extension, considering named protocols supported (comma-separated list)\n");
-# endif
 #endif
+    BIO_printf(bio_err,
+               " -alpn arg         - enable ALPN extension, considering named protocols supported (comma-separated list)\n");
    BIO_printf(bio_err,
               " -legacy_renegotiation - enable use of legacy renegotiation (dangerous)\n");
 #ifndef OPENSSL_NO_SRTP
@ -610,6 +616,27 @@ static int next_proto_cb(SSL *s, unsigned char **out, unsigned char *outlen,
    return SSL_TLSEXT_ERR_OK;
 }
 # endif                         /* ndef OPENSSL_NO_NEXTPROTONEG */
+
+static int serverinfo_cli_parse_cb(SSL *s, unsigned int ext_type,
+                                   const unsigned char *in, size_t inlen,
+                                   int *al, void *arg)
+{
+    char pem_name[100];
+    unsigned char ext_buf[4 + 65536];
+
+    /* Reconstruct the type/len fields prior to extension data */
+    ext_buf[0] = ext_type >> 8;
+    ext_buf[1] = ext_type & 0xFF;
+    ext_buf[2] = inlen >> 8;
+    ext_buf[3] = inlen & 0xFF;
+    memcpy(ext_buf + 4, in, inlen);
+
+    BIO_snprintf(pem_name, sizeof(pem_name), "SERVERINFO FOR EXTENSION %d",
+                 ext_type);
+    PEM_write_bio(bio_c_out, pem_name, "", ext_buf, 4 + inlen);
+    return 1;
+}
+
 #endif

 enum {
@ -625,7 +652,7 @@ int MAIN(int, char **);

 int MAIN(int argc, char **argv)
 {
-    unsigned int off = 0, clr = 0;
+    int build_chain = 0;
    SSL *con = NULL;
 #ifndef OPENSSL_NO_KRB5
    KSSL_CTX *kctx;
@ -638,13 +665,16 @@ int MAIN(int argc, char **argv)
    short port = PORT;
    int full_log = 1;
    char *host = SSL_HOST_NAME;
-    char *cert_file = NULL, *key_file = NULL;
+    char *cert_file = NULL, *key_file = NULL, *chain_file = NULL;
    int cert_format = FORMAT_PEM, key_format = FORMAT_PEM;
    char *passarg = NULL, *pass = NULL;
    X509 *cert = NULL;
    EVP_PKEY *key = NULL;
-    char *CApath = NULL, *CAfile = NULL, *cipher = NULL;
-    int reconnect = 0, badop = 0, verify = SSL_VERIFY_NONE, bugs = 0;
+    STACK_OF(X509) *chain = NULL;
+    char *CApath = NULL, *CAfile = NULL;
+    char *chCApath = NULL, *chCAfile = NULL;
+    char *vfyCApath = NULL, *vfyCAfile = NULL;
+    int reconnect = 0, badop = 0, verify = SSL_VERIFY_NONE;
    int crlf = 0;
    int write_tty, read_tty, write_ssl, read_ssl, tty_on, ssl_pending;
    SSL_CTX *ctx = NULL;
@ -677,6 +707,10 @@ int MAIN(int argc, char **argv)
 # ifndef OPENSSL_NO_NEXTPROTONEG
    const char *next_proto_neg_in = NULL;
 # endif
+    const char *alpn_in = NULL;
+# define MAX_SI_TYPES 100
+    unsigned short serverinfo_types[MAX_SI_TYPES];
+    int serverinfo_types_count = 0;
 #endif
    char *sess_in = NULL;
    char *sess_out = NULL;
@ -686,13 +720,25 @@ int MAIN(int argc, char **argv)
    int enable_timeouts = 0;
    long socket_mtu = 0;
 #ifndef OPENSSL_NO_JPAKE
-    char *jpake_secret = NULL;
+    static char *jpake_secret = NULL;
+# define no_jpake !jpake_secret
+#else
+# define no_jpake 1
 #endif
 #ifndef OPENSSL_NO_SRP
    char *srppass = NULL;
    int srp_lateuser = 0;
    SRP_ARG srp_arg = { NULL, NULL, 0, 0, 0, 1024 };
 #endif
+    SSL_EXCERT *exc = NULL;
+
+    SSL_CONF_CTX *cctx = NULL;
+    STACK_OF(OPENSSL_STRING) *ssl_args = NULL;
+
+    char *crl_file = NULL;
+    int crl_format = FORMAT_PEM;
+    int crl_download = 0;
+    STACK_OF(X509_CRL) *crls = NULL;

    meth = SSLv23_client_method();

@ -710,6 +756,12 @@ int MAIN(int argc, char **argv)
    if (!load_config(bio_err, NULL))
        goto end;

+    cctx = SSL_CONF_CTX_new();
+    if (!cctx)
+        goto end;
+    SSL_CONF_CTX_set_flags(cctx, SSL_CONF_FLAG_CLIENT);
+    SSL_CONF_CTX_set_flags(cctx, SSL_CONF_FLAG_CMDLINE);
+
    if (((cbuf = OPENSSL_malloc(BUFSIZZ)) == NULL) ||
        ((sbuf = OPENSSL_malloc(BUFSIZZ)) == NULL) ||
        ((mbuf = OPENSSL_malloc(BUFSIZZ)) == NULL)) {
@ -746,12 +798,19 @@ int MAIN(int argc, char **argv)
            if (--argc < 1)
                goto bad;
            verify_depth = atoi(*(++argv));
-            BIO_printf(bio_err, "verify depth is %d\n", verify_depth);
+            if (!c_quiet)
+                BIO_printf(bio_err, "verify depth is %d\n", verify_depth);
        } else if (strcmp(*argv, "-cert") == 0) {
            if (--argc < 1)
                goto bad;
            cert_file = *(++argv);
-        } else if (strcmp(*argv, "-sess_out") == 0) {
+        } else if (strcmp(*argv, "-CRL") == 0) {
+            if (--argc < 1)
+                goto bad;
+            crl_file = *(++argv);
+        } else if (strcmp(*argv, "-crl_download") == 0)
+            crl_download = 1;
+        else if (strcmp(*argv, "-sess_out") == 0) {
            if (--argc < 1)
                goto bad;
            sess_out = *(++argv);
@ -763,13 +822,31 @@ int MAIN(int argc, char **argv)
            if (--argc < 1)
                goto bad;
            cert_format = str2fmt(*(++argv));
+        } else if (strcmp(*argv, "-CRLform") == 0) {
+            if (--argc < 1)
+                goto bad;
+            crl_format = str2fmt(*(++argv));
        } else if (args_verify(&argv, &argc, &badarg, bio_err, &vpm)) {
            if (badarg)
                goto bad;
            continue;
        } else if (strcmp(*argv, "-verify_return_error") == 0)
            verify_return_error = 1;
-        else if (strcmp(*argv, "-prexit") == 0)
+        else if (strcmp(*argv, "-verify_quiet") == 0)
+            verify_quiet = 1;
+        else if (strcmp(*argv, "-brief") == 0) {
+            c_brief = 1;
+            verify_quiet = 1;
+            c_quiet = 1;
+        } else if (args_excert(&argv, &argc, &badarg, bio_err, &exc)) {
+            if (badarg)
+                goto bad;
+            continue;
+        } else if (args_ssl(&argv, &argc, cctx, &badarg, bio_err, &ssl_args)) {
+            if (badarg)
+                goto bad;
+            continue;
+        } else if (strcmp(*argv, "-prexit") == 0)
            prexit = 1;
        else if (strcmp(*argv, "-crlf") == 0)
            crlf = 1;
@ -796,6 +873,15 @@ int MAIN(int argc, char **argv)
 #endif
        else if (strcmp(*argv, "-msg") == 0)
            c_msg = 1;
+        else if (strcmp(*argv, "-msgfile") == 0) {
+            if (--argc < 1)
+                goto bad;
+            bio_c_msg = BIO_new_file(*(++argv), "w");
+        }
+#ifndef OPENSSL_NO_SSL_TRACE
+        else if (strcmp(*argv, "-trace") == 0)
+            c_msg = 2;
+#endif
        else if (strcmp(*argv, "-showcerts") == 0)
            c_showcerts = 1;
        else if (strcmp(*argv, "-nbio_test") == 0)
@ -864,11 +950,15 @@ int MAIN(int argc, char **argv)
            meth = TLSv1_client_method();
 #endif
 #ifndef OPENSSL_NO_DTLS1
-        else if (strcmp(*argv, "-dtls1") == 0) {
+        else if (strcmp(*argv, "-dtls") == 0) {
+            meth = DTLS_client_method();
+            socket_type = SOCK_DGRAM;
+        } else if (strcmp(*argv, "-dtls1") == 0) {
            meth = DTLSv1_client_method();
            socket_type = SOCK_DGRAM;
-        } else if (strcmp(*argv, "-fallback_scsv") == 0) {
-            fallback_scsv = 1;
+        } else if (strcmp(*argv, "-dtls1_2") == 0) {
+            meth = DTLSv1_2_client_method();
+            socket_type = SOCK_DGRAM;
        } else if (strcmp(*argv, "-timeout") == 0)
            enable_timeouts = 1;
        else if (strcmp(*argv, "-mtu") == 0) {
@ -877,9 +967,9 @@ int MAIN(int argc, char **argv)
            socket_mtu = atol(*(++argv));
        }
 #endif
-        else if (strcmp(*argv, "-bugs") == 0)
-            bugs = 1;
-        else if (strcmp(*argv, "-keyform") == 0) {
+        else if (strcmp(*argv, "-fallback_scsv") == 0) {
+            fallback_scsv = 1;
+        } else if (strcmp(*argv, "-keyform") == 0) {
            if (--argc < 1)
                goto bad;
            key_format = str2fmt(*(++argv));
@ -887,6 +977,10 @@ int MAIN(int argc, char **argv)
            if (--argc < 1)
                goto bad;
            passarg = *(++argv);
+        } else if (strcmp(*argv, "-cert_chain") == 0) {
+            if (--argc < 1)
+                goto bad;
+            chain_file = *(++argv);
        } else if (strcmp(*argv, "-key") == 0) {
            if (--argc < 1)
                goto bad;
@ -897,27 +991,30 @@ int MAIN(int argc, char **argv)
            if (--argc < 1)
                goto bad;
            CApath = *(++argv);
-        } else if (strcmp(*argv, "-CAfile") == 0) {
+        } else if (strcmp(*argv, "-chainCApath") == 0) {
+            if (--argc < 1)
+                goto bad;
+            chCApath = *(++argv);
+        } else if (strcmp(*argv, "-verifyCApath") == 0) {
+            if (--argc < 1)
+                goto bad;
+            vfyCApath = *(++argv);
+        } else if (strcmp(*argv, "-build_chain") == 0)
+            build_chain = 1;
+        else if (strcmp(*argv, "-CAfile") == 0) {
            if (--argc < 1)
                goto bad;
            CAfile = *(++argv);
-        } else if (strcmp(*argv, "-no_tls1_2") == 0)
-            off |= SSL_OP_NO_TLSv1_2;
-        else if (strcmp(*argv, "-no_tls1_1") == 0)
-            off |= SSL_OP_NO_TLSv1_1;
-        else if (strcmp(*argv, "-no_tls1") == 0)
-            off |= SSL_OP_NO_TLSv1;
-        else if (strcmp(*argv, "-no_ssl3") == 0)
-            off |= SSL_OP_NO_SSLv3;
-        else if (strcmp(*argv, "-no_ssl2") == 0)
-            off |= SSL_OP_NO_SSLv2;
-        else if (strcmp(*argv, "-no_comp") == 0) {
-            off |= SSL_OP_NO_COMPRESSION;
+        } else if (strcmp(*argv, "-chainCAfile") == 0) {
+            if (--argc < 1)
+                goto bad;
+            chCAfile = *(++argv);
+        } else if (strcmp(*argv, "-verifyCAfile") == 0) {
+            if (--argc < 1)
+                goto bad;
+            vfyCAfile = *(++argv);
        }
 #ifndef OPENSSL_NO_TLSEXT
-        else if (strcmp(*argv, "-no_ticket") == 0) {
-            off |= SSL_OP_NO_TICKET;
-        }
 # ifndef OPENSSL_NO_NEXTPROTONEG
        else if (strcmp(*argv, "-nextprotoneg") == 0) {
            if (--argc < 1)
@ -925,20 +1022,32 @@ int MAIN(int argc, char **argv)
            next_proto_neg_in = *(++argv);
        }
 # endif
-#endif
-        else if (strcmp(*argv, "-serverpref") == 0)
-            off |= SSL_OP_CIPHER_SERVER_PREFERENCE;
-        else if (strcmp(*argv, "-legacy_renegotiation") == 0)
-            off |= SSL_OP_ALLOW_UNSAFE_LEGACY_RENEGOTIATION;
-        else if (strcmp(*argv, "-legacy_server_connect") == 0) {
-            off |= SSL_OP_LEGACY_SERVER_CONNECT;
-        } else if (strcmp(*argv, "-no_legacy_server_connect") == 0) {
-            clr |= SSL_OP_LEGACY_SERVER_CONNECT;
-        } else if (strcmp(*argv, "-cipher") == 0) {
+        else if (strcmp(*argv, "-alpn") == 0) {
            if (--argc < 1)
                goto bad;
-            cipher = *(++argv);
+            alpn_in = *(++argv);
+        } else if (strcmp(*argv, "-serverinfo") == 0) {
+            char *c;
+            int start = 0;
+            int len;
+
+            if (--argc < 1)
+                goto bad;
+            c = *(++argv);
+            serverinfo_types_count = 0;
+            len = strlen(c);
+            for (i = 0; i <= len; ++i) {
+                if (i == len || c[i] == ',') {
+                    serverinfo_types[serverinfo_types_count]
+                        = atoi(c + start);
+                    serverinfo_types_count++;
+                    start = i + 1;
+                }
+                if (serverinfo_types_count == MAX_SI_TYPES)
+                    break;
+            }
        }
+#endif
 #ifdef FIONBIO
        else if (strcmp(*argv, "-nbio") == 0) {
            c_nbio = 1;
@ -1029,11 +1138,6 @@ int MAIN(int argc, char **argv)
            goto end;
        }
        psk_identity = "JPAKE";
-        if (cipher) {
-            BIO_printf(bio_err, "JPAKE sets cipher to PSK\n");
-            goto end;
-        }
-        cipher = "PSK";
    }
 #endif

@ -1092,6 +1196,33 @@ int MAIN(int argc, char **argv)
        }
    }

+    if (chain_file) {
+        chain = load_certs(bio_err, chain_file, FORMAT_PEM,
+                           NULL, e, "client certificate chain");
+        if (!chain)
+            goto end;
+    }
+
+    if (crl_file) {
+        X509_CRL *crl;
+        crl = load_crl(crl_file, crl_format);
+        if (!crl) {
+            BIO_puts(bio_err, "Error loading CRL\n");
+            ERR_print_errors(bio_err);
+            goto end;
+        }
+        crls = sk_X509_CRL_new_null();
+        if (!crls || !sk_X509_CRL_push(crls, crl)) {
+            BIO_puts(bio_err, "Error adding CRL\n");
+            ERR_print_errors(bio_err);
+            X509_CRL_free(crl);
+            goto end;
+        }
+    }
+
+    if (!load_excert(&exc, bio_err))
+        goto end;
+
    if (!app_RAND_load_file(NULL, bio_err, 1) && inrand == NULL
        && !RAND_status()) {
        BIO_printf(bio_err,
@ -1102,8 +1233,10 @@ int MAIN(int argc, char **argv)
                   app_RAND_load_files(inrand));

    if (bio_c_out == NULL) {
-        if (c_quiet && !c_debug && !c_msg) {
+        if (c_quiet && !c_debug) {
            bio_c_out = BIO_new(BIO_s_null());
+            if (c_msg && !bio_c_msg)
+                bio_c_msg = BIO_new_fp(stdout, BIO_NOCLOSE);
        } else {
            if (bio_c_out == NULL)
                bio_c_out = BIO_new_fp(stdout, BIO_NOCLOSE);
@ -1125,6 +1258,17 @@ int MAIN(int argc, char **argv)
    if (vpm)
        SSL_CTX_set1_param(ctx, vpm);

+    if (!args_ssl_call(ctx, bio_err, cctx, ssl_args, 1, no_jpake)) {
+        ERR_print_errors(bio_err);
+        goto end;
+    }
+
+    if (!ssl_load_stores(ctx, vfyCApath, vfyCAfile, chCApath, chCAfile,
+                         crls, crl_download)) {
+        BIO_printf(bio_err, "Error loading store locations\n");
+        ERR_print_errors(bio_err);
+        goto end;
+    }
 #ifndef OPENSSL_NO_ENGINE
    if (ssl_client_engine) {
        if (!SSL_CTX_set_client_cert_engine(ctx, ssl_client_engine)) {
@ -1154,35 +1298,43 @@ int MAIN(int argc, char **argv)
    if (srtp_profiles != NULL)
        SSL_CTX_set_tlsext_use_srtp(ctx, srtp_profiles);
 #endif
-    if (bugs)
-        SSL_CTX_set_options(ctx, SSL_OP_ALL | off);
-    else
-        SSL_CTX_set_options(ctx, off);
-
-    if (clr)
-        SSL_CTX_clear_options(ctx, clr);
+    if (exc)
+        ssl_ctx_set_excert(ctx, exc);

-#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG)
+#if !defined(OPENSSL_NO_TLSEXT)
+# if !defined(OPENSSL_NO_NEXTPROTONEG)
    if (next_proto.data)
        SSL_CTX_set_next_proto_select_cb(ctx, next_proto_cb, &next_proto);
+# endif
+    if (alpn_in) {
+        unsigned short alpn_len;
+        unsigned char *alpn = next_protos_parse(&alpn_len, alpn_in);
+
+        if (alpn == NULL) {
+            BIO_printf(bio_err, "Error parsing -alpn argument\n");
+            goto end;
+        }
+        SSL_CTX_set_alpn_protos(ctx, alpn, alpn_len);
+        OPENSSL_free(alpn);
+    }
+#endif
+#ifndef OPENSSL_NO_TLSEXT
+    for (i = 0; i < serverinfo_types_count; i++) {
+        SSL_CTX_add_client_custom_ext(ctx,
+                                      serverinfo_types[i],
+                                      NULL, NULL, NULL,
+                                      serverinfo_cli_parse_cb, NULL);
+    }
 #endif

    if (state)
        SSL_CTX_set_info_callback(ctx, apps_ssl_info_callback);
-    if (cipher != NULL)
-        if (!SSL_CTX_set_cipher_list(ctx, cipher)) {
-            BIO_printf(bio_err, "error setting cipher list\n");
-            ERR_print_errors(bio_err);
-            goto end;
-        }
 #if 0
-        else
-            SSL_CTX_set_cipher_list(ctx, getenv("SSL_CIPHER"));
+    else
+        SSL_CTX_set_cipher_list(ctx, getenv("SSL_CIPHER"));
 #endif

    SSL_CTX_set_verify(ctx, verify, verify_callback);
-    if (!set_cert_key_stuff(ctx, cert, key))
-        goto end;

    if ((!SSL_CTX_load_verify_locations(ctx, CAfile, CApath)) ||
        (!SSL_CTX_set_default_verify_paths(ctx))) {
@ -1192,6 +1344,11 @@ int MAIN(int argc, char **argv)
        ERR_print_errors(bio_err);
        /* goto end; */
    }
+
+    ssl_ctx_add_crls(ctx, crls, crl_download);
+    if (!set_cert_key_stuff(ctx, cert, key, chain, build_chain))
+        goto end;
+
 #ifndef OPENSSL_NO_TLSEXT
    if (servername != NULL) {
        tlsextcbp.biodebug = bio_err;
@ -1283,7 +1440,7 @@ int MAIN(int argc, char **argv)
    if (c_Pause & 0x01)
        SSL_set_debug(con, 1);

-    if (SSL_version(con) == DTLS1_VERSION) {
+    if (socket_type == SOCK_DGRAM) {

        sbio = BIO_new_dgram(s, BIO_NOCLOSE);
        if (getsockname(s, &peer, (void *)&peerlen) < 0) {
@ -1337,8 +1494,13 @@ int MAIN(int argc, char **argv)
        BIO_set_callback_arg(sbio, (char *)bio_c_out);
    }
    if (c_msg) {
-        SSL_set_msg_callback(con, msg_cb);
-        SSL_set_msg_callback_arg(con, bio_c_out);
+#ifndef OPENSSL_NO_SSL_TRACE
+        if (c_msg == 2)
+            SSL_set_msg_callback(con, SSL_trace);
+        else
+#endif
+            SSL_set_msg_callback(con, msg_cb);
+        SSL_set_msg_callback_arg(con, bio_c_msg ? bio_c_msg : bio_c_out);
    }
 #ifndef OPENSSL_NO_TLSEXT
    if (c_tlsextdebug) {
@ -1521,6 +1683,11 @@ int MAIN(int argc, char **argv)
                        BIO_printf(bio_err, "Error writing session file %s\n",
                                   sess_out);
                }
+                if (c_brief) {
+                    BIO_puts(bio_err, "CONNECTION ESTABLISHED\n");
+                    print_ssl_summary(bio_err, con);
+                }
+
                print_stuff(bio_c_out, con, full_log);
                if (full_log > 0)
                    full_log--;
@ -1590,7 +1757,10 @@ int MAIN(int argc, char **argv)
                    tv.tv_usec = 0;
                    i = select(width, (void *)&readfds, (void *)&writefds,
                               NULL, &tv);
-#if defined(OPENSSL_USE_STD_INPUT_HANDLE)
+# if defined(OPENSSL_SYS_WINCE) || defined(OPENSSL_SYS_MSDOS)
+                    if (!i && (!_kbhit() || !read_tty))
+                        continue;
+# else
                    if (!i && (!((_kbhit())
                                 || (WAIT_OBJECT_0 ==
                                     WaitForSingleObject(GetStdHandle
@ -1598,8 +1768,6 @@ int MAIN(int argc, char **argv)
                                                         0)))
                               || !read_tty))
                        continue;
-#else
-                    if(!i && (!_kbhit() || !read_tty) ) continue;
 # endif
                } else
                    i = select(width, (void *)&readfds, (void *)&writefds,
@ -1785,7 +1953,10 @@ int MAIN(int argc, char **argv)
                break;
            case SSL_ERROR_SYSCALL:
                ret = get_last_socket_error();
-                BIO_printf(bio_err, "read:errno=%d\n", ret);
+                if (c_brief)
+                    BIO_puts(bio_err, "CONNECTION CLOSED BY SERVER\n");
+                else
+                    BIO_printf(bio_err, "read:errno=%d\n", ret);
                goto shut;
            case SSL_ERROR_ZERO_RETURN:
                BIO_printf(bio_c_out, "closed\n");
@ -1798,12 +1969,12 @@ int MAIN(int argc, char **argv)
            }
        }
 #if defined(OPENSSL_SYS_WINDOWS) || defined(OPENSSL_SYS_MSDOS)
-#if defined(OPENSSL_USE_STD_INPUT_HANDLE)
+# if defined(OPENSSL_SYS_WINCE) || defined(OPENSSL_SYS_MSDOS)
+        else if (_kbhit())
+# else
        else if ((_kbhit())
                 || (WAIT_OBJECT_0 ==
                     WaitForSingleObject(GetStdHandle(STD_INPUT_HANDLE), 0)))
-#else
-        else if (_kbhit())
 # endif
 #elif defined (OPENSSL_SYS_NETWARE)
        else if (_kbhit())
@ -1885,12 +2056,25 @@ int MAIN(int argc, char **argv)
        SSL_CTX_free(ctx);
    if (cert)
        X509_free(cert);
+    if (crls)
+        sk_X509_CRL_pop_free(crls, X509_CRL_free);
    if (key)
        EVP_PKEY_free(key);
+    if (chain)
+        sk_X509_pop_free(chain, X509_free);
    if (pass)
        OPENSSL_free(pass);
    if (vpm)
        X509_VERIFY_PARAM_free(vpm);
+    ssl_excert_free(exc);
+    if (ssl_args)
+        sk_OPENSSL_STRING_free(ssl_args);
+    if (cctx)
+        SSL_CONF_CTX_free(cctx);
+#ifndef OPENSSL_NO_JPAKE
+    if (jpake_secret && psk_key)
+        OPENSSL_free(psk_key);
+#endif
    if (cbuf != NULL) {
        OPENSSL_cleanse(cbuf, BUFSIZZ);
        OPENSSL_free(cbuf);
@ -1907,6 +2091,10 @@ int MAIN(int argc, char **argv)
        BIO_free(bio_c_out);
        bio_c_out = NULL;
    }
+    if (bio_c_msg != NULL) {
+        BIO_free(bio_c_msg);
+        bio_c_msg = NULL;
+    }
    apps_shutdown();
    OPENSSL_EXIT(ret);
 }
@ -2000,6 +2188,9 @@ static void print_stuff(BIO *bio, SSL *s, int full)
            BIO_write(bio, "\n", 1);
        }

+        ssl_print_sigalgs(bio, s);
+        ssl_print_tmp_key(bio, s);
+
        BIO_printf(bio,
                   "---\nSSL handshake has read %ld bytes and written %ld bytes\n",
                   BIO_number_read(SSL_get_rbio(s)),
@ -2039,7 +2230,8 @@ static void print_stuff(BIO *bio, SSL *s, int full)
    }
 #endif

-#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG)
+#if !defined(OPENSSL_NO_TLSEXT)
+# if !defined(OPENSSL_NO_NEXTPROTONEG)
    if (next_proto.status != -1) {
        const unsigned char *proto;
        unsigned int proto_len;
@ -2048,6 +2240,18 @@ static void print_stuff(BIO *bio, SSL *s, int full)
        BIO_write(bio, proto, proto_len);
        BIO_write(bio, "\n", 1);
    }
+# endif
+    {
+        const unsigned char *proto;
+        unsigned int proto_len;
+        SSL_get0_alpn_selected(s, &proto, &proto_len);
+        if (proto_len > 0) {
+            BIO_printf(bio, "ALPN protocol: ");
+            BIO_write(bio, proto, proto_len);
+            BIO_write(bio, "\n", 1);
+        } else
+            BIO_printf(bio, "No ALPN negotiated\n");
+    }
 #endif

 #ifndef OPENSSL_NO_SRTP
--- a/deps/openssl/openssl/apps/s_server.c
+++ b/deps/openssl/openssl/apps/s_server.c
--- a/deps/openssl/openssl/apps/s_socket.c
+++ b/deps/openssl/openssl/apps/s_socket.c
@ -290,8 +290,9 @@ static int init_client_ip(int *sock, unsigned char ip[4], int port, int type)
 }

 int do_server(int port, int type, int *ret,
-              int (*cb) (char *hostname, int s, unsigned char *context),
-              unsigned char *context)
+              int (*cb) (char *hostname, int s, int stype,
+                         unsigned char *context), unsigned char *context,
+              int naccept)
 {
    int sock;
    char *name = NULL;
@ -313,12 +314,14 @@ int do_server(int port, int type, int *ret,
            }
        } else
            sock = accept_socket;
-        i = (*cb) (name, sock, context);
+        i = (*cb) (name, sock, type, context);
        if (name != NULL)
            OPENSSL_free(name);
        if (type == SOCK_STREAM)
            SHUTDOWN2(sock);
-        if (i < 0) {
+        if (naccept != -1)
+            naccept--;
+        if (i < 0 || naccept == 0) {
            SHUTDOWN2(accept_socket);
            return (i);
        }
--- a/deps/openssl/openssl/apps/smime.c
+++ b/deps/openssl/openssl/apps/smime.c
@ -632,6 +632,12 @@ int MAIN(int argc, char **argv)
            p7 = PKCS7_sign(NULL, NULL, other, in, flags);
            if (!p7)
                goto end;
+            if (flags & PKCS7_NOCERTS) {
+                for (i = 0; i < sk_X509_num(other); i++) {
+                    X509 *x = sk_X509_value(other, i);
+                    PKCS7_add_certificate(p7, x);
+                }
+            }
        } else
            flags |= PKCS7_REUSE_DIGEST;
        for (i = 0; i < sk_OPENSSL_STRING_num(sksigners); i++) {
--- a/deps/openssl/openssl/apps/speed.c
+++ b/deps/openssl/openssl/apps/speed.c
@ -366,6 +366,8 @@ static void *KDF1_SHA1(const void *in, size_t inlen, void *out,
 }
 # endif                         /* OPENSSL_NO_ECDH */

+static void multiblock_speed(const EVP_CIPHER *evp_cipher);
+
 int MAIN(int, char **);

 int MAIN(int argc, char **argv)
@ -646,6 +648,7 @@ int MAIN(int argc, char **argv)
 # ifndef NO_FORK
    int multi = 0;
 # endif
+    int multiblock = 0;

 # ifndef TIMES
    usertime = -1;
@ -776,6 +779,9 @@ int MAIN(int argc, char **argv)
            mr = 1;
            j--;                /* Otherwise, -mr gets confused with an
                                 * algorithm. */
+        } else if (argc > 0 && !strcmp(*argv, "-mb")) {
+            multiblock = 1;
+            j--;
        } else
 # ifndef OPENSSL_NO_MD2
        if (strcmp(*argv, "md2") == 0)
@ -1941,6 +1947,20 @@ int MAIN(int argc, char **argv)
 # endif

    if (doit[D_EVP]) {
+# ifdef EVP_CIPH_FLAG_TLS1_1_MULTIBLOCK
+        if (multiblock && evp_cipher) {
+            if (!
+                (EVP_CIPHER_flags(evp_cipher) &
+                 EVP_CIPH_FLAG_TLS1_1_MULTIBLOCK)) {
+                fprintf(stderr, "%s is not multi-block capable\n",
+                        OBJ_nid2ln(evp_cipher->nid));
+                goto end;
+            }
+            multiblock_speed(evp_cipher);
+            mret = 0;
+            goto end;
+        }
+# endif
        for (j = 0; j < SIZE_NUM; j++) {
            if (evp_cipher) {
                EVP_CIPHER_CTX ctx;
@ -2742,4 +2762,112 @@ static int do_multi(int multi)
    return 1;
 }
 # endif
+
+static void multiblock_speed(const EVP_CIPHER *evp_cipher)
+{
+    static int mblengths[] =
+        { 8 * 1024, 2 * 8 * 1024, 4 * 8 * 1024, 8 * 8 * 1024, 8 * 16 * 1024 };
+    int j, count, num = sizeof(lengths) / sizeof(lengths[0]);
+    const char *alg_name;
+    unsigned char *inp, *out, no_key[32], no_iv[16];
+    EVP_CIPHER_CTX ctx;
+    double d = 0.0;
+
+    inp = OPENSSL_malloc(mblengths[num - 1]);
+    out = OPENSSL_malloc(mblengths[num - 1] + 1024);
+    if(!inp || !out) {
+        BIO_printf(bio_err,"Out of memory\n");
+        goto end;
+    }
+
+
+    EVP_CIPHER_CTX_init(&ctx);
+    EVP_EncryptInit_ex(&ctx, evp_cipher, NULL, no_key, no_iv);
+    EVP_CIPHER_CTX_ctrl(&ctx, EVP_CTRL_AEAD_SET_MAC_KEY, sizeof(no_key),
+                        no_key);
+    alg_name = OBJ_nid2ln(evp_cipher->nid);
+
+    for (j = 0; j < num; j++) {
+        print_message(alg_name, 0, mblengths[j]);
+        Time_F(START);
+        for (count = 0, run = 1; run && count < 0x7fffffff; count++) {
+            unsigned char aad[13];
+            EVP_CTRL_TLS1_1_MULTIBLOCK_PARAM mb_param;
+            size_t len = mblengths[j];
+            int packlen;
+
+            memset(aad, 0, 8);  /* avoid uninitialized values */
+            aad[8] = 23;        /* SSL3_RT_APPLICATION_DATA */
+            aad[9] = 3;         /* version */
+            aad[10] = 2;
+            aad[11] = 0;        /* length */
+            aad[12] = 0;
+            mb_param.out = NULL;
+            mb_param.inp = aad;
+            mb_param.len = len;
+            mb_param.interleave = 8;
+
+            packlen = EVP_CIPHER_CTX_ctrl(&ctx,
+                                          EVP_CTRL_TLS1_1_MULTIBLOCK_AAD,
+                                          sizeof(mb_param), &mb_param);
+
+            if (packlen > 0) {
+                mb_param.out = out;
+                mb_param.inp = inp;
+                mb_param.len = len;
+                EVP_CIPHER_CTX_ctrl(&ctx,
+                                    EVP_CTRL_TLS1_1_MULTIBLOCK_ENCRYPT,
+                                    sizeof(mb_param), &mb_param);
+            } else {
+                int pad;
+
+                RAND_bytes(out, 16);
+                len += 16;
+                aad[11] = len >> 8;
+                aad[12] = len;
+                pad = EVP_CIPHER_CTX_ctrl(&ctx,
+                                          EVP_CTRL_AEAD_TLS1_AAD, 13, aad);
+                EVP_Cipher(&ctx, out, inp, len + pad);
+            }
+        }
+        d = Time_F(STOP);
+        BIO_printf(bio_err,
+                   mr ? "+R:%d:%s:%f\n"
+                   : "%d %s's in %.2fs\n", count, "evp", d);
+        results[D_EVP][j] = ((double)count) / d * mblengths[j];
+    }
+
+    if (mr) {
+        fprintf(stdout, "+H");
+        for (j = 0; j < num; j++)
+            fprintf(stdout, ":%d", mblengths[j]);
+        fprintf(stdout, "\n");
+        fprintf(stdout, "+F:%d:%s", D_EVP, alg_name);
+        for (j = 0; j < num; j++)
+            fprintf(stdout, ":%.2f", results[D_EVP][j]);
+        fprintf(stdout, "\n");
+    } else {
+        fprintf(stdout,
+                "The 'numbers' are in 1000s of bytes per second processed.\n");
+        fprintf(stdout, "type                    ");
+        for (j = 0; j < num; j++)
+            fprintf(stdout, "%7d bytes", mblengths[j]);
+        fprintf(stdout, "\n");
+        fprintf(stdout, "%-24s", alg_name);
+
+        for (j = 0; j < num; j++) {
+            if (results[D_EVP][j] > 10000)
+                fprintf(stdout, " %11.2fk", results[D_EVP][j] / 1e3);
+            else
+                fprintf(stdout, " %11.2f ", results[D_EVP][j]);
+        }
+        fprintf(stdout, "\n");
+    }
+
+end:
+    if(inp)
+        OPENSSL_free(inp);
+    if(out)
+        OPENSSL_free(out);
+}
 #endif
--- a/deps/openssl/openssl/apps/verify.c
+++ b/deps/openssl/openssl/apps/verify.c
@ -88,6 +88,7 @@ int MAIN(int argc, char **argv)
    X509_STORE *cert_ctx = NULL;
    X509_LOOKUP *lookup = NULL;
    X509_VERIFY_PARAM *vpm = NULL;
+    int crl_download = 0;
 #ifndef OPENSSL_NO_ENGINE
    char *engine = NULL;
 #endif
@ -136,7 +137,8 @@ int MAIN(int argc, char **argv)
                if (argc-- < 1)
                    goto end;
                crlfile = *(++argv);
-            }
+            } else if (strcmp(*argv, "-crl_download") == 0)
+                crl_download = 1;
 #ifndef OPENSSL_NO_ENGINE
            else if (strcmp(*argv, "-engine") == 0) {
                if (--argc < 1)
@ -214,6 +216,9 @@ int MAIN(int argc, char **argv)
    }

    ret = 0;
+
+    if (crl_download)
+        store_setup_crl_download(cert_ctx);
    if (argc < 1) {
        if (1 != check(cert_ctx, NULL, untrusted, trusted, crls, e))
            ret = -1;
--- a/deps/openssl/openssl/apps/x509.c
+++ b/deps/openssl/openssl/apps/x509.c
@ -150,6 +150,9 @@ static const char *x509_usage[] = {
    " -engine e       - use engine e, possibly a hardware device.\n",
 #endif
    " -certopt arg    - various certificate text options\n",
+    " -checkhost host - check certificate matches \"host\"\n",
+    " -checkemail email - check certificate matches \"email\"\n",
+    " -checkip ipaddr - check certificate matches \"ipaddr\"\n",
    NULL
 };

@ -163,6 +166,9 @@ static int x509_certify(X509_STORE *ctx, char *CAfile, const EVP_MD *digest,
                        char *section, ASN1_INTEGER *sno);
 static int purpose_print(BIO *bio, X509 *cert, X509_PURPOSE *pt);
 static int reqfile = 0;
+#ifdef OPENSSL_SSL_DEBUG_BROKEN_PROTOCOL
+static int force_version = 2;
+#endif

 int MAIN(int, char **);

@ -174,15 +180,16 @@ int MAIN(int argc, char **argv)
    X509 *x = NULL, *xca = NULL;
    ASN1_OBJECT *objtmp;
    STACK_OF(OPENSSL_STRING) *sigopts = NULL;
-    EVP_PKEY *Upkey = NULL, *CApkey = NULL;
+    EVP_PKEY *Upkey = NULL, *CApkey = NULL, *fkey = NULL;
    ASN1_INTEGER *sno = NULL;
-    int i, num, badops = 0;
+    int i, num, badops = 0, badsig = 0;
    BIO *out = NULL;
    BIO *STDout = NULL;
    STACK_OF(ASN1_OBJECT) *trust = NULL, *reject = NULL;
    int informat, outformat, keyformat, CAformat, CAkeyformat;
    char *infile = NULL, *outfile = NULL, *keyfile = NULL, *CAfile = NULL;
    char *CAkeyfile = NULL, *CAserial = NULL;
+    char *fkeyfile = NULL;
    char *alias = NULL;
    int text = 0, serial = 0, subject = 0, issuer = 0, startdate =
        0, enddate = 0;
@ -208,6 +215,9 @@ int MAIN(int argc, char **argv)
    int need_rand = 0;
    int checkend = 0, checkoffset = 0;
    unsigned long nmflag = 0, certflag = 0;
+    char *checkhost = NULL;
+    char *checkemail = NULL;
+    char *checkip = NULL;
 #ifndef OPENSSL_NO_ENGINE
    char *engine = NULL;
 #endif
@ -274,7 +284,15 @@ int MAIN(int argc, char **argv)
                sigopts = sk_OPENSSL_STRING_new_null();
            if (!sigopts || !sk_OPENSSL_STRING_push(sigopts, *(++argv)))
                goto bad;
-        } else if (strcmp(*argv, "-days") == 0) {
+        }
+#ifdef OPENSSL_SSL_DEBUG_BROKEN_PROTOCOL
+        else if (strcmp(*argv, "-force_version") == 0) {
+            if (--argc < 1)
+                goto bad;
+            force_version = atoi(*(++argv)) - 1;
+        }
+#endif
+        else if (strcmp(*argv, "-days") == 0) {
            if (--argc < 1)
                goto bad;
            days = atoi(*(++argv));
@ -327,6 +345,10 @@ int MAIN(int argc, char **argv)
                goto bad;
            if (!(sno = s2i_ASN1_INTEGER(NULL, *(++argv))))
                goto bad;
+        } else if (strcmp(*argv, "-force_pubkey") == 0) {
+            if (--argc < 1)
+                goto bad;
+            fkeyfile = *(++argv);
        } else if (strcmp(*argv, "-addtrust") == 0) {
            if (--argc < 1)
                goto bad;
@ -424,6 +446,18 @@ int MAIN(int argc, char **argv)
                goto bad;
            checkoffset = atoi(*(++argv));
            checkend = 1;
+        } else if (strcmp(*argv, "-checkhost") == 0) {
+            if (--argc < 1)
+                goto bad;
+            checkhost = *(++argv);
+        } else if (strcmp(*argv, "-checkemail") == 0) {
+            if (--argc < 1)
+                goto bad;
+            checkemail = *(++argv);
+        } else if (strcmp(*argv, "-checkip") == 0) {
+            if (--argc < 1)
+                goto bad;
+            checkip = *(++argv);
        } else if (strcmp(*argv, "-noout") == 0)
            noout = ++num;
        else if (strcmp(*argv, "-trustout") == 0)
@ -447,6 +481,8 @@ int MAIN(int argc, char **argv)
 #endif
        else if (strcmp(*argv, "-ocspid") == 0)
            ocspid = ++num;
+        else if (strcmp(*argv, "-badsig") == 0)
+            badsig = 1;
        else if ((md_alg = EVP_get_digestbyname(*argv + 1))) {
            /* ok */
            digest = md_alg;
@ -484,6 +520,13 @@ int MAIN(int argc, char **argv)
        goto end;
    }

+    if (fkeyfile) {
+        fkey = load_pubkey(bio_err, fkeyfile, keyformat, 0,
+                           NULL, e, "Forced key");
+        if (fkey == NULL)
+            goto end;
+    }
+
    if ((CAkeyfile == NULL) && (CA_flag) && (CAformat == FORMAT_PEM)) {
        CAkeyfile = CAfile;
    } else if ((CA_flag) && (CAkeyfile == NULL)) {
@ -605,10 +648,13 @@ int MAIN(int argc, char **argv)

        X509_gmtime_adj(X509_get_notBefore(x), 0);
        X509_time_adj_ex(X509_get_notAfter(x), days, 0, NULL);
-
-        pkey = X509_REQ_get_pubkey(req);
-        X509_set_pubkey(x, pkey);
-        EVP_PKEY_free(pkey);
+        if (fkey)
+            X509_set_pubkey(x, fkey);
+        else {
+            pkey = X509_REQ_get_pubkey(req);
+            X509_set_pubkey(x, pkey);
+            EVP_PKEY_free(pkey);
+        }
    } else
        x = load_cert(bio_err, infile, informat, NULL, e, "Certificate");

@ -937,11 +983,16 @@ int MAIN(int argc, char **argv)
        goto end;
    }

+    print_cert_checks(STDout, x, checkhost, checkemail, checkip);
+
    if (noout) {
        ret = 0;
        goto end;
    }

+    if (badsig)
+        x->signature->data[x->signature->length - 1] ^= 0x1;
+
    if (outformat == FORMAT_ASN1)
        i = i2d_X509_bio(out, x);
    else if (outformat == FORMAT_PEM) {
@ -982,6 +1033,7 @@ int MAIN(int argc, char **argv)
    X509_free(xca);
    EVP_PKEY_free(Upkey);
    EVP_PKEY_free(CApkey);
+    EVP_PKEY_free(fkey);
    if (sigopts)
        sk_OPENSSL_STRING_free(sigopts);
    X509_REQ_free(rq);
@ -1101,7 +1153,11 @@ static int x509_certify(X509_STORE *ctx, char *CAfile, const EVP_MD *digest,

    if (conf) {
        X509V3_CTX ctx2;
+#ifdef OPENSSL_SSL_DEBUG_BROKEN_PROTOCOL
+        X509_set_version(x, force_version);
+#else
        X509_set_version(x, 2); /* version 3 certificate */
+#endif
        X509V3_set_ctx(&ctx2, xca, x, NULL, NULL, 0);
        X509V3_set_nconf(&ctx2, conf);
        if (!X509V3_EXT_add_nconf(conf, &ctx2, section, x))
@ -1186,7 +1242,11 @@ static int sign(X509 *x, EVP_PKEY *pkey, int days, int clrext,
    }
    if (conf) {
        X509V3_CTX ctx;
+#ifdef OPENSSL_SSL_DEBUG_BROKEN_PROTOCOL
+        X509_set_version(x, force_version);
+#else
        X509_set_version(x, 2); /* version 3 certificate */
+#endif
        X509V3_set_ctx(&ctx, x, x, NULL, NULL, 0);
        X509V3_set_nconf(&ctx, conf);
        if (!X509V3_EXT_add_nconf(conf, &ctx, section, x))
--- a/deps/openssl/openssl/config
+++ b/deps/openssl/openssl/config
@ -587,15 +587,33 @@ case "$GUESSOS" in
 	fi
 	;;
  ppc64-*-linux2)
+	if [ -z "$KERNEL_BITS" ]; then
+	    echo "WARNING! If you wish to build 64-bit library, then you have to"
+	    echo "         invoke './Configure linux-ppc64' *manually*."
+	    if [ "$TEST" = "false" -a -t 1 ]; then
+		echo "         You have about 5 seconds to press Ctrl-C to abort."
+		(trap "stty `stty -g`" 2 0; stty -icanon min 0 time 50; read waste) <&1
+	    fi
+	fi
+	if [ "$KERNEL_BITS" = "64" ]; then
+	    OUT="linux-ppc64"
+	else
+	    OUT="linux-ppc"
+	    (echo "__LP64__" | gcc -E -x c - 2>/dev/null | grep "^__LP64__" 2>&1 > /dev/null) || options="$options -m32"
+	fi
+	;;
+  ppc64le-*-linux2) OUT="linux-ppc64le" ;;
+  ppc-*-linux2) OUT="linux-ppc" ;;
+  mips64*-*-linux2)
 	echo "WARNING! If you wish to build 64-bit library, then you have to"
-	echo "         invoke './Configure linux-ppc64' *manually*."
+	echo "         invoke './Configure linux64-mips64' *manually*."
 	if [ "$TEST" = "false" -a -t 1 ]; then
 	    echo "         You have about 5 seconds to press Ctrl-C to abort."
 	    (trap "stty `stty -g`" 2 0; stty -icanon min 0 time 50; read waste) <&1
 	fi
-	OUT="linux-ppc"
+	OUT="linux-mips64"
 	;;
-  ppc-*-linux2) OUT="linux-ppc" ;;
+  mips*-*-linux2) OUT="linux-mips32" ;;
  ppc60x-*-vxworks*) OUT="vxworks-ppc60x" ;;
  ppcgen-*-vxworks*) OUT="vxworks-ppcgen" ;;
  pentium-*-vxworks*) OUT="vxworks-pentium" ;;
@ -644,6 +662,7 @@ case "$GUESSOS" in
  armv[1-3]*-*-linux2) OUT="linux-generic32" ;;
  armv[7-9]*-*-linux2) OUT="linux-armv4"; options="$options -march=armv7-a" ;;
  arm*-*-linux2) OUT="linux-armv4" ;;
+  aarch64-*-linux2) OUT="linux-aarch64" ;;
  sh*b-*-linux2) OUT="linux-generic32"; options="$options -DB_ENDIAN" ;;
  sh*-*-linux2)  OUT="linux-generic32"; options="$options -DL_ENDIAN" ;;
  m68k*-*-linux2) OUT="linux-generic32"; options="$options -DB_ENDIAN" ;;
--- a/deps/openssl/openssl/crypto/Makefile
+++ b/deps/openssl/openssl/crypto/Makefile
@ -74,9 +74,9 @@ ia64cpuid.s: ia64cpuid.S;	$(CC) $(CFLAGS) -E ia64cpuid.S > $@
 ppccpuid.s:	ppccpuid.pl;	$(PERL) ppccpuid.pl $(PERLASM_SCHEME) $@
 pariscid.s:	pariscid.pl;	$(PERL) pariscid.pl $(PERLASM_SCHEME) $@
 alphacpuid.s:	alphacpuid.pl
-	(preproc=/tmp/$$$$.$@; trap "rm $$preproc" INT; \
+	(preproc=$$$$.$@.S; trap "rm $$preproc" INT; \
 	$(PERL) alphacpuid.pl > $$preproc && \
-	$(CC) -E $$preproc > $@ && rm $$preproc)
+	$(CC) -E -P $$preproc > $@ && rm $$preproc)

 testapps:
 	[ -z "$(THIS)" ] || (	if echo $(SDIRS) | fgrep ' des '; \
@ -88,7 +88,7 @@ subdirs:
 	@target=all; $(RECURSIVE_MAKE)

 files:
-	$(PERL) $(TOP)/util/files.pl Makefile >> $(TOP)/MINFO
+	$(PERL) $(TOP)/util/files.pl "CPUID_OBJ=$(CPUID_OBJ)" Makefile >> $(TOP)/MINFO
 	@target=files; $(RECURSIVE_MAKE)

 links:
@ -102,7 +102,7 @@ lib:	$(LIB)
 	@touch lib
 $(LIB):	$(LIBOBJ)
 	$(AR) $(LIB) $(LIBOBJ)
-	[ -z "$(FIPSLIBDIR)" ] || $(AR) $(LIB) $(FIPSLIBDIR)fipscanister.o
+	test -z "$(FIPSLIBDIR)" || $(AR) $(LIB) $(FIPSLIBDIR)fipscanister.o
 	$(RANLIB) $(LIB) || echo Never mind.

 shared: buildinf.h lib subdirs
--- a/deps/openssl/openssl/crypto/aes/Makefile
+++ b/deps/openssl/openssl/crypto/aes/Makefile
@ -65,12 +65,22 @@ aesni-x86_64.s: asm/aesni-x86_64.pl
 	$(PERL) asm/aesni-x86_64.pl $(PERLASM_SCHEME) > $@
 aesni-sha1-x86_64.s:	asm/aesni-sha1-x86_64.pl
 	$(PERL) asm/aesni-sha1-x86_64.pl $(PERLASM_SCHEME) > $@
+aesni-sha256-x86_64.s:	asm/aesni-sha256-x86_64.pl
+	$(PERL) asm/aesni-sha256-x86_64.pl $(PERLASM_SCHEME) > $@
+aesni-mb-x86_64.s:	asm/aesni-mb-x86_64.pl
+	$(PERL) asm/aesni-mb-x86_64.pl $(PERLASM_SCHEME) > $@

 aes-sparcv9.s: asm/aes-sparcv9.pl
 	$(PERL) asm/aes-sparcv9.pl $(CFLAGS) > $@
+aest4-sparcv9.s: asm/aest4-sparcv9.pl ../perlasm/sparcv9_modes.pl
+	$(PERL) asm/aest4-sparcv9.pl $(CFLAGS) > $@

 aes-ppc.s:	asm/aes-ppc.pl
 	$(PERL) asm/aes-ppc.pl $(PERLASM_SCHEME) $@
+vpaes-ppc.s:	asm/vpaes-ppc.pl
+	$(PERL) asm/vpaes-ppc.pl $(PERLASM_SCHEME) $@
+aesp8-ppc.s:	asm/aesp8-ppc.pl
+	$(PERL) asm/aesp8-ppc.pl $(PERLASM_SCHEME) $@

 aes-parisc.s:	asm/aes-parisc.pl
 	$(PERL) asm/aes-parisc.pl $(PERLASM_SCHEME) $@
@ -78,12 +88,18 @@ aes-parisc.s:	asm/aes-parisc.pl
 aes-mips.S:	asm/aes-mips.pl
 	$(PERL) asm/aes-mips.pl $(PERLASM_SCHEME) $@

+aesv8-armx.S:	asm/aesv8-armx.pl
+	$(PERL) asm/aesv8-armx.pl $(PERLASM_SCHEME) $@
+aesv8-armx.o:	aesv8-armx.S
+
 # GNU make "catch all"
 aes-%.S:	asm/aes-%.pl;	$(PERL) $< $(PERLASM_SCHEME) > $@
 aes-armv4.o:	aes-armv4.S
+bsaes-%.S:	asm/bsaes-%.pl;	$(PERL) $< $(PERLASM_SCHEME) $@
+bsaes-armv7.o:	bsaes-armv7.S

 files:
-	$(PERL) $(TOP)/util/files.pl Makefile >> $(TOP)/MINFO
+	$(PERL) $(TOP)/util/files.pl "AES_ENC=$(AES_ENC)" Makefile >> $(TOP)/MINFO

 links:
 	@$(PERL) $(TOP)/util/mklink.pl ../../include/openssl $(EXHEADER)
@ -147,7 +163,7 @@ aes_wrap.o: ../../e_os.h ../../include/openssl/aes.h
 aes_wrap.o: ../../include/openssl/bio.h ../../include/openssl/buffer.h
 aes_wrap.o: ../../include/openssl/crypto.h ../../include/openssl/e_os2.h
 aes_wrap.o: ../../include/openssl/err.h ../../include/openssl/lhash.h
-aes_wrap.o: ../../include/openssl/opensslconf.h
+aes_wrap.o: ../../include/openssl/modes.h ../../include/openssl/opensslconf.h
 aes_wrap.o: ../../include/openssl/opensslv.h ../../include/openssl/ossl_typ.h
 aes_wrap.o: ../../include/openssl/safestack.h ../../include/openssl/stack.h
 aes_wrap.o: ../../include/openssl/symhacks.h ../cryptlib.h aes_wrap.c
--- a/deps/openssl/openssl/crypto/aes/aes_wrap.c
+++ b/deps/openssl/openssl/crypto/aes/aes_wrap.c
@ -54,197 +54,19 @@

 #include "cryptlib.h"
 #include <openssl/aes.h>
-#include <openssl/bio.h>
-
-static const unsigned char default_iv[] = {
-    0xA6, 0xA6, 0xA6, 0xA6, 0xA6, 0xA6, 0xA6, 0xA6,
-};
+#include <openssl/modes.h>

 int AES_wrap_key(AES_KEY *key, const unsigned char *iv,
                 unsigned char *out,
                 const unsigned char *in, unsigned int inlen)
 {
-    unsigned char *A, B[16], *R;
-    unsigned int i, j, t;
-    if ((inlen & 0x7) || (inlen < 8))
-        return -1;
-    A = B;
-    t = 1;
-    memcpy(out + 8, in, inlen);
-    if (!iv)
-        iv = default_iv;
-
-    memcpy(A, iv, 8);
-
-    for (j = 0; j < 6; j++) {
-        R = out + 8;
-        for (i = 0; i < inlen; i += 8, t++, R += 8) {
-            memcpy(B + 8, R, 8);
-            AES_encrypt(B, B, key);
-            A[7] ^= (unsigned char)(t & 0xff);
-            if (t > 0xff) {
-                A[6] ^= (unsigned char)((t >> 8) & 0xff);
-                A[5] ^= (unsigned char)((t >> 16) & 0xff);
-                A[4] ^= (unsigned char)((t >> 24) & 0xff);
-            }
-            memcpy(R, B + 8, 8);
-        }
-    }
-    memcpy(out, A, 8);
-    return inlen + 8;
+    return CRYPTO_128_wrap(key, iv, out, in, inlen, (block128_f) AES_encrypt);
 }

 int AES_unwrap_key(AES_KEY *key, const unsigned char *iv,
                   unsigned char *out,
                   const unsigned char *in, unsigned int inlen)
 {
-    unsigned char *A, B[16], *R;
-    unsigned int i, j, t;
-    inlen -= 8;
-    if (inlen & 0x7)
-        return -1;
-    if (inlen < 8)
-        return -1;
-    A = B;
-    t = 6 * (inlen >> 3);
-    memcpy(A, in, 8);
-    memcpy(out, in + 8, inlen);
-    for (j = 0; j < 6; j++) {
-        R = out + inlen - 8;
-        for (i = 0; i < inlen; i += 8, t--, R -= 8) {
-            A[7] ^= (unsigned char)(t & 0xff);
-            if (t > 0xff) {
-                A[6] ^= (unsigned char)((t >> 8) & 0xff);
-                A[5] ^= (unsigned char)((t >> 16) & 0xff);
-                A[4] ^= (unsigned char)((t >> 24) & 0xff);
-            }
-            memcpy(B + 8, R, 8);
-            AES_decrypt(B, B, key);
-            memcpy(R, B + 8, 8);
-        }
-    }
-    if (!iv)
-        iv = default_iv;
-    if (memcmp(A, iv, 8)) {
-        OPENSSL_cleanse(out, inlen);
-        return 0;
-    }
-    return inlen;
-}
-
-#ifdef AES_WRAP_TEST
-
-int AES_wrap_unwrap_test(const unsigned char *kek, int keybits,
-                         const unsigned char *iv,
-                         const unsigned char *eout,
-                         const unsigned char *key, int keylen)
-{
-    unsigned char *otmp = NULL, *ptmp = NULL;
-    int r, ret = 0;
-    AES_KEY wctx;
-    otmp = OPENSSL_malloc(keylen + 8);
-    ptmp = OPENSSL_malloc(keylen);
-    if (!otmp || !ptmp)
-        return 0;
-    if (AES_set_encrypt_key(kek, keybits, &wctx))
-        goto err;
-    r = AES_wrap_key(&wctx, iv, otmp, key, keylen);
-    if (r <= 0)
-        goto err;
-
-    if (eout && memcmp(eout, otmp, keylen))
-        goto err;
-
-    if (AES_set_decrypt_key(kek, keybits, &wctx))
-        goto err;
-    r = AES_unwrap_key(&wctx, iv, ptmp, otmp, r);
-
-    if (memcmp(key, ptmp, keylen))
-        goto err;
-
-    ret = 1;
-
- err:
-    if (otmp)
-        OPENSSL_free(otmp);
-    if (ptmp)
-        OPENSSL_free(ptmp);
-
-    return ret;
-
+    return CRYPTO_128_unwrap(key, iv, out, in, inlen,
+                             (block128_f) AES_decrypt);
 }
-
-int main(int argc, char **argv)
-{
-
-    static const unsigned char kek[] = {
-        0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
-        0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
-        0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
-        0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f
-    };
-
-    static const unsigned char key[] = {
-        0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77,
-        0x88, 0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff,
-        0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
-        0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f
-    };
-
-    static const unsigned char e1[] = {
-        0x1f, 0xa6, 0x8b, 0x0a, 0x81, 0x12, 0xb4, 0x47,
-        0xae, 0xf3, 0x4b, 0xd8, 0xfb, 0x5a, 0x7b, 0x82,
-        0x9d, 0x3e, 0x86, 0x23, 0x71, 0xd2, 0xcf, 0xe5
-    };
-
-    static const unsigned char e2[] = {
-        0x96, 0x77, 0x8b, 0x25, 0xae, 0x6c, 0xa4, 0x35,
-        0xf9, 0x2b, 0x5b, 0x97, 0xc0, 0x50, 0xae, 0xd2,
-        0x46, 0x8a, 0xb8, 0xa1, 0x7a, 0xd8, 0x4e, 0x5d
-    };
-
-    static const unsigned char e3[] = {
-        0x64, 0xe8, 0xc3, 0xf9, 0xce, 0x0f, 0x5b, 0xa2,
-        0x63, 0xe9, 0x77, 0x79, 0x05, 0x81, 0x8a, 0x2a,
-        0x93, 0xc8, 0x19, 0x1e, 0x7d, 0x6e, 0x8a, 0xe7
-    };
-
-    static const unsigned char e4[] = {
-        0x03, 0x1d, 0x33, 0x26, 0x4e, 0x15, 0xd3, 0x32,
-        0x68, 0xf2, 0x4e, 0xc2, 0x60, 0x74, 0x3e, 0xdc,
-        0xe1, 0xc6, 0xc7, 0xdd, 0xee, 0x72, 0x5a, 0x93,
-        0x6b, 0xa8, 0x14, 0x91, 0x5c, 0x67, 0x62, 0xd2
-    };
-
-    static const unsigned char e5[] = {
-        0xa8, 0xf9, 0xbc, 0x16, 0x12, 0xc6, 0x8b, 0x3f,
-        0xf6, 0xe6, 0xf4, 0xfb, 0xe3, 0x0e, 0x71, 0xe4,
-        0x76, 0x9c, 0x8b, 0x80, 0xa3, 0x2c, 0xb8, 0x95,
-        0x8c, 0xd5, 0xd1, 0x7d, 0x6b, 0x25, 0x4d, 0xa1
-    };
-
-    static const unsigned char e6[] = {
-        0x28, 0xc9, 0xf4, 0x04, 0xc4, 0xb8, 0x10, 0xf4,
-        0xcb, 0xcc, 0xb3, 0x5c, 0xfb, 0x87, 0xf8, 0x26,
-        0x3f, 0x57, 0x86, 0xe2, 0xd8, 0x0e, 0xd3, 0x26,
-        0xcb, 0xc7, 0xf0, 0xe7, 0x1a, 0x99, 0xf4, 0x3b,
-        0xfb, 0x98, 0x8b, 0x9b, 0x7a, 0x02, 0xdd, 0x21
-    };
-
-    AES_KEY wctx, xctx;
-    int ret;
-    ret = AES_wrap_unwrap_test(kek, 128, NULL, e1, key, 16);
-    fprintf(stderr, "Key test result %d\n", ret);
-    ret = AES_wrap_unwrap_test(kek, 192, NULL, e2, key, 16);
-    fprintf(stderr, "Key test result %d\n", ret);
-    ret = AES_wrap_unwrap_test(kek, 256, NULL, e3, key, 16);
-    fprintf(stderr, "Key test result %d\n", ret);
-    ret = AES_wrap_unwrap_test(kek, 192, NULL, e4, key, 24);
-    fprintf(stderr, "Key test result %d\n", ret);
-    ret = AES_wrap_unwrap_test(kek, 256, NULL, e5, key, 24);
-    fprintf(stderr, "Key test result %d\n", ret);
-    ret = AES_wrap_unwrap_test(kek, 256, NULL, e6, key, 32);
-    fprintf(stderr, "Key test result %d\n", ret);
-}
-
-#endif
--- a/deps/openssl/openssl/crypto/aes/aes_x86core.c
+++ b/deps/openssl/openssl/crypto/aes/aes_x86core.c
@ -89,8 +89,10 @@ typedef unsigned long long u64;
 #endif

 #undef ROTATE
-#if defined(_MSC_VER) || defined(__ICC)
-# define ROTATE(a,n)	_lrotl(a,n)
+#if defined(_MSC_VER)
+# define ROTATE(a,n)    _lrotl(a,n)
+#elif defined(__ICC)
+# define ROTATE(a,n)    _rotl(a,n)
 #elif defined(__GNUC__) && __GNUC__>=2
 # if defined(__i386) || defined(__i386__) || defined(__x86_64) || defined(__x86_64__)
 #   define ROTATE(a,n)  ({ register unsigned int ret;   \
--- a/deps/openssl/openssl/crypto/aes/asm/aes-586.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/aes-586.pl
@ -39,7 +39,7 @@
 # but exhibits up to 10% improvement on other cores.
 #
 # Second version is "monolithic" replacement for aes_core.c, which in
-# addition to AES_[de|en]crypt implements private_AES_set_[de|en]cryption_key.
+# addition to AES_[de|en]crypt implements AES_set_[de|en]cryption_key.
 # This made it possible to implement little-endian variant of the
 # algorithm without modifying the base C code. Motivating factor for
 # the undertaken effort was that it appeared that in tight IA-32
@ -103,11 +103,12 @@
 # byte for 128-bit key.
 #
 #		ECB encrypt	ECB decrypt	CBC large chunk
-# P4		56[60]		84[100]		23
-# AMD K8	48[44]		70[79]		18
-# PIII		41[50]		61[91]		24
-# Core 2	32[38]		45[70]		18.5
-# Pentium	120		160		77
+# P4		52[54]		83[95]		23
+# AMD K8	46[41]		66[70]		18
+# PIII		41[50]		60[77]		24
+# Core 2	31[36]		45[64]		18.5
+# Atom		76[100]		96[138]		60
+# Pentium	115		150		77
 #
 # Version 4.1 switches to compact S-box even in key schedule setup.
 #
@ -242,7 +243,7 @@ $vertical_spin=0;	# shift "verticaly" defaults to 0, because of

 sub encvert()
 { my ($te,@s) = @_;
-  my $v0 = $acc, $v1 = $key;
+  my ($v0,$v1) = ($acc,$key);

 	&mov	($v0,$s[3]);				# copy s3
 	&mov	(&DWP(4,"esp"),$s[2]);			# save s2
@ -299,7 +300,7 @@ sub encvert()
 # Another experimental routine, which features "horizontal spin," but
 # eliminates one reference to stack. Strangely enough runs slower...
 sub enchoriz()
-{ my $v0 = $key, $v1 = $acc;
+{ my ($v0,$v1) = ($key,$acc);

 	&movz	($v0,&LB($s0));			#  3, 2, 1, 0*
 	&rotr	($s2,8);			#  8,11,10, 9
@ -427,7 +428,7 @@ sub sse_encbody()
 ######################################################################

 sub enccompact()
-{ my $Fn = mov;
+{ my $Fn = \&mov;
  while ($#_>5) { pop(@_); $Fn=sub{}; }
  my ($i,$te,@s)=@_;
  my $tmp = $key;
@ -476,24 +477,25 @@ sub enctransform()
  my $tmp = $tbl;
  my $r2  = $key ;

-	&mov	($acc,$s[$i]);
-	&and	($acc,0x80808080);
-	&mov	($tmp,$acc);
-	&shr	($tmp,7);
+	&and	($tmp,$s[$i]);
 	&lea	($r2,&DWP(0,$s[$i],$s[$i]));
-	&sub	($acc,$tmp);
+	&mov	($acc,$tmp);
+	&shr	($tmp,7);
 	&and	($r2,0xfefefefe);
-	&and	($acc,0x1b1b1b1b);
+	&sub	($acc,$tmp);
 	&mov	($tmp,$s[$i]);
+	&and	($acc,0x1b1b1b1b);
+	&rotr	($tmp,16);
 	&xor	($acc,$r2);	# r2
+	&mov	($r2,$s[$i]);

 	&xor	($s[$i],$acc);	# r0 ^ r2
+	&rotr	($r2,16+8);
+	&xor	($acc,$tmp);
 	&rotl	($s[$i],24);
-	&xor	($s[$i],$acc)	# ROTATE(r2^r0,24) ^ r2
-	&rotr	($tmp,16);
-	&xor	($s[$i],$tmp);
-	&rotr	($tmp,8);
-	&xor	($s[$i],$tmp);
+	&xor	($acc,$r2);
+	&mov	($tmp,0x80808080)	if ($i!=1);
+	&xor	($s[$i],$acc);	# ROTATE(r2^r0,24) ^ r2
 }

 &function_begin_B("_x86_AES_encrypt_compact");
@ -526,6 +528,7 @@ sub enctransform()
 		&enccompact(1,$tbl,$s1,$s2,$s3,$s0,1);
 		&enccompact(2,$tbl,$s2,$s3,$s0,$s1,1);
 		&enccompact(3,$tbl,$s3,$s0,$s1,$s2,1);
+		&mov	($tbl,0x80808080);
 		&enctransform(2);
 		&enctransform(3);
 		&enctransform(0);
@ -607,82 +610,84 @@ sub sse_enccompact()
 	&pshufw	("mm5","mm4",0x0d);		# 15,14,11,10
 	&movd	("eax","mm1");			#  5, 4, 1, 0
 	&movd	("ebx","mm5");			# 15,14,11,10
+	&mov	($__key,$key);

 	&movz	($acc,&LB("eax"));		#  0
-	&movz	("ecx",&BP(-128,$tbl,$acc,1));	#  0
-	&pshufw	("mm2","mm0",0x0d);		#  7, 6, 3, 2
 	&movz	("edx",&HB("eax"));		#  1
+	&pshufw	("mm2","mm0",0x0d);		#  7, 6, 3, 2
+	&movz	("ecx",&BP(-128,$tbl,$acc,1));	#  0
+	&movz	($key,&LB("ebx"));		# 10
 	&movz	("edx",&BP(-128,$tbl,"edx",1));	#  1
-	&shl	("edx",8);			#  1
 	&shr	("eax",16);			#  5, 4
+	&shl	("edx",8);			#  1

-	&movz	($acc,&LB("ebx"));		# 10
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	# 10
+	&movz	($acc,&BP(-128,$tbl,$key,1));	# 10
+	&movz	($key,&HB("ebx"));		# 11
 	&shl	($acc,16);			# 10
-	&or	("ecx",$acc);			# 10
 	&pshufw	("mm6","mm4",0x08);		# 13,12, 9, 8
-	&movz	($acc,&HB("ebx"));		# 11
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	# 11
+	&or	("ecx",$acc);			# 10
+	&movz	($acc,&BP(-128,$tbl,$key,1));	# 11
+	&movz	($key,&HB("eax"));		#  5
 	&shl	($acc,24);			# 11
-	&or	("edx",$acc);			# 11
 	&shr	("ebx",16);			# 15,14
+	&or	("edx",$acc);			# 11

-	&movz	($acc,&HB("eax"));		#  5
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	#  5
+	&movz	($acc,&BP(-128,$tbl,$key,1));	#  5
+	&movz	($key,&HB("ebx"));		# 15
 	&shl	($acc,8);			#  5
 	&or	("ecx",$acc);			#  5
-	&movz	($acc,&HB("ebx"));		# 15
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	# 15
+	&movz	($acc,&BP(-128,$tbl,$key,1));	# 15
+	&movz	($key,&LB("eax"));		#  4
 	&shl	($acc,24);			# 15
 	&or	("ecx",$acc);			# 15
-	&movd	("mm0","ecx");			# t[0] collected

-	&movz	($acc,&LB("eax"));		#  4
-	&movz	("ecx",&BP(-128,$tbl,$acc,1));	#  4
+	&movz	($acc,&BP(-128,$tbl,$key,1));	#  4
+	&movz	($key,&LB("ebx"));		# 14
 	&movd	("eax","mm2");			#  7, 6, 3, 2
-	&movz	($acc,&LB("ebx"));		# 14
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	# 14
-	&shl	($acc,16);			# 14
+	&movd	("mm0","ecx");			# t[0] collected
+	&movz	("ecx",&BP(-128,$tbl,$key,1));	# 14
+	&movz	($key,&HB("eax"));		#  3
+	&shl	("ecx",16);			# 14
+	&movd	("ebx","mm6");			# 13,12, 9, 8
 	&or	("ecx",$acc);			# 14

-	&movd	("ebx","mm6");			# 13,12, 9, 8
-	&movz	($acc,&HB("eax"));		#  3
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	#  3
+	&movz	($acc,&BP(-128,$tbl,$key,1));	#  3
+	&movz	($key,&HB("ebx"));		#  9
 	&shl	($acc,24);			#  3
 	&or	("ecx",$acc);			#  3
-	&movz	($acc,&HB("ebx"));		#  9
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	#  9
+	&movz	($acc,&BP(-128,$tbl,$key,1));	#  9
+	&movz	($key,&LB("ebx"));		#  8
 	&shl	($acc,8);			#  9
+	&shr	("ebx",16);			# 13,12
 	&or	("ecx",$acc);			#  9
-	&movd	("mm1","ecx");			# t[1] collected

-	&movz	($acc,&LB("ebx"));		#  8
-	&movz	("ecx",&BP(-128,$tbl,$acc,1));	#  8
-	&shr	("ebx",16);			# 13,12
-	&movz	($acc,&LB("eax"));		#  2
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	#  2
-	&shl	($acc,16);			#  2
-	&or	("ecx",$acc);			#  2
+	&movz	($acc,&BP(-128,$tbl,$key,1));	#  8
+	&movz	($key,&LB("eax"));		#  2
 	&shr	("eax",16);			#  7, 6
+	&movd	("mm1","ecx");			# t[1] collected
+	&movz	("ecx",&BP(-128,$tbl,$key,1));	#  2
+	&movz	($key,&HB("eax"));		#  7
+	&shl	("ecx",16);			#  2
+	&and	("eax",0xff);			#  6
+	&or	("ecx",$acc);			#  2

 	&punpckldq	("mm0","mm1");		# t[0,1] collected

-	&movz	($acc,&HB("eax"));		#  7
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	#  7
+	&movz	($acc,&BP(-128,$tbl,$key,1));	#  7
+	&movz	($key,&HB("ebx"));		# 13
 	&shl	($acc,24);			#  7
-	&or	("ecx",$acc);			#  7
-	&and	("eax",0xff);			#  6
+	&and	("ebx",0xff);			# 12
 	&movz	("eax",&BP(-128,$tbl,"eax",1));	#  6
+	&or	("ecx",$acc);			#  7
 	&shl	("eax",16);			#  6
+	&movz	($acc,&BP(-128,$tbl,$key,1));	# 13
 	&or	("edx","eax");			#  6
-	&movz	($acc,&HB("ebx"));		# 13
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	# 13
 	&shl	($acc,8);			# 13
-	&or	("ecx",$acc);			# 13
-	&movd	("mm4","ecx");			# t[2] collected
-	&and	("ebx",0xff);			# 12
 	&movz	("ebx",&BP(-128,$tbl,"ebx",1));	# 12
+	&or	("ecx",$acc);			# 13
 	&or	("edx","ebx");			# 12
+	&mov	($key,$__key);
+	&movd	("mm4","ecx");			# t[2] collected
 	&movd	("mm5","edx");			# t[3] collected

 	&punpckldq	("mm4","mm5");		# t[2,3] collected
@ -1222,7 +1227,7 @@ sub enclast()
 ######################################################################

 sub deccompact()
-{ my $Fn = mov;
+{ my $Fn = \&mov;
  while ($#_>5) { pop(@_); $Fn=sub{}; }
  my ($i,$td,@s)=@_;
  my $tmp = $key;
@ -1270,30 +1275,30 @@ sub dectransform()
  my $tp4 = @s[($i+3)%4]; $tp4 = @s[3] if ($i==1);
  my $tp8 = $tbl;

-	&mov	($acc,$s[$i]);
-	&and	($acc,0x80808080);
-	&mov	($tmp,$acc);
+	&mov	($tmp,0x80808080);
+	&and	($tmp,$s[$i]);
+	&mov	($acc,$tmp);
 	&shr	($tmp,7);
 	&lea	($tp2,&DWP(0,$s[$i],$s[$i]));
 	&sub	($acc,$tmp);
 	&and	($tp2,0xfefefefe);
 	&and	($acc,0x1b1b1b1b);
-	&xor	($acc,$tp2);
-	&mov	($tp2,$acc);
+	&xor	($tp2,$acc);
+	&mov	($tmp,0x80808080);

-	&and	($acc,0x80808080);
-	&mov	($tmp,$acc);
+	&and	($tmp,$tp2);
+	&mov	($acc,$tmp);
 	&shr	($tmp,7);
 	&lea	($tp4,&DWP(0,$tp2,$tp2));
 	&sub	($acc,$tmp);
 	&and	($tp4,0xfefefefe);
 	&and	($acc,0x1b1b1b1b);
 	 &xor	($tp2,$s[$i]);	# tp2^tp1
-	&xor	($acc,$tp4);
-	&mov	($tp4,$acc);
+	&xor	($tp4,$acc);
+	&mov	($tmp,0x80808080);

-	&and	($acc,0x80808080);
-	&mov	($tmp,$acc);
+	&and	($tmp,$tp4);
+	&mov	($acc,$tmp);
 	&shr	($tmp,7);
 	&lea	($tp8,&DWP(0,$tp4,$tp4));
 	&sub	($acc,$tmp);
@ -1305,13 +1310,13 @@ sub dectransform()

 	&xor	($s[$i],$tp2);
 	&xor	($tp2,$tp8);
-	&rotl	($tp2,24);
 	&xor	($s[$i],$tp4);
 	&xor	($tp4,$tp8);
-	&rotl	($tp4,16);
+	&rotl	($tp2,24);
 	&xor	($s[$i],$tp8);	# ^= tp8^(tp4^tp1)^(tp2^tp1)
-	&rotl	($tp8,8);
+	&rotl	($tp4,16);
 	&xor	($s[$i],$tp2);	# ^= ROTATE(tp8^tp2^tp1,24)
+	&rotl	($tp8,8);
 	&xor	($s[$i],$tp4);	# ^= ROTATE(tp8^tp4^tp1,16)
 	 &mov	($s[0],$__s0)			if($i==2); #prefetch $s0
 	 &mov	($s[1],$__s1)			if($i==3); #prefetch $s1
@ -1389,85 +1394,87 @@ sub dectransform()
 sub sse_deccompact()
 {
 	&pshufw	("mm1","mm0",0x0c);		#  7, 6, 1, 0
+	&pshufw	("mm5","mm4",0x09);		# 13,12,11,10
 	&movd	("eax","mm1");			#  7, 6, 1, 0
+	&movd	("ebx","mm5");			# 13,12,11,10
+	&mov	($__key,$key);

-	&pshufw	("mm5","mm4",0x09);		# 13,12,11,10
 	&movz	($acc,&LB("eax"));		#  0
-	&movz	("ecx",&BP(-128,$tbl,$acc,1));	#  0
-	&movd	("ebx","mm5");			# 13,12,11,10
 	&movz	("edx",&HB("eax"));		#  1
+	&pshufw	("mm2","mm0",0x06);		#  3, 2, 5, 4
+	&movz	("ecx",&BP(-128,$tbl,$acc,1));	#  0
+	&movz	($key,&LB("ebx"));		# 10
 	&movz	("edx",&BP(-128,$tbl,"edx",1));	#  1
+	&shr	("eax",16);			#  7, 6
 	&shl	("edx",8);			#  1

-	&pshufw	("mm2","mm0",0x06);		#  3, 2, 5, 4
-	&movz	($acc,&LB("ebx"));		# 10
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	# 10
+	&movz	($acc,&BP(-128,$tbl,$key,1));	# 10
+	&movz	($key,&HB("ebx"));		# 11
 	&shl	($acc,16);			# 10
+	&pshufw	("mm6","mm4",0x03);		# 9, 8,15,14
 	&or	("ecx",$acc);			# 10
-	&shr	("eax",16);			#  7, 6
-	&movz	($acc,&HB("ebx"));		# 11
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	# 11
+	&movz	($acc,&BP(-128,$tbl,$key,1));	# 11
+	&movz	($key,&HB("eax"));		#  7
 	&shl	($acc,24);			# 11
-	&or	("edx",$acc);			# 11
 	&shr	("ebx",16);			# 13,12
+	&or	("edx",$acc);			# 11

-	&pshufw	("mm6","mm4",0x03);		# 9, 8,15,14
-	&movz	($acc,&HB("eax"));		#  7
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	#  7
+	&movz	($acc,&BP(-128,$tbl,$key,1));	#  7
+	&movz	($key,&HB("ebx"));		# 13
 	&shl	($acc,24);			#  7
 	&or	("ecx",$acc);			#  7
-	&movz	($acc,&HB("ebx"));		# 13
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	# 13
+	&movz	($acc,&BP(-128,$tbl,$key,1));	# 13
+	&movz	($key,&LB("eax"));		#  6
 	&shl	($acc,8);			# 13
+	&movd	("eax","mm2");			#  3, 2, 5, 4
 	&or	("ecx",$acc);			# 13
-	&movd	("mm0","ecx");			# t[0] collected

-	&movz	($acc,&LB("eax"));		#  6
-	&movd	("eax","mm2");			#  3, 2, 5, 4
-	&movz	("ecx",&BP(-128,$tbl,$acc,1));	#  6
-	&shl	("ecx",16);			#  6
-	&movz	($acc,&LB("ebx"));		# 12
+	&movz	($acc,&BP(-128,$tbl,$key,1));	#  6
+	&movz	($key,&LB("ebx"));		# 12
+	&shl	($acc,16);			#  6
 	&movd	("ebx","mm6");			#  9, 8,15,14
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	# 12
+	&movd	("mm0","ecx");			# t[0] collected
+	&movz	("ecx",&BP(-128,$tbl,$key,1));	# 12
+	&movz	($key,&LB("eax"));		#  4
 	&or	("ecx",$acc);			# 12

-	&movz	($acc,&LB("eax"));		#  4
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	#  4
+	&movz	($acc,&BP(-128,$tbl,$key,1));	#  4
+	&movz	($key,&LB("ebx"));		# 14
 	&or	("edx",$acc);			#  4
-	&movz	($acc,&LB("ebx"));		# 14
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	# 14
+	&movz	($acc,&BP(-128,$tbl,$key,1));	# 14
+	&movz	($key,&HB("eax"));		#  5
 	&shl	($acc,16);			# 14
+	&shr	("eax",16);			#  3, 2
 	&or	("edx",$acc);			# 14
-	&movd	("mm1","edx");			# t[1] collected

-	&movz	($acc,&HB("eax"));		#  5
-	&movz	("edx",&BP(-128,$tbl,$acc,1));	#  5
-	&shl	("edx",8);			#  5
-	&movz	($acc,&HB("ebx"));		# 15
-	&shr	("eax",16);			#  3, 2
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	# 15
-	&shl	($acc,24);			# 15
-	&or	("edx",$acc);			# 15
+	&movz	($acc,&BP(-128,$tbl,$key,1));	#  5
+	&movz	($key,&HB("ebx"));		# 15
 	&shr	("ebx",16);			#  9, 8
+	&shl	($acc,8);			#  5
+	&movd	("mm1","edx");			# t[1] collected
+	&movz	("edx",&BP(-128,$tbl,$key,1));	# 15
+	&movz	($key,&HB("ebx"));		#  9
+	&shl	("edx",24);			# 15
+	&and	("ebx",0xff);			#  8
+	&or	("edx",$acc);			# 15

 	&punpckldq	("mm0","mm1");		# t[0,1] collected

-	&movz	($acc,&HB("ebx"));		#  9
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	#  9
+	&movz	($acc,&BP(-128,$tbl,$key,1));	#  9
+	&movz	($key,&LB("eax"));		#  2
 	&shl	($acc,8);			#  9
-	&or	("ecx",$acc);			#  9
-	&and	("ebx",0xff);			#  8
+	&movz	("eax",&HB("eax"));		#  3
 	&movz	("ebx",&BP(-128,$tbl,"ebx",1));	#  8
+	&or	("ecx",$acc);			#  9
+	&movz	($acc,&BP(-128,$tbl,$key,1));	#  2
 	&or	("edx","ebx");			#  8
-	&movz	($acc,&LB("eax"));		#  2
-	&movz	($acc,&BP(-128,$tbl,$acc,1));	#  2
 	&shl	($acc,16);			#  2
-	&or	("edx",$acc);			#  2
-	&movd	("mm4","edx");			# t[2] collected
-	&movz	("eax",&HB("eax"));		#  3
 	&movz	("eax",&BP(-128,$tbl,"eax",1));	#  3
+	&or	("edx",$acc);			#  2
 	&shl	("eax",24);			#  3
 	&or	("ecx","eax");			#  3
+	&mov	($key,$__key);
+	&movd	("mm4","edx");			# t[2] collected
 	&movd	("mm5","ecx");			# t[3] collected

 	&punpckldq	("mm4","mm5");		# t[2,3] collected
@ -2181,8 +2188,8 @@ my $mark=&DWP(76+240,"esp");	# copy of aes_key->rounds
 	&mov	("ecx",240/4);
 	&xor	("eax","eax");
 	&align	(4);
-	&data_word(0xABF3F689);	# rep stosd
-	&set_label("skip_ezero")
+	&data_word(0xABF3F689);		# rep stosd
+	&set_label("skip_ezero");
 	&mov	("esp",$_esp);
 	&popf	();
    &set_label("drop_out");
@ -2301,8 +2308,8 @@ my $mark=&DWP(76+240,"esp");	# copy of aes_key->rounds
 	&mov	("ecx",240/4);
 	&xor	("eax","eax");
 	&align	(4);
-	&data_word(0xABF3F689);	# rep stosd
-	&set_label("skip_dzero")
+	&data_word(0xABF3F689);		# rep stosd
+	&set_label("skip_dzero");
 	&mov	("esp",$_esp);
 	&popf	();
 	&function_end_A();
@ -2865,32 +2872,32 @@ sub deckey()
 { my ($i,$key,$tp1,$tp2,$tp4,$tp8) = @_;
  my $tmp = $tbl;

-	&mov	($acc,$tp1);
-	&and	($acc,0x80808080);
-	&mov	($tmp,$acc);
-	&shr	($tmp,7);
+	&mov	($tmp,0x80808080);
+	&and	($tmp,$tp1);
 	&lea	($tp2,&DWP(0,$tp1,$tp1));
+	&mov	($acc,$tmp);
+	&shr	($tmp,7);
 	&sub	($acc,$tmp);
 	&and	($tp2,0xfefefefe);
 	&and	($acc,0x1b1b1b1b);
-	&xor	($acc,$tp2);
-	&mov	($tp2,$acc);
+	&xor	($tp2,$acc);
+	&mov	($tmp,0x80808080);

-	&and	($acc,0x80808080);
-	&mov	($tmp,$acc);
-	&shr	($tmp,7);
+	&and	($tmp,$tp2);
 	&lea	($tp4,&DWP(0,$tp2,$tp2));
+	&mov	($acc,$tmp);
+	&shr	($tmp,7);
 	&sub	($acc,$tmp);
 	&and	($tp4,0xfefefefe);
 	&and	($acc,0x1b1b1b1b);
 	 &xor	($tp2,$tp1);	# tp2^tp1
-	&xor	($acc,$tp4);
-	&mov	($tp4,$acc);
+	&xor	($tp4,$acc);
+	&mov	($tmp,0x80808080);

-	&and	($acc,0x80808080);
-	&mov	($tmp,$acc);
-	&shr	($tmp,7);
+	&and	($tmp,$tp4);
 	&lea	($tp8,&DWP(0,$tp4,$tp4));
+	&mov	($acc,$tmp);
+	&shr	($tmp,7);
 	 &xor	($tp4,$tp1);	# tp4^tp1
 	&sub	($acc,$tmp);
 	&and	($tp8,0xfefefefe);
--- a/deps/openssl/openssl/crypto/aes/asm/aes-armv4.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/aes-armv4.pl
@ -1,7 +1,7 @@
 #!/usr/bin/env perl

 # ====================================================================
-# Written by Andy Polyakov <appro@fy.chalmers.se> for the OpenSSL
+# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
 # project. The module is, however, dual licensed under OpenSSL and
 # CRYPTOGAMS licenses depending on where you obtain it. For further
 # details see http://www.openssl.org/~appro/cryptogams/.
@ -51,9 +51,23 @@ $key="r11";
 $rounds="r12";

 $code=<<___;
-#include "arm_arch.h"
+#ifndef __KERNEL__
+# include "arm_arch.h"
+#else
+# define __ARM_ARCH__ __LINUX_ARM_ARCH__
+#endif
+
 .text
+#if __ARM_ARCH__<7
+.code	32
+#else
+.syntax	unified
+# ifdef __thumb2__
+.thumb
+# else
 .code	32
+# endif
+#endif

 .type	AES_Te,%object
 .align	5
@ -167,7 +181,11 @@ AES_Te:
 .type   AES_encrypt,%function
 .align	5
 AES_encrypt:
+#if __ARM_ARCH__<7
 	sub	r3,pc,#8		@ AES_encrypt
+#else
+	adr	r3,AES_encrypt
+#endif
 	stmdb   sp!,{r1,r4-r12,lr}
 	mov	$rounds,r0		@ inp
 	mov	$key,r2
@ -409,11 +427,21 @@ _armv4_AES_encrypt:
 .align	5
 private_AES_set_encrypt_key:
 _armv4_AES_set_encrypt_key:
+#if __ARM_ARCH__<7
 	sub	r3,pc,#8		@ AES_set_encrypt_key
+#else
+	adr	r3,private_AES_set_encrypt_key
+#endif
 	teq	r0,#0
+#if __ARM_ARCH__>=7
+	itt	eq			@ Thumb2 thing, sanity check in ARM
+#endif
 	moveq	r0,#-1
 	beq	.Labrt
 	teq	r2,#0
+#if __ARM_ARCH__>=7
+	itt	eq			@ Thumb2 thing, sanity check in ARM
+#endif
 	moveq	r0,#-1
 	beq	.Labrt

@ -422,6 +450,9 @@ _armv4_AES_set_encrypt_key:
 	teq	r1,#192
 	beq	.Lok
 	teq	r1,#256
+#if __ARM_ARCH__>=7
+	itt	ne			@ Thumb2 thing, sanity check in ARM
+#endif
 	movne	r0,#-1
 	bne	.Labrt

@ -576,6 +607,9 @@ _armv4_AES_set_encrypt_key:
 	str	$s2,[$key,#-16]
 	subs	$rounds,$rounds,#1
 	str	$s3,[$key,#-12]
+#if __ARM_ARCH__>=7
+	itt	eq				@ Thumb2 thing, sanity check in ARM
+#endif
 	subeq	r2,$key,#216
 	beq	.Ldone

@ -645,6 +679,9 @@ _armv4_AES_set_encrypt_key:
 	str	$s2,[$key,#-24]
 	subs	$rounds,$rounds,#1
 	str	$s3,[$key,#-20]
+#if __ARM_ARCH__>=7
+	itt	eq				@ Thumb2 thing, sanity check in ARM
+#endif
 	subeq	r2,$key,#256
 	beq	.Ldone

@ -674,11 +711,17 @@ _armv4_AES_set_encrypt_key:
 	str	$i3,[$key,#-4]
 	b	.L256_loop

+.align	2
 .Ldone:	mov	r0,#0
 	ldmia   sp!,{r4-r12,lr}
-.Labrt:	tst	lr,#1
+.Labrt:
+#if __ARM_ARCH__>=5
+	ret				@ bx lr
+#else
+	tst	lr,#1
 	moveq	pc,lr			@ be binary compatible with V4, yet
 	bx	lr			@ interoperable with Thumb ISA:-)
+#endif
 .size	private_AES_set_encrypt_key,.-private_AES_set_encrypt_key

 .global private_AES_set_decrypt_key
@ -688,34 +731,57 @@ private_AES_set_decrypt_key:
 	str	lr,[sp,#-4]!            @ push lr
 	bl	_armv4_AES_set_encrypt_key
 	teq	r0,#0
-	ldrne	lr,[sp],#4              @ pop lr
+	ldr	lr,[sp],#4              @ pop lr
 	bne	.Labrt

-	stmdb   sp!,{r4-r12}
+	mov	r0,r2			@ AES_set_encrypt_key preserves r2,
+	mov	r1,r2			@ which is AES_KEY *key
+	b	_armv4_AES_set_enc2dec_key
+.size	private_AES_set_decrypt_key,.-private_AES_set_decrypt_key

-	ldr	$rounds,[r2,#240]	@ AES_set_encrypt_key preserves r2,
-	mov	$key,r2			@ which is AES_KEY *key
-	mov	$i1,r2
-	add	$i2,r2,$rounds,lsl#4
+@ void AES_set_enc2dec_key(const AES_KEY *inp,AES_KEY *out)
+.global	AES_set_enc2dec_key
+.type	AES_set_enc2dec_key,%function
+.align	5
+AES_set_enc2dec_key:
+_armv4_AES_set_enc2dec_key:
+	stmdb   sp!,{r4-r12,lr}
+
+	ldr	$rounds,[r0,#240]
+	mov	$i1,r0			@ input
+	add	$i2,r0,$rounds,lsl#4
+	mov	$key,r1			@ ouput
+	add	$tbl,r1,$rounds,lsl#4
+	str	$rounds,[r1,#240]
+
+.Linv:	ldr	$s0,[$i1],#16
+	ldr	$s1,[$i1,#-12]
+	ldr	$s2,[$i1,#-8]
+	ldr	$s3,[$i1,#-4]
+	ldr	$t1,[$i2],#-16
+	ldr	$t2,[$i2,#16+4]
+	ldr	$t3,[$i2,#16+8]
+	ldr	$i3,[$i2,#16+12]
+	str	$s0,[$tbl],#-16
+	str	$s1,[$tbl,#16+4]
+	str	$s2,[$tbl,#16+8]
+	str	$s3,[$tbl,#16+12]
+	str	$t1,[$key],#16
+	str	$t2,[$key,#-12]
+	str	$t3,[$key,#-8]
+	str	$i3,[$key,#-4]
+	teq	$i1,$i2
+	bne	.Linv

-.Linv:	ldr	$s0,[$i1]
+	ldr	$s0,[$i1]
 	ldr	$s1,[$i1,#4]
 	ldr	$s2,[$i1,#8]
 	ldr	$s3,[$i1,#12]
-	ldr	$t1,[$i2]
-	ldr	$t2,[$i2,#4]
-	ldr	$t3,[$i2,#8]
-	ldr	$i3,[$i2,#12]
-	str	$s0,[$i2],#-16
-	str	$s1,[$i2,#16+4]
-	str	$s2,[$i2,#16+8]
-	str	$s3,[$i2,#16+12]
-	str	$t1,[$i1],#16
-	str	$t2,[$i1,#-12]
-	str	$t3,[$i1,#-8]
-	str	$i3,[$i1,#-4]
-	teq	$i1,$i2
-	bne	.Linv
+	str	$s0,[$key]
+	str	$s1,[$key,#4]
+	str	$s2,[$key,#8]
+	str	$s3,[$key,#12]
+	sub	$key,$key,$rounds,lsl#3
 ___
 $mask80=$i1;
 $mask1b=$i2;
@ -773,7 +839,7 @@ $code.=<<___;
 	moveq	pc,lr			@ be binary compatible with V4, yet
 	bx	lr			@ interoperable with Thumb ISA:-)
 #endif
-.size	private_AES_set_decrypt_key,.-private_AES_set_decrypt_key
+.size	AES_set_enc2dec_key,.-AES_set_enc2dec_key

 .type	AES_Td,%object
 .align	5
@ -883,7 +949,11 @@ AES_Td:
 .type   AES_decrypt,%function
 .align	5
 AES_decrypt:
+#if __ARM_ARCH__<7
 	sub	r3,pc,#8		@ AES_decrypt
+#else
+	adr	r3,AES_decrypt
+#endif
 	stmdb   sp!,{r1,r4-r12,lr}
 	mov	$rounds,r0		@ inp
 	mov	$key,r2
@ -1080,8 +1150,9 @@ _armv4_AES_decrypt:
 	ldrb	$t3,[$tbl,$i3]		@ Td4[s0>>0]
 	and	$i3,lr,$s1,lsr#8

+	add	$s1,$tbl,$s1,lsr#24
 	ldrb	$i1,[$tbl,$i1]		@ Td4[s1>>0]
-	ldrb	$s1,[$tbl,$s1,lsr#24]	@ Td4[s1>>24]
+	ldrb	$s1,[$s1]		@ Td4[s1>>24]
 	ldrb	$i2,[$tbl,$i2]		@ Td4[s1>>16]
 	eor	$s0,$i1,$s0,lsl#24
 	ldrb	$i3,[$tbl,$i3]		@ Td4[s1>>8]
@ -1094,7 +1165,8 @@ _armv4_AES_decrypt:
 	ldrb	$i2,[$tbl,$i2]		@ Td4[s2>>0]
 	and	$i3,lr,$s2,lsr#16

-	ldrb	$s2,[$tbl,$s2,lsr#24]	@ Td4[s2>>24]
+	add	$s2,$tbl,$s2,lsr#24
+	ldrb	$s2,[$s2]		@ Td4[s2>>24]
 	eor	$s0,$s0,$i1,lsl#8
 	ldrb	$i3,[$tbl,$i3]		@ Td4[s2>>16]
 	eor	$s1,$i2,$s1,lsl#16
@ -1106,8 +1178,9 @@ _armv4_AES_decrypt:
 	ldrb	$i2,[$tbl,$i2]		@ Td4[s3>>8]
 	and	$i3,lr,$s3		@ i2

+	add	$s3,$tbl,$s3,lsr#24
 	ldrb	$i3,[$tbl,$i3]		@ Td4[s3>>0]
-	ldrb	$s3,[$tbl,$s3,lsr#24]	@ Td4[s3>>24]
+	ldrb	$s3,[$s3]		@ Td4[s3>>24]
 	eor	$s0,$s0,$i1,lsl#16
 	ldr	$i1,[$key,#0]
 	eor	$s1,$s1,$i2,lsl#8
@ -1130,5 +1203,15 @@ _armv4_AES_decrypt:
 ___

 $code =~ s/\bbx\s+lr\b/.word\t0xe12fff1e/gm;	# make it possible to compile with -march=armv4
+$code =~ s/\bret\b/bx\tlr/gm;
+
+open SELF,$0;
+while(<SELF>) {
+	next if (/^#!/);
+	last if (!s/^#/@/ and !/^$/);
+	print;
+}
+close SELF;
+
 print $code;
 close STDOUT;	# enforce flush
--- a/deps/openssl/openssl/crypto/aes/asm/aes-mips.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/aes-mips.pl
--- a/deps/openssl/openssl/crypto/aes/asm/aes-ppc.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/aes-ppc.pl
@ -45,6 +45,8 @@ if ($flavour =~ /64/) {
 	$PUSH	="stw";
 } else { die "nonsense $flavour"; }

+$LITTLE_ENDIAN = ($flavour=~/le$/) ? $SIZE_T : 0;
+
 $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
 ( $xlate="${dir}ppc-xlate.pl" and -f $xlate ) or
 ( $xlate="${dir}../../perlasm/ppc-xlate.pl" and -f $xlate) or
@ -68,7 +70,7 @@ $key="r5";
 $Tbl0="r3";
 $Tbl1="r6";
 $Tbl2="r7";
-$Tbl3="r2";
+$Tbl3=$out;	# stay away from "r2"; $out is offloaded to stack

 $s0="r8";
 $s1="r9";
@ -76,7 +78,7 @@ $s2="r10";
 $s3="r11";

 $t0="r12";
-$t1="r13";
+$t1="r0";	# stay away from "r13";
 $t2="r14";
 $t3="r15";

@ -100,9 +102,6 @@ $acc13="r29";
 $acc14="r30";
 $acc15="r31";

-# stay away from TLS pointer
-if ($SIZE_T==8)	{ die if ($t1 ne "r13");  $t1="r0";		}
-else		{ die if ($Tbl3 ne "r2"); $Tbl3=$t0; $t0="r0";	}
 $mask80=$Tbl2;
 $mask1b=$Tbl3;

@ -337,8 +336,7 @@ $code.=<<___;
 	$STU	$sp,-$FRAME($sp)
 	mflr	r0

-	$PUSH	$toc,`$FRAME-$SIZE_T*20`($sp)
-	$PUSH	r13,`$FRAME-$SIZE_T*19`($sp)
+	$PUSH	$out,`$FRAME-$SIZE_T*19`($sp)
 	$PUSH	r14,`$FRAME-$SIZE_T*18`($sp)
 	$PUSH	r15,`$FRAME-$SIZE_T*17`($sp)
 	$PUSH	r16,`$FRAME-$SIZE_T*16`($sp)
@ -365,16 +363,61 @@ $code.=<<___;
 	bne	Lenc_unaligned

 Lenc_unaligned_ok:
+___
+$code.=<<___ if (!$LITTLE_ENDIAN);
 	lwz	$s0,0($inp)
 	lwz	$s1,4($inp)
 	lwz	$s2,8($inp)
 	lwz	$s3,12($inp)
+___
+$code.=<<___ if ($LITTLE_ENDIAN);
+	lwz	$t0,0($inp)
+	lwz	$t1,4($inp)
+	lwz	$t2,8($inp)
+	lwz	$t3,12($inp)
+	rotlwi	$s0,$t0,8
+	rotlwi	$s1,$t1,8
+	rotlwi	$s2,$t2,8
+	rotlwi	$s3,$t3,8
+	rlwimi	$s0,$t0,24,0,7
+	rlwimi	$s1,$t1,24,0,7
+	rlwimi	$s2,$t2,24,0,7
+	rlwimi	$s3,$t3,24,0,7
+	rlwimi	$s0,$t0,24,16,23
+	rlwimi	$s1,$t1,24,16,23
+	rlwimi	$s2,$t2,24,16,23
+	rlwimi	$s3,$t3,24,16,23
+___
+$code.=<<___;
 	bl	LAES_Te
 	bl	Lppc_AES_encrypt_compact
+	$POP	$out,`$FRAME-$SIZE_T*19`($sp)
+___
+$code.=<<___ if ($LITTLE_ENDIAN);
+	rotlwi	$t0,$s0,8
+	rotlwi	$t1,$s1,8
+	rotlwi	$t2,$s2,8
+	rotlwi	$t3,$s3,8
+	rlwimi	$t0,$s0,24,0,7
+	rlwimi	$t1,$s1,24,0,7
+	rlwimi	$t2,$s2,24,0,7
+	rlwimi	$t3,$s3,24,0,7
+	rlwimi	$t0,$s0,24,16,23
+	rlwimi	$t1,$s1,24,16,23
+	rlwimi	$t2,$s2,24,16,23
+	rlwimi	$t3,$s3,24,16,23
+	stw	$t0,0($out)
+	stw	$t1,4($out)
+	stw	$t2,8($out)
+	stw	$t3,12($out)
+___
+$code.=<<___ if (!$LITTLE_ENDIAN);
 	stw	$s0,0($out)
 	stw	$s1,4($out)
 	stw	$s2,8($out)
 	stw	$s3,12($out)
+___
+$code.=<<___;
 	b	Lenc_done

 Lenc_unaligned:
@ -417,6 +460,7 @@ Lenc_xpage:

 	bl	LAES_Te
 	bl	Lppc_AES_encrypt_compact
+	$POP	$out,`$FRAME-$SIZE_T*19`($sp)

 	extrwi	$acc00,$s0,8,0
 	extrwi	$acc01,$s0,8,8
@ -449,8 +493,6 @@ Lenc_xpage:

 Lenc_done:
 	$POP	r0,`$FRAME+$LRSAVE`($sp)
-	$POP	$toc,`$FRAME-$SIZE_T*20`($sp)
-	$POP	r13,`$FRAME-$SIZE_T*19`($sp)
 	$POP	r14,`$FRAME-$SIZE_T*18`($sp)
 	$POP	r15,`$FRAME-$SIZE_T*17`($sp)
 	$POP	r16,`$FRAME-$SIZE_T*16`($sp)
@ -764,6 +806,7 @@ Lenc_compact_done:
 	blr
 	.long	0
 	.byte	0,12,0x14,0,0,0,0,0
+.size	.AES_encrypt,.-.AES_encrypt

 .globl	.AES_decrypt
 .align	7
@ -771,8 +814,7 @@ Lenc_compact_done:
 	$STU	$sp,-$FRAME($sp)
 	mflr	r0

-	$PUSH	$toc,`$FRAME-$SIZE_T*20`($sp)
-	$PUSH	r13,`$FRAME-$SIZE_T*19`($sp)
+	$PUSH	$out,`$FRAME-$SIZE_T*19`($sp)
 	$PUSH	r14,`$FRAME-$SIZE_T*18`($sp)
 	$PUSH	r15,`$FRAME-$SIZE_T*17`($sp)
 	$PUSH	r16,`$FRAME-$SIZE_T*16`($sp)
@ -799,16 +841,61 @@ Lenc_compact_done:
 	bne	Ldec_unaligned

 Ldec_unaligned_ok:
+___
+$code.=<<___ if (!$LITTLE_ENDIAN);
 	lwz	$s0,0($inp)
 	lwz	$s1,4($inp)
 	lwz	$s2,8($inp)
 	lwz	$s3,12($inp)
+___
+$code.=<<___ if ($LITTLE_ENDIAN);
+	lwz	$t0,0($inp)
+	lwz	$t1,4($inp)
+	lwz	$t2,8($inp)
+	lwz	$t3,12($inp)
+	rotlwi	$s0,$t0,8
+	rotlwi	$s1,$t1,8
+	rotlwi	$s2,$t2,8
+	rotlwi	$s3,$t3,8
+	rlwimi	$s0,$t0,24,0,7
+	rlwimi	$s1,$t1,24,0,7
+	rlwimi	$s2,$t2,24,0,7
+	rlwimi	$s3,$t3,24,0,7
+	rlwimi	$s0,$t0,24,16,23
+	rlwimi	$s1,$t1,24,16,23
+	rlwimi	$s2,$t2,24,16,23
+	rlwimi	$s3,$t3,24,16,23
+___
+$code.=<<___;
 	bl	LAES_Td
 	bl	Lppc_AES_decrypt_compact
+	$POP	$out,`$FRAME-$SIZE_T*19`($sp)
+___
+$code.=<<___ if ($LITTLE_ENDIAN);
+	rotlwi	$t0,$s0,8
+	rotlwi	$t1,$s1,8
+	rotlwi	$t2,$s2,8
+	rotlwi	$t3,$s3,8
+	rlwimi	$t0,$s0,24,0,7
+	rlwimi	$t1,$s1,24,0,7
+	rlwimi	$t2,$s2,24,0,7
+	rlwimi	$t3,$s3,24,0,7
+	rlwimi	$t0,$s0,24,16,23
+	rlwimi	$t1,$s1,24,16,23
+	rlwimi	$t2,$s2,24,16,23
+	rlwimi	$t3,$s3,24,16,23
+	stw	$t0,0($out)
+	stw	$t1,4($out)
+	stw	$t2,8($out)
+	stw	$t3,12($out)
+___
+$code.=<<___ if (!$LITTLE_ENDIAN);
 	stw	$s0,0($out)
 	stw	$s1,4($out)
 	stw	$s2,8($out)
 	stw	$s3,12($out)
+___
+$code.=<<___;
 	b	Ldec_done

 Ldec_unaligned:
@ -851,6 +938,7 @@ Ldec_xpage:

 	bl	LAES_Td
 	bl	Lppc_AES_decrypt_compact
+	$POP	$out,`$FRAME-$SIZE_T*19`($sp)

 	extrwi	$acc00,$s0,8,0
 	extrwi	$acc01,$s0,8,8
@ -883,8 +971,6 @@ Ldec_xpage:

 Ldec_done:
 	$POP	r0,`$FRAME+$LRSAVE`($sp)
-	$POP	$toc,`$FRAME-$SIZE_T*20`($sp)
-	$POP	r13,`$FRAME-$SIZE_T*19`($sp)
 	$POP	r14,`$FRAME-$SIZE_T*18`($sp)
 	$POP	r15,`$FRAME-$SIZE_T*17`($sp)
 	$POP	r16,`$FRAME-$SIZE_T*16`($sp)
@ -1355,6 +1441,7 @@ Ldec_compact_done:
 	blr
 	.long	0
 	.byte	0,12,0x14,0,0,0,0,0
+.size	.AES_decrypt,.-.AES_decrypt

 .asciz	"AES for PPC, CRYPTOGAMS by <appro\@openssl.org>"
 .align	7
--- a/deps/openssl/openssl/crypto/aes/asm/aes-x86_64.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/aes-x86_64.pl
@ -19,9 +19,10 @@
 # Performance in number of cycles per processed byte for 128-bit key:
 #
 #		ECB encrypt	ECB decrypt	CBC large chunk
-# AMD64		33		41		13.0
-# EM64T		38		59		18.6(*)
-# Core 2	30		43		14.5(*)
+# AMD64		33		43		13.0
+# EM64T		38		56		18.6(*)
+# Core 2	30		42		14.5(*)
+# Atom		65		86		32.1(*)
 #
 # (*) with hyper-threading off

@ -366,68 +367,66 @@ $code.=<<___;
 	movzb	`&lo("$s0")`,$t0
 	movzb	`&lo("$s1")`,$t1
 	movzb	`&lo("$s2")`,$t2
-	movzb	($sbox,$t0,1),$t0
-	movzb	($sbox,$t1,1),$t1
-	movzb	($sbox,$t2,1),$t2
-
 	movzb	`&lo("$s3")`,$t3
 	movzb	`&hi("$s1")`,$acc0
 	movzb	`&hi("$s2")`,$acc1
+	shr	\$16,$s2
+	movzb	`&hi("$s3")`,$acc2
+	movzb	($sbox,$t0,1),$t0
+	movzb	($sbox,$t1,1),$t1
+	movzb	($sbox,$t2,1),$t2
 	movzb	($sbox,$t3,1),$t3
-	movzb	($sbox,$acc0,1),$t4	#$t0
-	movzb	($sbox,$acc1,1),$t5	#$t1

-	movzb	`&hi("$s3")`,$acc2
+	movzb	($sbox,$acc0,1),$t4	#$t0
 	movzb	`&hi("$s0")`,$acc0
-	shr	\$16,$s2
+	movzb	($sbox,$acc1,1),$t5	#$t1
+	movzb	`&lo("$s2")`,$acc1
 	movzb	($sbox,$acc2,1),$acc2	#$t2
 	movzb	($sbox,$acc0,1),$acc0	#$t3
-	shr	\$16,$s3

-	movzb	`&lo("$s2")`,$acc1
 	shl	\$8,$t4
+	shr	\$16,$s3
 	shl	\$8,$t5
-	movzb	($sbox,$acc1,1),$acc1	#$t0
 	xor	$t4,$t0
-	xor	$t5,$t1
-
-	movzb	`&lo("$s3")`,$t4
 	shr	\$16,$s0
+	movzb	`&lo("$s3")`,$t4
 	shr	\$16,$s1
-	movzb	`&lo("$s0")`,$t5
+	xor	$t5,$t1
 	shl	\$8,$acc2
-	shl	\$8,$acc0
-	movzb	($sbox,$t4,1),$t4	#$t1
-	movzb	($sbox,$t5,1),$t5	#$t2
+	movzb	`&lo("$s0")`,$t5
+	movzb	($sbox,$acc1,1),$acc1	#$t0
 	xor	$acc2,$t2
-	xor	$acc0,$t3

+	shl	\$8,$acc0
 	movzb	`&lo("$s1")`,$acc2
-	movzb	`&hi("$s3")`,$acc0
 	shl	\$16,$acc1
-	movzb	($sbox,$acc2,1),$acc2	#$t3
-	movzb	($sbox,$acc0,1),$acc0	#$t0
+	xor	$acc0,$t3
+	movzb	($sbox,$t4,1),$t4	#$t1
+	movzb	`&hi("$s3")`,$acc0
+	movzb	($sbox,$t5,1),$t5	#$t2
 	xor	$acc1,$t0

-	movzb	`&hi("$s0")`,$acc1
 	shr	\$8,$s2
+	movzb	`&hi("$s0")`,$acc1
+	shl	\$16,$t4
 	shr	\$8,$s1
+	shl	\$16,$t5
+	xor	$t4,$t1
+	movzb	($sbox,$acc2,1),$acc2	#$t3
+	movzb	($sbox,$acc0,1),$acc0	#$t0
 	movzb	($sbox,$acc1,1),$acc1	#$t1
 	movzb	($sbox,$s2,1),$s3	#$t3
 	movzb	($sbox,$s1,1),$s2	#$t2
-	shl	\$16,$t4
-	shl	\$16,$t5
+
 	shl	\$16,$acc2
-	xor	$t4,$t1
 	xor	$t5,$t2
-	xor	$acc2,$t3
-
 	shl	\$24,$acc0
+	xor	$acc2,$t3
 	shl	\$24,$acc1
-	shl	\$24,$s3
 	xor	$acc0,$t0
-	shl	\$24,$s2
+	shl	\$24,$s3
 	xor	$acc1,$t1
+	shl	\$24,$s2
 	mov	$t0,$s0
 	mov	$t1,$s1
 	xor	$t2,$s2
@ -466,12 +465,12 @@ sub enctransform()
 { my ($t3,$r20,$r21)=($acc2,"%r8d","%r9d");

 $code.=<<___;
-	mov	$s0,$acc0
-	mov	$s1,$acc1
-	and	\$0x80808080,$acc0
-	and	\$0x80808080,$acc1
-	mov	$acc0,$t0
-	mov	$acc1,$t1
+	mov	\$0x80808080,$t0
+	mov	\$0x80808080,$t1
+	and	$s0,$t0
+	and	$s1,$t1
+	mov	$t0,$acc0
+	mov	$t1,$acc1
 	shr	\$7,$t0
 	lea	($s0,$s0),$r20
 	shr	\$7,$t1
@ -489,25 +488,25 @@ $code.=<<___;

 	xor	$r20,$s0
 	xor	$r21,$s1
-	 mov	$s2,$acc0
-	 mov	$s3,$acc1
+	 mov	\$0x80808080,$t2
 	rol	\$24,$s0
+	 mov	\$0x80808080,$t3
 	rol	\$24,$s1
-	 and	\$0x80808080,$acc0
-	 and	\$0x80808080,$acc1
+	 and	$s2,$t2
+	 and	$s3,$t3
 	xor	$r20,$s0
 	xor	$r21,$s1
-	 mov	$acc0,$t2
-	 mov	$acc1,$t3
+	 mov	$t2,$acc0
 	ror	\$16,$t0
+	 mov	$t3,$acc1
 	ror	\$16,$t1
-	 shr	\$7,$t2
 	 lea	($s2,$s2),$r20
+	 shr	\$7,$t2
 	xor	$t0,$s0
-	xor	$t1,$s1
 	 shr	\$7,$t3
-	 lea	($s3,$s3),$r21
+	xor	$t1,$s1
 	ror	\$8,$t0
+	 lea	($s3,$s3),$r21
 	ror	\$8,$t1
 	 sub	$t2,$acc0
 	 sub	$t3,$acc1
@ -523,23 +522,23 @@ $code.=<<___;
 	xor	$acc0,$r20
 	xor	$acc1,$r21

+	ror	\$16,$t2
 	xor	$r20,$s2
+	ror	\$16,$t3
 	xor	$r21,$s3
 	rol	\$24,$s2
+	mov	0($sbox),$acc0			# prefetch Te4
 	rol	\$24,$s3
 	xor	$r20,$s2
-	xor	$r21,$s3
-	mov	0($sbox),$acc0			# prefetch Te4
-	ror	\$16,$t2
-	ror	\$16,$t3
 	mov	64($sbox),$acc1
-	xor	$t2,$s2
-	xor	$t3,$s3
+	xor	$r21,$s3
 	mov	128($sbox),$r20
+	xor	$t2,$s2
 	ror	\$8,$t2
+	xor	$t3,$s3
 	ror	\$8,$t3
-	mov	192($sbox),$r21
 	xor	$t2,$s2
+	mov	192($sbox),$r21
 	xor	$t3,$s3
 ___
 }
@ -936,70 +935,69 @@ $code.=<<___;
 	movzb	`&lo("$s0")`,$t0
 	movzb	`&lo("$s1")`,$t1
 	movzb	`&lo("$s2")`,$t2
-	movzb	($sbox,$t0,1),$t0
-	movzb	($sbox,$t1,1),$t1
-	movzb	($sbox,$t2,1),$t2
-
 	movzb	`&lo("$s3")`,$t3
 	movzb	`&hi("$s3")`,$acc0
 	movzb	`&hi("$s0")`,$acc1
+	shr	\$16,$s3
+	movzb	`&hi("$s1")`,$acc2
+	movzb	($sbox,$t0,1),$t0
+	movzb	($sbox,$t1,1),$t1
+	movzb	($sbox,$t2,1),$t2
 	movzb	($sbox,$t3,1),$t3
-	movzb	($sbox,$acc0,1),$t4	#$t0
-	movzb	($sbox,$acc1,1),$t5	#$t1

-	movzb	`&hi("$s1")`,$acc2
+	movzb	($sbox,$acc0,1),$t4	#$t0
 	movzb	`&hi("$s2")`,$acc0
-	shr	\$16,$s2
+	movzb	($sbox,$acc1,1),$t5	#$t1
 	movzb	($sbox,$acc2,1),$acc2	#$t2
 	movzb	($sbox,$acc0,1),$acc0	#$t3
-	shr	\$16,$s3

-	movzb	`&lo("$s2")`,$acc1
-	shl	\$8,$t4
+	shr	\$16,$s2
 	shl	\$8,$t5
-	movzb	($sbox,$acc1,1),$acc1	#$t0
-	xor	$t4,$t0
-	xor	$t5,$t1
-
-	movzb	`&lo("$s3")`,$t4
+	shl	\$8,$t4
+	movzb	`&lo("$s2")`,$acc1
 	shr	\$16,$s0
+	xor	$t4,$t0
 	shr	\$16,$s1
-	movzb	`&lo("$s0")`,$t5
+	movzb	`&lo("$s3")`,$t4
+
 	shl	\$8,$acc2
+	xor	$t5,$t1
 	shl	\$8,$acc0
-	movzb	($sbox,$t4,1),$t4	#$t1
-	movzb	($sbox,$t5,1),$t5	#$t2
+	movzb	`&lo("$s0")`,$t5
+	movzb	($sbox,$acc1,1),$acc1	#$t0
 	xor	$acc2,$t2
-	xor	$acc0,$t3
-
 	movzb	`&lo("$s1")`,$acc2
-	movzb	`&hi("$s1")`,$acc0
+
 	shl	\$16,$acc1
+	xor	$acc0,$t3
+	movzb	($sbox,$t4,1),$t4	#$t1
+	movzb	`&hi("$s1")`,$acc0
 	movzb	($sbox,$acc2,1),$acc2	#$t3
-	movzb	($sbox,$acc0,1),$acc0	#$t0
 	xor	$acc1,$t0
-
+	movzb	($sbox,$t5,1),$t5	#$t2
 	movzb	`&hi("$s2")`,$acc1
+
+	shl	\$16,$acc2
 	shl	\$16,$t4
 	shl	\$16,$t5
-	movzb	($sbox,$acc1,1),$s1	#$t1
+	xor	$acc2,$t3
+	movzb	`&hi("$s3")`,$acc2
 	xor	$t4,$t1
+	shr	\$8,$s0
 	xor	$t5,$t2

-	movzb	`&hi("$s3")`,$acc1
-	shr	\$8,$s0
-	shl	\$16,$acc2
-	movzb	($sbox,$acc1,1),$s2	#$t2
+	movzb	($sbox,$acc0,1),$acc0	#$t0
+	movzb	($sbox,$acc1,1),$s1	#$t1
+	movzb	($sbox,$acc2,1),$s2	#$t2
 	movzb	($sbox,$s0,1),$s3	#$t3
-	xor	$acc2,$t3

+	mov	$t0,$s0
 	shl	\$24,$acc0
 	shl	\$24,$s1
 	shl	\$24,$s2
-	xor	$acc0,$t0
+	xor	$acc0,$s0
 	shl	\$24,$s3
 	xor	$t1,$s1
-	mov	$t0,$s0
 	xor	$t2,$s2
 	xor	$t3,$s3
 ___
@ -1014,12 +1012,12 @@ sub dectransform()
  my $prefetch = shift;

 $code.=<<___;
-	mov	$tp10,$acc0
-	mov	$tp18,$acc8
-	and	$mask80,$acc0
-	and	$mask80,$acc8
-	mov	$acc0,$tp40
-	mov	$acc8,$tp48
+	mov	$mask80,$tp40
+	mov	$mask80,$tp48
+	and	$tp10,$tp40
+	and	$tp18,$tp48
+	mov	$tp40,$acc0
+	mov	$tp48,$acc8
 	shr	\$7,$tp40
 	lea	($tp10,$tp10),$tp20
 	shr	\$7,$tp48
@ -1030,15 +1028,15 @@ $code.=<<___;
 	and	$maskfe,$tp28
 	and	$mask1b,$acc0
 	and	$mask1b,$acc8
-	xor	$tp20,$acc0
-	xor	$tp28,$acc8
-	mov	$acc0,$tp20
-	mov	$acc8,$tp28
-
-	and	$mask80,$acc0
-	and	$mask80,$acc8
-	mov	$acc0,$tp80
-	mov	$acc8,$tp88
+	xor	$acc0,$tp20
+	xor	$acc8,$tp28
+	mov	$mask80,$tp80
+	mov	$mask80,$tp88
+
+	and	$tp20,$tp80
+	and	$tp28,$tp88
+	mov	$tp80,$acc0
+	mov	$tp88,$acc8
 	shr	\$7,$tp80
 	lea	($tp20,$tp20),$tp40
 	shr	\$7,$tp88
@ -1049,15 +1047,15 @@ $code.=<<___;
 	and	$maskfe,$tp48
 	and	$mask1b,$acc0
 	and	$mask1b,$acc8
-	xor	$tp40,$acc0
-	xor	$tp48,$acc8
-	mov	$acc0,$tp40
-	mov	$acc8,$tp48
-
-	and	$mask80,$acc0
-	and	$mask80,$acc8
-	mov	$acc0,$tp80
-	mov	$acc8,$tp88
+	xor	$acc0,$tp40
+	xor	$acc8,$tp48
+	mov	$mask80,$tp80
+	mov	$mask80,$tp88
+
+	and	$tp40,$tp80
+	and	$tp48,$tp88
+	mov	$tp80,$acc0
+	mov	$tp88,$acc8
 	shr	\$7,$tp80
 	 xor	$tp10,$tp20		# tp2^=tp1
 	shr	\$7,$tp88
@ -1082,51 +1080,51 @@ $code.=<<___;
 	mov	$tp10,$acc0
 	mov	$tp18,$acc8
 	xor	$tp80,$tp40		# tp4^tp1^=tp8
-	xor	$tp88,$tp48		# tp4^tp1^=tp8
 	shr	\$32,$acc0
+	xor	$tp88,$tp48		# tp4^tp1^=tp8
 	shr	\$32,$acc8
 	xor	$tp20,$tp80		# tp8^=tp8^tp2^tp1=tp2^tp1
-	xor	$tp28,$tp88		# tp8^=tp8^tp2^tp1=tp2^tp1
 	rol	\$8,`&LO("$tp10")`	# ROTATE(tp1^tp8,8)
+	xor	$tp28,$tp88		# tp8^=tp8^tp2^tp1=tp2^tp1
 	rol	\$8,`&LO("$tp18")`	# ROTATE(tp1^tp8,8)
 	xor	$tp40,$tp80		# tp2^tp1^=tp8^tp4^tp1=tp8^tp4^tp2
+	rol	\$8,`&LO("$acc0")`	# ROTATE(tp1^tp8,8)
 	xor	$tp48,$tp88		# tp2^tp1^=tp8^tp4^tp1=tp8^tp4^tp2

-	rol	\$8,`&LO("$acc0")`	# ROTATE(tp1^tp8,8)
 	rol	\$8,`&LO("$acc8")`	# ROTATE(tp1^tp8,8)
 	xor	`&LO("$tp80")`,`&LO("$tp10")`
-	xor	`&LO("$tp88")`,`&LO("$tp18")`
 	shr	\$32,$tp80
+	xor	`&LO("$tp88")`,`&LO("$tp18")`
 	shr	\$32,$tp88
 	xor	`&LO("$tp80")`,`&LO("$acc0")`
 	xor	`&LO("$tp88")`,`&LO("$acc8")`

 	mov	$tp20,$tp80
-	mov	$tp28,$tp88
-	shr	\$32,$tp80
-	shr	\$32,$tp88
 	rol	\$24,`&LO("$tp20")`	# ROTATE(tp2^tp1^tp8,24)
+	mov	$tp28,$tp88
 	rol	\$24,`&LO("$tp28")`	# ROTATE(tp2^tp1^tp8,24)
-	rol	\$24,`&LO("$tp80")`	# ROTATE(tp2^tp1^tp8,24)
-	rol	\$24,`&LO("$tp88")`	# ROTATE(tp2^tp1^tp8,24)
+	shr	\$32,$tp80
 	xor	`&LO("$tp20")`,`&LO("$tp10")`
+	shr	\$32,$tp88
 	xor	`&LO("$tp28")`,`&LO("$tp18")`
+	rol	\$24,`&LO("$tp80")`	# ROTATE(tp2^tp1^tp8,24)
 	mov	$tp40,$tp20
+	rol	\$24,`&LO("$tp88")`	# ROTATE(tp2^tp1^tp8,24)
 	mov	$tp48,$tp28
+	shr	\$32,$tp20
 	xor	`&LO("$tp80")`,`&LO("$acc0")`
+	shr	\$32,$tp28
 	xor	`&LO("$tp88")`,`&LO("$acc8")`

 	`"mov	0($sbox),$mask80"	if ($prefetch)`
-	shr	\$32,$tp20
-	shr	\$32,$tp28
-	`"mov	64($sbox),$maskfe"	if ($prefetch)`
 	rol	\$16,`&LO("$tp40")`	# ROTATE(tp4^tp1^tp8,16)
+	`"mov	64($sbox),$maskfe"	if ($prefetch)`
 	rol	\$16,`&LO("$tp48")`	# ROTATE(tp4^tp1^tp8,16)
 	`"mov	128($sbox),$mask1b"	if ($prefetch)`
 	rol	\$16,`&LO("$tp20")`	# ROTATE(tp4^tp1^tp8,16)
-	rol	\$16,`&LO("$tp28")`	# ROTATE(tp4^tp1^tp8,16)
 	`"mov	192($sbox),$tp80"	if ($prefetch)`
 	xor	`&LO("$tp40")`,`&LO("$tp10")`
+	rol	\$16,`&LO("$tp28")`	# ROTATE(tp4^tp1^tp8,16)
 	xor	`&LO("$tp48")`,`&LO("$tp18")`
 	`"mov	256($sbox),$tp88"	if ($prefetch)`
 	xor	`&LO("$tp20")`,`&LO("$acc0")`
@ -1302,10 +1300,6 @@ private_AES_set_encrypt_key:

 	call	_x86_64_AES_set_encrypt_key

-	mov	8(%rsp),%r15
-	mov	16(%rsp),%r14
-	mov	24(%rsp),%r13
-	mov	32(%rsp),%r12
 	mov	40(%rsp),%rbp
 	mov	48(%rsp),%rbx
 	add	\$56,%rsp
--- a/deps/openssl/openssl/crypto/aes/asm/aesni-mb-x86_64.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/aesni-mb-x86_64.pl
--- a/deps/openssl/openssl/crypto/aes/asm/aesni-sha1-x86_64.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/aesni-sha1-x86_64.pl
--- a/deps/openssl/openssl/crypto/aes/asm/aesni-sha256-x86_64.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/aesni-sha256-x86_64.pl
--- a/deps/openssl/openssl/crypto/aes/asm/aesni-x86.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/aesni-x86.pl
@ -1,7 +1,7 @@
 #!/usr/bin/env perl

 # ====================================================================
-# Written by Andy Polyakov <appro@fy.chalmers.se> for the OpenSSL
+# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
 # project. The module is, however, dual licensed under OpenSSL and
 # CRYPTOGAMS licenses depending on where you obtain it. For further
 # details see http://www.openssl.org/~appro/cryptogams/.
@ -43,6 +43,17 @@
 # Add aesni_xts_[en|de]crypt. Westmere spends 1.50 cycles processing
 # one byte out of 8KB with 128-bit key, Sandy Bridge - 1.09.

+######################################################################
+# Current large-block performance in cycles per byte processed with
+# 128-bit key (less is better).
+#
+#		CBC en-/decrypt	CTR	XTS	ECB
+# Westmere	3.77/1.37	1.37	1.52	1.27
+# * Bridge	5.07/0.98	0.99	1.09	0.91
+# Haswell	4.44/0.80	0.97	1.03	0.72
+# Atom		5.77/3.56	3.67	4.03	3.46
+# Bulldozer	5.80/0.98	1.05	1.24	0.93
+
 $PREFIX="aesni";	# if $PREFIX is set to "AES", the script
 			# generates drop-in replacement for
 			# crypto/aes/asm/aes-586.pl:-)
@ -54,8 +65,8 @@ require "x86asm.pl";

 &asm_init($ARGV[0],$0);

-if ($PREFIX eq "aesni")	{ $movekey=*movups; }
-else			{ $movekey=*movups; }
+if ($PREFIX eq "aesni")	{ $movekey=\&movups; }
+else			{ $movekey=\&movups; }

 $len="eax";
 $rounds="ecx";
@ -196,37 +207,71 @@ sub aesni_generate1	# fully unrolled loop
 # every *2nd* cycle. Thus 3x interleave was the one providing optimal
 # utilization, i.e. when subroutine's throughput is virtually same as
 # of non-interleaved subroutine [for number of input blocks up to 3].
-# This is why it makes no sense to implement 2x subroutine.
-# aes[enc|dec] latency in next processor generation is 8, but the
-# instructions can be scheduled every cycle. Optimal interleave for
-# new processor is therefore 8x, but it's unfeasible to accommodate it
-# in XMM registers addreassable in 32-bit mode and therefore 6x is
-# used instead...
+# This is why it originally made no sense to implement 2x subroutine.
+# But times change and it became appropriate to spend extra 192 bytes
+# on 2x subroutine on Atom Silvermont account. For processors that
+# can schedule aes[enc|dec] every cycle optimal interleave factor
+# equals to corresponding instructions latency. 8x is optimal for
+# * Bridge, but it's unfeasible to accommodate such implementation
+# in XMM registers addreassable in 32-bit mode and therefore maximum
+# of 6x is used instead...
+
+sub aesni_generate2
+{ my $p=shift;
+
+    &function_begin_B("_aesni_${p}rypt2");
+	&$movekey	($rndkey0,&QWP(0,$key));
+	&shl		($rounds,4);
+	&$movekey	($rndkey1,&QWP(16,$key));
+	&xorps		($inout0,$rndkey0);
+	&pxor		($inout1,$rndkey0);
+	&$movekey	($rndkey0,&QWP(32,$key));
+	&lea		($key,&DWP(32,$key,$rounds));
+	&neg		($rounds);
+	&add		($rounds,16);
+
+    &set_label("${p}2_loop");
+	eval"&aes${p}	($inout0,$rndkey1)";
+	eval"&aes${p}	($inout1,$rndkey1)";
+	&$movekey	($rndkey1,&QWP(0,$key,$rounds));
+	&add		($rounds,32);
+	eval"&aes${p}	($inout0,$rndkey0)";
+	eval"&aes${p}	($inout1,$rndkey0)";
+	&$movekey	($rndkey0,&QWP(-16,$key,$rounds));
+	&jnz		(&label("${p}2_loop"));
+    eval"&aes${p}	($inout0,$rndkey1)";
+    eval"&aes${p}	($inout1,$rndkey1)";
+    eval"&aes${p}last	($inout0,$rndkey0)";
+    eval"&aes${p}last	($inout1,$rndkey0)";
+    &ret();
+    &function_end_B("_aesni_${p}rypt2");
+}

 sub aesni_generate3
 { my $p=shift;

    &function_begin_B("_aesni_${p}rypt3");
 	&$movekey	($rndkey0,&QWP(0,$key));
-	&shr		($rounds,1);
+	&shl		($rounds,4);
 	&$movekey	($rndkey1,&QWP(16,$key));
-	&lea		($key,&DWP(32,$key));
 	&xorps		($inout0,$rndkey0);
 	&pxor		($inout1,$rndkey0);
 	&pxor		($inout2,$rndkey0);
-	&$movekey	($rndkey0,&QWP(0,$key));
+	&$movekey	($rndkey0,&QWP(32,$key));
+	&lea		($key,&DWP(32,$key,$rounds));
+	&neg		($rounds);
+	&add		($rounds,16);

    &set_label("${p}3_loop");
 	eval"&aes${p}	($inout0,$rndkey1)";
 	eval"&aes${p}	($inout1,$rndkey1)";
-	&dec		($rounds);
 	eval"&aes${p}	($inout2,$rndkey1)";
-	&$movekey	($rndkey1,&QWP(16,$key));
+	&$movekey	($rndkey1,&QWP(0,$key,$rounds));
+	&add		($rounds,32);
 	eval"&aes${p}	($inout0,$rndkey0)";
 	eval"&aes${p}	($inout1,$rndkey0)";
-	&lea		($key,&DWP(32,$key));
 	eval"&aes${p}	($inout2,$rndkey0)";
-	&$movekey	($rndkey0,&QWP(0,$key));
+	&$movekey	($rndkey0,&QWP(-16,$key,$rounds));
 	&jnz		(&label("${p}3_loop"));
    eval"&aes${p}	($inout0,$rndkey1)";
    eval"&aes${p}	($inout1,$rndkey1)";
@ -248,27 +293,29 @@ sub aesni_generate4
    &function_begin_B("_aesni_${p}rypt4");
 	&$movekey	($rndkey0,&QWP(0,$key));
 	&$movekey	($rndkey1,&QWP(16,$key));
-	&shr		($rounds,1);
-	&lea		($key,&DWP(32,$key));
+	&shl		($rounds,4);
 	&xorps		($inout0,$rndkey0);
 	&pxor		($inout1,$rndkey0);
 	&pxor		($inout2,$rndkey0);
 	&pxor		($inout3,$rndkey0);
-	&$movekey	($rndkey0,&QWP(0,$key));
+	&$movekey	($rndkey0,&QWP(32,$key));
+	&lea		($key,&DWP(32,$key,$rounds));
+	&neg		($rounds);
+	&data_byte	(0x0f,0x1f,0x40,0x00);
+	&add		($rounds,16);

    &set_label("${p}4_loop");
 	eval"&aes${p}	($inout0,$rndkey1)";
 	eval"&aes${p}	($inout1,$rndkey1)";
-	&dec		($rounds);
 	eval"&aes${p}	($inout2,$rndkey1)";
 	eval"&aes${p}	($inout3,$rndkey1)";
-	&$movekey	($rndkey1,&QWP(16,$key));
+	&$movekey	($rndkey1,&QWP(0,$key,$rounds));
+	&add		($rounds,32);
 	eval"&aes${p}	($inout0,$rndkey0)";
 	eval"&aes${p}	($inout1,$rndkey0)";
-	&lea		($key,&DWP(32,$key));
 	eval"&aes${p}	($inout2,$rndkey0)";
 	eval"&aes${p}	($inout3,$rndkey0)";
-	&$movekey	($rndkey0,&QWP(0,$key));
+	&$movekey	($rndkey0,&QWP(-16,$key,$rounds));
    &jnz		(&label("${p}4_loop"));

    eval"&aes${p}	($inout0,$rndkey1)";
@ -289,43 +336,43 @@ sub aesni_generate6
    &function_begin_B("_aesni_${p}rypt6");
    &static_label("_aesni_${p}rypt6_enter");
 	&$movekey	($rndkey0,&QWP(0,$key));
-	&shr		($rounds,1);
+	&shl		($rounds,4);
 	&$movekey	($rndkey1,&QWP(16,$key));
-	&lea		($key,&DWP(32,$key));
 	&xorps		($inout0,$rndkey0);
 	&pxor		($inout1,$rndkey0);	# pxor does better here
-	eval"&aes${p}	($inout0,$rndkey1)";
 	&pxor		($inout2,$rndkey0);
-	eval"&aes${p}	($inout1,$rndkey1)";
+	eval"&aes${p}	($inout0,$rndkey1)";
 	&pxor		($inout3,$rndkey0);
-	&dec		($rounds);
-	eval"&aes${p}	($inout2,$rndkey1)";
 	&pxor		($inout4,$rndkey0);
-	eval"&aes${p}	($inout3,$rndkey1)";
+	eval"&aes${p}	($inout1,$rndkey1)";
+	&lea		($key,&DWP(32,$key,$rounds));
+	&neg		($rounds);
+	eval"&aes${p}	($inout2,$rndkey1)";
 	&pxor		($inout5,$rndkey0);
+	&add		($rounds,16);
+	eval"&aes${p}	($inout3,$rndkey1)";
 	eval"&aes${p}	($inout4,$rndkey1)";
-	&$movekey	($rndkey0,&QWP(0,$key));
 	eval"&aes${p}	($inout5,$rndkey1)";
+	&$movekey	($rndkey0,&QWP(-16,$key,$rounds));
 	&jmp		(&label("_aesni_${p}rypt6_enter"));

    &set_label("${p}6_loop",16);
 	eval"&aes${p}	($inout0,$rndkey1)";
 	eval"&aes${p}	($inout1,$rndkey1)";
-	&dec		($rounds);
 	eval"&aes${p}	($inout2,$rndkey1)";
 	eval"&aes${p}	($inout3,$rndkey1)";
 	eval"&aes${p}	($inout4,$rndkey1)";
 	eval"&aes${p}	($inout5,$rndkey1)";
-    &set_label("_aesni_${p}rypt6_enter",16);
-	&$movekey	($rndkey1,&QWP(16,$key));
+    &set_label("_aesni_${p}rypt6_enter");
+	&$movekey	($rndkey1,&QWP(0,$key,$rounds));
+	&add		($rounds,32);
 	eval"&aes${p}	($inout0,$rndkey0)";
 	eval"&aes${p}	($inout1,$rndkey0)";
-	&lea		($key,&DWP(32,$key));
 	eval"&aes${p}	($inout2,$rndkey0)";
 	eval"&aes${p}	($inout3,$rndkey0)";
 	eval"&aes${p}	($inout4,$rndkey0)";
 	eval"&aes${p}	($inout5,$rndkey0)";
-	&$movekey	($rndkey0,&QWP(0,$key));
+	&$movekey	($rndkey0,&QWP(-16,$key,$rounds));
    &jnz		(&label("${p}6_loop"));

    eval"&aes${p}	($inout0,$rndkey1)";
@ -343,6 +390,8 @@ sub aesni_generate6
    &ret();
    &function_end_B("_aesni_${p}rypt6");
 }
+&aesni_generate2("enc") if ($PREFIX eq "aesni");
+&aesni_generate2("dec");
 &aesni_generate3("enc") if ($PREFIX eq "aesni");
 &aesni_generate3("dec");
 &aesni_generate4("enc") if ($PREFIX eq "aesni");
@ -446,8 +495,7 @@ if ($PREFIX eq "aesni") {
 	&jmp	(&label("ecb_ret"));

 &set_label("ecb_enc_two",16);
-	&xorps	($inout2,$inout2);
-	&call	("_aesni_encrypt3");
+	&call	("_aesni_encrypt2");
 	&movups	(&QWP(0,$out),$inout0);
 	&movups	(&QWP(0x10,$out),$inout1);
 	&jmp	(&label("ecb_ret"));
@ -547,8 +595,7 @@ if ($PREFIX eq "aesni") {
 	&jmp	(&label("ecb_ret"));

 &set_label("ecb_dec_two",16);
-	&xorps	($inout2,$inout2);
-	&call	("_aesni_decrypt3");
+	&call	("_aesni_decrypt2");
 	&movups	(&QWP(0,$out),$inout0);
 	&movups	(&QWP(0x10,$out),$inout1);
 	&jmp	(&label("ecb_ret"));
@ -610,11 +657,13 @@ if ($PREFIX eq "aesni") {
 	&mov	(&DWP(24,"esp"),$key_);
 	&mov	(&DWP(28,"esp"),$key_);

-	&shr	($rounds,1);
+	&shl	($rounds,4);
+	&mov	($rounds_,16);
 	&lea	($key_,&DWP(0,$key));
 	&movdqa	($inout3,&QWP(0,"esp"));
 	&movdqa	($inout0,$ivec);
-	&mov	($rounds_,$rounds);
+	&lea	($key,&DWP(32,$key,$rounds));
+	&sub	($rounds_,$rounds);
 	&pshufb	($ivec,$inout3);

 &set_label("ccm64_enc_outer");
@ -625,33 +674,31 @@ if ($PREFIX eq "aesni") {
 	&xorps		($inout0,$rndkey0);
 	&$movekey	($rndkey1,&QWP(16,$key_));
 	&xorps		($rndkey0,$in0);
-	&lea		($key,&DWP(32,$key_));
 	&xorps		($cmac,$rndkey0);		# cmac^=inp
-	&$movekey	($rndkey0,&QWP(0,$key));
+	&$movekey	($rndkey0,&QWP(32,$key_));

 &set_label("ccm64_enc2_loop");
 	&aesenc		($inout0,$rndkey1);
-	&dec		($rounds);
 	&aesenc		($cmac,$rndkey1);
-	&$movekey	($rndkey1,&QWP(16,$key));
+	&$movekey	($rndkey1,&QWP(0,$key,$rounds));
+	&add		($rounds,32);
 	&aesenc		($inout0,$rndkey0);
-	&lea		($key,&DWP(32,$key));
 	&aesenc		($cmac,$rndkey0);
-	&$movekey	($rndkey0,&QWP(0,$key));
+	&$movekey	($rndkey0,&QWP(-16,$key,$rounds));
 	&jnz		(&label("ccm64_enc2_loop"));
 	&aesenc		($inout0,$rndkey1);
 	&aesenc		($cmac,$rndkey1);
 	&paddq		($ivec,&QWP(16,"esp"));
+	&dec		($len);
 	&aesenclast	($inout0,$rndkey0);
 	&aesenclast	($cmac,$rndkey0);

-	&dec	($len);
 	&lea	($inp,&DWP(16,$inp));
 	&xorps	($in0,$inout0);			# inp^=E(ivec)
 	&movdqa	($inout0,$ivec);
 	&movups	(&QWP(0,$out),$in0);		# save output
-	&lea	($out,&DWP(16,$out));
 	&pshufb	($inout0,$inout3);
+	&lea	($out,&DWP(16,$out));
 	&jnz	(&label("ccm64_enc_outer"));

 	&mov	("esp",&DWP(48,"esp"));
@ -700,15 +747,19 @@ if ($PREFIX eq "aesni") {
 	{   &aesni_inline_generate1("enc");	}
 	else
 	{   &call	("_aesni_encrypt1");	}
+	&shl	($rounds_,4);
+	&mov	($rounds,16);
 	&movups	($in0,&QWP(0,$inp));		# load inp
 	&paddq	($ivec,&QWP(16,"esp"));
 	&lea	($inp,&QWP(16,$inp));
+	&sub	($rounds,$rounds_);
+	&lea	($key,&DWP(32,$key_,$rounds_));
+	&mov	($rounds_,$rounds);
 	&jmp	(&label("ccm64_dec_outer"));

 &set_label("ccm64_dec_outer",16);
 	&xorps	($in0,$inout0);			# inp ^= E(ivec)
 	&movdqa	($inout0,$ivec);
-	&mov	($rounds,$rounds_);
 	&movups	(&QWP(0,$out),$in0);		# save output
 	&lea	($out,&DWP(16,$out));
 	&pshufb	($inout0,$inout3);
@ -717,34 +768,33 @@ if ($PREFIX eq "aesni") {
 	&jz	(&label("ccm64_dec_break"));

 	&$movekey	($rndkey0,&QWP(0,$key_));
-	&shr		($rounds,1);
+	&mov		($rounds,$rounds_);
 	&$movekey	($rndkey1,&QWP(16,$key_));
 	&xorps		($in0,$rndkey0);
-	&lea		($key,&DWP(32,$key_));
 	&xorps		($inout0,$rndkey0);
 	&xorps		($cmac,$in0);		# cmac^=out
-	&$movekey	($rndkey0,&QWP(0,$key));
+	&$movekey	($rndkey0,&QWP(32,$key_));

 &set_label("ccm64_dec2_loop");
 	&aesenc		($inout0,$rndkey1);
-	&dec		($rounds);
 	&aesenc		($cmac,$rndkey1);
-	&$movekey	($rndkey1,&QWP(16,$key));
+	&$movekey	($rndkey1,&QWP(0,$key,$rounds));
+	&add		($rounds,32);
 	&aesenc		($inout0,$rndkey0);
-	&lea		($key,&DWP(32,$key));
 	&aesenc		($cmac,$rndkey0);
-	&$movekey	($rndkey0,&QWP(0,$key));
+	&$movekey	($rndkey0,&QWP(-16,$key,$rounds));
 	&jnz		(&label("ccm64_dec2_loop"));
 	&movups		($in0,&QWP(0,$inp));	# load inp
 	&paddq		($ivec,&QWP(16,"esp"));
 	&aesenc		($inout0,$rndkey1);
 	&aesenc		($cmac,$rndkey1);
-	&lea		($inp,&QWP(16,$inp));
 	&aesenclast	($inout0,$rndkey0);
 	&aesenclast	($cmac,$rndkey0);
+	&lea		($inp,&QWP(16,$inp));
 	&jmp	(&label("ccm64_dec_outer"));

 &set_label("ccm64_dec_break",16);
+	&mov	($rounds,&DWP(240,$key_));
 	&mov	($key,$key_);
 	if ($inline)
 	{   &aesni_inline_generate1("enc",$cmac,$in0);	}
@ -763,7 +813,7 @@ if ($PREFIX eq "aesni") {
 #                         const char *ivec);
 #
 # Handles only complete blocks, operates on 32-bit counter and
-# does not update *ivec! (see engine/eng_aesni.c for details)
+# does not update *ivec! (see crypto/modes/ctr128.c for details)
 #
 # stack layout:
 #	0	pshufb mask
@ -810,66 +860,61 @@ if ($PREFIX eq "aesni") {

 	# compose 2 vectors of 3x32-bit counters
 	&bswap	($rounds_);
-	&pxor	($rndkey1,$rndkey1);
 	&pxor	($rndkey0,$rndkey0);
+	&pxor	($rndkey1,$rndkey1);
 	&movdqa	($inout0,&QWP(0,"esp"));	# load byte-swap mask
-	&pinsrd	($rndkey1,$rounds_,0);
+	&pinsrd	($rndkey0,$rounds_,0);
 	&lea	($key_,&DWP(3,$rounds_));
-	&pinsrd	($rndkey0,$key_,0);
+	&pinsrd	($rndkey1,$key_,0);
 	&inc	($rounds_);
-	&pinsrd	($rndkey1,$rounds_,1);
+	&pinsrd	($rndkey0,$rounds_,1);
 	&inc	($key_);
-	&pinsrd	($rndkey0,$key_,1);
+	&pinsrd	($rndkey1,$key_,1);
 	&inc	($rounds_);
-	&pinsrd	($rndkey1,$rounds_,2);
+	&pinsrd	($rndkey0,$rounds_,2);
 	&inc	($key_);
-	&pinsrd	($rndkey0,$key_,2);
-	&movdqa	(&QWP(48,"esp"),$rndkey1);	# save 1st triplet
-	&pshufb	($rndkey1,$inout0);		# byte swap
-	&movdqa	(&QWP(64,"esp"),$rndkey0);	# save 2nd triplet
+	&pinsrd	($rndkey1,$key_,2);
+	&movdqa	(&QWP(48,"esp"),$rndkey0);	# save 1st triplet
 	&pshufb	($rndkey0,$inout0);		# byte swap
+	&movdqu	($inout4,&QWP(0,$key));		# key[0]
+	&movdqa	(&QWP(64,"esp"),$rndkey1);	# save 2nd triplet
+	&pshufb	($rndkey1,$inout0);		# byte swap

-	&pshufd	($inout0,$rndkey1,3<<6);	# place counter to upper dword
-	&pshufd	($inout1,$rndkey1,2<<6);
+	&pshufd	($inout0,$rndkey0,3<<6);	# place counter to upper dword
+	&pshufd	($inout1,$rndkey0,2<<6);
 	&cmp	($len,6);
 	&jb	(&label("ctr32_tail"));
-	&movdqa	(&QWP(32,"esp"),$inout5);	# save counter-less ivec
-	&shr	($rounds,1);
+	&pxor	($inout5,$inout4);		# counter-less ivec^key[0]
+	&shl	($rounds,4);
+	&mov	($rounds_,16);
+	&movdqa	(&QWP(32,"esp"),$inout5);	# save counter-less ivec^key[0]
 	&mov	($key_,$key);			# backup $key
-	&mov	($rounds_,$rounds);		# backup $rounds
+	&sub	($rounds_,$rounds);		# backup twisted $rounds
+	&lea	($key,&DWP(32,$key,$rounds));
 	&sub	($len,6);
 	&jmp	(&label("ctr32_loop6"));

 &set_label("ctr32_loop6",16);
-	&pshufd	($inout2,$rndkey1,1<<6);
-	&movdqa	($rndkey1,&QWP(32,"esp"));	# pull counter-less ivec
-	&pshufd	($inout3,$rndkey0,3<<6);
-	&por	($inout0,$rndkey1);		# merge counter-less ivec
-	&pshufd	($inout4,$rndkey0,2<<6);
-	&por	($inout1,$rndkey1);
-	&pshufd	($inout5,$rndkey0,1<<6);
-	&por	($inout2,$rndkey1);
-	&por	($inout3,$rndkey1);
-	&por	($inout4,$rndkey1);
-	&por	($inout5,$rndkey1);
-
-	# inlining _aesni_encrypt6's prologue gives ~4% improvement...
-	&$movekey	($rndkey0,&QWP(0,$key_));
-	&$movekey	($rndkey1,&QWP(16,$key_));
-	&lea		($key,&DWP(32,$key_));
-	&dec		($rounds);
-	&pxor		($inout0,$rndkey0);
+	# inlining _aesni_encrypt6's prologue gives ~6% improvement...
+	&pshufd	($inout2,$rndkey0,1<<6);
+	&movdqa	($rndkey0,&QWP(32,"esp"));	# pull counter-less ivec
+	&pshufd	($inout3,$rndkey1,3<<6);
+	&pxor		($inout0,$rndkey0);	# merge counter-less ivec
+	&pshufd	($inout4,$rndkey1,2<<6);
 	&pxor		($inout1,$rndkey0);
-	&aesenc		($inout0,$rndkey1);
+	&pshufd	($inout5,$rndkey1,1<<6);
+	&$movekey	($rndkey1,&QWP(16,$key_));
 	&pxor		($inout2,$rndkey0);
-	&aesenc		($inout1,$rndkey1);
 	&pxor		($inout3,$rndkey0);
-	&aesenc		($inout2,$rndkey1);
+	&aesenc		($inout0,$rndkey1);
 	&pxor		($inout4,$rndkey0);
-	&aesenc		($inout3,$rndkey1);
 	&pxor		($inout5,$rndkey0);
+	&aesenc		($inout1,$rndkey1);
+	&$movekey	($rndkey0,&QWP(32,$key_));
+	&mov		($rounds,$rounds_);
+	&aesenc		($inout2,$rndkey1);
+	&aesenc		($inout3,$rndkey1);
 	&aesenc		($inout4,$rndkey1);
-	&$movekey	($rndkey0,&QWP(0,$key));
 	&aesenc		($inout5,$rndkey1);

 	&call		(&label("_aesni_encrypt6_enter"));
@ -882,12 +927,12 @@ if ($PREFIX eq "aesni") {
 	&movups	(&QWP(0,$out),$inout0);
 	&movdqa	($rndkey0,&QWP(16,"esp"));	# load increment
 	&xorps	($inout2,$rndkey1);
-	&movdqa	($rndkey1,&QWP(48,"esp"));	# load 1st triplet
+	&movdqa	($rndkey1,&QWP(64,"esp"));	# load 2nd triplet
 	&movups	(&QWP(0x10,$out),$inout1);
 	&movups	(&QWP(0x20,$out),$inout2);

-	&paddd	($rndkey1,$rndkey0);		# 1st triplet increment
-	&paddd	($rndkey0,&QWP(64,"esp"));	# 2nd triplet increment
+	&paddd	($rndkey1,$rndkey0);		# 2nd triplet increment
+	&paddd	($rndkey0,&QWP(48,"esp"));	# 1st triplet increment
 	&movdqa	($inout0,&QWP(0,"esp"));	# load byte swap mask

 	&movups	($inout1,&QWP(0x30,$inp));
@ -895,44 +940,44 @@ if ($PREFIX eq "aesni") {
 	&xorps	($inout3,$inout1);
 	&movups	($inout1,&QWP(0x50,$inp));
 	&lea	($inp,&DWP(0x60,$inp));
-	&movdqa	(&QWP(48,"esp"),$rndkey1);	# save 1st triplet
-	&pshufb	($rndkey1,$inout0);		# byte swap
+	&movdqa	(&QWP(48,"esp"),$rndkey0);	# save 1st triplet
+	&pshufb	($rndkey0,$inout0);		# byte swap
 	&xorps	($inout4,$inout2);
 	&movups	(&QWP(0x30,$out),$inout3);
 	&xorps	($inout5,$inout1);
-	&movdqa	(&QWP(64,"esp"),$rndkey0);	# save 2nd triplet
-	&pshufb	($rndkey0,$inout0);		# byte swap
+	&movdqa	(&QWP(64,"esp"),$rndkey1);	# save 2nd triplet
+	&pshufb	($rndkey1,$inout0);		# byte swap
 	&movups	(&QWP(0x40,$out),$inout4);
-	&pshufd	($inout0,$rndkey1,3<<6);
+	&pshufd	($inout0,$rndkey0,3<<6);
 	&movups	(&QWP(0x50,$out),$inout5);
 	&lea	($out,&DWP(0x60,$out));

-	&mov	($rounds,$rounds_);
-	&pshufd	($inout1,$rndkey1,2<<6);
+	&pshufd	($inout1,$rndkey0,2<<6);
 	&sub	($len,6);
 	&jnc	(&label("ctr32_loop6"));

 	&add	($len,6);
 	&jz	(&label("ctr32_ret"));
+	&movdqu	($inout5,&QWP(0,$key_));
 	&mov	($key,$key_);
-	&lea	($rounds,&DWP(1,"",$rounds,2));	# restore $rounds
-	&movdqa	($inout5,&QWP(32,"esp"));	# pull count-less ivec
+	&pxor	($inout5,&QWP(32,"esp"));	# restore count-less ivec
+	&mov	($rounds,&DWP(240,$key_));	# restore $rounds

 &set_label("ctr32_tail");
 	&por	($inout0,$inout5);
 	&cmp	($len,2);
 	&jb	(&label("ctr32_one"));

-	&pshufd	($inout2,$rndkey1,1<<6);
+	&pshufd	($inout2,$rndkey0,1<<6);
 	&por	($inout1,$inout5);
 	&je	(&label("ctr32_two"));

-	&pshufd	($inout3,$rndkey0,3<<6);
+	&pshufd	($inout3,$rndkey1,3<<6);
 	&por	($inout2,$inout5);
 	&cmp	($len,4);
 	&jb	(&label("ctr32_three"));

-	&pshufd	($inout4,$rndkey0,2<<6);
+	&pshufd	($inout4,$rndkey1,2<<6);
 	&por	($inout3,$inout5);
 	&je	(&label("ctr32_four"));

@ -970,7 +1015,7 @@ if ($PREFIX eq "aesni") {
 	&jmp	(&label("ctr32_ret"));

 &set_label("ctr32_two",16);
-	&call	("_aesni_encrypt3");
+	&call	("_aesni_encrypt2");
 	&movups	($inout3,&QWP(0,$inp));
 	&movups	($inout4,&QWP(0x10,$inp));
 	&xorps	($inout0,$inout3);
@ -1057,8 +1102,10 @@ if ($PREFIX eq "aesni") {
 	&sub	($len,16*6);
 	&jc	(&label("xts_enc_short"));

-	&shr	($rounds,1);
-	&mov	($rounds_,$rounds);
+	&shl	($rounds,4);
+	&mov	($rounds_,16);
+	&sub	($rounds_,$rounds);
+	&lea	($key,&DWP(32,$key,$rounds));
 	&jmp	(&label("xts_enc_loop6"));

 &set_label("xts_enc_loop6",16);
@ -1080,6 +1127,7 @@ if ($PREFIX eq "aesni") {
 	&pxor	($inout5,$tweak);

 	# inline _aesni_encrypt6 prologue and flip xor with tweak and key[0]
+	&mov	($rounds,$rounds_);		# restore $rounds
 	&movdqu	($inout1,&QWP(16*1,$inp));
 	 &xorps		($inout0,$rndkey0);	# input^=rndkey[0]
 	&movdqu	($inout2,&QWP(16*2,$inp));
@ -1096,19 +1144,17 @@ if ($PREFIX eq "aesni") {
 	&pxor	($inout5,$rndkey1);

 	 &$movekey	($rndkey1,&QWP(16,$key_));
-	 &lea		($key,&DWP(32,$key_));
 	&pxor	($inout1,&QWP(16*1,"esp"));
-	 &aesenc	($inout0,$rndkey1);
 	&pxor	($inout2,&QWP(16*2,"esp"));
-	 &aesenc	($inout1,$rndkey1);
+	 &aesenc	($inout0,$rndkey1);
 	&pxor	($inout3,&QWP(16*3,"esp"));
-	 &dec		($rounds);
-	 &aesenc	($inout2,$rndkey1);
 	&pxor	($inout4,&QWP(16*4,"esp"));
-	 &aesenc	($inout3,$rndkey1);
+	 &aesenc	($inout1,$rndkey1);
 	&pxor		($inout5,$rndkey0);
+	 &$movekey	($rndkey0,&QWP(32,$key_));
+	 &aesenc	($inout2,$rndkey1);
+	 &aesenc	($inout3,$rndkey1);
 	 &aesenc	($inout4,$rndkey1);
-	 &$movekey	($rndkey0,&QWP(0,$key));
 	 &aesenc	($inout5,$rndkey1);
 	&call		(&label("_aesni_encrypt6_enter"));

@ -1135,13 +1181,12 @@ if ($PREFIX eq "aesni") {
 	&paddq	($tweak,$tweak);		# &psllq($tweak,1);
 	&pand	($twres,$twmask);		# isolate carry and residue
 	&pcmpgtd($twtmp,$tweak);		# broadcast upper bits
-	&mov	($rounds,$rounds_);		# restore $rounds
 	&pxor	($tweak,$twres);

 	&sub	($len,16*6);
 	&jnc	(&label("xts_enc_loop6"));

-	&lea	($rounds,&DWP(1,"",$rounds,2));	# restore $rounds
+	&mov	($rounds,&DWP(240,$key_));	# restore $rounds
 	&mov	($key,$key_);			# restore $key
 	&mov	($rounds_,$rounds);

@ -1241,9 +1286,8 @@ if ($PREFIX eq "aesni") {
 	&lea	($inp,&DWP(16*2,$inp));
 	&xorps	($inout0,$inout3);		# input^=tweak
 	&xorps	($inout1,$inout4);
-	&xorps	($inout2,$inout2);

-	&call	("_aesni_encrypt3");
+	&call	("_aesni_encrypt2");

 	&xorps	($inout0,$inout3);		# output^=tweak
 	&xorps	($inout1,$inout4);
@ -1399,8 +1443,10 @@ if ($PREFIX eq "aesni") {
 	&sub	($len,16*6);
 	&jc	(&label("xts_dec_short"));

-	&shr	($rounds,1);
-	&mov	($rounds_,$rounds);
+	&shl	($rounds,4);
+	&mov	($rounds_,16);
+	&sub	($rounds_,$rounds);
+	&lea	($key,&DWP(32,$key,$rounds));
 	&jmp	(&label("xts_dec_loop6"));

 &set_label("xts_dec_loop6",16);
@ -1422,6 +1468,7 @@ if ($PREFIX eq "aesni") {
 	&pxor	($inout5,$tweak);

 	# inline _aesni_encrypt6 prologue and flip xor with tweak and key[0]
+	&mov	($rounds,$rounds_);
 	&movdqu	($inout1,&QWP(16*1,$inp));
 	 &xorps		($inout0,$rndkey0);	# input^=rndkey[0]
 	&movdqu	($inout2,&QWP(16*2,$inp));
@ -1438,19 +1485,17 @@ if ($PREFIX eq "aesni") {
 	&pxor	($inout5,$rndkey1);

 	 &$movekey	($rndkey1,&QWP(16,$key_));
-	 &lea		($key,&DWP(32,$key_));
 	&pxor	($inout1,&QWP(16*1,"esp"));
-	 &aesdec	($inout0,$rndkey1);
 	&pxor	($inout2,&QWP(16*2,"esp"));
-	 &aesdec	($inout1,$rndkey1);
+	 &aesdec	($inout0,$rndkey1);
 	&pxor	($inout3,&QWP(16*3,"esp"));
-	 &dec		($rounds);
-	 &aesdec	($inout2,$rndkey1);
 	&pxor	($inout4,&QWP(16*4,"esp"));
-	 &aesdec	($inout3,$rndkey1);
+	 &aesdec	($inout1,$rndkey1);
 	&pxor		($inout5,$rndkey0);
+	 &$movekey	($rndkey0,&QWP(32,$key_));
+	 &aesdec	($inout2,$rndkey1);
+	 &aesdec	($inout3,$rndkey1);
 	 &aesdec	($inout4,$rndkey1);
-	 &$movekey	($rndkey0,&QWP(0,$key));
 	 &aesdec	($inout5,$rndkey1);
 	&call		(&label("_aesni_decrypt6_enter"));

@ -1477,13 +1522,12 @@ if ($PREFIX eq "aesni") {
 	&paddq	($tweak,$tweak);		# &psllq($tweak,1);
 	&pand	($twres,$twmask);		# isolate carry and residue
 	&pcmpgtd($twtmp,$tweak);		# broadcast upper bits
-	&mov	($rounds,$rounds_);		# restore $rounds
 	&pxor	($tweak,$twres);

 	&sub	($len,16*6);
 	&jnc	(&label("xts_dec_loop6"));

-	&lea	($rounds,&DWP(1,"",$rounds,2));	# restore $rounds
+	&mov	($rounds,&DWP(240,$key_));	# restore $rounds
 	&mov	($key,$key_);			# restore $key
 	&mov	($rounds_,$rounds);

@ -1584,7 +1628,7 @@ if ($PREFIX eq "aesni") {
 	&xorps	($inout0,$inout3);		# input^=tweak
 	&xorps	($inout1,$inout4);

-	&call	("_aesni_decrypt3");
+	&call	("_aesni_decrypt2");

 	&xorps	($inout0,$inout3);		# output^=tweak
 	&xorps	($inout1,$inout4);
@ -1816,7 +1860,7 @@ if ($PREFIX eq "aesni") {
 	&movups	(&QWP(0x10,$out),$inout1);
 	&lea	($inp,&DWP(0x60,$inp));
 	&movups	(&QWP(0x20,$out),$inout2);
-	&mov	($rounds,$rounds_)		# restore $rounds
+	&mov	($rounds,$rounds_);		# restore $rounds
 	&movups	(&QWP(0x30,$out),$inout3);
 	&mov	($key,$key_);			# restore $key
 	&movups	(&QWP(0x40,$out),$inout4);
@ -1884,8 +1928,7 @@ if ($PREFIX eq "aesni") {
 	&jmp	(&label("cbc_dec_tail_collected"));

 &set_label("cbc_dec_two",16);
-	&xorps	($inout2,$inout2);
-	&call	("_aesni_decrypt3");
+	&call	("_aesni_decrypt2");
 	&xorps	($inout0,$ivec);
 	&xorps	($inout1,$in0);
 	&movups	(&QWP(0,$out),$inout0);
@ -2015,7 +2058,7 @@ if ($PREFIX eq "aesni") {
 &set_label("12rounds",16);
 	&movq		("xmm2",&QWP(16,"eax"));	# remaining 1/3 of *userKey
 	&mov		($rounds,11);
-	&$movekey	(&QWP(-16,$key),"xmm0")		# round 0
+	&$movekey	(&QWP(-16,$key),"xmm0");	# round 0
 	&aeskeygenassist("xmm1","xmm2",0x01);		# round 1,2
 	&call		(&label("key_192a_cold"));
 	&aeskeygenassist("xmm1","xmm2",0x02);		# round 2,3
@ -2152,7 +2195,7 @@ if ($PREFIX eq "aesni") {
 	&mov	($key,&wparam(2));
 	&call	("_aesni_set_encrypt_key");
 	&mov	($key,&wparam(2));
-	&shl	($rounds,4)	# rounds-1 after _aesni_set_encrypt_key
+	&shl	($rounds,4);	# rounds-1 after _aesni_set_encrypt_key
 	&test	("eax","eax");
 	&jnz	(&label("dec_key_ret"));
 	&lea	("eax",&DWP(16,$key,$rounds));	# end of key schedule
--- a/deps/openssl/openssl/crypto/aes/asm/aesni-x86_64.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/aesni-x86_64.pl
--- a/deps/openssl/openssl/crypto/aes/asm/aesp8-ppc.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/aesp8-ppc.pl
--- a/deps/openssl/openssl/crypto/aes/asm/aest4-sparcv9.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/aest4-sparcv9.pl
@ -0,0 +1,919 @@
+#!/usr/bin/env perl
+
+# ====================================================================
+# Written by David S. Miller <davem@devemloft.net> and Andy Polyakov
+# <appro@openssl.org>. The module is licensed under 2-clause BSD
+# license. October 2012. All rights reserved.
+# ====================================================================
+
+######################################################################
+# AES for SPARC T4.
+#
+# AES round instructions complete in 3 cycles and can be issued every
+# cycle. It means that round calculations should take 4*rounds cycles,
+# because any given round instruction depends on result of *both*
+# previous instructions:
+#
+#	|0 |1 |2 |3 |4
+#	|01|01|01|
+#	   |23|23|23|
+#	            |01|01|...
+#	               |23|...
+#
+# Provided that fxor [with IV] takes 3 cycles to complete, critical
+# path length for CBC encrypt would be 3+4*rounds, or in other words
+# it should process one byte in at least (3+4*rounds)/16 cycles. This
+# estimate doesn't account for "collateral" instructions, such as
+# fetching input from memory, xor-ing it with zero-round key and
+# storing the result. Yet, *measured* performance [for data aligned
+# at 64-bit boundary!] deviates from this equation by less than 0.5%:
+#
+#		128-bit key	192-		256-
+# CBC encrypt	2.70/2.90(*)	3.20/3.40	3.70/3.90
+#			 (*) numbers after slash are for
+#			     misaligned data.
+#
+# Out-of-order execution logic managed to fully overlap "collateral"
+# instructions with those on critical path. Amazing!
+#
+# As with Intel AES-NI, question is if it's possible to improve
+# performance of parallelizeable modes by interleaving round
+# instructions. Provided round instruction latency and throughput
+# optimal interleave factor is 2. But can we expect 2x performance
+# improvement? Well, as round instructions can be issued one per
+# cycle, they don't saturate the 2-way issue pipeline and therefore
+# there is room for "collateral" calculations... Yet, 2x speed-up
+# over CBC encrypt remains unattaintable:
+#
+#		128-bit key	192-		256-
+# CBC decrypt	1.64/2.11	1.89/2.37	2.23/2.61
+# CTR		1.64/2.08(*)	1.89/2.33	2.23/2.61
+#			 (*) numbers after slash are for
+#			     misaligned data.
+#
+# Estimates based on amount of instructions under assumption that
+# round instructions are not pairable with any other instruction
+# suggest that latter is the actual case and pipeline runs
+# underutilized. It should be noted that T4 out-of-order execution
+# logic is so capable that performance gain from 2x interleave is
+# not even impressive, ~7-13% over non-interleaved code, largest
+# for 256-bit keys.
+
+# To anchor to something else, software implementation processes
+# one byte in 29 cycles with 128-bit key on same processor. Intel
+# Sandy Bridge encrypts byte in 5.07 cycles in CBC mode and decrypts
+# in 0.93, naturally with AES-NI.
+
+$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+push(@INC,"${dir}","${dir}../../perlasm");
+require "sparcv9_modes.pl";
+
+&asm_init(@ARGV);
+
+$::evp=1;	# if $evp is set to 0, script generates module with
+# AES_[en|de]crypt, AES_set_[en|de]crypt_key and AES_cbc_encrypt entry
+# points. These however are not fully compatible with openssl/aes.h,
+# because they expect AES_KEY to be aligned at 64-bit boundary. When
+# used through EVP, alignment is arranged at EVP layer. Second thing
+# that is arranged by EVP is at least 32-bit alignment of IV.
+
+######################################################################
+# single-round subroutines
+#
+{
+my ($inp,$out,$key,$rounds,$tmp,$mask)=map("%o$_",(0..5));
+
+$code.=<<___ if ($::abibits==64);
+.register	%g2,#scratch
+.register	%g3,#scratch
+
+___
+$code.=<<___;
+.text
+
+.globl	aes_t4_encrypt
+.align	32
+aes_t4_encrypt:
+	andcc		$inp, 7, %g1		! is input aligned?
+	andn		$inp, 7, $inp
+
+	ldx		[$key + 0], %g4
+	ldx		[$key + 8], %g5
+
+	ldx		[$inp + 0], %o4
+	bz,pt		%icc, 1f
+	ldx		[$inp + 8], %o5
+	ldx		[$inp + 16], $inp
+	sll		%g1, 3, %g1
+	sub		%g0, %g1, %o3
+	sllx		%o4, %g1, %o4
+	sllx		%o5, %g1, %g1
+	srlx		%o5, %o3, %o5
+	srlx		$inp, %o3, %o3
+	or		%o5, %o4, %o4
+	or		%o3, %g1, %o5
+1:
+	ld		[$key + 240], $rounds
+	ldd		[$key + 16], %f12
+	ldd		[$key + 24], %f14
+	xor		%g4, %o4, %o4
+	xor		%g5, %o5, %o5
+	movxtod		%o4, %f0
+	movxtod		%o5, %f2
+	srl		$rounds, 1, $rounds
+	ldd		[$key + 32], %f16
+	sub		$rounds, 1, $rounds
+	ldd		[$key + 40], %f18
+	add		$key, 48, $key
+
+.Lenc:
+	aes_eround01	%f12, %f0, %f2, %f4
+	aes_eround23	%f14, %f0, %f2, %f2
+	ldd		[$key + 0], %f12
+	ldd		[$key + 8], %f14
+	sub		$rounds,1,$rounds
+	aes_eround01	%f16, %f4, %f2, %f0
+	aes_eround23	%f18, %f4, %f2, %f2
+	ldd		[$key + 16], %f16
+	ldd		[$key + 24], %f18
+	brnz,pt		$rounds, .Lenc
+	add		$key, 32, $key
+
+	andcc		$out, 7, $tmp		! is output aligned?
+	aes_eround01	%f12, %f0, %f2, %f4
+	aes_eround23	%f14, %f0, %f2, %f2
+	aes_eround01_l	%f16, %f4, %f2, %f0
+	aes_eround23_l	%f18, %f4, %f2, %f2
+
+	bnz,pn		%icc, 2f
+	nop
+
+	std		%f0, [$out + 0]
+	retl
+	std		%f2, [$out + 8]
+
+2:	alignaddrl	$out, %g0, $out
+	mov		0xff, $mask
+	srl		$mask, $tmp, $mask
+
+	faligndata	%f0, %f0, %f4
+	faligndata	%f0, %f2, %f6
+	faligndata	%f2, %f2, %f8
+
+	stda		%f4, [$out + $mask]0xc0	! partial store
+	std		%f6, [$out + 8]
+	add		$out, 16, $out
+	orn		%g0, $mask, $mask
+	retl
+	stda		%f8, [$out + $mask]0xc0	! partial store
+.type	aes_t4_encrypt,#function
+.size	aes_t4_encrypt,.-aes_t4_encrypt
+
+.globl	aes_t4_decrypt
+.align	32
+aes_t4_decrypt:
+	andcc		$inp, 7, %g1		! is input aligned?
+	andn		$inp, 7, $inp
+
+	ldx		[$key + 0], %g4
+	ldx		[$key + 8], %g5
+
+	ldx		[$inp + 0], %o4
+	bz,pt		%icc, 1f
+	ldx		[$inp + 8], %o5
+	ldx		[$inp + 16], $inp
+	sll		%g1, 3, %g1
+	sub		%g0, %g1, %o3
+	sllx		%o4, %g1, %o4
+	sllx		%o5, %g1, %g1
+	srlx		%o5, %o3, %o5
+	srlx		$inp, %o3, %o3
+	or		%o5, %o4, %o4
+	or		%o3, %g1, %o5
+1:
+	ld		[$key + 240], $rounds
+	ldd		[$key + 16], %f12
+	ldd		[$key + 24], %f14
+	xor		%g4, %o4, %o4
+	xor		%g5, %o5, %o5
+	movxtod		%o4, %f0
+	movxtod		%o5, %f2
+	srl		$rounds, 1, $rounds
+	ldd		[$key + 32], %f16
+	sub		$rounds, 1, $rounds
+	ldd		[$key + 40], %f18
+	add		$key, 48, $key
+
+.Ldec:
+	aes_dround01	%f12, %f0, %f2, %f4
+	aes_dround23	%f14, %f0, %f2, %f2
+	ldd		[$key + 0], %f12
+	ldd		[$key + 8], %f14
+	sub		$rounds,1,$rounds
+	aes_dround01	%f16, %f4, %f2, %f0
+	aes_dround23	%f18, %f4, %f2, %f2
+	ldd		[$key + 16], %f16
+	ldd		[$key + 24], %f18
+	brnz,pt		$rounds, .Ldec
+	add		$key, 32, $key
+
+	andcc		$out, 7, $tmp		! is output aligned?
+	aes_dround01	%f12, %f0, %f2, %f4
+	aes_dround23	%f14, %f0, %f2, %f2
+	aes_dround01_l	%f16, %f4, %f2, %f0
+	aes_dround23_l	%f18, %f4, %f2, %f2
+
+	bnz,pn		%icc, 2f
+	nop
+
+	std		%f0, [$out + 0]
+	retl
+	std		%f2, [$out + 8]
+
+2:	alignaddrl	$out, %g0, $out
+	mov		0xff, $mask
+	srl		$mask, $tmp, $mask
+
+	faligndata	%f0, %f0, %f4
+	faligndata	%f0, %f2, %f6
+	faligndata	%f2, %f2, %f8
+
+	stda		%f4, [$out + $mask]0xc0	! partial store
+	std		%f6, [$out + 8]
+	add		$out, 16, $out
+	orn		%g0, $mask, $mask
+	retl
+	stda		%f8, [$out + $mask]0xc0	! partial store
+.type	aes_t4_decrypt,#function
+.size	aes_t4_decrypt,.-aes_t4_decrypt
+___
+}
+
+######################################################################
+# key setup subroutines
+#
+{
+my ($inp,$bits,$out,$tmp)=map("%o$_",(0..5));
+$code.=<<___;
+.globl	aes_t4_set_encrypt_key
+.align	32
+aes_t4_set_encrypt_key:
+.Lset_encrypt_key:
+	and		$inp, 7, $tmp
+	alignaddr	$inp, %g0, $inp
+	cmp		$bits, 192
+	ldd		[$inp + 0], %f0
+	bl,pt		%icc,.L128
+	ldd		[$inp + 8], %f2
+
+	be,pt		%icc,.L192
+	ldd		[$inp + 16], %f4
+	brz,pt		$tmp, .L256aligned
+	ldd		[$inp + 24], %f6
+
+	ldd		[$inp + 32], %f8
+	faligndata	%f0, %f2, %f0
+	faligndata	%f2, %f4, %f2
+	faligndata	%f4, %f6, %f4
+	faligndata	%f6, %f8, %f6
+.L256aligned:
+___
+for ($i=0; $i<6; $i++) {
+    $code.=<<___;
+	std		%f0, [$out + `32*$i+0`]
+	aes_kexpand1	%f0, %f6, $i, %f0
+	std		%f2, [$out + `32*$i+8`]
+	aes_kexpand2	%f2, %f0, %f2
+	std		%f4, [$out + `32*$i+16`]
+	aes_kexpand0	%f4, %f2, %f4
+	std		%f6, [$out + `32*$i+24`]
+	aes_kexpand2	%f6, %f4, %f6
+___
+}
+$code.=<<___;
+	std		%f0, [$out + `32*$i+0`]
+	aes_kexpand1	%f0, %f6, $i, %f0
+	std		%f2, [$out + `32*$i+8`]
+	aes_kexpand2	%f2, %f0, %f2
+	std		%f4, [$out + `32*$i+16`]
+	std		%f6, [$out + `32*$i+24`]
+	std		%f0, [$out + `32*$i+32`]
+	std		%f2, [$out + `32*$i+40`]
+
+	mov		14, $tmp
+	st		$tmp, [$out + 240]
+	retl
+	xor		%o0, %o0, %o0
+
+.align	16
+.L192:
+	brz,pt		$tmp, .L192aligned
+	nop
+
+	ldd		[$inp + 24], %f6
+	faligndata	%f0, %f2, %f0
+	faligndata	%f2, %f4, %f2
+	faligndata	%f4, %f6, %f4
+.L192aligned:
+___
+for ($i=0; $i<7; $i++) {
+    $code.=<<___;
+	std		%f0, [$out + `24*$i+0`]
+	aes_kexpand1	%f0, %f4, $i, %f0
+	std		%f2, [$out + `24*$i+8`]
+	aes_kexpand2	%f2, %f0, %f2
+	std		%f4, [$out + `24*$i+16`]
+	aes_kexpand2	%f4, %f2, %f4
+___
+}
+$code.=<<___;
+	std		%f0, [$out + `24*$i+0`]
+	aes_kexpand1	%f0, %f4, $i, %f0
+	std		%f2, [$out + `24*$i+8`]
+	aes_kexpand2	%f2, %f0, %f2
+	std		%f4, [$out + `24*$i+16`]
+	std		%f0, [$out + `24*$i+24`]
+	std		%f2, [$out + `24*$i+32`]
+
+	mov		12, $tmp
+	st		$tmp, [$out + 240]
+	retl
+	xor		%o0, %o0, %o0
+
+.align	16
+.L128:
+	brz,pt		$tmp, .L128aligned
+	nop
+
+	ldd		[$inp + 16], %f4
+	faligndata	%f0, %f2, %f0
+	faligndata	%f2, %f4, %f2
+.L128aligned:
+___
+for ($i=0; $i<10; $i++) {
+    $code.=<<___;
+	std		%f0, [$out + `16*$i+0`]
+	aes_kexpand1	%f0, %f2, $i, %f0
+	std		%f2, [$out + `16*$i+8`]
+	aes_kexpand2	%f2, %f0, %f2
+___
+}
+$code.=<<___;
+	std		%f0, [$out + `16*$i+0`]
+	std		%f2, [$out + `16*$i+8`]
+
+	mov		10, $tmp
+	st		$tmp, [$out + 240]
+	retl
+	xor		%o0, %o0, %o0
+.type	aes_t4_set_encrypt_key,#function
+.size	aes_t4_set_encrypt_key,.-aes_t4_set_encrypt_key
+
+.globl	aes_t4_set_decrypt_key
+.align	32
+aes_t4_set_decrypt_key:
+	mov		%o7, %o5
+	call		.Lset_encrypt_key
+	nop
+
+	mov		%o5, %o7
+	sll		$tmp, 4, $inp		! $tmp is number of rounds
+	add		$tmp, 2, $tmp
+	add		$out, $inp, $inp	! $inp=$out+16*rounds
+	srl		$tmp, 2, $tmp		! $tmp=(rounds+2)/4
+
+.Lkey_flip:
+	ldd		[$out + 0],  %f0
+	ldd		[$out + 8],  %f2
+	ldd		[$out + 16], %f4
+	ldd		[$out + 24], %f6
+	ldd		[$inp + 0],  %f8
+	ldd		[$inp + 8],  %f10
+	ldd		[$inp - 16], %f12
+	ldd		[$inp - 8],  %f14
+	sub		$tmp, 1, $tmp
+	std		%f0, [$inp + 0]
+	std		%f2, [$inp + 8]
+	std		%f4, [$inp - 16]
+	std		%f6, [$inp - 8]
+	std		%f8, [$out + 0]
+	std		%f10, [$out + 8]
+	std		%f12, [$out + 16]
+	std		%f14, [$out + 24]
+	add		$out, 32, $out
+	brnz		$tmp, .Lkey_flip
+	sub		$inp, 32, $inp
+
+	retl
+	xor		%o0, %o0, %o0
+.type	aes_t4_set_decrypt_key,#function
+.size	aes_t4_set_decrypt_key,.-aes_t4_set_decrypt_key
+___
+}
+
+{{{
+my ($inp,$out,$len,$key,$ivec,$enc)=map("%i$_",(0..5));
+my ($ileft,$iright,$ooff,$omask,$ivoff)=map("%l$_",(1..7));
+
+$code.=<<___;
+.align	32
+_aes128_encrypt_1x:
+___
+for ($i=0; $i<4; $i++) {
+    $code.=<<___;
+	aes_eround01	%f`16+8*$i+0`, %f0, %f2, %f4
+	aes_eround23	%f`16+8*$i+2`, %f0, %f2, %f2
+	aes_eround01	%f`16+8*$i+4`, %f4, %f2, %f0
+	aes_eround23	%f`16+8*$i+6`, %f4, %f2, %f2
+___
+}
+$code.=<<___;
+	aes_eround01	%f48, %f0, %f2, %f4
+	aes_eround23	%f50, %f0, %f2, %f2
+	aes_eround01_l	%f52, %f4, %f2, %f0
+	retl
+	aes_eround23_l	%f54, %f4, %f2, %f2
+.type	_aes128_encrypt_1x,#function
+.size	_aes128_encrypt_1x,.-_aes128_encrypt_1x
+
+.align	32
+_aes128_encrypt_2x:
+___
+for ($i=0; $i<4; $i++) {
+    $code.=<<___;
+	aes_eround01	%f`16+8*$i+0`, %f0, %f2, %f8
+	aes_eround23	%f`16+8*$i+2`, %f0, %f2, %f2
+	aes_eround01	%f`16+8*$i+0`, %f4, %f6, %f10
+	aes_eround23	%f`16+8*$i+2`, %f4, %f6, %f6
+	aes_eround01	%f`16+8*$i+4`, %f8, %f2, %f0
+	aes_eround23	%f`16+8*$i+6`, %f8, %f2, %f2
+	aes_eround01	%f`16+8*$i+4`, %f10, %f6, %f4
+	aes_eround23	%f`16+8*$i+6`, %f10, %f6, %f6
+___
+}
+$code.=<<___;
+	aes_eround01	%f48, %f0, %f2, %f8
+	aes_eround23	%f50, %f0, %f2, %f2
+	aes_eround01	%f48, %f4, %f6, %f10
+	aes_eround23	%f50, %f4, %f6, %f6
+	aes_eround01_l	%f52, %f8, %f2, %f0
+	aes_eround23_l	%f54, %f8, %f2, %f2
+	aes_eround01_l	%f52, %f10, %f6, %f4
+	retl
+	aes_eround23_l	%f54, %f10, %f6, %f6
+.type	_aes128_encrypt_2x,#function
+.size	_aes128_encrypt_2x,.-_aes128_encrypt_2x
+
+.align	32
+_aes128_loadkey:
+	ldx		[$key + 0], %g4
+	ldx		[$key + 8], %g5
+___
+for ($i=2; $i<22;$i++) {			# load key schedule
+    $code.=<<___;
+	ldd		[$key + `8*$i`], %f`12+2*$i`
+___
+}
+$code.=<<___;
+	retl
+	nop
+.type	_aes128_loadkey,#function
+.size	_aes128_loadkey,.-_aes128_loadkey
+_aes128_load_enckey=_aes128_loadkey
+_aes128_load_deckey=_aes128_loadkey
+
+___
+
+&alg_cbc_encrypt_implement("aes",128);
+if ($::evp) {
+    &alg_ctr32_implement("aes",128);
+    &alg_xts_implement("aes",128,"en");
+    &alg_xts_implement("aes",128,"de");
+}
+&alg_cbc_decrypt_implement("aes",128);
+
+$code.=<<___;
+.align	32
+_aes128_decrypt_1x:
+___
+for ($i=0; $i<4; $i++) {
+    $code.=<<___;
+	aes_dround01	%f`16+8*$i+0`, %f0, %f2, %f4
+	aes_dround23	%f`16+8*$i+2`, %f0, %f2, %f2
+	aes_dround01	%f`16+8*$i+4`, %f4, %f2, %f0
+	aes_dround23	%f`16+8*$i+6`, %f4, %f2, %f2
+___
+}
+$code.=<<___;
+	aes_dround01	%f48, %f0, %f2, %f4
+	aes_dround23	%f50, %f0, %f2, %f2
+	aes_dround01_l	%f52, %f4, %f2, %f0
+	retl
+	aes_dround23_l	%f54, %f4, %f2, %f2
+.type	_aes128_decrypt_1x,#function
+.size	_aes128_decrypt_1x,.-_aes128_decrypt_1x
+
+.align	32
+_aes128_decrypt_2x:
+___
+for ($i=0; $i<4; $i++) {
+    $code.=<<___;
+	aes_dround01	%f`16+8*$i+0`, %f0, %f2, %f8
+	aes_dround23	%f`16+8*$i+2`, %f0, %f2, %f2
+	aes_dround01	%f`16+8*$i+0`, %f4, %f6, %f10
+	aes_dround23	%f`16+8*$i+2`, %f4, %f6, %f6
+	aes_dround01	%f`16+8*$i+4`, %f8, %f2, %f0
+	aes_dround23	%f`16+8*$i+6`, %f8, %f2, %f2
+	aes_dround01	%f`16+8*$i+4`, %f10, %f6, %f4
+	aes_dround23	%f`16+8*$i+6`, %f10, %f6, %f6
+___
+}
+$code.=<<___;
+	aes_dround01	%f48, %f0, %f2, %f8
+	aes_dround23	%f50, %f0, %f2, %f2
+	aes_dround01	%f48, %f4, %f6, %f10
+	aes_dround23	%f50, %f4, %f6, %f6
+	aes_dround01_l	%f52, %f8, %f2, %f0
+	aes_dround23_l	%f54, %f8, %f2, %f2
+	aes_dround01_l	%f52, %f10, %f6, %f4
+	retl
+	aes_dround23_l	%f54, %f10, %f6, %f6
+.type	_aes128_decrypt_2x,#function
+.size	_aes128_decrypt_2x,.-_aes128_decrypt_2x
+___
+
+$code.=<<___;
+.align	32
+_aes192_encrypt_1x:
+___
+for ($i=0; $i<5; $i++) {
+    $code.=<<___;
+	aes_eround01	%f`16+8*$i+0`, %f0, %f2, %f4
+	aes_eround23	%f`16+8*$i+2`, %f0, %f2, %f2
+	aes_eround01	%f`16+8*$i+4`, %f4, %f2, %f0
+	aes_eround23	%f`16+8*$i+6`, %f4, %f2, %f2
+___
+}
+$code.=<<___;
+	aes_eround01	%f56, %f0, %f2, %f4
+	aes_eround23	%f58, %f0, %f2, %f2
+	aes_eround01_l	%f60, %f4, %f2, %f0
+	retl
+	aes_eround23_l	%f62, %f4, %f2, %f2
+.type	_aes192_encrypt_1x,#function
+.size	_aes192_encrypt_1x,.-_aes192_encrypt_1x
+
+.align	32
+_aes192_encrypt_2x:
+___
+for ($i=0; $i<5; $i++) {
+    $code.=<<___;
+	aes_eround01	%f`16+8*$i+0`, %f0, %f2, %f8
+	aes_eround23	%f`16+8*$i+2`, %f0, %f2, %f2
+	aes_eround01	%f`16+8*$i+0`, %f4, %f6, %f10
+	aes_eround23	%f`16+8*$i+2`, %f4, %f6, %f6
+	aes_eround01	%f`16+8*$i+4`, %f8, %f2, %f0
+	aes_eround23	%f`16+8*$i+6`, %f8, %f2, %f2
+	aes_eround01	%f`16+8*$i+4`, %f10, %f6, %f4
+	aes_eround23	%f`16+8*$i+6`, %f10, %f6, %f6
+___
+}
+$code.=<<___;
+	aes_eround01	%f56, %f0, %f2, %f8
+	aes_eround23	%f58, %f0, %f2, %f2
+	aes_eround01	%f56, %f4, %f6, %f10
+	aes_eround23	%f58, %f4, %f6, %f6
+	aes_eround01_l	%f60, %f8, %f2, %f0
+	aes_eround23_l	%f62, %f8, %f2, %f2
+	aes_eround01_l	%f60, %f10, %f6, %f4
+	retl
+	aes_eround23_l	%f62, %f10, %f6, %f6
+.type	_aes192_encrypt_2x,#function
+.size	_aes192_encrypt_2x,.-_aes192_encrypt_2x
+
+.align	32
+_aes256_encrypt_1x:
+	aes_eround01	%f16, %f0, %f2, %f4
+	aes_eround23	%f18, %f0, %f2, %f2
+	ldd		[$key + 208], %f16
+	ldd		[$key + 216], %f18
+	aes_eround01	%f20, %f4, %f2, %f0
+	aes_eround23	%f22, %f4, %f2, %f2
+	ldd		[$key + 224], %f20
+	ldd		[$key + 232], %f22
+___
+for ($i=1; $i<6; $i++) {
+    $code.=<<___;
+	aes_eround01	%f`16+8*$i+0`, %f0, %f2, %f4
+	aes_eround23	%f`16+8*$i+2`, %f0, %f2, %f2
+	aes_eround01	%f`16+8*$i+4`, %f4, %f2, %f0
+	aes_eround23	%f`16+8*$i+6`, %f4, %f2, %f2
+___
+}
+$code.=<<___;
+	aes_eround01	%f16, %f0, %f2, %f4
+	aes_eround23	%f18, %f0, %f2, %f2
+	ldd		[$key + 16], %f16
+	ldd		[$key + 24], %f18
+	aes_eround01_l	%f20, %f4, %f2, %f0
+	aes_eround23_l	%f22, %f4, %f2, %f2
+	ldd		[$key + 32], %f20
+	retl
+	ldd		[$key + 40], %f22
+.type	_aes256_encrypt_1x,#function
+.size	_aes256_encrypt_1x,.-_aes256_encrypt_1x
+
+.align	32
+_aes256_encrypt_2x:
+	aes_eround01	%f16, %f0, %f2, %f8
+	aes_eround23	%f18, %f0, %f2, %f2
+	aes_eround01	%f16, %f4, %f6, %f10
+	aes_eround23	%f18, %f4, %f6, %f6
+	ldd		[$key + 208], %f16
+	ldd		[$key + 216], %f18
+	aes_eround01	%f20, %f8, %f2, %f0
+	aes_eround23	%f22, %f8, %f2, %f2
+	aes_eround01	%f20, %f10, %f6, %f4
+	aes_eround23	%f22, %f10, %f6, %f6
+	ldd		[$key + 224], %f20
+	ldd		[$key + 232], %f22
+___
+for ($i=1; $i<6; $i++) {
+    $code.=<<___;
+	aes_eround01	%f`16+8*$i+0`, %f0, %f2, %f8
+	aes_eround23	%f`16+8*$i+2`, %f0, %f2, %f2
+	aes_eround01	%f`16+8*$i+0`, %f4, %f6, %f10
+	aes_eround23	%f`16+8*$i+2`, %f4, %f6, %f6
+	aes_eround01	%f`16+8*$i+4`, %f8, %f2, %f0
+	aes_eround23	%f`16+8*$i+6`, %f8, %f2, %f2
+	aes_eround01	%f`16+8*$i+4`, %f10, %f6, %f4
+	aes_eround23	%f`16+8*$i+6`, %f10, %f6, %f6
+___
+}
+$code.=<<___;
+	aes_eround01	%f16, %f0, %f2, %f8
+	aes_eround23	%f18, %f0, %f2, %f2
+	aes_eround01	%f16, %f4, %f6, %f10
+	aes_eround23	%f18, %f4, %f6, %f6
+	ldd		[$key + 16], %f16
+	ldd		[$key + 24], %f18
+	aes_eround01_l	%f20, %f8, %f2, %f0
+	aes_eround23_l	%f22, %f8, %f2, %f2
+	aes_eround01_l	%f20, %f10, %f6, %f4
+	aes_eround23_l	%f22, %f10, %f6, %f6
+	ldd		[$key + 32], %f20
+	retl
+	ldd		[$key + 40], %f22
+.type	_aes256_encrypt_2x,#function
+.size	_aes256_encrypt_2x,.-_aes256_encrypt_2x
+
+.align	32
+_aes192_loadkey:
+	ldx		[$key + 0], %g4
+	ldx		[$key + 8], %g5
+___
+for ($i=2; $i<26;$i++) {			# load key schedule
+    $code.=<<___;
+	ldd		[$key + `8*$i`], %f`12+2*$i`
+___
+}
+$code.=<<___;
+	retl
+	nop
+.type	_aes192_loadkey,#function
+.size	_aes192_loadkey,.-_aes192_loadkey
+_aes256_loadkey=_aes192_loadkey
+_aes192_load_enckey=_aes192_loadkey
+_aes192_load_deckey=_aes192_loadkey
+_aes256_load_enckey=_aes192_loadkey
+_aes256_load_deckey=_aes192_loadkey
+___
+
+&alg_cbc_encrypt_implement("aes",256);
+&alg_cbc_encrypt_implement("aes",192);
+if ($::evp) {
+    &alg_ctr32_implement("aes",256);
+    &alg_xts_implement("aes",256,"en");
+    &alg_xts_implement("aes",256,"de");
+    &alg_ctr32_implement("aes",192);
+}
+&alg_cbc_decrypt_implement("aes",192);
+&alg_cbc_decrypt_implement("aes",256);
+
+$code.=<<___;
+.align	32
+_aes256_decrypt_1x:
+	aes_dround01	%f16, %f0, %f2, %f4
+	aes_dround23	%f18, %f0, %f2, %f2
+	ldd		[$key + 208], %f16
+	ldd		[$key + 216], %f18
+	aes_dround01	%f20, %f4, %f2, %f0
+	aes_dround23	%f22, %f4, %f2, %f2
+	ldd		[$key + 224], %f20
+	ldd		[$key + 232], %f22
+___
+for ($i=1; $i<6; $i++) {
+    $code.=<<___;
+	aes_dround01	%f`16+8*$i+0`, %f0, %f2, %f4
+	aes_dround23	%f`16+8*$i+2`, %f0, %f2, %f2
+	aes_dround01	%f`16+8*$i+4`, %f4, %f2, %f0
+	aes_dround23	%f`16+8*$i+6`, %f4, %f2, %f2
+___
+}
+$code.=<<___;
+	aes_dround01	%f16, %f0, %f2, %f4
+	aes_dround23	%f18, %f0, %f2, %f2
+	ldd		[$key + 16], %f16
+	ldd		[$key + 24], %f18
+	aes_dround01_l	%f20, %f4, %f2, %f0
+	aes_dround23_l	%f22, %f4, %f2, %f2
+	ldd		[$key + 32], %f20
+	retl
+	ldd		[$key + 40], %f22
+.type	_aes256_decrypt_1x,#function
+.size	_aes256_decrypt_1x,.-_aes256_decrypt_1x
+
+.align	32
+_aes256_decrypt_2x:
+	aes_dround01	%f16, %f0, %f2, %f8
+	aes_dround23	%f18, %f0, %f2, %f2
+	aes_dround01	%f16, %f4, %f6, %f10
+	aes_dround23	%f18, %f4, %f6, %f6
+	ldd		[$key + 208], %f16
+	ldd		[$key + 216], %f18
+	aes_dround01	%f20, %f8, %f2, %f0
+	aes_dround23	%f22, %f8, %f2, %f2
+	aes_dround01	%f20, %f10, %f6, %f4
+	aes_dround23	%f22, %f10, %f6, %f6
+	ldd		[$key + 224], %f20
+	ldd		[$key + 232], %f22
+___
+for ($i=1; $i<6; $i++) {
+    $code.=<<___;
+	aes_dround01	%f`16+8*$i+0`, %f0, %f2, %f8
+	aes_dround23	%f`16+8*$i+2`, %f0, %f2, %f2
+	aes_dround01	%f`16+8*$i+0`, %f4, %f6, %f10
+	aes_dround23	%f`16+8*$i+2`, %f4, %f6, %f6
+	aes_dround01	%f`16+8*$i+4`, %f8, %f2, %f0
+	aes_dround23	%f`16+8*$i+6`, %f8, %f2, %f2
+	aes_dround01	%f`16+8*$i+4`, %f10, %f6, %f4
+	aes_dround23	%f`16+8*$i+6`, %f10, %f6, %f6
+___
+}
+$code.=<<___;
+	aes_dround01	%f16, %f0, %f2, %f8
+	aes_dround23	%f18, %f0, %f2, %f2
+	aes_dround01	%f16, %f4, %f6, %f10
+	aes_dround23	%f18, %f4, %f6, %f6
+	ldd		[$key + 16], %f16
+	ldd		[$key + 24], %f18
+	aes_dround01_l	%f20, %f8, %f2, %f0
+	aes_dround23_l	%f22, %f8, %f2, %f2
+	aes_dround01_l	%f20, %f10, %f6, %f4
+	aes_dround23_l	%f22, %f10, %f6, %f6
+	ldd		[$key + 32], %f20
+	retl
+	ldd		[$key + 40], %f22
+.type	_aes256_decrypt_2x,#function
+.size	_aes256_decrypt_2x,.-_aes256_decrypt_2x
+
+.align	32
+_aes192_decrypt_1x:
+___
+for ($i=0; $i<5; $i++) {
+    $code.=<<___;
+	aes_dround01	%f`16+8*$i+0`, %f0, %f2, %f4
+	aes_dround23	%f`16+8*$i+2`, %f0, %f2, %f2
+	aes_dround01	%f`16+8*$i+4`, %f4, %f2, %f0
+	aes_dround23	%f`16+8*$i+6`, %f4, %f2, %f2
+___
+}
+$code.=<<___;
+	aes_dround01	%f56, %f0, %f2, %f4
+	aes_dround23	%f58, %f0, %f2, %f2
+	aes_dround01_l	%f60, %f4, %f2, %f0
+	retl
+	aes_dround23_l	%f62, %f4, %f2, %f2
+.type	_aes192_decrypt_1x,#function
+.size	_aes192_decrypt_1x,.-_aes192_decrypt_1x
+
+.align	32
+_aes192_decrypt_2x:
+___
+for ($i=0; $i<5; $i++) {
+    $code.=<<___;
+	aes_dround01	%f`16+8*$i+0`, %f0, %f2, %f8
+	aes_dround23	%f`16+8*$i+2`, %f0, %f2, %f2
+	aes_dround01	%f`16+8*$i+0`, %f4, %f6, %f10
+	aes_dround23	%f`16+8*$i+2`, %f4, %f6, %f6
+	aes_dround01	%f`16+8*$i+4`, %f8, %f2, %f0
+	aes_dround23	%f`16+8*$i+6`, %f8, %f2, %f2
+	aes_dround01	%f`16+8*$i+4`, %f10, %f6, %f4
+	aes_dround23	%f`16+8*$i+6`, %f10, %f6, %f6
+___
+}
+$code.=<<___;
+	aes_dround01	%f56, %f0, %f2, %f8
+	aes_dround23	%f58, %f0, %f2, %f2
+	aes_dround01	%f56, %f4, %f6, %f10
+	aes_dround23	%f58, %f4, %f6, %f6
+	aes_dround01_l	%f60, %f8, %f2, %f0
+	aes_dround23_l	%f62, %f8, %f2, %f2
+	aes_dround01_l	%f60, %f10, %f6, %f4
+	retl
+	aes_dround23_l	%f62, %f10, %f6, %f6
+.type	_aes192_decrypt_2x,#function
+.size	_aes192_decrypt_2x,.-_aes192_decrypt_2x
+___
+}}}
+
+if (!$::evp) {
+$code.=<<___;
+.global	AES_encrypt
+AES_encrypt=aes_t4_encrypt
+.global	AES_decrypt
+AES_decrypt=aes_t4_decrypt
+.global	AES_set_encrypt_key
+.align	32
+AES_set_encrypt_key:
+	andcc		%o2, 7, %g0		! check alignment
+	bnz,a,pn	%icc, 1f
+	mov		-1, %o0
+	brz,a,pn	%o0, 1f
+	mov		-1, %o0
+	brz,a,pn	%o2, 1f
+	mov		-1, %o0
+	andncc		%o1, 0x1c0, %g0
+	bnz,a,pn	%icc, 1f
+	mov		-2, %o0
+	cmp		%o1, 128
+	bl,a,pn		%icc, 1f
+	mov		-2, %o0
+	b		aes_t4_set_encrypt_key
+	nop
+1:	retl
+	nop
+.type	AES_set_encrypt_key,#function
+.size	AES_set_encrypt_key,.-AES_set_encrypt_key
+
+.global	AES_set_decrypt_key
+.align	32
+AES_set_decrypt_key:
+	andcc		%o2, 7, %g0		! check alignment
+	bnz,a,pn	%icc, 1f
+	mov		-1, %o0
+	brz,a,pn	%o0, 1f
+	mov		-1, %o0
+	brz,a,pn	%o2, 1f
+	mov		-1, %o0
+	andncc		%o1, 0x1c0, %g0
+	bnz,a,pn	%icc, 1f
+	mov		-2, %o0
+	cmp		%o1, 128
+	bl,a,pn		%icc, 1f
+	mov		-2, %o0
+	b		aes_t4_set_decrypt_key
+	nop
+1:	retl
+	nop
+.type	AES_set_decrypt_key,#function
+.size	AES_set_decrypt_key,.-AES_set_decrypt_key
+___
+
+my ($inp,$out,$len,$key,$ivec,$enc)=map("%o$_",(0..5));
+
+$code.=<<___;
+.globl	AES_cbc_encrypt
+.align	32
+AES_cbc_encrypt:
+	ld		[$key + 240], %g1
+	nop
+	brz		$enc, .Lcbc_decrypt
+	cmp		%g1, 12
+
+	bl,pt		%icc, aes128_t4_cbc_encrypt
+	nop
+	be,pn		%icc, aes192_t4_cbc_encrypt
+	nop
+	ba		aes256_t4_cbc_encrypt
+	nop
+
+.Lcbc_decrypt:
+	bl,pt		%icc, aes128_t4_cbc_decrypt
+	nop
+	be,pn		%icc, aes192_t4_cbc_decrypt
+	nop
+	ba		aes256_t4_cbc_decrypt
+	nop
+.type	AES_cbc_encrypt,#function
+.size	AES_cbc_encrypt,.-AES_cbc_encrypt
+___
+}
+$code.=<<___;
+.asciz	"AES for SPARC T4, David S. Miller, Andy Polyakov"
+.align	4
+___
+
+&emit_assembler();
+
+close STDOUT;
--- a/deps/openssl/openssl/crypto/aes/asm/aesv8-armx.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/aesv8-armx.pl
@ -0,0 +1,962 @@
+#!/usr/bin/env perl
+#
+# ====================================================================
+# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
+# project. The module is, however, dual licensed under OpenSSL and
+# CRYPTOGAMS licenses depending on where you obtain it. For further
+# details see http://www.openssl.org/~appro/cryptogams/.
+# ====================================================================
+#
+# This module implements support for ARMv8 AES instructions. The
+# module is endian-agnostic in sense that it supports both big- and
+# little-endian cases. As does it support both 32- and 64-bit modes
+# of operation. Latter is achieved by limiting amount of utilized
+# registers to 16, which implies additional NEON load and integer
+# instructions. This has no effect on mighty Apple A7, where results
+# are literally equal to the theoretical estimates based on AES
+# instruction latencies and issue rates. On Cortex-A53, an in-order
+# execution core, this costs up to 10-15%, which is partially
+# compensated by implementing dedicated code path for 128-bit
+# CBC encrypt case. On Cortex-A57 parallelizable mode performance
+# seems to be limited by sheer amount of NEON instructions...
+#
+# Performance in cycles per byte processed with 128-bit key:
+#
+#		CBC enc		CBC dec		CTR
+# Apple A7	2.39		1.20		1.20
+# Cortex-A53	2.45		1.87		1.94
+# Cortex-A57	3.64		1.34		1.32
+
+$flavour = shift;
+open STDOUT,">".shift;
+
+$prefix="aes_v8";
+
+$code=<<___;
+#include "arm_arch.h"
+
+#if __ARM_MAX_ARCH__>=7
+.text
+___
+$code.=".arch	armv8-a+crypto\n"			if ($flavour =~ /64/);
+$code.=".arch	armv7-a\n.fpu	neon\n.code	32\n"	if ($flavour !~ /64/);
+		#^^^^^^ this is done to simplify adoption by not depending
+		#	on latest binutils.
+
+# Assembler mnemonics are an eclectic mix of 32- and 64-bit syntax,
+# NEON is mostly 32-bit mnemonics, integer - mostly 64. Goal is to
+# maintain both 32- and 64-bit codes within single module and
+# transliterate common code to either flavour with regex vodoo.
+#
+{{{
+my ($inp,$bits,$out,$ptr,$rounds)=("x0","w1","x2","x3","w12");
+my ($zero,$rcon,$mask,$in0,$in1,$tmp,$key)=
+	$flavour=~/64/? map("q$_",(0..6)) : map("q$_",(0..3,8..10));
+
+
+$code.=<<___;
+.align	5
+rcon:
+.long	0x01,0x01,0x01,0x01
+.long	0x0c0f0e0d,0x0c0f0e0d,0x0c0f0e0d,0x0c0f0e0d	// rotate-n-splat
+.long	0x1b,0x1b,0x1b,0x1b
+
+.globl	${prefix}_set_encrypt_key
+.type	${prefix}_set_encrypt_key,%function
+.align	5
+${prefix}_set_encrypt_key:
+.Lenc_key:
+___
+$code.=<<___	if ($flavour =~ /64/);
+	stp	x29,x30,[sp,#-16]!
+	add	x29,sp,#0
+___
+$code.=<<___;
+	mov	$ptr,#-1
+	cmp	$inp,#0
+	b.eq	.Lenc_key_abort
+	cmp	$out,#0
+	b.eq	.Lenc_key_abort
+	mov	$ptr,#-2
+	cmp	$bits,#128
+	b.lt	.Lenc_key_abort
+	cmp	$bits,#256
+	b.gt	.Lenc_key_abort
+	tst	$bits,#0x3f
+	b.ne	.Lenc_key_abort
+
+	adr	$ptr,rcon
+	cmp	$bits,#192
+
+	veor	$zero,$zero,$zero
+	vld1.8	{$in0},[$inp],#16
+	mov	$bits,#8		// reuse $bits
+	vld1.32	{$rcon,$mask},[$ptr],#32
+
+	b.lt	.Loop128
+	b.eq	.L192
+	b	.L256
+
+.align	4
+.Loop128:
+	vtbl.8	$key,{$in0},$mask
+	vext.8	$tmp,$zero,$in0,#12
+	vst1.32	{$in0},[$out],#16
+	aese	$key,$zero
+	subs	$bits,$bits,#1
+
+	veor	$in0,$in0,$tmp
+	vext.8	$tmp,$zero,$tmp,#12
+	veor	$in0,$in0,$tmp
+	vext.8	$tmp,$zero,$tmp,#12
+	 veor	$key,$key,$rcon
+	veor	$in0,$in0,$tmp
+	vshl.u8	$rcon,$rcon,#1
+	veor	$in0,$in0,$key
+	b.ne	.Loop128
+
+	vld1.32	{$rcon},[$ptr]
+
+	vtbl.8	$key,{$in0},$mask
+	vext.8	$tmp,$zero,$in0,#12
+	vst1.32	{$in0},[$out],#16
+	aese	$key,$zero
+
+	veor	$in0,$in0,$tmp
+	vext.8	$tmp,$zero,$tmp,#12
+	veor	$in0,$in0,$tmp
+	vext.8	$tmp,$zero,$tmp,#12
+	 veor	$key,$key,$rcon
+	veor	$in0,$in0,$tmp
+	vshl.u8	$rcon,$rcon,#1
+	veor	$in0,$in0,$key
+
+	vtbl.8	$key,{$in0},$mask
+	vext.8	$tmp,$zero,$in0,#12
+	vst1.32	{$in0},[$out],#16
+	aese	$key,$zero
+
+	veor	$in0,$in0,$tmp
+	vext.8	$tmp,$zero,$tmp,#12
+	veor	$in0,$in0,$tmp
+	vext.8	$tmp,$zero,$tmp,#12
+	 veor	$key,$key,$rcon
+	veor	$in0,$in0,$tmp
+	veor	$in0,$in0,$key
+	vst1.32	{$in0},[$out]
+	add	$out,$out,#0x50
+
+	mov	$rounds,#10
+	b	.Ldone
+
+.align	4
+.L192:
+	vld1.8	{$in1},[$inp],#8
+	vmov.i8	$key,#8			// borrow $key
+	vst1.32	{$in0},[$out],#16
+	vsub.i8	$mask,$mask,$key	// adjust the mask
+
+.Loop192:
+	vtbl.8	$key,{$in1},$mask
+	vext.8	$tmp,$zero,$in0,#12
+	vst1.32	{$in1},[$out],#8
+	aese	$key,$zero
+	subs	$bits,$bits,#1
+
+	veor	$in0,$in0,$tmp
+	vext.8	$tmp,$zero,$tmp,#12
+	veor	$in0,$in0,$tmp
+	vext.8	$tmp,$zero,$tmp,#12
+	veor	$in0,$in0,$tmp
+
+	vdup.32	$tmp,${in0}[3]
+	veor	$tmp,$tmp,$in1
+	 veor	$key,$key,$rcon
+	vext.8	$in1,$zero,$in1,#12
+	vshl.u8	$rcon,$rcon,#1
+	veor	$in1,$in1,$tmp
+	veor	$in0,$in0,$key
+	veor	$in1,$in1,$key
+	vst1.32	{$in0},[$out],#16
+	b.ne	.Loop192
+
+	mov	$rounds,#12
+	add	$out,$out,#0x20
+	b	.Ldone
+
+.align	4
+.L256:
+	vld1.8	{$in1},[$inp]
+	mov	$bits,#7
+	mov	$rounds,#14
+	vst1.32	{$in0},[$out],#16
+
+.Loop256:
+	vtbl.8	$key,{$in1},$mask
+	vext.8	$tmp,$zero,$in0,#12
+	vst1.32	{$in1},[$out],#16
+	aese	$key,$zero
+	subs	$bits,$bits,#1
+
+	veor	$in0,$in0,$tmp
+	vext.8	$tmp,$zero,$tmp,#12
+	veor	$in0,$in0,$tmp
+	vext.8	$tmp,$zero,$tmp,#12
+	 veor	$key,$key,$rcon
+	veor	$in0,$in0,$tmp
+	vshl.u8	$rcon,$rcon,#1
+	veor	$in0,$in0,$key
+	vst1.32	{$in0},[$out],#16
+	b.eq	.Ldone
+
+	vdup.32	$key,${in0}[3]		// just splat
+	vext.8	$tmp,$zero,$in1,#12
+	aese	$key,$zero
+
+	veor	$in1,$in1,$tmp
+	vext.8	$tmp,$zero,$tmp,#12
+	veor	$in1,$in1,$tmp
+	vext.8	$tmp,$zero,$tmp,#12
+	veor	$in1,$in1,$tmp
+
+	veor	$in1,$in1,$key
+	b	.Loop256
+
+.Ldone:
+	str	$rounds,[$out]
+	mov	$ptr,#0
+
+.Lenc_key_abort:
+	mov	x0,$ptr			// return value
+	`"ldr	x29,[sp],#16"		if ($flavour =~ /64/)`
+	ret
+.size	${prefix}_set_encrypt_key,.-${prefix}_set_encrypt_key
+
+.globl	${prefix}_set_decrypt_key
+.type	${prefix}_set_decrypt_key,%function
+.align	5
+${prefix}_set_decrypt_key:
+___
+$code.=<<___	if ($flavour =~ /64/);
+	stp	x29,x30,[sp,#-16]!
+	add	x29,sp,#0
+___
+$code.=<<___	if ($flavour !~ /64/);
+	stmdb	sp!,{r4,lr}
+___
+$code.=<<___;
+	bl	.Lenc_key
+
+	cmp	x0,#0
+	b.ne	.Ldec_key_abort
+
+	sub	$out,$out,#240		// restore original $out
+	mov	x4,#-16
+	add	$inp,$out,x12,lsl#4	// end of key schedule
+
+	vld1.32	{v0.16b},[$out]
+	vld1.32	{v1.16b},[$inp]
+	vst1.32	{v0.16b},[$inp],x4
+	vst1.32	{v1.16b},[$out],#16
+
+.Loop_imc:
+	vld1.32	{v0.16b},[$out]
+	vld1.32	{v1.16b},[$inp]
+	aesimc	v0.16b,v0.16b
+	aesimc	v1.16b,v1.16b
+	vst1.32	{v0.16b},[$inp],x4
+	vst1.32	{v1.16b},[$out],#16
+	cmp	$inp,$out
+	b.hi	.Loop_imc
+
+	vld1.32	{v0.16b},[$out]
+	aesimc	v0.16b,v0.16b
+	vst1.32	{v0.16b},[$inp]
+
+	eor	x0,x0,x0		// return value
+.Ldec_key_abort:
+___
+$code.=<<___	if ($flavour !~ /64/);
+	ldmia	sp!,{r4,pc}
+___
+$code.=<<___	if ($flavour =~ /64/);
+	ldp	x29,x30,[sp],#16
+	ret
+___
+$code.=<<___;
+.size	${prefix}_set_decrypt_key,.-${prefix}_set_decrypt_key
+___
+}}}
+{{{
+sub gen_block () {
+my $dir = shift;
+my ($e,$mc) = $dir eq "en" ? ("e","mc") : ("d","imc");
+my ($inp,$out,$key)=map("x$_",(0..2));
+my $rounds="w3";
+my ($rndkey0,$rndkey1,$inout)=map("q$_",(0..3));
+
+$code.=<<___;
+.globl	${prefix}_${dir}crypt
+.type	${prefix}_${dir}crypt,%function
+.align	5
+${prefix}_${dir}crypt:
+	ldr	$rounds,[$key,#240]
+	vld1.32	{$rndkey0},[$key],#16
+	vld1.8	{$inout},[$inp]
+	sub	$rounds,$rounds,#2
+	vld1.32	{$rndkey1},[$key],#16
+
+.Loop_${dir}c:
+	aes$e	$inout,$rndkey0
+	vld1.32	{$rndkey0},[$key],#16
+	aes$mc	$inout,$inout
+	subs	$rounds,$rounds,#2
+	aes$e	$inout,$rndkey1
+	vld1.32	{$rndkey1},[$key],#16
+	aes$mc	$inout,$inout
+	b.gt	.Loop_${dir}c
+
+	aes$e	$inout,$rndkey0
+	vld1.32	{$rndkey0},[$key]
+	aes$mc	$inout,$inout
+	aes$e	$inout,$rndkey1
+	veor	$inout,$inout,$rndkey0
+
+	vst1.8	{$inout},[$out]
+	ret
+.size	${prefix}_${dir}crypt,.-${prefix}_${dir}crypt
+___
+}
+&gen_block("en");
+&gen_block("de");
+}}}
+{{{
+my ($inp,$out,$len,$key,$ivp)=map("x$_",(0..4)); my $enc="w5";
+my ($rounds,$cnt,$key_,$step,$step1)=($enc,"w6","x7","x8","x12");
+my ($dat0,$dat1,$in0,$in1,$tmp0,$tmp1,$ivec,$rndlast)=map("q$_",(0..7));
+
+my ($dat,$tmp,$rndzero_n_last)=($dat0,$tmp0,$tmp1);
+
+### q8-q15	preloaded key schedule
+
+$code.=<<___;
+.globl	${prefix}_cbc_encrypt
+.type	${prefix}_cbc_encrypt,%function
+.align	5
+${prefix}_cbc_encrypt:
+___
+$code.=<<___	if ($flavour =~ /64/);
+	stp	x29,x30,[sp,#-16]!
+	add	x29,sp,#0
+___
+$code.=<<___	if ($flavour !~ /64/);
+	mov	ip,sp
+	stmdb	sp!,{r4-r8,lr}
+	vstmdb	sp!,{d8-d15}            @ ABI specification says so
+	ldmia	ip,{r4-r5}		@ load remaining args
+___
+$code.=<<___;
+	subs	$len,$len,#16
+	mov	$step,#16
+	b.lo	.Lcbc_abort
+	cclr	$step,eq
+
+	cmp	$enc,#0			// en- or decrypting?
+	ldr	$rounds,[$key,#240]
+	and	$len,$len,#-16
+	vld1.8	{$ivec},[$ivp]
+	vld1.8	{$dat},[$inp],$step
+
+	vld1.32	{q8-q9},[$key]		// load key schedule...
+	sub	$rounds,$rounds,#6
+	add	$key_,$key,x5,lsl#4	// pointer to last 7 round keys
+	sub	$rounds,$rounds,#2
+	vld1.32	{q10-q11},[$key_],#32
+	vld1.32	{q12-q13},[$key_],#32
+	vld1.32	{q14-q15},[$key_],#32
+	vld1.32	{$rndlast},[$key_]
+
+	add	$key_,$key,#32
+	mov	$cnt,$rounds
+	b.eq	.Lcbc_dec
+
+	cmp	$rounds,#2
+	veor	$dat,$dat,$ivec
+	veor	$rndzero_n_last,q8,$rndlast
+	b.eq	.Lcbc_enc128
+
+.Loop_cbc_enc:
+	aese	$dat,q8
+	vld1.32	{q8},[$key_],#16
+	aesmc	$dat,$dat
+	subs	$cnt,$cnt,#2
+	aese	$dat,q9
+	vld1.32	{q9},[$key_],#16
+	aesmc	$dat,$dat
+	b.gt	.Loop_cbc_enc
+
+	aese	$dat,q8
+	aesmc	$dat,$dat
+	 subs	$len,$len,#16
+	aese	$dat,q9
+	aesmc	$dat,$dat
+	 cclr	$step,eq
+	aese	$dat,q10
+	aesmc	$dat,$dat
+	 add	$key_,$key,#16
+	aese	$dat,q11
+	aesmc	$dat,$dat
+	 vld1.8	{q8},[$inp],$step
+	aese	$dat,q12
+	aesmc	$dat,$dat
+	 veor	q8,q8,$rndzero_n_last
+	aese	$dat,q13
+	aesmc	$dat,$dat
+	 vld1.32 {q9},[$key_],#16	// re-pre-load rndkey[1]
+	aese	$dat,q14
+	aesmc	$dat,$dat
+	aese	$dat,q15
+
+	 mov	$cnt,$rounds
+	veor	$ivec,$dat,$rndlast
+	vst1.8	{$ivec},[$out],#16
+	b.hs	.Loop_cbc_enc
+
+	b	.Lcbc_done
+
+.align	5
+.Lcbc_enc128:
+	vld1.32	{$in0-$in1},[$key_]
+	aese	$dat,q8
+	aesmc	$dat,$dat
+	b	.Lenter_cbc_enc128
+.Loop_cbc_enc128:
+	aese	$dat,q8
+	aesmc	$dat,$dat
+	 vst1.8	{$ivec},[$out],#16
+.Lenter_cbc_enc128:
+	aese	$dat,q9
+	aesmc	$dat,$dat
+	 subs	$len,$len,#16
+	aese	$dat,$in0
+	aesmc	$dat,$dat
+	 cclr	$step,eq
+	aese	$dat,$in1
+	aesmc	$dat,$dat
+	aese	$dat,q10
+	aesmc	$dat,$dat
+	aese	$dat,q11
+	aesmc	$dat,$dat
+	 vld1.8	{q8},[$inp],$step
+	aese	$dat,q12
+	aesmc	$dat,$dat
+	aese	$dat,q13
+	aesmc	$dat,$dat
+	aese	$dat,q14
+	aesmc	$dat,$dat
+	 veor	q8,q8,$rndzero_n_last
+	aese	$dat,q15
+	veor	$ivec,$dat,$rndlast
+	b.hs	.Loop_cbc_enc128
+
+	vst1.8	{$ivec},[$out],#16
+	b	.Lcbc_done
+___
+{
+my ($dat2,$in2,$tmp2)=map("q$_",(10,11,9));
+$code.=<<___;
+.align	5
+.Lcbc_dec:
+	vld1.8	{$dat2},[$inp],#16
+	subs	$len,$len,#32		// bias
+	add	$cnt,$rounds,#2
+	vorr	$in1,$dat,$dat
+	vorr	$dat1,$dat,$dat
+	vorr	$in2,$dat2,$dat2
+	b.lo	.Lcbc_dec_tail
+
+	vorr	$dat1,$dat2,$dat2
+	vld1.8	{$dat2},[$inp],#16
+	vorr	$in0,$dat,$dat
+	vorr	$in1,$dat1,$dat1
+	vorr	$in2,$dat2,$dat2
+
+.Loop3x_cbc_dec:
+	aesd	$dat0,q8
+	aesd	$dat1,q8
+	aesd	$dat2,q8
+	vld1.32	{q8},[$key_],#16
+	aesimc	$dat0,$dat0
+	aesimc	$dat1,$dat1
+	aesimc	$dat2,$dat2
+	subs	$cnt,$cnt,#2
+	aesd	$dat0,q9
+	aesd	$dat1,q9
+	aesd	$dat2,q9
+	vld1.32	{q9},[$key_],#16
+	aesimc	$dat0,$dat0
+	aesimc	$dat1,$dat1
+	aesimc	$dat2,$dat2
+	b.gt	.Loop3x_cbc_dec
+
+	aesd	$dat0,q8
+	aesd	$dat1,q8
+	aesd	$dat2,q8
+	 veor	$tmp0,$ivec,$rndlast
+	aesimc	$dat0,$dat0
+	aesimc	$dat1,$dat1
+	aesimc	$dat2,$dat2
+	 veor	$tmp1,$in0,$rndlast
+	aesd	$dat0,q9
+	aesd	$dat1,q9
+	aesd	$dat2,q9
+	 veor	$tmp2,$in1,$rndlast
+	 subs	$len,$len,#0x30
+	aesimc	$dat0,$dat0
+	aesimc	$dat1,$dat1
+	aesimc	$dat2,$dat2
+	 vorr	$ivec,$in2,$in2
+	 mov.lo	x6,$len			// x6, $cnt, is zero at this point
+	aesd	$dat0,q12
+	aesd	$dat1,q12
+	aesd	$dat2,q12
+	 add	$inp,$inp,x6		// $inp is adjusted in such way that
+					// at exit from the loop $dat1-$dat2
+					// are loaded with last "words"
+	aesimc	$dat0,$dat0
+	aesimc	$dat1,$dat1
+	aesimc	$dat2,$dat2
+	 mov	$key_,$key
+	aesd	$dat0,q13
+	aesd	$dat1,q13
+	aesd	$dat2,q13
+	 vld1.8	{$in0},[$inp],#16
+	aesimc	$dat0,$dat0
+	aesimc	$dat1,$dat1
+	aesimc	$dat2,$dat2
+	 vld1.8	{$in1},[$inp],#16
+	aesd	$dat0,q14
+	aesd	$dat1,q14
+	aesd	$dat2,q14
+	 vld1.8	{$in2},[$inp],#16
+	aesimc	$dat0,$dat0
+	aesimc	$dat1,$dat1
+	aesimc	$dat2,$dat2
+	 vld1.32 {q8},[$key_],#16	// re-pre-load rndkey[0]
+	aesd	$dat0,q15
+	aesd	$dat1,q15
+	aesd	$dat2,q15
+
+	 add	$cnt,$rounds,#2
+	veor	$tmp0,$tmp0,$dat0
+	veor	$tmp1,$tmp1,$dat1
+	veor	$dat2,$dat2,$tmp2
+	 vld1.32 {q9},[$key_],#16	// re-pre-load rndkey[1]
+	 vorr	$dat0,$in0,$in0
+	vst1.8	{$tmp0},[$out],#16
+	 vorr	$dat1,$in1,$in1
+	vst1.8	{$tmp1},[$out],#16
+	vst1.8	{$dat2},[$out],#16
+	 vorr	$dat2,$in2,$in2
+	b.hs	.Loop3x_cbc_dec
+
+	cmn	$len,#0x30
+	b.eq	.Lcbc_done
+	nop
+
+.Lcbc_dec_tail:
+	aesd	$dat1,q8
+	aesd	$dat2,q8
+	vld1.32	{q8},[$key_],#16
+	aesimc	$dat1,$dat1
+	aesimc	$dat2,$dat2
+	subs	$cnt,$cnt,#2
+	aesd	$dat1,q9
+	aesd	$dat2,q9
+	vld1.32	{q9},[$key_],#16
+	aesimc	$dat1,$dat1
+	aesimc	$dat2,$dat2
+	b.gt	.Lcbc_dec_tail
+
+	aesd	$dat1,q8
+	aesd	$dat2,q8
+	aesimc	$dat1,$dat1
+	aesimc	$dat2,$dat2
+	aesd	$dat1,q9
+	aesd	$dat2,q9
+	aesimc	$dat1,$dat1
+	aesimc	$dat2,$dat2
+	aesd	$dat1,q12
+	aesd	$dat2,q12
+	aesimc	$dat1,$dat1
+	aesimc	$dat2,$dat2
+	 cmn	$len,#0x20
+	aesd	$dat1,q13
+	aesd	$dat2,q13
+	aesimc	$dat1,$dat1
+	aesimc	$dat2,$dat2
+	 veor	$tmp1,$ivec,$rndlast
+	aesd	$dat1,q14
+	aesd	$dat2,q14
+	aesimc	$dat1,$dat1
+	aesimc	$dat2,$dat2
+	 veor	$tmp2,$in1,$rndlast
+	aesd	$dat1,q15
+	aesd	$dat2,q15
+	b.eq	.Lcbc_dec_one
+	veor	$tmp1,$tmp1,$dat1
+	veor	$tmp2,$tmp2,$dat2
+	 vorr	$ivec,$in2,$in2
+	vst1.8	{$tmp1},[$out],#16
+	vst1.8	{$tmp2},[$out],#16
+	b	.Lcbc_done
+
+.Lcbc_dec_one:
+	veor	$tmp1,$tmp1,$dat2
+	 vorr	$ivec,$in2,$in2
+	vst1.8	{$tmp1},[$out],#16
+
+.Lcbc_done:
+	vst1.8	{$ivec},[$ivp]
+.Lcbc_abort:
+___
+}
+$code.=<<___	if ($flavour !~ /64/);
+	vldmia	sp!,{d8-d15}
+	ldmia	sp!,{r4-r8,pc}
+___
+$code.=<<___	if ($flavour =~ /64/);
+	ldr	x29,[sp],#16
+	ret
+___
+$code.=<<___;
+.size	${prefix}_cbc_encrypt,.-${prefix}_cbc_encrypt
+___
+}}}
+{{{
+my ($inp,$out,$len,$key,$ivp)=map("x$_",(0..4));
+my ($rounds,$cnt,$key_)=("w5","w6","x7");
+my ($ctr,$tctr0,$tctr1,$tctr2)=map("w$_",(8..10,12));
+my $step="x12";		# aliases with $tctr2
+
+my ($dat0,$dat1,$in0,$in1,$tmp0,$tmp1,$ivec,$rndlast)=map("q$_",(0..7));
+my ($dat2,$in2,$tmp2)=map("q$_",(10,11,9));
+
+my ($dat,$tmp)=($dat0,$tmp0);
+
+### q8-q15	preloaded key schedule
+
+$code.=<<___;
+.globl	${prefix}_ctr32_encrypt_blocks
+.type	${prefix}_ctr32_encrypt_blocks,%function
+.align	5
+${prefix}_ctr32_encrypt_blocks:
+___
+$code.=<<___	if ($flavour =~ /64/);
+	stp		x29,x30,[sp,#-16]!
+	add		x29,sp,#0
+___
+$code.=<<___	if ($flavour !~ /64/);
+	mov		ip,sp
+	stmdb		sp!,{r4-r10,lr}
+	vstmdb		sp!,{d8-d15}            @ ABI specification says so
+	ldr		r4, [ip]		@ load remaining arg
+___
+$code.=<<___;
+	ldr		$rounds,[$key,#240]
+
+	ldr		$ctr, [$ivp, #12]
+	vld1.32		{$dat0},[$ivp]
+
+	vld1.32		{q8-q9},[$key]		// load key schedule...
+	sub		$rounds,$rounds,#4
+	mov		$step,#16
+	cmp		$len,#2
+	add		$key_,$key,x5,lsl#4	// pointer to last 5 round keys
+	sub		$rounds,$rounds,#2
+	vld1.32		{q12-q13},[$key_],#32
+	vld1.32		{q14-q15},[$key_],#32
+	vld1.32		{$rndlast},[$key_]
+	add		$key_,$key,#32
+	mov		$cnt,$rounds
+	cclr		$step,lo
+#ifndef __ARMEB__
+	rev		$ctr, $ctr
+#endif
+	vorr		$dat1,$dat0,$dat0
+	add		$tctr1, $ctr, #1
+	vorr		$dat2,$dat0,$dat0
+	add		$ctr, $ctr, #2
+	vorr		$ivec,$dat0,$dat0
+	rev		$tctr1, $tctr1
+	vmov.32		${dat1}[3],$tctr1
+	b.ls		.Lctr32_tail
+	rev		$tctr2, $ctr
+	sub		$len,$len,#3		// bias
+	vmov.32		${dat2}[3],$tctr2
+	b		.Loop3x_ctr32
+
+.align	4
+.Loop3x_ctr32:
+	aese		$dat0,q8
+	aese		$dat1,q8
+	aese		$dat2,q8
+	vld1.32		{q8},[$key_],#16
+	aesmc		$dat0,$dat0
+	aesmc		$dat1,$dat1
+	aesmc		$dat2,$dat2
+	subs		$cnt,$cnt,#2
+	aese		$dat0,q9
+	aese		$dat1,q9
+	aese		$dat2,q9
+	vld1.32		{q9},[$key_],#16
+	aesmc		$dat0,$dat0
+	aesmc		$dat1,$dat1
+	aesmc		$dat2,$dat2
+	b.gt		.Loop3x_ctr32
+
+	aese		$dat0,q8
+	aese		$dat1,q8
+	aese		$dat2,q8
+	 mov		$key_,$key
+	aesmc		$tmp0,$dat0
+	 vld1.8		{$in0},[$inp],#16
+	aesmc		$tmp1,$dat1
+	aesmc		$dat2,$dat2
+	 vorr		$dat0,$ivec,$ivec
+	aese		$tmp0,q9
+	 vld1.8		{$in1},[$inp],#16
+	aese		$tmp1,q9
+	aese		$dat2,q9
+	 vorr		$dat1,$ivec,$ivec
+	aesmc		$tmp0,$tmp0
+	 vld1.8		{$in2},[$inp],#16
+	aesmc		$tmp1,$tmp1
+	aesmc		$tmp2,$dat2
+	 vorr		$dat2,$ivec,$ivec
+	 add		$tctr0,$ctr,#1
+	aese		$tmp0,q12
+	aese		$tmp1,q12
+	aese		$tmp2,q12
+	 veor		$in0,$in0,$rndlast
+	 add		$tctr1,$ctr,#2
+	aesmc		$tmp0,$tmp0
+	aesmc		$tmp1,$tmp1
+	aesmc		$tmp2,$tmp2
+	 veor		$in1,$in1,$rndlast
+	 add		$ctr,$ctr,#3
+	aese		$tmp0,q13
+	aese		$tmp1,q13
+	aese		$tmp2,q13
+	 veor		$in2,$in2,$rndlast
+	 rev		$tctr0,$tctr0
+	aesmc		$tmp0,$tmp0
+	 vld1.32	 {q8},[$key_],#16	// re-pre-load rndkey[0]
+	aesmc		$tmp1,$tmp1
+	aesmc		$tmp2,$tmp2
+	 vmov.32	${dat0}[3], $tctr0
+	 rev		$tctr1,$tctr1
+	aese		$tmp0,q14
+	aese		$tmp1,q14
+	aese		$tmp2,q14
+	 vmov.32	${dat1}[3], $tctr1
+	 rev		$tctr2,$ctr
+	aesmc		$tmp0,$tmp0
+	aesmc		$tmp1,$tmp1
+	aesmc		$tmp2,$tmp2
+	 vmov.32	${dat2}[3], $tctr2
+	 subs		$len,$len,#3
+	aese		$tmp0,q15
+	aese		$tmp1,q15
+	aese		$tmp2,q15
+
+	 mov		$cnt,$rounds
+	veor		$in0,$in0,$tmp0
+	veor		$in1,$in1,$tmp1
+	veor		$in2,$in2,$tmp2
+	 vld1.32	 {q9},[$key_],#16	// re-pre-load rndkey[1]
+	vst1.8		{$in0},[$out],#16
+	vst1.8		{$in1},[$out],#16
+	vst1.8		{$in2},[$out],#16
+	b.hs		.Loop3x_ctr32
+
+	adds		$len,$len,#3
+	b.eq		.Lctr32_done
+	cmp		$len,#1
+	mov		$step,#16
+	cclr		$step,eq
+
+.Lctr32_tail:
+	aese		$dat0,q8
+	aese		$dat1,q8
+	vld1.32		{q8},[$key_],#16
+	aesmc		$dat0,$dat0
+	aesmc		$dat1,$dat1
+	subs		$cnt,$cnt,#2
+	aese		$dat0,q9
+	aese		$dat1,q9
+	vld1.32		{q9},[$key_],#16
+	aesmc		$dat0,$dat0
+	aesmc		$dat1,$dat1
+	b.gt		.Lctr32_tail
+
+	aese		$dat0,q8
+	aese		$dat1,q8
+	aesmc		$dat0,$dat0
+	aesmc		$dat1,$dat1
+	aese		$dat0,q9
+	aese		$dat1,q9
+	aesmc		$dat0,$dat0
+	aesmc		$dat1,$dat1
+	 vld1.8		{$in0},[$inp],$step
+	aese		$dat0,q12
+	aese		$dat1,q12
+	 vld1.8		{$in1},[$inp]
+	aesmc		$dat0,$dat0
+	aesmc		$dat1,$dat1
+	aese		$dat0,q13
+	aese		$dat1,q13
+	aesmc		$dat0,$dat0
+	aesmc		$dat1,$dat1
+	aese		$dat0,q14
+	aese		$dat1,q14
+	 veor		$in0,$in0,$rndlast
+	aesmc		$dat0,$dat0
+	aesmc		$dat1,$dat1
+	 veor		$in1,$in1,$rndlast
+	aese		$dat0,q15
+	aese		$dat1,q15
+
+	cmp		$len,#1
+	veor		$in0,$in0,$dat0
+	veor		$in1,$in1,$dat1
+	vst1.8		{$in0},[$out],#16
+	b.eq		.Lctr32_done
+	vst1.8		{$in1},[$out]
+
+.Lctr32_done:
+___
+$code.=<<___	if ($flavour !~ /64/);
+	vldmia		sp!,{d8-d15}
+	ldmia		sp!,{r4-r10,pc}
+___
+$code.=<<___	if ($flavour =~ /64/);
+	ldr		x29,[sp],#16
+	ret
+___
+$code.=<<___;
+.size	${prefix}_ctr32_encrypt_blocks,.-${prefix}_ctr32_encrypt_blocks
+___
+}}}
+$code.=<<___;
+#endif
+___
+########################################
+if ($flavour =~ /64/) {			######## 64-bit code
+    my %opcode = (
+	"aesd"	=>	0x4e285800,	"aese"	=>	0x4e284800,
+	"aesimc"=>	0x4e287800,	"aesmc"	=>	0x4e286800	);
+
+    local *unaes = sub {
+	my ($mnemonic,$arg)=@_;
+
+	$arg =~ m/[qv]([0-9]+)[^,]*,\s*[qv]([0-9]+)/o	&&
+	sprintf ".inst\t0x%08x\t//%s %s",
+			$opcode{$mnemonic}|$1|($2<<5),
+			$mnemonic,$arg;
+    };
+
+    foreach(split("\n",$code)) {
+	s/\`([^\`]*)\`/eval($1)/geo;
+
+	s/\bq([0-9]+)\b/"v".($1<8?$1:$1+8).".16b"/geo;	# old->new registers
+	s/@\s/\/\//o;			# old->new style commentary
+
+	#s/[v]?(aes\w+)\s+([qv].*)/unaes($1,$2)/geo	or
+	s/cclr\s+([wx])([^,]+),\s*([a-z]+)/csel	$1$2,$1zr,$1$2,$3/o	or
+	s/mov\.([a-z]+)\s+([wx][0-9]+),\s*([wx][0-9]+)/csel	$2,$3,$2,$1/o	or
+	s/vmov\.i8/movi/o	or	# fix up legacy mnemonics
+	s/vext\.8/ext/o		or
+	s/vrev32\.8/rev32/o	or
+	s/vtst\.8/cmtst/o	or
+	s/vshr/ushr/o		or
+	s/^(\s+)v/$1/o		or	# strip off v prefix
+	s/\bbx\s+lr\b/ret/o;
+
+	# fix up remainig legacy suffixes
+	s/\.[ui]?8//o;
+	m/\],#8/o and s/\.16b/\.8b/go;
+	s/\.[ui]?32//o and s/\.16b/\.4s/go;
+	s/\.[ui]?64//o and s/\.16b/\.2d/go;
+	s/\.[42]([sd])\[([0-3])\]/\.$1\[$2\]/o;
+
+	print $_,"\n";
+    }
+} else {				######## 32-bit code
+    my %opcode = (
+	"aesd"	=>	0xf3b00340,	"aese"	=>	0xf3b00300,
+	"aesimc"=>	0xf3b003c0,	"aesmc"	=>	0xf3b00380	);
+
+    local *unaes = sub {
+	my ($mnemonic,$arg)=@_;
+
+	if ($arg =~ m/[qv]([0-9]+)[^,]*,\s*[qv]([0-9]+)/o) {
+	    my $word = $opcode{$mnemonic}|(($1&7)<<13)|(($1&8)<<19)
+					 |(($2&7)<<1) |(($2&8)<<2);
+	    # since ARMv7 instructions are always encoded little-endian.
+	    # correct solution is to use .inst directive, but older
+	    # assemblers don't implement it:-(
+	    sprintf ".byte\t0x%02x,0x%02x,0x%02x,0x%02x\t@ %s %s",
+			$word&0xff,($word>>8)&0xff,
+			($word>>16)&0xff,($word>>24)&0xff,
+			$mnemonic,$arg;
+	}
+    };
+
+    sub unvtbl {
+	my $arg=shift;
+
+	$arg =~ m/q([0-9]+),\s*\{q([0-9]+)\},\s*q([0-9]+)/o &&
+	sprintf	"vtbl.8	d%d,{q%d},d%d\n\t".
+		"vtbl.8	d%d,{q%d},d%d", 2*$1,$2,2*$3, 2*$1+1,$2,2*$3+1;
+    }
+
+    sub unvdup32 {
+	my $arg=shift;
+
+	$arg =~ m/q([0-9]+),\s*q([0-9]+)\[([0-3])\]/o &&
+	sprintf	"vdup.32	q%d,d%d[%d]",$1,2*$2+($3>>1),$3&1;
+    }
+
+    sub unvmov32 {
+	my $arg=shift;
+
+	$arg =~ m/q([0-9]+)\[([0-3])\],(.*)/o &&
+	sprintf	"vmov.32	d%d[%d],%s",2*$1+($2>>1),$2&1,$3;
+    }
+
+    foreach(split("\n",$code)) {
+	s/\`([^\`]*)\`/eval($1)/geo;
+
+	s/\b[wx]([0-9]+)\b/r$1/go;		# new->old registers
+	s/\bv([0-9])\.[12468]+[bsd]\b/q$1/go;	# new->old registers
+	s/\/\/\s?/@ /o;				# new->old style commentary
+
+	# fix up remainig new-style suffixes
+	s/\{q([0-9]+)\},\s*\[(.+)\],#8/sprintf "{d%d},[$2]!",2*$1/eo	or
+	s/\],#[0-9]+/]!/o;
+
+	s/[v]?(aes\w+)\s+([qv].*)/unaes($1,$2)/geo	or
+	s/cclr\s+([^,]+),\s*([a-z]+)/mov$2	$1,#0/o	or
+	s/vtbl\.8\s+(.*)/unvtbl($1)/geo			or
+	s/vdup\.32\s+(.*)/unvdup32($1)/geo		or
+	s/vmov\.32\s+(.*)/unvmov32($1)/geo		or
+	s/^(\s+)b\./$1b/o				or
+	s/^(\s+)mov\./$1mov/o				or
+	s/^(\s+)ret/$1bx\tlr/o;
+
+	print $_,"\n";
+    }
+}
+
+close STDOUT;
--- a/deps/openssl/openssl/crypto/aes/asm/bsaes-armv7.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/bsaes-armv7.pl
--- a/deps/openssl/openssl/crypto/aes/asm/bsaes-x86_64.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/bsaes-x86_64.pl
@ -38,8 +38,9 @@
 #		Emilia's	this(*)		difference
 #
 # Core 2    	9.30		8.69		+7%
-# Nehalem(**) 	7.63		6.98		+9%
-# Atom	    	17.1		17.4		-2%(***)
+# Nehalem(**) 	7.63		6.88		+11%
+# Atom	    	17.1		16.4		+4%
+# Silvermont	-		12.9
 #
 # (*)	Comparison is not completely fair, because "this" is ECB,
 #	i.e. no extra processing such as counter values calculation
@ -50,14 +51,6 @@
 # (**)	Results were collected on Westmere, which is considered to
 #	be equivalent to Nehalem for this code.
 #
-# (***)	Slowdown on Atom is rather strange per se, because original
-#	implementation has a number of 9+-bytes instructions, which
-#	are bad for Atom front-end, and which I eliminated completely.
-#	In attempt to address deterioration sbox() was tested in FP
-#	SIMD "domain" (movaps instead of movdqa, xorps instead of
-#	pxor, etc.). While it resulted in nominal 4% improvement on
-#	Atom, it hurted Westmere by more than 2x factor.
-#
 # As for key schedule conversion subroutine. Interface to OpenSSL
 # relies on per-invocation on-the-fly conversion. This naturally
 # has impact on performance, especially for short inputs. Conversion
@ -67,7 +60,7 @@
 # 		conversion	conversion/8x block
 # Core 2	240		0.22
 # Nehalem	180		0.20
-# Atom		430		0.19
+# Atom		430		0.20
 #
 # The ratio values mean that 128-byte blocks will be processed
 # 16-18% slower, 256-byte blocks - 9-10%, 384-byte blocks - 6-7%,
@ -83,9 +76,10 @@
 # Add decryption procedure. Performance in CPU cycles spent to decrypt
 # one byte out of 4096-byte buffer with 128-bit key is:
 #
-# Core 2	9.83
-# Nehalem	7.74
-# Atom		19.0
+# Core 2	9.98
+# Nehalem	7.80
+# Atom		17.9
+# Silvermont	14.0
 #
 # November 2011.
 #
@ -434,21 +428,21 @@ my $mask=pop;
 $code.=<<___;
 	pxor	0x00($key),@x[0]
 	pxor	0x10($key),@x[1]
-	pshufb	$mask,@x[0]
 	pxor	0x20($key),@x[2]
-	pshufb	$mask,@x[1]
 	pxor	0x30($key),@x[3]
-	pshufb	$mask,@x[2]
+	pshufb	$mask,@x[0]
+	pshufb	$mask,@x[1]
 	pxor	0x40($key),@x[4]
-	pshufb	$mask,@x[3]
 	pxor	0x50($key),@x[5]
-	pshufb	$mask,@x[4]
+	pshufb	$mask,@x[2]
+	pshufb	$mask,@x[3]
 	pxor	0x60($key),@x[6]
-	pshufb	$mask,@x[5]
 	pxor	0x70($key),@x[7]
+	pshufb	$mask,@x[4]
+	pshufb	$mask,@x[5]
 	pshufb	$mask,@x[6]
-	lea	0x80($key),$key
 	pshufb	$mask,@x[7]
+	lea	0x80($key),$key
 ___
 }

@ -820,18 +814,18 @@ _bsaes_encrypt8:
 	movdqa	0x50($const), @XMM[8]	# .LM0SR
 	pxor	@XMM[9], @XMM[0]	# xor with round0 key
 	pxor	@XMM[9], @XMM[1]
-	 pshufb	@XMM[8], @XMM[0]
 	pxor	@XMM[9], @XMM[2]
-	 pshufb	@XMM[8], @XMM[1]
 	pxor	@XMM[9], @XMM[3]
-	 pshufb	@XMM[8], @XMM[2]
+	 pshufb	@XMM[8], @XMM[0]
+	 pshufb	@XMM[8], @XMM[1]
 	pxor	@XMM[9], @XMM[4]
-	 pshufb	@XMM[8], @XMM[3]
 	pxor	@XMM[9], @XMM[5]
-	 pshufb	@XMM[8], @XMM[4]
+	 pshufb	@XMM[8], @XMM[2]
+	 pshufb	@XMM[8], @XMM[3]
 	pxor	@XMM[9], @XMM[6]
-	 pshufb	@XMM[8], @XMM[5]
 	pxor	@XMM[9], @XMM[7]
+	 pshufb	@XMM[8], @XMM[4]
+	 pshufb	@XMM[8], @XMM[5]
 	 pshufb	@XMM[8], @XMM[6]
 	 pshufb	@XMM[8], @XMM[7]
 _bsaes_encrypt8_bitslice:
@ -884,18 +878,18 @@ _bsaes_decrypt8:
 	movdqa	-0x30($const), @XMM[8]	# .LM0ISR
 	pxor	@XMM[9], @XMM[0]	# xor with round0 key
 	pxor	@XMM[9], @XMM[1]
-	 pshufb	@XMM[8], @XMM[0]
 	pxor	@XMM[9], @XMM[2]
-	 pshufb	@XMM[8], @XMM[1]
 	pxor	@XMM[9], @XMM[3]
-	 pshufb	@XMM[8], @XMM[2]
+	 pshufb	@XMM[8], @XMM[0]
+	 pshufb	@XMM[8], @XMM[1]
 	pxor	@XMM[9], @XMM[4]
-	 pshufb	@XMM[8], @XMM[3]
 	pxor	@XMM[9], @XMM[5]
-	 pshufb	@XMM[8], @XMM[4]
+	 pshufb	@XMM[8], @XMM[2]
+	 pshufb	@XMM[8], @XMM[3]
 	pxor	@XMM[9], @XMM[6]
-	 pshufb	@XMM[8], @XMM[5]
 	pxor	@XMM[9], @XMM[7]
+	 pshufb	@XMM[8], @XMM[4]
+	 pshufb	@XMM[8], @XMM[5]
 	 pshufb	@XMM[8], @XMM[6]
 	 pshufb	@XMM[8], @XMM[7]
 ___
@ -1937,21 +1931,21 @@ $code.=<<___;
 	movdqa	-0x10(%r11), @XMM[8]	# .LSWPUPM0SR
 	pxor	@XMM[9], @XMM[0]	# xor with round0 key
 	pxor	@XMM[9], @XMM[1]
-	 pshufb	@XMM[8], @XMM[0]
 	pxor	@XMM[9], @XMM[2]
-	 pshufb	@XMM[8], @XMM[1]
 	pxor	@XMM[9], @XMM[3]
-	 pshufb	@XMM[8], @XMM[2]
+	 pshufb	@XMM[8], @XMM[0]
+	 pshufb	@XMM[8], @XMM[1]
 	pxor	@XMM[9], @XMM[4]
-	 pshufb	@XMM[8], @XMM[3]
 	pxor	@XMM[9], @XMM[5]
-	 pshufb	@XMM[8], @XMM[4]
+	 pshufb	@XMM[8], @XMM[2]
+	 pshufb	@XMM[8], @XMM[3]
 	pxor	@XMM[9], @XMM[6]
-	 pshufb	@XMM[8], @XMM[5]
 	pxor	@XMM[9], @XMM[7]
+	 pshufb	@XMM[8], @XMM[4]
+	 pshufb	@XMM[8], @XMM[5]
 	 pshufb	@XMM[8], @XMM[6]
-	lea	.LBS0(%rip), %r11	# constants table
 	 pshufb	@XMM[8], @XMM[7]
+	lea	.LBS0(%rip), %r11	# constants table
 	mov	%ebx,%r10d		# pass rounds

 	call	_bsaes_encrypt8_bitslice
--- a/deps/openssl/openssl/crypto/aes/asm/vpaes-ppc.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/vpaes-ppc.pl
--- a/deps/openssl/openssl/crypto/aes/asm/vpaes-x86.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/vpaes-x86.pl
@ -27,9 +27,10 @@
 #
 #		aes-586.pl		vpaes-x86.pl
 #
-# Core 2(**)	29.1/42.3/18.3		22.0/25.6(***)
-# Nehalem	27.9/40.4/18.1		10.3/12.0
-# Atom		102./119./60.1		64.5/85.3(***)
+# Core 2(**)	28.1/41.4/18.3		21.9/25.2(***)
+# Nehalem	27.9/40.4/18.1		10.2/11.9
+# Atom		70.7/92.1/60.1		61.1/75.4(***)
+# Silvermont	45.4/62.9/24.1		49.2/61.1(***)
 #
 # (*)	"Hyper-threading" in the context refers rather to cache shared
 #	among multiple cores, than to specifically Intel HTT. As vast
@ -40,8 +41,8 @@
 # (**)	"Core 2" refers to initial 65nm design, a.k.a. Conroe.
 #
 # (***)	Less impressive improvement on Core 2 and Atom is due to slow
-#	pshufb,	yet it's respectable +32%/65%  improvement on Core 2
-#	and +58%/40% on Atom (as implied, over "hyper-threading-safe"
+#	pshufb,	yet it's respectable +28%/64%  improvement on Core 2
+#	and +15% on Atom (as implied, over "hyper-threading-safe"
 #	code path).
 #
 #						<appro@openssl.org>
@ -183,35 +184,35 @@ $k_dsbo=0x2c0;		# decryption sbox final output
 	&movdqa	("xmm1","xmm6")
 	&movdqa	("xmm2",&QWP($k_ipt,$const));
 	&pandn	("xmm1","xmm0");
-	&movdqu	("xmm5",&QWP(0,$key));
-	&psrld	("xmm1",4);
 	&pand	("xmm0","xmm6");
+	&movdqu	("xmm5",&QWP(0,$key));
 	&pshufb	("xmm2","xmm0");
 	&movdqa	("xmm0",&QWP($k_ipt+16,$const));
-	&pshufb	("xmm0","xmm1");
 	&pxor	("xmm2","xmm5");
-	&pxor	("xmm0","xmm2");
+	&psrld	("xmm1",4);
 	&add	($key,16);
+	&pshufb	("xmm0","xmm1");
 	&lea	($base,&DWP($k_mc_backward,$const));
+	&pxor	("xmm0","xmm2");
 	&jmp	(&label("enc_entry"));


 &set_label("enc_loop",16);
 	# middle of middle round
 	&movdqa	("xmm4",&QWP($k_sb1,$const));	# 4 : sb1u
-	&pshufb	("xmm4","xmm2");		# 4 = sb1u
-	&pxor	("xmm4","xmm5");		# 4 = sb1u + k
 	&movdqa	("xmm0",&QWP($k_sb1+16,$const));# 0 : sb1t
+	&pshufb	("xmm4","xmm2");		# 4 = sb1u
 	&pshufb	("xmm0","xmm3");		# 0 = sb1t
-	&pxor	("xmm0","xmm4");		# 0 = A
+	&pxor	("xmm4","xmm5");		# 4 = sb1u + k
 	&movdqa	("xmm5",&QWP($k_sb2,$const));	# 4 : sb2u
-	&pshufb	("xmm5","xmm2");		# 4 = sb2u
+	&pxor	("xmm0","xmm4");		# 0 = A
 	&movdqa	("xmm1",&QWP(-0x40,$base,$magic));# .Lk_mc_forward[]
+	&pshufb	("xmm5","xmm2");		# 4 = sb2u
 	&movdqa	("xmm2",&QWP($k_sb2+16,$const));# 2 : sb2t
-	&pshufb	("xmm2","xmm3");		# 2 = sb2t
-	&pxor	("xmm2","xmm5");		# 2 = 2A
 	&movdqa	("xmm4",&QWP(0,$base,$magic));	# .Lk_mc_backward[]
+	&pshufb	("xmm2","xmm3");		# 2 = sb2t
 	&movdqa	("xmm3","xmm0");		# 3 = A
+	&pxor	("xmm2","xmm5");		# 2 = 2A
 	&pshufb	("xmm0","xmm1");		# 0 = B
 	&add	($key,16);			# next key
 	&pxor	("xmm0","xmm2");		# 0 = 2A+B
@ -220,30 +221,30 @@ $k_dsbo=0x2c0;		# decryption sbox final output
 	&pxor	("xmm3","xmm0");		# 3 = 2A+B+D
 	&pshufb	("xmm0","xmm1");		# 0 = 2B+C
 	&and	($magic,0x30);			# ... mod 4
-	&pxor	("xmm0","xmm3");		# 0 = 2A+3B+C+D
 	&sub	($round,1);			# nr--
+	&pxor	("xmm0","xmm3");		# 0 = 2A+3B+C+D

 &set_label("enc_entry");
 	# top of round
 	&movdqa	("xmm1","xmm6");		# 1 : i
+	&movdqa	("xmm5",&QWP($k_inv+16,$const));# 2 : a/k
 	&pandn	("xmm1","xmm0");		# 1 = i<<4
 	&psrld	("xmm1",4);			# 1 = i
 	&pand	("xmm0","xmm6");		# 0 = k
-	&movdqa	("xmm5",&QWP($k_inv+16,$const));# 2 : a/k
 	&pshufb	("xmm5","xmm0");		# 2 = a/k
-	&pxor	("xmm0","xmm1");		# 0 = j
 	&movdqa	("xmm3","xmm7");		# 3 : 1/i
+	&pxor	("xmm0","xmm1");		# 0 = j
 	&pshufb	("xmm3","xmm1");		# 3 = 1/i
-	&pxor	("xmm3","xmm5");		# 3 = iak = 1/i + a/k
 	&movdqa	("xmm4","xmm7");		# 4 : 1/j
+	&pxor	("xmm3","xmm5");		# 3 = iak = 1/i + a/k
 	&pshufb	("xmm4","xmm0");		# 4 = 1/j
-	&pxor	("xmm4","xmm5");		# 4 = jak = 1/j + a/k
 	&movdqa	("xmm2","xmm7");		# 2 : 1/iak
+	&pxor	("xmm4","xmm5");		# 4 = jak = 1/j + a/k
 	&pshufb	("xmm2","xmm3");		# 2 = 1/iak
-	&pxor	("xmm2","xmm0");		# 2 = io
 	&movdqa	("xmm3","xmm7");		# 3 : 1/jak
-	&movdqu	("xmm5",&QWP(0,$key));
+	&pxor	("xmm2","xmm0");		# 2 = io
 	&pshufb	("xmm3","xmm4");		# 3 = 1/jak
+	&movdqu	("xmm5",&QWP(0,$key));
 	&pxor	("xmm3","xmm1");		# 3 = jo
 	&jnz	(&label("enc_loop"));

@ -265,8 +266,8 @@ $k_dsbo=0x2c0;		# decryption sbox final output
 ##  Same API as encryption core.
 ##
 &function_begin_B("_vpaes_decrypt_core");
-	&mov	($round,&DWP(240,$key));
 	&lea	($base,&DWP($k_dsbd,$const));
+	&mov	($round,&DWP(240,$key));
 	&movdqa	("xmm1","xmm6");
 	&movdqa	("xmm2",&QWP($k_dipt-$k_dsbd,$base));
 	&pandn	("xmm1","xmm0");
@ -292,62 +293,61 @@ $k_dsbo=0x2c0;		# decryption sbox final output
 ##  Inverse mix columns
 ##
 	&movdqa	("xmm4",&QWP(-0x20,$base));	# 4 : sb9u
+	&movdqa	("xmm1",&QWP(-0x10,$base));	# 0 : sb9t
 	&pshufb	("xmm4","xmm2");		# 4 = sb9u
-	&pxor	("xmm4","xmm0");
-	&movdqa	("xmm0",&QWP(-0x10,$base));	# 0 : sb9t
-	&pshufb	("xmm0","xmm3");		# 0 = sb9t
-	&pxor	("xmm0","xmm4");		# 0 = ch
-	&add	($key,16);			# next round key
-
-	&pshufb	("xmm0","xmm5");		# MC ch
+	&pshufb	("xmm1","xmm3");		# 0 = sb9t
+	&pxor	("xmm0","xmm4");
 	&movdqa	("xmm4",&QWP(0,$base));		# 4 : sbdu
-	&pshufb	("xmm4","xmm2");		# 4 = sbdu
-	&pxor	("xmm4","xmm0");		# 4 = ch
-	&movdqa	("xmm0",&QWP(0x10,$base));	# 0 : sbdt
-	&pshufb	("xmm0","xmm3");		# 0 = sbdt
-	&pxor	("xmm0","xmm4");		# 0 = ch
-	&sub	($round,1);			# nr--
+	&pxor	("xmm0","xmm1");		# 0 = ch
+	&movdqa	("xmm1",&QWP(0x10,$base));	# 0 : sbdt

+	&pshufb	("xmm4","xmm2");		# 4 = sbdu
 	&pshufb	("xmm0","xmm5");		# MC ch
+	&pshufb	("xmm1","xmm3");		# 0 = sbdt
+	&pxor	("xmm0","xmm4");		# 4 = ch
 	&movdqa	("xmm4",&QWP(0x20,$base));	# 4 : sbbu
-	&pshufb	("xmm4","xmm2");		# 4 = sbbu
-	&pxor	("xmm4","xmm0");		# 4 = ch
-	&movdqa	("xmm0",&QWP(0x30,$base));	# 0 : sbbt
-	&pshufb	("xmm0","xmm3");		# 0 = sbbt
-	&pxor	("xmm0","xmm4");		# 0 = ch
+	&pxor	("xmm0","xmm1");		# 0 = ch
+	&movdqa	("xmm1",&QWP(0x30,$base));	# 0 : sbbt

+	&pshufb	("xmm4","xmm2");		# 4 = sbbu
 	&pshufb	("xmm0","xmm5");		# MC ch
+	&pshufb	("xmm1","xmm3");		# 0 = sbbt
+	&pxor	("xmm0","xmm4");		# 4 = ch
 	&movdqa	("xmm4",&QWP(0x40,$base));	# 4 : sbeu
-	&pshufb	("xmm4","xmm2");		# 4 = sbeu
-	&pxor	("xmm4","xmm0");		# 4 = ch
-	&movdqa	("xmm0",&QWP(0x50,$base));	# 0 : sbet
-	&pshufb	("xmm0","xmm3");		# 0 = sbet
-	&pxor	("xmm0","xmm4");		# 0 = ch
+	&pxor	("xmm0","xmm1");		# 0 = ch
+	&movdqa	("xmm1",&QWP(0x50,$base));	# 0 : sbet

+	&pshufb	("xmm4","xmm2");		# 4 = sbeu
+	&pshufb	("xmm0","xmm5");		# MC ch
+	&pshufb	("xmm1","xmm3");		# 0 = sbet
+	&pxor	("xmm0","xmm4");		# 4 = ch
+	&add	($key,16);			# next round key
 	&palignr("xmm5","xmm5",12);
+	&pxor	("xmm0","xmm1");		# 0 = ch
+	&sub	($round,1);			# nr--

 &set_label("dec_entry");
 	# top of round
 	&movdqa	("xmm1","xmm6");		# 1 : i
+	&movdqa	("xmm2",&QWP($k_inv+16,$const));# 2 : a/k
 	&pandn	("xmm1","xmm0");		# 1 = i<<4
-	&psrld	("xmm1",4);			# 1 = i
 	&pand	("xmm0","xmm6");		# 0 = k
-	&movdqa	("xmm2",&QWP($k_inv+16,$const));# 2 : a/k
+	&psrld	("xmm1",4);			# 1 = i
 	&pshufb	("xmm2","xmm0");		# 2 = a/k
-	&pxor	("xmm0","xmm1");		# 0 = j
 	&movdqa	("xmm3","xmm7");		# 3 : 1/i
+	&pxor	("xmm0","xmm1");		# 0 = j
 	&pshufb	("xmm3","xmm1");		# 3 = 1/i
-	&pxor	("xmm3","xmm2");		# 3 = iak = 1/i + a/k
 	&movdqa	("xmm4","xmm7");		# 4 : 1/j
+	&pxor	("xmm3","xmm2");		# 3 = iak = 1/i + a/k
 	&pshufb	("xmm4","xmm0");		# 4 = 1/j
 	&pxor	("xmm4","xmm2");		# 4 = jak = 1/j + a/k
 	&movdqa	("xmm2","xmm7");		# 2 : 1/iak
 	&pshufb	("xmm2","xmm3");		# 2 = 1/iak
-	&pxor	("xmm2","xmm0");		# 2 = io
 	&movdqa	("xmm3","xmm7");		# 3 : 1/jak
+	&pxor	("xmm2","xmm0");		# 2 = io
 	&pshufb	("xmm3","xmm4");		# 3 = 1/jak
-	&pxor	("xmm3","xmm1");		# 3 = jo
 	&movdqu	("xmm0",&QWP(0,$key));
+	&pxor	("xmm3","xmm1");		# 3 = jo
 	&jnz	(&label("dec_loop"));

 	# middle of last round
@ -542,12 +542,12 @@ $k_dsbo=0x2c0;		# decryption sbox final output
 ##    %xmm0: b+c+d  b+c  b  a
 ##
 &function_begin_B("_vpaes_schedule_192_smear");
-	&pshufd	("xmm0","xmm6",0x80);		# d c 0 0 -> c 0 0 0
-	&pxor	("xmm6","xmm0");		# -> c+d c 0 0
+	&pshufd	("xmm1","xmm6",0x80);		# d c 0 0 -> c 0 0 0
 	&pshufd	("xmm0","xmm7",0xFE);		# b a _ _ -> b b b a
+	&pxor	("xmm6","xmm1");		# -> c+d c 0 0
+	&pxor	("xmm1","xmm1");
 	&pxor	("xmm6","xmm0");		# -> b+c+d b+c b a
 	&movdqa	("xmm0","xmm6");
-	&pxor	("xmm1","xmm1");
 	&movhlps("xmm6","xmm1");		# clobber low side with zeros
 	&ret	();
 &function_end_B("_vpaes_schedule_192_smear");
--- a/deps/openssl/openssl/crypto/aes/asm/vpaes-x86_64.pl
+++ b/deps/openssl/openssl/crypto/aes/asm/vpaes-x86_64.pl
@ -27,9 +27,10 @@
 #
 #		aes-x86_64.pl		vpaes-x86_64.pl
 #
-# Core 2(**)	30.5/43.7/14.3		21.8/25.7(***)
-# Nehalem	30.5/42.2/14.6		 9.8/11.8
-# Atom		63.9/79.0/32.1		64.0/84.8(***)
+# Core 2(**)	29.6/41.1/14.3		21.9/25.2(***)
+# Nehalem	29.6/40.3/14.6		10.0/11.8
+# Atom		57.3/74.2/32.1		60.9/77.2(***)
+# Silvermont	52.7/64.0/19.5		48.8/60.8(***)
 #
 # (*)	"Hyper-threading" in the context refers rather to cache shared
 #	among multiple cores, than to specifically Intel HTT. As vast
@ -40,7 +41,7 @@
 # (**)	"Core 2" refers to initial 65nm design, a.k.a. Conroe.
 #
 # (***)	Less impressive improvement on Core 2 and Atom is due to slow
-#	pshufb,	yet it's respectable +40%/78% improvement on Core 2
+#	pshufb,	yet it's respectable +36%/62% improvement on Core 2
 #	(as implied, over "hyper-threading-safe" code path).
 #
 #						<appro@openssl.org>
@ -95,8 +96,8 @@ _vpaes_encrypt_core:
 	movdqa	.Lk_ipt+16(%rip), %xmm0	# ipthi
 	pshufb	%xmm1,	%xmm0
 	pxor	%xmm5,	%xmm2
-	pxor	%xmm2,	%xmm0
 	add	\$16,	%r9
+	pxor	%xmm2,	%xmm0
 	lea	.Lk_mc_backward(%rip),%r10
 	jmp	.Lenc_entry

@ -104,19 +105,19 @@ _vpaes_encrypt_core:
 .Lenc_loop:
 	# middle of middle round
 	movdqa  %xmm13,	%xmm4	# 4 : sb1u
-	pshufb  %xmm2,	%xmm4	# 4 = sb1u
-	pxor	%xmm5,	%xmm4	# 4 = sb1u + k
 	movdqa  %xmm12,	%xmm0	# 0 : sb1t
+	pshufb  %xmm2,	%xmm4	# 4 = sb1u
 	pshufb  %xmm3,	%xmm0	# 0 = sb1t
-	pxor	%xmm4,	%xmm0	# 0 = A
+	pxor	%xmm5,	%xmm4	# 4 = sb1u + k
 	movdqa  %xmm15,	%xmm5	# 4 : sb2u
-	pshufb	%xmm2,	%xmm5	# 4 = sb2u
+	pxor	%xmm4,	%xmm0	# 0 = A
 	movdqa	-0x40(%r11,%r10), %xmm1		# .Lk_mc_forward[]
+	pshufb	%xmm2,	%xmm5	# 4 = sb2u
+	movdqa	(%r11,%r10), %xmm4		# .Lk_mc_backward[]
 	movdqa	%xmm14, %xmm2	# 2 : sb2t
 	pshufb	%xmm3,  %xmm2	# 2 = sb2t
-	pxor	%xmm5,	%xmm2	# 2 = 2A
-	movdqa	(%r11,%r10), %xmm4		# .Lk_mc_backward[]
 	movdqa	%xmm0,  %xmm3	# 3 = A
+	pxor	%xmm5,	%xmm2	# 2 = 2A
 	pshufb  %xmm1,  %xmm0	# 0 = B
 	add	\$16,	%r9	# next key
 	pxor	%xmm2,  %xmm0	# 0 = 2A+B
@ -125,30 +126,30 @@ _vpaes_encrypt_core:
 	pxor	%xmm0,	%xmm3	# 3 = 2A+B+D
 	pshufb  %xmm1,	%xmm0	# 0 = 2B+C
 	and	\$0x30,	%r11	# ... mod 4
-	pxor	%xmm3,	%xmm0	# 0 = 2A+3B+C+D
 	sub	\$1,%rax	# nr--
+	pxor	%xmm3,	%xmm0	# 0 = 2A+3B+C+D

 .Lenc_entry:
 	# top of round
 	movdqa  %xmm9, 	%xmm1	# 1 : i
+	movdqa	%xmm11, %xmm5	# 2 : a/k
 	pandn	%xmm0, 	%xmm1	# 1 = i<<4
 	psrld	\$4,   	%xmm1   # 1 = i
 	pand	%xmm9, 	%xmm0   # 0 = k
-	movdqa	%xmm11, %xmm5	# 2 : a/k
 	pshufb  %xmm0,  %xmm5	# 2 = a/k
-	pxor	%xmm1,	%xmm0	# 0 = j
 	movdqa	%xmm10,	%xmm3  	# 3 : 1/i
+	pxor	%xmm1,	%xmm0	# 0 = j
 	pshufb  %xmm1, 	%xmm3  	# 3 = 1/i
-	pxor	%xmm5, 	%xmm3  	# 3 = iak = 1/i + a/k
 	movdqa	%xmm10,	%xmm4  	# 4 : 1/j
+	pxor	%xmm5, 	%xmm3  	# 3 = iak = 1/i + a/k
 	pshufb	%xmm0, 	%xmm4  	# 4 = 1/j
-	pxor	%xmm5, 	%xmm4  	# 4 = jak = 1/j + a/k
 	movdqa	%xmm10,	%xmm2  	# 2 : 1/iak
+	pxor	%xmm5, 	%xmm4  	# 4 = jak = 1/j + a/k
 	pshufb  %xmm3,	%xmm2  	# 2 = 1/iak
-	pxor	%xmm0, 	%xmm2  	# 2 = io
 	movdqa	%xmm10, %xmm3   # 3 : 1/jak
-	movdqu	(%r9),	%xmm5
+	pxor	%xmm0, 	%xmm2  	# 2 = io
 	pshufb  %xmm4,  %xmm3   # 3 = 1/jak
+	movdqu	(%r9),	%xmm5
 	pxor	%xmm1,  %xmm3   # 3 = jo
 	jnz	.Lenc_loop

@ -201,62 +202,61 @@ _vpaes_decrypt_core:
 ##  Inverse mix columns
 ##
 	movdqa  -0x20(%r10),%xmm4	# 4 : sb9u
+	movdqa  -0x10(%r10),%xmm1	# 0 : sb9t
 	pshufb	%xmm2,	%xmm4		# 4 = sb9u
-	pxor	%xmm0,	%xmm4
-	movdqa  -0x10(%r10),%xmm0	# 0 : sb9t
-	pshufb	%xmm3,	%xmm0		# 0 = sb9t
-	pxor	%xmm4,	%xmm0		# 0 = ch
-	add	\$16, %r9		# next round key
-
-	pshufb	%xmm5,	%xmm0		# MC ch
+	pshufb	%xmm3,	%xmm1		# 0 = sb9t
+	pxor	%xmm4,	%xmm0
 	movdqa  0x00(%r10),%xmm4	# 4 : sbdu
+	pxor	%xmm1,	%xmm0		# 0 = ch
+	movdqa  0x10(%r10),%xmm1	# 0 : sbdt
+
 	pshufb	%xmm2,	%xmm4		# 4 = sbdu
-	pxor	%xmm0,	%xmm4		# 4 = ch
-	movdqa  0x10(%r10),%xmm0	# 0 : sbdt
-	pshufb	%xmm3,	%xmm0		# 0 = sbdt
-	pxor	%xmm4,	%xmm0		# 0 = ch
-	sub	\$1,%rax		# nr--
-	
 	pshufb	%xmm5,	%xmm0		# MC ch
+	pshufb	%xmm3,	%xmm1		# 0 = sbdt
+	pxor	%xmm4,	%xmm0		# 4 = ch
 	movdqa  0x20(%r10),%xmm4	# 4 : sbbu
+	pxor	%xmm1,	%xmm0		# 0 = ch
+	movdqa  0x30(%r10),%xmm1	# 0 : sbbt
+
 	pshufb	%xmm2,	%xmm4		# 4 = sbbu
-	pxor	%xmm0,	%xmm4		# 4 = ch
-	movdqa  0x30(%r10),%xmm0	# 0 : sbbt
-	pshufb	%xmm3,	%xmm0		# 0 = sbbt
-	pxor	%xmm4,	%xmm0		# 0 = ch
-	
 	pshufb	%xmm5,	%xmm0		# MC ch
+	pshufb	%xmm3,	%xmm1		# 0 = sbbt
+	pxor	%xmm4,	%xmm0		# 4 = ch
 	movdqa  0x40(%r10),%xmm4	# 4 : sbeu
-	pshufb	%xmm2,	%xmm4		# 4 = sbeu
-	pxor	%xmm0,	%xmm4		# 4 = ch
-	movdqa  0x50(%r10),%xmm0	# 0 : sbet
-	pshufb	%xmm3,	%xmm0		# 0 = sbet
-	pxor	%xmm4,	%xmm0		# 0 = ch
+	pxor	%xmm1,	%xmm0		# 0 = ch
+	movdqa  0x50(%r10),%xmm1	# 0 : sbet

+	pshufb	%xmm2,	%xmm4		# 4 = sbeu
+	pshufb	%xmm5,	%xmm0		# MC ch
+	pshufb	%xmm3,	%xmm1		# 0 = sbet
+	pxor	%xmm4,	%xmm0		# 4 = ch
+	add	\$16, %r9		# next round key
 	palignr	\$12,	%xmm5,	%xmm5
-	
+	pxor	%xmm1,	%xmm0		# 0 = ch
+	sub	\$1,%rax		# nr--
+
 .Ldec_entry:
 	# top of round
 	movdqa  %xmm9, 	%xmm1	# 1 : i
 	pandn	%xmm0, 	%xmm1	# 1 = i<<4
+	movdqa	%xmm11, %xmm2	# 2 : a/k
 	psrld	\$4,    %xmm1	# 1 = i
 	pand	%xmm9, 	%xmm0	# 0 = k
-	movdqa	%xmm11, %xmm2	# 2 : a/k
 	pshufb  %xmm0,  %xmm2	# 2 = a/k
-	pxor	%xmm1,	%xmm0	# 0 = j
 	movdqa	%xmm10,	%xmm3	# 3 : 1/i
+	pxor	%xmm1,	%xmm0	# 0 = j
 	pshufb  %xmm1, 	%xmm3	# 3 = 1/i
-	pxor	%xmm2, 	%xmm3	# 3 = iak = 1/i + a/k
 	movdqa	%xmm10,	%xmm4	# 4 : 1/j
+	pxor	%xmm2, 	%xmm3	# 3 = iak = 1/i + a/k
 	pshufb	%xmm0, 	%xmm4	# 4 = 1/j
 	pxor	%xmm2, 	%xmm4	# 4 = jak = 1/j + a/k
 	movdqa	%xmm10,	%xmm2	# 2 : 1/iak
 	pshufb  %xmm3,	%xmm2	# 2 = 1/iak
-	pxor	%xmm0, 	%xmm2	# 2 = io
 	movdqa	%xmm10, %xmm3	# 3 : 1/jak
+	pxor	%xmm0, 	%xmm2	# 2 = io
 	pshufb  %xmm4,  %xmm3	# 3 = 1/jak
-	pxor	%xmm1,  %xmm3	# 3 = jo
 	movdqu	(%r9),	%xmm0
+	pxor	%xmm1,  %xmm3	# 3 = jo
 	jnz	.Ldec_loop

 	# middle of last round
@ -464,12 +464,12 @@ _vpaes_schedule_core:
 .type	_vpaes_schedule_192_smear,\@abi-omnipotent
 .align	16
 _vpaes_schedule_192_smear:
-	pshufd	\$0x80,	%xmm6,	%xmm0	# d c 0 0 -> c 0 0 0
-	pxor	%xmm0,	%xmm6		# -> c+d c 0 0
+	pshufd	\$0x80,	%xmm6,	%xmm1	# d c 0 0 -> c 0 0 0
 	pshufd	\$0xFE,	%xmm7,	%xmm0	# b a _ _ -> b b b a
+	pxor	%xmm1,	%xmm6		# -> c+d c 0 0
+	pxor	%xmm1,	%xmm1
 	pxor	%xmm0,	%xmm6		# -> b+c+d b+c b a
 	movdqa	%xmm6,	%xmm0
-	pxor	%xmm1,	%xmm1
 	movhlps	%xmm1,	%xmm6		# clobber low side with zeros
 	ret
 .size	_vpaes_schedule_192_smear,.-_vpaes_schedule_192_smear
--- a/deps/openssl/openssl/crypto/arm64cpuid.S
+++ b/deps/openssl/openssl/crypto/arm64cpuid.S
@ -0,0 +1,46 @@
+#include "arm_arch.h"
+
+.text
+.arch	armv8-a+crypto
+
+.align	5
+.global	_armv7_neon_probe
+.type	_armv7_neon_probe,%function
+_armv7_neon_probe:
+	orr	v15.16b, v15.16b, v15.16b
+	ret
+.size	_armv7_neon_probe,.-_armv7_neon_probe
+
+.global	_armv7_tick
+.type	_armv7_tick,%function
+_armv7_tick:
+	mrs	x0, CNTVCT_EL0
+	ret
+.size	_armv7_tick,.-_armv7_tick
+
+.global	_armv8_aes_probe
+.type	_armv8_aes_probe,%function
+_armv8_aes_probe:
+	aese	v0.16b, v0.16b
+	ret
+.size	_armv8_aes_probe,.-_armv8_aes_probe
+
+.global	_armv8_sha1_probe
+.type	_armv8_sha1_probe,%function
+_armv8_sha1_probe:
+	sha1h	s0, s0
+	ret
+.size	_armv8_sha1_probe,.-_armv8_sha1_probe
+
+.global	_armv8_sha256_probe
+.type	_armv8_sha256_probe,%function
+_armv8_sha256_probe:
+	sha256su0	v0.4s, v0.4s
+	ret
+.size	_armv8_sha256_probe,.-_armv8_sha256_probe
+.global	_armv8_pmull_probe
+.type	_armv8_pmull_probe,%function
+_armv8_pmull_probe:
+	pmull	v0.1q, v0.1d, v0.1d
+	ret
+.size	_armv8_pmull_probe,.-_armv8_pmull_probe
--- a/deps/openssl/openssl/crypto/arm_arch.h
+++ b/deps/openssl/openssl/crypto/arm_arch.h
@ -10,13 +10,24 @@
 #    define __ARMEL__
 #   endif
 #  elif defined(__GNUC__)
+#   if   defined(__aarch64__)
+#    define __ARM_ARCH__ 8
+#    if __BYTE_ORDER__==__ORDER_BIG_ENDIAN__
+#     define __ARMEB__
+#    else
+#     define __ARMEL__
+#    endif
  /*
   * Why doesn't gcc define __ARM_ARCH__? Instead it defines
   * bunch of below macros. See all_architectires[] table in
   * gcc/config/arm/arm.c. On a side note it defines
   * __ARMEL__/__ARMEB__ for little-/big-endian.
   */
-#   if   defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__)     || \
+#   elif defined(__ARM_ARCH)
+#    define __ARM_ARCH__ __ARM_ARCH
+#   elif defined(__ARM_ARCH_8A__)
+#    define __ARM_ARCH__ 8
+#   elif defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__)     || \
        defined(__ARM_ARCH_7R__)|| defined(__ARM_ARCH_7M__)     || \
        defined(__ARM_ARCH_7EM__)
 #    define __ARM_ARCH__ 7
@ -41,11 +52,27 @@
 #  include <openssl/fipssyms.h>
 # endif

+# if !defined(__ARM_MAX_ARCH__)
+#  define __ARM_MAX_ARCH__ __ARM_ARCH__
+# endif
+
+# if __ARM_MAX_ARCH__<__ARM_ARCH__
+#  error "__ARM_MAX_ARCH__ can't be less than __ARM_ARCH__"
+# elif __ARM_MAX_ARCH__!=__ARM_ARCH__
+#  if __ARM_ARCH__<7 && __ARM_MAX_ARCH__>=7 && defined(__ARMEB__)
+#   error "can't build universal big-endian binary"
+#  endif
+# endif
+
 # if !__ASSEMBLER__
 extern unsigned int OPENSSL_armcap_P;
-
-#  define ARMV7_NEON      (1<<0)
-#  define ARMV7_TICK      (1<<1)
 # endif

+# define ARMV7_NEON      (1<<0)
+# define ARMV7_TICK      (1<<1)
+# define ARMV8_AES       (1<<2)
+# define ARMV8_SHA1      (1<<3)
+# define ARMV8_SHA256    (1<<4)
+# define ARMV8_PMULL     (1<<5)
+
 #endif
--- a/deps/openssl/openssl/crypto/armcap.c
+++ b/deps/openssl/openssl/crypto/armcap.c
@ -7,8 +7,18 @@

 #include "arm_arch.h"

-unsigned int OPENSSL_armcap_P;
+unsigned int OPENSSL_armcap_P = 0;

+#if __ARM_MAX_ARCH__<7
+void OPENSSL_cpuid_setup(void)
+{
+}
+
+unsigned long OPENSSL_rdtsc(void)
+{
+    return 0;
+}
+#else
 static sigset_t all_masked;

 static sigjmp_buf ill_jmp;
@ -22,9 +32,13 @@ static void ill_handler(int sig)
 * ARM compilers support inline assembler...
 */
 void _armv7_neon_probe(void);
-unsigned int _armv7_tick(void);
+void _armv8_aes_probe(void);
+void _armv8_sha1_probe(void);
+void _armv8_sha256_probe(void);
+void _armv8_pmull_probe(void);
+unsigned long _armv7_tick(void);

-unsigned int OPENSSL_rdtsc(void)
+unsigned long OPENSSL_rdtsc(void)
 {
    if (OPENSSL_armcap_P & ARMV7_TICK)
        return _armv7_tick();
@ -32,9 +46,44 @@ unsigned int OPENSSL_rdtsc(void)
        return 0;
 }

-#if defined(__GNUC__) && __GNUC__>=2
+/*
+ * Use a weak reference to getauxval() so we can use it if it is available but
+ * don't break the build if it is not.
+ */
+# if defined(__GNUC__) && __GNUC__>=2
 void OPENSSL_cpuid_setup(void) __attribute__ ((constructor));
-#endif
+extern unsigned long getauxval(unsigned long type) __attribute__ ((weak));
+# else
+static unsigned long (*getauxval) (unsigned long) = NULL;
+# endif
+
+/*
+ * ARM puts the the feature bits for Crypto Extensions in AT_HWCAP2, whereas
+ * AArch64 used AT_HWCAP.
+ */
+# if defined(__arm__) || defined (__arm)
+#  define HWCAP                  16
+                                  /* AT_HWCAP */
+#  define HWCAP_NEON             (1 << 12)
+
+#  define HWCAP_CE               26
+                                  /* AT_HWCAP2 */
+#  define HWCAP_CE_AES           (1 << 0)
+#  define HWCAP_CE_PMULL         (1 << 1)
+#  define HWCAP_CE_SHA1          (1 << 2)
+#  define HWCAP_CE_SHA256        (1 << 3)
+# elif defined(__aarch64__)
+#  define HWCAP                  16
+                                  /* AT_HWCAP */
+#  define HWCAP_NEON             (1 << 1)
+
+#  define HWCAP_CE               HWCAP
+#  define HWCAP_CE_AES           (1 << 3)
+#  define HWCAP_CE_PMULL         (1 << 4)
+#  define HWCAP_CE_SHA1          (1 << 5)
+#  define HWCAP_CE_SHA256        (1 << 6)
+# endif
+
 void OPENSSL_cpuid_setup(void)
 {
    char *e;
@ -47,7 +96,7 @@ void OPENSSL_cpuid_setup(void)
    trigger = 1;

    if ((e = getenv("OPENSSL_armcap"))) {
-        OPENSSL_armcap_P = strtoul(e, NULL, 0);
+        OPENSSL_armcap_P = (unsigned int)strtoul(e, NULL, 0);
        return;
    }

@ -67,9 +116,42 @@ void OPENSSL_cpuid_setup(void)
    sigprocmask(SIG_SETMASK, &ill_act.sa_mask, &oset);
    sigaction(SIGILL, &ill_act, &ill_oact);

-    if (sigsetjmp(ill_jmp, 1) == 0) {
+    if (getauxval != NULL) {
+        if (getauxval(HWCAP) & HWCAP_NEON) {
+            unsigned long hwcap = getauxval(HWCAP_CE);
+
+            OPENSSL_armcap_P |= ARMV7_NEON;
+
+            if (hwcap & HWCAP_CE_AES)
+                OPENSSL_armcap_P |= ARMV8_AES;
+
+            if (hwcap & HWCAP_CE_PMULL)
+                OPENSSL_armcap_P |= ARMV8_PMULL;
+
+            if (hwcap & HWCAP_CE_SHA1)
+                OPENSSL_armcap_P |= ARMV8_SHA1;
+
+            if (hwcap & HWCAP_CE_SHA256)
+                OPENSSL_armcap_P |= ARMV8_SHA256;
+        }
+    } else if (sigsetjmp(ill_jmp, 1) == 0) {
        _armv7_neon_probe();
        OPENSSL_armcap_P |= ARMV7_NEON;
+        if (sigsetjmp(ill_jmp, 1) == 0) {
+            _armv8_pmull_probe();
+            OPENSSL_armcap_P |= ARMV8_PMULL | ARMV8_AES;
+        } else if (sigsetjmp(ill_jmp, 1) == 0) {
+            _armv8_aes_probe();
+            OPENSSL_armcap_P |= ARMV8_AES;
+        }
+        if (sigsetjmp(ill_jmp, 1) == 0) {
+            _armv8_sha1_probe();
+            OPENSSL_armcap_P |= ARMV8_SHA1;
+        }
+        if (sigsetjmp(ill_jmp, 1) == 0) {
+            _armv8_sha256_probe();
+            OPENSSL_armcap_P |= ARMV8_SHA256;
+        }
    }
    if (sigsetjmp(ill_jmp, 1) == 0) {
        _armv7_tick();
@ -79,3 +161,4 @@ void OPENSSL_cpuid_setup(void)
    sigaction(SIGILL, &ill_oact, NULL);
    sigprocmask(SIG_SETMASK, &oset, NULL);
 }
+#endif
--- a/deps/openssl/openssl/crypto/armv4cpuid.S
+++ b/deps/openssl/openssl/crypto/armv4cpuid.S
@ -4,20 +4,6 @@
 .code	32

 .align	5
-.global	_armv7_neon_probe
-.type	_armv7_neon_probe,%function
-_armv7_neon_probe:
-	.word	0xf26ee1fe	@ vorr	q15,q15,q15
-	.word	0xe12fff1e	@ bx	lr
-.size	_armv7_neon_probe,.-_armv7_neon_probe
-
-.global	_armv7_tick
-.type	_armv7_tick,%function
-_armv7_tick:
-	mrc	p15,0,r0,c9,c13,0
-	.word	0xe12fff1e	@ bx	lr
-.size	_armv7_tick,.-_armv7_tick
-
 .global	OPENSSL_atomic_add
 .type	OPENSSL_atomic_add,%function
 OPENSSL_atomic_add:
@ -28,7 +14,7 @@ OPENSSL_atomic_add:
 	cmp	r2,#0
 	bne	.Ladd
 	mov	r0,r3
-	.word	0xe12fff1e	@ bx	lr
+	bx	lr
 #else
 	stmdb	sp!,{r4-r6,lr}
 	ldr	r2,.Lspinlock
@ -81,62 +67,131 @@ OPENSSL_cleanse:
 	adds	r1,r1,#4
 	bne	.Little
 .Lcleanse_done:
+#if __ARM_ARCH__>=5
+	bx	lr
+#else
 	tst	lr,#1
 	moveq	pc,lr
 	.word	0xe12fff1e	@ bx	lr
+#endif
 .size	OPENSSL_cleanse,.-OPENSSL_cleanse

+#if __ARM_MAX_ARCH__>=7
+.arch	armv7-a
+.fpu	neon
+
+.align	5
+.global	_armv7_neon_probe
+.type	_armv7_neon_probe,%function
+_armv7_neon_probe:
+	vorr	q0,q0,q0
+	bx	lr
+.size	_armv7_neon_probe,.-_armv7_neon_probe
+
+.global	_armv7_tick
+.type	_armv7_tick,%function
+_armv7_tick:
+	mrrc	p15,1,r0,r1,c14		@ CNTVCT
+	bx	lr
+.size	_armv7_tick,.-_armv7_tick
+
+.global	_armv8_aes_probe
+.type	_armv8_aes_probe,%function
+_armv8_aes_probe:
+	.byte	0x00,0x03,0xb0,0xf3	@ aese.8	q0,q0
+	bx	lr
+.size	_armv8_aes_probe,.-_armv8_aes_probe
+
+.global	_armv8_sha1_probe
+.type	_armv8_sha1_probe,%function
+_armv8_sha1_probe:
+	.byte	0x40,0x0c,0x00,0xf2	@ sha1c.32	q0,q0,q0
+	bx	lr
+.size	_armv8_sha1_probe,.-_armv8_sha1_probe
+
+.global	_armv8_sha256_probe
+.type	_armv8_sha256_probe,%function
+_armv8_sha256_probe:
+	.byte	0x40,0x0c,0x00,0xf3	@ sha256h.32	q0,q0,q0
+	bx	lr
+.size	_armv8_sha256_probe,.-_armv8_sha256_probe
+.global	_armv8_pmull_probe
+.type	_armv8_pmull_probe,%function
+_armv8_pmull_probe:
+	.byte	0x00,0x0e,0xa0,0xf2	@ vmull.p64	q0,d0,d0
+	bx	lr
+.size	_armv8_pmull_probe,.-_armv8_pmull_probe
+#endif
+
 .global	OPENSSL_wipe_cpu
 .type	OPENSSL_wipe_cpu,%function
 OPENSSL_wipe_cpu:
+#if __ARM_MAX_ARCH__>=7
 	ldr	r0,.LOPENSSL_armcap
 	adr	r1,.LOPENSSL_armcap
 	ldr	r0,[r1,r0]
+#endif
 	eor	r2,r2,r2
 	eor	r3,r3,r3
 	eor	ip,ip,ip
+#if __ARM_MAX_ARCH__>=7
 	tst	r0,#1
 	beq	.Lwipe_done
-	.word	0xf3000150	@ veor    q0, q0, q0
-	.word	0xf3022152	@ veor    q1, q1, q1
-	.word	0xf3044154	@ veor    q2, q2, q2
-	.word	0xf3066156	@ veor    q3, q3, q3
-	.word	0xf34001f0	@ veor    q8, q8, q8
-	.word	0xf34221f2	@ veor    q9, q9, q9
-	.word	0xf34441f4	@ veor    q10, q10, q10
-	.word	0xf34661f6	@ veor    q11, q11, q11
-	.word	0xf34881f8	@ veor    q12, q12, q12
-	.word	0xf34aa1fa	@ veor    q13, q13, q13
-	.word	0xf34cc1fc	@ veor    q14, q14, q14
-	.word	0xf34ee1fe	@ veor    q15, q15, q15
+	veor	q0, q0, q0
+	veor	q1, q1, q1
+	veor	q2, q2, q2
+	veor	q3, q3, q3
+	veor	q8, q8, q8
+	veor	q9, q9, q9
+	veor	q10, q10, q10
+	veor	q11, q11, q11
+	veor	q12, q12, q12
+	veor	q13, q13, q13
+	veor	q14, q14, q14
+	veor	q15, q15, q15
 .Lwipe_done:
+#endif
 	mov	r0,sp
+#if __ARM_ARCH__>=5
+	bx	lr
+#else
 	tst	lr,#1
 	moveq	pc,lr
 	.word	0xe12fff1e	@ bx	lr
+#endif
 .size	OPENSSL_wipe_cpu,.-OPENSSL_wipe_cpu

 .global	OPENSSL_instrument_bus
 .type	OPENSSL_instrument_bus,%function
 OPENSSL_instrument_bus:
 	eor	r0,r0,r0
+#if __ARM_ARCH__>=5
+	bx	lr
+#else
 	tst	lr,#1
 	moveq	pc,lr
 	.word	0xe12fff1e	@ bx	lr
+#endif
 .size	OPENSSL_instrument_bus,.-OPENSSL_instrument_bus

 .global	OPENSSL_instrument_bus2
 .type	OPENSSL_instrument_bus2,%function
 OPENSSL_instrument_bus2:
 	eor	r0,r0,r0
+#if __ARM_ARCH__>=5
+	bx	lr
+#else
 	tst	lr,#1
 	moveq	pc,lr
 	.word	0xe12fff1e	@ bx	lr
+#endif
 .size	OPENSSL_instrument_bus2,.-OPENSSL_instrument_bus2

 .align	5
+#if __ARM_MAX_ARCH__>=7
 .LOPENSSL_armcap:
 .word	OPENSSL_armcap_P-.LOPENSSL_armcap
+#endif
 #if __ARM_ARCH__>=6
 .align	5
 #else
--- a/deps/openssl/openssl/crypto/asn1/Makefile
+++ b/deps/openssl/openssl/crypto/asn1/Makefile
@ -174,7 +174,7 @@ a_gentm.o: ../../include/openssl/err.h ../../include/openssl/lhash.h
 a_gentm.o: ../../include/openssl/opensslconf.h ../../include/openssl/opensslv.h
 a_gentm.o: ../../include/openssl/ossl_typ.h ../../include/openssl/safestack.h
 a_gentm.o: ../../include/openssl/stack.h ../../include/openssl/symhacks.h
-a_gentm.o: ../cryptlib.h ../o_time.h a_gentm.c
+a_gentm.o: ../cryptlib.h ../o_time.h a_gentm.c asn1_locl.h
 a_i2d_fp.o: ../../e_os.h ../../include/openssl/asn1.h
 a_i2d_fp.o: ../../include/openssl/bio.h ../../include/openssl/buffer.h
 a_i2d_fp.o: ../../include/openssl/crypto.h ../../include/openssl/e_os2.h
@ -275,6 +275,7 @@ a_time.o: ../../include/openssl/lhash.h ../../include/openssl/opensslconf.h
 a_time.o: ../../include/openssl/opensslv.h ../../include/openssl/ossl_typ.h
 a_time.o: ../../include/openssl/safestack.h ../../include/openssl/stack.h
 a_time.o: ../../include/openssl/symhacks.h ../cryptlib.h ../o_time.h a_time.c
+a_time.o: asn1_locl.h
 a_type.o: ../../e_os.h ../../include/openssl/asn1.h
 a_type.o: ../../include/openssl/asn1t.h ../../include/openssl/bio.h
 a_type.o: ../../include/openssl/buffer.h ../../include/openssl/crypto.h
@ -291,7 +292,7 @@ a_utctm.o: ../../include/openssl/err.h ../../include/openssl/lhash.h
 a_utctm.o: ../../include/openssl/opensslconf.h ../../include/openssl/opensslv.h
 a_utctm.o: ../../include/openssl/ossl_typ.h ../../include/openssl/safestack.h
 a_utctm.o: ../../include/openssl/stack.h ../../include/openssl/symhacks.h
-a_utctm.o: ../cryptlib.h ../o_time.h a_utctm.c
+a_utctm.o: ../cryptlib.h ../o_time.h a_utctm.c asn1_locl.h
 a_utf8.o: ../../e_os.h ../../include/openssl/asn1.h ../../include/openssl/bio.h
 a_utf8.o: ../../include/openssl/buffer.h ../../include/openssl/crypto.h
 a_utf8.o: ../../include/openssl/e_os2.h ../../include/openssl/err.h
--- a/deps/openssl/openssl/crypto/asn1/a_gentm.c
+++ b/deps/openssl/openssl/crypto/asn1/a_gentm.c
@ -65,6 +65,7 @@
 #include "cryptlib.h"
 #include "o_time.h"
 #include <openssl/asn1.h>
+#include "asn1_locl.h"

 #if 0

@ -117,7 +118,7 @@ ASN1_GENERALIZEDTIME *d2i_ASN1_GENERALIZEDTIME(ASN1_GENERALIZEDTIME **a,

 #endif

-int ASN1_GENERALIZEDTIME_check(ASN1_GENERALIZEDTIME *d)
+int asn1_generalizedtime_to_tm(struct tm *tm, const ASN1_GENERALIZEDTIME *d)
 {
    static const int min[9] = { 0, 0, 1, 1, 0, 0, 0, 0, 0 };
    static const int max[9] = { 99, 99, 12, 31, 23, 59, 59, 12, 59 };
@ -139,6 +140,8 @@ int ASN1_GENERALIZEDTIME_check(ASN1_GENERALIZEDTIME *d)
    for (i = 0; i < 7; i++) {
        if ((i == 6) && ((a[o] == 'Z') || (a[o] == '+') || (a[o] == '-'))) {
            i++;
+            if (tm)
+                tm->tm_sec = 0;
            break;
        }
        if ((a[o] < '0') || (a[o] > '9'))
@ -155,6 +158,31 @@ int ASN1_GENERALIZEDTIME_check(ASN1_GENERALIZEDTIME *d)

        if ((n < min[i]) || (n > max[i]))
            goto err;
+        if (tm) {
+            switch (i) {
+            case 0:
+                tm->tm_year = n * 100 - 1900;
+                break;
+            case 1:
+                tm->tm_year += n;
+                break;
+            case 2:
+                tm->tm_mon = n - 1;
+                break;
+            case 3:
+                tm->tm_mday = n;
+                break;
+            case 4:
+                tm->tm_hour = n;
+                break;
+            case 5:
+                tm->tm_min = n;
+                break;
+            case 6:
+                tm->tm_sec = n;
+                break;
+            }
+        }
    }
    /*
     * Optional fractional seconds: decimal point followed by one or more
@ -174,6 +202,7 @@ int ASN1_GENERALIZEDTIME_check(ASN1_GENERALIZEDTIME *d)
    if (a[o] == 'Z')
        o++;
    else if ((a[o] == '+') || (a[o] == '-')) {
+        int offsign = a[o] == '-' ? -1 : 1, offset = 0;
        o++;
        if (o + 4 > l)
            goto err;
@ -187,9 +216,17 @@ int ASN1_GENERALIZEDTIME_check(ASN1_GENERALIZEDTIME *d)
            n = (n * 10) + a[o] - '0';
            if ((n < min[i]) || (n > max[i]))
                goto err;
+            if (tm) {
+                if (i == 7)
+                    offset = n * 3600;
+                else if (i == 8)
+                    offset += n * 60;
+            }
            o++;
        }
-    } else {
+        if (offset && !OPENSSL_gmtime_adj(tm, 0, offset * offsign))
+            return 0;
+    } else if (a[o]) {
        /* Missing time zone information. */
        goto err;
    }
@ -198,6 +235,11 @@ int ASN1_GENERALIZEDTIME_check(ASN1_GENERALIZEDTIME *d)
    return (0);
 }

+int ASN1_GENERALIZEDTIME_check(const ASN1_GENERALIZEDTIME *d)
+{
+    return asn1_generalizedtime_to_tm(NULL, d);
+}
+
 int ASN1_GENERALIZEDTIME_set_string(ASN1_GENERALIZEDTIME *s, const char *str)
 {
    ASN1_GENERALIZEDTIME t;
--- a/deps/openssl/openssl/crypto/asn1/a_time.c
+++ b/deps/openssl/openssl/crypto/asn1/a_time.c
@ -66,6 +66,7 @@
 #include "cryptlib.h"
 #include "o_time.h"
 #include <openssl/asn1t.h>
+#include "asn1_locl.h"

 IMPLEMENT_ASN1_MSTRING(ASN1_TIME, B_ASN1_TIME)

@ -196,3 +197,32 @@ int ASN1_TIME_set_string(ASN1_TIME *s, const char *str)

    return 1;
 }
+
+static int asn1_time_to_tm(struct tm *tm, const ASN1_TIME *t)
+{
+    if (t == NULL) {
+        time_t now_t;
+        time(&now_t);
+        if (OPENSSL_gmtime(&now_t, tm))
+            return 1;
+        return 0;
+    }
+
+    if (t->type == V_ASN1_UTCTIME)
+        return asn1_utctime_to_tm(tm, t);
+    else if (t->type == V_ASN1_GENERALIZEDTIME)
+        return asn1_generalizedtime_to_tm(tm, t);
+
+    return 0;
+}
+
+int ASN1_TIME_diff(int *pday, int *psec,
+                   const ASN1_TIME *from, const ASN1_TIME *to)
+{
+    struct tm tm_from, tm_to;
+    if (!asn1_time_to_tm(&tm_from, from))
+        return 0;
+    if (!asn1_time_to_tm(&tm_to, to))
+        return 0;
+    return OPENSSL_gmtime_diff(pday, psec, &tm_from, &tm_to);
+}
--- a/deps/openssl/openssl/crypto/asn1/a_utctm.c
+++ b/deps/openssl/openssl/crypto/asn1/a_utctm.c
@ -61,6 +61,7 @@
 #include "cryptlib.h"
 #include "o_time.h"
 #include <openssl/asn1.h>
+#include "asn1_locl.h"

 #if 0
 int i2d_ASN1_UTCTIME(ASN1_UTCTIME *a, unsigned char **pp)
@ -109,7 +110,7 @@ ASN1_UTCTIME *d2i_ASN1_UTCTIME(ASN1_UTCTIME **a, unsigned char **pp,

 #endif

-int ASN1_UTCTIME_check(ASN1_UTCTIME *d)
+int asn1_utctime_to_tm(struct tm *tm, const ASN1_UTCTIME *d)
 {
    static const int min[8] = { 0, 1, 1, 0, 0, 0, 0, 0 };
    static const int max[8] = { 99, 12, 31, 23, 59, 59, 12, 59 };
@ -127,6 +128,8 @@ int ASN1_UTCTIME_check(ASN1_UTCTIME *d)
    for (i = 0; i < 6; i++) {
        if ((i == 5) && ((a[o] == 'Z') || (a[o] == '+') || (a[o] == '-'))) {
            i++;
+            if (tm)
+                tm->tm_sec = 0;
            break;
        }
        if ((a[o] < '0') || (a[o] > '9'))
@ -143,10 +146,33 @@ int ASN1_UTCTIME_check(ASN1_UTCTIME *d)

        if ((n < min[i]) || (n > max[i]))
            goto err;
+        if (tm) {
+            switch (i) {
+            case 0:
+                tm->tm_year = n < 50 ? n + 100 : n;
+                break;
+            case 1:
+                tm->tm_mon = n - 1;
+                break;
+            case 2:
+                tm->tm_mday = n;
+                break;
+            case 3:
+                tm->tm_hour = n;
+                break;
+            case 4:
+                tm->tm_min = n;
+                break;
+            case 5:
+                tm->tm_sec = n;
+                break;
+            }
+        }
    }
    if (a[o] == 'Z')
        o++;
    else if ((a[o] == '+') || (a[o] == '-')) {
+        int offsign = a[o] == '-' ? -1 : 1, offset = 0;
        o++;
        if (o + 4 > l)
            goto err;
@ -160,12 +186,25 @@ int ASN1_UTCTIME_check(ASN1_UTCTIME *d)
            n = (n * 10) + a[o] - '0';
            if ((n < min[i]) || (n > max[i]))
                goto err;
+            if (tm) {
+                if (i == 6)
+                    offset = n * 3600;
+                else if (i == 7)
+                    offset += n * 60;
+            }
            o++;
        }
+        if (offset && !OPENSSL_gmtime_adj(tm, 0, offset * offsign))
+            return 0;
    }
-    return (o == l);
+    return o == l;
 err:
-    return (0);
+    return 0;
+}
+
+int ASN1_UTCTIME_check(const ASN1_UTCTIME *d)
+{
+    return asn1_utctime_to_tm(NULL, d);
 }

 int ASN1_UTCTIME_set_string(ASN1_UTCTIME *s, const char *str)
@ -249,43 +288,26 @@ ASN1_UTCTIME *ASN1_UTCTIME_adj(ASN1_UTCTIME *s, time_t t,

 int ASN1_UTCTIME_cmp_time_t(const ASN1_UTCTIME *s, time_t t)
 {
-    struct tm *tm;
-    struct tm data;
-    int offset;
-    int year;
-
-#define g2(p) (((p)[0]-'0')*10+(p)[1]-'0')
-
-    if (s->data[12] == 'Z')
-        offset = 0;
-    else {
-        offset = g2(s->data + 13) * 60 + g2(s->data + 15);
-        if (s->data[12] == '-')
-            offset = -offset;
-    }
+    struct tm stm, ttm;
+    int day, sec;

-    t -= offset * 60;           /* FIXME: may overflow in extreme cases */
+    if (!asn1_utctime_to_tm(&stm, s))
+        return -2;

-    tm = OPENSSL_gmtime(&t, &data);
-    /*
-     * NB: -1, 0, 1 already valid return values so use -2 to indicate error.
-     */
-    if (tm == NULL)
+    if (!OPENSSL_gmtime(&t, &ttm))
        return -2;

-#define return_cmp(a,b) if ((a)<(b)) return -1; else if ((a)>(b)) return 1
-    year = g2(s->data);
-    if (year < 50)
-        year += 100;
-    return_cmp(year, tm->tm_year);
-    return_cmp(g2(s->data + 2) - 1, tm->tm_mon);
-    return_cmp(g2(s->data + 4), tm->tm_mday);
-    return_cmp(g2(s->data + 6), tm->tm_hour);
-    return_cmp(g2(s->data + 8), tm->tm_min);
-    return_cmp(g2(s->data + 10), tm->tm_sec);
-#undef g2
-#undef return_cmp
+    if (!OPENSSL_gmtime_diff(&day, &sec, &ttm, &stm))
+        return -2;

+    if (day > 0)
+        return 1;
+    if (day < 0)
+        return -1;
+    if (sec > 0)
+        return 1;
+    if (sec < 0)
+        return -1;
    return 0;
 }

--- a/deps/openssl/openssl/crypto/asn1/ameth_lib.c
+++ b/deps/openssl/openssl/crypto/asn1/ameth_lib.c
@ -68,6 +68,7 @@
 extern const EVP_PKEY_ASN1_METHOD rsa_asn1_meths[];
 extern const EVP_PKEY_ASN1_METHOD dsa_asn1_meths[];
 extern const EVP_PKEY_ASN1_METHOD dh_asn1_meth;
+extern const EVP_PKEY_ASN1_METHOD dhx_asn1_meth;
 extern const EVP_PKEY_ASN1_METHOD eckey_asn1_meth;
 extern const EVP_PKEY_ASN1_METHOD hmac_asn1_meth;
 extern const EVP_PKEY_ASN1_METHOD cmac_asn1_meth;
@ -92,7 +93,10 @@ static const EVP_PKEY_ASN1_METHOD *standard_methods[] = {
    &eckey_asn1_meth,
 #endif
    &hmac_asn1_meth,
-    &cmac_asn1_meth
+    &cmac_asn1_meth,
+#ifndef OPENSSL_NO_DH
+    &dhx_asn1_meth
+#endif
 };

 typedef int sk_cmp_fn_type(const char *const *a, const char *const *b);
--- a/deps/openssl/openssl/crypto/asn1/asn1.h
+++ b/deps/openssl/openssl/crypto/asn1/asn1.h
@ -207,13 +207,13 @@ typedef struct asn1_const_ctx_st {
 # define ASN1_OBJECT_FLAG_CRITICAL        0x02/* critical x509v3 object id */
 # define ASN1_OBJECT_FLAG_DYNAMIC_STRINGS 0x04/* internal use */
 # define ASN1_OBJECT_FLAG_DYNAMIC_DATA    0x08/* internal use */
-typedef struct asn1_object_st {
+struct asn1_object_st {
    const char *sn, *ln;
    int nid;
    int length;
    const unsigned char *data;  /* data remains const after init */
    int flags;                  /* Should we free this one */
-} ASN1_OBJECT;
+};

 # define ASN1_STRING_FLAG_BITS_LEFT 0x08/* Set if 0x07 has bits left value */
 /*
@ -843,7 +843,7 @@ int ASN1_INTEGER_cmp(const ASN1_INTEGER *x, const ASN1_INTEGER *y);

 DECLARE_ASN1_FUNCTIONS(ASN1_ENUMERATED)

-int ASN1_UTCTIME_check(ASN1_UTCTIME *a);
+int ASN1_UTCTIME_check(const ASN1_UTCTIME *a);
 ASN1_UTCTIME *ASN1_UTCTIME_set(ASN1_UTCTIME *s, time_t t);
 ASN1_UTCTIME *ASN1_UTCTIME_adj(ASN1_UTCTIME *s, time_t t,
                               int offset_day, long offset_sec);
@ -853,13 +853,15 @@ int ASN1_UTCTIME_cmp_time_t(const ASN1_UTCTIME *s, time_t t);
 time_t ASN1_UTCTIME_get(const ASN1_UTCTIME *s);
 # endif

-int ASN1_GENERALIZEDTIME_check(ASN1_GENERALIZEDTIME *a);
+int ASN1_GENERALIZEDTIME_check(const ASN1_GENERALIZEDTIME *a);
 ASN1_GENERALIZEDTIME *ASN1_GENERALIZEDTIME_set(ASN1_GENERALIZEDTIME *s,
                                               time_t t);
 ASN1_GENERALIZEDTIME *ASN1_GENERALIZEDTIME_adj(ASN1_GENERALIZEDTIME *s,
                                               time_t t, int offset_day,
                                               long offset_sec);
 int ASN1_GENERALIZEDTIME_set_string(ASN1_GENERALIZEDTIME *s, const char *str);
+int ASN1_TIME_diff(int *pday, int *psec,
+                   const ASN1_TIME *from, const ASN1_TIME *to);

 DECLARE_ASN1_FUNCTIONS(ASN1_OCTET_STRING)
 ASN1_OCTET_STRING *ASN1_OCTET_STRING_dup(const ASN1_OCTET_STRING *a);
--- a/deps/openssl/openssl/crypto/asn1/asn1_locl.h
+++ b/deps/openssl/openssl/crypto/asn1/asn1_locl.h
@ -59,6 +59,9 @@

 /* Internal ASN1 structures and functions: not for application use */

+int asn1_utctime_to_tm(struct tm *tm, const ASN1_UTCTIME *d);
+int asn1_generalizedtime_to_tm(struct tm *tm, const ASN1_GENERALIZEDTIME *d);
+
 /* ASN1 print context structure */

 struct asn1_pctx_st {
--- a/deps/openssl/openssl/crypto/asn1/t_x509.c
+++ b/deps/openssl/openssl/crypto/asn1/t_x509.c
@ -228,6 +228,21 @@ int X509_print_ex(BIO *bp, X509 *x, unsigned long nmflags,
        }
    }

+    if (!(cflag & X509_FLAG_NO_IDS)) {
+        if (ci->issuerUID) {
+            if (BIO_printf(bp, "%8sIssuer Unique ID: ", "") <= 0)
+                goto err;
+            if (!X509_signature_dump(bp, ci->issuerUID, 12))
+                goto err;
+        }
+        if (ci->subjectUID) {
+            if (BIO_printf(bp, "%8sSubject Unique ID: ", "") <= 0)
+                goto err;
+            if (!X509_signature_dump(bp, ci->subjectUID, 12))
+                goto err;
+        }
+    }
+
    if (!(cflag & X509_FLAG_NO_EXTENSIONS))
        X509V3_extensions_print(bp, "X509v3 extensions",
                                ci->extensions, cflag, 8);
--- a/deps/openssl/openssl/crypto/asn1/x_crl.c
+++ b/deps/openssl/openssl/crypto/asn1/x_crl.c
@ -58,8 +58,8 @@

 #include <stdio.h>
 #include "cryptlib.h"
-#include "asn1_locl.h"
 #include <openssl/asn1t.h>
+#include "asn1_locl.h"
 #include <openssl/x509.h>
 #include <openssl/x509v3.h>

@ -341,6 +341,8 @@ ASN1_SEQUENCE_ref(X509_CRL, crl_cb, CRYPTO_LOCK_X509_CRL) = {

 IMPLEMENT_ASN1_FUNCTIONS(X509_REVOKED)

+IMPLEMENT_ASN1_DUP_FUNCTION(X509_REVOKED)
+
 IMPLEMENT_ASN1_FUNCTIONS(X509_CRL_INFO)

 IMPLEMENT_ASN1_FUNCTIONS(X509_CRL)
--- a/deps/openssl/openssl/crypto/asn1/x_x509.c
+++ b/deps/openssl/openssl/crypto/asn1/x_x509.c
@ -208,3 +208,23 @@ int i2d_X509_AUX(X509 *a, unsigned char **pp)
        length += i2d_X509_CERT_AUX(a->aux, pp);
    return length;
 }
+
+int i2d_re_X509_tbs(X509 *x, unsigned char **pp)
+{
+    x->cert_info->enc.modified = 1;
+    return i2d_X509_CINF(x->cert_info, pp);
+}
+
+void X509_get0_signature(ASN1_BIT_STRING **psig, X509_ALGOR **palg,
+                         const X509 *x)
+{
+    if (psig)
+        *psig = x->signature;
+    if (palg)
+        *palg = x->sig_alg;
+}
+
+int X509_get_signature_nid(const X509 *x)
+{
+    return OBJ_obj2nid(x->sig_alg->algorithm);
+}
--- a/deps/openssl/openssl/crypto/bio/b_dump.c
+++ b/deps/openssl/openssl/crypto/bio/b_dump.c
@ -182,3 +182,28 @@ int BIO_dump_indent(BIO *bp, const char *s, int len, int indent)
 {
    return BIO_dump_indent_cb(write_bio, bp, s, len, indent);
 }
+
+int BIO_hex_string(BIO *out, int indent, int width, unsigned char *data,
+                   int datalen)
+{
+    int i, j = 0;
+
+    if (datalen < 1)
+        return 1;
+
+    for (i = 0; i < datalen - 1; i++) {
+        if (i && !j)
+            BIO_printf(out, "%*s", indent, "");
+
+        BIO_printf(out, "%02X:", data[i]);
+
+        j = (j + 1) % width;
+        if (!j)
+            BIO_printf(out, "\n");
+    }
+
+    if (i && !j)
+        BIO_printf(out, "%*s", indent, "");
+    BIO_printf(out, "%02X", data[datalen - 1]);
+    return 1;
+}
--- a/deps/openssl/openssl/crypto/bio/b_sock.c
+++ b/deps/openssl/openssl/crypto/bio/b_sock.c
@ -225,13 +225,17 @@ int BIO_get_port(const char *str, unsigned short *port_ptr)
 int BIO_sock_error(int sock)
 {
    int j, i;
-    int size;
+    union {
+        size_t s;
+        int i;
+    } size;

 # if defined(OPENSSL_SYS_BEOS_R5)
    return 0;
 # endif

-    size = sizeof(int);
+    /* heuristic way to adapt for platforms that expect 64-bit optlen */
+    size.s = 0, size.i = sizeof(j);
    /*
     * Note: under Windows the third parameter is of type (char *) whereas
     * under other systems it is (void *) if you don't have a cast it will
--- a/deps/openssl/openssl/crypto/bio/bio.h
+++ b/deps/openssl/openssl/crypto/bio/bio.h
@ -174,6 +174,7 @@ extern "C" {

 # define BIO_CTRL_DGRAM_SET_NEXT_TIMEOUT   45/* Next DTLS handshake timeout
                                              * to adjust socket timeouts */
+# define BIO_CTRL_DGRAM_SET_DONT_FRAG      48

 # define BIO_CTRL_DGRAM_GET_MTU_OVERHEAD   49

@ -725,6 +726,9 @@ int BIO_dump_indent(BIO *b, const char *bytes, int len, int indent);
 int BIO_dump_fp(FILE *fp, const char *s, int len);
 int BIO_dump_indent_fp(FILE *fp, const char *s, int len, int indent);
 # endif
+int BIO_hex_string(BIO *out, int indent, int width, unsigned char *data,
+                   int datalen);
+
 struct hostent *BIO_gethostbyname(const char *name);
 /*-
 * We might want a thread-safe interface too:
@ -761,8 +765,8 @@ int BIO_dgram_sctp_wait_for_dry(BIO *b);
 int BIO_dgram_sctp_msg_waiting(BIO *b);
 # endif
 BIO *BIO_new_fd(int fd, int close_flag);
-BIO *BIO_new_connect(char *host_port);
-BIO *BIO_new_accept(char *host_port);
+BIO *BIO_new_connect(const char *host_port);
+BIO *BIO_new_accept(const char *host_port);

 int BIO_new_bio_pair(BIO **bio1, size_t writebuf1,
                     BIO **bio2, size_t writebuf2);
--- a/deps/openssl/openssl/crypto/bio/bio_err.c
+++ b/deps/openssl/openssl/crypto/bio/bio_err.c
@ -1,6 +1,6 @@
 /* crypto/bio/bio_err.c */
 /* ====================================================================
- * Copyright (c) 1999-2011 The OpenSSL Project.  All rights reserved.
+ * Copyright (c) 1999-2015 The OpenSSL Project.  All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
--- a/deps/openssl/openssl/crypto/bio/bss_acpt.c
+++ b/deps/openssl/openssl/crypto/bio/bss_acpt.c
@ -445,7 +445,7 @@ static int acpt_puts(BIO *bp, const char *str)
    return (ret);
 }

-BIO *BIO_new_accept(char *str)
+BIO *BIO_new_accept(const char *str)
 {
    BIO *ret;

--- a/deps/openssl/openssl/crypto/bio/bss_conn.c
+++ b/deps/openssl/openssl/crypto/bio/bss_conn.c
@ -585,7 +585,7 @@ static int conn_puts(BIO *bp, const char *str)
    return (ret);
 }

-BIO *BIO_new_connect(char *str)
+BIO *BIO_new_connect(const char *str)
 {
    BIO *ret;

--- a/deps/openssl/openssl/crypto/bio/bss_dgram.c
+++ b/deps/openssl/openssl/crypto/bio/bss_dgram.c
@ -65,7 +65,7 @@
 #include <openssl/bio.h>
 #ifndef OPENSSL_NO_DGRAM

-# if defined(OPENSSL_SYS_WIN32) || defined(OPENSSL_SYS_VMS)
+# if defined(OPENSSL_SYS_VMS)
 #  include <sys/timeb.h>
 # endif

@ -80,6 +80,10 @@
 #  define IP_MTU      14        /* linux is lame */
 # endif

+# if OPENSSL_USE_IPV6 && !defined(IPPROTO_IPV6)
+#  define IPPROTO_IPV6 41       /* windows is lame */
+# endif
+
 # if defined(__FreeBSD__) && defined(IN6_IS_ADDR_V4MAPPED)
 /* Standard definition causes type-punning problems. */
 #  undef IN6_IS_ADDR_V4MAPPED
@ -495,8 +499,8 @@ static long dgram_ctrl(BIO *b, int cmd, long num, void *ptr)
    int *ip;
    struct sockaddr *to = NULL;
    bio_dgram_data *data = NULL;
-# if defined(OPENSSL_SYS_LINUX) && (defined(IP_MTU_DISCOVER) || defined(IP_MTU))
    int sockopt_val = 0;
+# if defined(OPENSSL_SYS_LINUX) && (defined(IP_MTU_DISCOVER) || defined(IP_MTU))
    socklen_t sockopt_len;      /* assume that system supporting IP_MTU is
                                 * modern enough to define socklen_t */
    socklen_t addr_len;
@ -881,6 +885,61 @@ static long dgram_ctrl(BIO *b, int cmd, long num, void *ptr)
            ret = 0;
        break;
 # endif
+    case BIO_CTRL_DGRAM_SET_DONT_FRAG:
+        sockopt_val = num ? 1 : 0;
+
+        switch (data->peer.sa.sa_family) {
+        case AF_INET:
+# if defined(IP_DONTFRAG)
+            if ((ret = setsockopt(b->num, IPPROTO_IP, IP_DONTFRAG,
+                                  &sockopt_val, sizeof(sockopt_val))) < 0) {
+                perror("setsockopt");
+                ret = -1;
+            }
+# elif defined(OPENSSL_SYS_LINUX) && defined(IP_MTUDISCOVER)
+            if ((sockopt_val = num ? IP_PMTUDISC_PROBE : IP_PMTUDISC_DONT),
+                (ret = setsockopt(b->num, IPPROTO_IP, IP_MTU_DISCOVER,
+                                  &sockopt_val, sizeof(sockopt_val))) < 0) {
+                perror("setsockopt");
+                ret = -1;
+            }
+# elif defined(OPENSSL_SYS_WINDOWS) && defined(IP_DONTFRAGMENT)
+            if ((ret = setsockopt(b->num, IPPROTO_IP, IP_DONTFRAGMENT,
+                                  (const char *)&sockopt_val,
+                                  sizeof(sockopt_val))) < 0) {
+                perror("setsockopt");
+                ret = -1;
+            }
+# else
+            ret = -1;
+# endif
+            break;
+# if OPENSSL_USE_IPV6
+        case AF_INET6:
+#  if defined(IPV6_DONTFRAG)
+            if ((ret = setsockopt(b->num, IPPROTO_IPV6, IPV6_DONTFRAG,
+                                  (const void *)&sockopt_val,
+                                  sizeof(sockopt_val))) < 0) {
+                perror("setsockopt");
+                ret = -1;
+            }
+#  elif defined(OPENSSL_SYS_LINUX) && defined(IPV6_MTUDISCOVER)
+            if ((sockopt_val = num ? IP_PMTUDISC_PROBE : IP_PMTUDISC_DONT),
+                (ret = setsockopt(b->num, IPPROTO_IPV6, IPV6_MTU_DISCOVER,
+                                  &sockopt_val, sizeof(sockopt_val))) < 0) {
+                perror("setsockopt");
+                ret = -1;
+            }
+#  else
+            ret = -1;
+#  endif
+            break;
+# endif
+        default:
+            ret = -1;
+            break;
+        }
+        break;
    case BIO_CTRL_DGRAM_GET_MTU_OVERHEAD:
        ret = dgram_get_mtu_overhead(data);
        break;
@ -1994,11 +2053,22 @@ int BIO_dgram_non_fatal_error(int err)

 static void get_current_time(struct timeval *t)
 {
-# ifdef OPENSSL_SYS_WIN32
-    struct _timeb tb;
-    _ftime(&tb);
-    t->tv_sec = (long)tb.time;
-    t->tv_usec = (long)tb.millitm * 1000;
+# if defined(_WIN32)
+    SYSTEMTIME st;
+    union {
+        unsigned __int64 ul;
+        FILETIME ft;
+    } now;
+
+    GetSystemTime(&st);
+    SystemTimeToFileTime(&st, &now.ft);
+#  ifdef  __MINGW32__
+    now.ul -= 116444736000000000ULL;
+#  else
+    now.ul -= 116444736000000000UI64; /* re-bias to 1/1/1970 */
+#  endif
+    t->tv_sec = (long)(now.ul / 10000000);
+    t->tv_usec = ((int)(now.ul % 10000000)) / 10;
 # elif defined(OPENSSL_SYS_VMS)
    struct timeb tb;
    ftime(&tb);
--- a/deps/openssl/openssl/crypto/bio/bss_fd.c
+++ b/deps/openssl/openssl/crypto/bio/bss_fd.c
@ -63,9 +63,27 @@

 #if defined(OPENSSL_NO_POSIX_IO)
 /*
- * One can argue that one should implement dummy placeholder for
- * BIO_s_fd here...
+ * Dummy placeholder for BIO_s_fd...
 */
+BIO *BIO_new_fd(int fd, int close_flag)
+{
+    return NULL;
+}
+
+int BIO_fd_non_fatal_error(int err)
+{
+    return 0;
+}
+
+int BIO_fd_should_retry(int i)
+{
+    return 0;
+}
+
+BIO_METHOD *BIO_s_fd(void)
+{
+    return NULL;
+}
 #else
 /*
 * As for unconditional usage of "UPLINK" interface in this module.
--- a/deps/openssl/openssl/crypto/bn/Makefile
+++ b/deps/openssl/openssl/crypto/bn/Makefile
@ -77,6 +77,12 @@ sparcv9a-mont.s:	asm/sparcv9a-mont.pl
 	$(PERL) asm/sparcv9a-mont.pl $(CFLAGS) > $@
 sparcv9-mont.s:		asm/sparcv9-mont.pl
 	$(PERL) asm/sparcv9-mont.pl $(CFLAGS) > $@
+vis3-mont.s:		asm/vis3-mont.pl
+	$(PERL) asm/vis3-mont.pl $(CFLAGS) > $@
+sparct4-mont.S:	asm/sparct4-mont.pl
+	$(PERL) asm/sparct4-mont.pl $(CFLAGS) > $@
+sparcv9-gf2m.S:	asm/sparcv9-gf2m.pl
+	$(PERL) asm/sparcv9-gf2m.pl $(CFLAGS) > $@

 bn-mips3.o:	asm/mips3.s
 	@if [ "$(CC)" = "gcc" ]; then \
@ -102,8 +108,10 @@ x86_64-mont5.s:	asm/x86_64-mont5.pl
 	$(PERL) asm/x86_64-mont5.pl $(PERLASM_SCHEME) > $@
 x86_64-gf2m.s:	asm/x86_64-gf2m.pl
 	$(PERL) asm/x86_64-gf2m.pl $(PERLASM_SCHEME) > $@
-modexp512-x86_64.s:	asm/modexp512-x86_64.pl
-	$(PERL) asm/modexp512-x86_64.pl $(PERLASM_SCHEME) > $@
+rsaz-x86_64.s:	asm/rsaz-x86_64.pl
+	$(PERL) asm/rsaz-x86_64.pl $(PERLASM_SCHEME) > $@
+rsaz-avx2.s:	asm/rsaz-avx2.pl
+	$(PERL) asm/rsaz-avx2.pl $(PERLASM_SCHEME) > $@

 bn-ia64.s:	asm/ia64.S
 	$(CC) $(CFLAGS) -E asm/ia64.S > $@
@ -125,14 +133,15 @@ ppc-mont.s:	asm/ppc-mont.pl;$(PERL) asm/ppc-mont.pl $(PERLASM_SCHEME) $@
 ppc64-mont.s:	asm/ppc64-mont.pl;$(PERL) asm/ppc64-mont.pl $(PERLASM_SCHEME) $@

 alpha-mont.s:	asm/alpha-mont.pl
-	(preproc=/tmp/$$$$.$@; trap "rm $$preproc" INT; \
+	(preproc=$$$$.$@.S; trap "rm $$preproc" INT; \
 	$(PERL) asm/alpha-mont.pl > $$preproc && \
-	$(CC) -E $$preproc > $@ && rm $$preproc)
+	$(CC) -E -P $$preproc > $@ && rm $$preproc)

 # GNU make "catch all"
-%-mont.s:	asm/%-mont.pl;	$(PERL) $< $(PERLASM_SCHEME) $@
+%-mont.S:	asm/%-mont.pl;	$(PERL) $< $(PERLASM_SCHEME) $@
 %-gf2m.S:	asm/%-gf2m.pl;	$(PERL) $< $(PERLASM_SCHEME) $@

+armv4-mont.o:	armv4-mont.S
 armv4-gf2m.o:	armv4-gf2m.S

 files:
--- a/deps/openssl/openssl/crypto/bn/asm/armv4-gf2m.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/armv4-gf2m.pl
@ -20,48 +20,26 @@
 # length, more for longer keys. Even though NEON 1x1 multiplication
 # runs in even less cycles, ~30, improvement is measurable only on
 # longer keys. One has to optimize code elsewhere to get NEON glow...
+#
+# April 2014
+#
+# Double bn_GF2m_mul_2x2 performance by using algorithm from paper
+# referred below, which improves ECDH and ECDSA verify benchmarks
+# by 18-40%.
+#
+# Câmara, D.; Gouvêa, C. P. L.; López, J. & Dahab, R.: Fast Software
+# Polynomial Multiplication on ARM Processors using the NEON Engine.
+#
+# http://conradoplg.cryptoland.net/files/2010/12/mocrysen13.pdf

 while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {}
 open STDOUT,">$output";

-sub Dlo()   { shift=~m|q([1]?[0-9])|?"d".($1*2):"";     }
-sub Dhi()   { shift=~m|q([1]?[0-9])|?"d".($1*2+1):"";   }
-sub Q()     { shift=~m|d([1-3]?[02468])|?"q".($1/2):""; }
-
 $code=<<___;
 #include "arm_arch.h"

 .text
 .code	32
-
-#if __ARM_ARCH__>=7
-.fpu	neon
-
-.type	mul_1x1_neon,%function
-.align	5
-mul_1x1_neon:
-	vshl.u64	`&Dlo("q1")`,d16,#8	@ q1-q3 are slided $a
-	vmull.p8	`&Q("d0")`,d16,d17	@ a·bb
-	vshl.u64	`&Dlo("q2")`,d16,#16
-	vmull.p8	q1,`&Dlo("q1")`,d17	@ a<<8·bb
-	vshl.u64	`&Dlo("q3")`,d16,#24
-	vmull.p8	q2,`&Dlo("q2")`,d17	@ a<<16·bb
-	vshr.u64	`&Dlo("q1")`,#8
-	vmull.p8	q3,`&Dlo("q3")`,d17	@ a<<24·bb
-	vshl.u64	`&Dhi("q1")`,#24
-	veor		d0,`&Dlo("q1")`
-	vshr.u64	`&Dlo("q2")`,#16
-	veor		d0,`&Dhi("q1")`
-	vshl.u64	`&Dhi("q2")`,#16
-	veor		d0,`&Dlo("q2")`
-	vshr.u64	`&Dlo("q3")`,#24
-	veor		d0,`&Dhi("q2")`
-	vshl.u64	`&Dhi("q3")`,#8
-	veor		d0,`&Dlo("q3")`
-	veor		d0,`&Dhi("q3")`
-	bx	lr
-.size	mul_1x1_neon,.-mul_1x1_neon
-#endif
 ___
 ################
 # private interface to mul_1x1_ialu
@ -159,56 +137,17 @@ ___
 # void	bn_GF2m_mul_2x2(BN_ULONG *r,
 #	BN_ULONG a1,BN_ULONG a0,
 #	BN_ULONG b1,BN_ULONG b0);	# r[3..0]=a1a0·b1b0
-
-($A1,$B1,$A0,$B0,$A1B1,$A0B0)=map("d$_",(18..23));
-
+{
 $code.=<<___;
 .global	bn_GF2m_mul_2x2
 .type	bn_GF2m_mul_2x2,%function
 .align	5
 bn_GF2m_mul_2x2:
-#if __ARM_ARCH__>=7
+#if __ARM_MAX_ARCH__>=7
 	ldr	r12,.LOPENSSL_armcap
 .Lpic:	ldr	r12,[pc,r12]
 	tst	r12,#1
-	beq	.Lialu
-
-	veor	$A1,$A1
-	vmov.32	$B1,r3,r3		@ two copies of b1
-	vmov.32	${A1}[0],r1		@ a1
-
-	veor	$A0,$A0
-	vld1.32	${B0}[],[sp,:32]	@ two copies of b0
-	vmov.32	${A0}[0],r2		@ a0
-	mov	r12,lr
-
-	vmov	d16,$A1
-	vmov	d17,$B1
-	bl	mul_1x1_neon		@ a1·b1
-	vmov	$A1B1,d0
-
-	vmov	d16,$A0
-	vmov	d17,$B0
-	bl	mul_1x1_neon		@ a0·b0
-	vmov	$A0B0,d0
-
-	veor	d16,$A0,$A1
-	veor	d17,$B0,$B1
-	veor	$A0,$A0B0,$A1B1
-	bl	mul_1x1_neon		@ (a0+a1)·(b0+b1)
-
-	veor	d0,$A0			@ (a0+a1)·(b0+b1)-a0·b0-a1·b1
-	vshl.u64 d1,d0,#32
-	vshr.u64 d0,d0,#32
-	veor	$A0B0,d1
-	veor	$A1B1,d0
-	vst1.32	{${A0B0}[0]},[r0,:32]!
-	vst1.32	{${A0B0}[1]},[r0,:32]!
-	vst1.32	{${A1B1}[0]},[r0,:32]!
-	vst1.32	{${A1B1}[1]},[r0,:32]
-	bx	r12
-.align	4
-.Lialu:
+	bne	.LNEON
 #endif
 ___
 $ret="r10";	# reassigned 1st argument
@ -260,8 +199,72 @@ $code.=<<___;
 	moveq	pc,lr			@ be binary compatible with V4, yet
 	bx	lr			@ interoperable with Thumb ISA:-)
 #endif
+___
+}
+{
+my ($r,$t0,$t1,$t2,$t3)=map("q$_",(0..3,8..12));
+my ($a,$b,$k48,$k32,$k16)=map("d$_",(26..31));
+
+$code.=<<___;
+#if __ARM_MAX_ARCH__>=7
+.arch	armv7-a
+.fpu	neon
+
+.align	5
+.LNEON:
+	ldr		r12, [sp]		@ 5th argument
+	vmov.32		$a, r2, r1
+	vmov.32		$b, r12, r3
+	vmov.i64	$k48, #0x0000ffffffffffff
+	vmov.i64	$k32, #0x00000000ffffffff
+	vmov.i64	$k16, #0x000000000000ffff
+
+	vext.8		$t0#lo, $a, $a, #1	@ A1
+	vmull.p8	$t0, $t0#lo, $b		@ F = A1*B
+	vext.8		$r#lo, $b, $b, #1	@ B1
+	vmull.p8	$r, $a, $r#lo		@ E = A*B1
+	vext.8		$t1#lo, $a, $a, #2	@ A2
+	vmull.p8	$t1, $t1#lo, $b		@ H = A2*B
+	vext.8		$t3#lo, $b, $b, #2	@ B2
+	vmull.p8	$t3, $a, $t3#lo		@ G = A*B2
+	vext.8		$t2#lo, $a, $a, #3	@ A3
+	veor		$t0, $t0, $r		@ L = E + F
+	vmull.p8	$t2, $t2#lo, $b		@ J = A3*B
+	vext.8		$r#lo, $b, $b, #3	@ B3
+	veor		$t1, $t1, $t3		@ M = G + H
+	vmull.p8	$r, $a, $r#lo		@ I = A*B3
+	veor		$t0#lo, $t0#lo, $t0#hi	@ t0 = (L) (P0 + P1) << 8
+	vand		$t0#hi, $t0#hi, $k48
+	vext.8		$t3#lo, $b, $b, #4	@ B4
+	veor		$t1#lo, $t1#lo, $t1#hi	@ t1 = (M) (P2 + P3) << 16
+	vand		$t1#hi, $t1#hi, $k32
+	vmull.p8	$t3, $a, $t3#lo		@ K = A*B4
+	veor		$t2, $t2, $r		@ N = I + J
+	veor		$t0#lo, $t0#lo, $t0#hi
+	veor		$t1#lo, $t1#lo, $t1#hi
+	veor		$t2#lo, $t2#lo, $t2#hi	@ t2 = (N) (P4 + P5) << 24
+	vand		$t2#hi, $t2#hi, $k16
+	vext.8		$t0, $t0, $t0, #15
+	veor		$t3#lo, $t3#lo, $t3#hi	@ t3 = (K) (P6 + P7) << 32
+	vmov.i64	$t3#hi, #0
+	vext.8		$t1, $t1, $t1, #14
+	veor		$t2#lo, $t2#lo, $t2#hi
+	vmull.p8	$r, $a, $b		@ D = A*B
+	vext.8		$t3, $t3, $t3, #12
+	vext.8		$t2, $t2, $t2, #13
+	veor		$t0, $t0, $t1
+	veor		$t2, $t2, $t3
+	veor		$r, $r, $t0
+	veor		$r, $r, $t2
+
+	vst1.32		{$r}, [r0]
+	ret		@ bx lr
+#endif
+___
+}
+$code.=<<___;
 .size	bn_GF2m_mul_2x2,.-bn_GF2m_mul_2x2
-#if __ARM_ARCH__>=7
+#if __ARM_MAX_ARCH__>=7
 .align	5
 .LOPENSSL_armcap:
 .word	OPENSSL_armcap_P-(.Lpic+8)
@ -269,10 +272,18 @@ $code.=<<___;
 .asciz	"GF(2^m) Multiplication for ARMv4/NEON, CRYPTOGAMS by <appro\@openssl.org>"
 .align	5

+#if __ARM_MAX_ARCH__>=7
 .comm	OPENSSL_armcap_P,4,4
+#endif
 ___

-$code =~ s/\`([^\`]*)\`/eval $1/gem;
-$code =~ s/\bbx\s+lr\b/.word\t0xe12fff1e/gm;    # make it possible to compile with -march=armv4
-print $code;
+foreach (split("\n",$code)) {
+	s/\`([^\`]*)\`/eval $1/geo;
+
+	s/\bq([0-9]+)#(lo|hi)/sprintf "d%d",2*$1+($2 eq "hi")/geo	or
+	s/\bret\b/bx	lr/go		or
+	s/\bbx\s+lr\b/.word\t0xe12fff1e/go;    # make it possible to compile with -march=armv4
+
+	print $_,"\n";
+}
 close STDOUT;   # enforce flush
--- a/deps/openssl/openssl/crypto/bn/asm/armv4-mont.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/armv4-mont.pl
@ -1,7 +1,7 @@
 #!/usr/bin/env perl

 # ====================================================================
-# Written by Andy Polyakov <appro@fy.chalmers.se> for the OpenSSL
+# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
 # project. The module is, however, dual licensed under OpenSSL and
 # CRYPTOGAMS licenses depending on where you obtain it. For further
 # details see http://www.openssl.org/~appro/cryptogams/.
@ -23,6 +23,21 @@
 # than 1/2KB. Windows CE port would be trivial, as it's exclusively
 # about decorations, ABI and instruction syntax are identical.

+# November 2013
+#
+# Add NEON code path, which handles lengths divisible by 8. RSA/DSA
+# performance improvement on Cortex-A8 is ~45-100% depending on key
+# length, more for longer keys. On Cortex-A15 the span is ~10-105%.
+# On Snapdragon S4 improvement was measured to vary from ~70% to
+# incredible ~380%, yes, 4.8x faster, for RSA4096 sign. But this is
+# rather because original integer-only code seems to perform
+# suboptimally on S4. Situation on Cortex-A9 is unfortunately
+# different. It's being looked into, but the trouble is that
+# performance for vectors longer than 256 bits is actually couple
+# of percent worse than for integer-only code. The code is chosen
+# for execution on all NEON-capable processors, because gain on
+# others outweighs the marginal loss on Cortex-A9.
+
 while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {}
 open STDOUT,">$output";

@ -52,16 +67,40 @@ $_n0="$num,#14*4";
 $_num="$num,#15*4";	$_bpend=$_num;

 $code=<<___;
+#include "arm_arch.h"
+
 .text
+.code	32
+
+#if __ARM_MAX_ARCH__>=7
+.align	5
+.LOPENSSL_armcap:
+.word	OPENSSL_armcap_P-bn_mul_mont
+#endif

 .global	bn_mul_mont
 .type	bn_mul_mont,%function

-.align	2
+.align	5
 bn_mul_mont:
+	ldr	ip,[sp,#4]		@ load num
 	stmdb	sp!,{r0,r2}		@ sp points at argument block
-	ldr	$num,[sp,#3*4]		@ load num
-	cmp	$num,#2
+#if __ARM_MAX_ARCH__>=7
+	tst	ip,#7
+	bne	.Lialu
+	adr	r0,bn_mul_mont
+	ldr	r2,.LOPENSSL_armcap
+	ldr	r0,[r0,r2]
+	tst	r0,#1			@ NEON available?
+	ldmia	sp, {r0,r2}
+	beq	.Lialu
+	add	sp,sp,#8
+	b	bn_mul8x_mont_neon
+.align	4
+.Lialu:
+#endif
+	cmp	ip,#2
+	mov	$num,ip			@ load num
 	movlt	r0,#0
 	addlt	sp,sp,#2*4
 	blt	.Labrt
@ -191,14 +230,447 @@ bn_mul_mont:
 	ldmia	sp!,{r4-r12,lr}		@ restore registers
 	add	sp,sp,#2*4		@ skip over {r0,r2}
 	mov	r0,#1
-.Labrt:	tst	lr,#1
+.Labrt:
+#if __ARM_ARCH__>=5
+	ret				@ bx lr
+#else
+	tst	lr,#1
 	moveq	pc,lr			@ be binary compatible with V4, yet
 	bx	lr			@ interoperable with Thumb ISA:-)
+#endif
 .size	bn_mul_mont,.-bn_mul_mont
-.asciz	"Montgomery multiplication for ARMv4, CRYPTOGAMS by <appro\@openssl.org>"
+___
+{
+sub Dlo()   { shift=~m|q([1]?[0-9])|?"d".($1*2):"";     }
+sub Dhi()   { shift=~m|q([1]?[0-9])|?"d".($1*2+1):"";   }
+
+my ($A0,$A1,$A2,$A3)=map("d$_",(0..3));
+my ($N0,$N1,$N2,$N3)=map("d$_",(4..7));
+my ($Z,$Temp)=("q4","q5");
+my ($A0xB,$A1xB,$A2xB,$A3xB,$A4xB,$A5xB,$A6xB,$A7xB)=map("q$_",(6..13));
+my ($Bi,$Ni,$M0)=map("d$_",(28..31));
+my $zero=&Dlo($Z);
+my $temp=&Dlo($Temp);
+
+my ($rptr,$aptr,$bptr,$nptr,$n0,$num)=map("r$_",(0..5));
+my ($tinptr,$toutptr,$inner,$outer)=map("r$_",(6..9));
+
+$code.=<<___;
+#if __ARM_MAX_ARCH__>=7
+.arch	armv7-a
+.fpu	neon
+
+.type	bn_mul8x_mont_neon,%function
+.align	5
+bn_mul8x_mont_neon:
+	mov	ip,sp
+	stmdb	sp!,{r4-r11}
+	vstmdb	sp!,{d8-d15}		@ ABI specification says so
+	ldmia	ip,{r4-r5}		@ load rest of parameter block
+
+	sub		$toutptr,sp,#16
+	vld1.32		{${Bi}[0]}, [$bptr,:32]!
+	sub		$toutptr,$toutptr,$num,lsl#4
+	vld1.32		{$A0-$A3},  [$aptr]!		@ can't specify :32 :-(
+	and		$toutptr,$toutptr,#-64
+	vld1.32		{${M0}[0]}, [$n0,:32]
+	mov		sp,$toutptr			@ alloca
+	veor		$zero,$zero,$zero
+	subs		$inner,$num,#8
+	vzip.16		$Bi,$zero
+
+	vmull.u32	$A0xB,$Bi,${A0}[0]
+	vmull.u32	$A1xB,$Bi,${A0}[1]
+	vmull.u32	$A2xB,$Bi,${A1}[0]
+	vshl.i64	$temp,`&Dhi("$A0xB")`,#16
+	vmull.u32	$A3xB,$Bi,${A1}[1]
+
+	vadd.u64	$temp,$temp,`&Dlo("$A0xB")`
+	veor		$zero,$zero,$zero
+	vmul.u32	$Ni,$temp,$M0
+
+	vmull.u32	$A4xB,$Bi,${A2}[0]
+	 vld1.32	{$N0-$N3}, [$nptr]!
+	vmull.u32	$A5xB,$Bi,${A2}[1]
+	vmull.u32	$A6xB,$Bi,${A3}[0]
+	vzip.16		$Ni,$zero
+	vmull.u32	$A7xB,$Bi,${A3}[1]
+
+	bne	.LNEON_1st
+
+	@ special case for num=8, everything is in register bank...
+
+	vmlal.u32	$A0xB,$Ni,${N0}[0]
+	sub		$outer,$num,#1
+	vmlal.u32	$A1xB,$Ni,${N0}[1]
+	vmlal.u32	$A2xB,$Ni,${N1}[0]
+	vmlal.u32	$A3xB,$Ni,${N1}[1]
+
+	vmlal.u32	$A4xB,$Ni,${N2}[0]
+	vmov		$Temp,$A0xB
+	vmlal.u32	$A5xB,$Ni,${N2}[1]
+	vmov		$A0xB,$A1xB
+	vmlal.u32	$A6xB,$Ni,${N3}[0]
+	vmov		$A1xB,$A2xB
+	vmlal.u32	$A7xB,$Ni,${N3}[1]
+	vmov		$A2xB,$A3xB
+	vmov		$A3xB,$A4xB
+	vshr.u64	$temp,$temp,#16
+	vmov		$A4xB,$A5xB
+	vmov		$A5xB,$A6xB
+	vadd.u64	$temp,$temp,`&Dhi("$Temp")`
+	vmov		$A6xB,$A7xB
+	veor		$A7xB,$A7xB
+	vshr.u64	$temp,$temp,#16
+
+	b	.LNEON_outer8
+
+.align	4
+.LNEON_outer8:
+	vld1.32		{${Bi}[0]}, [$bptr,:32]!
+	veor		$zero,$zero,$zero
+	vzip.16		$Bi,$zero
+	vadd.u64	`&Dlo("$A0xB")`,`&Dlo("$A0xB")`,$temp
+
+	vmlal.u32	$A0xB,$Bi,${A0}[0]
+	vmlal.u32	$A1xB,$Bi,${A0}[1]
+	vmlal.u32	$A2xB,$Bi,${A1}[0]
+	vshl.i64	$temp,`&Dhi("$A0xB")`,#16
+	vmlal.u32	$A3xB,$Bi,${A1}[1]
+
+	vadd.u64	$temp,$temp,`&Dlo("$A0xB")`
+	veor		$zero,$zero,$zero
+	subs		$outer,$outer,#1
+	vmul.u32	$Ni,$temp,$M0
+
+	vmlal.u32	$A4xB,$Bi,${A2}[0]
+	vmlal.u32	$A5xB,$Bi,${A2}[1]
+	vmlal.u32	$A6xB,$Bi,${A3}[0]
+	vzip.16		$Ni,$zero
+	vmlal.u32	$A7xB,$Bi,${A3}[1]
+
+	vmlal.u32	$A0xB,$Ni,${N0}[0]
+	vmlal.u32	$A1xB,$Ni,${N0}[1]
+	vmlal.u32	$A2xB,$Ni,${N1}[0]
+	vmlal.u32	$A3xB,$Ni,${N1}[1]
+
+	vmlal.u32	$A4xB,$Ni,${N2}[0]
+	vmov		$Temp,$A0xB
+	vmlal.u32	$A5xB,$Ni,${N2}[1]
+	vmov		$A0xB,$A1xB
+	vmlal.u32	$A6xB,$Ni,${N3}[0]
+	vmov		$A1xB,$A2xB
+	vmlal.u32	$A7xB,$Ni,${N3}[1]
+	vmov		$A2xB,$A3xB
+	vmov		$A3xB,$A4xB
+	vshr.u64	$temp,$temp,#16
+	vmov		$A4xB,$A5xB
+	vmov		$A5xB,$A6xB
+	vadd.u64	$temp,$temp,`&Dhi("$Temp")`
+	vmov		$A6xB,$A7xB
+	veor		$A7xB,$A7xB
+	vshr.u64	$temp,$temp,#16
+
+	bne	.LNEON_outer8
+
+	vadd.u64	`&Dlo("$A0xB")`,`&Dlo("$A0xB")`,$temp
+	mov		$toutptr,sp
+	vshr.u64	$temp,`&Dlo("$A0xB")`,#16
+	mov		$inner,$num
+	vadd.u64	`&Dhi("$A0xB")`,`&Dhi("$A0xB")`,$temp
+	add		$tinptr,sp,#16
+	vshr.u64	$temp,`&Dhi("$A0xB")`,#16
+	vzip.16		`&Dlo("$A0xB")`,`&Dhi("$A0xB")`
+
+	b	.LNEON_tail2
+
+.align	4
+.LNEON_1st:
+	vmlal.u32	$A0xB,$Ni,${N0}[0]
+	 vld1.32	{$A0-$A3}, [$aptr]!
+	vmlal.u32	$A1xB,$Ni,${N0}[1]
+	subs		$inner,$inner,#8
+	vmlal.u32	$A2xB,$Ni,${N1}[0]
+	vmlal.u32	$A3xB,$Ni,${N1}[1]
+
+	vmlal.u32	$A4xB,$Ni,${N2}[0]
+	 vld1.32	{$N0-$N1}, [$nptr]!
+	vmlal.u32	$A5xB,$Ni,${N2}[1]
+	 vst1.64	{$A0xB-$A1xB}, [$toutptr,:256]!
+	vmlal.u32	$A6xB,$Ni,${N3}[0]
+	vmlal.u32	$A7xB,$Ni,${N3}[1]
+	 vst1.64	{$A2xB-$A3xB}, [$toutptr,:256]!
+
+	vmull.u32	$A0xB,$Bi,${A0}[0]
+	 vld1.32	{$N2-$N3}, [$nptr]!
+	vmull.u32	$A1xB,$Bi,${A0}[1]
+	 vst1.64	{$A4xB-$A5xB}, [$toutptr,:256]!
+	vmull.u32	$A2xB,$Bi,${A1}[0]
+	vmull.u32	$A3xB,$Bi,${A1}[1]
+	 vst1.64	{$A6xB-$A7xB}, [$toutptr,:256]!
+
+	vmull.u32	$A4xB,$Bi,${A2}[0]
+	vmull.u32	$A5xB,$Bi,${A2}[1]
+	vmull.u32	$A6xB,$Bi,${A3}[0]
+	vmull.u32	$A7xB,$Bi,${A3}[1]
+
+	bne	.LNEON_1st
+
+	vmlal.u32	$A0xB,$Ni,${N0}[0]
+	add		$tinptr,sp,#16
+	vmlal.u32	$A1xB,$Ni,${N0}[1]
+	sub		$aptr,$aptr,$num,lsl#2		@ rewind $aptr
+	vmlal.u32	$A2xB,$Ni,${N1}[0]
+	 vld1.64	{$Temp}, [sp,:128]
+	vmlal.u32	$A3xB,$Ni,${N1}[1]
+	sub		$outer,$num,#1
+
+	vmlal.u32	$A4xB,$Ni,${N2}[0]
+	vst1.64		{$A0xB-$A1xB}, [$toutptr,:256]!
+	vmlal.u32	$A5xB,$Ni,${N2}[1]
+	vshr.u64	$temp,$temp,#16
+	 vld1.64	{$A0xB},       [$tinptr, :128]!
+	vmlal.u32	$A6xB,$Ni,${N3}[0]
+	vst1.64		{$A2xB-$A3xB}, [$toutptr,:256]!
+	vmlal.u32	$A7xB,$Ni,${N3}[1]
+
+	vst1.64		{$A4xB-$A5xB}, [$toutptr,:256]!
+	vadd.u64	$temp,$temp,`&Dhi("$Temp")`
+	veor		$Z,$Z,$Z
+	vst1.64		{$A6xB-$A7xB}, [$toutptr,:256]!
+	 vld1.64	{$A1xB-$A2xB}, [$tinptr, :256]!
+	vst1.64		{$Z},          [$toutptr,:128]
+	vshr.u64	$temp,$temp,#16
+
+	b		.LNEON_outer
+
+.align	4
+.LNEON_outer:
+	vld1.32		{${Bi}[0]}, [$bptr,:32]!
+	sub		$nptr,$nptr,$num,lsl#2		@ rewind $nptr
+	vld1.32		{$A0-$A3},  [$aptr]!
+	veor		$zero,$zero,$zero
+	mov		$toutptr,sp
+	vzip.16		$Bi,$zero
+	sub		$inner,$num,#8
+	vadd.u64	`&Dlo("$A0xB")`,`&Dlo("$A0xB")`,$temp
+
+	vmlal.u32	$A0xB,$Bi,${A0}[0]
+	 vld1.64	{$A3xB-$A4xB},[$tinptr,:256]!
+	vmlal.u32	$A1xB,$Bi,${A0}[1]
+	vmlal.u32	$A2xB,$Bi,${A1}[0]
+	 vld1.64	{$A5xB-$A6xB},[$tinptr,:256]!
+	vmlal.u32	$A3xB,$Bi,${A1}[1]
+
+	vshl.i64	$temp,`&Dhi("$A0xB")`,#16
+	veor		$zero,$zero,$zero
+	vadd.u64	$temp,$temp,`&Dlo("$A0xB")`
+	 vld1.64	{$A7xB},[$tinptr,:128]!
+	vmul.u32	$Ni,$temp,$M0
+
+	vmlal.u32	$A4xB,$Bi,${A2}[0]
+	 vld1.32	{$N0-$N3}, [$nptr]!
+	vmlal.u32	$A5xB,$Bi,${A2}[1]
+	vmlal.u32	$A6xB,$Bi,${A3}[0]
+	vzip.16		$Ni,$zero
+	vmlal.u32	$A7xB,$Bi,${A3}[1]
+
+.LNEON_inner:
+	vmlal.u32	$A0xB,$Ni,${N0}[0]
+	 vld1.32	{$A0-$A3}, [$aptr]!
+	vmlal.u32	$A1xB,$Ni,${N0}[1]
+	 subs		$inner,$inner,#8
+	vmlal.u32	$A2xB,$Ni,${N1}[0]
+	vmlal.u32	$A3xB,$Ni,${N1}[1]
+	vst1.64		{$A0xB-$A1xB}, [$toutptr,:256]!
+
+	vmlal.u32	$A4xB,$Ni,${N2}[0]
+	 vld1.64	{$A0xB},       [$tinptr, :128]!
+	vmlal.u32	$A5xB,$Ni,${N2}[1]
+	vst1.64		{$A2xB-$A3xB}, [$toutptr,:256]!
+	vmlal.u32	$A6xB,$Ni,${N3}[0]
+	 vld1.64	{$A1xB-$A2xB}, [$tinptr, :256]!
+	vmlal.u32	$A7xB,$Ni,${N3}[1]
+	vst1.64		{$A4xB-$A5xB}, [$toutptr,:256]!
+
+	vmlal.u32	$A0xB,$Bi,${A0}[0]
+	 vld1.64	{$A3xB-$A4xB}, [$tinptr, :256]!
+	vmlal.u32	$A1xB,$Bi,${A0}[1]
+	vst1.64		{$A6xB-$A7xB}, [$toutptr,:256]!
+	vmlal.u32	$A2xB,$Bi,${A1}[0]
+	 vld1.64	{$A5xB-$A6xB}, [$tinptr, :256]!
+	vmlal.u32	$A3xB,$Bi,${A1}[1]
+	 vld1.32	{$N0-$N3}, [$nptr]!
+
+	vmlal.u32	$A4xB,$Bi,${A2}[0]
+	 vld1.64	{$A7xB},       [$tinptr, :128]!
+	vmlal.u32	$A5xB,$Bi,${A2}[1]
+	vmlal.u32	$A6xB,$Bi,${A3}[0]
+	vmlal.u32	$A7xB,$Bi,${A3}[1]
+
+	bne	.LNEON_inner
+
+	vmlal.u32	$A0xB,$Ni,${N0}[0]
+	add		$tinptr,sp,#16
+	vmlal.u32	$A1xB,$Ni,${N0}[1]
+	sub		$aptr,$aptr,$num,lsl#2		@ rewind $aptr
+	vmlal.u32	$A2xB,$Ni,${N1}[0]
+	 vld1.64	{$Temp}, [sp,:128]
+	vmlal.u32	$A3xB,$Ni,${N1}[1]
+	subs		$outer,$outer,#1
+
+	vmlal.u32	$A4xB,$Ni,${N2}[0]
+	vst1.64		{$A0xB-$A1xB}, [$toutptr,:256]!
+	vmlal.u32	$A5xB,$Ni,${N2}[1]
+	 vld1.64	{$A0xB},       [$tinptr, :128]!
+	vshr.u64	$temp,$temp,#16
+	vst1.64		{$A2xB-$A3xB}, [$toutptr,:256]!
+	vmlal.u32	$A6xB,$Ni,${N3}[0]
+	 vld1.64	{$A1xB-$A2xB}, [$tinptr, :256]!
+	vmlal.u32	$A7xB,$Ni,${N3}[1]
+
+	vst1.64		{$A4xB-$A5xB}, [$toutptr,:256]!
+	vadd.u64	$temp,$temp,`&Dhi("$Temp")`
+	vst1.64		{$A6xB-$A7xB}, [$toutptr,:256]!
+	vshr.u64	$temp,$temp,#16
+
+	bne	.LNEON_outer
+
+	mov		$toutptr,sp
+	mov		$inner,$num
+
+.LNEON_tail:
+	vadd.u64	`&Dlo("$A0xB")`,`&Dlo("$A0xB")`,$temp
+	vld1.64		{$A3xB-$A4xB}, [$tinptr, :256]!
+	vshr.u64	$temp,`&Dlo("$A0xB")`,#16
+	vadd.u64	`&Dhi("$A0xB")`,`&Dhi("$A0xB")`,$temp
+	vld1.64		{$A5xB-$A6xB}, [$tinptr, :256]!
+	vshr.u64	$temp,`&Dhi("$A0xB")`,#16
+	vld1.64		{$A7xB},       [$tinptr, :128]!
+	vzip.16		`&Dlo("$A0xB")`,`&Dhi("$A0xB")`
+
+.LNEON_tail2:
+	vadd.u64	`&Dlo("$A1xB")`,`&Dlo("$A1xB")`,$temp
+	vst1.32		{`&Dlo("$A0xB")`[0]}, [$toutptr, :32]!
+	vshr.u64	$temp,`&Dlo("$A1xB")`,#16
+	vadd.u64	`&Dhi("$A1xB")`,`&Dhi("$A1xB")`,$temp
+	vshr.u64	$temp,`&Dhi("$A1xB")`,#16
+	vzip.16		`&Dlo("$A1xB")`,`&Dhi("$A1xB")`
+
+	vadd.u64	`&Dlo("$A2xB")`,`&Dlo("$A2xB")`,$temp
+	vst1.32		{`&Dlo("$A1xB")`[0]}, [$toutptr, :32]!
+	vshr.u64	$temp,`&Dlo("$A2xB")`,#16
+	vadd.u64	`&Dhi("$A2xB")`,`&Dhi("$A2xB")`,$temp
+	vshr.u64	$temp,`&Dhi("$A2xB")`,#16
+	vzip.16		`&Dlo("$A2xB")`,`&Dhi("$A2xB")`
+
+	vadd.u64	`&Dlo("$A3xB")`,`&Dlo("$A3xB")`,$temp
+	vst1.32		{`&Dlo("$A2xB")`[0]}, [$toutptr, :32]!
+	vshr.u64	$temp,`&Dlo("$A3xB")`,#16
+	vadd.u64	`&Dhi("$A3xB")`,`&Dhi("$A3xB")`,$temp
+	vshr.u64	$temp,`&Dhi("$A3xB")`,#16
+	vzip.16		`&Dlo("$A3xB")`,`&Dhi("$A3xB")`
+
+	vadd.u64	`&Dlo("$A4xB")`,`&Dlo("$A4xB")`,$temp
+	vst1.32		{`&Dlo("$A3xB")`[0]}, [$toutptr, :32]!
+	vshr.u64	$temp,`&Dlo("$A4xB")`,#16
+	vadd.u64	`&Dhi("$A4xB")`,`&Dhi("$A4xB")`,$temp
+	vshr.u64	$temp,`&Dhi("$A4xB")`,#16
+	vzip.16		`&Dlo("$A4xB")`,`&Dhi("$A4xB")`
+
+	vadd.u64	`&Dlo("$A5xB")`,`&Dlo("$A5xB")`,$temp
+	vst1.32		{`&Dlo("$A4xB")`[0]}, [$toutptr, :32]!
+	vshr.u64	$temp,`&Dlo("$A5xB")`,#16
+	vadd.u64	`&Dhi("$A5xB")`,`&Dhi("$A5xB")`,$temp
+	vshr.u64	$temp,`&Dhi("$A5xB")`,#16
+	vzip.16		`&Dlo("$A5xB")`,`&Dhi("$A5xB")`
+
+	vadd.u64	`&Dlo("$A6xB")`,`&Dlo("$A6xB")`,$temp
+	vst1.32		{`&Dlo("$A5xB")`[0]}, [$toutptr, :32]!
+	vshr.u64	$temp,`&Dlo("$A6xB")`,#16
+	vadd.u64	`&Dhi("$A6xB")`,`&Dhi("$A6xB")`,$temp
+	vld1.64		{$A0xB}, [$tinptr, :128]!
+	vshr.u64	$temp,`&Dhi("$A6xB")`,#16
+	vzip.16		`&Dlo("$A6xB")`,`&Dhi("$A6xB")`
+
+	vadd.u64	`&Dlo("$A7xB")`,`&Dlo("$A7xB")`,$temp
+	vst1.32		{`&Dlo("$A6xB")`[0]}, [$toutptr, :32]!
+	vshr.u64	$temp,`&Dlo("$A7xB")`,#16
+	vadd.u64	`&Dhi("$A7xB")`,`&Dhi("$A7xB")`,$temp
+	vld1.64		{$A1xB-$A2xB},	[$tinptr, :256]!
+	vshr.u64	$temp,`&Dhi("$A7xB")`,#16
+	vzip.16		`&Dlo("$A7xB")`,`&Dhi("$A7xB")`
+	subs		$inner,$inner,#8
+	vst1.32		{`&Dlo("$A7xB")`[0]}, [$toutptr, :32]!
+
+	bne	.LNEON_tail
+
+	vst1.32	{${temp}[0]}, [$toutptr, :32]		@ top-most bit
+	sub	$nptr,$nptr,$num,lsl#2			@ rewind $nptr
+	subs	$aptr,sp,#0				@ clear carry flag
+	add	$bptr,sp,$num,lsl#2
+
+.LNEON_sub:
+	ldmia	$aptr!, {r4-r7}
+	ldmia	$nptr!, {r8-r11}
+	sbcs	r8, r4,r8
+	sbcs	r9, r5,r9
+	sbcs	r10,r6,r10
+	sbcs	r11,r7,r11
+	teq	$aptr,$bptr				@ preserves carry
+	stmia	$rptr!, {r8-r11}
+	bne	.LNEON_sub
+
+	ldr	r10, [$aptr]				@ load top-most bit
+	veor	q0,q0,q0
+	sub	r11,$bptr,sp				@ this is num*4
+	veor	q1,q1,q1
+	mov	$aptr,sp
+	sub	$rptr,$rptr,r11				@ rewind $rptr
+	mov	$nptr,$bptr				@ second 3/4th of frame
+	sbcs	r10,r10,#0				@ result is carry flag
+
+.LNEON_copy_n_zap:
+	ldmia	$aptr!, {r4-r7}
+	ldmia	$rptr,  {r8-r11}
+	movcc	r8, r4
+	vst1.64	{q0-q1}, [$nptr,:256]!			@ wipe
+	movcc	r9, r5
+	movcc	r10,r6
+	vst1.64	{q0-q1}, [$nptr,:256]!			@ wipe
+	movcc	r11,r7
+	ldmia	$aptr, {r4-r7}
+	stmia	$rptr!, {r8-r11}
+	sub	$aptr,$aptr,#16
+	ldmia	$rptr, {r8-r11}
+	movcc	r8, r4
+	vst1.64	{q0-q1}, [$aptr,:256]!			@ wipe
+	movcc	r9, r5
+	movcc	r10,r6
+	vst1.64	{q0-q1}, [$nptr,:256]!			@ wipe
+	movcc	r11,r7
+	teq	$aptr,$bptr				@ preserves carry
+	stmia	$rptr!, {r8-r11}
+	bne	.LNEON_copy_n_zap
+
+	sub	sp,ip,#96
+        vldmia  sp!,{d8-d15}
+        ldmia   sp!,{r4-r11}
+	ret						@ bx lr
+.size	bn_mul8x_mont_neon,.-bn_mul8x_mont_neon
+#endif
+___
+}
+$code.=<<___;
+.asciz	"Montgomery multiplication for ARMv4/NEON, CRYPTOGAMS by <appro\@openssl.org>"
 .align	2
+#if __ARM_MAX_ARCH__>=7
+.comm	OPENSSL_armcap_P,4,4
+#endif
 ___

+$code =~ s/\`([^\`]*)\`/eval $1/gem;
 $code =~ s/\bbx\s+lr\b/.word\t0xe12fff1e/gm;	# make it possible to compile with -march=armv4
+$code =~ s/\bret\b/bx	lr/gm;
 print $code;
 close STDOUT;
--- a/deps/openssl/openssl/crypto/bn/asm/mips-mont.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/mips-mont.pl
@ -46,7 +46,7 @@
 # ($s0,$s1,$s2,$s3,$s4,$s5,$s6,$s7)=map("\$$_",(16..23));
 # ($gp,$sp,$fp,$ra)=map("\$$_",(28..31));
 #
-$flavour = shift; # supported flavours are o32,n32,64,nubi32,nubi64
+$flavour = shift || "o32"; # supported flavours are o32,n32,64,nubi32,nubi64

 if ($flavour =~ /64|n32/i) {
 	$PTR_ADD="dadd";	# incidentally works even on n32
--- a/deps/openssl/openssl/crypto/bn/asm/mips.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/mips.pl
@ -48,7 +48,7 @@
 # has to content with 40-85% improvement depending on benchmark and
 # key length, more for longer keys.

-$flavour = shift;
+$flavour = shift || "o32";
 while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {}
 open STDOUT,">$output";

--- a/deps/openssl/openssl/crypto/bn/asm/mips3.s
+++ b/deps/openssl/openssl/crypto/bn/asm/mips3.s
--- a/deps/openssl/openssl/crypto/bn/asm/modexp512-x86_64.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/modexp512-x86_64.pl
--- a/deps/openssl/openssl/crypto/bn/asm/ppc-mont.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/ppc-mont.pl
@ -325,6 +325,7 @@ Lcopy:				; copy or in-place refresh
 	.long	0
 	.byte	0,12,4,0,0x80,12,6,0
 	.long	0
+.size	.bn_mul_mont_int,.-.bn_mul_mont_int

 .asciz  "Montgomery Multiplication for PPC, CRYPTOGAMS by <appro\@openssl.org>"
 ___
--- a/deps/openssl/openssl/crypto/bn/asm/ppc.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/ppc.pl
@ -392,6 +392,7 @@ $data=<<EOF;
 	.long	0
 	.byte	0,12,0x14,0,0,0,2,0
 	.long	0
+.size	.bn_sqr_comba4,.-.bn_sqr_comba4

 #
 #	NOTE:	The following label name should be changed to
@ -819,6 +820,7 @@ $data=<<EOF;
 	.long	0
 	.byte	0,12,0x14,0,0,0,2,0
 	.long	0
+.size	.bn_sqr_comba8,.-.bn_sqr_comba8

 #
 #	NOTE:	The following label name should be changed to
@ -972,6 +974,7 @@ $data=<<EOF;
 	.long	0
 	.byte	0,12,0x14,0,0,0,3,0
 	.long	0
+.size	.bn_mul_comba4,.-.bn_mul_comba4

 #
 #	NOTE:	The following label name should be changed to
@ -1510,6 +1513,7 @@ $data=<<EOF;
 	.long	0
 	.byte	0,12,0x14,0,0,0,3,0
 	.long	0
+.size	.bn_mul_comba8,.-.bn_mul_comba8

 #
 #	NOTE:	The following label name should be changed to
@ -1560,6 +1564,7 @@ Lppcasm_sub_adios:
 	.long	0
 	.byte	0,12,0x14,0,0,0,4,0
 	.long	0
+.size	.bn_sub_words,.-.bn_sub_words

 #
 #	NOTE:	The following label name should be changed to
@ -1605,6 +1610,7 @@ Lppcasm_add_adios:
 	.long	0
 	.byte	0,12,0x14,0,0,0,4,0
 	.long	0
+.size	.bn_add_words,.-.bn_add_words

 #
 #	NOTE:	The following label name should be changed to
@ -1720,6 +1726,7 @@ Lppcasm_div9:
 	.long	0
 	.byte	0,12,0x14,0,0,0,3,0
 	.long	0
+.size	.bn_div_words,.-.bn_div_words

 #
 #	NOTE:	The following label name should be changed to
@ -1761,6 +1768,7 @@ Lppcasm_sqr_adios:
 	.long	0
 	.byte	0,12,0x14,0,0,0,3,0
 	.long	0
+.size	.bn_sqr_words,.-.bn_sqr_words

 #
 #	NOTE:	The following label name should be changed to
@ -1866,6 +1874,7 @@ Lppcasm_mw_OVER:
 	.long	0
 	.byte	0,12,0x14,0,0,0,4,0
 	.long	0
+.size	bn_mul_words,.-bn_mul_words

 #
 #	NOTE:	The following label name should be changed to
@ -1991,6 +2000,7 @@ Lppcasm_maw_adios:
 	.long	0
 	.byte	0,12,0x14,0,0,0,4,0
 	.long	0
+.size	.bn_mul_add_words,.-.bn_mul_add_words
 	.align	4
 EOF
 $data =~ s/\`([^\`]*)\`/eval $1/gem;
--- a/deps/openssl/openssl/crypto/bn/asm/ppc64-mont.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/ppc64-mont.pl
@ -1,7 +1,7 @@
 #!/usr/bin/env perl

 # ====================================================================
-# Written by Andy Polyakov <appro@fy.chalmers.se> for the OpenSSL
+# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
 # project. The module is, however, dual licensed under OpenSSL and
 # CRYPTOGAMS licenses depending on where you obtain it. For further
 # details see http://www.openssl.org/~appro/cryptogams/.
@ -65,6 +65,14 @@
 # others alternative would be to break dependence on upper halves of
 # GPRs by sticking to 32-bit integer operations...

+# December 2012
+
+# Remove above mentioned dependence on GPRs' upper halves in 32-bit
+# build. No signal masking overhead, but integer instructions are
+# *more* numerous... It's still "universally" faster than 32-bit
+# ppc-mont.pl, but improvement coefficient is not as impressive
+# for longer keys...
+
 $flavour = shift;

 if ($flavour =~ /32/) {
@ -110,6 +118,9 @@ $tp="r10";
 $j="r11";
 $i="r12";
 # non-volatile registers
+$c1="r19";
+$n1="r20";
+$a1="r21";
 $nap_d="r22";	# interleaved ap and np in double format
 $a0="r23";	# ap[0]
 $t0="r24";	# temporary registers
@ -180,8 +191,8 @@ $T3a="f30";	$T3b="f31";
 #		.				.
 #		+-------------------------------+
 #		.				.
-#   -12*size_t	+-------------------------------+
-#		| 10 saved gpr, r22-r31		|
+#   -13*size_t	+-------------------------------+
+#		| 13 saved gpr, r19-r31		|
 #		.				.
 #		.				.
 #   -12*8	+-------------------------------+
@ -215,6 +226,9 @@ $code=<<___;
 	mr	$i,$sp
 	$STUX	$sp,$sp,$tp	; alloca

+	$PUSH	r19,`-12*8-13*$SIZE_T`($i)
+	$PUSH	r20,`-12*8-12*$SIZE_T`($i)
+	$PUSH	r21,`-12*8-11*$SIZE_T`($i)
 	$PUSH	r22,`-12*8-10*$SIZE_T`($i)
 	$PUSH	r23,`-12*8-9*$SIZE_T`($i)
 	$PUSH	r24,`-12*8-8*$SIZE_T`($i)
@ -237,40 +251,26 @@ $code=<<___;
 	stfd	f29,`-3*8`($i)
 	stfd	f30,`-2*8`($i)
 	stfd	f31,`-1*8`($i)
-___
-$code.=<<___ if ($SIZE_T==8);
-	ld	$a0,0($ap)	; pull ap[0] value
-	ld	$n0,0($n0)	; pull n0[0] value
-	ld	$t3,0($bp)	; bp[0]
-___
-$code.=<<___ if ($SIZE_T==4);
-	mr	$t1,$n0
-	lwz	$a0,0($ap)	; pull ap[0,1] value
-	lwz	$t0,4($ap)
-	lwz	$n0,0($t1)	; pull n0[0,1] value
-	lwz	$t1,4($t1)
-	lwz	$t3,0($bp)	; bp[0,1]
-	lwz	$t2,4($bp)
-	insrdi	$a0,$t0,32,0
-	insrdi	$n0,$t1,32,0
-	insrdi	$t3,$t2,32,0
-___
-$code.=<<___;
+
 	addi	$tp,$sp,`$FRAME+$TRANSFER+8+64`
 	li	$i,-64
 	add	$nap_d,$tp,$num
 	and	$nap_d,$nap_d,$i	; align to 64 bytes
-
-	mulld	$t7,$a0,$t3	; ap[0]*bp[0]
 	; nap_d is off by 1, because it's used with stfdu/lfdu
 	addi	$nap_d,$nap_d,-8
 	srwi	$j,$num,`3+1`	; counter register, num/2
-	mulld	$t7,$t7,$n0	; tp[0]*n0
 	addi	$j,$j,-1
 	addi	$tp,$sp,`$FRAME+$TRANSFER-8`
 	li	$carry,0
 	mtctr	$j
+___
+
+$code.=<<___ if ($SIZE_T==8);
+	ld	$a0,0($ap)		; pull ap[0] value
+	ld	$t3,0($bp)		; bp[0]
+	ld	$n0,0($n0)		; pull n0[0] value

+	mulld	$t7,$a0,$t3		; ap[0]*bp[0]
 	; transfer bp[0] to FPU as 4x16-bit values
 	extrdi	$t0,$t3,16,48
 	extrdi	$t1,$t3,16,32
@ -280,6 +280,8 @@ $code.=<<___;
 	std	$t1,`$FRAME+8`($sp)
 	std	$t2,`$FRAME+16`($sp)
 	std	$t3,`$FRAME+24`($sp)
+
+	mulld	$t7,$t7,$n0		; tp[0]*n0
 	; transfer (ap[0]*bp[0])*n0 to FPU as 4x16-bit values
 	extrdi	$t4,$t7,16,48
 	extrdi	$t5,$t7,16,32
@ -289,21 +291,61 @@ $code.=<<___;
 	std	$t5,`$FRAME+40`($sp)
 	std	$t6,`$FRAME+48`($sp)
 	std	$t7,`$FRAME+56`($sp)
-___
-$code.=<<___ if ($SIZE_T==8);
-	lwz	$t0,4($ap)		; load a[j] as 32-bit word pair
-	lwz	$t1,0($ap)
-	lwz	$t2,12($ap)		; load a[j+1] as 32-bit word pair
+
+	extrdi	$t0,$a0,32,32		; lwz	$t0,4($ap)
+	extrdi	$t1,$a0,32,0		; lwz	$t1,0($ap)
+	lwz	$t2,12($ap)		; load a[1] as 32-bit word pair
 	lwz	$t3,8($ap)
-	lwz	$t4,4($np)		; load n[j] as 32-bit word pair
+	lwz	$t4,4($np)		; load n[0] as 32-bit word pair
 	lwz	$t5,0($np)
-	lwz	$t6,12($np)		; load n[j+1] as 32-bit word pair
+	lwz	$t6,12($np)		; load n[1] as 32-bit word pair
 	lwz	$t7,8($np)
 ___
 $code.=<<___ if ($SIZE_T==4);
-	lwz	$t0,0($ap)		; load a[j..j+3] as 32-bit word pairs
-	lwz	$t1,4($ap)
-	lwz	$t2,8($ap)
+	lwz	$a0,0($ap)		; pull ap[0,1] value
+	mr	$n1,$n0
+	lwz	$a1,4($ap)
+	li	$c1,0
+	lwz	$t1,0($bp)		; bp[0,1]
+	lwz	$t3,4($bp)
+	lwz	$n0,0($n1)		; pull n0[0,1] value
+	lwz	$n1,4($n1)
+
+	mullw	$t4,$a0,$t1		; mulld ap[0]*bp[0]
+	mulhwu	$t5,$a0,$t1
+	mullw	$t6,$a1,$t1
+	mullw	$t7,$a0,$t3
+	add	$t5,$t5,$t6
+	add	$t5,$t5,$t7
+	; transfer bp[0] to FPU as 4x16-bit values
+	extrwi	$t0,$t1,16,16
+	extrwi	$t1,$t1,16,0
+	extrwi	$t2,$t3,16,16
+	extrwi	$t3,$t3,16,0
+	std	$t0,`$FRAME+0`($sp)	; yes, std in 32-bit build
+	std	$t1,`$FRAME+8`($sp)
+	std	$t2,`$FRAME+16`($sp)
+	std	$t3,`$FRAME+24`($sp)
+
+	mullw	$t0,$t4,$n0		; mulld tp[0]*n0
+	mulhwu	$t1,$t4,$n0
+	mullw	$t2,$t5,$n0
+	mullw	$t3,$t4,$n1
+	add	$t1,$t1,$t2
+	add	$t1,$t1,$t3
+	; transfer (ap[0]*bp[0])*n0 to FPU as 4x16-bit values
+	extrwi	$t4,$t0,16,16
+	extrwi	$t5,$t0,16,0
+	extrwi	$t6,$t1,16,16
+	extrwi	$t7,$t1,16,0
+	std	$t4,`$FRAME+32`($sp)	; yes, std in 32-bit build
+	std	$t5,`$FRAME+40`($sp)
+	std	$t6,`$FRAME+48`($sp)
+	std	$t7,`$FRAME+56`($sp)
+
+	mr	$t0,$a0			; lwz	$t0,0($ap)
+	mr	$t1,$a1			; lwz	$t1,4($ap)
+	lwz	$t2,8($ap)		; load a[j..j+3] as 32-bit word pairs
 	lwz	$t3,12($ap)
 	lwz	$t4,0($np)		; load n[j..j+3] as 32-bit word pairs
 	lwz	$t5,4($np)
@ -319,7 +361,7 @@ $code.=<<___;
 	lfd	$nb,`$FRAME+40`($sp)
 	lfd	$nc,`$FRAME+48`($sp)
 	lfd	$nd,`$FRAME+56`($sp)
-	std	$t0,`$FRAME+64`($sp)
+	std	$t0,`$FRAME+64`($sp)	; yes, std even in 32-bit build
 	std	$t1,`$FRAME+72`($sp)
 	std	$t2,`$FRAME+80`($sp)
 	std	$t3,`$FRAME+88`($sp)
@ -441,7 +483,7 @@ $code.=<<___ if ($SIZE_T==4);
 	lwz	$t7,12($np)
 ___
 $code.=<<___;
-	std	$t0,`$FRAME+64`($sp)
+	std	$t0,`$FRAME+64`($sp)	; yes, std even in 32-bit build
 	std	$t1,`$FRAME+72`($sp)
 	std	$t2,`$FRAME+80`($sp)
 	std	$t3,`$FRAME+88`($sp)
@ -449,6 +491,9 @@ $code.=<<___;
 	std	$t5,`$FRAME+104`($sp)
 	std	$t6,`$FRAME+112`($sp)
 	std	$t7,`$FRAME+120`($sp)
+___
+if ($SIZE_T==8 or $flavour =~ /osx/) {
+$code.=<<___;
 	ld	$t0,`$FRAME+0`($sp)
 	ld	$t1,`$FRAME+8`($sp)
 	ld	$t2,`$FRAME+16`($sp)
@ -457,6 +502,20 @@ $code.=<<___;
 	ld	$t5,`$FRAME+40`($sp)
 	ld	$t6,`$FRAME+48`($sp)
 	ld	$t7,`$FRAME+56`($sp)
+___
+} else {
+$code.=<<___;
+	lwz	$t1,`$FRAME+0`($sp)
+	lwz	$t0,`$FRAME+4`($sp)
+	lwz	$t3,`$FRAME+8`($sp)
+	lwz	$t2,`$FRAME+12`($sp)
+	lwz	$t5,`$FRAME+16`($sp)
+	lwz	$t4,`$FRAME+20`($sp)
+	lwz	$t7,`$FRAME+24`($sp)
+	lwz	$t6,`$FRAME+28`($sp)
+___
+}
+$code.=<<___;
 	lfd	$A0,`$FRAME+64`($sp)
 	lfd	$A1,`$FRAME+72`($sp)
 	lfd	$A2,`$FRAME+80`($sp)
@ -488,7 +547,9 @@ $code.=<<___;
 	fmadd	$T0b,$A0,$bb,$dotb
 	stfd	$A2,24($nap_d)		; save a[j+1] in double format
 	stfd	$A3,32($nap_d)
-
+___
+if ($SIZE_T==8 or $flavour =~ /osx/) {
+$code.=<<___;
 	fmadd	$T1a,$A0,$bc,$T1a
 	fmadd	$T1b,$A0,$bd,$T1b
 	fmadd	$T2a,$A1,$bc,$T2a
@ -561,11 +622,123 @@ $code.=<<___;
 	stfd	$T3b,`$FRAME+56`($sp)
 	 std	$t0,8($tp)		; tp[j-1]
 	 stdu	$t4,16($tp)		; tp[j]
+___
+} else {
+$code.=<<___;
+	fmadd	$T1a,$A0,$bc,$T1a
+	fmadd	$T1b,$A0,$bd,$T1b
+	 addc	$t0,$t0,$carry
+	 adde	$t1,$t1,$c1
+	 srwi	$carry,$t0,16
+	fmadd	$T2a,$A1,$bc,$T2a
+	fmadd	$T2b,$A1,$bd,$T2b
+	stfd	$N0,40($nap_d)		; save n[j] in double format
+	stfd	$N1,48($nap_d)
+	 srwi	$c1,$t1,16
+	 insrwi	$carry,$t1,16,0
+	fmadd	$T3a,$A2,$bc,$T3a
+	fmadd	$T3b,$A2,$bd,$T3b
+	 addc	$t2,$t2,$carry
+	 adde	$t3,$t3,$c1
+	 srwi	$carry,$t2,16
+	fmul	$dota,$A3,$bc
+	fmul	$dotb,$A3,$bd
+	stfd	$N2,56($nap_d)		; save n[j+1] in double format
+	stfdu	$N3,64($nap_d)
+	 insrwi	$t0,$t2,16,0		; 0..31 bits
+	 srwi	$c1,$t3,16
+	 insrwi	$carry,$t3,16,0
+
+	fmadd	$T1a,$N1,$na,$T1a
+	fmadd	$T1b,$N1,$nb,$T1b
+	 lwz	$t3,`$FRAME+32`($sp)	; permuted $t1
+	 lwz	$t2,`$FRAME+36`($sp)	; permuted $t0
+	 addc	$t4,$t4,$carry
+	 adde	$t5,$t5,$c1
+	 srwi	$carry,$t4,16
+	fmadd	$T2a,$N2,$na,$T2a
+	fmadd	$T2b,$N2,$nb,$T2b
+	 srwi	$c1,$t5,16
+	 insrwi	$carry,$t5,16,0
+	fmadd	$T3a,$N3,$na,$T3a
+	fmadd	$T3b,$N3,$nb,$T3b
+	 addc	$t6,$t6,$carry
+	 adde	$t7,$t7,$c1
+	 srwi	$carry,$t6,16
+	fmadd	$T0a,$N0,$na,$T0a
+	fmadd	$T0b,$N0,$nb,$T0b
+	 insrwi	$t4,$t6,16,0		; 32..63 bits
+	 srwi	$c1,$t7,16
+	 insrwi	$carry,$t7,16,0
+
+	fmadd	$T1a,$N0,$nc,$T1a
+	fmadd	$T1b,$N0,$nd,$T1b
+	 lwz	$t7,`$FRAME+40`($sp)	; permuted $t3
+	 lwz	$t6,`$FRAME+44`($sp)	; permuted $t2
+	 addc	$t2,$t2,$carry
+	 adde	$t3,$t3,$c1
+	 srwi	$carry,$t2,16
+	fmadd	$T2a,$N1,$nc,$T2a
+	fmadd	$T2b,$N1,$nd,$T2b
+	 stw	$t0,12($tp)		; tp[j-1]
+	 stw	$t4,8($tp)
+	 srwi	$c1,$t3,16
+	 insrwi	$carry,$t3,16,0
+	fmadd	$T3a,$N2,$nc,$T3a
+	fmadd	$T3b,$N2,$nd,$T3b
+	 lwz	$t1,`$FRAME+48`($sp)	; permuted $t5
+	 lwz	$t0,`$FRAME+52`($sp)	; permuted $t4
+	 addc	$t6,$t6,$carry
+	 adde	$t7,$t7,$c1
+	 srwi	$carry,$t6,16
+	fmadd	$dota,$N3,$nc,$dota
+	fmadd	$dotb,$N3,$nd,$dotb
+	 insrwi	$t2,$t6,16,0		; 64..95 bits
+	 srwi	$c1,$t7,16
+	 insrwi	$carry,$t7,16,0
+
+	fctid	$T0a,$T0a
+	fctid	$T0b,$T0b
+	 lwz	$t5,`$FRAME+56`($sp)	; permuted $t7
+	 lwz	$t4,`$FRAME+60`($sp)	; permuted $t6
+	 addc	$t0,$t0,$carry
+	 adde	$t1,$t1,$c1
+	 srwi	$carry,$t0,16
+	fctid	$T1a,$T1a
+	fctid	$T1b,$T1b
+	 srwi	$c1,$t1,16
+	 insrwi	$carry,$t1,16,0
+	fctid	$T2a,$T2a
+	fctid	$T2b,$T2b
+	 addc	$t4,$t4,$carry
+	 adde	$t5,$t5,$c1
+	 srwi	$carry,$t4,16
+	fctid	$T3a,$T3a
+	fctid	$T3b,$T3b
+	 insrwi	$t0,$t4,16,0		; 96..127 bits
+	 srwi	$c1,$t5,16
+	 insrwi	$carry,$t5,16,0
+
+	stfd	$T0a,`$FRAME+0`($sp)
+	stfd	$T0b,`$FRAME+8`($sp)
+	stfd	$T1a,`$FRAME+16`($sp)
+	stfd	$T1b,`$FRAME+24`($sp)
+	stfd	$T2a,`$FRAME+32`($sp)
+	stfd	$T2b,`$FRAME+40`($sp)
+	stfd	$T3a,`$FRAME+48`($sp)
+	stfd	$T3b,`$FRAME+56`($sp)
+	 stw	$t2,20($tp)		; tp[j]
+	 stwu	$t0,16($tp)
+___
+}
+$code.=<<___;
 	bdnz-	L1st

 	fctid	$dota,$dota
 	fctid	$dotb,$dotb
-
+___
+if ($SIZE_T==8 or $flavour =~ /osx/) {
+$code.=<<___;
 	ld	$t0,`$FRAME+0`($sp)
 	ld	$t1,`$FRAME+8`($sp)
 	ld	$t2,`$FRAME+16`($sp)
@ -611,33 +784,117 @@ $code.=<<___;
 	insrdi	$t6,$t7,48,0
 	srdi	$ovf,$t7,48
 	std	$t6,8($tp)		; tp[num-1]
+___
+} else {
+$code.=<<___;
+	lwz	$t1,`$FRAME+0`($sp)
+	lwz	$t0,`$FRAME+4`($sp)
+	lwz	$t3,`$FRAME+8`($sp)
+	lwz	$t2,`$FRAME+12`($sp)
+	lwz	$t5,`$FRAME+16`($sp)
+	lwz	$t4,`$FRAME+20`($sp)
+	lwz	$t7,`$FRAME+24`($sp)
+	lwz	$t6,`$FRAME+28`($sp)
+	stfd	$dota,`$FRAME+64`($sp)
+	stfd	$dotb,`$FRAME+72`($sp)

+	addc	$t0,$t0,$carry
+	adde	$t1,$t1,$c1
+	srwi	$carry,$t0,16
+	insrwi	$carry,$t1,16,0
+	srwi	$c1,$t1,16
+	addc	$t2,$t2,$carry
+	adde	$t3,$t3,$c1
+	srwi	$carry,$t2,16
+	 insrwi	$t0,$t2,16,0		; 0..31 bits
+	insrwi	$carry,$t3,16,0
+	srwi	$c1,$t3,16
+	addc	$t4,$t4,$carry
+	adde	$t5,$t5,$c1
+	srwi	$carry,$t4,16
+	insrwi	$carry,$t5,16,0
+	srwi	$c1,$t5,16
+	addc	$t6,$t6,$carry
+	adde	$t7,$t7,$c1
+	srwi	$carry,$t6,16
+	 insrwi	$t4,$t6,16,0		; 32..63 bits
+	insrwi	$carry,$t7,16,0
+	srwi	$c1,$t7,16
+	 stw	$t0,12($tp)		; tp[j-1]
+	 stw	$t4,8($tp)
+
+	lwz	$t3,`$FRAME+32`($sp)	; permuted $t1
+	lwz	$t2,`$FRAME+36`($sp)	; permuted $t0
+	lwz	$t7,`$FRAME+40`($sp)	; permuted $t3
+	lwz	$t6,`$FRAME+44`($sp)	; permuted $t2
+	lwz	$t1,`$FRAME+48`($sp)	; permuted $t5
+	lwz	$t0,`$FRAME+52`($sp)	; permuted $t4
+	lwz	$t5,`$FRAME+56`($sp)	; permuted $t7
+	lwz	$t4,`$FRAME+60`($sp)	; permuted $t6
+
+	addc	$t2,$t2,$carry
+	adde	$t3,$t3,$c1
+	srwi	$carry,$t2,16
+	insrwi	$carry,$t3,16,0
+	srwi	$c1,$t3,16
+	addc	$t6,$t6,$carry
+	adde	$t7,$t7,$c1
+	srwi	$carry,$t6,16
+	 insrwi	$t2,$t6,16,0		; 64..95 bits
+	insrwi	$carry,$t7,16,0
+	srwi	$c1,$t7,16
+	addc	$t0,$t0,$carry
+	adde	$t1,$t1,$c1
+	srwi	$carry,$t0,16
+	insrwi	$carry,$t1,16,0
+	srwi	$c1,$t1,16
+	addc	$t4,$t4,$carry
+	adde	$t5,$t5,$c1
+	srwi	$carry,$t4,16
+	 insrwi	$t0,$t4,16,0		; 96..127 bits
+	insrwi	$carry,$t5,16,0
+	srwi	$c1,$t5,16
+	 stw	$t2,20($tp)		; tp[j]
+	 stwu	$t0,16($tp)
+
+	lwz	$t7,`$FRAME+64`($sp)
+	lwz	$t6,`$FRAME+68`($sp)
+	lwz	$t5,`$FRAME+72`($sp)
+	lwz	$t4,`$FRAME+76`($sp)
+
+	addc	$t6,$t6,$carry
+	adde	$t7,$t7,$c1
+	srwi	$carry,$t6,16
+	insrwi	$carry,$t7,16,0
+	srwi	$c1,$t7,16
+	addc	$t4,$t4,$carry
+	adde	$t5,$t5,$c1
+
+	insrwi	$t6,$t4,16,0
+	srwi	$t4,$t4,16
+	insrwi	$t4,$t5,16,0
+	srwi	$ovf,$t5,16
+	stw	$t6,12($tp)		; tp[num-1]
+	stw	$t4,8($tp)
+___
+}
+$code.=<<___;
 	slwi	$t7,$num,2
 	subf	$nap_d,$t7,$nap_d	; rewind pointer

 	li	$i,8			; i=1
 .align	5
 Louter:
-___
-$code.=<<___ if ($SIZE_T==8);
-	ldx	$t3,$bp,$i	; bp[i]
-___
-$code.=<<___ if ($SIZE_T==4);
-	add	$t0,$bp,$i
-	lwz	$t3,0($t0)		; bp[i,i+1]
-	lwz	$t0,4($t0)
-	insrdi	$t3,$t0,32,0
-___
-$code.=<<___;
-	ld	$t6,`$FRAME+$TRANSFER+8`($sp)	; tp[0]
-	mulld	$t7,$a0,$t3	; ap[0]*bp[i]
-
 	addi	$tp,$sp,`$FRAME+$TRANSFER`
-	add	$t7,$t7,$t6	; ap[0]*bp[i]+tp[0]
 	li	$carry,0
-	mulld	$t7,$t7,$n0	; tp[0]*n0
 	mtctr	$j
+___
+$code.=<<___ if ($SIZE_T==8);
+	ldx	$t3,$bp,$i		; bp[i]

+	ld	$t6,`$FRAME+$TRANSFER+8`($sp)	; tp[0]
+	mulld	$t7,$a0,$t3		; ap[0]*bp[i]
+	add	$t7,$t7,$t6		; ap[0]*bp[i]+tp[0]
 	; transfer bp[i] to FPU as 4x16-bit values
 	extrdi	$t0,$t3,16,48
 	extrdi	$t1,$t3,16,32
@ -647,6 +904,8 @@ $code.=<<___;
 	std	$t1,`$FRAME+8`($sp)
 	std	$t2,`$FRAME+16`($sp)
 	std	$t3,`$FRAME+24`($sp)
+
+	mulld	$t7,$t7,$n0		; tp[0]*n0
 	; transfer (ap[0]*bp[i]+tp[0])*n0 to FPU as 4x16-bit values
 	extrdi	$t4,$t7,16,48
 	extrdi	$t5,$t7,16,32
@ -656,7 +915,50 @@ $code.=<<___;
 	std	$t5,`$FRAME+40`($sp)
 	std	$t6,`$FRAME+48`($sp)
 	std	$t7,`$FRAME+56`($sp)
+___
+$code.=<<___ if ($SIZE_T==4);
+	add	$t0,$bp,$i
+	li	$c1,0
+	lwz	$t1,0($t0)		; bp[i,i+1]
+	lwz	$t3,4($t0)
+
+	mullw	$t4,$a0,$t1		; ap[0]*bp[i]
+	lwz	$t0,`$FRAME+$TRANSFER+8+4`($sp)	; tp[0]
+	mulhwu	$t5,$a0,$t1
+	lwz	$t2,`$FRAME+$TRANSFER+8`($sp)	; tp[0]
+	mullw	$t6,$a1,$t1
+	mullw	$t7,$a0,$t3
+	add	$t5,$t5,$t6
+	add	$t5,$t5,$t7
+	addc	$t4,$t4,$t0		; ap[0]*bp[i]+tp[0]
+	adde	$t5,$t5,$t2
+	; transfer bp[i] to FPU as 4x16-bit values
+	extrwi	$t0,$t1,16,16
+	extrwi	$t1,$t1,16,0
+	extrwi	$t2,$t3,16,16
+	extrwi	$t3,$t3,16,0
+	std	$t0,`$FRAME+0`($sp)	; yes, std in 32-bit build
+	std	$t1,`$FRAME+8`($sp)
+	std	$t2,`$FRAME+16`($sp)
+	std	$t3,`$FRAME+24`($sp)

+	mullw	$t0,$t4,$n0		; mulld tp[0]*n0
+	mulhwu	$t1,$t4,$n0
+	mullw	$t2,$t5,$n0
+	mullw	$t3,$t4,$n1
+	add	$t1,$t1,$t2
+	add	$t1,$t1,$t3
+	; transfer (ap[0]*bp[i]+tp[0])*n0 to FPU as 4x16-bit values
+	extrwi	$t4,$t0,16,16
+	extrwi	$t5,$t0,16,0
+	extrwi	$t6,$t1,16,16
+	extrwi	$t7,$t1,16,0
+	std	$t4,`$FRAME+32`($sp)	; yes, std in 32-bit build
+	std	$t5,`$FRAME+40`($sp)
+	std	$t6,`$FRAME+48`($sp)
+	std	$t7,`$FRAME+56`($sp)
+___
+$code.=<<___;
 	lfd	$A0,8($nap_d)		; load a[j] in double format
 	lfd	$A1,16($nap_d)
 	lfd	$A2,24($nap_d)		; load a[j+1] in double format
@ -769,7 +1071,9 @@ Linner:
 	fmul	$dotb,$A3,$bd
 	 lfd	$A2,24($nap_d)		; load a[j+1] in double format
 	 lfd	$A3,32($nap_d)
-
+___
+if ($SIZE_T==8 or $flavour =~ /osx/) {
+$code.=<<___;
 	fmadd	$T1a,$N1,$na,$T1a
 	fmadd	$T1b,$N1,$nb,$T1b
 	 ld	$t0,`$FRAME+0`($sp)
@ -856,10 +1160,131 @@ $code.=<<___;
 	 addze	$carry,$carry
 	 std	$t3,-16($tp)		; tp[j-1]
 	 std	$t5,-8($tp)		; tp[j]
+___
+} else {
+$code.=<<___;
+	fmadd	$T1a,$N1,$na,$T1a
+	fmadd	$T1b,$N1,$nb,$T1b
+	 lwz	$t1,`$FRAME+0`($sp)
+	 lwz	$t0,`$FRAME+4`($sp)
+	fmadd	$T2a,$N2,$na,$T2a
+	fmadd	$T2b,$N2,$nb,$T2b
+	 lwz	$t3,`$FRAME+8`($sp)
+	 lwz	$t2,`$FRAME+12`($sp)
+	fmadd	$T3a,$N3,$na,$T3a
+	fmadd	$T3b,$N3,$nb,$T3b
+	 lwz	$t5,`$FRAME+16`($sp)
+	 lwz	$t4,`$FRAME+20`($sp)
+	 addc	$t0,$t0,$carry
+	 adde	$t1,$t1,$c1
+	 srwi	$carry,$t0,16
+	fmadd	$T0a,$N0,$na,$T0a
+	fmadd	$T0b,$N0,$nb,$T0b
+	 lwz	$t7,`$FRAME+24`($sp)
+	 lwz	$t6,`$FRAME+28`($sp)
+	 srwi	$c1,$t1,16
+	 insrwi	$carry,$t1,16,0
+
+	fmadd	$T1a,$N0,$nc,$T1a
+	fmadd	$T1b,$N0,$nd,$T1b
+	 addc	$t2,$t2,$carry
+	 adde	$t3,$t3,$c1
+	 srwi	$carry,$t2,16
+	fmadd	$T2a,$N1,$nc,$T2a
+	fmadd	$T2b,$N1,$nd,$T2b
+	 insrwi	$t0,$t2,16,0		; 0..31 bits
+	 srwi	$c1,$t3,16
+	 insrwi	$carry,$t3,16,0
+	fmadd	$T3a,$N2,$nc,$T3a
+	fmadd	$T3b,$N2,$nd,$T3b
+	 lwz	$t2,12($tp)		; tp[j]
+	 lwz	$t3,8($tp)
+	 addc	$t4,$t4,$carry
+	 adde	$t5,$t5,$c1
+	 srwi	$carry,$t4,16
+	fmadd	$dota,$N3,$nc,$dota
+	fmadd	$dotb,$N3,$nd,$dotb
+	 srwi	$c1,$t5,16
+	 insrwi	$carry,$t5,16,0
+
+	fctid	$T0a,$T0a
+	 addc	$t6,$t6,$carry
+	 adde	$t7,$t7,$c1
+	 srwi	$carry,$t6,16
+	fctid	$T0b,$T0b
+	 insrwi	$t4,$t6,16,0		; 32..63 bits
+	 srwi	$c1,$t7,16
+	 insrwi	$carry,$t7,16,0
+	fctid	$T1a,$T1a
+	 addc	$t0,$t0,$t2
+	 adde	$t4,$t4,$t3
+	 lwz	$t3,`$FRAME+32`($sp)	; permuted $t1
+	 lwz	$t2,`$FRAME+36`($sp)	; permuted $t0
+	fctid	$T1b,$T1b
+	 addze	$carry,$carry
+	 addze	$c1,$c1
+	 stw	$t0,4($tp)		; tp[j-1]
+	 stw	$t4,0($tp)
+	fctid	$T2a,$T2a
+	 addc	$t2,$t2,$carry
+	 adde	$t3,$t3,$c1
+	 srwi	$carry,$t2,16
+	 lwz	$t7,`$FRAME+40`($sp)	; permuted $t3
+	 lwz	$t6,`$FRAME+44`($sp)	; permuted $t2
+	fctid	$T2b,$T2b
+	 srwi	$c1,$t3,16
+	 insrwi	$carry,$t3,16,0
+	 lwz	$t1,`$FRAME+48`($sp)	; permuted $t5
+	 lwz	$t0,`$FRAME+52`($sp)	; permuted $t4
+	fctid	$T3a,$T3a
+	 addc	$t6,$t6,$carry
+	 adde	$t7,$t7,$c1
+	 srwi	$carry,$t6,16
+	 lwz	$t5,`$FRAME+56`($sp)	; permuted $t7
+	 lwz	$t4,`$FRAME+60`($sp)	; permuted $t6
+	fctid	$T3b,$T3b
+
+	 insrwi	$t2,$t6,16,0		; 64..95 bits
+	insrwi	$carry,$t7,16,0
+	srwi	$c1,$t7,16
+	 lwz	$t6,20($tp)
+	 lwzu	$t7,16($tp)
+	addc	$t0,$t0,$carry
+	 stfd	$T0a,`$FRAME+0`($sp)
+	adde	$t1,$t1,$c1
+	srwi	$carry,$t0,16
+	 stfd	$T0b,`$FRAME+8`($sp)
+	insrwi	$carry,$t1,16,0
+	srwi	$c1,$t1,16
+	addc	$t4,$t4,$carry
+	 stfd	$T1a,`$FRAME+16`($sp)
+	adde	$t5,$t5,$c1
+	srwi	$carry,$t4,16
+	 insrwi	$t0,$t4,16,0		; 96..127 bits
+	 stfd	$T1b,`$FRAME+24`($sp)
+	insrwi	$carry,$t5,16,0
+	srwi	$c1,$t5,16
+
+	addc	$t2,$t2,$t6
+	 stfd	$T2a,`$FRAME+32`($sp)
+	adde	$t0,$t0,$t7
+	 stfd	$T2b,`$FRAME+40`($sp)
+	addze	$carry,$carry
+	 stfd	$T3a,`$FRAME+48`($sp)
+	addze	$c1,$c1
+	 stfd	$T3b,`$FRAME+56`($sp)
+	 stw	$t2,-4($tp)		; tp[j]
+	 stw	$t0,-8($tp)
+___
+}
+$code.=<<___;
 	bdnz-	Linner

 	fctid	$dota,$dota
 	fctid	$dotb,$dotb
+___
+if ($SIZE_T==8 or $flavour =~ /osx/) {
+$code.=<<___;
 	ld	$t0,`$FRAME+0`($sp)
 	ld	$t1,`$FRAME+8`($sp)
 	ld	$t2,`$FRAME+16`($sp)
@ -926,7 +1351,116 @@ $code.=<<___;
 	insrdi	$t6,$t7,48,0
 	srdi	$ovf,$t7,48
 	std	$t6,0($tp)		; tp[num-1]
+___
+} else {
+$code.=<<___;
+	lwz	$t1,`$FRAME+0`($sp)
+	lwz	$t0,`$FRAME+4`($sp)
+	lwz	$t3,`$FRAME+8`($sp)
+	lwz	$t2,`$FRAME+12`($sp)
+	lwz	$t5,`$FRAME+16`($sp)
+	lwz	$t4,`$FRAME+20`($sp)
+	lwz	$t7,`$FRAME+24`($sp)
+	lwz	$t6,`$FRAME+28`($sp)
+	stfd	$dota,`$FRAME+64`($sp)
+	stfd	$dotb,`$FRAME+72`($sp)

+	addc	$t0,$t0,$carry
+	adde	$t1,$t1,$c1
+	srwi	$carry,$t0,16
+	insrwi	$carry,$t1,16,0
+	srwi	$c1,$t1,16
+	addc	$t2,$t2,$carry
+	adde	$t3,$t3,$c1
+	srwi	$carry,$t2,16
+	 insrwi	$t0,$t2,16,0		; 0..31 bits
+	 lwz	$t2,12($tp)		; tp[j]
+	insrwi	$carry,$t3,16,0
+	srwi	$c1,$t3,16
+	 lwz	$t3,8($tp)
+	addc	$t4,$t4,$carry
+	adde	$t5,$t5,$c1
+	srwi	$carry,$t4,16
+	insrwi	$carry,$t5,16,0
+	srwi	$c1,$t5,16
+	addc	$t6,$t6,$carry
+	adde	$t7,$t7,$c1
+	srwi	$carry,$t6,16
+	 insrwi	$t4,$t6,16,0		; 32..63 bits
+	insrwi	$carry,$t7,16,0
+	srwi	$c1,$t7,16
+
+	addc	$t0,$t0,$t2
+	adde	$t4,$t4,$t3
+	addze	$carry,$carry
+	addze	$c1,$c1
+	 stw	$t0,4($tp)		; tp[j-1]
+	 stw	$t4,0($tp)
+
+	lwz	$t3,`$FRAME+32`($sp)	; permuted $t1
+	lwz	$t2,`$FRAME+36`($sp)	; permuted $t0
+	lwz	$t7,`$FRAME+40`($sp)	; permuted $t3
+	lwz	$t6,`$FRAME+44`($sp)	; permuted $t2
+	lwz	$t1,`$FRAME+48`($sp)	; permuted $t5
+	lwz	$t0,`$FRAME+52`($sp)	; permuted $t4
+	lwz	$t5,`$FRAME+56`($sp)	; permuted $t7
+	lwz	$t4,`$FRAME+60`($sp)	; permuted $t6
+
+	addc	$t2,$t2,$carry
+	adde	$t3,$t3,$c1
+	srwi	$carry,$t2,16
+	insrwi	$carry,$t3,16,0
+	srwi	$c1,$t3,16
+	addc	$t6,$t6,$carry
+	adde	$t7,$t7,$c1
+	srwi	$carry,$t6,16
+	 insrwi	$t2,$t6,16,0		; 64..95 bits
+	 lwz	$t6,20($tp)
+	insrwi	$carry,$t7,16,0
+	srwi	$c1,$t7,16
+	 lwzu	$t7,16($tp)
+	addc	$t0,$t0,$carry
+	adde	$t1,$t1,$c1
+	srwi	$carry,$t0,16
+	insrwi	$carry,$t1,16,0
+	srwi	$c1,$t1,16
+	addc	$t4,$t4,$carry
+	adde	$t5,$t5,$c1
+	srwi	$carry,$t4,16
+	 insrwi	$t0,$t4,16,0		; 96..127 bits
+	insrwi	$carry,$t5,16,0
+	srwi	$c1,$t5,16
+
+	addc	$t2,$t2,$t6
+	adde	$t0,$t0,$t7
+	 lwz	$t7,`$FRAME+64`($sp)
+	 lwz	$t6,`$FRAME+68`($sp)
+	addze	$carry,$carry
+	addze	$c1,$c1
+	 lwz	$t5,`$FRAME+72`($sp)
+	 lwz	$t4,`$FRAME+76`($sp)
+
+	addc	$t6,$t6,$carry
+	adde	$t7,$t7,$c1
+	 stw	$t2,-4($tp)		; tp[j]
+	 stw	$t0,-8($tp)
+	addc	$t6,$t6,$ovf
+	addze	$t7,$t7
+	srwi	$carry,$t6,16
+	insrwi	$carry,$t7,16,0
+	srwi	$c1,$t7,16
+	addc	$t4,$t4,$carry
+	adde	$t5,$t5,$c1
+
+	insrwi	$t6,$t4,16,0
+	srwi	$t4,$t4,16
+	insrwi	$t4,$t5,16,0
+	srwi	$ovf,$t5,16
+	stw	$t6,4($tp)		; tp[num-1]
+	stw	$t4,0($tp)
+___
+}
+$code.=<<___;
 	slwi	$t7,$num,2
 	addi	$i,$i,8
 	subf	$nap_d,$t7,$nap_d	; rewind pointer
@ -994,14 +1528,14 @@ $code.=<<___ if ($SIZE_T==4);
 	mtctr	$j

 .align	4
-Lsub:	ld	$t0,8($tp)	; load tp[j..j+3] in 64-bit word order
-	ldu	$t2,16($tp)
+Lsub:	lwz	$t0,12($tp)	; load tp[j..j+3] in 64-bit word order
+	lwz	$t1,8($tp)
+	lwz	$t2,20($tp)
+	lwzu	$t3,16($tp)
 	lwz	$t4,4($np)	; load np[j..j+3] in 32-bit word order
 	lwz	$t5,8($np)
 	lwz	$t6,12($np)
 	lwzu	$t7,16($np)
-	extrdi	$t1,$t0,32,0
-	extrdi	$t3,$t2,32,0
 	subfe	$t4,$t4,$t0	; tp[j]-np[j]
 	 stw	$t0,4($ap)	; save tp[j..j+3] in 32-bit word order
 	subfe	$t5,$t5,$t1	; tp[j+1]-np[j+1]
@ -1052,6 +1586,9 @@ ___
 $code.=<<___;
 	$POP	$i,0($sp)
 	li	r3,1	; signal "handled"
+	$POP	r19,`-12*8-13*$SIZE_T`($i)
+	$POP	r20,`-12*8-12*$SIZE_T`($i)
+	$POP	r21,`-12*8-11*$SIZE_T`($i)
 	$POP	r22,`-12*8-10*$SIZE_T`($i)
 	$POP	r23,`-12*8-9*$SIZE_T`($i)
 	$POP	r24,`-12*8-8*$SIZE_T`($i)
@ -1077,8 +1614,9 @@ $code.=<<___;
 	mr	$sp,$i
 	blr
 	.long	0
-	.byte	0,12,4,0,0x8c,10,6,0
+	.byte	0,12,4,0,0x8c,13,6,0
 	.long	0
+.size	.$fname,.-.$fname

 .asciz  "Montgomery Multiplication for PPC64, CRYPTOGAMS by <appro\@openssl.org>"
 ___
--- a/deps/openssl/openssl/crypto/bn/asm/rsaz-avx2.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/rsaz-avx2.pl
--- a/deps/openssl/openssl/crypto/bn/asm/rsaz-x86_64.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/rsaz-x86_64.pl
--- a/deps/openssl/openssl/crypto/bn/asm/sparct4-mont.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/sparct4-mont.pl
--- a/deps/openssl/openssl/crypto/bn/asm/sparcv9-gf2m.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/sparcv9-gf2m.pl
@ -0,0 +1,190 @@
+#!/usr/bin/env perl
+#
+# ====================================================================
+# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
+# project. The module is, however, dual licensed under OpenSSL and
+# CRYPTOGAMS licenses depending on where you obtain it. For further
+# details see http://www.openssl.org/~appro/cryptogams/.
+# ====================================================================
+#
+# October 2012
+#
+# The module implements bn_GF2m_mul_2x2 polynomial multiplication used
+# in bn_gf2m.c. It's kind of low-hanging mechanical port from C for
+# the time being... Except that it has two code paths: one suitable
+# for all SPARCv9 processors and one for VIS3-capable ones. Former
+# delivers ~25-45% more, more for longer keys, heaviest DH and DSA
+# verify operations on venerable UltraSPARC II. On T4 VIS3 code is
+# ~100-230% faster than gcc-generated code and ~35-90% faster than
+# the pure SPARCv9 code path.
+
+$locals=16*8;
+
+$tab="%l0";
+
+@T=("%g2","%g3");
+@i=("%g4","%g5");
+
+($a1,$a2,$a4,$a8,$a12,$a48)=map("%o$_",(0..5));
+($lo,$hi,$b)=("%g1",$a8,"%o7"); $a=$lo;
+
+$code.=<<___;
+#include <sparc_arch.h>
+
+#ifdef __arch64__
+.register	%g2,#scratch
+.register	%g3,#scratch
+#endif
+
+#ifdef __PIC__
+SPARC_PIC_THUNK(%g1)
+#endif
+
+.globl	bn_GF2m_mul_2x2
+.align	16
+bn_GF2m_mul_2x2:
+        SPARC_LOAD_ADDRESS_LEAF(OPENSSL_sparcv9cap_P,%g1,%g5)
+        ld	[%g1+0],%g1             	! OPENSSL_sparcv9cap_P[0]
+
+        andcc	%g1, SPARCV9_VIS3, %g0
+        bz,pn	%icc,.Lsoftware
+        nop
+
+	sllx	%o1, 32, %o1
+	sllx	%o3, 32, %o3
+	or	%o2, %o1, %o1
+	or	%o4, %o3, %o3
+	.word	0x95b262ab			! xmulx   %o1, %o3, %o2
+	.word	0x99b262cb			! xmulxhi %o1, %o3, %o4
+	srlx	%o2, 32, %o1			! 13 cycles later
+	st	%o2, [%o0+0]
+	st	%o1, [%o0+4]
+	srlx	%o4, 32, %o3
+	st	%o4, [%o0+8]
+	retl
+	st	%o3, [%o0+12]
+
+.align	16
+.Lsoftware:
+	save	%sp,-STACK_FRAME-$locals,%sp
+
+	sllx	%i1,32,$a
+	mov	-1,$a12
+	sllx	%i3,32,$b
+	or	%i2,$a,$a
+	srlx	$a12,1,$a48			! 0x7fff...
+	or	%i4,$b,$b
+	srlx	$a12,2,$a12			! 0x3fff...
+	add	%sp,STACK_BIAS+STACK_FRAME,$tab
+
+	sllx	$a,2,$a4
+	mov	$a,$a1
+	sllx	$a,1,$a2
+
+	srax	$a4,63,@i[1]			! broadcast 61st bit
+	and	$a48,$a4,$a4			! (a<<2)&0x7fff...
+	srlx	$a48,2,$a48
+	srax	$a2,63,@i[0]			! broadcast 62nd bit
+	and	$a12,$a2,$a2			! (a<<1)&0x3fff...
+	srax	$a1,63,$lo			! broadcast 63rd bit
+	and	$a48,$a1,$a1			! (a<<0)&0x1fff...
+
+	sllx	$a1,3,$a8
+	and	$b,$lo,$lo
+	and	$b,@i[0],@i[0]
+	and	$b,@i[1],@i[1]
+
+	stx	%g0,[$tab+0*8]			! tab[0]=0
+	xor	$a1,$a2,$a12
+	stx	$a1,[$tab+1*8]			! tab[1]=a1
+	stx	$a2,[$tab+2*8]			! tab[2]=a2
+	 xor	$a4,$a8,$a48
+	stx	$a12,[$tab+3*8]			! tab[3]=a1^a2
+	 xor	$a4,$a1,$a1
+
+	stx	$a4,[$tab+4*8]			! tab[4]=a4
+	xor	$a4,$a2,$a2
+	stx	$a1,[$tab+5*8]			! tab[5]=a1^a4
+	xor	$a4,$a12,$a12
+	stx	$a2,[$tab+6*8]			! tab[6]=a2^a4
+	 xor	$a48,$a1,$a1
+	stx	$a12,[$tab+7*8]			! tab[7]=a1^a2^a4
+	 xor	$a48,$a2,$a2
+
+	stx	$a8,[$tab+8*8]			! tab[8]=a8
+	xor	$a48,$a12,$a12
+	stx	$a1,[$tab+9*8]			! tab[9]=a1^a8
+	 xor	$a4,$a1,$a1
+	stx	$a2,[$tab+10*8]			! tab[10]=a2^a8
+	 xor	$a4,$a2,$a2
+	stx	$a12,[$tab+11*8]		! tab[11]=a1^a2^a8
+
+	xor	$a4,$a12,$a12
+	stx	$a48,[$tab+12*8]		! tab[12]=a4^a8
+	 srlx	$lo,1,$hi
+	stx	$a1,[$tab+13*8]			! tab[13]=a1^a4^a8
+	 sllx	$lo,63,$lo
+	stx	$a2,[$tab+14*8]			! tab[14]=a2^a4^a8
+	 srlx	@i[0],2,@T[0]
+	stx	$a12,[$tab+15*8]		! tab[15]=a1^a2^a4^a8
+
+	sllx	@i[0],62,$a1
+	 sllx	$b,3,@i[0]
+	srlx	@i[1],3,@T[1]
+	 and	@i[0],`0xf<<3`,@i[0]
+	sllx	@i[1],61,$a2
+	 ldx	[$tab+@i[0]],@i[0]
+	 srlx	$b,4-3,@i[1]
+	xor	@T[0],$hi,$hi
+	 and	@i[1],`0xf<<3`,@i[1]
+	xor	$a1,$lo,$lo
+	 ldx	[$tab+@i[1]],@i[1]
+	xor	@T[1],$hi,$hi
+
+	xor	@i[0],$lo,$lo
+	srlx	$b,8-3,@i[0]
+	 xor	$a2,$lo,$lo
+	and	@i[0],`0xf<<3`,@i[0]
+___
+for($n=1;$n<14;$n++) {
+$code.=<<___;
+	sllx	@i[1],`$n*4`,@T[0]
+	ldx	[$tab+@i[0]],@i[0]
+	srlx	@i[1],`64-$n*4`,@T[1]
+	xor	@T[0],$lo,$lo
+	srlx	$b,`($n+2)*4`-3,@i[1]
+	xor	@T[1],$hi,$hi
+	and	@i[1],`0xf<<3`,@i[1]
+___
+	push(@i,shift(@i)); push(@T,shift(@T));
+}
+$code.=<<___;
+	sllx	@i[1],`$n*4`,@T[0]
+	ldx	[$tab+@i[0]],@i[0]
+	srlx	@i[1],`64-$n*4`,@T[1]
+	xor	@T[0],$lo,$lo
+
+	sllx	@i[0],`($n+1)*4`,@T[0]
+	 xor	@T[1],$hi,$hi
+	srlx	@i[0],`64-($n+1)*4`,@T[1]
+	xor	@T[0],$lo,$lo
+	xor	@T[1],$hi,$hi
+
+	srlx	$lo,32,%i1
+	st	$lo,[%i0+0]
+	st	%i1,[%i0+4]
+	srlx	$hi,32,%i2
+	st	$hi,[%i0+8]
+	st	%i2,[%i0+12]
+
+	ret
+	restore
+.type	bn_GF2m_mul_2x2,#function
+.size	bn_GF2m_mul_2x2,.-bn_GF2m_mul_2x2
+.asciz	"GF(2^m) Multiplication for SPARCv9, CRYPTOGAMS by <appro\@openssl.org>"
+.align	4
+___
+
+$code =~ s/\`([^\`]*)\`/eval($1)/gem;
+print $code;
+close STDOUT;
--- a/deps/openssl/openssl/crypto/bn/asm/vis3-mont.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/vis3-mont.pl
@ -0,0 +1,373 @@
+#!/usr/bin/env perl
+
+# ====================================================================
+# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
+# project. The module is, however, dual licensed under OpenSSL and
+# CRYPTOGAMS licenses depending on where you obtain it. For further
+# details see http://www.openssl.org/~appro/cryptogams/.
+# ====================================================================
+
+# October 2012.
+#
+# SPARCv9 VIS3 Montgomery multiplicaion procedure suitable for T3 and
+# onward. There are three new instructions used here: umulxhi,
+# addxc[cc] and initializing store. On T3 RSA private key operations
+# are 1.54/1.87/2.11/2.26 times faster for 512/1024/2048/4096-bit key
+# lengths. This is without dedicated squaring procedure. On T4
+# corresponding coefficients are 1.47/2.10/2.80/2.90x, which is mostly
+# for reference purposes, because T4 has dedicated Montgomery
+# multiplication and squaring *instructions* that deliver even more.
+
+$bits=32;
+for (@ARGV)     { $bits=64 if (/\-m64/ || /\-xarch\=v9/); }
+if ($bits==64)  { $bias=2047; $frame=192; }
+else            { $bias=0;    $frame=112; }
+
+$code.=<<___ if ($bits==64);
+.register	%g2,#scratch
+.register	%g3,#scratch
+___
+$code.=<<___;
+.section	".text",#alloc,#execinstr
+___
+
+($n0,$m0,$m1,$lo0,$hi0, $lo1,$hi1,$aj,$alo,$nj,$nlo,$tj)=
+	(map("%g$_",(1..5)),map("%o$_",(0..5,7)));
+
+# int bn_mul_mont(
+$rp="%o0";	# BN_ULONG *rp,
+$ap="%o1";	# const BN_ULONG *ap,
+$bp="%o2";	# const BN_ULONG *bp,
+$np="%o3";	# const BN_ULONG *np,
+$n0p="%o4";	# const BN_ULONG *n0,
+$num="%o5";	# int num);	# caller ensures that num is even
+				# and >=6
+$code.=<<___;
+.globl	bn_mul_mont_vis3
+.align	32
+bn_mul_mont_vis3:
+	add	%sp,	$bias,	%g4	! real top of stack
+	sll	$num,	2,	$num	! size in bytes
+	add	$num,	63,	%g5
+	andn	%g5,	63,	%g5	! buffer size rounded up to 64 bytes
+	add	%g5,	%g5,	%g1
+	add	%g5,	%g1,	%g1	! 3*buffer size
+	sub	%g4,	%g1,	%g1
+	andn	%g1,	63,	%g1	! align at 64 byte
+	sub	%g1,	$frame,	%g1	! new top of stack
+	sub	%g1,	%g4,	%g1
+
+	save	%sp,	%g1,	%sp
+___
+
+#	+-------------------------------+<-----	%sp
+#	.				.
+#	+-------------------------------+<-----	aligned at 64 bytes
+#	| __int64 tmp[0]		|
+#	+-------------------------------+
+#	.				.
+#	.				.
+#	+-------------------------------+<----- aligned at 64 bytes
+#	| __int64 ap[1..0]		|	converted ap[]
+#	+-------------------------------+
+#	| __int64 np[1..0]		|	converted np[]
+#	+-------------------------------+
+#	| __int64 ap[3..2]		|
+#	.				.
+#	.				.
+#	+-------------------------------+
+($rp,$ap,$bp,$np,$n0p,$num)=map("%i$_",(0..5));
+($t0,$t1,$t2,$t3,$cnt,$tp,$bufsz,$anp)=map("%l$_",(0..7));
+($ovf,$i)=($t0,$t1);
+$code.=<<___;
+	ld	[$n0p+0],	$t0	! pull n0[0..1] value
+	add	%sp, $bias+$frame, $tp
+	ld	[$n0p+4],	$t1
+	add	$tp,	%g5,	$anp
+	ld	[$bp+0],	$t2	! m0=bp[0]
+	sllx	$t1,	32,	$n0
+	ld	[$bp+4],	$t3
+	or	$t0,	$n0,	$n0
+	add	$bp,	8,	$bp
+
+	ld	[$ap+0],	$t0	! ap[0]
+	sllx	$t3,	32,	$m0
+	ld	[$ap+4],	$t1
+	or	$t2,	$m0,	$m0
+
+	ld	[$ap+8],	$t2	! ap[1]
+	sllx	$t1,	32,	$aj
+	ld	[$ap+12],	$t3
+	or	$t0,	$aj,	$aj
+	add	$ap,	16,	$ap
+	stxa	$aj,	[$anp]0xe2	! converted ap[0]
+
+	mulx	$aj,	$m0,	$lo0	! ap[0]*bp[0]
+	umulxhi	$aj,	$m0,	$hi0
+
+	ld	[$np+0],	$t0	! np[0]
+	sllx	$t3,	32,	$aj
+	ld	[$np+4],	$t1
+	or	$t2,	$aj,	$aj
+
+	ld	[$np+8],	$t2	! np[1]
+	sllx	$t1,	32,	$nj
+	ld	[$np+12],	$t3
+	or	$t0, $nj,	$nj
+	add	$np,	16,	$np
+	stx	$nj,	[$anp+8]	! converted np[0]
+
+	mulx	$lo0,	$n0,	$m1	! "tp[0]"*n0
+	stx	$aj,	[$anp+16]	! converted ap[1]
+
+	mulx	$aj,	$m0,	$alo	! ap[1]*bp[0]
+	umulxhi	$aj,	$m0,	$aj	! ahi=aj
+
+	mulx	$nj,	$m1,	$lo1	! np[0]*m1
+	umulxhi	$nj,	$m1,	$hi1
+
+	sllx	$t3,	32,	$nj
+	or	$t2,	$nj,	$nj
+	stx	$nj,	[$anp+24]	! converted np[1]
+	add	$anp,	32,	$anp
+
+	addcc	$lo0,	$lo1,	$lo1
+	addxc	%g0,	$hi1,	$hi1
+
+	mulx	$nj,	$m1,	$nlo	! np[1]*m1
+	umulxhi	$nj,	$m1,	$nj	! nhi=nj
+
+	ba	.L1st
+	sub	$num,	24,	$cnt	! cnt=num-3
+
+.align	16
+.L1st:
+	ld	[$ap+0],	$t0	! ap[j]
+	addcc	$alo,	$hi0,	$lo0
+	ld	[$ap+4],	$t1
+	addxc	$aj,	%g0,	$hi0
+
+	sllx	$t1,	32,	$aj
+	add	$ap,	8,	$ap
+	or	$t0,	$aj,	$aj
+	stxa	$aj,	[$anp]0xe2	! converted ap[j]
+
+	ld	[$np+0],	$t2	! np[j]
+	addcc	$nlo,	$hi1,	$lo1
+	ld	[$np+4],	$t3
+	addxc	$nj,	%g0,	$hi1	! nhi=nj
+
+	sllx	$t3,	32,	$nj
+	add	$np,	8,	$np
+	mulx	$aj,	$m0,	$alo	! ap[j]*bp[0]
+	or	$t2,	$nj,	$nj
+	umulxhi	$aj,	$m0,	$aj	! ahi=aj
+	stx	$nj,	[$anp+8]	! converted np[j]
+	add	$anp,	16,	$anp	! anp++
+
+	mulx	$nj,	$m1,	$nlo	! np[j]*m1
+	addcc	$lo0,	$lo1,	$lo1	! np[j]*m1+ap[j]*bp[0]
+	umulxhi	$nj,	$m1,	$nj	! nhi=nj
+	addxc	%g0,	$hi1,	$hi1
+	stxa	$lo1,	[$tp]0xe2	! tp[j-1]
+	add	$tp,	8,	$tp	! tp++
+
+	brnz,pt	$cnt,	.L1st
+	sub	$cnt,	8,	$cnt	! j--
+!.L1st
+	addcc	$alo,	$hi0,	$lo0
+	addxc	$aj,	%g0,	$hi0	! ahi=aj
+
+	addcc	$nlo,	$hi1,	$lo1
+	addxc	$nj,	%g0,	$hi1
+	addcc	$lo0,	$lo1,	$lo1	! np[j]*m1+ap[j]*bp[0]
+	addxc	%g0,	$hi1,	$hi1
+	stxa	$lo1,	[$tp]0xe2	! tp[j-1]
+	add	$tp,	8,	$tp
+
+	addcc	$hi0,	$hi1,	$hi1
+	addxc	%g0,	%g0,	$ovf	! upmost overflow bit
+	stxa	$hi1,	[$tp]0xe2
+	add	$tp,	8,	$tp
+
+	ba	.Louter
+	sub	$num,	16,	$i	! i=num-2
+
+.align	16
+.Louter:
+	ld	[$bp+0],	$t2	! m0=bp[i]
+	ld	[$bp+4],	$t3
+
+	sub	$anp,	$num,	$anp	! rewind
+	sub	$tp,	$num,	$tp
+	sub	$anp,	$num,	$anp
+
+	add	$bp,	8,	$bp
+	sllx	$t3,	32,	$m0
+	ldx	[$anp+0],	$aj	! ap[0]
+	or	$t2,	$m0,	$m0
+	ldx	[$anp+8],	$nj	! np[0]
+
+	mulx	$aj,	$m0,	$lo0	! ap[0]*bp[i]
+	ldx	[$tp],		$tj	! tp[0]
+	umulxhi	$aj,	$m0,	$hi0
+	ldx	[$anp+16],	$aj	! ap[1]
+	addcc	$lo0,	$tj,	$lo0	! ap[0]*bp[i]+tp[0]
+	mulx	$aj,	$m0,	$alo	! ap[1]*bp[i]
+	addxc	%g0,	$hi0,	$hi0
+	mulx	$lo0,	$n0,	$m1	! tp[0]*n0
+	umulxhi	$aj,	$m0,	$aj	! ahi=aj
+	mulx	$nj,	$m1,	$lo1	! np[0]*m1
+	umulxhi	$nj,	$m1,	$hi1
+	ldx	[$anp+24],	$nj	! np[1]
+	add	$anp,	32,	$anp
+	addcc	$lo1,	$lo0,	$lo1
+	mulx	$nj,	$m1,	$nlo	! np[1]*m1
+	addxc	%g0,	$hi1,	$hi1
+	umulxhi	$nj,	$m1,	$nj	! nhi=nj
+
+	ba	.Linner
+	sub	$num,	24,	$cnt	! cnt=num-3
+.align	16
+.Linner:
+	addcc	$alo,	$hi0,	$lo0
+	ldx	[$tp+8],	$tj	! tp[j]
+	addxc	$aj,	%g0,	$hi0	! ahi=aj
+	ldx	[$anp+0],	$aj	! ap[j]
+	addcc	$nlo,	$hi1,	$lo1
+	mulx	$aj,	$m0,	$alo	! ap[j]*bp[i]
+	addxc	$nj,	%g0,	$hi1	! nhi=nj
+	ldx	[$anp+8],	$nj	! np[j]
+	add	$anp,	16,	$anp
+	umulxhi	$aj,	$m0,	$aj	! ahi=aj
+	addcc	$lo0,	$tj,	$lo0	! ap[j]*bp[i]+tp[j]
+	mulx	$nj,	$m1,	$nlo	! np[j]*m1
+	addxc	%g0,	$hi0,	$hi0
+	umulxhi	$nj,	$m1,	$nj	! nhi=nj
+	addcc	$lo1,	$lo0,	$lo1	! np[j]*m1+ap[j]*bp[i]+tp[j]
+	addxc	%g0,	$hi1,	$hi1
+	stx	$lo1,	[$tp]		! tp[j-1]
+	add	$tp,	8,	$tp
+	brnz,pt	$cnt,	.Linner
+	sub	$cnt,	8,	$cnt
+!.Linner
+	ldx	[$tp+8],	$tj	! tp[j]
+	addcc	$alo,	$hi0,	$lo0
+	addxc	$aj,	%g0,	$hi0	! ahi=aj
+	addcc	$lo0,	$tj,	$lo0	! ap[j]*bp[i]+tp[j]
+	addxc	%g0,	$hi0,	$hi0
+
+	addcc	$nlo,	$hi1,	$lo1
+	addxc	$nj,	%g0,	$hi1	! nhi=nj
+	addcc	$lo1,	$lo0,	$lo1	! np[j]*m1+ap[j]*bp[i]+tp[j]
+	addxc	%g0,	$hi1,	$hi1
+	stx	$lo1,	[$tp]		! tp[j-1]
+
+	subcc	%g0,	$ovf,	%g0	! move upmost overflow to CCR.xcc
+	addxccc	$hi1,	$hi0,	$hi1
+	addxc	%g0,	%g0,	$ovf
+	stx	$hi1,	[$tp+8]
+	add	$tp,	16,	$tp
+
+	brnz,pt	$i,	.Louter
+	sub	$i,	8,	$i
+
+	sub	$anp,	$num,	$anp	! rewind
+	sub	$tp,	$num,	$tp
+	sub	$anp,	$num,	$anp
+	ba	.Lsub
+	subcc	$num,	8,	$cnt	! cnt=num-1 and clear CCR.xcc
+
+.align	16
+.Lsub:
+	ldx	[$tp],		$tj
+	add	$tp,	8,	$tp
+	ldx	[$anp+8],	$nj
+	add	$anp,	16,	$anp
+	subccc	$tj,	$nj,	$t2	! tp[j]-np[j]
+	srlx	$tj,	32,	$tj
+	srlx	$nj,	32,	$nj
+	subccc	$tj,	$nj,	$t3
+	add	$rp,	8,	$rp
+	st	$t2,	[$rp-4]		! reverse order
+	st	$t3,	[$rp-8]
+	brnz,pt	$cnt,	.Lsub
+	sub	$cnt,	8,	$cnt
+
+	sub	$anp,	$num,	$anp	! rewind
+	sub	$tp,	$num,	$tp
+	sub	$anp,	$num,	$anp
+	sub	$rp,	$num,	$rp
+
+	subc	$ovf,	%g0,	$ovf	! handle upmost overflow bit
+	and	$tp,	$ovf,	$ap
+	andn	$rp,	$ovf,	$np
+	or	$np,	$ap,	$ap	! ap=borrow?tp:rp
+	ba	.Lcopy
+	sub	$num,	8,	$cnt
+
+.align	16
+.Lcopy:					! copy or in-place refresh
+	ld	[$ap+0],	$t2
+	ld	[$ap+4],	$t3
+	add	$ap,	8,	$ap
+	stx	%g0,	[$tp]		! zap
+	add	$tp,	8,	$tp
+	stx	%g0,	[$anp]		! zap
+	stx	%g0,	[$anp+8]
+	add	$anp,	16,	$anp
+	st	$t3,	[$rp+0]		! flip order
+	st	$t2,	[$rp+4]
+	add	$rp,	8,	$rp
+	brnz	$cnt,	.Lcopy
+	sub	$cnt,	8,	$cnt
+
+	mov	1,	%o0
+	ret
+	restore
+.type	bn_mul_mont_vis3, #function
+.size	bn_mul_mont_vis3, .-bn_mul_mont_vis3
+.asciz  "Montgomery Multiplication for SPARCv9 VIS3, CRYPTOGAMS by <appro\@openssl.org>"
+.align	4
+___
+
+# Purpose of these subroutines is to explicitly encode VIS instructions,
+# so that one can compile the module without having to specify VIS
+# extentions on compiler command line, e.g. -xarch=v9 vs. -xarch=v9a.
+# Idea is to reserve for option to produce "universal" binary and let
+# programmer detect if current CPU is VIS capable at run-time.
+sub unvis3 {
+my ($mnemonic,$rs1,$rs2,$rd)=@_;
+my %bias = ( "g" => 0, "o" => 8, "l" => 16, "i" => 24 );
+my ($ref,$opf);
+my %visopf = (	"addxc"		=> 0x011,
+		"addxccc"	=> 0x013,
+		"umulxhi"	=> 0x016	);
+
+    $ref = "$mnemonic\t$rs1,$rs2,$rd";
+
+    if ($opf=$visopf{$mnemonic}) {
+	foreach ($rs1,$rs2,$rd) {
+	    return $ref if (!/%([goli])([0-9])/);
+	    $_=$bias{$1}+$2;
+	}
+
+	return	sprintf ".word\t0x%08x !%s",
+			0x81b00000|$rd<<25|$rs1<<14|$opf<<5|$rs2,
+			$ref;
+    } else {
+	return $ref;
+    }
+}
+
+foreach (split("\n",$code)) {
+	s/\`([^\`]*)\`/eval $1/ge;
+
+	s/\b(umulxhi|addxc[c]{0,2})\s+(%[goli][0-7]),\s*(%[goli][0-7]),\s*(%[goli][0-7])/
+		&unvis3($1,$2,$3,$4)
+	 /ge;
+
+	print $_,"\n";
+}
+
+close STDOUT;
--- a/deps/openssl/openssl/crypto/bn/asm/x86_64-gcc.c
+++ b/deps/openssl/openssl/crypto/bn/asm/x86_64-gcc.c
@ -55,7 +55,7 @@
 *    machine.
 */

-# ifdef _WIN64
+# if defined(_WIN64) || !defined(__LP64__)
 #  define BN_ULONG unsigned long long
 # else
 #  define BN_ULONG unsigned long
@ -63,7 +63,6 @@

 # undef mul
 # undef mul_add
-# undef sqr

 /*-
 * "m"(a), "+m"(r)      is the way to favor DirectPath µ-code;
@ -99,8 +98,8 @@
                : "cc");                \
        (r)=carry, carry=high;          \
        } while (0)
-
-# define sqr(r0,r1,a)                    \
+# undef sqr
+# define sqr(r0,r1,a)                   \
        asm ("mulq %2"                  \
                : "=a"(r0),"=d"(r1)     \
                : "a"(a)                \
@ -204,20 +203,22 @@ BN_ULONG bn_div_words(BN_ULONG h, BN_ULONG l, BN_ULONG d)
 BN_ULONG bn_add_words(BN_ULONG *rp, const BN_ULONG *ap, const BN_ULONG *bp,
                      int n)
 {
-    BN_ULONG ret = 0, i = 0;
+    BN_ULONG ret;
+    size_t i = 0;

    if (n <= 0)
        return 0;

-    asm volatile ("       subq    %2,%2           \n"
+    asm volatile ("       subq    %0,%0           \n" /* clear carry */
+                  "       jmp     1f              \n"
                  ".p2align 4                     \n"
                  "1:     movq    (%4,%2,8),%0    \n"
                  "       adcq    (%5,%2,8),%0    \n"
                  "       movq    %0,(%3,%2,8)    \n"
-                  "       leaq    1(%2),%2        \n"
+                  "       lea     1(%2),%2        \n"
                  "       loop    1b              \n"
-                  "       sbbq    %0,%0           \n":"=&a" (ret), "+c"(n),
-                  "=&r"(i)
+                  "       sbbq    %0,%0           \n":"=&r" (ret), "+c"(n),
+                  "+r"(i)
                  :"r"(rp), "r"(ap), "r"(bp)
                  :"cc", "memory");

@ -228,20 +229,22 @@ BN_ULONG bn_add_words(BN_ULONG *rp, const BN_ULONG *ap, const BN_ULONG *bp,
 BN_ULONG bn_sub_words(BN_ULONG *rp, const BN_ULONG *ap, const BN_ULONG *bp,
                      int n)
 {
-    BN_ULONG ret = 0, i = 0;
+    BN_ULONG ret;
+    size_t i = 0;

    if (n <= 0)
        return 0;

-    asm volatile ("       subq    %2,%2           \n"
+    asm volatile ("       subq    %0,%0           \n" /* clear borrow */
+                  "       jmp     1f              \n"
                  ".p2align 4                     \n"
                  "1:     movq    (%4,%2,8),%0    \n"
                  "       sbbq    (%5,%2,8),%0    \n"
                  "       movq    %0,(%3,%2,8)    \n"
-                  "       leaq    1(%2),%2        \n"
+                  "       lea     1(%2),%2        \n"
                  "       loop    1b              \n"
-                  "       sbbq    %0,%0           \n":"=&a" (ret), "+c"(n),
-                  "=&r"(i)
+                  "       sbbq    %0,%0           \n":"=&r" (ret), "+c"(n),
+                  "+r"(i)
                  :"r"(rp), "r"(ap), "r"(bp)
                  :"cc", "memory");

@ -313,55 +316,58 @@ BN_ULONG bn_sub_words(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b, int n)
 */
 # if 0
 /* original macros are kept for reference purposes */
-#  define mul_add_c(a,b,c0,c1,c2) {       \
-        BN_ULONG ta=(a),tb=(b);         \
-        t1 = ta * tb;                   \
-        t2 = BN_UMULT_HIGH(ta,tb);      \
-        c0 += t1; t2 += (c0<t1)?1:0;    \
-        c1 += t2; c2 += (c1<t2)?1:0;    \
-        }
-
-#  define mul_add_c2(a,b,c0,c1,c2) {      \
-        BN_ULONG ta=(a),tb=(b),t0;      \
-        t1 = BN_UMULT_HIGH(ta,tb);      \
-        t0 = ta * tb;                   \
-        c0 += t0; t2 = t1+((c0<t0)?1:0);\
-        c1 += t2; c2 += (c1<t2)?1:0;    \
-        c0 += t0; t1 += (c0<t0)?1:0;    \
-        c1 += t1; c2 += (c1<t1)?1:0;    \
-        }
+#  define mul_add_c(a,b,c0,c1,c2)       do {    \
+        BN_ULONG ta = (a), tb = (b);            \
+        BN_ULONG lo, hi;                        \
+        BN_UMULT_LOHI(lo,hi,ta,tb);             \
+        c0 += lo; hi += (c0<lo)?1:0;            \
+        c1 += hi; c2 += (c1<hi)?1:0;            \
+        } while(0)
+
+#  define mul_add_c2(a,b,c0,c1,c2)      do {    \
+        BN_ULONG ta = (a), tb = (b);            \
+        BN_ULONG lo, hi, tt;                    \
+        BN_UMULT_LOHI(lo,hi,ta,tb);             \
+        c0 += lo; tt = hi+((c0<lo)?1:0);        \
+        c1 += tt; c2 += (c1<tt)?1:0;            \
+        c0 += lo; hi += (c0<lo)?1:0;            \
+        c1 += hi; c2 += (c1<hi)?1:0;            \
+        } while(0)
+
+#  define sqr_add_c(a,i,c0,c1,c2)       do {    \
+        BN_ULONG ta = (a)[i];                   \
+        BN_ULONG lo, hi;                        \
+        BN_UMULT_LOHI(lo,hi,ta,ta);             \
+        c0 += lo; hi += (c0<lo)?1:0;            \
+        c1 += hi; c2 += (c1<hi)?1:0;            \
+        } while(0)
 # else
-#  define mul_add_c(a,b,c0,c1,c2) do {    \
+#  define mul_add_c(a,b,c0,c1,c2) do {  \
+        BN_ULONG t1,t2;                 \
        asm ("mulq %3"                  \
                : "=a"(t1),"=d"(t2)     \
                : "a"(a),"m"(b)         \
                : "cc");                \
-        asm ("addq %2,%0; adcq %3,%1"   \
-                : "+r"(c0),"+d"(t2)     \
-                : "a"(t1),"g"(0)        \
-                : "cc");                \
-        asm ("addq %2,%0; adcq %3,%1"   \
-                : "+r"(c1),"+r"(c2)     \
-                : "d"(t2),"g"(0)        \
-                : "cc");                \
+        asm ("addq %3,%0; adcq %4,%1; adcq %5,%2"       \
+                : "+r"(c0),"+r"(c1),"+r"(c2)            \
+                : "r"(t1),"r"(t2),"g"(0)                \
+                : "cc");                                \
        } while (0)

-#  define sqr_add_c(a,i,c0,c1,c2) do {    \
+#  define sqr_add_c(a,i,c0,c1,c2) do {  \
+        BN_ULONG t1,t2;                 \
        asm ("mulq %2"                  \
                : "=a"(t1),"=d"(t2)     \
                : "a"(a[i])             \
                : "cc");                \
-        asm ("addq %2,%0; adcq %3,%1"   \
-                : "+r"(c0),"+d"(t2)     \
-                : "a"(t1),"g"(0)        \
-                : "cc");                \
-        asm ("addq %2,%0; adcq %3,%1"   \
-                : "+r"(c1),"+r"(c2)     \
-                : "d"(t2),"g"(0)        \
-                : "cc");                \
+        asm ("addq %3,%0; adcq %4,%1; adcq %5,%2"       \
+                : "+r"(c0),"+r"(c1),"+r"(c2)            \
+                : "r"(t1),"r"(t2),"g"(0)                \
+                : "cc");                                \
        } while (0)

-#  define mul_add_c2(a,b,c0,c1,c2) do {   \
+#  define mul_add_c2(a,b,c0,c1,c2) do { \
+        BN_ULONG t1,t2;                 \
        asm ("mulq %3"                  \
                : "=a"(t1),"=d"(t2)     \
                : "a"(a),"m"(b)         \
@ -382,7 +388,6 @@ BN_ULONG bn_sub_words(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b, int n)

 void bn_mul_comba8(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)
 {
-    BN_ULONG t1, t2;
    BN_ULONG c1, c2, c3;

    c1 = 0;
@ -486,7 +491,6 @@ void bn_mul_comba8(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)

 void bn_mul_comba4(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)
 {
-    BN_ULONG t1, t2;
    BN_ULONG c1, c2, c3;

    c1 = 0;
@ -526,7 +530,6 @@ void bn_mul_comba4(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)

 void bn_sqr_comba8(BN_ULONG *r, const BN_ULONG *a)
 {
-    BN_ULONG t1, t2;
    BN_ULONG c1, c2, c3;

    c1 = 0;
@ -602,7 +605,6 @@ void bn_sqr_comba8(BN_ULONG *r, const BN_ULONG *a)

 void bn_sqr_comba4(BN_ULONG *r, const BN_ULONG *a)
 {
-    BN_ULONG t1, t2;
    BN_ULONG c1, c2, c3;

    c1 = 0;
--- a/deps/openssl/openssl/crypto/bn/asm/x86_64-mont.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/x86_64-mont.pl
--- a/deps/openssl/openssl/crypto/bn/asm/x86_64-mont5.pl
+++ b/deps/openssl/openssl/crypto/bn/asm/x86_64-mont5.pl
--- a/deps/openssl/openssl/crypto/bn/bn.h
+++ b/deps/openssl/openssl/crypto/bn/bn.h
@ -256,24 +256,6 @@ extern "C" {
 #  define BN_HEX_FMT2     "%08X"
 # endif

-/*
- * 2011-02-22 SMS. In various places, a size_t variable or a type cast to
- * size_t was used to perform integer-only operations on pointers.  This
- * failed on VMS with 64-bit pointers (CC /POINTER_SIZE = 64) because size_t
- * is still only 32 bits.  What's needed in these cases is an integer type
- * with the same size as a pointer, which size_t is not certain to be. The
- * only fix here is VMS-specific.
- */
-# if defined(OPENSSL_SYS_VMS)
-#  if __INITIAL_POINTER_SIZE == 64
-#   define PTR_SIZE_INT long long
-#  else                         /* __INITIAL_POINTER_SIZE == 64 */
-#   define PTR_SIZE_INT int
-#  endif                        /* __INITIAL_POINTER_SIZE == 64 [else] */
-# else                          /* defined(OPENSSL_SYS_VMS) */
-#  define PTR_SIZE_INT size_t
-# endif                         /* defined(OPENSSL_SYS_VMS) [else] */
-
 # define BN_DEFAULT_BITS 1280

 # define BN_FLG_MALLOCED         0x01
--- a/deps/openssl/openssl/crypto/bn/bn_asm.c
+++ b/deps/openssl/openssl/crypto/bn/bn_asm.c
@ -489,121 +489,144 @@ BN_ULONG bn_sub_words(BN_ULONG *r, const BN_ULONG *a, const BN_ULONG *b,
 * c=(c2,c1,c0)
 */

+# ifdef BN_LLONG
 /*
- * Keep in mind that carrying into high part of multiplication result
- * can not overflow, because it cannot be all-ones.
+ * Keep in mind that additions to multiplication result can not
+ * overflow, because its high half cannot be all-ones.
 */
-# ifdef BN_LLONG
-#  define mul_add_c(a,b,c0,c1,c2) \
-        t=(BN_ULLONG)a*b; \
-        t1=(BN_ULONG)Lw(t); \
-        t2=(BN_ULONG)Hw(t); \
-        c0=(c0+t1)&BN_MASK2; if ((c0) < t1) t2++; \
-        c1=(c1+t2)&BN_MASK2; if ((c1) < t2) c2++;
-
-#  define mul_add_c2(a,b,c0,c1,c2) \
-        t=(BN_ULLONG)a*b; \
-        tt=(t+t)&BN_MASK; \
-        if (tt < t) c2++; \
-        t1=(BN_ULONG)Lw(tt); \
-        t2=(BN_ULONG)Hw(tt); \
-        c0=(c0+t1)&BN_MASK2;  \
-        if ((c0 < t1) && (((++t2)&BN_MASK2) == 0)) c2++; \
-        c1=(c1+t2)&BN_MASK2; if ((c1) < t2) c2++;
-
-#  define sqr_add_c(a,i,c0,c1,c2) \
-        t=(BN_ULLONG)a[i]*a[i]; \
-        t1=(BN_ULONG)Lw(t); \
-        t2=(BN_ULONG)Hw(t); \
-        c0=(c0+t1)&BN_MASK2; if ((c0) < t1) t2++; \
-        c1=(c1+t2)&BN_MASK2; if ((c1) < t2) c2++;
+#  define mul_add_c(a,b,c0,c1,c2)       do {    \
+        BN_ULONG hi;                            \
+        BN_ULLONG t = (BN_ULLONG)(a)*(b);       \
+        t += c0;                /* no carry */  \
+        c0 = (BN_ULONG)Lw(t);                   \
+        hi = (BN_ULONG)Hw(t);                   \
+        c1 = (c1+hi)&BN_MASK2; if (c1<hi) c2++; \
+        } while(0)
+
+#  define mul_add_c2(a,b,c0,c1,c2)      do {    \
+        BN_ULONG hi;                            \
+        BN_ULLONG t = (BN_ULLONG)(a)*(b);       \
+        BN_ULLONG tt = t+c0;    /* no carry */  \
+        c0 = (BN_ULONG)Lw(tt);                  \
+        hi = (BN_ULONG)Hw(tt);                  \
+        c1 = (c1+hi)&BN_MASK2; if (c1<hi) c2++; \
+        t += c0;                /* no carry */  \
+        c0 = (BN_ULONG)Lw(t);                   \
+        hi = (BN_ULONG)Hw(t);                   \
+        c1 = (c1+hi)&BN_MASK2; if (c1<hi) c2++; \
+        } while(0)
+
+#  define sqr_add_c(a,i,c0,c1,c2)       do {    \
+        BN_ULONG hi;                            \
+        BN_ULLONG t = (BN_ULLONG)a[i]*a[i];     \
+        t += c0;                /* no carry */  \
+        c0 = (BN_ULONG)Lw(t);                   \
+        hi = (BN_ULONG)Hw(t);                   \
+        c1 = (c1+hi)&BN_MASK2; if (c1<hi) c2++; \
+        } while(0)

 #  define sqr_add_c2(a,i,j,c0,c1,c2) \
        mul_add_c2((a)[i],(a)[j],c0,c1,c2)

 # elif defined(BN_UMULT_LOHI)
-
-#  define mul_add_c(a,b,c0,c1,c2) {       \
-        BN_ULONG ta=(a),tb=(b);         \
-        BN_UMULT_LOHI(t1,t2,ta,tb);     \
-        c0 += t1; t2 += (c0<t1)?1:0;    \
-        c1 += t2; c2 += (c1<t2)?1:0;    \
-        }
-
-#  define mul_add_c2(a,b,c0,c1,c2) {      \
-        BN_ULONG ta=(a),tb=(b),t0;      \
-        BN_UMULT_LOHI(t0,t1,ta,tb);     \
-        c0 += t0; t2 = t1+((c0<t0)?1:0);\
-        c1 += t2; c2 += (c1<t2)?1:0;    \
-        c0 += t0; t1 += (c0<t0)?1:0;    \
-        c1 += t1; c2 += (c1<t1)?1:0;    \
-        }
-
-#  define sqr_add_c(a,i,c0,c1,c2) {       \
-        BN_ULONG ta=(a)[i];             \
-        BN_UMULT_LOHI(t1,t2,ta,ta);     \
-        c0 += t1; t2 += (c0<t1)?1:0;    \
-        c1 += t2; c2 += (c1<t2)?1:0;    \
-        }
+/*
+ * Keep in mind that additions to hi can not overflow, because
+ * the high word of a multiplication result cannot be all-ones.
+ */
+#  define mul_add_c(a,b,c0,c1,c2)       do {    \
+        BN_ULONG ta = (a), tb = (b);            \
+        BN_ULONG lo, hi;                        \
+        BN_UMULT_LOHI(lo,hi,ta,tb);             \
+        c0 += lo; hi += (c0<lo)?1:0;            \
+        c1 += hi; c2 += (c1<hi)?1:0;            \
+        } while(0)
+
+#  define mul_add_c2(a,b,c0,c1,c2)      do {    \
+        BN_ULONG ta = (a), tb = (b);            \
+        BN_ULONG lo, hi, tt;                    \
+        BN_UMULT_LOHI(lo,hi,ta,tb);             \
+        c0 += lo; tt = hi+((c0<lo)?1:0);        \
+        c1 += tt; c2 += (c1<tt)?1:0;            \
+        c0 += lo; hi += (c0<lo)?1:0;            \
+        c1 += hi; c2 += (c1<hi)?1:0;            \
+        } while(0)
+
+#  define sqr_add_c(a,i,c0,c1,c2)       do {    \
+        BN_ULONG ta = (a)[i];                   \
+        BN_ULONG lo, hi;                        \
+        BN_UMULT_LOHI(lo,hi,ta,ta);             \
+        c0 += lo; hi += (c0<lo)?1:0;            \
+        c1 += hi; c2 += (c1<hi)?1:0;            \
+        } while(0)

 #  define sqr_add_c2(a,i,j,c0,c1,c2)    \
        mul_add_c2((a)[i],(a)[j],c0,c1,c2)

 # elif defined(BN_UMULT_HIGH)
-
-#  define mul_add_c(a,b,c0,c1,c2) {       \
-        BN_ULONG ta=(a),tb=(b);         \
-        t1 = ta * tb;                   \
-        t2 = BN_UMULT_HIGH(ta,tb);      \
-        c0 += t1; t2 += (c0<t1)?1:0;    \
-        c1 += t2; c2 += (c1<t2)?1:0;    \
-        }
-
-#  define mul_add_c2(a,b,c0,c1,c2) {      \
-        BN_ULONG ta=(a),tb=(b),t0;      \
-        t1 = BN_UMULT_HIGH(ta,tb);      \
-        t0 = ta * tb;                   \
-        c0 += t0; t2 = t1+((c0<t0)?1:0);\
-        c1 += t2; c2 += (c1<t2)?1:0;    \
-        c0 += t0; t1 += (c0<t0)?1:0;    \
-        c1 += t1; c2 += (c1<t1)?1:0;    \
-        }
-
-#  define sqr_add_c(a,i,c0,c1,c2) {       \
-        BN_ULONG ta=(a)[i];             \
-        t1 = ta * ta;                   \
-        t2 = BN_UMULT_HIGH(ta,ta);      \
-        c0 += t1; t2 += (c0<t1)?1:0;    \
-        c1 += t2; c2 += (c1<t2)?1:0;    \
-        }
+/*
+ * Keep in mind that additions to hi can not overflow, because
+ * the high word of a multiplication result cannot be all-ones.
+ */
+#  define mul_add_c(a,b,c0,c1,c2)       do {    \
+        BN_ULONG ta = (a), tb = (b);            \
+        BN_ULONG lo = ta * tb;                  \
+        BN_ULONG hi = BN_UMULT_HIGH(ta,tb);     \
+        c0 += lo; hi += (c0<lo)?1:0;            \
+        c1 += hi; c2 += (c1<hi)?1:0;            \
+        } while(0)
+
+#  define mul_add_c2(a,b,c0,c1,c2)      do {    \
+        BN_ULONG ta = (a), tb = (b), tt;        \
+        BN_ULONG lo = ta * tb;                  \
+        BN_ULONG hi = BN_UMULT_HIGH(ta,tb);     \
+        c0 += lo; tt = hi + ((c0<lo)?1:0);      \
+        c1 += tt; c2 += (c1<tt)?1:0;            \
+        c0 += lo; hi += (c0<lo)?1:0;            \
+        c1 += hi; c2 += (c1<hi)?1:0;            \
+        } while(0)
+
+#  define sqr_add_c(a,i,c0,c1,c2)       do {    \
+        BN_ULONG ta = (a)[i];                   \
+        BN_ULONG lo = ta * ta;                  \
+        BN_ULONG hi = BN_UMULT_HIGH(ta,ta);     \
+        c0 += lo; hi += (c0<lo)?1:0;            \
+        c1 += hi; c2 += (c1<hi)?1:0;            \
+        } while(0)

 #  define sqr_add_c2(a,i,j,c0,c1,c2)      \
        mul_add_c2((a)[i],(a)[j],c0,c1,c2)

 # else                          /* !BN_LLONG */
-#  define mul_add_c(a,b,c0,c1,c2) \
-        t1=LBITS(a); t2=HBITS(a); \
-        bl=LBITS(b); bh=HBITS(b); \
-        mul64(t1,t2,bl,bh); \
-        c0=(c0+t1)&BN_MASK2; if ((c0) < t1) t2++; \
-        c1=(c1+t2)&BN_MASK2; if ((c1) < t2) c2++;
-
-#  define mul_add_c2(a,b,c0,c1,c2) \
-        t1=LBITS(a); t2=HBITS(a); \
-        bl=LBITS(b); bh=HBITS(b); \
-        mul64(t1,t2,bl,bh); \
-        if (t2 & BN_TBIT) c2++; \
-        t2=(t2+t2)&BN_MASK2; \
-        if (t1 & BN_TBIT) t2++; \
-        t1=(t1+t1)&BN_MASK2; \
-        c0=(c0+t1)&BN_MASK2;  \
-        if ((c0 < t1) && (((++t2)&BN_MASK2) == 0)) c2++; \
-        c1=(c1+t2)&BN_MASK2; if ((c1) < t2) c2++;
-
-#  define sqr_add_c(a,i,c0,c1,c2) \
-        sqr64(t1,t2,(a)[i]); \
-        c0=(c0+t1)&BN_MASK2; if ((c0) < t1) t2++; \
-        c1=(c1+t2)&BN_MASK2; if ((c1) < t2) c2++;
+/*
+ * Keep in mind that additions to hi can not overflow, because
+ * the high word of a multiplication result cannot be all-ones.
+ */
+#  define mul_add_c(a,b,c0,c1,c2)       do {    \
+        BN_ULONG lo = LBITS(a), hi = HBITS(a);  \
+        BN_ULONG bl = LBITS(b), bh = HBITS(b);  \
+        mul64(lo,hi,bl,bh);                     \
+        c0 = (c0+lo)&BN_MASK2; if (c0<lo) hi++; \
+        c1 = (c1+hi)&BN_MASK2; if (c1<hi) c2++; \
+        } while(0)
+
+#  define mul_add_c2(a,b,c0,c1,c2)      do {    \
+        BN_ULONG tt;                            \
+        BN_ULONG lo = LBITS(a), hi = HBITS(a);  \
+        BN_ULONG bl = LBITS(b), bh = HBITS(b);  \
+        mul64(lo,hi,bl,bh);                     \
+        tt = hi;                                \
+        c0 = (c0+lo)&BN_MASK2; if (c0<lo) tt++; \
+        c1 = (c1+tt)&BN_MASK2; if (c1<tt) c2++; \
+        c0 = (c0+lo)&BN_MASK2; if (c0<lo) hi++; \
+        c1 = (c1+hi)&BN_MASK2; if (c1<hi) c2++; \
+        } while(0)
+
+#  define sqr_add_c(a,i,c0,c1,c2)       do {    \
+        BN_ULONG lo, hi;                        \
+        sqr64(lo,hi,(a)[i]);                    \
+        c0 = (c0+lo)&BN_MASK2; if (c0<lo) hi++; \
+        c1 = (c1+hi)&BN_MASK2; if (c1<hi) c2++; \
+        } while(0)

 #  define sqr_add_c2(a,i,j,c0,c1,c2) \
        mul_add_c2((a)[i],(a)[j],c0,c1,c2)
@ -611,12 +634,6 @@ BN_ULONG bn_sub_words(BN_ULONG *r, const BN_ULONG *a, const BN_ULONG *b,

 void bn_mul_comba8(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)
 {
-# ifdef BN_LLONG
-    BN_ULLONG t;
-# else
-    BN_ULONG bl, bh;
-# endif
-    BN_ULONG t1, t2;
    BN_ULONG c1, c2, c3;

    c1 = 0;
@ -720,12 +737,6 @@ void bn_mul_comba8(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)

 void bn_mul_comba4(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)
 {
-# ifdef BN_LLONG
-    BN_ULLONG t;
-# else
-    BN_ULONG bl, bh;
-# endif
-    BN_ULONG t1, t2;
    BN_ULONG c1, c2, c3;

    c1 = 0;
@ -765,12 +776,6 @@ void bn_mul_comba4(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)

 void bn_sqr_comba8(BN_ULONG *r, const BN_ULONG *a)
 {
-# ifdef BN_LLONG
-    BN_ULLONG t, tt;
-# else
-    BN_ULONG bl, bh;
-# endif
-    BN_ULONG t1, t2;
    BN_ULONG c1, c2, c3;

    c1 = 0;
@ -846,12 +851,6 @@ void bn_sqr_comba8(BN_ULONG *r, const BN_ULONG *a)

 void bn_sqr_comba4(BN_ULONG *r, const BN_ULONG *a)
 {
-# ifdef BN_LLONG
-    BN_ULLONG t, tt;
-# else
-    BN_ULONG bl, bh;
-# endif
-    BN_ULONG t1, t2;
    BN_ULONG c1, c2, c3;

    c1 = 0;