nmav's Blog

Saturday, February 18, 2012

The need for SSH-like authentication in TLS

After the Diginotar CA compromise it is apparent that verifying web sites using only a trusted certificate authority (CA) is not sufficient. Currently a web site's certificate is verified against the CA that issued it and checked for revocation using the OCSP server the CA set up. If the CA is trusted by the user, this process protects against man-in-the-middle attacks when visiting such a web-site, and also against leakage of the web-sites private key (e.g. via OCSP as long as the leakage is reported to the CA). This is an automatic process that does not require the user to be involved, but it comes at a cost. There is a single channel for verification, the CA.

The certificate based web-site verification tries to convert the trust we have in a merchant to the digital world. That is currently done by having "few" authorities that provide certificates to the merchants and based on those certificates we should base our decisions. However, trust in trade requires more than that. For example wouldn't it raise suspicions, a laptop from a merchant who approached you in a parking lot and provided you with a legitimate looking brand card? Would his business or credentials card be the only thing to check? Those heuristics are unfortunately not currently available in the digital world.

If it wasn't for the Chrome browser that implemented a preliminary version of key pinning, the Diginotar compromise might not have been uncovered. In the maliciously-issued certificates, the automatic verification procedure was not reporting any problems. It was the change in public key that triggered the key pinning warning and eventually to the Diginotar issue becoming public. Thus having an additional verification channel to standard PKI proved to be a success.

Key pinning, as of the 01 draft, is mostly server driven. That is, the user has very little control on trusting a particular server key if the server operator doesn't ask to. Another approach is the SSH programs' typical authentication method. The trust on first use. That describes the concept where the public key of the peer is not verified, or verified out-of-bound, but subsequent connections to the same peer require the public key to remain the same. That approach has the advantage that doesn't depend on the server operator setting up anything, but also the disadvantage that the user will be bugged every time the server changes its public key.

In any case having such methods to complement standard certificate verification and revocation checks, provides an additional layer of security. With such a layer, a CA compromise would not be enough for a large-scale man-in-the-middle attack since changes in the public keys would be detected and users would be warned. Such warnings might not be entirely successful in preventing all users from continuing but would raise suspicions for the legitimacy of the server, which might be enough.

For that, we implemented in GnuTLS a framework to be used either by applications that want to use key pinning, or a trust on first use (TOFU) authentication. That consists of three helper functions that store and verify the public keys. They can be seen in the GnuTLS manual. The included program gnutls-cli has also been modified to support a hybrid authentication that includes certificate authentication, TOFU and OCSP if the --tofu and --ocsp arguments are specified. The main idea of its operation is the idea discussed here and is shown in this example code at the manual.

Sunday, January 1, 2012

Do we need elliptic curve point compression?

GnuTLS has recently added support for elliptic curves (ECDSA and Elliptic curve Diffie-Hellman). Elliptic curves are an improvement on public key technologies, mostly in efficiency because they require operations with much smaller integers for equivalent security parameters to RSA or DH. For example a 1248-bit RSA key corresponds to an 160-bit ECDSA key, saving both in space required to store parameters and in time spent for calculations.

However elliptic curve technology is considered a patent minefield and quite some attempts have been made to clarify the status, e.g. IETF made an attempt with RFC6090 to describe only the fundamental algorithms on which all patents are known to have expired. The GnuTLS implementation of ECC is based on this document.

One of the interesting "improvements" that is not included in the fundamental algorithms, and is still patented is point compression. To describe it we need a small (but not detailed) introduction to elliptic curves. An elliptic curve (at least for the needs of TLS protocol) is a set of points (x,y) that satisfy the following relation modulo a large prime p.

y² = x³ + ax + b

This set of points has some properties, i.e., a group law allowing point addition, scalar multiplication etc. Without getting into details those points need to be exchanged during a protocol run and given that the points usually belong to a "large" prime field (e.g. 256-bits), reducing the exchanged bytes is an improvement. Point compression achieves that goal, so let's see how.

Say we have a point (x₀,y₀) satisfying y₀² = x₀³ + ax₀ + b. If x₀ is known, then the previous relation becomes a second degree polynomial, that has two solutions.

y_1,2 = ±√ x³ + ax + b

And surprisingly that's all about point compression. Instead of sending the tuple (x₀,y₀), only x₀ and a bit identifying the sign of y is sent. The receiver only needs to calculate y_1,2 and then select the appropriate using the sign bit. Of course calculating square roots modulo a prime isn't as simple as in our example, but the idea is similar.

So is this worthwhile? Are implementations like GnuTLS that don't infringe this patent in a disadvantage? While it is a clever idea, it doesn't really add much value in real-world protocols. A 256-bit curve has points of around 32 bytes. Thus the bytes saved per TLS handshake are 64 bytes overall. Such handshake typically contains data of several kilobytes, and saving few bytes isn't going to make a dramatic difference. It wouldn't be a top priority addition even if it wasn't a patented method.

Thursday, December 15, 2011

Self-modifying code using GCC

One of my research topics last year was self-modifying code mainly for obfuscation. Having seen how self-modification is implemented for a variety of programs, I could say that most existing techniques implement the self-modification in assembly, or in the most high level case, in C using inline assembly.

I'm not a good assembly programmer so I always try to move things to "high-level" C. Unfortunately I haven't managed to implement self modification with standard C, but can be done by using an interesting GCC extension. That is the label to pointer or '&&' extension. This provides the address of a C label as a pointer. Using that we can implement self modification without the use of assembler. The idea is demonstrated in the code snippet below.

One of its problems is that if the labels are not used in the code they are removed by the GCC optimizer (even with -O0) and the address to pointer just returns a dummy value. For that the labels have to be used in dummy code. The better the optimizer in GCC becomes the harder the work around. In this test we use the value of argc (any other external value would do) to ensure the labels stay put.

The code was inspired by a self-modifying code example using inline assembly, but unfortunately I can no longer find it in order to provide proper references.

/* 
 * A self-modifying code snippet that uses C
 * and the GCC label to pointer (&&) extension.
 */

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <stdint.h>

int main(int argc, char **argv)
{
 int (*my_printf) (const char *format, ...);
 void (*my_exit) (int);
 void *page =
     (void *) ((unsigned long) (&&checkpoint) &
        ~(getpagesize() - 1));

 /* mark the code section we are going to overwrite
  * as writable.
  */
 mprotect(page, getpagesize(), PROT_READ | PROT_WRITE | PROT_EXEC);

 /* Use the labels to avoid having GCC
  * optimize them out */
 switch (argc) {
   case 33:
     goto checkpoint;
   case 44:
     goto newcode;
   case 55:
     goto newcode_end;
   default:
     break;
 }

 /* Replace code in checkpoint with code from
  * newcode.
  */
 memcpy(&&checkpoint, &&newcode, &&newcode_end - &&newcode);

checkpoint:
 printf("Good morning!\n");
 return 1;

newcode:
 my_printf = &printf;
 (*(my_printf)) ("Good evening\n");

 my_exit = &exit;
 (*(my_exit)) (0);

newcode_end:
 return 2;
}

Dedicated to the anonymous referee who insisted into adding this example to our article.

Monday, December 12, 2011

Generating Diffie-Hellman parameters

Starting with gnutls 3.0 the Diffie-Hellman parameter generation has been changed. That was mandated by the move from libgcrypt to nettle. Nettle didn't support Diffie-Hellman parameter generation, so I had to implement it within gnutls. For that I generate a prime of the form p=2wq+1, where w and q are also primes and q has the size in bits of the security parameter (could be 160,256 etc. bits, based on the size of p). Then I search for a generator of the q subgroup using an algorithm which typically gives a large generator --few bits less than the size of p.

This method has the advantage that a server when selecting a private key value x, instead of selecting 0 < x < p-1, it can select a smaller x within the multiplicative subgroup of order q, i.e., 0 < x < q-1. The security level of x is that of the security parameter, which in gnutls is calculated as in ECRYPT recommendations. However until now we never wrote the size of the security parameter in the Diffie-Hellman parameters file, so it was impossible for the server to guess the order of q, since the PKCS #3 file only contained the generator and the prime. However PKCS #3 has a privateValueLength field exactly for this purpose, but it is not used by gnutls or any other implementation I'm aware of.

DHParameter ::= SEQUENCE {
  prime INTEGER, -- p
  base INTEGER, -- g
  privateValueLength INTEGER OPTIONAL 
}

By populating and using it, the performance improvement was quite impressive. The following table demonstrates that.

Prime length	Private key length	Transactions/sec with DHE-RSA
1248	1248	122.75
1248	160	189.91

So starting from 3.0.9 gnutls' generated parameters for Diffie-Hellman should perform better. However one question that I had to answer, is what is more important, keeping x small as we do, or having a small generator? Libgcrypt and openssl generate parameters in a way that the generator is kept small, e.g. g=5 or g=2. For that I generated 1248 bit parameters using openssl which happened to have g=2.

Prime length	Generator length	Private key length	Transactions/sec with DHE-RSA
1248	1246 bits	1248	122.75
1248	1246 bits	160	189.91
1248	2 bits	1248	125.94

So it seems keeping the generator small doesn't really have an impact to performance comparing to using a smaller but still secure subgroup.

Thursday, December 8, 2011

The price to pay for perfect-forward secrecy

[EDIT: This article was written in 2011 and reflects the facts at the time; fortunately, since then the state of the Internet has improved, and PFS ciphersuites have become the norm]

Ok, the question seems to be what is perfect forward secrecy? Perfect forward secrecy (PFS) is a property of a cryptographic protocol, such as SSL (or better TLS). That property ensures that once the cryptographic keys of a participant are leaked the secrecy of past sessions remains. For example if the amazon.com private key is leaked your previous transactions with this site remain secure. That is a very interesting protocol property, but is actually only effective if the web site wouldn't store any of the sensitive transaction data. In this post we ignore any issues due to storage and focus on the protocol property. So does TLS have this property? That is a yes and a no.

TLS provides the option to use ciphersuites offering perfect forward secrecy. Those are the ciphersuites that use ephemeral Diffie-Hellman and elliptic curve Diffie-Hellman authentication. However those come at a cost very few web sites are willing to pay for. They involve expensive calculations that put some burden on the server and delay the transaction for few milliseconds. Why would a server bother to protect your data anyway? And that seems to be the current attitude. Most servers totally prohibit the usage of ciphersuites that offer PFS. Try for example to connect to amazon.com using:

$ ./gnutls-cli www.amazon.com --priority NORMAL:-KX-ALL:+DHE-RSA:+ECDHE-RSA
Resolving 'www.amazon.com'...
Connecting to '72.21.214.128:443'...
*** Fatal error: A TLS fatal alert has been received.
*** Received alert [40]: Handshake failed

So let's try to evaluate the cost of PFS versus the plain RSA ciphersuites that do not offer PFS, using a simple approach initially. For that we use Diffie-Hellman group parameters of 1024 bits, a 192-bit elliptic curve and a 1024-bit RSA key and a modified gnutls-cli for its benchmark-tls option.

Key exchange	Parameters	Transactions/sec
DHE-RSA	1024-bit RSA key, 1024-bit DH parameters	345.53
ECDHE-RSA	1024-bit RSA key, 192-bit ECDH parameters	604.92
ECDHE-ECDSA	192-bit ECDSA key, 192-bit ECDH parameters	595.84
RSA	1024-bit RSA key	994.59

In this test the ciphersuites DHE-RSA, ECDHE-RSA and ECDHE-ECDSA provide perfect forward secrecy, and use a signed ephemeral (Diffie-Hellman) key exchange. The signature in the key exchange is an RSA or ECDSA one, determined by the ciphersuite and the certificate. The plain RSA ciphersuite doesn't offer PFS and utilizes RSA's encryption capability (instead of signing). We can see that the non-PFS ciphersuite is a clear winner with 3x factor from DHE_RSA ciphersuite. The elliptic curve equivalents of DHE also fail to reach plain RSA's performance.

However, our security levels are not consistent. An elliptic curve of 192-bits provides 96-bits of security according to ECRYPT (don't confuse the bits of an algorithm such as RSA with the bits of the security level, a security level of 96-bits is a good level of security for today's standards). That means it is equivalent to roughly 1776-bits RSA and DH parameters. So let's do the same experiment with more consistent parameters.

Key exchange	Parameters	Transactions/sec
DHE-RSA	1776-bit RSA key, 1776-bit DH parameters	98.26
ECDHE-RSA	1776-bit RSA key, 192-bit ECDH parameters	352.41
ECDHE-ECDSA	192-bit ECDSA key, 192-bit ECDH parameters	595.84
RSA	1776-bit RSA key	460.08

So it is obvious that increasing the security level to 96-bits degrades performance for all ciphersuites (except ECDHE-ECDSA which already offered that level of security). The Diffie-Hellman key exchange is very slow over a 1776-bit group, and in addition it includes an RSA signature of 1776-bits. The Elliptic curve equivalent (ECDHE-RSA) is much more efficient, even though it also includes a 1776-bit RSA signature. The clear winner though is the ECDHE-ECDSA variant which uses an ECDSA public key and outperforms even plain RSA.

Nevertheless, if we restrict to RSA keys that are supported by almost every implementation, the winner is plain RSA, with a small but not insignificant difference from the second.

So although I'd suggest ciphersuites offering PFS for almost everything, in practice, when ECDSA keys are not option, they degrade performance, and this should not be underestimated. When it is desired to have PFS, then its shortcomings have to be evaluated against the benefits.

On the other hand if a decent security level is required (such as the 96-bit one used in our example), we see that the switch from plain RSA to ECDHE-ECDSA provides both efficiency and forward secrecy.

So given the fact that any web server may at some point be compromised, my suggestion would be, that if you value the exchanged data in the transactions, only PFS ciphersuites should be used. Otherwise, using the plain RSA ciphersuite may suit better.

PS. Note that protecting the server's private key on the disk using PIN or passwords is futile as the key resides unencrypted in the server's memory.

Thursday, November 24, 2011

Enhancing privacy in comments

FHEO or "For human eyes only" is a Firefox browser plugin with the goal of enhancing privacy in comments made in social networks or forums. The idea is to counter privacy issues seen mainly in social networks, by using (distorted) images instead of plain text. The two issues it mainly defends against:

Expiration of comments;
Prevention of comments being indexed by web crawlers.

Most social networks do not allow deleting old posts or comments, and if they do it is not an automatic process. Moreover those comments can be easily gathered by a crawler such as google and be used to associate persons based on written comments, political ideas etc.

This is something my colleague Andreas Pashalidis didn't like and proposed a master thesis on a firefox plugin that would counter those issues. This thesis was completed by Xavi Ferrer and Beñat Bermejo under our supervision and resulted to a firefox plugin prototype. Later the plugin and distortions have been enhanced by Andreas (and to a tiny degree me), and now a fully functional plugin is available under the Apache License 2.0.

We'd appreciate your comments and ideas.

Tuesday, September 27, 2011

Ovrimos was a greek company making an RDBMS. They tried to compete with giants like Oracle and microsoft SQL, but failed. Having the guts to try was enough for me. They are part of history right now and I'm pretty proud I was a tiny part of it. The history of the company is nicely described by one of the lead engineers at that time.