Monday, May 9, 2016

An overview of the new features in GnuTLS 3.5.0

Few minutes ago I've released GnuTLS 3.5.0. This is the stable-next branch of GnuTLS which will replace the stable GnuTLS 3.4.x branch within a year. It is fully backwards compatible and comes with several new features, the most prominent I'll summarize later on this post.

However, before going on the features let me describe the current trends in the applied cryptography field, to provide an idea of the issues considered during development, and to give context for the included changes. For the really impatient to see the features, jump to last section.

Non-NIST algorithms

After the dual EC-DRBG NIST fiasco, requests to avoid relying blindly on NIST approved/standardized algorithms for the Internet infrastructure, became louder and louder (for the new in the crypto field NIST is USA's National Institute of Standards and Technology). Even though NIST with its standardizations has certainly aided the Internet security technologies, the Snowden revelations that NSA via NIST had pushed for the backdoor-able by design EC-DRBG random generator (RNG), and required standard compliant applications to include the backdoor,  made a general distrust apparent in organizations like IETF or IRTF. Furthermore, more public scrutiny of NSA's contributions followed. You can see a nice example on the reactions to NSA's attempt to standardize extensions to the TLS protocol which will expose more state of the generator; an improvement which would have enabled the EC-DRBG backdoor to operate in a more efficient way under TLS.

Given the above, several proposals were made to no longer rely on NIST's recommendations for elliptic curve cryptography or otherwise. That is, both their elliptic curve parameters as well as their standardized random generators, etc. The first to propose alternative curves to IETF was the German BSI which proposed the brainpool curves. Despite their more open design, they didn't receive much attention by implementers. The most current proposal to replace the NIST curves comes from the Crypto Forum Research Group (CFRG) and proposes two curves, curve25519 and curve448. These, in addition to being not-proposed-by-NIST, can have a very fast implementation in modern systems and can be implemented to operate in constant time, something that is of significant value for digital signatures generated by servers. These curves are being considered for addition in the next iteration of the TLS elliptic curve document for key exchange, and there is also a proposal to use them for PKIX/X.509 certificate digital signatures under the EdDSA signature scheme.

For the non-NIST symmetric cipher replacements the story is a bit different. The NIST-standardized AES algorithm is still believed to be a very strong cipher, and will stay with us for quite long time especially since it is already part of the x86-64 CPU instruction set. However, for CPUs that do not have that instruction set, encryption performance is not particularly impressing. That, when combined with the common in TLS GCM authenticated-encryption construction which cannot be easily optimized without a specific (e.g., PCLMUL) instruction set being present, put certain systems on a disadvantage. Prior to RC4 being known to be completely broken, this was the cipher to switch your TLS connection to, for such systems. However, after RC4 became a cipher to display on a museum, a new choice was needed. I have written about the need for it back in 2013, and it seems today we are very close to having Chacha20 with Poly1305 as authenticator being standardized for use in the TLS protocol. That is an authenticated-encryption construction defined in RFC7539, a construction that can outperform both RC4 and AES on hardware without any cipher-specific acceleration.


Note that in non-NIST reliance trend, in GnuTLS we attempt to stay neutral and decide our next steps on case by case basis. Not everything coming or being standardized from NIST is bad, and algorithms that are not standardized by NIST do not become automatically more secure. Things like EC-DRBG for example were never part of GnuTLS not because we disliked NIST, but because this design didn't make sense for a random generator at all.

No more CBC

TLS from its first incarnation used a flawed CBC construction which led to several flaws over its years, the most prominent being the Lucky13 attack. The protocol in its subsequent updates (TLS 1.1 or 1.2) never fixed these flaws, and instead required the implementers to have a constant time TLS CBC pad decoding, a task which proved to be notoriously hard, as indicated by the latest OpenSSL issue. While there have been claims that some implementations are better than others, this is not the case here. The simple task of reading the CBC padding bytes, which would have been 2-3 lines of code normally, requires tens of lines of code (if not more), and even more extensive testing for correctness/time invariance.  The more code a protocol requires, the more mistakes (remember that better engineers make less mistakes, but they still make mistakes). The protocol is at fault, and the path taken in the next revision of TLS, being 1.3, is to completely drop the CBC mode. In RFC7366 there is a proposal to continue using the CBC ciphersuites in a correct way which would pose no future issues, but the TLS 1.3 revision team though it is better to move away completely from something that has caused so many issues historically.

In addition to that, it seems that all the CBC ciphersuites were banned from being used under HTTP/2.0 even when used under TLS 1.2. That would mean that applications talking HTTP/2.0 would have to disable such ciphersuites or they may fail to interoperate. This move effectively obsoletes the RFC7366 fix for CBC (already implemented in GnuTLS).


Cryptographically-speaking the CBC mode is a perfectly valid mode when used as prescribed. Unfortunately when TLS 1.0 was being designed, the prescription was not widely available or known, and as such it got into to protocol following the "wrong" prescription. As it is now and with all the bad PR around it, it seems we are going to say goodbye to it quite soon.

For the emotionally tied with CBC mode like me (after all it's a nice mode to implement), I'd like to note that it will still live with us under the CCM ciphersuites but on a different role. It now serves as a MAC algorithm. For the non-crypto geeks, CCM is an authenticated encryption mode using counter mode for encryption and CBC-MAC for authentication; it is widely used in the IoT, an acronym for the 'Internet of Things' buzzwords, which typically refers to small embedded devices.




Latency reduction


One of the TLS 1.3 design goals, according to its charter, is to "reduce handshake latency, ..., aiming for one roundtrip for a full handshake and one or zero roundtrip for repeated handshakes". That effort was initiated and tested widely by Google and the TLS false start protocol, which reduced the handshake latency to a single roundtrip, and further demonstrated its benefits with the QUIC protocol. The latter is an attempt to provide multiple connections over a stream into userspace, avoiding any kernel interaction for de-multiplexing or congestion avoidance into user-space. It comes with its own secure communications protocol, which integrates security into the transport layer, as opposed to running a generic secure communications protocol over a transport layer. That's a certainly interesting approach and it remains to be seen whether we will be hearing more of it.

While I was initially a skeptic for modifications to existing cryptographic protocols to achieve low latency, after all such modifications reduce the security guarantees (see this paper for a theoretical attack which can benefit from false-start), the requirement for secure communications with low latency is there to stay. Even though the strive to reduce latency for HTTP communication may not be convincing for everyone, one cannot but imagine a future where high latency scenarios like this are the norm, and low-roundtrip secure communications protocols are required.

Post-quantum cryptography

Since several years it is known that a potential quantum computer can break cryptographic algorithms like RSA or Diffie Hellman as well as the elliptic curve cryptosystems. It was unknown whether a quantum computer at the size where it could break existing cryptosystems could ever exist, however research conducted the last few years provides indications that this is a possibility. NIST hosted a conference on the topic last year, where NSA expressed their plan to prepare for a post-quantum computer future. That is, they no longer believe that elliptic curve cryptography, i.e., their SuiteB profile, is a solution which will be applicable long-term. That is, because due to their use of short keys, the elliptic curve algorithms require a smaller quantum computer to be broken, rather than their finite field counterparts (RSA and Diffie-Hellman). Ironically, it is easier to protect the classical finite field algorithms from quantum computers by increasing the key size (e.g., to 8k or 16k keys) than their more modern counterparts based on elliptic curves.

Other than the approach of increasing the key sizes, today we don't have much tools (i.e., algorithms) to protect key exchanges or digital signatures against a quantum computer future. By the time a quantum computer of 256-qubits or larger roughly 384 qubits is available all today's communication which is based on elliptic curves will be potentially be made available to the owner of such a system. Note, that this future is not expected soon; however, no-one would dare to make a prediction for that. Note also, that the existing systems of D-WAVE are not known to be capable of breaking the current cryptosystems.

Neither IETF or any other standardizing body has any out of the box solution. The most recent development is a NIST competition for quantum computer resistant algorithms, which is certainly a good starting point. It is also a challenge for NIST as it will have to overcome the bad publicity due to the EC-DRBG issue and reclaim its position in technology standardization and driver. Whether they will be successful on that goal, or whether we are going to have new quantum-computer resistant algorithms at all, it remains to be seen.



Finally: the GnuTLS 3.5.0 new features

In case you managed to read all of the above, only few paragraphs are left. Let me summarize the list of prominent changes.

  • SHA3 as a certificate signature algorithm. The SHA3 algorithm on all its variations (256-512) was standardized by FIPS 202 publication in August 2015. However, until very recently there were no code points (or more specifically object identifiers) for using it on certificates. Now that they are finally available, we have modified GnuTLS to include support for generating, and verifying certificates with SHA3. For SHA3 support in the TLS protocol either as a signature algorithm or a MAC algorithm we will have to wait further for code points and ciphersuites being available and standardized. Note also, that since GnuTLS 3.5.0 is the first implementation supporting SHA3 on PKIX certificates there have not been any interoperability tests with the generated certificates.
  • X25519 (formerly curve25519) for ephemeral EC diffie-hellman key exchange. One quite long-time expected feature we wanted to introduce in GnuTLS is support for alternative to the standardized by NIST elliptic curves. We selected curve25519 originally described in draft-ietf-tls-curve25519-01 and currently in the document which revises the elliptic curve support in TLS draft-ietf-tls-rfc4492bis-07. The latter document --which most likely means the curve will be widely implemented in TLS-- and the advantages of X25519 in terms of performance are the main reasons of selecting it. Note however, that X25519 is a peculiar curve for implementations designed around the NIST curves. That curve cannot be used with ECDSA signatures, although it can be used with a similar algorithm called EdDSA. We don't include EdDSA support for certificates or for the TLS protocol in GnuTLS 3.5.0 as the specification for it has not settled down. We plan to include it in a later 3.5.x release. For curve448 we would have to wait until its specification for digital signatures is settled and is available in the nettle crypto library.
  • TLS false start. The TLS 1.2 protocol as well as its earlier versions required a full round-trip time of 2 for its handshake process. Several applications require reduced latency on the first packet and so the False start modification of TLS was defined. The modification allows the client to start transmitting at the time the encryption keys are known to him, but prior to verifying the keys with the server. That reduces the protocol to a single round-trip at the cost of putting the initially transmitted messages of the client at risk. The risk is that any modification of the handshake process by an active attacker will not be detected by the client, something that can lead the client to negotiate weaker security parameters than expected, and so lead to a possible decryption of the initial messages. To prevent that GnuTLS 3.5.0 will not enable false start even if requested when it detects a weak ciphersuite or weak Diffie-Hellman parameters.  The false start functionality can be requested by applications using a flag to gnutls_init().
  • New APIs to access the Shawe-Taylor-based provable RSA and DSA parameter generation. While enhancing GnuTLS 3.3.x for Red Hat in order to pass the FIPS140-2 certification, we introduced provable RSA and DSA key generation based on the Shawe-Taylor algorithm, following the FIPS 186-4 recommendations. That algorithm allows generating parameters for the RSA and DSA algorithms from a seed that are provably prime (i.e., no probabilistic primality tests are included). In practice this allows an auditor to verify that the keys and any parameters (e.g., DH) present on a system are generated using a predefined and repeatable process. This code was enabled only when GnuTLS was compiled to enable FIPS140-2 mode, and when the system was put in FIPS140-2 compliance mode. In GnuTLS 3.5.0 this functionality is made available unconditionally from the certtool utility, and a key or DH parameters will be generated using these algorithms when the --provable parameter is specified. That required to modify the storage format for RSA and DSA keys to include the seed, and thus for compatibility purposes this tool will output both old and new formats to allow the use of these parameters from earlier GnuTLS versions and other software.
  • Prevent the change of identity on rehandshakes by default.  The TLS rehandshake protocol is typically used for three reasons, (a) rekey on long standing connections, (b) re-authentication and (c) connection upgrade. The rekey use-case is self-explanatory so my focus will be on the latter two. Connection upgrade is when connecting with no client authentication and rehandshaking to provide a client certificate, while re-authentication is when connecting with an identity A, and switching to identity B mid-connection. With that change in GnuTLS the latter use case (re-authentication) is prohibited unless the application has explicitly requested it. The reason is that the majority of applications using GnuTLS are not prepared to handle a connection identity change in the middle of a connection something that depending on the application protocol may lead to issues. Imagine the case where a client authenticates to access a resource, but just before accessing it, the client switches to another identity by presenting another certificate. It is unknown whether applications using GnuTLS are prepared for such changes, and thus we considered important to protect applications by default by requiring applications that utilize re-authentication to explicitly specify it via a flag to gnutls_init(). This change does not affect applications using rehandshake for rekey or connection upgrade.

That concludes my list of the most notable changes, even though this release includes several other ones, including a stricter protocol adherence in corner cases and a significant enhancement of the included test suite. Even though I am the primary contributor, this is a release containing code contributed by more than 20 people which I'd like to personally thank for their contributions.

If you are interested in following our development or helping out, I invite you on our mailing list as well as to our gitlab pages.


Monday, February 8, 2016

Why do we need SSL VPNs today?

One question that has been bothering me for quite a while, is why do we need SSL VPNs? There is an IETF standardized VPN type, IPSec, and given that, why do SSL VPNs still get deployed? Why not just switch everything to IPSec? Moreover, another important question is, since we have IPSec since around 1998, why IPSec hasn't took over the whole market of VPNs? Note that, I'll be using the term SSL even though today it has been replaced by Transport Layer Security (TLS) because the former is widely used to describe this type of VPNs.

These are valid questions, but depending on who you ask you are very likely to get a different answer. I'll try to answer from an SSL VPN developer standpoint.


In the VPN world there are two main types of deployment, the 'site-to-site' and the 'remote access' types. To put it simply, the first is about securing lines between two offices, and the latter is about securing the connection between your remote users and the office. The former type may rely on some minimal PKI deployment or pre-shared keys, but the latter requires integration with some user database, credentials, as well as settings which may be applied individually for each user. In addition the 'remote access' type is often associated with accounting such as keeping track how long a user is connected, how much data has been transferred and so on. That may remind you the kind of accounting used in ppp and dial-up connections, and indeed the same radius-based accounting methods are being used for that purpose.

Both of the 'site-to-site' and 'remote access' setups can be handled by either SSL or IPSec VPNs. However, there are some facts that make some VPNs more suitable for one purpose than the other. In particular, it is believed that SSL VPNs are more suitable for the 'remote access' type of VPNs, while IPSec is unquestionably the solution one would deploy on site-to-site connections. In the next paragraphs I focus on the SSL VPNs and try to list their competitive advantage for the purpose of 'remote access'.
  1. Application level. In SSL VPNs the software is at the application level, and that  means that it can provided by the software distributor, or even by the administrator of the server. These VPN applications can be customized for the particular service the user connects to (e.g., include logos, or adjust to the environment the user is used to, or even integrate VPN connectivity with an application). For example the 12vpn.net VPN provider customizes the openconnect-gui application (which is free software) to provide it with a pre-loaded list of the servers they offer to their customers. Several other proprietary solutions use a similar practice, and the server provides the software for the end users.
  2. Custom interfaces for authentication. The fact that (most) SSL VPNs run over HTTPS, it provides them with an inherent feature of having complete control over the authentication interface they can display to users. For example in Openconnect VPN we provide the client with XML forms that the user is presented and must fill in, in order to authenticate. That usually covers typical password authentication, one time passwords, group selections, and so on. Other SSL VPN solutions use entirely free form HTML authentication and often only require a browser to log to the network. Others integrate certificate issuing on the first user connection using SCEP, and so on.
  3. Enforcing a security policy. Another reason (which I don't quite like or endorse - but happens quite often) is that the VPN client applications, enforce a particular company-wide security policy; e.g., ensure that anti-virus software is running and up to date, prior to connecting to the company LAN. This often is implemented with server provided executables being run by the clients, but that is also a double-edged sword as a VPN server compromise will allow for a compromise of all the clients. In fact the bypass of this "feature" was one of the driving reasons behind the openconnect client.
  4. Server side user restrictions. On the server-side the available freedom is comparable with the client side. Because SSL VPNs are on the application layer protocol, they are more flexible in what the connecting client can be restricted to. For example, in openconnect VPN server invidual users, or groups of them can be set into a specific kernel cgroup, i.e., limiting their available CPU time, or can be restricted to a fixed bandwidth in a much more easy way than in any IPSec server.
  5. Reliability, i.e., operation over any network. In my opinion, the major reason of existance of SSL VPN applications and servers is that they can operate under any environment. You can be restricted by firewalls, broken networks which block ESP or UDP packets and still be able to connect to your network. That is, because the HTTPS protocol which they rely on, cannot be blocked without having a major part of the Internet go down. That's not something to overlook; a VPN service which works most of the times but not always because the user is within some misconfigured network is unreliable. Reliability is something you need when you want to communicate with colleagues when being on the field, and that's the real problem SSL VPN solve (and the main reason companies and IT administrators usually pay extra to have these features enabled). Furthermore, solutions like Openconnect VPN utilize a combination of HTTPS (TCP) and UDP when available to provide the best possible user experience. It utilizes Datagram TLS over UDP when it detects that this is allowed by network policy (and thus avoiding the TCP over TCP tunneling issues), and falls back to tunneling over HTTPS when the establishment of the DTLS channel is not possible.

That doesn't of course mean that IPSec VPNs are obsolete or not needed for remote access. We are far from that. IPSec VPNs are very well suited for site-to-site links --which are typically on networks under the full control of the deployer-- and are cross platform (if we ignore the IKEv1 vs IKEv2 issues), in the sense that you are very likely to find native servers and clients offered by the operating system. In addition, they possess a significant advantage; because they are integrated with the operating system's IP stack, they utilize the kernel for encryption which removes the need for userspace to kernel space switches. That allows them to serve high bandwidths and spend less CPU time. A kernel side TLS stack, would of course provide SSL VPNs a similar advantage but currently that is work in progress.

As a bottom line, you should chose the best tool for the job at hand based on your requirements and network limitations. I made the case for SSL VPNs, and provided the reasons of why I believe they are still widely deployed and why they'll continue to. If I have already convinced you for the need for SSL VPNs, and you are an administrator working with VPN deployments I'd like to refer you to my FOSDEM 2016 talk about the OpenConnect (SSL) VPN server, on which I describe the reasons I believe it provides a significant advantage over any existing solutions in Linux systems.


Monday, November 23, 2015

An overview of GnuTLS 3.4.x

This April GnuTLS 3.4.0 was released, as our stable-next  branch, i.e., the branch to replace the current stable branch of GnuTLS (which as of today is the 3.3.x branch). During that time, we also moved our development infrastructure from gitorious to gitlab.com, a move that has improved the project management significantly. We now have automated builds (on servers running on an Openstack system supplied by Red Hat), continuous integration results integrated to the development pages, as well as simple to use milestone and issue trackers. For a project which as it stands is mostly managed by me, that move was godsend, as we can have several new development and management features with considerable less effort. That move unfortunately was not without opposition, but at the end a decision had to be taken, which is driven by pragmatism.

Assessing the last 12 months we had quite a productive year. Several people offered fixes and enhancements, and I still remain the main contributor -though I would really like for that to change.

Anyway, let me get to the overview of the GnuTLS 3.4.x feature set. For its development we started with a basic plan describing the desired features (it still be seen at this wiki page), and we also allowed for features which did not threaten backwards compatibility to be included during in the first releases. These were tracked using the milestone tracker.

As we are very close to tag the current release as stable, I'll attempt to summarize the new features and enhancements in the following paragraphs, and describe our vision for the future of the library.
  •  New features:
    • Added a new and simple API for authenticated encryption. The new API description can be seen in the manual. The idea behind that change was to make programming errors in encryption / decryption routine usage as difficult as possible. That was done by combining encryption and tagging, as well as decryption and verification of the tag (MAC). That is of course taking advantage of the AEAD cipher constructions and currently can be used by the following AEAD ciphers: AES-GCM, AES-CCM, CAMELLIA-GCM, and CHACHA20-POLY1305.
    • Simplified the verification API. One of the most common complains on TLS library APIs is their involved complexity to perform a simple connection. To our defense part of that complexity is due to specifications like PKIX which forwards the security decisions to application code, but partly also because the APIs provided are simply too complicated. In that particular case we needed to get rid the need for elaborate callbacks to perform certificate verification checks, and include the certificate verification in the handshake process. That means, to move the key purpose, and hostname verification to the handshake process.  For that we simplified the current API by introducing gnutls_session_set_verify_cert() and gnutls_session_set_verify_cert2(), when a key purpose is required. The call of the former function instructs GnuTLS to verify the peer's certificate hostname as part of the handshake, hence eliminating the need for callbacks. An example of a session setup with the new API can be seen in the manual. If we compare the old method and the new, we can see that it reduces the minimum client size by roughly 50 lines of code.
    • Added an API to access system keys. Several operating systems provide a system key store where private keys and certificates may be present. The idea of this feature was to make these keys readily accessible to GnuTLS applications, by using a URL as a key identifiers (something like system:uuid=xxx). Currently we support the windows key store. More information can be seen at the system keys manual section.
    • Added an API for PKCS #7 structure generation and signing. Given that PKCS #7 seems to be the format of choice for signing kernel modules and firmware it was deemed important to provide such an API. It is currently described in gnutls/pkcs7.h and can be seen in action in the modsign tool.
    • Added transparent support for internationalized DNS names. Internationalized DNS names require some canonicalization prior to being compared or used. Previous versions of GnuTLS didn't support that, disallowing the comparison of such names, but this is the first version which adds support for the RFC6125 recommendations.
  •  TLS protocol updates:
    • Added new AEAD cipher suites. There are new ciphersuites being defined for the TLS protocol, some for the Internet of Things profile of TLS, and some which involve stream ciphers like Chacha20. This version of TLS adds support for both of these ciphersuites, i.e., AES-CCM, AES-CCM-8, and CHACHA20-POLY1305.
    • Added support for Encrypt-then-authenticate mode for CBC. During the last decade we have seen several attacks on the CBC mode, as used by the TLS ciphersuites. Each time there were mitigations for the issues at hand, but there was never a proper fix. This version of GnuTLS implements the "proper" fix as defined in rfc7366. That fix extends the lifetime of the CBC ciphersuites.
    • Included Fix for the triple-handshake attack. An other attack on the TLS protocol which required a protocol fix is the triple handshake attack. This attack took advantage of clients not verifying the validity of the peer's certificate on a renegotiation. This release of GnuTLS implements the protocol fix which ensures that the applications which rely on renegotiation are not vulnerable to the attack.
    • Introduced support for fallback SCSV. That is provide the signaling mechanism from RFC7507 so that applications which are forced to use insecure negotiation can reasonably prevent downgrade attacks.
    • Disabled SSL 3.0 and RC4 in default settings. Given the recent status on attacks on both the SSL 3.0 protocol, and the RC4 cipher, it was made apparent that these no longer are a reasonable choice for the default protocol and ciphersuite set. As such they have been disabled.

That list summarizes the major enhancements in the 3.4.x branch. As you can see our focus was not to only add new security related features, but to also simplify the usage of the currently present features. The latter is very important, as we see in practice that TLS and its related technologies (like PKIX) are hard to use even by experts. That is partially due to the complexity of the protocols involved (like PKIX), but also due to very flexible APIs which strive to handle 99% of the possible options allowed by the protocols, ignoring the fact that 99% of the use cases found in the Internet utilize a single protocol option. That was our main mistake with the verification API, and I believe with the changes in 3.4.x we have finally addressed it.

Furthermore, a long time concern of mine was that the stated protocol complexity for PKIX and TLS, translates to tens of thousands lines of code, hence to a considerable probability for the presence of a software error. We strive to reduce that probability, for example by using fuzzying or static analyzers like coverity and clang, and audit, but it cannot be eliminated. No matter how low the probability of an error per line of code is, it will become certainty in large code bases. Thus my recommendation for software which can trade performance for security, would be to consider designs where the TLS and PKIX protocols are confined in sandboxed environments like seccomp. We have tested such a design on the openconnect VPN server project and we have noticed no considerable obstacles or significant performance reduction. The main requirement in such an environment is IPC, and being able to pack structures in an efficient way (i.e., check protocol-buffers). For such environments, we have added a short text in our manual describing the system calls used by GnuTLS, to assist with setting up filters, and we welcome comments for improving that text or the approach.

In addition, to allow applications to offer, a multi-layer security, involving private key isolation, we continue to enhance, the strong point of GnuTLS 3.3.x. That is, the transparent support for PKCS #11. Whilst sometimes it overlooked as a security feature, PKCS #11 can provide an additional layer of security by allowing to separate private keys from the applications using them. When GnuTLS is combined with a Hardware Security Module (HSM), or a software security module like caml-crush, it will prevent the leak of the server's private key even if the presence of a catastrophic attack which reveals the server's memory (e.g., an attack like heartbleed).

Hence, a summary of our vision for the future, would be to be able to operate under and contribute to multi-layer security designs. That is, designs where the TLS library isn't the Maginot line, after which there is no defense, but rather where the TLS library is considered to be one of the many defense layers.

So, overall, with this release we add new features to simplify the library, modernize its protocol support but also enhance GnuTLS in a way to be useful in security designs where multiple layers of protection are implemented (seccomp, key isolation via PKCS #11). If that is something that you are interested at, we welcome you to join our development mailing list, visit our development pages at gitlab, and help us make GnuTLS better.


Monday, June 15, 2015

Software Isolation in Linux

Starting from the assumption that software will always have bugs, we need a way to isolate and neutralize the effects of the bugs. One approach is by isolating components of the software such that a breach in one component doesn't compromise another. That, in the software engineering field became popular with the principle of privilege separation used by the openssh server, and is the application of an old idea to a new field. Isolation for protection of assets is  widespread in several other aspects of human activity; the most prominent example throughout history is the process of quarantine which isolates suspected carriers of epidemic diseases to protect the rest of the population. In this post, we briefly overview the tools available in the Linux kernel to offer isolation between software components. It is based on my FOSDEM 2015 presentation on software isolation in security devroom, slightly enhanced.

Note that this text is intended to developers and software engineers looking for methods to isolate components in their software.

An important aspect of such an analysis is to clarify the threats that it targets. In this text I describe methods which protect against code injection attacks. Code injection provides the attacker a very strong tool to execute code, read arbitrary memory, etc.

In a typical Linux kernel based system, the available tools we can use for such protection, are the following.
  1. fork() + setuid() + exec()
  2. chroot()
  3. seccomp()
  4. prctl()
  5. SELinux
  6. Namespaces

The first allows for memory isolation by using different processes, ensuring that a forked process has no access to parent's memory address space. The second allows a process to restrict itself in a part of the file system available in the operating system. These two are available in almost all POSIX systems, and are the oldest tools available dating to the first UNIX releases. The focus of this post are the methods 3-6 which are fairly recent and Linux kernel specific.

Before proceeding, let's make a brief overview of what is a process in an UNIX-like operating system. It is some code in memory which is scheduled to have a time slot in the CPU. It can access the virtual memory assigned to it, make calculations using the CPU user-mode instructions, and that's pretty much all. To access anything else, e.g., get additional memory, access files, read/write the memory of other processes, system calls from the operating system have to be used.

Let's now proceed and describe the available isolation methods.

Seccomp

After that introduction to processes, seccomp comes as the natural method to list, since it is essentially a filter for the system calls available in a process. For example, a process can install a filter to allow read() and write() but nothing else. After such a filter applied, any code which is potentially injected to that process will not be able to execute any other system calls, reducing the attack impact to the allowed calls. In our particular example with read() and write() only the data written and read by that process will be affected.

The simplest way to access seccomp is via the libseccomp library which has a quite intuitive API. An example using that library, which creates a whitelist of three system calls is shown below.

    
    #include <seccomp.h>

    scmp_filter_ctx ctx;
    ctx = seccomp_init(SCMP_ACT_ERRNO(EPERM))
    assert(ctx == 0);
    assert(seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(read), 0) == 0);
    assert(seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(write), 0) == 0);
    assert (seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(ioctl), 1, SCMP_A1(SCMP_CMP_EQ, (int)SIOCGIFMTU)) == 0);
    assert (seccomp_load(ctx) == 0);

The example above installs a filter which allows the read(), write() and ioctl() system calls. The latter is only allowed if the second argument of ioctl() is SIOCGIFMTU. The first line of filter setup, instructs seccomp to return -1 and set errno to EPERM when an instruction outside this filter is called.

The drawback of seccomp is the tedious process which a developer has to go in order to figure out the used system calls in his application. Manual inspection of the code is required to discover the system calls, as well as inspection of run traces using the 'strace' tool. Manual inspection will allow a making a rough list which will provide a starting point, but will not be entirely accurate. The issue is that calls to libc functions may not correspond to the expected system call. For example, a call exit() often results to an exit_group() system call.

The trace obtained using 'strace' will help clarify, restrict or extend the initial list. Note however, that using traces alone may prevent getting the system calls used in error condition handling, and different versions of libc may use different system calls for the same function. For example, the libc call select() uses the system call select() in x86-64 architecture, but the _newselect() in the x86.


The performance cost of seccomp is the cost of executing the filter, and for most cases it is a fixed cost per system call. In the openconnect VPN server I estimated the impact of seccomp on a worker process to 2% slowdown in transfer speed (from 624 to 607Mbps). The measured server worker process is executing read/send and recv/write in a tight loop.

Prctl

The PR_SET_DUMPABLE flag of the prctl() system call protects a process from other processes accessing its memory. That is, it will prevent processes with the same privilege as the protected to read its memory it via ptrace(). A usage example is shown below.

    #include <prctl.h>

    prctl(PR_SET_DUMPABLE, 0);

While this approach doesn't protect against code injection in general, it may prove a useful tool with low-performance cost in several setups.

SELinux

SELinux is an operating system mechanism to prevent processes from accessing various "objects" of the OS. The objects can be files, other processes, pipes, network interfaces etc. It may be used to enhance seccomp using more fine-grained access control. For example one may setup a rule with seccomp to only allow read(), and enhance the rule with SELinux for read() to only accept certain file descriptors.

On the other hand, a software engineer may not be able to rely too much on  SELinux to provide isolation, because it is often not within the developer's control. It is typically used as an administrative tool, and the administrator may decide to turn it off, set it to non-enforcing mode, etc.

The way for a process to transition to a different SELinux ruleset is via exec() or via the setcon() call, and its cost is perceived to be high. However, I have no performance tests with software relying on it.

The drawbacks of this approach are the centralized nature of the system policy, meaning that individual applications can only apply the existing system policy, not update it, the obscure language (m4) a system policy needs to be written at, and the fact that any policy written will not be portable across different Linux based systems.

Linux Namespaces

One of the most recent additions to the Linux kernel are the Namespaces feature. These allow "virtualizing" certain Linux kernel subsystems in processes, and the result is often referred to as containers. It is available via the clone() and unshare() system calls. It is documented in the unshare(2) and clone(2) man pages, but let's see some examples of the subsystems they can be restricted.

  • NEWPID: Prevents processes to "see" and access process IDs (PIDs) outside their namespace. That is the first isolated process will believe it has PID 1 and see only the processes it has forked.
  • NEWIPC: Prevents processes to access the main IPC subsystem (shared memory segments, messages queues etc.). The processes will have access to their own IPC subsystem.
  • NEWNS: Provides filesystem isolation, in a way as a feature rich chroot(). It allows for example to create isolated mount points which exist only within a process.
  • NEWNET: Isolates processes from the main network subsystem. That is it provides them with a separate networking stack, device interfaces, routing tables etc.
Let's see an example of an isolated process being created on its own PID namespace. The following code operates as fork() would do.

    #if defined(__i386__) || defined(__arm__) || defined(__x86_64__) || defined(__mips__)
        long ret;
        int flags = SIGCHLD|CLONE_NEWPID;
        ret = syscall(SYS_clone, flags, 0, 0, 0);
        if (ret == 0 && syscall(SYS_getpid) != 1)
                return -1;
        return ret;
    #endif

This approach, of course has a performance penalty in the time needed to create a new process. In my experiments with openconnect VPN server the time to create a process with NEWPID, NEWNET and NEWIPC flags increased the process creation time to 10 times more than a call to fork().

Note however, that the isolation subsystems available in clone() are by default reversible using the setns() system call. To ensure that these subsystems remain isolated even after code injection seccomp must be used to eliminate calls setns() (many thanks to the FOSDEM participant who brought that to my attention).

Furthermore, the approach of Namespaces follows the principle of blacklisting, allowing a developer to isolate from certain subsystems but not from every one available in the system. That is, one may enable all the isolation subsystems available in the clone() call, but then he may realize that the kernel keyring is still available to the isolated process. That is because there is no implementation of such an isolation mechanism for the kernel keyring so far.

Conclusions


In the following table I attempt to summarize the protection offerings of each of the described methods.


Prevent killing
other processes
Prevent access to memory
of other processes
Prevent access to
shared memory
Prevent exploitation
of an unused system call bug
Seccomp True True True True
prctl(SET_DUMPABLE) False True False False
SELinux True True True False
Namespaces True True True False

In my opinion, seccomp seems to be the best option to consider as an isolation mechanism when designing new software. Together with Linux Namespaces it can prevent access to other processes, shared memory and filesystem, but the main distinguisher I see, is the fact that it can restrict access to unused system calls.

That is an important point, given the number of available system calls in a modern kernel. It is not only that it reduces the overall attack surface by limiting them, but it will also deny access to functionality that was not intended to --see setns() and the kernel keyring issue above.

Wednesday, December 3, 2014

A quick overview of GnuTLS development in 2014

2014 was a very interesting year in the development of GnuTLS. On the development side, this year we have incorporated patches with fixes or enhanced functionality from more than 25 people according to openhub, and the main focus was moving GnuTLS 3.3.x from next to a stable release. That version had quite a number of new features, the most prominent being:
  • Initialization on a library constructor,
  • The introduction of verification profiles, i.e., enforce rules not only on the session parameters such as ciphersuites and protocol version, but also on the peer's certificate (e.g., only accept a session which provides a 2048 bit certificate),
  • The addition of support for DNS name constraints in certificates, i.e., enforce rules that restrict an intermediate CA to issue certificates only for a specific DNS domain,
  • The addition of support for trust modules using PKCS #11,
  • Better integration with PKCS #11 smart cards, and Hardware Security Modules (HSMs); I believe we currently have one of the better, if not the best, support and integration with smart cards and HSMs among  crypto libraries. That we mostly own to the support of PKCS#11 URLs, which allowed the conversion of high level APIs which accepted files, to API that would work with HSMs and smart cards, transparently, without changing it,
  • Support for the required by FIPS140-2 algorithms.
Most of these are described in the original announcement mail, some others were completed and added gradually.

A long-time requested feature that was implemented was the removal of the need for explicit library initialization. That relocated the previously explicit call of gnutls_global_init() to a library constructor, and eliminated a cumbersome requirement the previous versions of GnuTLS had. As a result it removed the need for locks and coordination in a large application which may use multiple libraries that depend on GnuTLS. However it had an unexpected side-effect. Because initialization was moved to a constructor, which is executed prior to an application entering main(), server applications which closed all the file descriptors on startup, did close the file descriptor that was held by GnuTLS for reading /dev/urandom. That's a pretty nasty bug, and was realized because in CUPS the descriptor which replaced the /dev/urandom one, was a non-readable file descriptor. It was solved re-opening the descriptor on the first call of gnutls_global_init(), but that issue demonstrates the advantage and simplicity of the system call approach for the random generator, i.e., getentropy() or getrandom(). In fact adding support for getentropy() reduced the complexity of the relevant code to 8 lines from 75 lines of code in the file-based approach.

An other significant addition, at least for client-side, is the support for trust modules, available using PKCS #11. Trust modules, like p11-kit trust, allow clients to perform server certificate authentication using a method very similar to what the NSS library uses. That is, a system-wide database of certificates, such as the p11-kit trust module, can be used to query for the trusted CAs and intermediate anchors. These anchors may have private attached extensions, e.g., restrict the name constraints for a CA, or restrict the scope of a CA to code signing, etc., in addition to, or in place of the scope set by the CA itself. That has very important implications for system-wide CA management. CAs can be limited on scope, or per application, and can even be restricted to few DNS top-level domains using name constraints. Most importantly, the database can be shared across libraries (in fact in Fedora 21 the trust database is shared between GnuTLS and NSS), simplifying significantly administration, even though the tools to manage it are primitive for the moment. That was a long time vision of Stef Walter, who develops p11-kit, and I'm pretty happy the GnuTLS part of it  was completed, quite well.

This was also the first year I attempted to document and publicize the goals for the next GnuTLS release which is 3.4.0 and will be released around next March. They can be seen in our gitorious wiki pages. Hopefully, all the points will be delivered, although some of the few remaining points rely on a new release of the nettle crypto library, and on an update of c-ares to support DNSSEC.

Dependency-wise, GnuTLS is moving to a tighter integration with nettle, which provides one of the fastest ECC implementations out there; I find unfortunate, though, that there seem to be no intention of collaboration between the two GNU crypto libraries - nettle and libgcrypt, meaning that optimizations in libgcrypt stay in libgcrypt and the same for nettle. GNU Libtasn1 is seeking for a co-maintainer (or even better for someone to rethink and redesign the project I'd add), so feel free to step up if you're up to the job. As it is now, I'm taking care of any fixes needed for that project.

On the other hand, we also had quite serious security vulnerabilities fixed that year. These varied from certificate verification issues to a heap overflow. A quick categorization of the serious issues fixed this year follows.
  • Issues discovered with manual auditing (GNUTLS-SA-2014-1, GNUTLS-SA-2014-2). These were certificate verification issues and were discovered as part of manual auditing of the code. Thanks to Suman Jana, and also to Red Hat which provided me the time for the audit.
  • Issues discovered by Fuzzers (GNUTLS-SA-2014-3, GNUTLS-SA-2014-5):
    • Codenomicon  provided a few month valid copy of their test suite, and that helped to discover the first bug above,
    • Sean Buford of Google reported to me that using AFL he identified a heap corruption issue in the encoding of ECC parameters; I have not used AFL but it seems quite an impressive piece of software,
  • Other/Protocol issues (GNUTLS-SA-2014-4), i.e., POODLE. Although, that is not really a GnuTLS or TLS protocol issue, POODLE takes advantage of the downgrade dance, used in several applications.
Even though the sample is small, and only accounts for the issues with an immediate security impact, I believe that this view of the issues, shows the importance of fuzzers today. In a project of 180k lines of code, it is not feasible to fully rely on manual auditing, except for some crucial parts of it. While there are no security vulnerabilities discovered via static analysis tools like clang or coverity, we had quite a few other fixes because of these tools.

So overall, 2014 was a quite productive year, both in the number of new features being added, and to the number of bugs fixed. New techniques were used to verify the source code, and new test cases were added in our test suite. What is encouraging is that more and more people test, report bugs and provide patches on GnuTLS. A big thank you to all of the people who contributed.

Wednesday, October 15, 2014

What about POODLE?

Yesterday POODLE was announced, a fancy named new attack on the SSL 3.0 protocol, which relies on applications using a non-standard fallback mechanism, typically found in browsers. The attack takes advantage of
  • a vulnerability in the CBC mode in SSL 3.0 which is known since a decade
  • a non-standard fallback mechanism (often called as downgrade dance)
So the novel and crucial part of the attack is the exploitation of the non-standard fallback mechanism.What is that, you may ask. I'll try to explain it in the next paragraph. Note that in the next paragraphs I'll use the term SSL protocol to cover TLS as well, since TLS is simply a newer version of SSL.

The SSL protocol, has a protocol negotiation mechanism that wouldn't allow a fallback to SSL 3.0 from clients and servers that both support a newer variant (e.g, TLS 1.1). That detects modifications by man-in-the-middle attackers and the POODLE attack would have been thwarted. However, a limited set of clients, perform a custom protocol fallback, the downgrade dance, which is straightforward but insecure. That set of clients seem to be most of the browsers; those in order to negotiate an acceptable TLS version follow something along the lines:
  1. Connect using the highest SSL version (e.g., TLS 1.2)
  2. If that fails set the version to be TLS 1.1 and reconnect
  3. ...
  4. until there are no options and SSL 3.0 is used.
That's a non-standard way to negotiate TLS and as the POODLE attack demonstrates, it is insecure. Any attacker can interrupt the first connection and make it seem like failure to force a fallback to a weaker protocol. The good news is, that mostly browsers use this construct, and few other applications should be affected.

Why do browsers use this construct then? To their defence, there have been serious bugs in SSL and TLS standard protocol fallback implementation, in widespread software. For example, when TLS 1.2 was out, we realized that our TLS 1.2-enabled client in GnuTLS couldn't connect to a large part of the internet. Few large sites would refuse to talk to the GnuTLS client because it advertised TLS 1.2 as its highest supported protocol. The bug was on the server, that closed the connection when encountered with a newer protocol than its own, instead of negotiating its highest supported (in accordance with the TLS protocol). It took few years before TLS 1.2 was enabled by default in GnuTLS, and still then we had a hard time convincing our users that encountered connection failures, that it was a server bug. The truth is that users don't care who's bug it is, they will simply use software that just works.

There has been long time since then (TLS 1.2 was published in 2008), and today almost all public servers follow the TLS protocol negotiation. So that may be the time for browsers to get rid of that relic of the past. Unfortunately, that isn't the case. The IETF TLS working group is now trying to standardize counter-measures for the browser negotiation trickery. Even though I have become more pragmatist since 2008, I believe that forcing counter measures in every TLS implementation just because there used to (or may still be) be broken servers on the Internet, not only prolongs the life of an insecure out of protocol work-around, but creates a waste. That is, it creates a code dump called TLS protocol implementations which get filled with hacks and work-arounds, just because of few broken implementations. As Florian Weimer puts it, all applications pay a tax of extra code, potentially introducing new bugs, and even more scary potentially introducing more compatibility issues, just because some servers on the Internet have chosen not to follow the protocol.

Are there, however, any counter-measures that one can use to avoid the attack, without introducing an additional fallback signalling mechanism? As previously mentioned, if you are using the SSL protocol the recommended way, no work around is needed, you are safe. If for any reason you want to use the insecure non-standard protocol negotiation, make sure that no insecure protocols like SSL 3.0 are in the negotiated set or if disabling SSL 3.0 isn't an option, ensure that it is only allowed when negotiated as a fallback (e.g., offer TLS 1.0 + SSL 3.0, and only then accept SSL 3.0).

In any case, that attack has provided the incentive to remove SSL 3.0 from public servers on the Internet. Given that, and its known vulnerabilities, it will no longer be included by default in the upcoming GnuTLS 3.4.0.

[Last update 2014-10-20]

PS. There are some recommendations to work around the issue by using RC4 instead of a block cipher in SSL 3.0. That would defeat the current attack and it closes a door, by opening another; RC4 is a broken cipher and there are known attacks which recover plaintext for it.

Thursday, April 17, 2014

software has bugs... now what?

The recent bugs uncovered in TLS/SSL implementations, were received in the blogo-sphere with a quest for the perfectly secure implementations, that have no bugs. That is the modern quest for perpetual motion. Nevertheless, very few bothered by the fact that the application's only security defence line were few TLS/SSL implementations. We design software in a way that openssl or gnutls become the Maginot line of security and when they fail they fail catastrophically.

So the question that I find more interesting, is, can we live with libraries that have bugs? Can we design resilient software that will operate despite serious bugs, and will provide us with the necessary time for a fix? In other words could an application design have mitigated or neutralized the catastrophic bugs we saw? Let's see each case, separately.


Mitigating attacks that expose memory (heartbleed)

The heartbleed attack allows an attacker to obtain a random portion of memory (there is a nice illustration in xkcd). In that attack all the data held within a server process are at risk, including user data and cryptographic keys. Could an attack of this scale be avoided?

One approach is to avoid putting all of our eggs in one basket; that's a long-time human practice, which is also used in software design. OpenSSH's privilege separation and isolation of private keys using smart cards or software security modules are two prominent examples. The defence in that design is that the unprivileged process memory contains only data that are related the current user's session, but no privileged information such as passwords or the server's private key. Could we have a similar design for an SSL server? We already have a similar design for an SSL VPN server, that revealing the worker processes' memory couldn't possibly reveal its private key. These designs come at a cost though; that is performance, as they need to rely on slow Inter-Process Communication (IPC) for basic functionality. Nevertheless, it is interesting to see whether we can re-use the good elements of these designs in existing servers.



Few years ago in a joint effort of the Computer Security and Industrial Cryptography research group of KU Leuven, and Red Hat we produced a software security module (softhsm) for the Linux-kernel, that had the purpose of preventing a server memory leak to an adversary from revealing its private keys. Unfortunately we failed to convince the kernel developers for its usefulness. That wasn't the first attempt for such a module. A user-space security module existed already and called LSM-PKCS11. Similarly to the one we proposed, that would provide access to the private key, but the operations will be performed on an isolated process (instead of the kernel). If such a module would be in place in popular TLS/SSL servers there would be no need to regenerate the server's certificates after a heartbleed-type of attack. So what can we do to use such a mechanism in existing software?

The previous projects are dead since quite some time, but there are newer modules like opendns's softhsm which are active. Unfortunately softhsm is not a module that enforces any type of isolation. Thus a wrapper PKCS #11 module over softhsm (or any other software HSM) that enforces process isolation between the keys and the application using it would be a good starting point. GnuTLS and NSS provide support for using PKCS #11 private keys, and adding support for this module in apache's mod_gnutls or mod_nss would be trivial. OpenSSL would still need some support for PKCS #11 modules to use it.

Note however, that such an approach would take care of the leakage of the server's private key, but would not prevent any user data to be leaked (e.g., user passwords). That of course could be handled by a similar isolation approach on the web server, or even the same module (though not over the PKCS #11 API).

So if the solution is that simple, why isn't it already deployed? Well, there is always a catch; and that catch as I mentioned before is performance. A simple PKCS #11 module that enforces process isolation would introduce overhead (due to IPC), and most certainly is going to become a bottleneck. That could be unacceptable for many high-load sites, but on the other hand that could be a good driver to optimize the IPC code paths.


Mitigating an authentication issue

The question here is what could it be done for the bugs found in GnuTLS and Apple's SSL implementation, that allowed certain certificates to always succeed authentication. That's related to a PKI failure, and in fact the same defences required to mitigate a PKI failure, can be used. A PKI failure is typically a CA compromise (e.g., the Diginotar issue).

One approach is again on the same lines as above, to avoid reliance on a single authentication method. That is, use two-factor authentication. Instead of relying only on PKI, combine password (or shared-key) authentication with PKI over TLS. Unfortunately, that's not as simple as a password key exchange over TLS, since that is vulnerable to eavesdropping once the PKI is compromised. To achieve two factor authentication with TLS, one can simply negotiate a session using TLS-PSK (or SRP), and renegotiate on top of it using its certificate and PKI. That way both factors are accounted, and the compromise of any of the two factors doesn't affect the other.

Of course, the above is a quite over-engineered authentication scenario, requires significant changes to today's applications, and also imposes a significant penalty. That is a penalty in performance and network communication as twice the authentication now required twice the round-trips. However, a simpler approach is to rely on trust on first use or simply SSH authentication on top of PKI. That is, use PKI to verify new keys, but follow the SSH approach with previously known keys. That way, if the PKI verification is compromised at some point, it would affect only new sessions to unknown hosts. Moreover, for an attack to be undetected the adversary is forced to operate a man-in-the-middle attack indefinitely, or if we have been under attack, a discrepancy on the server key will be detected on the first connection to the real server.


In conclusion, it seems that we can have resilient software to bugs, no matter how serious, but that software will come at a performance penalty, and would require more conservative designs. While there are applications that mainly target performance and could not always benefit from the above ideas, there are several applications that can sacrifice few extra milliseconds of processing for a more resilient to attacks design.