Wednesday, April 18, 2012

A flaw in the smart card Kerberos (PKINIT) protocol

Reading security protocols is not always fun nor easy. Protocols like public key Kerberos are hard to read because they just define the packet format and expect the reader to assume a correct message sequence. I read it, nevertheless, because I was interested on the protocol's interaction with smart cards. If you are not aware of the protocol, public key Kerberos or PKINIT is the protocol used in Microsoft Active Directory and described in RFC4556.

The idea of the protocol is to extend the traditional Kerberos, that supports only symmetric ciphers, with digital signatures and public key encryption in order to support stock smart cards. A use-case is, for example, logging in a windows domain using the smart card. The protocol itself doesn't mention smart cards at all, probably because it was thought as a deployment issue. Nevertheless, it was believed to be a secure protocol and several published papers provided proofs of security for all operational modes of the protocol.

However, the protocol has an important flaw. A flaw that makes it insecure if used with smart cards. I wrote a detailed report on the flaw, but the main idea is that if one has access for few minutes to your smart card he can login using your credentials at any time in the future. You may think that an attack like that can be prevented by never lending your smart card to anyone, but how can you prove that no-one borrowed it for a while? Or if you believe the smart card PIN would protect from theft, how could you know that the reader you are inserting your card isn't tampered? And in a protocol with smart cards you'd expect the tampered reader not to be able to use the card after you retrieve it. This is not the case, making the protocol unsuitable for smart cards, its primary use-case.


What is the most interesting issue however, are the security proofs. The protocol was proven secure, but because the protocol never mentioned smart cards, researchers proved its security on a different setting than the actual use cases. So when reading a security proof, always check the assumptions, which are as important as the proof itself.

Few things went bad with the design of this protocol, none of which is actually technical. The protocol is hard to read, and in order to get an overview of it, you have to read the whole RFC. This is just bad. If you check figure 1 in the TLS RFC you get an overview of the protocol immediately. You might not know the actual contents of the messages but the sequence is apparent. This is not possible in the Kerberos protocol, and that discourages anyone who might want to understand the protocol using a high level description of it. Another flaw, is that the protocol doesn't mention smart cards, its primary use-case. Smart cards were treated as a deployment issue and readers of the RFC, would never know about it. The latter issue is occurring in many of the IETF protocols and the readers are expected to know where and how this protocol is used. As it was demonstrated by the security proofs on a different setting, this is not the case.

So, what can it be done to mitigate the flaw? Unfortunately without modifying the protocol, the only advice that can be given is something along the:
  • make sure you always possess the card;
  • make sure you never use a tampered smart card reader.
Which may seem pretty useless in a typical working environment. What makes the attack nasty, is that if an adversary tampers your reader, and you use your card with it now, the adversary can perform a transaction a year later when you'll have no clue on what happened and might be no evidence of the tampered reader.

Sunday, April 1, 2012

TLS in embedded systems

In some embedded systems space may often be a serious constraint. However, there are many such systems that contain several megabytes of flash either as an SD memory card, or as raw NAND, having no real space constraint. For those systems using a TLS implementation such as GnuTLS or OpenSSL would provide performance gains that are not possible with the smaller implementations that target small size. That is because both of the above implementations, unlike the constraint ones, support cryptodev-linux to take advantage of cryptographic accelerators, widely present in several constraint CPUs, and support elliptic curves to optimize performance when perfect forward secrecy is required.

I happened to have an old geode (x86 compatible) CPU which contained an AES accelerator, so here are some benchmarks created using the nxweb/GnuTLS and nginx/OpenSSL web servers and the httpress utility. The figure on the right shows the data transferred per second using AES in CBC mode, with the cryptographic accelerator compared to GnuTLS' and  OpenSSL's software implementations. We can clearly see that download speed almost doubles on a big file transfer when using the accelerator.


The figure on the left shows a comparison of the various key exchange methods in this platform using GnuTLS and OpenSSL. The benchmark measures HTTPS transactions per second and the keys and parameters used are the same for both implementations. The key sizes are selected of equivalent security levels (1776 bits in RSA and DH are equivalent to 192 bits in ECDH according to ECRYPT II recommendations). We can see that the elliptic curve version of Diffie Hellman (ECDHE-RSA) allows 25% more transactions in both implementations comparing to the Diffie-Hellman on a prime field (DHE-RSA). The plain RSA key exchange remains the fastest, at the cost of sacrificing perfect forward secrecy.

As a side-note it is nice to see that at the security level of 192 bits GnuTLS outperforms OpenSSL on this processor. The trend continues on higher security levels for the RSA and DHE-RSA methods but the ECDHE-RSA method is interesting since even though OpenSSL has a more efficient elliptic curve implementation. GnuTLS' usage of nettle and GMP (which provide a faster RSA implementation) compensates, and their performance is almost identical.

Overall, in the few embedded systems that I've worked on, space was not a crucial limiting factor, and it was mainly this work that drove me into updating cryptodev for linux. In those systems the space cost occurred due to the usage of a larger library was compensated by (1) the off-loading to the crypto processor of operations that would otherwise load the CPU and (2) the reduce in processing time due to elliptic curves.
However this balance is system specific and what was important for my needs would not cover everyone elses, so it is important to weigh the advantages and disadvantages of cryptographic implementations on your system alone.