ewx: (geek)
Richard Kettlewell ([personal profile] ewx) wrote2008-11-08 02:47 pm
Entry tags:

Regarding DSA 1571

I wrote this a while ago but never got round to posting it. The dust has since settled but it might be interesting to some nonetheless.

1) I try to use valgrind on something that involved OpenSSL. I can't even remember what. The exercise is significantly hampered by numerous false positives.

2) I identify OpenSSL's practice of passing uninitialized data to its RNG as part of the problem. This happens in two places.

(i) When RAND_poll (in crypto/rand/rand_unix.c) collects entropy from /dev/urandom (or an EGD) , it reads data into a buffer called tmp_buf. It might not fill this buffer, but it nonetheless passes the whole buffer into the RNG. The balance of the buffer is uninitialized and valgrind warnings result.

I spotted this one, and my fix was to zero out the buffer before filling it. This may not be entirely obvious from the patch in the bug report but if you read it in context then it is plainly harmless.

(ii)When you ask for a random number, ssleay_rand_bytes adds the output buffer you're going to use for the answer into its calculation of new random data. I didn't spot this one. This is where the PURIFY ifdef can be found.

My fix for (i) was perfectly safe. Also, it does not cost you anything: if you were analyzing the security of OpenSSL you would have to ignore any contribution to entropy from the uninitialized part of the buffer; its uninitialized state means the compiler and language standard don't make any promises, it does not mean its unpredictable on a given implementation. (Early in a program it is relatively likely to be all 0s for instance). So this change would not have removed any security guarantee worth the name.

Since I apparently didn't spot (ii) I didn't make any changes there, though if I had that would have been safe too for the same reason.

3) I send my patch for (i) to Debian. I am after all using a Debian system.

4) Debian's OpenSSL maintainer spots (ii) as well but misunderstands (i), apparently concluding that the problem is not RAND_poll passing (partially) uninitialized data but in ssleay_rand_add reading that data at all. In other words, they've assigned the blame for the valgrind warning to the code that performs the read, rather than the code that fails to do a write.

5) Kurt then mails openssl-dev about it, and gets a positive reaction a member of the development team. openssl-dev is the address the README told you to use:

Development is coordinated on the openssl-dev mailing list (see
http://www.openssl.org for information on subscribing). If you
would like to submit a patch, send it to openssl-dev@openssl.org with
the string "[PATCH]" in the subject. Please be sure to include a
textual explanation of what your patch does.

6) Kurt makes his proposed changes and uploads the new package.

Henceforth anyone who installs this tainted OpenSSL suffers two problems. Firstly there is no longer adequate diversity of keys generated with it. Secondly if they make any DSA signatures, they risk exposing the value of the private key.

(You might want to rethink how much you rely on DSA in the light of this; requiring a source of cryptographic-strength random numbers for key generation is fair enough but additionally requiring it for every signature if you don't want to expose your private key seems a little on the risky side!)

7) Luciano Bello spots the problem. The bug gets fixed and a DSA is issued. There's a great deal of informed and uninformed comment online. A number of people blame me, assuming that the only patch that appears in the bug report is the faulty one, and not bothering to check either what it does or what the actual change was. Obviously this does not make me happy.

(There's a lot of great work done by various people to clean up the mess, too.)

8) Ben Laurie complains that vendors are bad for security.

There's some truth in there: people should indeed not try to change code they don't understand, and coordinating with the people who maintain it is an obvious way to avoid pitfalls. However, Kurt did mail the advertized address, and did get a response from a member of the OpenSSL development team, so even though he screwed up, this safety net failed dismally too.


Some people have criticized the original motivation for any change as putting debugging ahead of security. Not only is this not true - the original change had no impact on security - but it is completely wrong-headed: in general software that is harder to debug is software that is harder to secure, and in particular the kind of bugs valgrind is good at spotting are just the kind that may turn out to be security holes.

[identity profile] cartesiandaemon.livejournal.com 2008-11-09 01:39 am (UTC)(link)
(Thanks, that sounds very well described.)

so even if he screwed up, this safety net failed dismally too.

This is a very nice metaphor :) Unlike many people's attempts to pin the blame, it clearly approximates what actually happened.