Beyond that, generating prime numbers >= to 2048 bits takes a non-trivial amount of time, sometimes close to a minute. This isn't feasible for servers to do on launch, and might not even be feasible to generate during install---which brings us back to predistributed parameters.
which generates 2048-bit Diffie-Hellman parameters, to
openssl genrsa 4096
which generates a 4096-bit RSA key (containing two 2048-bit primes) and note that the second is dramatically faster than the first. On-the-fly DH parameter generation is really slow.
However, generating 1024-bit parameters is probably fast enough to do on launch or install. Under the authors' estimates this might be fairly safe today because the adversary will have to spend many millions of dollars to attack your individual service (and you could change the parameters once a day or something if you wanted). But I think the authors agreed that large predistributed parameters make a better tradeoff for most cases.
There is an RFC about to issue listing such parameters
hobbes@namagiri:~$ time openssl dhparam -5 2048
Generating DH parameters, 2048 bit long safe prime, generator 5
This is going to take a long time
............[snip]......++*++*
-----BEGIN DH PARAMETERS-----
MIIBCAKCAQEAzshMWp3IBjMW5Aia2wJOvA2EBY32Mn2fMXzlyFDklnRUg8ff/19A
YWbRA4RAXrBMxoXEH1LVVpm5l89PGZ3DzjDafuNzNskgUhcAewUXXMQdkOFnPHYc
5F7+3DS8981Q0Q05qscBb26YGb2XaoJygyVj+B87NTvdAPzNU4fW5DyCuxhf5eov
ZeZwcC4KZ31Lr7enFcFBjTjxQxW88pP4YhiNYQ1fsFARGJT0X7ksOlRVWFrODu6b
nq0Ye/UWe0WB1zzmxz66ZujAwRwAgfmQZd7rILJqg68sxBeg88FlXJUeKfPL/bIT
KOl1LhiHr/HkUBfgZRahK0MGcthwiLdFZwIBBQ==
-----END DH PARAMETERS-----
real 0m37.073s
user 0m33.464s
sys 0m2.924s
hobbes@namagiri:~$
Edit: second run:
real 0m35.362s
user 0m31.992s
sys 0m2.880s
Would somebody post the steps I should use to make my own OpenSSH sshd(s) use non-standard DH parameters? Do I need to do anything to my clients?
Bit late to the party here, but this is why I wrote https://2ton.com.au/dhtool/ ... I have 48 CPU cores still dedicated to constantly generating safe primes and generators (have amassed a huge number of 2k, 3k, 4k and 8k DH parameters if anyone's interested)
The dhparam command has mostly been used with Apache, as far as I know; ssh provides its own commands as part of ssh-keygen, which are described in the man page you just mentioned.
However, the paper authors say
"If you use SSH, you should upgrade both your server and client installations to the most recent version of OpenSSH, which prefers Elliptic-Curve Diffie-Hellman Key Exchange."
That might ultimately be a simpler and safer course unless you have to deal with old clients that you don't have the ability to upgrade.
Generating DH parameters, 2048 bit long safe prime, generator 5
This is going to take a long time
Haha! It also made my CPU sweat nicely. I Ctrl+C^d after it ran for nearly 10 minutes with no sign of stopping though :( I see what you mean when you guys say it isn't feasible on every install of everything and why these are reused. I do think software maintainers should make an effort to do this regularly when new patches are released.
This actually opens up a new question in my mind: how does Open Source Software manage to keep these keys secret? Off to google...
Here, the dhparam command is creating g and p (well, you're supplying g as a command-line argument, and the command is choosing p); those are the public parameters which can be published anywhere, can be given to anyone, and can be re-used by multiple sites and services (although the last case makes life easier for attackers if the p value is too small, as the researchers indicate it is for some popular Internet standards and software configurations).
When a Diffie-Hellman key exchange happens, the two sites agree (not necessarily in a secret way, indeed normally not in a secret way) on what g and p to use, and then one site secretly chooses a, the other side secretly chooses b, and they do the math that results in both sides knowing the same combined secret without other parties being able to determine it. Only a, b, and the resulting combined secret g^ab mod p are confidential; g and p aren't.
Diffie-Hellman can also provide forward secrecy because you can deliberately forget what values of a and b you used on a particular occasion, and then you can't reconstruct them (unless you can compute discrete logarithms, which is supposed to be difficult). That's not true for key exchange using RSA for confidentiality, where, if you still have the private key, you can still read your messages that were sent to you in the past using that private key.
Ciphersuites in TLS that use Diffie-Hellman in an ephemeral way ("DHE") will still use RSA, but for identifying the server (and optionally the client) and confirming that no man-in-the-middle attack has happened. So, they only use RSA for signatures, not for confidentiality, and they still get forward secrecy if they forget the DH private values associated with individual sessions. (Some servers use the same b value for multiple incoming connections, which reduces the degree of forward secrecy they get: if someone hacked them or seized their server while it still contained a particular b value, that person could decrypt previously recorded TLS sessions with that server that used the same b value.)
... thank you for trying to explain. I am still too thick to get it on first read. I have saved this in Evernote and will keep coming back to it after researching terms etc., until I understand in full.
I'm sure you'll want to keep reading about this, but let me just set out:
Diffie-Hellman involves four numbers, which are called g, p, a, and b (and which are used to calculate other numbers).
g and p are public, and are often used "for the long term" (including sharing g and p values between different Internet sites and services). The researchers found that the problem is that a lot of sites use the same p values, and the ones they use are too small, so that there is a single calculation that a government can do which allows them to break DH that uses those particular p values. Although this calculation is phenomenally expensive, it seems likely that NSA did it several years ago.
a and b are private and must remain secret. Commonly a different a and b are chosen each time Diffie-Hellman is used (like for each new connection or session with an Internet service).
Diffie-Hellman provides a property called forward secrecy because it allows two ends of a connection to derive a session key (which gets calculated using information derived from g, p, a, and b, and gets then used as a key for some other cryptosystem, normally some form of AES today, but in theory it could be anything) but then later "forget" what the session key was and how it was derived. Unlike other ways of doing key exchange, when the parties choose to "forget" their session key and associated information this way, they no longer have a way to recover or reconstruct it!
This is useful because, for example, if someone hacks the computer of someone who was using protocols with ephemeral DH, the computer likely won't contain information that could be used to reconstruct the session key and decrypt old communications. (That's assuming that the software scrubs key material from memory when it's no longer needed... which might not always be true.)