It's about two years ago that I played with some code for factoring small RSA moduli. About a month ago I was checking if it's possible to crack moduli in bulk. Can you collect a million of them, and crack them a million times as fast? It can speed up the process, but it doesn't really help. Normal RSA keys are so large that even with a big bunch of them, normal factoring algorithms don't 'get lucky'.
Searching bulk-cracking attacks brought me to this page with a great explanation on cracking RSA moduli that share a common GCD. To put it short: an RSA modulus is a multiple of two primes: p and q. If two moduli share a common factor (p or q), you can find this value extremely fast, because we have an efficient algorithm for finding the GCD of two numbers.
One of the best resources on this topic is the paper Mining Your Ps and Qs. Not only does it explain the attack in great detail, it also shows amazing results of successful large-scale cracking attempts. A summary of the results can be found on their website, which also hosts the source code for their tool fastgcd, that can be used to find GCD pairs really fast. However, their websites states: "We have not released our code for distributed scanning or key collection". I wanted to make my own tool to mine moduli, and check if this attack still works in 2017.
The first few days I spent creating some tool using Python, with Queue for threading, sqlite for a database, normal threading for mutex locks, OpenSSL with Crypto.Util for parsing certificates and started scanning some AWS and Dutch IP ranges. It was just awful. I used special scripts to create backups, export stuff from the database and parse results. The only useful part of it is this code snippet to retrieve a modulus from a server.
After some days painfully running scans I noticed that I really just created a crappy portscanner. So I looked over at nmap. As it turns out, nmap parses TLS certificates with the function get_ssl_certificate, and for RSA certificates "it will also have an exponent member, containing the public exponent as a bignum.". That's way better. I forked nmap and changed the source code to return the modulus instead of the exponent, with a small script to show the modulus. Now I can do nmap scans and just parse the results like a sane person.
After some more days running scans I retrieved about 250k moduli, and fastgcd found that 6 of them were vulnerable! After checking out the devices, all 6 seem to be Juniper Netscreen VPN firewalls. These devices contained a known backdoor that allowed for the decryption of VPN traffic. I think it's interesting to see that these properties are also detectable from their generated TLS certificates. Anyway, I showed these results to some co-workers, and one informed me that all certificates in public IPv4 address space can be downloaded from the Censys Data Collection!
After downloading some huge certificate files from Censys, I started extracting, sorting and factoring moduli. I used fastgcd on blocks of millions at a time, because the tool requires a lot of memory. By the way, this cool paper talks about a tool to utilize the GPU for performing this task at high speed. I spent some time finding a bunch of vulnerable moduli, mainly from Fritz!Box's for some reason, and it got boring. I thought about creating a tool to test new moduli against all the certificates from Censys, but this tool already exists.
To sum it up, I spent two resourceful weeks re-inventing wheels. Some devices have trouble creating proper random numbers, and this introduces cryptographic flaws. The internet is hard at work to rub it in their faces.