Massively collecting CRL and OCSP information

November 3, 2011  |  Jaime Blasco

As part of the IP reputation project we are writing a small engine to avoid false positives and whitelisting some common ips/networks.

Usually when you execute a binary on a sandbox and the executable file has been signed, you receive a lot connections to the servers hosting the Certificate Revocation Lists (CRL) and the Online Certificate Status Protocol (OCSP).

To avoid processing this ips, we use some scripts to parse and extract the most used CRL and OCSP servers extracting this information from certificates.

Right now we are using the EFF SSL Observatory dataset and also the Alexa Top 1M list.

Let’s begin with the SSL Observatory database. Once we have the Mysql database ready, execute the following sql query to extract the OCSP URIs:

select `X509v3 extensions:Authority Information Access:OCSP - URI` as ocsp,count(*) as total INTO OUTFILE ‘/tmp/ocsp.csv’ FIELDS TERMINATED BY ‘,’ OPTIONALLY ENCLOSED BY ‘“’ LINES TERMINATED BY ‘ ’ from all_certs where `X509v3 extensions:Authority Information Access:OCSP - URI` is not NULL group by ocsp order by total desc;

Then you can use this script http://alienvault-labs-garage.googlecode.com/svn/trunk/certs/ocsp.py [no longer available] to parse the file:

tor@tor-VirtualBox:~$ python ocsp.py /tmp/ocsp.csv

ocsp.godaddy.com

ocsp.starfieldtech.com

ocsp.startssl.com

ocsp.cacert.org

ocsplevel101.ipsca.com

...

We can do the same for CRL entries using this other script http://alienvault-labs-garage.googlecode.com/svn/trunk/certs/crl.py [no longer available]:

tor@tor-VirtualBox:~$ python crl.py /tmp/crls.csv

crl.geotrust.com

crl.comodoca.com

crl.comodo.net

SVRIntl-crl.verisign.com

...

The other script http://alienvault-labs-garage.googlecode.com/svn/trunk/certs/alexa_top_certs.py [no longer available] I want to share parses the Alexa TOP 1M list, extracts the SSL certificate if https is supported and then extracts the OCSP/CRL URIS:

jaimes-MacBook-Pro:PKIS jaime$ python2.7 alexa_top_certs.py

http://crl.thawte.com/ThawteSGCCA.crl

http://ocsp.thawte.com

http://SVRIntl-crl.verisign.com/SVRIntl.crl

http://ocsp.verisign.com

http://www.gstatic.com/GoogleInternetAuthority/GoogleInternetAuthority.crl

http://crl.geotrust.com/crls/secureca.crl

...

So mixing the outputs we have a list of the most used PKI servers that we can classify as normal activity.

Share this with others

Get price Free trial