As part of the IP reputation project we are writing a small engine to avoid false positives and whitelisting some common ips/networks.
Usually when you execute a binary on a sandbox and the executable file has been signed, you receive a lot connections to the servers hosting the Certificate Revocation Lists (CRL) and the Online Certificate Status Protocol (OCSP).
To avoid processing this ips, we use some scripts to parse and extract the most used CRL and OCSP servers extracting this information from certificates.
Right now we are using the EFF SSL Observatory dataset and also the Alexa Top 1M list.
Let’s begin with the SSL Observatory database. Once we have the Mysql database ready, execute the following sql query to extract the OCSP URIs:
select `X509v3 extensions:Authority Information Access:OCSP - URI` as ocsp,count(*) as total INTO OUTFILE ‘/tmp/ocsp.csv’ FIELDS TERMINATED BY ‘,’ OPTIONALLY ENCLOSED BY ‘“’ LINES TERMINATED BY ‘ ’ from all_certs where `X509v3 extensions:Authority Information Access:OCSP - URI` is not NULL group by ocsp order by total desc;
Then you can use this script http://alienvault-labs-garage.googlecode.com/svn/trunk/certs/ocsp.py [no longer available] to parse the file:
tor@tor-VirtualBox:~$ python ocsp.py /tmp/ocsp.csv
ocsp.godaddy.com
ocsp.starfieldtech.com
ocsp.startssl.com
ocsp.cacert.org
ocsplevel101.ipsca.com
...
We can do the same for CRL entries using this other script http://alienvault-labs-garage.googlecode.com/svn/trunk/certs/crl.py [no longer available]:
tor@tor-VirtualBox:~$ python crl.py /tmp/crls.csv
crl.geotrust.com
crl.comodoca.com
crl.comodo.net
SVRIntl-crl.verisign.com
...
The other script http://alienvault-labs-garage.googlecode.com/svn/trunk/certs/alexa_top_certs.py [no longer available] I want to share parses the Alexa TOP 1M list, extracts the SSL certificate if https is supported and then extracts the OCSP/CRL URIS:
jaimes-MacBook-Pro:PKIS jaime$ python2.7 alexa_top_certs.py
http://crl.thawte.com/ThawteSGCCA.crl
http://ocsp.thawte.com
http://SVRIntl-crl.verisign.com/SVRIntl.crl
http://ocsp.verisign.com
http://www.gstatic.com/GoogleInternetAuthority/GoogleInternetAuthority.crl
http://crl.geotrust.com/crls/secureca.crl
...
So mixing the outputs we have a list of the most used PKI servers that we can classify as normal activity.