Friday, February 11, 2011

XSS Awareness #2: Internet Explorer, MHTML and the case of Twitter's promo service

If you missed either of the other articles in this series, they can be found here:

MHTML: Microsoft's black sheep web page archive format

The first advisory I could find related to the latest wave of exploits through the MHTML protocol handler, dates back to 2004. The protocol itself was proposed in 1999, which means that this problem has been known for more than half its lifetime. That's pretty spectacular. I'll admit I had barely heard of, let alone dug into, what and why MHTML was - until the reports in early January this year.

In a nutshell, the MHTML protocol handler will look for a second HTTP header like structure in what's being sent from the server. Following the regular HTTP header, you'll find a double set of carriage return and line feed, "\r\n\r\n", after which the HTTP body begins. Granted that the url is prefixed "mhtml:", the MHTML handler will kick in at this point.

The MHTML handler will parse the HTTP body for MHTML headers, stopping when it finds a completely blank line ("\n\n" or "\r\n\r\n"). The headers it looks for include "Content-Location" and "Content-Transfer-Encoding". Having successfully found these, it will attempt to decode the MHTML resource, which follows the MHTML header. A HTTP body can include several MHTML resources, which are named through the Content-Location header. Which specific MHTML resource is executed, is decided by an argument passed through the URL, following a trailing exclamation mark.

A server response with a single MHTML document, containing a javascript to alert "hi", could look like:
HTTP/1.0 200 OK
Date: Fri, 11 Feb 2011 08:07:05 GMT
Status: 200 OK
Content-Length: [...]
Pragma: no-cache
Expires: Tue, 31 Mar 1981 05:00:00 GMT
Connection: close

Content-Type: Multipart/related; boundary=_bounds

Content-Location: foo
Content-Transfer-Encoding: base64

What makes the MHTML protocol handler dangerous, is a combination of several factors.
  1. Internet Explorer is prone disregard content type in the first place, but the MHTML protocol handler will even disregard nosniff and mime type.
  2. Seeing as the script part of the document can be base64 encoded, the browser's XSS prevention mechanism won't kick in.
  3. No quotes or other commonly escaped / encoded characters are present in the MHTML payload. I have seen some examples where semicolons are replaced as well, but they aren't required in the MHTML header's short form in either case.
This effectively means that web pages which otherwise follow best practices, but don't strip carriage return and newline from the input they pass on, could be vulnerable. That is, if they don't output any blank lines anywhere prior to the MHTML header's output point. This happens to be why I've previously said that ASP.NET is accidentally (nearly) immune to the attack: the commonly used markup, with page directives and such on top, will cause an extra blank line to be written between the HTTP header and the HTTP body -- abruptly disabling the MHTML procol handler.

While many pages are affected by this, one specific kind of web page has been more prone to vulnerability than others: JSON web services with jsonp callback parameters. Few web services bother washing input to these services, and are relatively safe to do so, granted that they pass the right content type, mime type and preferably nosniff header back to the browser. No sane browser would run the output from the page, should it be requested, in either case. Such is not the case with the MHTML handler, however.

Let me just say: MHTML is an epically bad idea

It really is essential that organizations of IE users understand just how ludicrous the MHTML protocol handler is. Let me add to the examples above by mentioning that IE doesn't care at all whether the MIME type you're loading implies a binary file or not. It could just as well be a jpg, png, gif or whatever else, so long as there are (as mentioned above) no two successive newlines prior to the MHTML header. Open an url such as, and you'd get a picture. Open mhtml:!foo, and your browser would happily execute scripts and whatnot.

Then there are the content hosting sites, such as Github, who really have no option but to host raw, unmodified, views of files. Or even Gists. Since MHTML's reach stretches beyond MIME type, Content-Type and anything you can pass in the HTTP header, it goes without saying that hosting *any* file with a valid MHTML header on such a site, would be executed by the browser. And when it does, any private listings you have there are at risk of being stolen.

There really is no sense to the MHTML handler. At all.

If you're still using Internet Explorer (and yes, that includes the newly released IE 9 Release Candidate), go patch it:

The case of Twitter's promo service

Twitter had one specific web service, which returned json-formatted information about promotions. You could freely pass a callback to it, and it would return a javascript which called said function, with the json data as an argument (jsonp in a nutshell, yeah). This service happened to not wash carriage return and line-feed from the callback name, and consequently was XSS vulnerable to injected MHTML.

One thing that complicated a proof of concept against the service -- but ultimately pointed out something important -- was a specific piece of javascript on Twitter's main page.

Most MHTML XSS attacks I've seen deliver an iframe, which in turn loads up a page which holds CSRF tokens. The MHTML's script would extract the token from the iframe's document, and the browser would happily allow it, seeing as the two were within in the same domain. Finally the script would post to whatever service it wanted to be naughty towards.

Because of Twitter's script, however, this proved difficult. Upon opening the main page, the script would check if the page was currently the topmost frame, and if not: replace the url of the top frame, effectively breaking out of the iframe. Second of all, the script would immediately set document.domain to "", which isn't possible from MHTML, and that would also break the iframe approach.

What I ended up doing was deliver a script which would use Microsoft's XMLHttp object to get the token, and post that to the tweet page. From within MHTML, this also proved slightly difficult, as the XHR would regularly throw "access denied" error messages. The curious part here, though: it wouldn't *always* throw these messages. There appeared to be some timing to it. The access error couldn't be because of the document.domain restriction either, as no page had yet been requested with that on it. It seemed to be entirely an effect of where in the MHTML lifecycle the request was fired. That possible digression bug aside, bunny hopping between different scripts upon error, eventually made both token retrieval and cross-tweeting go through:

XSS proof of concept against the promo service.

Internet Explorer's XSS filter gone autoimmune

So, while that approach did work, I mentioned that the whole iframe redirect / document.domain spectacle did point out something important. While I was discussing the anomaly in MHTML's XMLHttp request with a friend of mine, Erling Ellingsen, he pointed out that Internet Explorer's own XSS filter often can be used against itself, in order to disable pieces of scripts on a page.

Imagine that you have a page with a CSRF token, lets call it secret.html, at
<script>document.domain = ""</script>
On this server, there would be a page somehow MHTML XSS exploitable. The following is then injected into it (quoted-printable, rather than base64 encoded, for readability):
Content-Type: Multipart/related; boundary=_bounds

Content-Location: foo
Content-Transfer-Encoding: QUOTED-PRINTABLE

Now because of the document.domain="" of the secret.html page, the script wouldn't be able to retrieve the CSRF token. For the service provider, this is definitely a good thing, as it (usually) adds another layer of protection. What the service provider didn't think of, however, was Internet Explorer's hyperactive XSS filter.

If you changed the url requested in the iframe to:
src=3D"<script>document.domain =3D</script>"></iframe></body>
Then Internet Explorer would parse the page, looking for any signs of scripts similar to the block in the query string. Upon finding this in secret.html, it would believe that it got there because of the query string, and throw a bunch of pound signs into the tags and code -- rendering it useless. The iframe's document would consequently no longer be restricted to, and would be readable from the MHTML's script.

Final thoughts

There's a definitive lesson to be learnt here. If you rely on scripts to restrict requests to specific domains, prevent iframe embedding, or similar: be certain to throw some random garbage into your script block, while it is generated server side. That said, in browsers other than Internet Explorer, the iframe could be declared sandboxed, to prevent script execution within it. If you still want to go down this obscure security trail, countering that would in some browsers be possible with
<noscript><meta http-equiv="X-Frame-Options" content="deny" /></noscript>
... but I won't follow you down there.

When it comes to MHTML, there's not much to do, short of either disabling \r\n in query input, specifically removing the MHTML headers or simply adding an extra \r\n right after the HTTP header. Generally speaking, it's a very good idea to be somewhat restrictive even with callbacks given to jsonp services, and especially so if you don't pass the X-Content-Type-Options: nosniff HTTP header.

No comments: