Part III: A Glimpse of Things to Come
Of course, objectively measuring the robustness of any sufficiently complex piece of software is an unsolved problem in computing, doubly so if your codebase happens to carry almost two decades worth of bloat. Therefore, much of the competitive effort goes into inventing and then rapidly deploying new security-themed additions, often with little consideration for how well they actually solve the problem they were supposed to address.
In the meantime, standards bodies, mindful of their earlier misadventures, have ditched much of their academic rigor in favor of just letting a dedicated group of contributors tweak the specifications as they see fit. There is talk of making HTML5 the last numbered version of the standard and transitioning to a living document that changes every day—often radically. The relaxation of the requirement has helped keep ongoing much of the work around W3C and WHATWG, but it has also undermined some of the benefits of having a central organization to begin with. Many recent proposals gravitate toward quick, narrowly scoped hacks that do not even try to form a consistent and well-integrated framework. When this happens, no robust feedback mechanism is in place to allow external experts to review reasonably stable specifications and voice concerns before any implementation work takes place. The only way to stay on top of the changes is to immerse oneself in the day-to-day dynamics of the working group.
It is difficult to say if this new approach to standardization is a bad thing. In fact, its benefits may easily outweigh any of the speculative risks; for one, we now have a chance at a standard that is reasonably close to what browsers actually do. Nevertheless, the results of this frantic and largely unsupervised process can be unpredictable, and they require the security community to be very alert.
In this spirit, the last part of the book will explore some of the more plausible and advanced proposals that may shape the future of the Web . . . or that may just as likely end up in the dustbin of history a few years from now.
New and Upcoming Security Features
The dream of inventing a brand-new browser security model is strong within the community, but it is always followed by the realization that it would require rebuilding the entire Web. Therefore, much of the practical work focuses on more humble extensions to the existing approach, necessarily increasing the complexity of the security-critical sections of the browser codebase. This complexity is unwelcome, but its proponents invariably see it as justified, whether because they aim to mitigate a class of vulnerabilities, build a stopgap for some other hard problem that nobody wants to tackle right now, or simply enable new types of applications to be built in the future.
All these benefits usually trump the vague risk.
Security Model Extension Frameworks
Some of the most successful security enhancements proposed in the past few years boil down to adding flexibility to the original constraints imposed by the same-origin policy and its friends. For example, one formerly experimental proposal that has now crossed into the mainstream is the postMessage(…) API for communicating across origins, discussed in Chapter 9. Surprisingly, the act of relaxing SOP checks in certain carefully chosen scenarios is more intuitive and less likely to cause problems than locking the policy down. So, to begin on a lighter note, we’ll focus on this class of frameworks first.
Under the original constraints of the same-origin policy, scripts associated with one origin have no clean and secure way to communicate with client-side scripts executing in any other origin and no safe way to retrieve potentially useful data from a willing third-party server.
Web developers have long complained about these constraints, and in recent years, browser vendors have begun to listen to their demands. As you recall, the more pressing task of arranging client-side communications between scripts was solved with postMessage(…). The client-to-server scenario was found to be less urgent and still awaits a canonical solution, but there has been some progress to report.
The most successful attempt to create a method for retrieving documents from non-same-origin servers began in 2005. Under the auspices of W3C, several developers working on VoiceXML, an obscure document format for building Interactive Voice Response (IVR) systems, drafted a proposal for Cross-Origin Resource Sharing (CORS). Between 2007 and 2009, their awkward, XML-based design gradually morphed into a much simpler and more widely useful scheme, which relied on HTTP header–level signaling to communicate consent to cross-origin content retrieval using a natural extension of the XMLHttpRequest API.
CORS Request Types
As specified today, CORS relies on differentiating between two types of calls to the XMLHttpRequest API. When the site attempts to load a cross-origin document through the API, the browser first needs to distinguish between simple requests, where the resulting HTTP traffic is deemed close enough to what can be generated through other, existing methods of navigation, and non-simple requests, which encompass everything else. The operation of these two classes of requests vary significantly, as we’ll see.
The current specification says that simple requests must have a method of GET, POST, or HEAD. Additionally, if any custom headers are specified by the caller, they must belong to the following set:
Today, browsers that support CORS simply do not allow methods other than GET, POST, and HEAD. At the same time, they ignore the recommended whitelist of headers, unconditionally demoting any requests with custom header values to non-simple status. The implementation in WebKit also considers any payload-bearing requests to be non-simple. (It is not clear whether this is an intentional design decision or a bug.)
Security Checks for Simple Requests
The CORS specification allows simple requests to be submitted to the destination server immediately, without attempting to confirm whether the destination is willing to engage in cross-domain communications to begin with. This decision is based on the fact that the attacker may initiate fairly similar cookie-authenticated traffic by other means (for example, by automatically submitting a form) and, therefore, that there is no point in introducing an additional handshake specifically for CORS.
The crucial security check is carried out only after the response is retrieved from the server: The data is revealed to the caller through the XMLHttpRequest API only if the response includes a suitable, well-formed Access-Control-Allow-Origin header. To assist the server, the original request will include a mandatory Origin header, specifying the origin associated with the calling script.
To illustrate this behavior, consider the following cross-domain XMLHttpRequest call performed from http://www.bunnyoutlet.com/:
var x = XMLHttpRequest();
x.open(‘GET’, ‘http://fuzzybunnies.com/get_message.php?id=42’, false);
The result will be an HTTP request that looks roughly like this:
GET /get_message.php?id=42 HTTP/1.0
To indicate that the response should be readable across domains, the server needs to respond with
HTTP/1.0 200 OK
The secret message is: “It’s a cold day for pontooning.”
NOTE: It is possible to use a wildcard (“*”) in Access-Control-Allow-Origin, but do so with care. It is certainly unwise to indiscriminately set Access-Control-Allow-Origin: * on all HTTP responses, because this step largely eliminates any assurances of the same-origin policy in CORS-compliant browsers.
Non-simple Requests and Preflight
In the early drafts of the CORS protocol, almost all requests were meant to be submitted without first checking to see if the server was actually willing to accept them. Unfortunately, this design undermined an interesting property leveraged by some web applications to prevent cross-site request forgery: Prior to CORS, attackers could not inject arbitrary HTTP headers into cross-domain requests, so the presence of a custom header often served as a proof that the request came from the same origin as the destination and was issued through XMLHttpRequest.
Later CORS revisions corrected this problem by requiring a more complicated two-step handshake for requests that did not meet the strict “simple request” criteria outlined in “CORS Request Types” on page 236. The handshake for non-simple requests aims to confirm that the destination server is CORS compliant and that it wants to receive nonstandard traffic from that particular caller. The handshake is implemented by sending a vanilla OPTIONS request (“preflight”) to the target URL containing an outline of the parameters of the underlying XMLHttpRequest call. The most important information is conveyed to the server in three self-explanatory headers: Origin, Access-Control-Request-Method, and Access-Control-Request-Headers.
This handshake is considered successful only if these parameters are properly acknowledged in the response through the use of Access-Control-Allow-Origin, Access-Control-Allow-Method, and Access-Control-Allow-Headers. Following a correct handshake, the actual request is made. For performance reasons, the result of the preflight check for a particular URL may be cached by the client for a set period of time.
Current Status of CORS
As of this writing, CORS is available only in Firefox and WebKit-based browsers and is notably absent in Opera or Internet Explorer. The most important factor hindering its adoption may be simply that the API is not as critical as postMessage(…), its client-side counterpart, because it can be often replaced by a content-fetching proxy on the server side. But the scheme is also facing three principal, if weak, criticisms, some of which come directly from one of the vendors. Obviously, these criticisms don’t help matters.
The first complaint, voiced chiefly by Microsoft developers and echoed by some academics, is that the scheme needlessly abuses ambient authority. They argue that there are very few cases where data shared across domains would need to be tailored based on the credentials available for the destination site. The critics believe that the risks of accidentally leaking sensitive information far outweigh any benefits and that a scheme permitting only nonauthenticated requests to be made would be preferable. In their view, any sites that need a form of authentication should instead rely on explicitly exchanged authentication tokens.
The other, more pragmatic criticism of CORS is that the scheme is needlessly complicated: It extends an already problematic and error-prone API without clearly explaining the benefits of some of the tweaks. In particular, it is not clear if the added complexity of preflight requests is worth the peripheral benefit of being able to issue cross-domain requests with unorthodox methods or random headers.
The last of the weak complaints hinges on the fact that CORS is susceptible to header injection. Unlike some other recently proposed browser features, such as WebSockets (Chapter 17), CORS does not require the server to echo back an unpredictable challenge string to complete the handshake. Particularly in conjunction with preflight caching, this may worsen the impact of certain header-splitting vulnerabilities in the server-side code.
Microsoft’s objection to CORS appears to stem from the aforementioned concerns over the use of ambient authority, but it also bears subtle overtones of their dissatisfaction with interactions with W3C. In 2008, Sunava Dutta, a program manager at Microsoft, offered this somewhat cryptic insight:
During the [Internet Explorer 8] Beta 1 timeframe there were many security based concerns raised for cross domain access of third party data using cross site XMLHttpRequest and the Access Control framework. Since Beta 1, we had the chance to work with other browsers and attendees at a W3C face-to-face meeting to improve the server-side experience and security of the W3C’s Access Control framework.
Instead of embracing the CORS extensions to XMLHttpRequest, Microsoft decided to implement a counterproposal, dubbed XDomainRequest. This remarkably simple, new API differs from the variant available in other browsers in that the resulting requests are always anonymous (that is, devoid of any browser-managed credentials) and that it does not allow for any custom HTTP headers or methods to be used.
The use of Microsoft’s API is otherwise very similar to XMLHttpRequest:
var x = new XDomainRequest();
Borrowing from W3C’s proposal, the resulting request will bear an Origin header, and the response data will be revealed to the caller only if a matching Access-Control-Allow-Origin header is present in the response. Preflight requests and permission caching are not a part of the design.
For all intents and purposes, Microsoft’s solution is far more reasonable than CORS: It is simpler, safer, and probably just as functional in all the plausible uses. That said, it is also unpopular. It is supported only in Internet Explorer 8 and up, and owing to W3C backing CORS, others have no reason to embrace XDomainRequest anytime soon.
In the meantime, a separate group of researchers have proposed a third solution, again acting under the auspices of W3C. Their design, known as Uniform Messaging Policy (complete with a corresponding UniformRequest API), embraces an approach nearly identical to Microsoft’s. It is not supported in any existing browser, but there is some talk of unifying it with CORS.
Other Uses of the Origin Header
The Origin header is an essential part of CORS, XDomainRequest, and UMP, but it actually evolved somewhat independently with other uses in mind. In their 2008 paper, Adam Barth, Collin Jackson, and John C. Mitchell advocated the introduction of a new HTTP header that would offer a more reliable and privacy-conscious alternative to Referer. It would also serve as a way to prevent cross-site request vulnerabilities by providing the server with the information needed to identify the SOP-level origin of a request, without disclosing the potentially more sensitive path or query data.
Of course, it was unclear whether the subtle improvement between Referer and its proposed successor would actually make a difference for the small but nonnegligible population of users who block that first header on privacy grounds. The proposal consequently ended up in a virtual limbo, not being deployed in any existing browsers but also discouraging others from pursuing other solutions such as XSRF or XSSI. (To be fair, the concept was very recently revived under the new name of From-Origin and may not be completely dead yet.)
The fate of the original idea aside, the utility of the Origin header in specialized cases such as CORS was pretty clear. Around 2009, this led to Barth submitting an IETF draft specifying the syntax of the header, while shying away from making any statements about when the header should be sent, or what specific security problems it might solve:
The user agent MAY include an Origin header in any HTTP request.
Whenever a user agent issues an HTTP request from a “privacy-sensitive” context, the user agent MUST send the value “null” in the Origin header.
NOTE: This document does not define the notion of a privacy-sensitive context. Applications that generate HTTP requests can designate contexts as privacy-sensitive to impose restrictions on how user agents generate Origin headers.
The bottom line of this specification is that whatever the decision process is, once the client chooses to provide the header, the value is required to accurately represent the SOP origin from which the request is being made. For example, when a particular operation takes place from http://www .bunnyoutlet.com:1234/bunny_reports.php, the transmitted value should be.
For origins that do not meaningfully map to a protocol-host-port tuple, the browser must send the value of null instead.
Despite all of these plans, as of this writing only one browser includes the Origin header on non-CORS navigation: WebKit-based implementations send it when submitting HTML forms. Firefox seems to be considering a different approach, but nothing specific seems to have been implemented yet.