Fun hacks for faster content

Posted 06 December 2016 - Seems I'm finding everything "fun" right now

A few weeks ago I was at Heathrow airport getting a bit of work done before a flight, and I noticed something odd about the performance of GitHub: It was quicker to open links in a new window than simply click them. Here's a video I took at the time:

GitHub link click vs new tab

Here I click a link, then paste the same link into a fresh tab. The page in the fresh tab renders way sooner, even though it's started later.

Show them what you got

When you load a page, the browser takes a network stream and pipes it to the HTML parser, and the HTML parser is piped to the document. This means the page can render progressively as it's downloading. The page may be 100k, but it can render useful content after only 20k is received.

This is a great, ancient browser feature, but as developers we often engineer it away. Most load-time performance advice boils down to "show them what you got" - don't hold back, don't wait until you have everything before showing the user anything.

GitHub cares about performance so they server-render their pages. However, when navigating within the same tab navigation is entirely reimplemented using JavaScript. Something like…

// …lots of code to reimplement browser navigation…
const response = await fetch('page-data.inc');
const html = await response.text();
document.querySelector('.content').innerHTML = html;
// …loads more code to reimplement browser navigation…

This breaks the rule, as all of page-data.inc is downloaded before anything is done with it. The server-rendered version doesn't hoard content this way, it streams, making it faster. For GitHub's client-side render, a lot of JavaScript was written to make this slow.

I'm just using GitHub as an example here - this anti-pattern is used by almost every single-page-app.

Switching content in the page can have some benefits, especially if you have some heavy scripts, as you can update content without re-evaluating all that JS. But can we do that without losing streaming? I've often said that JavaScript has no access to the streaming parser, but it kinda does…

Using iframes and document.write to improve performance

The worst hacks involve <iframe>s, and this one uses <iframe>s and document.write(), but it does allow you to stream content to the page. It goes like this:

// Create an iframe:
const iframe = document.createElement('iframe');

// Put it in the document (but hidden):
iframe.style.display = 'none';
document.body.appendChild(iframe);

// Wait for the iframe to be ready:
iframe.onload = () => {
  // Ignore further load events:
  iframe.onload = null;

  // Write a dummy tag:
  iframe.contentDocument.write('<streaming-element>');

  // Get a reference to that element:
  const streamingElement =
    iframe.contentDocument.querySelector('streaming-element');

  // Pull it out of the iframe & into the parent document:
  document.body.appendChild(streamingElement);

  // Write some more content - this should be done async:
  iframe.contentDocument.write('<p>Hello!</p>');

  // Keep writing content like above, and then when we're done:
  iframe.contentDocument.write('</streaming-element>');
  iframe.contentDocument.close();
};

// Initialise the iframe
iframe.src = '';

Although <p>Hello!</p> is written to the iframe, it appears in the parent document! This is because the parser maintains a stack of open elements, which newly created elements are inserted into. It doesn't matter that we moved <streaming-element>, it just works.

Also, this technique processes HTML much closer to the standard page-loading parser than innerHTML. Notably, scripts will download and execute in the context of the parent document, except in Firefox where script doesn't execute at all, ~~but I think that's a bug~~ update: turns out scripts shouldn't be executed (thanks to Simon Pieters for pointing this out), but Edge, Safari & Chrome all do.

Now we just have to stream HTML content from the server and call iframe.contentDocument.write() as each part arrives. Streaming is really efficient with fetch(), but for a sake of Safari support we'll hack it with XHR.

I've built a little demo where you can compare this to what GitHub does today, and here are the results based on a 3g connection:

Raw test data.

By streaming the content via the iframe, content appears 1.5 seconds sooner. The avatars also finish loading half a second sooner - streaming means the browser finds out about them earlier, so it can download them in parallel with the content.

The above would work for GitHub since the server delivers HTML, but if you're using a framework that wants to manage its own representation of the DOM you'll probably run into difficulties. For that case, here's a less-good alternative:

Newline-delimited JSON

A lot of sites deliver their dynamic updates as JSON. Unfortunately JSON isn't a streaming-friendly format. There are streaming JSON parsers out there, but they aren't easy to use.

So instead of delivering a chunk of JSON:

{
  "Comments": [
    {"author": "Alex", "body": "…"},
    {"author": "Jake", "body": "…"}
  ]
}

…deliver each JSON object on a new line:

{"author": "Alex", "body": "…"}
{"author": "Jake", "body": "…"}

This is called "newline-delimited JSON" and there's a sort-of standard for it. Writing a parser for the above is much simpler. In 2017 we'll be able to express this as a series of composable transform streams:

Sometime in 2017:

const response = await fetch('comments.ndjson');
const comments = response.body
  // From bytes to text:
  .pipeThrough(new TextDecoder())
  // Buffer until newlines:
  .pipeThrough(splitStream('\n'))
  // Parse chunks as JSON:
  .pipeThrough(parseJSON());

for await (const comment of comments) {
  // Process each comment and add it to the page:
  // (via whatever template or VDOM you're using)
  addCommentToPage(comment);
}

…where splitStream and parseJSON are reusable transform streams. But in the meantime, for maximum browser compatibility we can hack it on top of XHR.

Again, I've built a little demo where you can compare the two, here are the 3g results:

Raw test data.

Versus normal JSON, ND-JSON gets content on screen 1.5 seconds sooner, although it isn't quite as fast as the iframe solution. It has to wait for a complete JSON object before it can create elements, you may run into a lack-of-streaming if your JSON objects are huge.

Don't go single-page-app too soon

As I mentioned above, GitHub wrote a lot of code to create this performance problem. Reimplementing navigations on the client is hard, and if you're changing large parts of the page it might not be worth it.

If we compare our best efforts to a simple browser navigation:

Raw test data.

…a simple no-JavaScript browser navigation to a server rendered page is roughly as fast. The test page is really simple aside from the comments list, your mileage may vary if you have a lot of complex content repeated between pages (basically, I mean horrible ad scripts), but always test! You might be writing a lot of code for very little benefit, or even making it slower.

Thanks to Elliott Sprehn for telling me the HTML parser worked this way!

View this page on GitHub

Fun hacks for faster content

Show them what you got

Using iframes and document.write to improve performance

Newline-delimited JSON

Don't go single-page-app too soon

Links

Contact