Generate and view HAR files
A HAR (Http ARchive) file contains data related to the HTTP transactions that occur between a web browser and a website or web app. Generating and analyzing a HAR file is important when troubleshooting performance issues.
What’s inside a HAR file?
The HAR file format is based on JSON, and the current version of the specification (1.2) can be found here.
Inside a HAR file you can find both coarse data regarding every single page visited—like timings for the DOMContentLoaded and load events—and granular data describing every single HTTP request made by the browser.
Every entry in a HAR file is tied to a page via a pageref
field. Thanks to this relationship the browser—or any tool that can process and show HAR files—can reconstruct a waterfall of HTTP requests.
Down below you can see how Chrome DevTools displays the waterfall chart of the Google homepage. The blue vertical line represents the DOMContentLoaded event, while the red one the Load event.
Note how Chrome DevTools and Compare share the same color scheme for the horizontal bars. This is usually the case with HAR file viewers, even if there are some tools that use a different color scheme.
The colors of the horizontal bars represent:
- Blocked: the request is blocked for some reason: maybe the browser is fetching higher priority requests, there are no TCP connections available, there are too many TCP connections already open (applies only to HTTP/1.0 and HTTP/1.1). Chrome DevTools refers to this case with the terms Queueing and Stalled.
- DNS: the browser is performing a DNS lookup, i.e. translating your-website.com into your-host-ip-address. DNS requests are cached, so DNS lookup times may differ in subsequent tests. That’s why most tools perform at least three runs before generating a waterfall.
- Connect: the browser is establishing a TCP connection, including TCP handshakes/retries.
- SSL (TLS): browser and server are negotiating an SSL certificate with the SSL/TLS handshake.
- Send: the browser is sending the request to the server. If it’s a PUT or POST request, then this will also include the time spent uploading any data with that request.
- Wait: the server received the request and it is generating a response, while the browser is waiting for the first byte of a response. Chrome DevTools calls this time TTFB (Time To First Byte). Note that other definitions of TTFB include the DNS lookup time.
- Receive: the browser is receiving the response. Chrome DevTools uses the term Content Download.
By looking at these waterfalls we can understand a lot about the performance of a web page, but first we need to generate a HAR file. So let’s see which tools we can use for that.
Generate a HAR file
You can use several tools to create a HAR file:
- a web browser like Chrome or Firefox. This is convenient, but don’t forget to warm up the DNS cache by making multiple runs. Also, HAR files generated by browsers are not perfect;
- a browser automation tool like Puppeteer, Cypress or Browsertime;
- a web performance tool like WebPageTest or sitespeed.io;
- a library that extracts the HTTP transactions from the Chrome DevTools Protocol, like chrome-har-capturer or chrome-har.
I like to use WebPageTest to generate a HAR file because of its many test locations and the many options for device emulation. Replicating the same options in Chrome would mean having to implement custom profiles for CPU throttling, network throttling and location.
⚠️ — You should always create a HAR file in an incognito tab, to avoid having requests made by Chrome extensions show up in your waterfall. I find this quite annoying and easy to forget. That's another reason why I prefer a tool like WebPageTest for this task.
View and analyze a HAR file
You have several options to view a HAR file:
- a web browser;
- an online tool like WebPageTest, Compare, HAR Analyzer
or HAR Viewer; - a UI component like Network-Viewer for React.
I like Compare because it lets me load either one or two HAR files, and it lets me… well… compare them. But the waterfall chart generated by WebPageTest is by far the one with the most details.
The waterfall chart contains a lot of valuable information, and it’s important to have a look both at individual rows and at the overall picture.
When looking at a single row, you may want to focus on the response size (body and headers), Cache-Control directives, cookies, and the TTFB.
When looking at the overall picture you can take note of how many parallel requests there are, how many TCP connections, how many HTTP redirects, and whether there are wide horizontal gaps in the chart. The waterfall generated by WebPageTest is particularly useful here, because it shows the JavaScript execution in pink. So for example a wide horizontal gap in the network activity, with many bursts of pink, indicates a CPU bottleneck: low-end mobile devices might have a hard time processing this particular request.
For a thorough analysis of visual patterns that can be found in a waterfall chart, I recommend the talk How to read a WebPageTest waterfall chart by Matt Hobbs or the associated blog post.
Other tools
HAR files are mostly used to detect performance issues, but they can also be useful when developing and (stress) testing web apps and websites. Here are a couple of tools that make use of HAR files for these use-cases:
- server-replay: a proxy server that replays HTTP responses. I have never used it, but it seems to me a better solution that mocking HTTP responses with something like nock.
- harhar: a HTTP benchmark tool. It takes all requests recorded in a HAR file and replays them a thousand fold.