Speeding Up NerdWallet ššØ
Our recently spun-up Frontend Infrastructure team now has more than 6 months under our belt and site performance has been one of our main focuses.
This post summarizes a handful of the macro level things weāve done to improve and maintain page load performance of nerdwallet.com.
Huge shout out to the product teams that did the brunt of the work to consume these changes as well as implement their own optimizations on individual parts of the site.
NerdWallet was originally created as a PHP monolith. During my time one of the largest overarching initiatives has been to move from this monolithic stack to microservices mostly built in Python and micro-apps built with Node/React.
The large amount of CSS we used to serve globally was intertwined in these micro-apps and even in our base React components. This effort was tedious tech debt cleanup but hereās the net impact after removing this unused code that was being served globally:
50kb (gzipped) of JS (190kb parsed) removed
30kb (gzipped) of CSS (200kb parsed) removed
I wrote a full blog post about this so you can read more about it there.
TL;DR: Font optimization is quite complicated nowadays but a concerted effort can have meaningful reduction in time-to-first-paint (in our case, as much as a 30% drop). In our case we are subsetting critical fonts and loading them as soon as possible via <link rel="preloadā>.
When we first migrated to React, one of the must-haves was that we couldnāt sacrifice the risk of losing search ranking by client-side rendering our applicationsāāāwe had to use server side rendering to produce the HTML needed for easiest GoogleBot consumption.
However the React render can take a non-trivial amount of time. If you have a really large tree, itās not uncommon for the render function to near 100ms. And 100ms per request spent on a synchronous, blocking call means a maximum of 10 requests per second per instance. Thatās not good.
Since Reactās render is a pure function, we can memoize based on all the inputs (e.g. reduxās store state and the react-router location). In a cache hit this can be an order of magnitude or more reduction in time to render, improving performance for that page as well as throughput capabilities.
Webpack configuration can be fairly complex. Since we have dozens of individual web-apps that all run webpack themselves we devised a solution to share webpack configs across all of these applications so they could be centrally managed. We landed on using a light wrapper around Electrodeās webpack-config-composer because it can be expressed as a plain object.
With this tool in place, whenever we make a performance optimization via our webpack config, we just update in one place and the benefits are propagated out to all apps when they upgrade / redeploy.
The above would produce a webpack config with sane defaults to be consumed in a browser, while allowing complete customization, in this case specifying a custom value for the minChunks option to be passed into CommonChunksPlugin.
Upgrading Webpack
As a part of creating this centralized webpack interface, we upgraded our version of Webpack. One challenge we ran into when upgrading to webpack 3 (this was prior to webpack 4 being released) from webpack 1 is that the dedupe plugin was removed and thus our bundles got bigger.
This was unexpected since we assumed the later version would be better for performance. We ended up rolling our own webpack dedupe plugin to produce the same functionality.
Babel
When transpiling with Babel, we use babel-preset-env, browserslist, and our siteās google analytics data to compile our JavaScript for supported browsers based on traffic usage.
When we update our traffic usage from google analytics, as apps re build/deploy the JS supported will be automatically transpiled to reflect the browsers we support
We built a React Image component to codify best practices (e.g. <picture> , srcset, sizes) and support lazy loading to improve perceived performance.
By lazy loading we can ensure we are only downloading images when the user will see them, and by using an aspect ratio box we can avoid image reflows.
Godspeed those who attempt the server-rendered, code-split apps
This is a real quote from react-router that was live on their site until very recently. Codesplitting a server side rendered app is not as battle tested and only recently has there been solid tooling to do such a thing.
We ended up leveraging the suite of react-universal-component packages to handle CSS and JS based codesplitting. This allows us to achieve CSS and JS codesplitting that will work server side or client side.
Weāve had some challenges around setting this up, codesplitting from within a nested package, and codesplitting / css modules not playing nicely. However this has allowed us to do route-based or component-based codesplitting, and split out things like large visualization libraries to their own bundles that are lazy-loaded.
This was probably the single biggest optimization that improved our page load performance site-wide.
Thanks to a large effort from our DevOps team we put Amazonās CloudFront in front of all traffic on our entire website. This was a huge win because CloudFront has many data centers from which it operates and our customers are now opening up connections with the closest CloudFront location to them rather than going all the way to nerdwallet.com which is not hosted in nearly as many locations. Additionally, even after the connection has been established, since nerdwallet.com is hosted in Amazonās S3/ECS, we are able to leverage Amazonās internal routing rather than the public internet to get content to users as fast as possible. Overall, this helped dropped our site-wide time to first byte by ~20%.
The second large change associated with this is that we were able to consolidate all of our assets to our top level domain. For example, instead of referencing assets on the CDN at cdn.nerdwallet.com we now can use www.nerdwallet.com/cdn. This results in faster time to download assets in the critical render path over HTTP2 as we donāt need to handle a new DNS lookup/TCP connection/SSL handshakeāāāthe browser can leverage the existing connection opened from the top level domain.
CSS / Fonts / images
<link rel="preload" as="font" type="font/woff2" href="..."/>
For assets in the critical render path impacting SpeedIndex, we leverage the preload attribute where we can and have moved our JavaScript out of the <head> so the browser can potentially paint as much of the page without having to make many requests in serial.
JavaScript
<script src="..." defer />
We use defer for our core JS bundles over async because defer guarantees order and guarantees the main thread wonāt be blocked until the DOMContentLoaded event. In practice we saw async scripts sometimes being evaluated much earlier than weād like to seeāāāahead of image paint or similar.
Challenges
We have seen some unexpected behavior with the HTTP2 implementation of Chrome and/or Cloudfront. In particular, Chrome specifies assets of a lower priority in a linear dependency. What this means is, as an example, that there might be tiny image files that are waiting for much larger JavaScript files to completely finish downloadingāāāthese assets are downloading in serial not in parallel.
This required us to not preload our JavaScript bundles since they can be large and we didnāt want them to block the loading of much smaller images, and Chrome gives a higher priority to preloaded JS assets than images.
In order to maintain and improve our site performance, we needed to build a culture that values site performance and measures it on a regular basis.
This is probably the biggest change weāve made to ensure we are on top of our performance. Weāve adopted SpeedCurve and made SpeedIndex one of the key metrics we monitor across all of our frontend teams. If you donāt use SpeedCurve, we highly recommend it.
Vlad Silin, an intern on the FEI team for the past quarter built an awesome tool to provide performance related feedback in the form of a Github comment when developers make PRs. Read more about this in Vladās post.
Culture
Weāve held performance-related workshops, talks, captured performance data in our data warehouse to correlate page performance to business performance, and in general been advocates where possible to make performance a core part of frontend engineering at NerdWallet.
We still have a long list of things we want to do. Some things we plan to work on in the near future
Inline CSS under a minimum threshold via Googleās Nginx pagespeed module
Improve our image processing pipelineāāāsupport requiring an image require('../my-image.png')from within JS and generate responsive image sizes either at build or on the fly
Service Workers for better offline support
Upgrade versions of some of our key packages (webpack 4, React 16)
Enjoy this type of work? Check out our job openings at NerdWallet.