Measuring Web Page Performance at Zoopla
The performance of our website is vital to our success as a business. It is important to delight our consumers with a fast and reliable service so they continue to return and favour our site. More than this, as a marketplace, our property portal relies heavily upon organic search traffic, and performance plays a key role in maintaining our high ranking on search engines. So, how exactly do we measure and quantify performance? And why is it so important that we do so?
If not already apparent within the context of this article performance has the broad meaning of page load speed and importantly the reliability and consistency of that speed across a multitude of connection types (3G, 4G, 5G, cable) and devices (desktop, mobile, tablet).
Zoopla is growing and our stack has been re-imagined in order to meet our future ambitions and goals. This new stack unlocks far greater flexibility, scalability, and developer experience with the ultimate goal of allowing us to build a better product for our consumers. However our change to a more resource heavy (React/Next.js) solution has brought with it new challenges and tradeoffs when optimising for web page performance.
Having undergone several focused performance initiatives Zoopla has gradually shifted our performance "golden signal" away from the Google Lighthouse Performance (LHP) score to the relatively new Core Web Vitals (CWV). Of course these two measures go hand in hand, with the CWV contributing to the overall LHP, but it was an important distinction for us, as it allows us to focus more keenly on improvements that move the smaller "needles" rather than one big "needle". CWV scales and measurements are far more transparent and reliable when compared to the many inputs and weightings considered in the LHP.
For those unfamiliar, the CWV are 3 signals that indicate the quality of the user experience of your site:
- Largest Contentful Paint (LCP) measures how fast a page loads. The bounce rate of a site increases very quickly once you move past 2.5 seconds.  We want our pages to load within 2.5 seconds.
- First Input Delay measures how quickly your site responds to input, we want to avoid that "laggy" and "unresponsive" feeling. We want our pages to have an input delay of no more than 100ms, under this threshold interactions still feel instantaneous.
- Finally Cumulative Layout Shift measures the stability of the page layout, i.e no nasty shifts or content moving around the screen. Google uses an algorithm for this which weighs shifts based upon the size and the distance of the moving element, we want a score of less than 0.1.
Google collects real user data on these three metrics and segments them across mobile and desktop devices. Each metric has a "good", "needs improvement" and "poor" threshold measured as the 75th percentile.
“Users are 24% less likely to abandon page loads with an overall good score”
CWV directly impacts upon our search results rankings. As a marketplace we must ensure our portal pages are relevant and our content of high quality so that searches like "flats for sale in Devon" will match on our site. However when there are other competitor pages of similar quality and relevancy, Google's algorithm will use the CWV metrics to further rank the matching results. Google segments mobile and desktop metrics, which means that our mobile search ranking can differ from our desktop ranking if the CWV are different across the two channels. The majority of our traffic is mobile based and the majority of our page referrals are from organic search results (i.e not Adverts). Therefore we place huge importance on achieving and maintaining good CWV scores, particularly on mobile.
At Zoopla, we track our CWV performance in several ways:
- A self hosted sitespeed.io agent crawls (on an hourly schedule) a curated list of our pages for both desktop and mobile, using an emulated 3G and cable connection. This offers us a synthetic "lab" score and baseline to compare over time. It is not perfect and does not directly match data gathered by Google, but it allows us to observe trends over time and is an early indicator of something being drastically impacted by a recent change.
- webpagetest.org is a fantastic tool for further diagnosing issues and reliably profiling the impact of changes. The results are easy to gather and share. It is most frequently used to benchmark planned changes on an ephemeral "production like" development environment so that we can quantify the impact prior to shipping.
- The Google Search Console also offers a Core Web Vitals dashboard powered by the Chrome UX Report. These dashboards provide us with the most accurate CWV data as it is the very same data used by Google's algorithm and it uses Real User Metrics (RUM). However the data can lag behind by a few days and does not offer a "live" representation of our position.
- Chrome Lighthouse, while useful during local development, returns far too variable and subjective results. It can really struggle to provide reliable feedback on smaller tweaks and fine tuning.
To help illustrate how we use these sources consider the following example.
- Our SEO team notices on the Google Search Console that the number of our mobile pages reporting as "needs improvement" has spiked dramatically. From the search console we can see a breakdown of which page urls are at fault and which CWV metric is of concern.
- With this information we can now consult our sitespeed.io graphs for the pages in question. Here we may notice that the performance of the page has slowly degraded over time, but we notice a particular jump on a certain day for the metric in question. We then check the application commit and deployment history around this time frame and focus on a change we believe to be the culprit.
- Next we deploy a feature build (in an ephemeral production-like environment) and run webpagetest.org against it to give us a baseline. We then change and deploy code in an attempt to rectify the issue and rerun webpagetest.org. Hopefully we confirm that our change will have a positive impact and we can proceed to promote to production.
- After the production deployment our sitespeed.io graphs should show a similar improvement to the page metric on its next run, giving us the confidence that we have made some level of improvement.
- Finally after a lag time of a few days we should see the impact of this change trickle back into the Google Search Console dashboard and hopefully push our pages back into the "good" threshold.
This methodology has allowed us to improve our pages in a controlled and systematic manner, and brings clarity to a process that can sometimes feel like a game of whack-a-mole. This approach allows us to clearly conclude and evidence the impact of our changes and helps grow knowledge in the teams about patterns to avoid and best practices to follow.
There is always room for improvement and there is a degree of human intervention as well as trial and error involved in the process. We hope to improve and shorten the feedback loop over time, with our ultimate goal being an automated deployment process that canary releases to x% of traffic, and compares CWV data in situ. If the canary release introduces an unacceptable shift in CWV the release is aborted and the code owner notified.
Performance is important not just for the end user experience but also the business as a whole. We will continue to focus on keeping our CWV healthy by diagnosing, measuring and fixing regressions.
[Post image by Maico Amorim on Unsplash]