The data is in - Your website sucks!

Black and white pencil drawing, an elephant is straddling a race car, its stomach resting on the roof. The race car is slightly crushed.

I have a guilty pleasure.

No, it’s not watching terrible reality shows.

When I get an ad for a company that makes websites for others, I like to do a performance review of that company’s own site.

Why?

Because I kinda enjoy seeing how badly they perform. It’s like watching motorsports for the crashes. But, you know, without the deaths.

And I can’t help but feel like I’m being lied to, when the company, who wants to sell me a high quality, performant website, have so clearly failed to build their own home on the web to the same standard.

While my unsolicited performance reviews don’t amount to any sort of empirical evidence for the sad state of the modern web, their consistent findings have nevertheless made me worried.

So, instead of worrying without a reason, let’s actually find out! In this post I’ll collect a whole bunch of data to finally conclude that yes, indeed, most websites suck!

The data

The dataset consists of 3373 audits done using the PageSpeed Insights API. Audits were done as mobile-only.

The audited websites were a subset of 27350 sites collected from dk.trustpilot.com - the Danish subdomain of the review website Trustpilot - with at least 1 review published.

To get the number of sites down to a more manageable size, those with less than 100 reviews on Trustpilot were excluded.

Of the remaining 5241 sites, 528 returned errors upon running the audit, resulting in 4713 completed. Each audit took between 10 and 40 seconds to finish.

Part of the PageSpeed Insights audit is a report on real world performance metrics collected by actual users as part of CrUX. However, not every site had enough data collected on it to compute these metrics, leaving some or all of these fields empty in the audit. Websites without all metrics available were filtered out. 3373 websites were left.

Each audit was delivered as a JSON-file and then parsed with Node.

27350
Sites on dk.trustpilot.com
5241
Sites with >100 reviews
4713
Audits run successfully
3373
Audits with all metrics
Overview of data selection

The results

So, with the numbers in hand, is it as bad as I feared?

Yes!

And no. It’s complicated. But mostly yes, unfortunately. Let’s start with the results from the simulated Moto G4 on a slow 4G network.

10
20
30
40
50
60
70
80
90
Histogram of performance scores

The average performance score across the audited sites was 36. Thirty-six. A whopping 78% percent scored below 50 - what Google regards as “poor” performance. Only 1.2% scored above 89, achieving the label of “good”. To provide a good user experience, sites should strive to have a good score (90-100), as Google puts it.

Let’s have a look at the six metrics that make up the overall performance score:

FCP
3.6 s
SI
8.3 s
LCP
10 s
TTI
14 s
TBT
1.5 s
CLS
0.22
Average scores for the six core metrics

Here, only the average cumulative layout shift (CLS) gets anywhere close to acceptable levels. The rest sit firmly in the red. Imagine having to wait 14 seconds for a site to become interactive! That’s of course if you even bother to wait around almost 4 seconds for anything to appear on the page in the first place!

When possible, the audit also tries to offer suggestions for specific improvements. Among these are potential savings from removing unused javascript and CSS and the use of right sized images delivered in modern formats like WebP.

CSS
JS
Img
Total
Potential for savings across CSS, JS and images

On average the sites stand to save 1.2 MB. Do you have any idea how much website you can get for one-point-two megabytes?! Well, it’s a lot! Even an image-heavy landing page should be able to fit within that budget if below-the-fold content is deferred.

There’s a lot of data to look at in a single audit let alone 3373, and there’s no doubt potential for many interesting comparisons and analyses. For now, however, let’s move briskly along to the performance numbers collected from real world users. Again we're looking at data from mobile only.

PSI classifies the quality of user experiences into three buckets - Poor, Needs Improvement, Good - and reports the proportion of users in each for six different metrics. The specific criteria used are available here. Ideally, all users would get the Good experience, but alas, the real world is complicated. The newest iPhone on an excellent 5G connection is going to perform better than a 10 year old android phone on slow 3G. Google uses the 75th percentile to give a metric its overall classification. That is, at least 75% of users had that experience or better.

The graphic below shows the distribution of overall classifications across the audited sites.

Cumulative layout shift
Interaction to next paint
Time to first byte
First contentful paint
First input delay
Largest contentful paint
User experience classification across all audited sites (left to right: Poor, Needs improvement, Good)

Suddenly the results aren’t too shabby! In all metrics the majority of sites get the classification Good. Great! The proportion of sites earning the dreaded Poor classification never exceed 17%, and First Input Delay is clearly a complete non-issue.

Not so fast, though. It’s possible for a specific site to excel in most metrics, but offer such a low score in one or two that it nevertheless makes for a terrible experience for the majority of users.

That's what Google accounts for in the overall classification of user experience. Here, the result will be equal to the lowest performing metric between Cumulative Layout Shift, First Contentful Paint and First Input Delay (the so-called core web vitals). If a site scores Good in 2 out of 3, but Poor in the last, the overall classification will be Poor.

Let's take a look at the distribution of those overall classifications:

Poor
Needs improvement
Good
Distribution of overall user experience classification

The number of sites that can offer a Good experience to most users shrinks considerably. Now only 18% get that. At least Poor remains the minority!

It should also be noted that these real world metrics are from a very particular subset of users: those who use Chrome (iOS-version not included), have enabled usage statistic reporting and sync their browser history.

Struggling to convert mobile users?

It can be tempting to ignore the terrible results from the simulated Moto G4 and focus on the real world user metrics. After all, it’s the real users who count, right? Clearly, they’re mostly getting a decent experience. And only real users can turn into real customers!

Black and white pencil drawing, an elephant is sitting in the cargo bed of a pickup truck, barely fitting in.

But before you do so, go ahead and look up your conversion rates by device type. Maybe you already know what I’m hinting at. Could it be that your much lower conversion rate on mobile is in part due to the terrible performance those mobile users are likely to experience?

Perhaps the conversion rates you see aren't even accurate. Is a mobile user on a slow device and connection going to stick around long enough for your slew of analytics scripts to even count them?

And sure, you probably have a super fancy responsive design that adjusts itself according to screen size - even the smallest ones! That's a good start. But a mobile friendly website requires more than media queries.

Article illustrations generated by DALL-E 2

Get the data

Feel like poring over the data yourself? Please do! It’s about 900MB (2GB uncompressed) in total and comes in the form of 4713 json-files assembled in a zip-archive.

Audits Zip-archive