User-Agent reduction

In our fourth chapter of “The problem with User-Agent strings” we learn about some of the privacy advances that browsers have made in the last couple of years.

Privacy on the Internet has become much more important over the last decade. All the prominent browser vendors informally agreed to limit the potential user-identifying information in the User Agent string. 

As the creator of a popular browser detection library, I might say that browser vendors have been ‘plotting and scheming’ to reduce the amount of helpful information in the User-Agent string, and that makes my life quite a bit more complicated. 

But no. Privacy is essential, and there are certainly some excellent arguments in favour of this effort. 

I’ve seen cases where some Chinese browsers included a unique identifier in their User Agent string, which acts like a global super cookie – allowing everybody to track that user. The same thing still sometimes happens with mobile apps with in-app browsers. But it is not just that.

There was a time when the Internet Explorer User-Agent string included the versions of all the installed .Net runtimes.

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.3; .NET4.0C; .NET4.0E)

And until recently, the User-Agent string on Android contained the phone’s model number, so every website you visited knew precisely which phone you used.

Mozilla/5.0 (Linux; Android 11; Pixel 4a (5G)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.79 Mobile Safari/537.36

Apart from directly leaking private information, there is another argument in favour of reducing the information in the User-Agent string: fingerprinting. Fingerprinting tries uniquely identifying users by examining all kinds of information readily available in the browser. Using that fingerprint, you can track users across multiple websites. 

One of the components used in fingerprinting is the User-Agent string. Reducing the unique information in the User-Agent string reduces its usefulness for fingerprinting. It makes it much harder to fingerprint users. 

As you’ve probably guessed, you can’t remove anything from the User-Agent string because doing so causes websites to break. The only way to reduce the information is to use fixed values for every component that you don’t think is necessary to include. And that is what browsers have been doing for the last couple of years.

So, let’s review all the information that is no longer included – or accurate – in the User-Agent string. 

Browser build number

Chrome 101 and later no longer include the build number in its browser version number. Only the major version number remains. So Chrome/101.0.4951 becomes Chrome/ Other Chromium browsers like Edge and Opera have also dropped the build number from their own version.

In the past, this version number allowed the website to see if the user was running a nightly build, canary, beta or release version of the browser. Given that the number of users for a specific pre-release build is minimal, somebody could use this to track those users.

So this is definitely a good thing. 

I can’t think of any good use case for including this, except maybe the browser’s bug tracker. Without the build number in the User-Agent string, the bug tracker can no longer auto-populate the build number in the bug submission form. 

CPU architecture

Chrome, Edge, Firefox and Safari on desktop machines no longer include accurate information about the CPU architecture. Everybody now uses Intel forever. 

That means that Safari running on an Apple Silicon-based Mac still stays it is running on Intel:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.5 Safari/605.1.15

But it is not just on Mac; it also happens on Windows. Take a look at the User-Agent string for Edge on the Windows Dev Kit running on an ARM processor:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ Safari/537.36 Edg/

Instead of an ARM processor, it lies and tells us it runs on a 64-bit Intel processor. But given the minimal number of Windows users running ARM, this will be fine, right?

Actually, it is a problem. It is not only lying about the processor’s brand. It says it is a 64-bit processor running a 64-bit version of Windows. It will say that even if you are running a 32-bit processor or a 64-bit processor running a 32-bit version of Windows.

Unlike the previous example, this affects users, and I consider the removal of this information potentially a mistake. Consider the following: you want to download some software. You go to the vendor’s website, and there is no longer a single link; instead, you must choose between 32-bit and 64-bit versions. You might know which one to pick, but will the general public?

The same applies to macOS users. They must choose between an Apple Silicon or an Intel build of the application. This is certainly not ideal. If you pick the wrong one, it will either not work, which can be fixed by downloading the right one, or it will run much slower under emulation. Both experiences are bad for the user.

Previously, the download page could detect what kind of system you had by looking at the User-Agent string. That worked really well. 

I run a software company and encounter this with our users. Once in a while, we get a bug report from one of our users that the application on the download page is broken. It isn’t, but it turns out they are running a 32-bit version of Windows and just downloaded the 64-bit version of the application. 

For users on macOS, we decided to offer a Universal version for download. It is twice the size because it contains both the code for Apple Silicon and Intel. The download is huge, but at least it will always work and run without emulation. 

Removal of this information makes users’ lives more difficult. But I am afraid this ship has sailed, and we now have to deal with it.

Operating system version

In the previous chapter, we discussed macOS and why the operating system’s version was fixed to 10.15.7. 

We see the same thing happening with other operating systems as well. When you look at a Windows 11-based PC and open up Edge, it says you are using Windows 10:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ Safari/537.36 Edg/

And so do Chrome and Firefox. There is no way to determine if a user uses Windows 11 based on the User-Agent string. Since both sets of users are now grouped together, we also lost the ability to say a user is using Windows 10. The best we can say is a user uses Windows 10 or higher.

Moving over to mobile, Chrome 110 and later use a fixed version number for Android. No matter which version you use, it is reported as Android 10. As Chrome is an evergreen browser on Android, this also affects the User-Agent string retroactively. Devices that previously reported Android 11 or 12 now report Android 10 instead.

iOS is the exception here. Safari on iOS still reports the correct version of iOS in its User-Agent string. Given that each version of Safari is coupled to one specific version of iOS, removing it does not have any privacy benefits. One could always deduce the operating system’s version based on the browser’s version.

Device model

That leads us to the most egregious privacy violation in the User-Agent string. Honestly, I do not know why this wasn’t removed much, much earlier.

Why would a website – any website – need to know what kind of phone you have? That says something about how much money you have, or at minimum, how much you are willing to spend on a phone. That is something quite personal that websites do not need to know. 

Imagine a website seeing you have the new Samsung Galaxy S24 Ultra, a 1500-euro phone. That means you have some money to spend. Compare that to a user of a Xiaomi Redmi A1, which you can buy for around 75 euros. Some websites might use that knowledge to ‘optimise’ their prices, to ‘extract more value’. You know what I mean, and no, this is not a hypothetical. 

The history of this goes back a long time, back to a time when there were actually technical reasons why a website needed to know the model of your phone. Back in the days of WAP, there was a separate header that included a link to a file on the manufacturer’s website, called a User-Agent profile, containing all kinds of information about the phone. Even Android Browser used this header, but when Chrome replaced it, it mostly disappeared from use. 

Nevertheless, this profile from the Samsung Galaxy J1 from 2015 is still very much alive:

Even including the model number downright in the User-Agent string was quite common.

Mozilla/5.0 (Series40; Nokia110/03.04; Profile/MIDP-2.1 Configuration/CLDC-1.1) Gecko/20100401 S40OviBrowser/

BlackBerry8100/4.2.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 VendorID/114

Mozilla/4.0 (PDA; PalmOS/sony/model mdrd/Revision:1.1.47) NetFront/3.0

Given that it was common practice, it wasn’t a weird decision to include the model in the User-Agent string when Google released Android. It was expected. The User-Agent string of the first Android phone, the T-Mobile G1 aka HTC Dream, contained the string “dream”.

Mozilla/5.0 (Linux; U; Android 1.0; en-us; dream) AppleWebKit/525.10 (KHTML, like Gecko) Version/3.0.4 Mobile Safari/523.12.2

And Chrome on Android simply continued with that tradition. 

Mozilla/5.0 (Linux; Android 13; Pixel 7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.0 Mobile Safari/537.36

But seriously. This practice should have stopped with Android and the advent of proper mobile browsers. Apple never felt the need to include a model number. Thankfully, as of Chrome 1101 it is gone on Android too. Good riddance. Instead, all devices will have a model “K” for backward compatibility.

Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ Mobile Safari/537.36

And it is not just Chrome. Samsung Internet also dropped the device model and so will other Chromium based browsers once they update beyond Chromium 110.

Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/25.0 Chrome/ Mobile Safari/537.36

What is left?

Well, not that much, to be honest. 

The three most important things remain available: the browser’s name, the version, and the operating system’s name. That should be enough for statistics and logging issues. 

Feature detection can solve most other uses for browser sniffing. Using browser sniffing for feature detection is a terrible practice and is prone to failure.

Of course, there are use cases where more information would be helpful. In the next instalment of this series, we’ll look at the User-Agent Client Hints API and how it can solve some of the issues we’re seeing with these reduced User-Agent strings.

Continue with The User-Agent Client Hints API

  1. This change rolled out gradually between Februari and May 2023 for Chrome 110 and later. Given that Chrome 114 was released after the roll out finished, it should be the first version that does not contain the device model at all anymore. ↩︎