How to inspect any URL from any website in Google Search Console?

Published in

Today I bring you a tool that will allow you to take your URL tests in Google Search Console to the next level. We’re going to understand how a script that does reverse proxy and/or redirects will help us test URLs from any website, and also give you a competitive edge in analyzing our SEO strategies.

It came to mind to prepare it thanks to the whole Reddit battle against search engines that don’t pay them and are blocked, like Bing.

Let’s get started!

What is this script for?

This script is designed to work on Cloudflare Workers, although it can be used in any environment compatible with JavaScript and the idea of having a domain at your full disposal.

You can find the entire script on GitHub. Any feedback or suggestions for improvement are welcome

Its main purpose is to act as a reverse proxy and/or dynamic redirector, allowing you to simulate different devices and user agents when accessing URLs, both our own and obviously those of our competitors or other websites.

Why is this interesting? Because it allows you, among other things:

  1. To check if a website is doing cloaking (showing different content to bots and users).
  2. To verify if specific elements are being hidden specifically for Googlebot (for example, the cookie banner)
  3. To perform competitive analysis more effectively.
  4. To test how a third-party website looks and behaves from different devices and browsers for Googlebot.

This script creates on a domain we have, a fully dynamic infrastructure to test in GSC any existing (or non-existing!) URL on the entire internet

Detailed analysis of how the script works

Let’s break down the most important parts of the script:

1. Request handling

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

This is the entry point of the script. It listens to all incoming requests and passes them to the handleRequest function.

2. Main function: handleRequest

This function is the heart of the script. It does the following:

  • Serves a robots.txt file
  • Handles requests in the /proxy directory.
  • Checks the necessary parameters (URL, device, redirect).
  • Makes the request to the destination site with the appropriate user agent.
  • Modifies HTML responses to rewrite relative URLs. This is done because relative URLs don’t contain the domain and otherwise would result in an error when loading, calling the root + path of our domain.

3. Device simulation

The script can simulate different devices by changing the User-Agent:

switch(device) {
  case 'desktop':
    // ...
  case 'windows':
    // ...
  case 'android':
    // ...
  case 'iphone':
    // ...
  default:
    // ...
}

Desktop and mobile (by default) use the Googlebot user-agent, the rest go without Googlebot.

This is crucial for testing how a site behaves on different devices and platforms.

4. URL rewriting

The rewriteUrls function is essential to ensure that all resources (images, CSS, JavaScript) load correctly through the proxy:

function rewriteUrls(html, baseUrl) {
  // ...
}

This function modifies all relative URLs in the HTML to point to the proxy, keeping the site’s functionality intact.

Natzir Turrado sent me this article by Oliver H.G. Mason and I realized that redirects were a better way to do this exercise, since with the reverse proxy it’s likely that some especially large websites will block us, plus the issue of relative URLs at scale is quite complex and prone to not capturing all use cases, although the script is quite flexible with this, I’ve seen that there are cases it doesn’t correct well, and in view of the fact that the redirect works perfectly since Google is the one making the request at the destination, there’s no need to worry further.

How to deploy the script on Cloudflare Workers

  • Log in to your Cloudflare account.
  • Go to the “Workers” section.
  • Click on “Create a Worker”.
  • Copy and paste the complete script into the editor.

  • Click on “Save and Deploy”.

Finally, you’ll need to verify that property in Google Search Console to have full access. The easiest way will be within the domain you ideally have on Cloudflare, use a TXT record to validate it.

Once deployed, you’ll have a unique URL for your Worker. Use it as the base for your proxy requests.

To be able to use this, you’ll need to have full access to the domain or subdomain you use, which is why you won’t be able to use it on Workers domains, but you’ll have to use one of your own.

Important: All of this is free, except for the domain you’ll have to buy, or if you use the subdomain of one you already have, then it will be free. Cloudflare Workers is free up to 100,000 requests per day.

Practical use cases

This solution works for any URL and domain, meaning it has no limits.

Some use cases, you’ll see that I’ve used a domain I had lying around called aimarketshare.org, here you can see it working in a real case: https://aimarketshare.org/proxy?device=desktop&redirect=true&url=https://marca.com/ or without redirect https://aimarketshare.org/proxy?device=desktop&url=https://marca.com:

1. Cloaking detection on a competitor

To check if a site is showing different content to bots, you would inspect a URL like this in GSC

https://your-domain.com/proxy?url=https://suspicious-site.com/suspicious-page&device=desktop

Compare this with what you see when accessing the site directly.

2. Verification of hidden elements for Googlebot

To see if there are elements that are only hidden from Googlebot:

https://your-domain.com/proxy?url=https://site-to-verify.com/page-1&device=mobile

Or do the same but with the redirect method. I recommend always using this method for more reliable results, meaning you should always add redirect=true to the request.

https://your-domain.com/proxy?url=https://site-with-redirects.com&redirect=true

3. Competitive analysis

To analyze how a competitor’s site looks on different devices:

https://your-domain.com/proxy?url=https://competitor.com/page-i-like&redirect=true&device=iphone

Conclusion

This script can now be an interesting tool in your arsenal as an SEO or web developer. It allows you to gain valuable insights into how competitors’ websites behave in different scenarios in Google’s eyes, which can be crucial for optimizing your own sites and better understanding your competitors’ most hidden strategies.

Always remember to use this tool ethically and responsibly.

Happy testing!