Subscribe to receive notifications of new posts:

SEO Best Practices with Cloudflare Workers, Part 2: Implementing Subdomains

2019-02-15

2 min read

Recap

In Part 1, the merits and tradeoffs of subdirectories and subdomains were discussed.  The subdirectory strategy is typically superior to subdomains because subdomains suffer from keyword and backlink dilution.  The subdirectory strategy more effectively boosts a site's search rankings by ensuring that every keyword is attributed to the root domain instead of diluting across subdomains.

Subdirectory Strategy without the NGINX

In the first part, our friend Bob set up a hosted Ghost blog at bobtopia.coolghosthost.com that he connected to blog.bobtopia.com using a CNAME DNS record.  But what if he wanted his blog to live at bobtopia.com/blog to gain the SEO advantages of subdirectories?

A reverse proxy like NGINX is normally needed to route traffic from subdirectories to remotely hosted services.  We'll demonstrate how to implement the subdirectory strategy with Cloudflare Workers and eliminate our dependency on NGINX. (Cloudflare Workers are serverless functions that run on the Cloudflare global network.)

Back to Bobtopia

Let's write a Worker that proxies traffic from a subdirectory – bobtopia.com/blog – to a remotely hosted platform – bobtopia.coolghosthost.com.  This means that if I go to bobtopia.com/blog, I should see the content of bobtopia.coolghosthost.com, but my browser should still think it's on bobtopia.com.

Configuration Options

In the Workers editor, we'll start a new script with some basic configuration options.

// keep track of all our blog endpoints here
const myBlog = {
  hostname: "bobtopia.coolghosthost.com",
  targetSubdirectory: "/articles",
  assetsPathnames: ["/public/", "/assets/"]
}

The script will proxy traffic from myBlog.targetSubdirectory to Bob's hosted Ghost endpoint, myBlog.hostname.  We'll talk about myBlog.assetsPathnames a little later.

Requests are proxied from bobtopia.com/articles to bobtopia.coolghosthost.com (Uh oh... is because the hosted Ghost blog doesn't actually exist)

Request Handlers

Next, we'll add a request handler:

async function handleRequest(request) {
  return fetch(request)
}

addEventListener("fetch", event => {
  event.respondWith(handleRequest(event.request))
})

So far we're just passing requests through handleRequest unmodified.  Let's make it do something:


async function handleRequest(request) { 
  ...

  // if the request is for blog html, get it
  if (requestMatches(myBlog.targetSubdirectory)) {
    console.log("this is a request for a blog document", parsedUrl.pathname)
    const targetPath = formatPath(parsedUrl)
    
    return fetch(`https://${myBlog.hostname}/${targetPath}`)
  }

  ...
  
  console.log("this is a request to my root domain", parsedUrl.pathname)
  // if its not a request blog related stuff, do nothing
  return fetch(request)
}

addEventListener("fetch", event => {
  event.respondWith(handleRequest(event.request))
})

In the above code, we added a conditional statement to handle traffic to myBlog.targetSubdirectory.  Note that we've omitted our helper functions here.  The relevant code lives inside the if block near the top of the function. The requestMatches helper checks if the incoming request contains targetSubdirectory.  If it does, a request is made to myBlog.hostname to fetch the HTML document which is returned to the browser.

When the browser parses the HTML, it makes additional asset requests required by the document (think images, stylesheets, and scripts).  We'll need another conditional statement to handle these kinds of requests.

// if its blog assets, get them
if ([myBlog.assetsPathnames].some(requestMatches)) {
    console.log("this is a request for blog assets", parsedUrl.pathname)
    const assetUrl = request.url.replace(parsedUrl.hostname, myBlog.hostname);

    return fetch(assetUrl)
  }

This similarly shaped block checks if the request matches any pathnames enumerated in myBlog.assetPathnames and fetches the assets required to fully render the page.  Assets happen to live in /public and /assets on a Ghost blog.  You'll be able to identify your assets directories when you fetch the HTML and see logs for scripts, images, and stylesheets.

Logs show the various scripts and stylesheets required by Ghost live in /assets and /public

The full script with helper functions included is:


// keep track of all our blog endpoints here
const myBlog = {
  hostname: "bobtopia.coolghosthost.com",
  targetSubdirectory: "/articles",
  assetsPathnames: ["/public/", "/assets/"]
}

async function handleRequest(request) {
  // returns an empty string or a path if one exists
  const formatPath = (url) => {
    const pruned = url.pathname.split("/").filter(part => part)
    return pruned && pruned.length > 1 ? `${pruned.join("/")}` : ""
  }
  
  const parsedUrl = new URL(request.url)
  const requestMatches = match => new RegExp(match).test(parsedUrl.pathname)
  
  // if its blog html, get it
  if (requestMatches(myBlog.targetSubdirectory)) {
    console.log("this is a request for a blog document", parsedUrl.pathname)
    const targetPath = formatPath(parsedUrl)
    
    return fetch(`https://${myBlog.hostname}/${targetPath}`)
  }
  
  // if its blog assets, get them
  if ([myBlog.assetsPathnames].some(requestMatches)) {
    console.log("this is a request for blog assets", parsedUrl.pathname)
    const assetUrl = request.url.replace(parsedUrl.hostname, myBlog.hostname);

    return fetch(assetUrl)
  }

  console.log("this is a request to my root domain", parsedUrl.host, parsedUrl.pathname);
  // if its not a request blog related stuff, do nothing
  return fetch(request)
}

addEventListener("fetch", event => {
  event.respondWith(handleRequest(event.request))
})

Caveat

There is one important caveat about the current implementation that bears mentioning. This script will not work if your hosted service assets are stored in a folder that shares a name with a route on your root domain.  For example, if you're serving assets from the root directory of your hosted service, any request made to the bobtopia.com home page will be masked by these asset requests, and the home page won't load.

The solution here involves modifying the blog assets block to handle asset requests without using paths.  I'll leave it to the reader to solve this, but a more general solution might involve changing myBlog.assetPathnames to myBlog.assetFileExtensions, which is a list of all asset file extensions (like .png and .css).  Then, the assets block would handle requests that contain assetFileExtensions instead of assetPathnames.

Conclusion

Bob is now enjoying the same SEO advantages as Alice after converting his subdomains to subdirectories using Cloudflare Workers.  Bobs of the world, rejoice!


Interested in deploying a Cloudflare Worker without setting up a domain on Cloudflare? We’re making it easier to get started building serverless applications with custom subdomains on workers.dev. If you’re already a Cloudflare customer, you can add Workers to your existing website here.

Reserve a workers.dev subdomain


Cloudflare's connectivity cloud protects entire corporate networks, helps customers build Internet-scale applications efficiently, accelerates any website or Internet application, wards off DDoS attacks, keeps hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check out our open positions.
Cloudflare WorkersSEONGINXServerlessDeveloper PlatformDevelopers

Follow on X

Cloudflare|@cloudflare

Related posts

October 31, 2024 1:00 PM

Moving Baselime from AWS to Cloudflare: simpler architecture, improved performance, over 80% lower cloud costs

Post-acquisition, we migrated Baselime from AWS to the Cloudflare Developer Platform and in the process, we improved query times, simplified data ingestion, and now handle far more events, all while cutting costs. Here’s how we built a modern, high-performing observability platform on Cloudflare’s network. ...