RegEx in SEO

RegEx in SEO

Ideas from the article were originally published in the Women in Tech SEO Knowledge Hub (HERE)

As SEOs, we often need to analyze large sets of search data to identify highly performing areas and improvement opportunities that will boost
out SEO strategy.

Luckily for us, there are shortcuts (such as RegEx) that we can use to identify patterns and extract data more accurately.

In this article, I’ll be focusing entirely on the use of RegEx in SEO and some of its use cases, such as:

  1. Data filtering in Google Search Console
  2. Reports building in Looker Data Studio
  3. RegEx formula creation via ChatGPT prompts

Happy learning!

What is RegEx?

As Dan Taylor perfectly describes in his Regex for SEO article for Search Engine Journal, “Regular expressions, or ‘regex’, are like an in-line programming language for text searches that allow you to include complex search strings, partial matches and wildcards, case-insensitive searches, and other advanced instructions.

In other words, RegEx is a series of characters that represent a pattern, and it is one of the most useful tools used to match, manage, or filter texts. A pattern can be a phone number, a URL, a search query, or even an identifier such as a product reference.

Here is an example of a RegEx string:

(?i)\b(what|where|when|how|who)\b

This RegEx String can be used to identify all question queries in Google Search Console

Basic RegEx Operators

How do I use Regex for SEO?

1. Filter Keywords in Google Search Console (Brand vs Non-Brand)

Regex is a great method to filter keywords in Google Search Console. One of my favourite use cases is to filter Branded & Non-Branded keywords. By using Regex, you can include multiple variations of your branded searches, such as brand name, partial brand name, abbreviations, or typos.

For example: .*domain name.*domain.*name.*dm.*

Step 1: Go to Google Search Console – Performance – Query

Step 2: Click on Query – Custom (regex)

Step 3: Matches regex (for Branded Terms) & Doesn’t match regex (for Non-Brand)

2. Filter URL patterns in Google Search Console

Regex can be used in Google Search Console to filter URL patterns, such as articles from the same category or subcategory, only category pages, articles containing one specific name, etc. One of my favourite use cases is to filter our only category pages.

For example: https://domainname.com/.*/

Step 1: Go to Google Search Console – Performance – Page

Step 2: Click on Page  – Custom (regex)

Step 3: Matches regex

3. Create Custom Reports in Looker Studio (ex-Google Data Studio)

Regular expressions in Looker Studio are commonly used for creating more flexible filters or custom reports.

One of my favourite use cases is filtering data by Page Type, such as ‘Category’, ‘Subcategory’, ‘Product’.

For example:

CASE

WHEN REGEXP_MATCH(Landing Page, “https://domainname.com/.*/.*/.*/.*”) THEN “Article”

WHEN REGEXP_MATCH(Landing Page, “https://domainname.com/.*/”) THEN “Category”

WHEN REGEXP_MATCH(Landing Page, “https://domainname.com/.*/.*/”) THEN “Subcategory”

WHEN REGEXP_MATCH(Landing Page, “https://domainname.com/$”) THEN “Home Page”

ELSE “Other ”

END

In order to create a Custom Field in Looker Studio, you have to:

1. Add Data Source

2. Data – Add a field

3. Add calculated field

4. Name your custom field, for example: Category

  • Add Regex formula

5. Save and Finish

Other Regex SEO Use cases include:

  • Creating Advanced Redirects Rules
  • Filter Data & Create Goals in Google Analytics

ChatGPT promps for RegEx formulas:

ChatGPT is a very useful tool to leverage for creating small pieces of code or formulas that are then to be used for SEO purposes. For me, creating RegEx formulas is one of my favourite use cases for ChatGPT.

Google Search Console Question queries

Chat GPT Promps to use:

Create a regex for Google Search Console that can find all search queries that include the words What, Where, When, How, Who

Result: (?i)\b(what|where|when|how|who)\b

Google Search Console Category URLs

Chat GPT Promps to use:

Create a regex for Google Search Console that can find all Category URLs. Here is the URL structure of my domain: www.domain.com/category/subcategory/product

Result: ^https:\/\/www\.domain\.com\/category\/[^\/]+\/[^\/]+$

Google Search Console Short-Tail Keywords

Chat GPT Promps to use:

Create a regex for Google Search Console that can find all search queries between 1 and 4 words.

Result: ^\b\w+\b(\s\b\w+\b){0,3}$

Key Takeaways

RegEx is a powerful tool for SEOs due to its versatile applicability. We can use RegEx for Keyword Research, Competitor Analysis, Planning, and, most importantly, Reporting. Also, RegEx is supported by key SEO tools such as Google Search Console and Google Analytics, making our lives easier and our processes faster.

Plus, with the increase of AI tools, we can create functional RegEx formulas quicker and more efficiently, while still ensuring they capture the essence of our needs!  

More on this topic on the Women in Tech SEO Knowledge Hub

Check out my Article for Women in Tech SEO​Women in Tech SEO Knowledge Hub (HERE)

RegEx for SEOs: ready-to-implement use cases for the Women in Tech SEO Knowledge Hub
This error message is only visible to WordPress admins

Error: No feed found.

Please go to the Instagram Feed settings page to create a feed.