search mobile facets autocomplete spellcheck crawler rankings weights synonyms analytics engage api customize documentation install setup technology content domains user history info home business cart chart contact email activate analyticsalt analytics autocomplete cart contact content crawling custom documentation domains email engage faceted history info install mobile person querybuilder search setup spellcheck synonyms weights engage_search_term engage_related_content engage_next_results engage_personalized_results engage_recent_results success add arrow-down arrow-left arrow-right arrow-up caret-down caret-left caret-right caret-up check close content conversions-small conversions details edit grid help small-info error live magento minus move photo pin plus preview refresh search settings small-home stat subtract text trash unpin wordpress x alert case_deflection advanced-permissions keyword-detection predictive-ai sso

Canonical Element Crawling

Canonical URLs inform search engines how to handle duplicate content.

Including a canonical URL within the <head></head> tags of your webpage will declare it "source of truth":

<link rel="canonical" href="https://example.com/tea/peppermint" />

These tags tell search engine crawlers that:

"The original source of this content is: https://example.com/tea/peppermint."

Thus, if the content appears anywhere else, search engines will grant authority to the correct page.

The Site Search Crawler will obey these tags, too.

Be warned: an incorrect implementation will prevent your pages from being indexed.

Two common case are:

Canonical link elements should include the precise URI: https://example.com/tea/peppermint.

For example, your homepage URI is https://example.com.

Mistakenly, the homepage URL is included as the canonical link element on every page.

The crawler will follow the link and assume there is only one page - https://example.com - and the other pages will not be indexed.

This is incorrect!

Each page should have its own unique URI.

The URL seen when you browse the page must match the one within the canonical URL.

Redirect loops

If you have setup redirects, be sure that your canonical URLs do not contradict your redirects.

For example, your pages have https://example.com/tea/peppermint/ as the canonical link element.

But the page is set to redirect to https://example.com/tea/peppermint.

Note the absence of the trailing slash: /peppermint/ vs. /peppermint.

The crawler will become stuck in a loop.

It tries to go to ../peppermint but it will be directed to: ../peppermint/ by the canonical link element, then back again, and again...

And it will be stuck in this loop until it gives up.


Stuck? Looking for help? Contact support or check out the Site Search community forum!