search mobile facets autocomplete spellcheck crawler rankings weights synonyms analytics engage api customize documentation install setup technology content domains user history info home business cart chart contact email activate analyticsalt analytics autocomplete cart contact content crawling custom documentation domains email engage faceted history info install mobile person querybuilder search setup spellcheck synonyms weights engage_search_term engage_related_content engage_next_results engage_personalized_results engage_recent_results success add arrow-down arrow-left arrow-right arrow-up caret-down caret-left caret-right caret-up check close content conversions-small conversions details edit grid help small-info error live magento minus move photo pin plus preview refresh search settings small-home stat subtract text trash unpin wordpress x alert case_deflection
Swiftype Documentation / crawler: Including and Excluding Content by URL

Including and Excluding Content by URL

Whitelist and blacklist rules allow you to tell the Swiftype Crawler to include or exclude parts of your domain. To configure these rules, visit the Manage Domain page of your Swiftype dashboard. You can also control what pages get indexed with a robots.txt file.

As you type a whitelist or blacklist rule, you will see a sample of URLs that will be affected.

Whitelist - Including only certain paths

Whitelist rules allow you to specify which parts of your domain you want the Swiftype Crawler to index. If you add rules to the whitelist, the Swiftype Crawler will only include the parts of you domain that match these rules. Otherwise Swiftype will include every page on your domain which is not excluded by blacklist or your robots.txt file

Whitelist Options

Option Description Example
begin with Include URLs that begin with this text. Setting this to /doc would only include paths like /documents and /doctors but would ban paths like /down or /help if there are no other whitelist rules including these paths.
contain Include URLs that contain this text. Setting this to doc would include paths like /example/docs/ and /my-doctor.
end with Include URLs that end with this text. Setting this to docs would include paths like /example/docs and /docs but ban paths like /docs/example.
match regex Include URLs that match a regular expression. Advanced users only. Setting this to /archives/\d+/\d+ would include paths like /archives/2012/07 and /archives/123/9 but ban paths like /archives/december-2009.

Blacklist - Excluding certain paths

Blacklist rules allow you to tell the Swiftype Crawler not to index parts of your domain. The rules you create in the blacklist will be applied to everything allowed by the whitelist rules. If there is no whitelist, everything on your domain is assumed to be allowed.

Blacklist Options

Option Description Example
begin with Exclude URLs that begin with this text. Setting this to /doc would exclude paths like /documents and /docs/examples but would allow paths like /down.
contain Exclude URLs that contain this text. Setting this to doc would exclude paths like /example/docs/, /my-doctor.
end with Exclude URLs that end with this text. Setting this to docs would exclude paths like /example/docs and /docs but allow paths like /docs/example.
match regex Exclude URLs that match a regular expression. Advanced users only. Setting this to /archives/\d+/\d+ would exclude paths like /archives/2012/07 but allow paths like /archives/december-2009. Be careful with regex exclusions because you can easily exclude more than you intended.