Robots.txt Support

The Swiftype Crawler's User-agent is: Swiftbot.

The Site Search Crawler supports the features of the robots.txt file standard and will respect all of its rules.

A robots.txt file is not required for Site Search to function, but it can help direct the crawler where you do or do not want it to go.

Disallow the Crawler

The robots.txt file can exclude portions of your site from Site Search by disallowing access to the Swiftbot user agent.

Careful! If your robots.txt is set to disallow content that has already been crawled, it will stay in your Engine but no longer be updated!

See Troubleshooting: Removing Documents if you run into that scenario.

Example - robots.txt file disallowing the Site Search Crawler from indexing any content under the /mobile/ path.

User-agent: Swiftbot
Disallow: /mobile/

Example - robots.txt file disallowing the Site Search Crawler and all other crawlers, for all pages. Not helpful!

User-agent: *
Disallow: /

Allow the Crawler

Use the Disallow rule to permit Swiftbot places which you do not want other crawlers to go.

Example - robots.txt file allowing Swiftbot while disallowing all other User-agents, like those belonging to major search engines. Specifying a User-agent overrides the wildcard (*).

User-agent: Swiftbot
Disallow:

User-agent: *
Disallow: /

Example - robots.txt file disallowing the Swiftbot access to one directory, /documentation/ and disallowing any other User-agent access to all pages.

User-agent: Swiftbot
Disallow: /documentation/

User-agent: *
Disallow: /

Control the Crawler

You can control the rate at which the Crawler will access your website by using the Crawl-delay directive with a number indicating seconds.

A crawl is web traffic, so limiting it can reduce bandwidth. Limiting it too much, however, can slow the uptake of new documents!

Example - robots.txt file with a Crawl-delay of 5 seconds. A delay of 5 seconds is 17,280 crawls per day.

User-agent: Swiftbot
Crawl-delay: 5

For fine-grained control over how your pages are indexed, you can configure Meta Tags. We even support robots Meta Tags.

Stuck? Looking for help? Contact support or check out the Site Search community forum!

Site Search

Guides

Site Search API

API Reference

API Clients

Plugins

Resources

Robots.txt Support

Disallow the Crawler

Allow the Crawler

Control the Crawler