The Site Search Crawler supports the features of the robots.txt file standard and will respect all of its rules.
A robots.txt file is not required for Site Search to function, but it can help direct the crawler where you do or do not want it to go.
Disallow the Crawler
The robots.txt file can exclude portions of your site from Site Search by disallowing access to the Swiftbot user agent.
Careful! If your robots.txt is set to disallow content that has already been crawled, it will stay in your Engine but no longer be updated!
See Troubleshooting: Removing Documents if you run into that scenario.
User-agent: Swiftbot Disallow: /mobile/
User-agent: * Disallow: /
Allow the Crawler
Disallow rule to permit Swiftbot places which you do not want other crawlers to go.
User-agents, like those belonging to major search engines. Specifying a
User-agentoverrides the wildcard (
User-agent: Swiftbot Disallow: User-agent: * Disallow: /
/documentation/and disallowing any other
User-agentaccess to all pages.
User-agent: Swiftbot Disallow: /documentation/ User-agent: * Disallow: /
Control the Crawler
You can control the rate at which the Crawler will access your website by using the
Crawl-delay directive with a number indicating seconds.
A crawl is web traffic, so limiting it can reduce bandwidth. Limiting it too much, however, can slow the uptake of new documents!
5seconds is 17,280 crawls per day.
User-agent: Swiftbot Crawl-delay: 5