search mobile facets autocomplete spellcheck crawler rankings weights synonyms analytics engage api customize documentation install setup technology content domains user history info home business cart chart contact email activate analyticsalt analytics autocomplete cart contact content crawling custom documentation domains email engage faceted history info install mobile person querybuilder search setup spellcheck synonyms weights engage_search_term engage_related_content engage_next_results engage_personalized_results engage_recent_results success add arrow-down arrow-left arrow-right arrow-up caret-down caret-left caret-right caret-up check close content conversions-small conversions details edit grid help small-info error live magento minus move photo pin plus preview refresh search settings small-home stat subtract text trash unpin wordpress x alert case_deflection advanced-permissions keyword-detection predictive-ai sso

PDF Crawling

Pro and Premium plans can index PDFs up to 10MB in size.

The PDF URLs need to be discoverable within your site’s HTML pages or included in a sitemap.

The Crawler can extract text from:

  1. The body of the PDF document.
  2. Any values within the PDF files standard metadata fields:
    • title
    • author
    • subject
    • keywords

By default, the Crawler will try to flatten all the content of the PDF into a body text field.

Images and OCR are not supported. Custom and non-standard fonts can be embedded in the PDF file.

If you'd like more flexibility, please contact support and ask about PDF Extraction Rules in our Premium plan.

Stuck? Looking for help? Contact support or check out the Site Search community forum!