search mobile facets autocomplete spellcheck crawler rankings weights synonyms analytics engage api customize documentation install setup technology content domains user history info home business cart chart contact email activate analyticsalt analytics autocomplete cart contact content crawling custom documentation domains email engage faceted history info install mobile person querybuilder search setup spellcheck synonyms weights engage_search_term engage_related_content engage_next_results engage_personalized_results engage_recent_results success add arrow-down arrow-left arrow-right arrow-up caret-down caret-left caret-right caret-up check close content conversions-small conversions details edit grid help small-info error live magento minus move photo pin plus preview refresh search settings small-home stat subtract text trash unpin wordpress x alert case_deflection
Swiftype Documentation / tutorials: Swiftype Search Engine Schema Design

Swiftype Search Engine Schema Design

Understanding how to design a search schema is important for a successful Swiftype API integration. You can add fields to your schema as your app changes, but you can't change the types of existing fields or remove fields (you may stop using unneeded fields, but they are still there).

Suggest versus search queries

Before choosing your field types, consider the difference between suggest queries and search queries.

Suggest queries match prefixes of terms, allowing for autocompletion of search terms. For example, if you have a Document with a field containing "Autocomplete Example", a suggest queries for "aut", "auto", "autoc" and so forth will match "Autocomplete".

Search queries match complete terms and prefixes. If you have a large body text field, such as the content from a PDF, it will be matched via search but not via suggest.

When designing your schema, consider which fields you want to match for autocompletion and which should match for full-text search. For example, the title of an article is a good candidate for autocompletion, but the body text is not. Depending on your application, you may or may not want the author of the article's name to be autocompleted. Swiftype gives you the flexibility to design your search experience so that your application works how you want it to.

Field type overview

The key distinguishing feature between Swiftype's field types is whether they are used for searching or not. Textual fields (string and text) can be searched. The other field types are used to filter results, change the relevance of results (with functional boosts), change the sorting, or provide faceted counts of results.

Field Type Use Cases

Type Search Optimized for Autocomplete Functional Boosts Filtering Sorting Facets
string Yes Yes No Yes Yes Yes
text Yes No No No No No
enum Yes No No Yes Yes Yes
integer Yes No Yes Yes Yes Yes
float Yes No Yes Yes Yes Yes
date No No No Yes Yes Yes
location No No No Yes No No

string fields

Short textual fields like title or headers are appropriate for the string field type, which is used for both suggest and search queries. Structured data like a database ID or a URL are not appropriate for string fields (use enum instead, see below). Textual fields longer than a few hundred characters should use the text type (see below).

A match in a string field for a full-text search will be returned in the highlights.

string fields cannot be used for functional boosts.

text fields

For longer fields (like the body text of an article), use the text field type.

text fields are not used for prefix matching so they will not match autocomplete queries, but they are used in full-text search. A match in a text field will be returned in the highlights.

text fields cannot be used for filtering, sorting, functional boots, or faceting.

enum fields

To store bits of text that should not be processed by the search engine (for example, URLs or email addresses), use the enum field type.

enum fields are considered as a single piece of data. They are not tokenized or analyzed. For example, an enum of "AppleCart" will not be lower-cased or split on case changes.

You can use enum fields to filter data and for faceting. enum fields can also be sorted by, but be aware that the sort is by string comparison. For example, "apple" will sort before "bear" but "100" will sort before "99" because the first character of "100" is less than the first character of "99". If you need numerical sorting use an integer or float field instead.

Additionally, enum fields are returned for full-text and autocomplete queries if they match exactly. For fuzzy matches, use a string field instead.

One special enum field is the external_id which ties the Document in Swiftype's system to an object in your system. All Swiftype documents have an external_id. You do not have to define it in your schema.

Numeric fields (integer and float)

To store numbers, use the integer or float field type. Numeric fields are appropriate for data you may want to use in scoring or filtering, but not searching. For example, the number of "Likes" on a post, or the average review score for a product.

Numeric fields (integer and float) are not used for autocomplete or full-text search, but they can be used for filtering (including by range), functional boosts, sorting, and faceting.

date fields

To store dates, use a date field. For example, you could store an article's publication date and allow users to search for only articles published in the last 30 days with a range filter.

date fields are not used for autocomplete or searching but can be used for filtering (including by range), sorting, and faceting. When sent to the API, dates must be in ISO 8601 format (for example, "2013-02-27T18:09:19"). We recommend using UTC representations for dates.

location fields

To store the geographic location of a Document, use a location field. Using the a location field allows filtering by distance from a specified point. For example, a store could have a location field and users could search for stores near their location.

The location field type can be used only for filtering by location. It is not used in autocomplete or full-text search. The location is specified using a JSON object containing the longitude and latitude, for example {"lat": 56.2,"lon": 44.7}.

Multi-valued fields

Multi-valued fields are useful for storing fields like tags or categories, with more than one distinct value. You cannot mix multiple types in the same field (for example, an integer and a string cannot be stored in the same field).

Multi-valued fields are transparent in the search and suggest API calls. If the field type is searchable (string and text), multi-valued fields can be searched. If the field type is sortable, they can be sorted on, and so on.

To specify multiple values for a tag, simply pass a JSON array of the values, for example ["ruby", "rails", "json", "programming"].

Schema design example

Let's say you were designing a schema for YouTube, and you want to search over the videos. Videos have properties like title, caption, length, and so on. You can view a complete list of attributes that a a YouTube video has in the developer documentation.

The first step in schema design is determining which attributes you want to search, sort, and filter by. Keep in mind that you only need to store data in Swiftype that you want to search, sort, or filter. Swiftype is not a database!

For a YouTube video, we might want to store these attributes:

Attribute Purpose Recommended Data Type
ID identifies a unique video; links a record in your database to a Swiftype Document external_id
URL Search results link enum
thumbnail URL display with search results enum
channel ID filtering enum
title autocomplete and full-text search string
caption full-text search text
tags autocomplete and full-text search string (multi-value)
category name autocomplete and full-text search string
category ID filtering by category enum
published at date filtering by date range date
duration (in seconds) filtering integer
number of views filtering, functional boosts integer
number of likes functional boosts integer

Note how the schema contains both the category name as a string (for searching) and the category ID as an enum (for filtering).

Creating the schema

Now that you've decided on the fields to index and what types they should be, you can index content with the Swiftype API.

First, create the videos DocumentType to hold the Documents:

curl -X POST 'https://api.swiftype.com/api/v1/engines/youtube/document_types.json' \
  -H 'Content-Type: application/json' \
  -d '{
        "auth_token": "YOUR_API_KEY",
        "document_type": {"name": "videos"}
      }'

Next, create a Document in the videos DocumentType that conforms to the schema:

curl -X POST 'https://api.swiftype.com/api/v1/engines/youtube/document_types/videos/documents.json' \
  -H 'Content-Type: application/json' \
  -d '{
        "auth_token": "YOUR_API_KEY",
        "document": {
          "external_id": "v1uyQZNg2vE",
          "fields": [
            {"name": "url", "value": "http://www.youtube.com/watch?v=v1uyQZNg2vE", "type":  "enum"},
            {"name": "thumbnail_url", "value": "https://i.ytimg.com/vi/v1uyQZNg2vE/mqdefault.jpg", "type": "enum"},
            {"name": "channel_id", "value": "UCK8sQmJBp8GCxrOtXWBpyEA", "type": "enum"},
            {"name": "title", "value": "How It Feels [through Glass]", "type": "string"},
            {"name": "caption", "value": "Want to see how Glass actually feels?...", "type": "text"},
            {"name": "tags", "value": ["glass", "wearable computing", "google"], "type": "string"},
            {"name": "category_name", "value": "Science & Technology", "type": "string"},
            {"name": "category_id", "value": 28, "type": "enum"},
            {"name": "published_at", "value": "2013-02-20T10:47:18", "type": "date"},
            {"name": "duration", "value": 136, "type": "integer"},
            {"name": "view_count", "value": 14599202, "type": "integer"},
            {"name": "like_count", "value": 75952, "type": "integer"}
          ]
        }
     }'

It may seem strange that you define the fields in the Document instead of the DocumentType, but this is because Swiftype schemas are flexible. Individual documents in a DocumentType do not need to share all the same fields. You can add new fields over time simply by creating Documents that contain them. However, you must be careful to index fields as the same type as prior documents: string fields should not become enums and so forth.

Example queries

Here's some examples of queries that are possible with this schema.

Find videos about cats and boost the score by the number of likes:

curl -X GET 'https://api.swiftype.com/api/v1/public/engines/search.json?engine_key=swiftype-api-example' \
  -H 'Content-Type: application/json' \
  -d '{
        "q": "cats",
        "document_types": ["videos"],
        "functional_boosts": {
          "videos": {
            "like_count": "linear"
          }
        }
      }'

Find videos in the Pets & Animals category sorted by number of views:

curl -X GET 'https://api.swiftype.com/api/v1/public/engines/search.json?engine_key=swiftype-api-example' \
  -H 'Content-Type: application/json' \
  -d '{
        "document_types": ["videos"],
        "filters": {"videos": {"category_id": "15"}},
        "sort_field": {"videos": "view_count"},
        "sort_direction": {"videos": "desc"}
      }'

Find recent videos over a minute in length with more than 1,000,000 views:

curl -X GET 'https://api.swiftype.com/api/v1/public/engines/search.json?engine_key=swiftype-api-example' \
  -H 'Content-Type: application/json' \
  -d '{
        "document_types": ["videos"],
        "filters": {
          "videos": {
            "published_at": {"type": "range", "from": "2013-02-01"},
            "view_count": {"type": "range", "from": 1000000},
            "duration": {"type": "range", "from": 60}
          }
        }
      }'

Try it yourself!

You can use the engine key swiftype-api-example to execute queries against some sample data from YouTube based on this example. The queries above really work!

Want to try out more with an engine you can read and write to? Use our swiftype-api-example script to create your own engine to play around with. In addition to the videos example shown above, the swiftype-api-example Engine includes Documents in multiple DocumentTypes so you can examine how those are indexed and searched as well.