Document Indexing
Add, delete, update, or retrieve documents from within your API-based Engine.
Using an API-based Engine forgoes the Site Search Crawler.
This document covers four different indexing methods:
Engine Type | Supported? |
---|---|
Crawler-based Engine | NO |
API-based Engine | YES |
Read more about API-based Engines and API resources within the API Overview.
API-based Engines
An API-based Engine provides manual control over document indexing.
Requires your private API Key and your Engine Slug.
This section includes:
DocumentTypes
DocumentTypes
specify the structure of a set of documents and are the entry point for searches.
Add a DocumentType
Add a new DocumentType to your Engine.
curl -X POST 'https://api.swiftype.com/api/v1/engines/bookstore/document_types.json' \
-H 'Content-Type: application/json' \
-d '{
"auth_token": "YOUR_API_KEY",
"document_type": {"name": "books"}
}'
GET a DocumentType
List all DocumentTypes or list a specific DocumentType by id.
List all DocumentTypes
curl -X GET 'https://api.swiftype.com/api/v1/engines/bookstore/document_types.json?auth_token=YOUR_API_KEY'
List a specific DocumentType
curl -X GET 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books.json?auth_token=YOUR_API_KEY'
Delete a DocumentType
Delete a DocumentType from your Engine.
curl -X DELETE 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books?auth_token=YOUR_API_KEY'
Documents
Documents represent all of the pieces of content in an Engine.
They conform to a DocumentType.
When you perform a search on a DocumentType, you will receive documents as results.
Create a document
Add a new document into your Engine or create or update a document.
- external_id
- required
- Can be any value, such as a number or BSON id.
Create a new document
curl -X POST 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books/documents.json' \
-H 'Content-Type: application/json' \
-d '{
"auth_token": "YOUR_API_KEY",
"document": {
"external_id": "2",
"fields": [
{"name": "title", "value": "my title 1", "type": "string"},
{"name": "page_type", "value": "user", "type": "enum"}
]}
}'
Create or update a document
If the external_id
exists, update the document.
If it does not, create a new document.
curl -X POST 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books/documents/create_or_update.json' \
-H 'Content-Type: application/json' \
-d '{
"auth_token": "YOUR_API_KEY",
"document": {
"external_id": "2",
"fields": [
{"name": "title", "value": "my new title", "type": "string"},
{"name": "page_type", "value": "user", "type": "enum"}
]}
}'
GET a document
List all documents within a DocumentType
List all documents within a DocumentType.
curl -X GET 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books/documents.json?auth_token=YOUR_API_KEY'
List a specific document by id
List a specific document within a DocumentType by id.
external_id=1
from the books DocumentType
curl -X GET 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books/documents/1?auth_token=YOUR_API_KEY'
DELETE a document
DELETE document by external_id
Delete a document by external_id value.
external_id=1
from the books DocumentType
curl -X DELETE 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books/documents/1?auth_token=YOUR_API_KEY'
Update document fields
Updates fields on an existing document.
Existing fields will not change if not listed in the request.
external_id
1
.
curl -X PUT 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books/documents/1/update_fields' \
-H 'Content-Type: application/json' \
-d '{
"auth_token": "YOUR_API_KEY",
"fields": {"title": "my title 2", "page_type": "user2"}
}'
Bulk Indexing
Bulk operations perform rapid document updates and avoid the latency overhead of repeated requests.
You can:
Bulk Create or Update, Verbose
To create or update documents in bulk, use bulk_create_or_update_verbose
.
Returns an array of responses, one for each document.
If the request was successful, the response will be true
.
Documents will be created if they do not exist or updated if they already exist.
documents
in bulk (all succeeded)
curl -X POST 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books/documents/bulk_create_or_update_verbose' \
-H 'Content-Type: application/json' \
-d '{
"auth_token": "YOUR_API_KEY",
"documents":
[
{"external_id": "3", "fields": [{"name": "title", "value": "my title 1", "type": "string"},
{"name": "page_type", "value": "user", "type": "enum"}]},
{"external_id": "4", "fields": [{"name": "title", "value": "my title 1", "type": "string"},
{"name": "page_type", "value": "user", "type": "enum"}]}
]
}'
[true, true]
If a create or update fails, the response for that document will be a string containing an error message.
curl -X POST 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books/documents/bulk_create_or_update_verbose' \
-H 'Content-Type: application/json' \
-d '{
"auth_token": "YOUR_API_KEY",
"documents":
[
{"fields": [{"name": "title", "value": "my title 1", "type": "string"},
{"name": "page_type", "value": "user", "type": "enum"}]},
{"external_id": "4", "fields": [{"name": "title", "value": "my title 1", "type": "bogus"},
{"name": "page_type", "value": "user", "type": "enum"}]}
]
}'
["Missing external_id", "Invalid field type: Invalid type for \"title\": \"bogus\""]
Bulk Create
Create documents in bulk.
Will return an error if a document with a given external_id
already exists.
curl -X POST 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books/documents/bulk_create' \
-H 'Content-Type: application/json' \
-d '{
"auth_token": "YOUR_API_KEY",
"documents":
[
{"external_id": "3", "fields": [{"name": "title", "value": "my title 1", "type": "string"},
{"name": "page_type", "value": "user", "type": "enum"}]},
{"external_id": "4", "fields": [{"name": "title", "value": "my title 1", "type": "string"},
{"name": "page_type", "value": "user", "type": "enum"}]}
]
}'
Bulk Update
Update documents in bulk.
Will return an error if a document with a given external_id
does not exist.
curl -X PUT 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books/documents/bulk_update' \
-H 'Content-Type: application/json' \
-d '{
"auth_token": "YOUR_API_KEY",
"documents":
[
{"external_id": "1", "fields": {"title": "my title 5", "page_type": "user2"}},
{"external_id": "2", "fields": {"title": "my title 4", "page_type": "user2"}}
]
}'
Bulk Destroy
Remove documents in bulk.
The output is an array of true
/false
values.
If an individual destroy succeeds, its entry in the result array will be true
.
If it fails, its entry in the result array will be false
.
curl -X POST 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books/documents/bulk_destroy' \
-H 'Content-Type: application/json' \
-d '{
"auth_token": "YOUR_API_KEY",
"documents": ["1","2"]
}'
[true, true]
Asynchronous Indexing
Asynchronous indexing to efficiently indexes a large numbers of documents.
Asynchronous indexing consists of two stages:
- Submission
- Result
Both stages support batch operations.
This section covers:
- Submitting a batch of documents
- Checking the status of document receipts
- Checking the status of a single document receipt
- Handling errors at submission time
Submitting a batch of documents
Submit a list of one or more documents to the asynchronous indexing API.
This API follows create-or-update semantics.
The documents will be created if they do not exist, or updated if they already exist.
This endpoint does basic parameter validation; success is shown within the document receipt resource.
A link to show the receipt is included in response.
curl -X POST 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books/documents/async_bulk_create_or_update' \
-H 'Content-Type: application/json' \
-d '{
"auth_token": "YOUR_API_KEY",
"documents":
[
{"external_id": "3", "fields": [{"name": "title", "value": "my title 1", "type": "string"},
{"name": "page_type", "value": "user", "type": "enum"}]},
{"external_id": "4", "fields": [{"name": "title", "value": "my title 1", "type": "enum"},
{"name": "page_type", "value": "user", "type": "enum"}]}
]
}'
{
"batch_link": "https://api.swiftype.com/api/v1/document_receipts.json?ids=5473d6142ed96065a9000001,5473d6142ed96065a9000002",
"document_receipts": [
{
"id": "5473d6142ed96065a9000001",
"external_id": "3",
"status": "pending",
"errors": [],
"links": {
"document_receipt": "https://api.swiftype.com/api/v1/document_receipts/5473d6142ed96065a9000001.json",
"document": null
}
},
{
"id": "5473d6342ed96065a9000002",
"external_id": "4",
"status": "pending",
"errors": [],
"links": {
"document_receipt": "https://api.swiftype.com/api/v1/document_receipts/5473d6142ed96065a9000002.json",
"document": null
}
}
]
}
Checking the status of Document receipts
Check the indexing status by using the document receipts API endpoint.
Each receipt will have a status
value.
Check the receipts often because as the documents are processed asynchronously.
There is a limit 1000 document receipts per API request.
Supports GET and POST requests.
Status | Meaning |
---|---|
pending |
The Document has not yet been processed. |
complete |
The Document was successfully created. |
failed |
The Document could not be created. Check the errors array for details. |
expired |
The receipt is over 24 hours old and can no longer be checked. |
unknown |
The document receipt could not be found. |
curl -X GET 'https://YOUR_API_KEY:@api.swiftype.com/api/v1/document_receipts.json?ids=5473d6142ed96065a9000001,5473d6142ed96065a9000002'
[
{
"id": "5473d6142ed96065a9000001",
"external_id": "3",
"status": "pending",
"errors": [],
"links": {
"document_receipt": "https://api.swiftype.com/api/v1/document_receipts/5473d6142ed96065a9000001.json",
"document": null
}
},
{
"id": "5473d6142ed96065a9000002",
"external_id": "4",
"status": "complete",
"errors": [],
"links": {
"document_receipt": "https://api.swiftype.com/api/v1/document_receipts/5473d6142ed96065a9000001.json",
"document": "https://api.swiftype.com/api/v1/engine/xyz/document_type/abc/document/547e513f2ed9609fa7000001.json"
}
}
]
curl -X POST 'https://api.swiftype.com/api/v1/document_receipts.json' \
-H 'Content-Type: application/json'
-d '{"ids": ["5473d6142ed96065a9000001", "5473d6142ed96065a9000002"]}'
[
{
"id": "5473d6142ed96065a9000001",
"external_id": "3",
"status": "pending",
"errors": [],
"links": {
"document_receipt": "https://api.swiftype.com/api/v1/document_receipts/5473d6142ed96065a9000001.json",
"document": null
}
},
{
"id": "5473d6142ed96065a9000002",
"external_id": "4",
"status": "complete",
"errors": [],
"links": {
"document_receipt": "https://api.swiftype.com/api/v1/document_receipts/5473d6142ed96065a9000001.json",
"document": "https://api.swiftype.com/api/v1/engine/xyz/document_type/abc/document/547e513f2ed9609fa7000001.json"
}
}
]
When a document cannot be created due to incorrect input, the receipt will have a status
of failed
.
It will include a list of error messages:
curl -X GET 'https://YOUR_API_KEY:@api.swiftype.com/api/v1/document_receipts.json?ids=5473d6142ed96065a9000002'
[
{
"id": "5473d6142ed96065a9000002",
"external_id": "4",
"status": "failed",
"errors": ["Invalid type for title: Got enum, but title is already defined as string in schema."],
"links": {
"document_receipt": "https://api.swiftype.com/api/v1/document_receipts/5473d6142ed96065a9000001.json",
"document": null
}
}
]
Receipts are only available for 24 hours.
If you request a receipt more than 24 hours old, you will see the expired
status.
curl -X GET 'https://YOUR_API_KEY:@api.swiftype.com/api/v1/document_receipts.json?ids=5473d6142ed96065a9000002'
[
{
"id": "5473d6142ed96065a9000002",
"external_id": null,
"status": "expired",
"errors": [],
"links": {
"document_receipt": "https://api.swiftype.com/api/v1/document_receipts/5473d6142ed96065a9000002.json",
"document": null
}
}
]
Checking the status of a single document receipt
Checking multiple receipt IDs is most efficient, but you can also check the status of an individual receipt.
This is especially useful for debugging.
Check receipt status via a GET
request to the link in the async_bulk_create_or_update
response.
curl -X GET 'https://YOUR_API_KEY:@api.swiftype.com/api/v1/document_receipts/5473d6142ed96065a9000001.json'
{
"id": "5473d6142ed96065a9000001",
"external_id": "3",
"status": "complete",
"errors": [],
"links": {
"document_receipt": "https://api.swiftype.com/api/v1/document_receipts/5473d6142ed96065a9000001.json",
"document": "https://api.swiftype.com/api/v1/engines/5025a35185307b737f000008/document_types/5025a35485307b737f00000a/documents/5025a3036052f6b650000006.json"
}
}
Handling errors at submission time
Basic parameter validation occurs at submission time.
Requests must contain a documents
key that is a non-empty array.
There is a 100 document limit on the number of documents that may be submitted in a single batch.
There is a maximum request size of 10 megabytes.
If a request cannot be validated, an error response will be returned.
curl -X POST 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books/documents/async_bulk_create_or_update.json' \
-H 'Content-Type: application/json' \
-d '{
"auth_token": "YOUR_API_KEY",
"documents": []
}'
{
"error": "Invalid field type: documents"
}
Document Expiration
Automatically delete documents after a given time period.
- Specifying document expiration when creating or updating documents
- Searching with precise expiration
Specifying document expiration when creating or updating documents
Create (or update) a document with an expiration date, using the asynchronous bulk indexing API.
Add an expires_after
key to the document object with the time to expire as a Unix Epoch timestamp.
The expires_after
value must be in the future.
expires_after
time. However, it is guaranteed that a document will not be automatically deleted before its expires_after
time. When querying for documents that have expired within 24 hours, use precise_expiration: true
in your queries (see below).
curl -X POST 'https://api.swiftype.com/api/v1/engines/bookstore/document_types/books/documents/async_bulk_create_or_update' \
-H 'Content-Type: application/json' \
-d '{
"auth_token": "YOUR_API_KEY",
"documents":
[
{
"external_id": "3",
"fields": [{"name": "title", "value": "my title 1", "type": "string"},
{"name": "page_type", "value": "user", "type": "enum"}],
"expires_after": 1450574556
},
{
"external_id": "4",
"fields": [{"name": "title", "value": "my title 2", "type": "enum"},
{"name": "page_type", "value": "user", "type": "enum"}],
"expires_after": 1450574556
}
]
}'
Searching with precise expiration
Documents are not guaranteed to be deleted exactly at their expiration time.
Ensure that expired documents do not appear in search using the precise_expiration
search option.
This option is available for both the search and suggest API endpoints.
Returns all documents that:
- Are not expired e.g. contain an
expires_after
value in the future.
Note: due to performance optimizations, a document may be returned in results for an additional 10 minutes beyond itsexpires_after
value. - Do not have an
expires_after
value set
In other words, it filters all expired documents from the results.
The format of the precise_expiration
option is as follows:
"precise_expiration": {
"<document_type_1_slug>": true,
"<document_type_2_slug>": true
}
The default value is false
for all DocumentTypes; expired but not deleted documents return in the results.
precise_expiration
for a DocumentType.
curl -XGET 'https://search-api.swiftype.com/api/v1/engines/bookstore/document_types/books/search.json' \
-H 'Content-Type: application/json' \
-d '{
"auth_token": "YOUR_API_KEY",
"q": "brothers",
"precise_expiration": {
"books": true
}
}'
Stuck? Looking for help? Contact support or check out the Site Search community forum!