What is Search index?
A search index is a body of structured data that a search engine refers to when looking for results that are relevant to a specific query. Indexes are a critical piece of any search system, since they must be tailored to the specific information retrieval method of the search engine’s algorithm. In this manner, the algorithm and the index are inextricably linked to one another. Index can also be used as a verb (indexing), referring to the process of collecting unstructured website data in a structured format that is tailored for the search engine algorithm.
One way to think about indices is to consider the following analogy between a search infrastructure and an office filing system. Imagine you hand an intern a stack of thousands of pieces of paper (documents) and tell them to organize these pieces of paper in a filing cabinet (index) to help the company find information more efficiently. The intern will first have to sort through the papers and get a sense of all the information contained within them, then they will have to decide on a system for arranging them in the filing cabinet, then finally they’ll need to decide what is the most effective manner for searching through and selecting from the files once they are in the cabinet. In this example, the process of organizing and filing the papers corresponds to the process of indexing website content, and the method for searching across these organized files and finding those that are most relevant corresponds to the search algorithm.
In Swiftype
Swiftype Site Search leverages a high performance web crawler that automatically indexes your websites content in a structured format that is optimized for our search algorithm. To customize the fields that comprise their website schema, site owners can use Swiftype’s custom meta tags or API documentation.
Furthermore, Site Search users can control the scope of their search engine index in the Swiftype Site Search dashboard by adding additional domains with blacklist or whitelist rules or eliminating and adding individual pages to their index.
App Search provides programmatic control over search. The documents API endpoint allows a user to send documents into the users's App Search Engine for indexing. A default, text-based schema is created - but after that, the user has full control over the various fields through either the dashboard or the API. With App Search, you can write simple functions that index data in near real-time.