Search Concepts

AutocompleteAutocomplete - also known as typeahead or autosuggest - is a language prediction tool that many search interfaces use to provide suggestions for users as they type in a query. In general, autocomplete menus drop down below the search bar as users type and change with each keys...Bigram matchingBigram matching is a language analysis tool which advanced search engines use to find results for multiple word queries that are similar to - but not exactly the same as - the text in the searchable index. For example, imagine an ecommerce website with many products that have ...Clickthrough behaviorClickthrough behavior is a type of data that records which results users are clicking on from an Search Engine Results Page (SERP). Advanced search engines use this data to learn from user behavior and improve over time. Site owners can use these search analytics to determine ...Constant CrawlConstant Crawl is a Swiftype Site Search feature that allows your search engine index to be updated in real time. This means that anytime a page is created or updated, Swiftbot will immediately index the new content and make it available in search results. To enable Constant ...Conversions/Conversion ratesIn the world of search, conversions occur when users perform a query and click on a result from the SERP. Higher quality search engines will have a higher conversion rate than lower quality search engines, because a higher conversion rate indicates that users are clicking on r...CorpusThe corpus is the entire body of searchable text in a search index. DocumentThe term document refers to a single page or item in a website search index. On a publishing website, for example, a document could be a single article. On an ecommerce website, a document could be a product listing. SERPs list documents that are relevant to a given query, sor...Document frequencyDocument frequency is the number of times a given term or query appears in a specific document within a larger search index. Document frequency is an important measurement for determining how relevant a particular document is for a given query, and is an essential piece of the...Document typeDocuments in a search index are divided into different document types based on their different schemas. For example, imagine a website that has blog entries, product listings, and support articles all under one domain. A well structured search engine will index these documents...Enterprise SearchEnterprise search is a technology solution that unifies data from disparate data sources and cloud storage libraries in a single search experience, making it possible for employees to locate information quickly and easily from a single location. Aside from indexing many differ...Exit rateExit rate is a measurement in search analytics that records how often users perform a query then leave the Search Engine Results Page (SERP) without clicking on a result. In many ways, exit rate can be thought of as the bounce rate for a search engine. A high exit rate indicat...Facets/FacetingFaceting (also known as faceted search) is a search tool that allows users to narrow down a large set of results on an SERP by selecting one or more criteria that the results must adhere to. For instance, a user who searches for "sweater" on an ecommerce search engine will at ...FieldsIn the world of search, the term fields refers to the various components or elements which form the schema of a document in a search engine index. Different document types have different fields. For example, a news article schema may be broken into the following fields: title,...Filtered searchFiltering is a search tool that lets users to restrict their search to a certain section of a website or a specific document type. For example, a reader on a publishing website may want to restrict their search for "election speeches" to only return videos as a result.IndexingIndexing refers to the process of collecting website data in a structured format (i.e. search index) that is optimized for a search algorithm. When developing a search engine infrastucture, developers need to devise a system for indexing existing website content and new conten...Meta TagsMeta Tags are HTML elements, typically placed in the <head> of a page, that contain information about a web page that is not displayed to users. Meta tags provide metadata about a page, such as a description, keywords, or other information that is not provided elsewhere....Phrase matchingPhrase matching is a language-dependent process which advanced search engines use to identify sets of words that should be treated as a cohesive unit when scanning across a search index for the most relevant documents. For example, an advanced search engine will read "Apple wa...Relevance scoreA relevance score is a numerical value assigned to a document that indicates how relevant that document is for a given query. Relevance scores are used to order results on an SERP, and can be calculated through a wide range of different relevance models. Relevance scores are d...SERPA SERP or search engine results page is a page that shows a collection of search results, typically ordered by relevance related to the query. In many cases, SERPs have multiple options for sorting results, ranging from document publication date on a publishing website to prod...SchemaSchema refers to the structure or format of a document or set of documents in a search engine index. For example, an example news article schema might have the following fields: title, subtitle, author, sections, body, and publication date. This structure is extracted from the...Search algorithmA search algorithm is a procedure or method for scanning over a search index to find relevant documents.Search analyticsSearch analytics is the practice of collecting, organizing, and communicating data about user interaction with search engines. Some of the most important and commonly recorded pieces of search analytics data are total queries, most common queries, most common queries with no r...Search indexA search index is a body of structured data that a search engine refers to when looking for results that are relevant to a specific query. Indexes are a critical piece of any search system, since they must be tailored to the specific information retrieval method of the search ...Search queryQueries are the strings of text that users type into a search bar when looking for a specific result or set of results. Queries can be one word or several. Search engines input user-generated queries to their algorithm to surface documents that are feature portions of or all o...Search relevanceSearch relevance is a measurement of how closely related a document is to a query. Search relevance can be determined in a wide variety of ways, ranging from simple binary relevance to a weighted relevance algorithm such as TF-IDF, which assigns a relevance score to documents.SortingSorting is a search tool that lets users change the order of results on an SERP based off a specific criteria. For example, an ecommerce website may offer customers the option to sort product results by relevance, price (low to high or high to low), date added, etc.Spelling correctionSpelling correction is a language analysis tool that identifies and corrects common spelling mistakes as users perform queries. Spelling correction can be configured to suggest alternative spellings on the SERP ("did you mean ...?") or the SERP can automatically correct mistak...StemmingStemming is a language-dependent process of removing suffixes from words so that words with the same root match each other. For example, "walking" and "walked" can be stemmed to the same root word, "walk". Swiftbot Web CrawlerSwiftbot is the Swiftype web crawler, which is used to crawl your website in order to create your search index. To learn more about Swiftbot, visit our Swiftbot details page.SynonymsSynonyms are two or more words which have the same meaning. Advanced search engines recognize when a query has synonyms in the corpus and return documents which are relevant to the original query as well as documents relevant to the synonym(s).TF-IDF Search Relevance ModelTF-IDF (term frequency weighted by inverse document frequency) is a relevance model that determines how relevant a particular document is for a given query by weighting the number of times that query appears within the searchable text corpus (TF) by the number of times that qu...Term frequencyTerm frequency is the number of times a given term or query apears within a search index. Term frequency is a key component for determining the relevance of a given document for a particular query, and is an essential piece of the widely used TF-IDF relevancy algorithm.TypeaheadTypeahead - also known as autocomplete or autosuggest - is a language prediction tool that many search interfaces use to provide suggestions for users as they type in a query. In general, typeahead menus drop down below the search bar as users type and change with each keystr...Web CrawlerA web crawler is a piece of software that visits a website and indexes all the content on a webpage. Once it lands on a page, it follows all the links on that page and then does the same to each page it can find. Web crawlers will also follow sitemaps to discover and index all...