mlt: More like this request.

Description

Usage

mlt(index, type, id, doc_type = NULL, body = NULL, boost_terms = NULL,
  include = NULL, max_doc_freq = NULL, max_query_terms = NULL,
  max_word_length = NULL, min_doc_freq = NULL, min_term_freq = NULL,
  min_word_length = NULL, mlt_fields = NULL,
  percent_terms_to_match = NULL, routing = NULL, search_from = NULL,
  search_indices = NULL, search_query_hint = NULL, search_scroll = NULL,
  search_size = NULL, search_source = NULL, search_type = NULL,
  search_types = NULL, stop_words = NULL, like_text = NULL, ...)

Arguments

index

(character) The name of the index

type

(character) A document type

(numeric) The document ID

doc_type

(character) The type of the document (use _all to fetch the first document matching the ID across all types)

body

A specific search request definition

boost_terms

(numeric) The boost factor

include

(logical) Whether to include the queried document from the response

max_doc_freq

(numeric) The word occurrence frequency as count: words with higher occurrence in the corpus will be ignored

max_query_terms

(numeric) The maximum query terms to be included in the generated query

max_word_length

(numeric) The minimum length of the word: longer words will be ignored

min_doc_freq

(numeric) The word occurrence frequency as count: words with lower occurrence in the corpus will be ignored

min_term_freq

(numeric) The term frequency as percent: terms with lower occurence in the source document will be ignored

min_word_length

(numeric) The minimum length of the word: shorter words will be ignored

mlt_fields

(character) Specific fields to perform the query against

percent_terms_to_match

(numeric) How many terms have to match in order to consider the document a match (default: 0.3)

routing

Specific routing value

search_from

(numeric) The offset from which to return results

search_indices

(character) A comma-separated list of indices to perform the query against (default: the index containing the document)

search_query_hint

(character) The search query hint

search_scroll

A scroll search request definition

search_size

(numeric) The number of documents to return (default: 10)

search_source

A specific search request definition (instead of using the request body)

search_type

(character) Specific search type (eg. dfs_then_fetch, count, etc)

search_types

(character) A comma-separated list of types to perform the query against (default: the same type as the document)

stop_words

(character) A list of stop words to be ignored

like_text

(character) Like text...

...

Curl options passed on to GET

Details

Currently uses HTTP GET request, so parameters are passed in the URL. Another option is the "more like this query", which passes the query in the body of a POST request - may be added later.

Examples

Run this code

mlt(index = "plos", type = "article", id = 5)$hits$total
mlt(index = "plos", type = "article", id = 5, min_doc_freq=12)$hits$total
mlt(index = "plos", type = "article", id = 800)$hits$total

# Return different number of results
mlt(index = "plos", type = "article", id = 800, search_size=1)$hits$hits
mlt(index = "plos", type = "article", id = 800, search_size=2)$hits$hits

# Exclude stop words
mlt(index = "plos", type = "article", id = 800)$hits$total
mlt(index = "plos", type = "article", id = 800, stop_words="the,and")$hits$total

# Specify percent of terms that have to match
mlt(index = "plos", type = "article", id = 800, percent_terms_to_match=0.1)$hits$total
mlt(index = "plos", type = "article", id = 800, percent_terms_to_match=0.7)$hits$total

# Maximum query terms to be included in the generated query
mlt(index = "plos", type = "article", id = 800, max_query_terms=1)$hits$total
mlt(index = "plos", type = "article", id = 800, max_query_terms=2)$hits$total
mlt(index = "plos", type = "article", id = 800, max_query_terms=3)$hits$total

# Maximum query terms to be included in the generated query
mlt(index = "plos", type = "article", id = 800, mlt_fields="title", boost_terms=1)$hits$total

Run the code above in your browser using DataLab