Learn R Programming

mscsweblm4r (version 0.1.2)

weblmCalculateConditionalProbability: Calculates the conditional probability that a word follows a sequence of words.

Description

This function calculates the conditional probability that a particular word will follow a given sequence of words. The input string must be in ASCII format.

Internally, this function invokes the Microsoft Cognitive Services Web Language Model REST API documented at https://www.microsoft.com/cognitive-services/en-us/web-language-model-api/documentation.

You MUST have a valid Microsoft Cognitive Services account and an API key for this function to work properly. See https://www.microsoft.com/cognitive-services/en-us/pricing for details.

Usage

weblmCalculateConditionalProbability(precedingWords, continuations, modelToUse = "body", orderOfNgram = 5L)

Arguments

precedingWords
(character) Character string for which to calculate continuation probabilities. Must be in ASCII format.
continuations
(character vector) Vector of words following precedingWords for which to calculate conditional probabilities.
modelToUse
(character) Which language model to use, supported values: "title", "anchor", "query", or "body" (optional, default: "body")
orderOfNgram
(integer) Which order of N-gram to use, supported values: 1L, 2L, 3L, 4L, or 5L (optional, default: 5L)

Value

An S3 object of the class weblm. The results are stored in the results dataframe inside this object. The dataframe contains the continuation words and their log(probability).

Examples

Run this code
## Not run: 
#  tryCatch({
# 
#    # Calculate conditional probability a particular word will follow a given sequence of words
#    conditionalProbabilities <- weblmCalculateConditionalProbability(
#      precedingWords = "hello world wide",       # ASCII only
#      continuations = c("web", "range", "open"), # ASCII only
#      modelToUse = "title",                      # "title"|"anchor"|"query"(default)|"body"
#      orderOfNgram = 4L                          # 1L|2L|3L|4L|5L(default)
#    )
# 
#    # Class and structure of conditionalProbabilities
#    class(conditionalProbabilities)
#    #> [1] "weblm"
# 
#    str(conditionalProbabilities, max.level = 1)
#    #> List of 3
#    #>  $ results:'data.frame':  3 obs. of  3 variables:
#    #>  $ json   : chr "{"results":[{"words":"hello world wide","word":"web", __truncated__ }]}
#    #>  $ request:List of 7
#    #>   ..- attr(*, "class")= chr "request"
#    #>  - attr(*, "class")= chr "weblm"
# 
#    # Print results
#    pandoc.table(conditionalProbabilities$results)
#    #> -------------------------------------
#    #>      words        word   probability
#    #> ---------------- ------ -------------
#    #> hello world wide   web      -0.32
#    #>
#    #> hello world wide range     -2.403
#    #>
#    #> hello world wide  open      -2.97
#    #> -------------------------------------
# 
#  }, error = function(err) {
# 
#    # Print error
#    geterrmessage()
# 
#  })
# ## End(Not run)

Run the code above in your browser using DataLab