weight

0th

Percentile

Parameter objects related to text analysis.

These are objects that can be used for modeling, especially in conjunction with the textrecipes package.

Keywords
datasets
Usage
weight

weight_scheme

token

max_times

min_times

max_tokens

Details

These objects are pre-made parameter sets that are useful in a variety of models.

  • min_times, max_times: frequency of word occurances for removal. See ?step_tokenfilter.

  • max_tokens: the number of tokens that will be retained. See ?step_tokenfilter.

  • weight: A parameter for "double normalization" when creating token counts. See ?step_tf.

  • weight_scheme: the method for term frequency calculations. Possible values are: "binary", "raw count", "term frequency", "log normalization", or "double normalization". See ?step_tf.

  • token: the type of token with possible values: "characters", "character_shingle", "lines", "ngrams", "paragraphs", "ptb", "regex", "sentences", "skip_ngrams", "tweets", "words", "word_stems". See ?step_tokenize

Value

Each object is generated by either new_quant_param or new_qual_param.

Format

An object of class quant_param (inherits from param) of length 7.

Aliases
  • weight
  • text_parameters
  • weight_scheme
  • token
  • max_times
  • min_times
  • max_tokens
Documentation reproduced from package dials, version 0.0.2, License: GPL-2

Community examples

Looks like there are no examples yet.