plot_wordcloud: Create wordcloud of terms associated with a miRNA name

Description

Create wordcloud of terms associated with a miRNA name.

Usage

plot_wordcloud(
  df,
  mir,
  min.freq = 1,
  max.terms = 20,
  tf.idf = FALSE,
  token = "words",
  ...,
  stopwords = stopwords_miretrieve,
  stopwords_ngram = TRUE,
  colours = "black",
  random.colour = TRUE,
  ordered.colour = FALSE,
  col.mir = miRNA,
  col.abstract = Abstract,
  col.pmid = PMID
)

Arguments

Data frame containing miRNA names, abstracts, and PubMed-IDs.

mir

String. miRNA name of interest.

min.freq

Integer. Specifies least number of times a term must be associated with mir to be plotted.

max.terms

Integer. Maximum number of terms to plot.

tf.idf

Boolean. If tf.idf = TRUE, terms are weighed in a tf-idf fashion. miRNA names are considered as separate documents, and terms often associated with one miRNA, but not with other miRNAs get more weight. Cannot be used if normalize = TRUE. If tf.idf = TRUE and normalize = TRUE, tf.idf = TRUE is ignored.

token

String. Specifies how abstracts shall be split up. Taken from unnest_tokens() in the tidytext package: "Unit for tokenizing, or a custom tokenizing function. Built-in options are "words" (default), "characters", "character_shingles", "ngrams", "skip_ngrams", "sentences", "lines", "paragraphs", "regex", (...), and "ptb" (Penn Treebank). If a function, should take a character vector and return a list of character vectors of the same length."

...

Additional arguments for tokenization, if necessary.

stopwords

Data frame containing stop words.

stopwords_ngram

Boolean. Specifies if stop words shall be removed from abstracts when using ngrams. Only applied when token = 'ngrams'.

colours

Vector of strings. Colours for wordcloud.

random.colour

Boolean. Taken from wordcloud() in the wordcloud package: "Choose colours randomly from colours. If false, the colour is chosen based on the frequency."

ordered.colour

Boolean. Taken from wordcloud() in the wordcloud package: "If true, then colours are assigned to words in order."

col.mir

Symbol. Column containing miRNA names.

col.abstract

Symbol. Column containing abstracts.

col.pmid

Symbol. Column containing PubMed-IDs.

Value

Wordcloud of terms associated with a miRNA name.

Details

Create wordcloud of terms associated with a miRNA name. miRNA names must be in a data frame df, while terms are taken from abstracts contained in df. Number of terms to plot is regulated by max.terms, while min.freq regulates the least number of times a term must be mentioned to be plotted. Terms can either be evaluated as their raw count, e.g. how often they are mentioned in conjunction with the miRNA of interest, or weighed in a tf-idf fashion. If tf.idf = TRUE, miRNA names are considered as separate documents, and terms often associated with one miRNA, but not with other miRNAs get more weight. plot_wordcloud() is based on the tools available in the wordcloud package.

Description

Usage

Arguments

Value

Details

See Also