Last chance! 50% off unlimited learning
Sale ends in
Find words associated with a given word(s) or a phrase(s). Results can be output as a network graph and/or wordcloud.
word_associate(
text.var,
grouping.var = NULL,
match.string,
text.unit = "sentence",
extra.terms = NULL,
target.exclude = NULL,
stopwords = NULL,
network.plot = FALSE,
wordcloud = FALSE,
cloud.colors = c("black", "gray55"),
title.color = "blue",
nw.label.cex = 0.8,
title.padj = -4.5,
nw.label.colors = NULL,
nw.layout = NULL,
nw.edge.color = "gray90",
nw.label.proportional = TRUE,
nw.title.padj = NULL,
nw.title.location = NULL,
title.font = NULL,
title.cex = NULL,
nw.edge.curved = TRUE,
cloud.legend = NULL,
cloud.legend.cex = 0.8,
cloud.legend.location = c(-0.03, 1.03),
nw.legend = NULL,
nw.legend.cex = 0.8,
nw.legend.location = c(-1.54, 1.41),
legend.override = FALSE,
char2space = "~~",
...
)
Returns a list:
Word frequency matrices for each grouping variable.
A list of dataframes for each word list (each vector supplied
to match.string
) and a final dataframe of all combined text units that
contain any match string.
A list of vectors of word lists (each vector supplied
to match.string
).
Optionally, returns a word cloud and/or a network plot of the text unit
containing the match.string
terms.
The text variable.
The grouping variables. Default NULL
generates
one word list for all text. Also takes a single grouping variable or a list
of 1 or more grouping variables.
A list of vectors or vector of terms to associate in the text.
The text unit (either "sentence"
or "tot"
.
This argument determines what unit to find the match string words within.
For example if "sentence"
is chosen the function pulls all text for
sentences the match string terms are found in.
Other terms to color beyond the match string.
A vector of words to exclude from the
match.string
.
Words to exclude from the analysis.
logical. If TRUE
plots a network plot of the
words.
logical. If TRUE
plots a wordcloud plot of the
words.
A vector of colors equal to the length of
match.string
+1.
A character vector of length one corresponding to the color of the title.
The magnification to be used for network plot labels relative to the current setting of cex. Default is .8.
Adjustment for the title. For strings parallel to the axes, padj = 0 means right or top alignment, and padj = 1 means left or bottom alignment.
A vector of colors equal to the length of
match.string
+1.
layout types supported by igraph. See
layout
.
A character vector of length one corresponding to the color of the plot edges.
logical. If TRUE
scales the network
plots across grouping.var to allow plot to plot comparisons.
Adjustment for the network plot title. For strings parallel to the axes, padj = 0 means right or top alignment, and padj = 1 means left or bottom alignment.
On which side of the network plot (1=bottom, 2=left, 3=top, 4=right).
The font family of the cloud title.
Character expansion factor for the title. NULL
and
NA
are equivalent to 1.0.
logical. If TRUE
edges will be curved rather than
straight paths.
A character vector of names corresponding to the number of
vectors in match.string
. Both nw.legend
and cloud.legend
can be set separately; or one may be set and by default the other will assume
those legend labels. If the user does not desire this behavior use the
legend.override
argument.
Character expansion factor for the wordcloud legend.
NULL
and NA
are equivalent to 1.0.
The x and y co-ordinates to be used to position the
wordcloud legend. The location may also be specified by setting x to a
single keyword from the list "bottomright"
, "bottom"
,
"bottomleft"
, "left"
, "topleft"
, "top"
,
"topright"
, "right"
and "center"
. This places the legend on
the inside of the plot frame at the given location.
A character vector of names corresponding to the number of
vectors in match.string
. Both nw.legend
and cloud.legend
can be set separately; or one may be set and by default the other will assume
those legend labels. If the user does not desire this behavior use the
legend.override
argument.
Character expansion factor for the network plot legend.
NULL
and NA
are equivalent to 1.0.
The x and y co-ordinates to be used to position the
network plot legend. The location may also be specified by setting x to a
single keyword from the list "bottomright"
, "bottom"
,
"bottomleft"
, "left"
, "topleft"
, "top"
,
"topright"
, "right"
and "center"
. This places the legend
on the inside of the plot frame at the given location.
By default if legend labels are supplied to either
cloud.legend
or nw.legend
may be set and if the other remains
NULL
it will assume the supplied vector to the previous legend
argument. If this behavior is not desired legend.override
should be
set to TRUE
.
Currently a road to nowhere. Eventually this will allow
the retention of characters as is allowed in trans_cloud
already.
Other arguments supplied to trans_cloud
.
trans_cloud
,
word_network_plot
,
wordcloud
,
graph.adjacency
if (FALSE) {
ms <- c(" I ", "you")
et <- c(" it", " tell", "tru")
out1 <- word_associate(DATA2$state, DATA2$person, match.string = ms,
wordcloud = TRUE, proportional = TRUE,
network.plot = TRUE, nw.label.proportional = TRUE, extra.terms = et,
cloud.legend =c("A", "B", "C"),
title.color = "blue", cloud.colors = c("red", "purple", "gray70"))
#======================================
#Note: You don't have to name the vectors in the lists but I do for clarity
ms <- list(
list1 = c(" I ", " you", "not"),
list2 = c(" wh")
)
et <- list(
B = c(" the", "do", "tru"),
C = c(" it", " already", "we")
)
out2 <- word_associate(DATA2$state, DATA2$person, match.string = ms,
wordcloud = TRUE, proportional = TRUE,
network.plot = TRUE, nw.label.proportional = TRUE, extra.terms = et,
cloud.legend =c("A", "B", "C", "D"),
title.color = "blue", cloud.colors = c("red", "blue", "purple", "gray70"))
out3 <- word_associate(DATA2$state, list(DATA2$day, DATA2$person), match.string = ms)
#======================================
m <- list(
A1 = c("you", "in"), #list 1
A2 = c(" wh") #list 2
)
n <- list(
B = c(" the", " on"),
C = c(" it", " no")
)
out4 <- word_associate(DATA2$state, list(DATA2$day, DATA2$person),
match.string = m)
out5 <- word_associate(raj.act.1$dialogue, list(raj.act.1$person),
match.string = m)
out6 <- with(mraja1spl, word_associate(dialogue, list(fam.aff, sex),
match.string = m))
names(out6)
lapply(out6$dialogue, htruncdf, n = 20, w = 20)
#======================================
DATA2$state2 <- space_fill(DATA2$state, c("is fun", "too fun"))
ms <- list(
list1 = c(" I ", " you", "is fun", "too fun"),
list2 = c(" wh")
)
et <- list(
B = c(" the", " on"),
C = c(" it", " no")
)
out7 <- word_associate(DATA2$state2, DATA2$person, match.string = ms,
wordcloud = TRUE, proportional = TRUE,
network.plot = TRUE, nw.label.proportional = TRUE, extra.terms = et,
cloud.legend =c("A", "B", "C", "D"),
title.color = "blue", cloud.colors = c("red", "blue", "purple", "gray70"))
DATA2 <- qdap::DATA2
}
Run the code above in your browser using DataLab