
Last chance! 50% off unlimited learning
Sale ends in
Using prop.test()
, ci
adds three columns to a data frame:
relative frequency (f
)
lower bound of a confidence interval (ci.low
)
upper bound of a confidence interval
Convenience function for converting frequency tables to instances per million.
Convenience function for converting frequency tables of alternative variants
(generated with as.alternatives=TRUE
) to percent.
Converts a vector of query or vc strings to typically appropriate legend labels by clipping off prefixes and suffixes that are common to all query strings.
Experimental convenience function for plotting typical frequency by year graphs with confidence intervals using ggplot2. Warning: This function may be moved to a new package.
ci(df, x = totalResults, N = total, conf.level = 0.95)ipm(df)
percent(df)
queryStringToLabel(data, pubDateOnly = FALSE, excludePubDate = FALSE)
geom_freq_by_year_ci(mapping = aes(ymin = conf.low, ymax = conf.high), ...)
original table with additional column ipm
and converted columns conf.low
and conf.high
original table with converted columns f
, conf.low
and conf.high
string or vector of strings with clipped off common prefixes and suffixes
table returned from frequencyQuery()
column with the observed absolute frequency.
column with the total frequencies
confidence level of the returned confidence interval. Must be a single number between 0 and 1.
string or vector of query or vc definition strings
discard all but the publication date
discard publication date constraints
Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE (the default), it is combined with the default mapping at the top level of the plot. You must supply mapping if there is no plot mapping.
Other arguments passed to geom_ribbon, geom_line, and geom_click_point.
Given a table with columns f
, conf.low
, and conf.high
, ipm
ads a column ipm
und multiplies conf.low and conf.high
with 10^6.
ci
is already included in frequencyQuery()
if (FALSE) {
library(ggplot2)
kco <- new("KorAPConnection", verbose=TRUE)
expand_grid(year=2015:2018, alternatives=c("Hate Speech", "Hatespeech")) %>%
bind_cols(corpusQuery(kco, .$alternatives, sprintf("pubDate in %d", .$year))) %>%
mutate(total=corpusStats(kco, vc=vc)$tokens) %>%
ci() %>%
ggplot(aes(x=year, y=f, fill=query, color=query, ymin=conf.low, ymax=conf.high)) +
geom_point() + geom_line() + geom_ribbon(alpha=.3)
}
if (FALSE) {
new("KorAPConnection") %>% frequencyQuery("Test", paste0("pubDate in ", 2000:2002)) %>% ipm()
}
if (FALSE) {
new("KorAPConnection") %>%
frequencyQuery(c("Tollpatsch", "Tolpatsch"),
vc=paste0("pubDate in ", 2000:2002),
as.alternatives = TRUE) %>%
percent()
}
queryStringToLabel(paste("textType = /Zeit.*/ & pubDate in", c(2010:2019)))
queryStringToLabel(c("[marmot/m=mood:subj]", "[marmot/m=mood:ind]"))
queryStringToLabel(c("wegen dem [tt/p=NN]", "wegen des [tt/p=NN]"))
if (FALSE) {
library(ggplot2)
kco <- new("KorAPConnection", verbose=TRUE)
expand_grid(condition = c("textDomain = /Wirtschaft.*/", "textDomain != /Wirtschaft.*/"),
year = (2005:2011)) %>%
cbind(frequencyQuery(kco, "[tt/l=Heuschrecke]",
paste0(.$condition," & pubDate in ", .$year))) %>%
ipm() %>%
ggplot(aes(year, ipm, fill = condition, color = condition)) +
geom_freq_by_year_ci()
}
Run the code above in your browser using DataLab