Learn R Programming

exams.forge (version 1.0.10)

cor_data: Correlation and Data Generation

Description

Generates a data set based on x and y for a given target correlation r according to stats::cor(). The algorithm modifies the order of the y's, therefore is guaranteed that the (marginal) distribution of x and y will not be modified. Please note that it is not guaranteed that the final correlation will be the desired correlation; the algorithm interactively modifies the order. If you are unsatisfied with the result, it might help to increase maxit.

Usage

cor_data(
  x,
  y,
  r,
  method = c("pearson", "kendall", "spearman"),
  ...,
  maxit = 1000
)

dcorr(x, y, r, method = c("pearson", "kendall", "spearman"), ..., maxit = 1000)

Value

A matrix with two columns and an attribute interim for intermediate values as matrix. The rows of the matrix contain:

  • if method=="pearson": \(x_i\), \(y_i\), \(x_i-bar{x}\), \(y_i-\bar{y}\), \((x_i-bar{x})^2\), \((y_i-\bar{y})^2\), and \((x_i-bar{x})((y_i-\bar{y})\).

  • if method=="kendall":

    • \(x_i\): The original x values.

    • \(y_i\): The original y values.

    • \(p_i\): The number of concordant pairs.

    • \(q_i\): The number of discordant pairs.

  • if method=="spearman": \(x_i\), \(y_i\), \(p_i\) (concordant pairs), and \(q_i\) (disconcordant pairs). In a final step a vector with the row sums is appended as further column.

Arguments

x

numeric: given x values

y

numeric: given y values

r

numeric: desired correlation

method

character: indicates which correlation coefficient is to be computed (default: `"pearson")

...

further parameters given to stats::cor()

maxit

numeric: maximal number of iterations (default: 1000)

Examples

Run this code
x <- runif(6)
y <- runif(6)
xy <- cor_data(x, y, r=0.6)
cbind(x, y, xy)

Run the code above in your browser using DataLab