Learn R Programming

exams.forge (version 1.0.10)

pearson_data: Pearson Data

Description

Generates an integer data set for computing a correlation using sumofsquares(). If n>100 and nmax>6 it is better to use one of the precomputed solutions. Otherwise it may take up to maxt seconds. Please note that the correlation of the generated data set may differ from the desired correlation.

Usage

pearson_data(r, n = 100, nmax = 6, maxt = 30, xsos = NULL, ysos = NULL)

dpearson(r, n = 100, nmax = 6, maxt = 30, xsos = NULL, ysos = NULL)

Value

A matrix with two columns and an attribute interim for intermediate values as matrix. The rows of the matrix contain : \(x_i\), \(y_i\), \(x_i-bar{x}\), \(y_i-\bar{y}\), \((x_i-bar{x})^2\), \((y_i-\bar{y})^2\), and \((x_i-bar{x})((y_i-\bar{y})\). In a final step, a vector with the row of sums is appended as a further column.

Arguments

r

numeric: desired correlation

n

integer: number to decompose as sum of squares, see sumofsquares().

nmax

integer: maximal number of squares in the sum, see sumofsquares().

maxt

numeric: maximal number of seconds the routine should run, see sumofsquares().

xsos

sos matrix: precomputed matrix

ysos

sos matrix: precomputed matrix

Examples

Run this code
data(sos)
xy <- pearson_data(0.7, xsos=sos100)
colSums(xy)
colSums(xy^2)
sum(xy[,1]*xy[,2])
# my data
x <- 100+5*xy[,1]
y <- 100+5*xy[,2]
cor(x, y)

Run the code above in your browser using DataLab