e.step: performs the E step of the EM algorithm for a single pedigree for both cases with and without familial dependence

Description

computes triplet and individual weights the E step of the EM algorithm for all pedigrees in the data, in both cases with and without familial dependence. This is an internal function not meant to be called by the user.

Usage

e.step(ped, probs, param, dens, peel, x = NULL, var.list = NULL, 
       famdep = TRUE)

Arguments

ped

a matrix representing pedigrees and measurements: ped[,1] family ID, ped[,2] subjects ID, ped[,3] dad ID, ped[,4] mom ID, ped[,5] sex, ped[,6] symptom status: (2: symptomatic, 1: without symptoms, 0: missing), ped[,7:ncol(ped)] measurements, each column corresponds to a phenotypic measurement,

probs

a list of probability parameters of the model, see below for more details,

param

a list of measurement distribution parameters of the model, see below for more details,

dens

distribution of the mesurements, used in the model (multinormal, multinomial,...)

peel

a list of pedigree peeling containing connectors by peeling order and couples of parents,

covariates, if any. Default is NULL,

var.list

a list of integers indicating which covariates (taken from x) are used for a given type of measurement. Default is NULL,

famdep

a logical variable indicating if familial dependence model is used or not. Default is TRUE. In models without familial dependence, individuals are treated as independent and pedigree structure is meaningless. In models with familial dependence, a child class depends in his parents classes via a triplet-transition probability,

Value

The function returns a list of 3 elements:

triplet posterior probabilities, an array of n (the number of individuals) times 2 times K+1 times K+1 times K+1, where K is the total number of latent classes of the model. For an individual i, the triplet probability ww[i,s,c,c_1,c_2] is the posterior probability that individual i belongs to class c when his symptom status is s and given that his parents classes are c_1 and c_2, where s takes two values 1 for affected and 2 for unaffected. In particular, all ww[,2,,,] are zeros for affected individuals and all ww[,1,,,] are zeros for unaffected individuals. For missing individuals (unkown symptom status), both ww[,1,,,] and ww[,2,,,] are full,

individual posterior probabilities, an array of n times 2 times K+1 , where n is the number of individuals and is such that w[i,s,c] is the posterior probability that individual i belongs to class c when his symptom status is s, where s takes two values 1 for affected and 2 for unaffected. In particular, all w[,2,] are zeros for affected individuals and all w[,1,] are zeros for unaffected individuals. For missing individuals (unkown symptom status), both w[,1,] and w[,2,] are full,

log-likelihood of the considered model and parameters.

Details

probs is a list of initial probability parameters:

For models with familial dependence:

p: a probability vector, each p[c] is the probability that an symptomatic founder is in class c for c>=1,
p0: the probability that a founder without symptoms is in class 0,
p.trans: an array of dimension K times K+1 times K+1, where K is the number of latent classes of the model, and is such that p.trans[c_i,c_1,c_2] is the conditional probability that a symptomatic individual i is in class c_i given that his parents are in classes c_1 and c_2,
p0connect: a vector of length K, where p0connect[c] is the probability that a connector without symptoms is in class 0, given that one of his parents is in class c>=1 and the other in class 0,
p.found: the probability that a founder is symptomatic,
p.child: the probability that a child is symptomatic,

For models without familial dependence, all individuals are independent:

p: a probability vector, each p[c] is the probability that an symptomatic individual is in class c for c>=1,
p0: the probability that an individual without symptoms is in class 0,
p.aff: the probability that an individual is symptomatic,

param is a list of measurement density parameters: the coefficients alpha (cumulative logistic coefficients see alpha.compute) in the case of discrete or ordinal data, and means mu and variances-covariances matrices sigma in the case of continuous data,

References

TAYEB et al.: Solving Genetic Heterogeneity in Extended Families by Identifying Sub-types of Complex Diseases. Computational Statistics, 2011, DOI: 10.1007/s00180-010-0224-2.

Examples

Run this code

# NOT RUN {
#data
data(ped.cont)
data(peel)
#probs and probs
data(probs)
data(param.cont)
#the function
e.step(ped.cont,probs,param.cont,dens.norm,peel,x=NULL,var.list=NULL,
       famdep=TRUE)
# }

Run the code above in your browser using DataLab