Learn R Programming

sequoia (version 2.0.7)

ComparePairs: Comparison of all pairwise relationships in 2 pedigrees

Description

Compare, count and identify different types of relative pairs between two pedigrees. The matrix returned by DyadCompare [Deprecated] is a subset of the matrix returned here using default settings.

Usage

ComparePairs(
  Ped1 = NULL,
  Ped2 = NULL,
  Pairs2 = NULL,
  GenBack = 1,
  patmat = FALSE,
  DumPrefix = c("F0", "M0"),
  Return = "Counts"
)

Arguments

Ped1

(Original/reference) pedigree, dataframe with 3 columns: id-dam-sire

Ped2

Second (inferred) pedigree

Pairs2

dataframe with relationships categories between pairs of individuals, instead of or in addition to Ped2, e.g. as returned by GetMaybeRel. First three columns: ID1-ID2-relationship, column names and any additional columns are ignored.

GenBack

Number of generations back to consider; 1 returns parent-offspring and sibling relationships, 2 also returns grandparental, avuncular and first cousins. GenBack >2 is not implemented.

patmat

logical, distinguish between paternal versus maternal relative pairs?

DumPrefix

character vector of length 2 with the dummy prefixes in Ped1 and/or Ped2. IDs starting with these prefixes will not be excluded, but individuals with dummy parents are compared. Use GetRelCat on a single pedigree to find relationships with dummies.

Return

Return a matrix with Counts or a Summary of the number of identical relationships and mismatches per relationship, or detailed results as a 2xNxN Array or as a Dataframe. All returns a list with all four.

Value

a matrix with counts, a 3D array or a 4-column dataframe, depending on Return, with by default (GenBack=1, patmat=FALSE) the following 7 relationships:

S

Self (not in counts)

MP

Parent

O

Offspring (not in counts)

FS

Full sibling

HS

Half sibling

U

Unrelated, or otherwise related

X

Either or both individuals not occurring in both pedigrees

Where in the array and dataframe, 'MP' indicates that the second (column) individual is the parent of the first (row) individual, and 'O' indicates the reverse.

When GenBack=2, patmat=TRUE, the following relationships are distinguished:

S

Self (not in counts)

M

Mother

P

Father

O

Offspring (not in counts)

FS

Full sibling

MHS

Maternal half-sibling

PHS

Paternal half-sibling

MGM

Maternal grandmother

MGF

Maternal grandfather

PGM

Paternal grandmother

PGF

Paternal grandfather

GO

Grand-offspring (not in counts

FA

Full avuncular; maternal or paternal aunt or uncle

HA

Half avuncular

FN

Full nephew/niece (not in counts

HN

Half nephew/niece (not in counts

FC1

Full first cousin

DFC1

Double full first cousin

U

Unrelated, or otherwise related

X

Either or both individuals not occurring in both pedigrees

Note that for avuncular and cousin relationships no distinction is made between paternal versus maternal, as this may differ between the two individuals and would generate a large number of subclasses. When a pair is related via multiple paths, the first-listed relationship is returned.

When GenBack=1, patmat=TRUE the categories are (S)-M-P-(O)-FS-MHS-PHS- U-X. When GenBack=2, patmat=FALSE, MGM, MGF, PGM and PGF are combined into GP, with the rest of the categories analogous to the above.

Note that in the dataframe each pair is listed twice, e.g. once as P and once as O, or twice as FS.

When Return = "Counts" (the default), a matrix with counts is returned, with the classification in Ped1 on rows and that in Ped2 in columns. Counts for 'symmetrical' pairs ("FS", "HS", "MHS", "PHS", "FC1", "DFC1", "U","X") are divided by two.

When Return = 'Summary', the counts table is distilled down into a matrix with four columns, which names assuming Ped1 is the true pedigree:

n

total number of pairs with that relationship in Ped1

OK

Number of pairs with same relationship in Ped2 as in Ped1

lo

Number of pairs with 'lower' relationship in Ped2 as in Ped1 (see ranking above), but not unrelated in Ped2

hi

Number of pairs with 'higher' relationship in Ped2 as in Ped1

When Return = "Array", the first dimension is 1=Ped1, 2=Ped2, the 2nd and 3rd dimension are the two individuals of the pair.

When Return = "Dataframe", the columns are

id.A

First individual of the pair

id.B

Second individual of the pair

RC1

the relationship category in Ped1, as a factor with all considered categories as levels, including those with 0 count

RC2

the relationship category in Ped2

Details

If Pairs2 is as returned by GetMaybeRel (identified by the additional column names 'LLR' and 'OH'), these relationship categories are appended with an '?' in the output, to distinguish them from those derived from Ped2.

When Pairs2$TopRel contains values other than the ones listed among the return values for the combination of patmat and GenBack, they are prioritised in decreasing order of factor levels, or in decreasing alphabetical order, and before the default (ped2 derived) levels.

See Also

PedCompare for individual-based comparison; GetRelCat for pairs of relatives within a single pedigree.

Examples

Run this code
# NOT RUN {
data(Ped_HSg5, SimGeno_example, LH_HSg5, package="sequoia")
SeqOUT <- sequoia(GenoM = SimGeno_example, LifeHistData = LH_HSg5,
                  MaxSibIter = 0)
ComparePairs(Ped1=Ped_HSg5, Ped2=SeqOUT$Pedigree, Return="Counts")
# matrix with counts of pairs
RC.A <- ComparePairs(Ped1=Ped_HSg5, Ped2=SeqOUT$Pedigree, Return="Array")
RC.A[, "a05017", "b05018"] # check specific pairs

RC.DF <- ComparePairs(Ped1=Ped_HSg5, Ped2=SeqOUT$Pedigree,
  Return="Dataframe")
RC.DF[RC.DF$id.A=="a05017" & RC.DF$id.B=="b05018", ] # check specific pairs
table(RC.DF$Ped1, RC.DF$Ped2)
# incl. S,O,GO,FN,HN; duplicated counts for FS,HS,FC1,DFC1,U,X
Mismatches <- RC.DF[RC.DF$Ped1 != RC.DF$Ped2, ]

Maybe <- GetMaybeRel(SimGeno_example, SeqList=SeqOUT, ParSib="sib")
cp <- ComparePairs(Ped1=Ped_HSg5, Ped2=SeqOUT$Pedigree,
                   Pairs2=Maybe$MaybeRel, Return="All")
cp$Counts[, colSums(cp$Counts)>0]
cp$Summary[,"OK"] / cp$Summary[,"n"]  # pairwise assignment rate

# }
# NOT RUN {
# }

Run the code above in your browser using DataLab