multistageoptimal.pb: Optimizing two-stage selection in plant breeding with grid algorithm

Description

This function is used to calculate the correlation matrix of n-stage selection for the testcross progenies of DH lines in all stages.

Usage

multistageoptimal.pb(N.upper, N.lower, num.grid, Budget, 
CostC, CostTv, V, L1, Rep, N.fs, detail, fig, alg)

Arguments

N.upper

Vector with length n. It is the vector of upper limits of candidates X.

N.lower

Vector with length n. It is the vector of lower limits of candidates X.

num.grid

An integer value. It is the number of equally distanced points which divided the axis of $x_1$ into $num.grid-1$ intervals and there are $(num.grid-1)^n$ grids in a n dimensional hyper cube. More detail see

Budget

A double value. It contains the value of total budget.

CostC

A double value. It contains the costs of producing and identifying a candidate.

CostTv

Vector with length n. It contains a vector with length n reflecting the cost of evaluating a candidate in the tests performed at stage i, i=1,...,n. The cost might vary in different stages.

Vector of variance components. More detail see multistagecor.

Scalar of $L_1$.

Rep

Logical. If TRUE, the optimal selection index (Longin et al. 2007) will be used in the calculation.

N.fs

An integer value. It is the number of final selected candidates.

detail

Logical. if TRUE, the result of all the grids will be given or only the maximum.

alg

An object used to switch between two algorithms. More detail see multistagegain.

fig

Logical. If TRUE, a figure of contour plot will be saved in the default folder.

Value

If detail = FALSE, the output of this function is a vector with the optimal number of candidates $(N_1,N_2)$, locations $(L_1,L_2)$, replicates $(R_1,R_2)$ and the maximum $\Delta G(y)$. Otherwise, the result for all the grid points, which have been calculated, will be exported as a table.

Details

This is a special function made for plant breeding. A special and more complicated scenario relates to field tests of candidates in plant breeding. Here, the candidates are usually tested in replicated multi-location trials over several years, corresponding to the stages of selection. Thus, besides $\textbf{N}$, referring to the numbers of candidates to be tested in each stage, the breeder must also decide on the intensity of testing, as reflected by vector for the number of test locations $\textbf{L}={L_1,...,L_n}$ and replications $\textbf{R}={R_1,...,R_n}$, where $L_i$ and $R_i$ refer to the number of test locations and replications per location, respectively, in stage $i$. If there is no upper limit on $L_i$, then $R_i$=1 is optimal for maximizing $\Delta G$ (Longin et al. 2007). Normally a large number of candidates will be tested in few locations at the first stage, i.e., $L_1=1$ or $2$. Under this scenario, the elements in $\bm{\Sigma}^{*}$ are a rational function of $L$, $R$ and the vector of variance components $\textbf{V}={Vg, Vgl, Vgy, Vgly, Ve }$, where the latter refer to the variance among genotypes ($Vg$), genotype $\times$ location interactions ($Vgl$), genotype $\times$ year interactions ($Vgy$), genotype $\times$ year $\times$ location interactions ($Vgly$) and plot error ($Ve$). Here, $Vg$ corresponds to $\sigma_y^2$ and is set equal to 1. Likewise, the costs are not only a function of $\textbf{N}$, but also of $\textbf{L}$ and $\textbf{R}$, because each test plot in field trials is associated with costs. Hence, the set of admissible allocations of resources $\Omega (B)$ can be described as $$\Omega (B) :={ \omega = (\textbf{N,L,R})| C(\omega) \leq B }.$$ In the simplest case, $$C(\omega) = \sum_{i=1}^{n} N_i * L_i * R_i * CostT_i + N_1 * CostC \leq B,$$ where $CostC$ refers to the costs of producing or identifying a candidate a candidate and $CostT_i$ refers to the costs of testing a candidate in a test unit for stage $i$.

Examples

Run this code

# examples for the JSS paper

multistageoptimal.pb(N.upper=rep(401,2), N.lower=c(1,1), num.grid=21, Budget=1000, 
 CostC=0.5, CostTv=c(1,1), V="VC2", L1=2, Rep=c(1,1), N.fs=1, alg=GenzBretz())

# glm

  dim=6
 
  gain.table= array(0,c(dim,7))
  result.nlm= array(0,c(dim,9))
  result.grid= array(0,c(dim,9))
  rownames(gain.table)= c(1:dim)
  colnames(gain.table)= c("NumSelected","Budget","maxN1","maxN2",
  "Location1","steplength","Calcu.Gain")

gain.table[1,]=c(1,200, 101, 101,1,21 ,0)
gain.table[2,]=c(1,1000,401, 401,2,21,0)
gain.table[3,]=c(1,5000,2001,2001,2,41,0)
gain.table[4,]=c(4,200, 101, 101,1,21, 0)
gain.table[5,]=c(4,1000,601, 601,1,21, 0)
gain.table[6,]=c(4,5000,2001,2001,2,41,0) 

#######  
# IMPORTANT 
#######

# in order to reduce the time of checking in CRAN only the first breeding scheme will be checked

# if you want to run all 6 schemes you have to change the following code dim6=1 into dim6=6

dim6=1

#######  
# change the code above
#######


for (i in 1:dim6 )
{   
    maxn=gain.table[i,"maxN1"]
    length=gain.table[i,"steplength"]
    Budget=gain.table[i,"Budget"]
    Location1= gain.table[i,"Location1"]
    NumSelected=gain.table[i,"NumSelected"]    

    temp<-multistageoptimal.pb(N.upper=rep(maxn,2), N.lower=c(1,1), num.grid=length, 
    Budget=Budget,  CostC=0.5, CostTv=c(1,1), V="VC2", L1=Location1, Rep=c(1,1), 
    N.fs=NumSelected, alg=GenzBretz(),detail=TRUE)    
    result.grid[i,]=temp[[1]][1,]
}
  colnames(result.grid)<-c("NumSelected","Budget","Location1","Location2",
  "N1","N2","Rep1","Rep2","gain")
  rownames(result.grid)<-rep("grid",6)
  result.grid


# round-nlm

  dim=6
  gain.table= array(0,c(dim,7))
  result.nlm= array(0,c(dim,9))
  result.grid= array(0,c(dim,9))
  rownames(gain.table)= c(1:dim)
  colnames(gain.table)= c("NumSelected","Budget","maxN1","maxN2",
  "Location1","steplength","Calcu.Gain")

gain.table[1,]=c(1,200, 101, 101,1,21 ,0)
gain.table[2,]=c(1,1000,401, 401,2,21,0)
gain.table[3,]=c(1,5000,2001,2001,2,41,0)
gain.table[4,]=c(4,200, 101, 101,1,21, 0)
gain.table[5,]=c(4,1000,601, 601,1,21, 0)
gain.table[6,]=c(4,5000,2001,2001,2,41,0) 

# name changed in certain version

gainmatrix=gain.table

for (i in 1:dim6 )
{   
    maxn=gainmatrix[i,"maxN1"]
    length=gainmatrix[i,"steplength"]
    Budget=gainmatrix[i,"Budget"]
    Location1= gainmatrix[i,"Location1"]
    NumSelected=gainmatrix[i,"NumSelected"]    
    temp<-multistageoptimal.pb(N.upper=rep(maxn,2), N.lower=c(1,1), 
    num.grid=length, Budget=Budget,  CostC=0.5, CostTv=c(1,1), V="VC2", 
    L1=Location1, Rep=c(1,1), N.fs=NumSelected, alg=GenzBretz(),detail=FALSE)
    result.nlm[i,]=temp[[1]][2,]
}


  colnames(result.nlm)<-c("NumSelected","Budget","Location1","Location2",
  "N1","N2","Rep1","Rep2","gain")
  rownames(result.nlm)<-rep("round-nlm",6)
  result.nlm