Learn R Programming

geneNetBP (version 2.0.1)

fit-methods: Fit a Conditional Gaussian Bayesian Network or Discrete Bayesian Network to QTL data

Description

Learn the structure of a genotype-phenotype network from quantitative trait loci (QTL) data and the conditional probability table for each node in the network.

Usage

## Fit a conditional gaussian or a discrete bayesian network using RHugin. fit.gnbp(geno,pheno,constraints,learn="TRUE",graph,type ="cg", alpha=0.001,tol=1e-04,maxit=0) ## Fit a discrete bayesian network using bnlearn. fit.dbn(geno,pheno,graph,learn="TRUE",method="hc",whitelist,blacklist)

Arguments

geno
a data frame of column vectors of class factor (or one that can be coerced to that class) and non-empty column names.

pheno
a data frame of column vectors of class numeric for fit.gnpb if type = "cg" or class factor if type = "db" and for fit.dbn. Non-empty column names.
constraints
an optional list of constraints on the edges for specifying required and forbidden edges for fit.dbn. See details.
learn
a boolean value. If TRUE (default), the network structure will be learnt. If FALSE, only conditional probabilities will be learnt (a graph must be provided in this case.)
graph
graph structure of class "graphNEL" or a data frame with two columns of (labeled "from" and "to"), containing a set of edges to be included in the graph to be provided if learn == FALSE. See details.
type
specify the type of network for fit.gnbp. "cg" for Conditional Gaussian (default) and "db" for Discrete Bayesian.
method
a character string. The score-based or constraint-based algorithms available in the package bnlearn. Valid options are "hc", "tabu", "gs", "iamb", "fast.iamb", "inter.iamb", "mmhc". See details below.
whitelist
a data frame with two columns of (labeled "from" and "to"), containing a set of edges to be included in the graph.
blacklist
a data frame with two columns (labeled "from" and "to"), containing a set of edges NOT to be included in the graph.
alpha
a single numeric value specifying the significance level (for use with RHugin). Default is 0.001.
tol
a positive numeric value (optional) specifying the tolerance for EM algorithm to learn conditional probability tables (for use with RHugin). Default value is 1e-04. See learn.cpt for details.

maxit
a positive integer value (optional) specifying the maximum number of iterations of EM algorithm to learn conditional probability tables (for use with RHugin). See learn.cpt for details.

Value

fit.gnbp returns an object of class "gpfit" containing the following components.fit.dbn returns an object of class "dbnfit" containing the following components. returns an object of class "dbnfit" containing the following components.

Details

The function fit.gnbp fits a conditional gaussian bayesian network or a discrete bayesian network at the specified level of significance alpha, to genotype-phenotype (QTL) data by the PC algorithm implemented in the RHugin package. The conditional probability tables are learnt for each node in the domain by the EM algorithm implemented in the RHugin package.

Edges between the genotypes at SNP markers are not allowed and the genotypes are constrained to precede the phenotypes. The phenotypes should be either all numeric or all discrete. The function does not currently support mixture of discrete and continuous phenotypes. Additional domain knowledge in terms of edges should be provided as a list of constraints, the structure of which is described in detail in learn.structure. Briefly, the constraints argument is a list of two elements: directed and undirected. Each of these elements in turn should be a list with two elements: required and forbidden. The elements of required and forbidden must be a character vector of length two specifying the names of the nodes. See learn.cpt for details.

Note that this function works on Hugin domains. Since Hugin domains are external pointers and cannot be saved in R workspace, the RHugin package provides functions read.rhd and write.rhd for loading and saving the Hugin domains. See RHugin documentation for more information.

The function fit.dbn infers a discrete bayesian network structure from genotype-phenotype (QTL) categorical data by implementing score based and constraint based algorithms from the bnlearn package. The conditional probability tables are learnt for each node in the inferred network. The phenotypes should be ALL discrete variables. Additional domain knowledge in terms of edges should be provided as a whitelist and blacklist.Edges between the genotypes at SNP markers are not allowed and the genotypes are constrained to precede the phenotypes.

The supported algorithms from bnlearn are

  1. Score-based: Hill-Climbing (hc,default), Tabu Search (tabu)
  2. Constraint-based: Grow-Shrink (gs), Incremental Association (iamb), Fast Incremental Association (fast.iamb), Interleaved Incremental Association (inter.iamb)
  3. Hybrid: Max-Min Hill-Climbing (mmhc).

The algorithm can be specified by method. Structure learning functions are implemented with their default parameters. If different parameter values are desired, it is recommended to learn the network structure independently using the bnlearn package.The inferred structure can be input as a graph object to fit.dbn and then set learn="FALSE".

See Also

plot.gpfit, plot.dbnfit, absorb.gnbp, For discrete bayesian networks : fit.dbn, absorb.dbn

Examples

Run this code
## Not run: 
# ## load the mouse kidney eQTL dataset
# data(mouse)
# 
# ## get genotype and phenotype data
# mousegeno<-mouse[,1:5]
# mousepheno<-mouse[,6:19]
# 
# ## Simple example : Fit a bayesian network to genotype-phenotype data using the default values
# fit.gnbp(mousegeno,mousepheno)
# 
# ## Fit a bayesian network to genotype-phenotype data at a specified significance level and plot it
# mouse.cgbn<-fit.gnbp(mousegeno,mousepheno,alpha = 0.1)
# plot(mouse.cgbn)
# 
# ## load yeast dataset
# data(yeast)
# 
# ## get genotype and phenotype data
# yeastgeno<-yeast[,1:12]
# yeastpheno<-yeast[,13:50]
# 
# ## Simple example : Fit a discrete bayesian network to genotype-phenotype data
# fit.dbn(yeastgeno,yeastpheno)
# 
# ## Fit a discrete bayesian network by Tabu method and plot it.
# yeast.dbn.tabu<-fit.dbn(yeastgeno,yeastpheno,method="tabu")
# plot(yeast.dbn.tabu)
# ## End(Not run)

Run the code above in your browser using DataLab