readTSVmod
reads metabolic networks in text files,
following a character-separated value format. Each line should contain one
entry; the default value separator is a tab. Output files from the
BiGG database are compatible.readTSVmod(prefix, suffix,
reactList, metList = NA, modDesc = NA,
fielddelim = "", entrydelim = ", ", extMetFlag = "b",
excludeComments = TRUE,
oneSubSystem = TRUE,
mergeMet = TRUE,
balanceReact = TRUE,
remUnusedMetReact = TRUE,
singletonMet = FALSE,
deadEndMet = FALSE,
remMet = FALSE,
constrMet = FALSE,
tol = SYBIL_SETTINGS("TOLERANCE"),
verboseMode = 2,
loglevel = -1,
logfile = NA,
logfileEnc = NA,
fpath = SYBIL_SETTINGS("PATH_TO_MODEL"),
def_bnd = SYBIL_SETTINGS("MAXIMUM"),
quoteChar = "",
commentChar, ...) - prefix
{
A single character string giving the prefix for three possible input files
(see Details below).
}
- suffix
{
A single character string giving the file name extension. If missing, the
value of suffix
depends on the argument fielddelim
, see
Details below.
Default: "tsv"
.
}
- reactList
{
A single character vector giving a file name containing a reaction list.
Only necessary, if argument suffix
is empty.
}
- metList
{
A single character vector giving a file name containing a metabolite
list.
Default: NA
.
}
- modDesc
{
A single character vector giving a file name containing a model
description.
Default: NA
.
}
- fielddelim
{
A single character string giving the value separator.
Default: "t"
.
}
- entrydelim
{
A single character string giving the a separator for values containing
more than one entry.
Default: ", "
.
}
- extMetFlag
{
A single character string giving the identificator for metabolites which
are outside the system boundary. Only necessary, if the model is a closed
one.
Default: "b"
.
}
- excludeComments
{
A boolean value. Sometimes, the reaction abbreviations and/or the metabolite
abbreviations contain comments in square brackets. If set to TRUE
,
these comments will be removed. If set to FALSE
, whitespaces included
in comments in metabolite abbreviations will be removed. Comments in
reaction abbreviations stay unchanged.
Default: TRUE
.
}
- oneSubSystem
{
A boolean value. Ignore parameter entrydelim
for the field
subsystem , if every reaction belongs to exactly one sys system.
Default: TRUE
.
}
- mergeMet
{
Boolean: if set to TRUE
, metabolites used more than once as reactand
or product in a particular reaction are added up, see details below. If set
to FALSE
, the last value is used without warning.
Default: TRUE
.
}
- balanceReact
{
Boolean: if set to TRUE
, metabolites used as reactand and product in
a particular reaction at the same time are balanced, see details below. If
set to FALSE
the last value is used without warning (reactands before
products).
Default: TRUE
.
}
- remUnusedMetReact
{
Boolean: if set to TRUE, metabolites and reactions which are not used in the
stoichiometric matrix will be removed. A metabolite or a reaction is
considered as unused, if the corresponding element of rowSums
(metabolites) or colSums
(reactions) of the binary version of the
stoichiometric matrix is zero, see details below. If set to FALSE
,
only a warning is given.
Default: FALSE
.
}
- singletonMet
{
Boolean: if set to TRUE, metabolites appearing only once in the
stoichiometric matrix are identified. Metabolites appear only
once, if rowSums
of the binary stoichiometric matrix is one in
the corresponding row, see details below.
Default: FALSE
.
}
- deadEndMet
{
Boolean: if set to TRUE, metabolites which are produced but not consumed, or
vice versa are identified, see details below. If both arguments
singletonMet
and deadEndMet
are set to TRUE
, the
function will first look for singleton metabolites, and exclude them (and
the corresponding reactions) from the search list. Afterwards, dead end
metabolites are searched only in the smaller model.
Default: FALSE
.
}
- remMet
{
Boolean: if set to TRUE, metabolites identified as singleton or dead end
metabolites will be removed from the model. Additionally, reactions
containing such metabolites will be removed also.
Default: FALSE
.
}
- constrMet
{
Boolean: if set to TRUE, reactions containing metabolites identified as
singleton or dead end metabolites will be constrained to zero.
Default: FALSE
.
}
- tol
{
A single numeric value, giving the smallest positive floating point number
unequal to zero, see details below.
Default: SYBIL_SETTINGS("TOLERANCE")
.
}
- verboseMode
{
Single integer value, see sybilLog
for possible values.
Default: 2
.
}
- loglevel
{
Single integer value, see sybilLog
for possible values.
Default: -1
.
}
- logfile
{
A single character string giving the filename of the logfile.
The value of logfile
gets prefixed with fpath
. If
logfile
is set to NA
and loglevel
is > -1
,
a default filename is used (see sybilLog
for details).
Default: NA
.
}
- logfileEnc
{
A single character string giving the encoding for the logfile. If set to
NA
, the default encoding is used: getOption("encoding")
.
If logfileEnc
is set to something different, the option
useFancyQuotes
is turned
off: options(useFancyQuotes = FALSE)
.
Default: NA
.
}
- fpath
{
A single character string giving the path to a certain directory containing
the model files.
Default: SYBIL_SETTINGS("PATH_TO_MODEL")
.
}
- def_bnd
{
A single numeric value. Absolute value for uppper and lower bounds for
reaction bounds.
Default: SYBIL_SETTINGS("MAXIMUM")
.
}
- quoteChar
{
Set of quoting characters used for the argument quote
in
read.table
, see there for details.
Default: ""
(disable quoting).
}
- commentChar
{
A single character used for the argument comment.char
in
read.table
, see there for details. If a comment char is
needed, e.g. @
(at) seems to be a good one.
Default: ""
.
}
- ...
{
Further arguments passed to read.table
, e.g. argument
quote
, comment.char
or argument fill
, if some lines do
not have enough elements. If all fields are in double quotes, for example,
set quote
to """
.
}
A metabolic model consists of three input files:
_react.
containing all reactions._met.
containing all metabolites._desc.
containing a model description.
All of these files must be character separated value files (for a detailed
format description and examples, see package vignette). The argument
prefix
is the part of the filenames, all three have in common (e.g. if
they where produced by modelorg2tsv
).
Alternativly, the arguments reactList
, metList
and
modDesc
can be used. A file containing all reactions must be there,
everything else is optional.
If suffix
is missing, it is set according to the value of
fielddelim
:
ll {
"t"
"tsv"
";"
"csv"
","
"csv"
"|"
"dsv"
anything else "dsv"
}
The argument ...
is passed to read.table
. It could be
necessary, to turn off quoting quote = ""
, if e.g. metabolite names
contain quoting characters "'"
like in
3',5'-bisphosphate nucleotidase
.
The input files are read using the function read.table
. The
argument header
is set to TRUE
and the argument sep
is
set to the value of fielddelim
. Everything else can be passed via
the ...
argument.
The header for the reactions list may have the following columns:
ll {
"abbreviation"
a unique reaction id
"name"
a reaction name
"equation"
the reaction equation
"reversible"
TRUE, if the reaction is reversible
"compartment"
reaction compartment(s) (currently unused)
"lowbnd"
lower bound
"uppbnd"
upper bound
"obj_coef"
objective coeficient
"rule"
gene to reaction association
"subsystem"
subsystem of the reaction
}
Every entry except for "equation"
is optional.
The header for the metabolites list may have the following columns:
ll {
"abbreviation"
a unique metabolite id
"name"
a metabolite name
"compartment"
metabolite compartment (currently unused)
}
If a metabolite list is provided, it is supposed to contain at least the
entries "abbreviation"
and "name"
.
The header for the model description file may have the following columns:
ll {
"name"
a name for the model
"id"
a shorter model id
"description"
a model description
"compartment"
the compartments
"abbreviation"
unique compartment abbreviations
"Nmetabolites"
number of metabolites
"Nreactions"
number of reactions
"Ngenes"
number of independend genes
"Nnnz"
number of non-zero elements in
the stoichiometric matrix
}
If a model description file is provided, it is supposed to contain at least
the entries "name"
and "id"
. Otherwise, the filename of the
reactions list will be used.
The compartments in which a reaction takes place is determined by the
compartment flags of the participating metabolites.
All fields in the output files of modelorg2tsv
are in double
quotes. In order to read them, set argument quote
to """
.
Please read the package vignette for detailed information about input formats
and examples.
If a metabolite is used more than once as product or
reactand of a particular reaction, it is merged:
a + (2) a
is converted to (3) a
and a warning will be given.
If a metabolite is used first as reactand and then as
product of a particular reaction, the reaction is
balanced:
(2) b + a -> b + c
is converted to
b + a -> c
A binary version of the stoichiometric matrix $S$ is constructed
via $\left|S\right| > tol$.
A binary version of the stoichiometric matrix $S$ is scanned for reactions
and metabolites which are not used in S. If there are some, a warning will be
given and the corresponding reactions and metabolites will be removed from
the model if remUnusedMetReact
is set to TRUE
.
The binary version of the stoichiometric matrix $S$ is scanned for
metabolites, which are used only once in S. If there are some, at least
warning will be given. If either constrMet
or remMet
is set to
TRUE
, the binary version of $S$ is scanned for paths of singleton
metabolites. If constrMet
is set to TRUE
, reactions containing
those metabolites will be constrained to zero; if remMet
is set to
TRUE
, the metabolites and the reactions containing those metabolites
will be removed from the network.
In order to find path of singleton metabolites a binary version of the
stiochiometric matrix $S$ is used. Sums of rows gives the vektor of
metabolite usage, each element is the number of reactions a metabolite
parcipitates. A single metabolite (singleton) is a metabolite with a row sum
of zero. All columns in $S$ (reactions) containing singleton metabolites
will be set to zero. And again, singleton metabolites will be searched until
none are found.
The algorithm to find dead end metabolites works in a quite similar way, but
not in the binary version of the stroichiometric matrix. Here, metabolite
i
is considered as dead end, if it is for example produced by reaction
j
but not used by any other reaction k
.
An instance of class modelorg
.
The BiGG database http://bigg.ucsd.edu/ .
Schellenberger, J., Park, J. O., Conrad, T. C., and Palsson, B. Ø., (2010)
BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic
reconstructions. BMC Bioinformatics 11, 213.
Becker, S. A., Feist, A. M., Mo, M. L., Hannum, G., Palsson, B. Ø. and
Herrgard, M. J. (2007) Quantitative prediction of cellular metabolism with
constraint-based models: the COBRA Toolbox. Nat Protoc 2,
727--738.
Schellenberger, J., Que, R., Fleming, R. M. T., Thiele, I., Orth, J. D.,
Feist, A. M., Zielinski, D. C., Bordbar, A., Lewis, N. E., Rahmanian, S.,
Kang, J., Hyduke, D. R. and Palsson, B. Ø. (2011) Quantitative prediction of
cellular metabolism with constraint-based models: the COBRA Toolbox v2.0.
Nat Protoc 6, 1290--1307.
[object Object]
read.table
, modelorg2tsv
,
modelorg
, sybilLog
.
file