Extract one or more sets of features from traits observed over time, the result being traits that have a single value for each individual. The sets of features are:
single times -- the value for each individual for a single time.
(uses getTimesSubset)
growth rates for a time interval -- the average growth rate (AGR
and/or RGR) over a time interval for each individual.
(uses byIndv4Intvl_GRsDiff or byIndv4Intvl_GRsAvg)
water use traits for a time interval -- the total water use (WU),
the water use rate (WUR) and the water use index (WUI) over a time
interval for each individual. (uses byIndv4Intvl_WaterUse)
growth rates for the imaging period overall -- the average growth rate (AGR
and/or RGR) over the whole imaging period for each individual.
(uses byIndv4Intvl_GRsDiff or byIndv4Intvl_GRsAvg)
water use traits for the imaging period overall -- the total water use (WU),
the water use rate (WUR) and the water use index (WUI) for the whole imaging
period for each individual. (uses byIndv4Intvl_WaterUse)
totals for the imaging period overall -- the total over the whole imaging
period of a trait for each individual. (uses byIndv4Intvl_ValueCalc)
maximum for the imaging period overall -- the maximum value over the whole
imaging period, and the time at which it occurred, for each individual.
(uses byIndv4Intvl_ValueCalc)
The Tomato vignette illustrates the use of traitSmooth and
traitExtractFeatures to carry out the SET procedure for the example
presented in Brien et al. (2020).
Use vignette("Tomato", package = "growthPheno") to access it.
traitExtractFeatures(data, individuals = "Snapshot.ID.Tag", times = "DAP",
starts.intvl = NULL, stops.intvl = NULL,
suffices.intvl = NULL,
responses4intvl.rates = NULL,
growth.rates = NULL,
growth.rates.method = "differences",
suffices.growth.rates = NULL,
water.use4intvl.traits = NULL,
responses4water = NULL,
water.trait.types = c("WU", "WUR", "WUI"),
suffix.water.rate = "R", suffix.water.index = "I",
responses4singletimes = NULL, times.single = NULL,
responses4overall.rates = NULL,
water.use4overall.water = NULL,
responses4overall.water = NULL,
responses4overall.totals = NULL,
responses4overall.max = NULL,
intvl.overall = NULL, suffix.overall = NULL,
sep.times.intvl = "to", sep.suffix.times = ".",
sep.growth.rates = ".", sep.water.traits = "",
mergedata = NULL, ...)A data.frame that contains an individuals column and a
column for each extracted trait, in addition to any columns in mergedata.
The number of rows in the data.frame will equal the number of
unique element of the individuals column in data, except when
there are extra values in the individuals column in data. If the
latter applies, then the number of rows will equal the number of unique
values in the combined individuals columns from mergedata and
data.
The names of the columns produced by the function are constructed as follows:
single times -- A name for a single-time trait is formed by appending
a full stop to an element of responses4singletimes, followed by the value of
times at which the values were observed.
growth rates for a time interval -- The name for an interval growth
rate is constructed by concatenating the relevant element of responses4intvl.rates,
growth.rates and a suffix for the time interval, each separated by a full
stop. The interval suffix is formed by joining its starts.intvl and
stops.intvl values, separating them by the value of sep.times.intvl.
growth rates for the whole imaging period -- The name for an interval growth
rate is constructed by concatenating the relevant element of responses4intvl.rates,
growth.rates and suffix.overall, each separated by a full
stop.
water use traits for a time interval -- Construction of the names for
the three water traits begins with the value of water.use4intvl.traits. The rate (WUR)
has either R or the value of suffix.water.rate added to the value of
water.use4intvl.traits. Similarly the index (WUI) has either I or the value of
suffix.water.index added to it. The WUI also has the element of
responses4water used in calculating the WUI prefixed to its name.
All three water use traits have a suffix for the interval appended to their names.
This suffix is contructed by joining its starts.intvl and stops.intvl,
separated by the value of sep.times.intvl.
water use traits for the whole imaging period -- Construction of the names for
the three water traits begins with the value of water.use4intvl.traits. The rate (WUR)
has either R or the value of suffix.water.rate added to the value of
water.use4intvl.traits. Similarly the index (WUI) has either I or the value of
suffix.water.index added to it. The WUI also has the element of
responses4water used in calculating the WUI prefixed to its name.
All three water use traits have suffix.overall appended to their names.
the total for the whole of imaging period -- The name for whole-of-imaging
total is formed by combining an element ofresponses4overall.totals with suffix.overall,
separating them by a full stop.
maximum for the whole of imaging period -- The name of the column with the
maximum values will be the result of concatenating the responses4overall.max, "max"
and suffix.overall, each separated by a full stop. The name of the column with
the value of times at which the maximum occurred will be the result of
concatenating the responses4overall.max, "max" and the value of times,
each separated by a full stop.
The data.frame is returned invisibly.
A data.frame containing the columns specified by
individuals, times, the various responses
arguments and the water.use argument.
A character giving the name of the
factor that defines the subsets of the data
for which each subset corresponds to the response values for
an individual (e.g. plant, pot, cart, plot or unit).
A character giving the name of the column in
data containing the times at which the data was
collected, either as a numeric, factor, or
character. It will be used identifying the intervals and,
if a factor or character, the values should
be numerics stored as characters.
A numeric giving the times, in terms of values in
times, that are the initial times for a set of intervals for which
growth.rates and water.use traits are to be obtained.
These times may also be used to obtain values for single-time traits
(see responses4singletimes).
A numeric giving the times, in terms of values in
times, that are the end times for a set of intervals for which
growth.rates and water.use traits are to be obtained.
These times may also be used to obtain values for single-time traits
(see responses4singletimes).
A character giving the suffices for intervals
specified using starts.intvl and stops.intvl. If NULL,
the suffices are automatically generated using starts.intvl,
stops.intvl and sep.times.intvl.
A character specifying the names of the columns
containing responses for which growth rates are to be obtained for
the intervals specified by starts.intvl and stops.intvl.
For growth.rates.method set to differences, the growth rates will
be computed from the column of the response values whose name is
listed in responses4intvl.rates. For growth.rates.method set to
derivatives, the growth rates will be computed from a column with
the growth rates computed for each time. The name of the column should be
a response listed in responses4intvl.rates to which is appended an
element of suffices.growth.rates.
A character specifying the method to use
in calculating the growth rates over an interval for the responses
specified by responses4intvl.rates.
The two possibilities are "differences" and "ratesaverages".
For differences, the growth rate for an interval is computed
by taking differences between the values of a response for pairs
of times. For ratesaverage, the growth rate for an interval
is computed by taking weighted averages of growth rates for times within
the interval. That is, differences operates on the response and
ratesaverage operates on the growth rates previously calculated
from the response, so that the appropriate one of these must be in
data. The ratesaverage option is most appropriate when the
growth rates are calculated using the derivatives of a fitted curve.
Note that, for responses for which the AGR has been calculated using
differences, both methods will give the same result, but the
differences option will be more efficient than ratesaverages.
A character specifying which growth rates are
to be obtained for the intervals specified by starts.intvl and
stops.intvl. It should contain one of both of "AGR" and
"RGR".
A character giving the suffices appended to
responses4intvl.rates in constructung the column names for the
storing the growth rates specified by growth.rates. If
suffices.growth.rates is NULL, then "AGR" and
"RGR" will be used.
A character giving the names of the columns in
data that contain the water use values that are to be used
in computing the water use traits (WU, WUR, WUI) for the intervals
specified by starts.intvl and stops.intvl. If there is
only one column name, then the WUI will be calculated using
this name for all column names in responses4water. If there
are several column names in water.use4intvl.traits, then there must be
either one or the same number of names in responses4water.
If both have same number of names, then the two lists of column names
will be processed in parallel, so that a single WUI will be
produced for each pair of responses4water and water.use4intvl.traits
values.
A character giving the names of the columns
in data that are to provide the numerator in calculating a
WUI for the intervals specified using starts.intvl and
stops.intvl. The denominator will be the values in the columns
in data whose names are those given by water.use4intvl.traits.
If there is only one column name in responses4water, then the
WUI will be calculated using this name for all column names in
responses4water. If there are several column names in
responses4water, then there must be either one or the same
number of names in water.use4intvl.traits. If both have same number of
names, then the two lists of column names will be
processed in parallel, so that a single WUI will be produced
for each pair of responses4water and water.use4intvl.traits values.
See the Value section for a description of how
responses4water is incorporated into the names constructed for
the water use traits.
A character listing the trait types to compute
and return. It should be some combination of WU, WUR
and WUI. See Details in byIndv4Intvl_WaterUse for
how each is calculated.
A character giving the label to be appended
to the value of water.use4intvl.traits to form the name of the WUR.
A character giving the label to be appended
to the value of water.use4intvl.traits to form the name of the WUI.
A character specifying the names of the
columns containing responses for which a column of the values is
to be formed for each response for each of the times values specified in
times.single. If times.single is NULL, then the
unique values in the combined starts.intvl and stops.intvl
will be used.
A numeric giving the times of imaging, for each of
which, the values of each responses4singletimes will be stored in
a column of the resulting data.frame. If NULL, then
the unique values in the combined starts.intvl and stops.intvl
will be used.
A character specifying the names of the
columns containing responses for which growth rates are to be obtained
for the whole imaging period i.e. the interval specified by
intvl.overall. The settings of growth.rates.method,
growth.rates, suffices.growth.rates,
sep.growth.rates, suffix.overall and intvl.overall
will be used in producing the growth rates. See responses4intvl.rates
for more information about how these arguments are used.
A logical indicating whether the overall
water.traits are to be obtained. The settings of water.trait.types,
suffix.water.rate, suffix.water.index, sep.water.traits,
suffix.overall and intvl.overall will be used in producing the
overall water traits. See water.use4intvl.traits for more information about
how these arguments are used.
A character giving the names of the columns
in data that are to provide the numerator in calculating a
WUI for the interval corresponding to the whole imaging period.
See response.water for further details. See responses4water
for more information about how this argument is processed.
A character specifying the names of the
columns containing responses for which a column of the values is
to be formed by summing the response for each individual over the
whole of the imaging period.
A character specifying the names of the
columns containing responses for which columns of the values are
to be formed for the maximum of the response for each
individual over the whole of the imaging period and the times value
at which the maximum occurred.
A numeric giving the starts and stop times of imaging.
If NULL, the start time will be the minimum of starts.intvl
and the stop time will be the maximum of stops.intvl.
A character giving the suffix to be appended to
the names of traits that apply to the whole imagng period. It applies to
overall.growth.rates, water.use4overall.water,
responses4overall.water and responses4overall.totals.
If NULL, then nothing will be added.
A character giving the separator to use in
combining a starts.intvl with a stops.intvl in constructing
the suffix to be appended to an interval trait. If set to NULL and
there is only one value for each of starts.intvl and
stops.intvl, then no suffix will be added; otherwise
sep.times.intvl set to NULL will result in an error.
A character giving the separator to use in
appending a suffix for times to a trait. For no separator, set to
"".
A character giving the character(s) to be
used to separate the suffices.growth.rates value from the
responses4intvl.rates values in constructing the name for a
new rate. It is also used for separating responses4water
values from the suffix.water.index. For no separator, set to
"".
A character giving the character(s) to be
used to separate the suffix.rate and suffix.index values
from the response value in constructing the name for a new
rate/index. The default of "" results in no separator.
A data.frame containing a column with the name given
in individuals and for which there is only one row for each value
given in this column. In general, it will be that the number of rows in
mergedata is equal to the number of unique values in the column in
data labelled by the value of individuals, but this is not
mandatory. If mergedata is not NULL, the values extracted by
traitExtractFeatures will be merged with it.
allows passing of arguments to other functions; not used at present.
Chris Brien
Brien, C., Jewell, N., Garnett, T., Watts-Williams, S. J., & Berger, B. (2020). Smoothing and extraction of traits in the growth analysis of noninvasive phenotypic data. Plant Methods, 16, 36. tools:::Rd_expr_doi("10.1186/s13007-020-00577-6").
getTimesSubset, byIndv4Intvl_GRsAvg,
byIndv4Intvl_GRsDiff, byIndv4Intvl_WaterUse,
byIndv_ValueCalc.
#Load dat
data(tomato.dat)
#Define DAP constants
DAP.endpts <- c(18,22,27,33,39,43,51)
nDAP.endpts <- length(DAP.endpts)
DAP.starts <- DAP.endpts[-nDAP.endpts]
DAP.stops <- DAP.endpts[-1]
DAP.segs <- list(c(DAP.endpts[1]-1, 39),
c(40, DAP.endpts[nDAP.endpts]))
#Add PSA rates and smooth PSA, also producing sPSA rates
tom.dat <- byIndv4Times_SplinesGRs(data = tomato.dat,
response = "PSA", response.smoothed = "sPSA",
times = "DAP", rates.method = "differences",
smoothing.method = "log",
spline.type = "PS", lambda = 1,
smoothing.segments = DAP.segs)
#Smooth WU
tom.dat <- byIndv4Times_SplinesGRs(data = tom.dat,
response = "WU", response.smoothed = "sWU",
rates.method = "none",
times = "DAP",
smoothing.method = "direct",
spline.type = "PS", lambda = 10^(-0.5),
smoothing.segments = DAP.segs)
#Extract single-valued traits for each individual
indv.cols <- c("Snapshot.ID.Tag", "Lane", "Position", "Block", "Cart", "AMF", "Zn")
indv.dat <- subset(tom.dat, subset = DAP == DAP.endpts[1],
select = indv.cols)
indv.dat <- traitExtractFeatures(data = tom.dat,
starts.intvl = DAP.starts, stops.intvl = DAP.stops,
responses4singletimes = "sPSA",
responses4intvl.rates = "sPSA",
growth.rates = c("AGR", "RGR"),
water.use4intvl.traits = "sWU",
responses4water = "sPSA",
responses4overall.totals = "sWU",
responses4overall.max = "sPSA.AGR",
mergedata = indv.dat)
Run the code above in your browser using DataLab