The function generates trajectories of probabilistic population projection for all countries for which input data is available, or any subset of them.
pop.predict(end.year = 2100, start.year = 1950, present.year = 2015,
wpp.year = 2017, countries = NULL,
output.dir = file.path(getwd(), "bayesPop.output"),
inputs = list(popM=NULL, popF=NULL, mxM=NULL, mxF=NULL, srb=NULL,
pasfr=NULL, mig.type=NULL, migM=NULL, migF=NULL,
e0F.file=NULL, e0M.file=NULL, tfr.file=NULL,
e0F.sim.dir=NULL, e0M.sim.dir=NULL, tfr.sim.dir=NULL,
migMtraj = NULL, migFtraj = NULL),
nr.traj = 1000, keep.vital.events = FALSE,
fixed.mx = FALSE, fixed.pasfr = FALSE,
my.locations.file = NULL, replace.output = FALSE, verbose = TRUE)
End year of the projection.
First year of the historical data.
Year for which initial population data is to be used.
Year for which WPP data is used. The functions loads a package called wpp\(x\) where \(x\) is the wpp.year
and uses the various datasets as default if the corresponding inputs
element is missing (see below).
Array of country codes or country names for which a projection is generated. If it is NULL
, all available countries are used. If it is NA
and there is an existing projection in output.dir
and replace.output=FALSE
, then a projection is performed for all countries that are not included in the existing projection. Names of countries are matched to those in the UNlocations
dataset (or in the dataset loaded from my.locations.file
if used).
Output directory of the projection. If there is an existing projection in output.dir
and replace.output=TRUE
, everything in the directory will be deleted.
A list of file names where input data is stored. It contains the following elements (Unless otherwise noted, these are tab delimited ASCII files; Names of default datasets from the corresponding wpp package which are used if the corresponding element is NULL
are shown in brackets):
Initial male/female age-specific population (at time present.year
) [popM
, popF
].
Historical data and (optionally) projections of male/female age-specific death rates [mxM
, mxF
] (see also argument fixed.mx
).
Projection of sex ratio at birth. [sexRatio
]
Historical data and (optionally) projections of percentage age-specific fertility rate [percentASFR
] (see also argument fixed.pasfr
).
Migration type and base year of the migration. In addition, this dataset gives information on country's specifics regarding mortality and fertility age patterns as defined in [vwBaseYear
]. patterns
and mig.type
have the same meaning and can be used interchangeably.
Projection of male/female age-specific migration as net counts on the same scale as initital population [migrationM
, migrationF
]. If not available, the migration schedules are reconstructed from total migration counts derived from migration
using the age.specific.migration
function.
Comma-delimited CSV file with results of female life expectancy (generated using bayesLife, function convert.e0.trajectories
, file “ascii_trajectories.csv”). Required columns are “LocID”, “Year”, “Trajectory”, and “e0”. If this element is not NULL
, the argument e0F.sim.dir
is ignored. If both e0F.file
and e0F.sim.dir
are NULL
, data from the corresponding wpp package is taken, namely the median projections as one trajectory and the low and high variants (if available) as second and third trajectory.
Comma-delimited CSV file containing results of male life expectancy (generated using bayesLife, function convert.e0.trajectories
, file “ascii_trajectories.csv”). Required columns are “LocID”, “Year”, “Trajectory”, and “e0”. If this element is not NULL
, the argument e0M.sim.dir
is ignored. As in the female case, if both e0M.file
and e0M.sim.dir
are NULL
, data from the corresponding wpp package is taken.
Comma-delimited CSV file with results of total fertility rate (generated using bayesTFR, function convert.tfr.trajectories
, file “ascii_trajectories.csv”). Required columns are “LocID”, “Year”, “Trajectory”, and “TF”. If this element is not NULL
, the argument tfr.sim.dir
is ignored. If both tfr.file
and tfr.sim.dir
are NULL
, data from the corresponding wpp package is taken (median and the low and high variants as three trajectories). Alternatively, this argument can be the keyword “median_” in which case only the wpp median is taken.
Simulation directory with results of female life expectancy (generated using bayesLife). It is only used if e0F.file
is NULL
.
Simulation directory with results of male life expectancy (generated using bayesLife). Alternatively, it can be the string “joint_”, in which case it is assumed that the male life expectancy was projected jointly from the female life expectancy (see joint.male.predict) and thus contained in the e0F.sim.dir
directory. The argument is only used if e0M.file
is NULL
.
Simulation directory with results of total fertility rate (generated using bayesTFR). It is only used if tfr.file
is NULL
.
Comma-delimited CSV file with male/female age-specific migration trajectories. If present, it replaces deterministic projections given by the migM
and migF
items. It has a similar format as e.g. e0M.file
with columns “LocID”, “Year”, “Trajectory”, “Age” and “Migration”. The “Age” column must have values “0-4”, “5-9”, “10-14”, …, “95-99”, “100+”.
Number of trajectories to be generated. If this number is smaller than the number of available trajectories of the probabilistic components (TFR, life expectancy and migration), the trajectories are equidistantly thinned.
If all of those components contain less trajectories than nr.traj
, the value is adjusted to the maximum of available trajectories of the components. For those that have less trajectories than the adjusted number, the available trajectories are re-sampled, so that all components have the same number of trajectories.
Logical. If TRUE
age- and sex-specific vital events of births and deaths as well as other objects are stored in the prediction object, see Details.
Logical. If TRUE
, it is assumed the dataset of death rates (mxM and mxF) include data for projection years and they are then used instead of the life expectancy.
Logical. If TRUE
, it is assumed the dataset on percent age-specific fertility rate (percentASFR) include data for projection years and they are then used instead of computing it on the fly.
Name of a tab-delimited ascii file with a set of all locations for which a projection is generated. Use this argument if you are projecting for a country/region that is not included in the standard UNlocations
dataset. It must have the same structure.
Logical. If TRUE
, everything in the directory output.dir
is deleted prior to the prediction.
Logical controlling the amount of output messages.
Object of class bayesPop.prediction
with the following elements:
Full path to the base directory output.dir
.
Sub-directory relative to base.directory
with the projections.
The actual number of trajectories of the projections.
Three-dimensional array of projection quantiles (countries x number of quantiles x projection periods). The second dimension corresponds to the following quantiles: \(0.025,0.05,0.1,0.25,0.5,0.75,0.9,0.95,0.975\).
Three-dimensional array of projection mean and standard deviation (countries x 2 x projection periods). First and second matrix of the second dimension, respectively, is the mean and standard deviation, respectively.
Quantiles of male and female projection, respectively. Same structure as quantiles
.
Same as traj.mean.sd
corresponding to male and female projection, respectively.
Four-dimensional array of age-specific quantiles of male and female projection, respectively (countries x age groups x number of quantiles x projection periods). The same quantiles are used as in quantiles
.
Array of age-specific quantiles of male and female projection, respectively, divided by the total population. The dimensions are the same as in quantilesMage
.
Vector of time for which historical data was used in the projections.
Vector of projection time periods starting with the present period.
The wpp year used.
List of input data used for the projection.
Content of the inputs
argument passed to the function.
Matrix of countries for which projection exists. It contains two columns: code
, name
.
Vector of age groups.
This component is added by get.pop.prediction
and modified and used by pop.map
and write.pop.projection.summary
. It is an environment for caching and re-using results of expressions.
Logical determining if cache
should be modified.
Logical determining if this object is a result of pop.predict
or pop.aggregate
.
The population projection is computed using the Cohort Component method and is based on an algorithm used by the United Nation Population Division (see also Sevcikova et al (2015) in the References below). For each country, one projection is calculated for each trajectory of male and female life expectancy, TFR and possibly migration. This results in a set of trajectories of population projection which forms its posterior distribution. The trajectories of life expectancy and TFR can be given either in its binary form generated by the packages bayesLife and bayesTFR, respectively (as directories e0M.sim.dir
, e0F.sim.dir
, tfr.sim.dir
of the inputs
argument), or they can be given as ASCII tables in csv format, see above. The number of trajectories for male and female life expectancy must match, as does for male and female migration.
The projection is generated sequentially country by country. Results are stored in a sub-directory of output.dir
called prediction
. There is one binary file per country, called totpop_country\(x\).rda
, where \(x\) is the country code. It contains six objects: totp
, totpf
, totpm
(trajectories of total population, age-specific female and age-specific male, respectively), totp.hch
, totpf.hch
, totpm.hch
(the UN half-child variant for total population, age-specific female and age-specific male, respectively). Optionally, if keep.vital.events
is TRUE
, there is an additional file per country, called vital_events_country\(x\).rda
, containing the following objects: btm
, btf
(trajectories for births by age of mothers for male and female child, respectively), deathsm
, deathsf
(trajectories for age-specific male and female deaths, respectively), asfert
(trajectories of age-specific fertility), mxm
, mxf
(trajectories of male and female age-specific mortality rates), migm
, migf
(if used, these are trajectories of male and female age-specific migration), btm.hch
, btf.hch
, deathsm.hch
, deathsf.hch
, asfert.hch
, mxm.hch
, mxf.hch
(the UN half-child variant for age- and sex-specific births, deaths, fertility rates and mortality rates). An object of class bayesPop.prediction
is stored in the same directory in a file prediction.rda
. It is updated every time a country projection is finished.
See pop.trajectories
for extracting trajectories.
To access a previously stored prediction object, use get.pop.prediction
.
H. Sevcikova, A. E. Raftery (2016). bayesPop: Probabilistic Population Projections. Journal of Statistical Software, 75(5), 1-29. doi:10.18637/jss.v075.i05
A. E. Raftery, N. Li, H. Sevcikova , P. Gerland, G. K. Heilig (2012). Bayesian probabilistic population projections for all countries. Proceedings of the National Academy of Sciences 109:13915-13921.
P. Gerland, A. E. Raftery, H. Sevcikova, N. Li, D. Gu, T. Spoorenberg, L. Alkema, B. K. Fosdick, J. L. Chunn, N. Lalic, G. Bay, T. Buettner, G. K. Heilig, J. Wilmoth (2014). World Population Stabilization Unlikely This Century. Science 346:234-237.
H. Sevcikova, N. Li, V. Kantorova, P. Gerland and A. E. Raftery (2015). Age-Specific Mortality and Fertility Rates for Probabilistic Population Projections. arXiv:1503.05215. http://arxiv.org/abs/1503.05215
pop.trajectories.plot
, pop.pyramid
, pop.trajectories
, get.pop.prediction
, age.specific.migration
# NOT RUN {
sim.dir <- tempfile()
# Countries can be given as a combination of numerical codes and names
pred <- pop.predict(countries=c("Netherlands", 218, "Madagascar"), nr.traj=3,
output.dir=sim.dir)
pop.trajectories.plot(pred, "Ecuador", sum.over.ages=TRUE)
unlink(sim.dir, recursive=TRUE)
# }
Run the code above in your browser using DataLab