Function to get Births file from DHS .dta files.
getBirths(filepath = NULL, data = NULL, surveyyear = NA,
variables = c("caseid", "v001", "v002", "v004", "v005", "v021", "v022",
"v023", "v024", "v025", "v139", "bidx"), strata = c("v024", "v025"),
dob = "b3", alive = "b5", age = "b7", date.interview = "v008",
month.cut = c(1, 12, 24, 36, 48, 60), year.cut = seq(1980, 2020, by =
5), cmc.adjust = 0, compact = FALSE, compact.by = c("v001", "v024",
"v025", "v005"))
file path of raw .dta file from DHS. Only used when data frame is not provided in the function call.
data frame of a DHS survey
year of survey
vector of variables to be used in obtaining the person-month files. The variables correspond the the DHS recode manual VI. For early DHS data, the variable names may need to be changed.
vector of variable names used for strata. If a single variable is specified, then that variable will be used as strata indicator If multiple variables are specified, the interaction of these variables will be used as strata indicator.
variable name for the date of birth.
variable name for the indicator of whether child was alive or dead at the time of interview.
variable name for the age at death of the child in completed months.
variable name for the date of interview.
the cutoff of each bins of age group in the unit of months. Default values are 1, 12, 24, 36, 48, and 60, representing the age groups (0, 1), [1, 12), [12, 24), ..., [48, 60).
The cutoff of each bins of time periods, including both boundaries. Default values are 1980, 1985, ..., 2020, representing the time periods 80-84, 85-89, ..., 15-19. Notice that if each bin contains one year, the last year in the output is max(year.cut)-1. For example, if year.cut = 1980:2020, the last year in the output is 2019.
number of months to add to the recorded month in the dataset. Some DHS surveys does not use Gregorian calendar (the calendar used in most of the world). For example, the Ethiopian calendar is 92 months behind the Gregorian calendar in general. Then we can set cmc.adjust to 92, which adds 92 months to all dates in the dataset, effectively transforming the Ethiopian calendar to the Gregorian calendar.
logical indicator of whether the compact format is returned. In the compact output, person months are aggregated by cluster, age, and time. Total number of person months and deaths in each group are returned instead of the raw person-months.
vector of variables to summarize the compact form by.
This function returns a new data frame where each row indicate a person-month, with the additional variables specified in the function argument.
# NOT RUN {
my_fp <- "/myExampleFilepath/surveyData.DTA"
DemoData <- getBirths(filepath = my_fp, surveyyear = 2015)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab