This dataset is borrowed from "Flexible parametric survival analysis using Stata: beyond the Cox model" (Roystone and Lambert, 2011). It contains follow-up data on 2982 woman with breast cancer who have gone through breast surgery. The women are followed from the time of surgery until death, relapse or censoring.
data(rott2)
The dataset rott2
contains the following variables:
patient ID number.
year of breast surgery (i.e. year of enrollment into the study), between the years 1978-1993.
relapse free interval measured in months.
relapse indicator.
metastasis free.
metastasis status.
overall survival
overall survival indicator
age at surgery measured in years.
menopausal status with levels "pre
" and "post
".
tumor size in three classes: <=20mm, >20-50mmm
and >50mm
.
differentiation grade with levels 2 or 3.
progesterone receptors, fmol/l.
oestrogen receptors, fmol/l.
the number of positive lymph nodes.
hormonal therapy with levels "no
" and "yes
".
categorical variable indicating whether the patient recieved chemotheraphy or not, with levels "no
" and "yes
".
a numeric indicator of whether the tumor was discovered recently with levels "1978-87
" and "1988-93
".
a numerical indicator of whether the patient did not recieved chemotherapy. Recoded version of "chemo
" where "yes
" is recoded as 0 and "no
" is recoded as 1.
The following changes have been made to the original data in Roystone and Lambert (2011):
- The variable "chemo
" is recoded into the numeric indicator variable "no.chemo
":
rott22$no.chemo <- as.numeric(rott2$chemo == "no")
The follwing variables have been removed from the original dataset: enodes, pr_1, enodes_1, _st, _d, _t, _t0
since they are recodings of some existing variables which are not used in this analysis.
Royston, Patrick & Lambert, Paul. C (2011). Flexible parametric survival analysis using Stata: beyond the Cox model. College Station, Texas, U.S, Stata press.