create_synthetic_data: Generates a survival data set for synthetic streaming service subscription data. The survival event in this case is a cancellation of the subscription. It is given as a function of household income and average number of hours watched in the prior month. Users can adjust the level of censoring and variance in the data with the supplied parameters or simply call with no parameters for a default distribution of data.

Description

Generates a survival data set for synthetic streaming service subscription data. The survival event in this case is a cancellation of the subscription. It is given as a function of household income and average number of hours watched in the prior month. Users can adjust the level of censoring and variance in the data with the supplied parameters or simply call with no parameters for a default distribution of data.

Usage

create_synthetic_data(
  sample_size = 250,
  minimum_income = 5000,
  median_income = 50000,
  income_variance = 10000,
  min_watchhours = 0,
  max_watchhours = 6,
  censor_percentage = 0,
  min_censor_amount = 0,
  max_censor_amount = 0,
  study_time_in_months = 48,
  perturbation_shift = 0
)

Value

A survival data set suitable for modeling using spect_train.

Arguments

sample_size: optional - size of the sample population to generate
minimum_income: optional - minimum household income used to generate the distribution
median_income: optional - median household income used to generate the distribution
income_variance: optional - variance to use when generating the household income distribution
min_watchhours: optional - minimum average number of hours watched used to generate the distribution
max_watchhours: optional - minimum average number of hours watched used to generate the distribution
censor_percentage: optional - percentage of population to artificially censor
min_censor_amount: optional - Minimum number of months of censoring to apply to the censored population
max_censor_amount: optional - maximum number of months of censoring to apply to the censored population
study_time_in_months: optional - observation horizon in months
perturbation_shift: optional - defines a boundary for the amount to randomly perturb the formulaic result. Zero for no perturbation

Author

Stephen Abrams, stephen.abrams@louisville.edu

Examples

Run this code

data <- create_synthetic_data()

Run the code above in your browser using DataLab