Learn R Programming

Rcan (version 1.3.70)

csu_group_cases: csu_group_cases

Description

csu_group_cases groups individual data into 5 years age-group data and other user defined variable (sex, registry, etc...). Optionally: Group cancer based on a standard ICD10 coding; Extract year from custom year format.

Usage

csu_group_cases(df_data, 
	var_age ,
	group_by=NULL,
	var_cases = NULL,
	df_ICD = NULL,
	var_ICD=NULL,
	var_year = NULL,
	all_cancer=FALSE)

Arguments

df_data

Individual data (need to be R data.frame format, see examples to import csv file).

var_age

Age variable. (Numeric). Value > 150 will be considered as missing age.

group_by

(Optional) A vector of variables to create the different population (sex, country, etc...).

var_cases

(Optional) cases variable: If there is already a variable for the number of cases.

df_ICD

(Optional) ICD file for ICD grouping information. Must have 2 fields: "ICD", "LABEL". 2 formats are possible: Each ICD code separated by ICD group

ICD LABEL
C82 NHL
C83 NHL
C84 NHL
C85 NHL
C96 NHL

ICD code already grouped.

ICD_group LABEL
C82-85,C96 NHL

2 ICD codes separated by "-" includes all the ICD code between. 2 ICD codes separated by "," includes only these 2 ICD code. For instance, C82-85, C96 (or C82-C85, C96) includes: C82, C83, C84, C85 and C96 Must be filled if var_ICD argument is defined

example: ICD_group_GLOBOCAN

var_ICD

(Optional) ICD variable: ICD variable in the individual data. Must be filled if df_ICD argument is defined

var_year

(Optional) Year variable: Extract year from custom format , as long as the year is expressed with 4 digits (i.e. ("yyyymmdd","ddmmyyyy", "yyyy/mm","dd-mm-yyyy", etc..) and group data by year.

all_cancer

(Optional) If TRUE, will calculate the number of cases for all cancers (C00-97) and all cancers but non-melanoma of skin (C00-97 but C44) Need var_ICD and df_ICD arguments to be defined

Value

Return a dataframe.

Details

For most analysis, individual cases database need to be grouped by category. This function groups data by 5 years age-group and other user defined variable. Next step will be to add 5 years population data. (see csu_merge_cases_pop).

See Also

csu_merge_cases_pop csu_asr csu_eapc csu_ageSpecific csu_ageSpecific_top csu_bar_top csu_time_trend csu_trendCohortPeriod

Examples

Run this code
# NOT RUN {
# you can import your data from csv file using read.csv:
# mydata <-  read.csv("mydata.csv", sep=",")

data(ICD_group_GLOBOCAN)
data(data_individual_file)

#group individual data by 
# 5 year age group 
df_data_age <- csu_group_cases(data_individual_file,
  var_age="age",
  group_by=c("sex", "regcode", "reglabel", "site")) 

#group individual data by 
# 5 year age group 
# ICD grouping from dataframe ICD_group_GLOBOCAN

df_data_icd <- csu_group_cases(data_individual_file,
  var_age="age",
  group_by=c("sex", "regcode", "reglabel"),
  df_ICD = ICD_group_GLOBOCAN,
  var_ICD  ="site") 

#group individual data by 
# 5 year age group 
# ICD grouping from dataframe ICD_group_GLOBOCAN
# year (extract from date of incidence)

df_data_year <- csu_group_cases(data_individual_file,
  var_age="age",
  group_by=c("sex", "regcode", "reglabel"),
  df_ICD = ICD_group_GLOBOCAN,
  var_ICD  ="site",
  var_year = "doi")       
	

# you can export your result as csv file using write.csv:
# write.csv(result, file="result.csv")
				  		  
# }

Run the code above in your browser using DataLab