Learn R Programming

Colossus

The goal of Colossus is to provide an open-source means of performing survival analysis on big data with complex risk formulas. Colossus is designed to perform Cox Proportional Hazard regressions and Poisson regressions on datasets loaded as data.tables or data.frames. The risk models allowed are sums or products of linear, log-linear, or several other radiation dose response formulas highlighted in the vignettes. Additional plotting capabilities are available.

By default, a fully portable version of the code is compiled, which does not support OpenMP on every system. Note that Colossus requires OpenMP support to perform parallel calculations. The environment variable “R_COLOSSUS_NOT_CRAN” is checked to determine if OpenMP should be disabled for linux compiling with clang. The number of cores is set to 1 if the environment variable is empty, the operating system is detected as linux, and the default compiler or R compiler is clang. Colossus testing checks for the “NOT_CRAN” variable to determine if additional tests should be run. Setting “NOT_CRAN” to “false” will disable the longer tests. Currently, OpenMP support is not configured for linux compiling with clang.

Example

This is a basic example which shows you how to solve a common problem:

library(data.table)
library(parallel)
library(Colossus)
## basic example code reproduced from the starting-description vignette

df <- data.table(
  "UserID" = c(112, 114, 213, 214, 115, 116, 117),
  "Starting_Age" = c(18, 20, 18, 19, 21, 20, 18),
  "Ending_Age" = c(30, 45, 57, 47, 36, 60, 55),
  "Cancer_Status" = c(0, 0, 1, 0, 1, 0, 0),
  "a" = c(0, 1, 1, 0, 1, 0, 1),
  "b" = c(1, 1.1, 2.1, 2, 0.1, 1, 0.2),
  "c" = c(10, 11, 10, 11, 12, 9, 11),
  "d" = c(0, 0, 0, 1, 1, 1, 1)
)
# For the interval case
time1 <- "Starting_Age"
time2 <- "Ending_Age"
event <- "Cancer_Status"

names <- c("a", "b", "c", "d")
term_n <- c(0, 1, 1, 2)
tform <- c("loglin", "lin", "lin", "plin")
modelform <- "M"

a_n <- c(0.1, 0.1, 0.1, 0.1)

keep_constant <- c(0, 0, 0, 0)
der_iden <- 0

control <- list(
  "lr" = 0.75, "maxiter" = 100, "halfmax" = 5, "epsilon" = 1e-9,
  "deriv_epsilon" = 1e-9, "abs_max" = 1.0,
  "verbose" = 2, "ties" = "breslow"
)

e <- RunCoxRegression(df, time1, time2, event, names, term_n, tform, keep_constant, a_n, modelform, control = control)
Interpret_Output(e)
#> |-------------------------------------------------------------------|
#> Final Results
#>    Covariate Subterm Term Number Central Estimate Standard Error 2-tail p-value
#>       <char>  <char>       <int>            <num>          <num>          <num>
#> 1:         a  loglin           0         42.10452            NaN            NaN
#> 2:         b     lin           1         98.72266    3781273.501      0.9999792
#> 3:         c     lin           1         96.82311    3698137.325      0.9999791
#> 4:         d    plin           2        101.10000       2326.871      0.9653437
#> 
#> Cox Model Used
#> -2*Log-Likelihood: 1.35,  AIC: 9.35
#> Iterations run: 100
#> maximum step size: 1.00e+00, maximum first derivative: 1.92e-04
#> Analysis did not converge, check convergence criteria or run further
#> Run finished in 0.25 seconds
#> |-------------------------------------------------------------------|

Copy Link

Version

Install

install.packages('Colossus')

Monthly Downloads

279

Version

1.3.0

License

GPL (>= 3)

Issues

Pull Requests

Stars

Forks

Maintainer

Eric Giunta

Last Published

June 5th, 2025

Functions in Colossus (1.3.0)

Joint_Multiple_Events

Automates creating data for a joint competing risks analysis
Def_modelform_fix

Automatically assigns geometric-mixture values and checks that a valid modelform is used
Event_Count_Gen

uses a table, list of categories, and list of event summaries to generate person-count tables
Def_model_control

Automatically assigns missing model control values
Gather_Guesses_CPP

Performs checks to gather a list of guesses and iterations
RunCoxNull

Performs basic Cox Proportional Hazards regression with the null model
RunCoxRegression

Performs basic Cox Proportional Hazards regression without special options
OMP_Check

Checks the OMP flag
RunCoxPlots

Performs Cox Proportional Hazard model plots
PoissonCurveSolver

Calculates the likelihood curve for a poisson model directly
RunCaseControlRegression_Omnibus

Performs Matched Case-Control Conditional Logistic Regression
Rcpp_version

Checks default R c++ compiler
Linked_Dose_Formula

Calculates Full Parameter list for Special Dose Formula
Linked_Lin_Exp_Para

Calculates The Additional Parameter For a linear-exponential formula with known maximum
RunCoxRegression_Omnibus

Performs Cox Proportional Hazards regression using the omnibus function
RunCoxRegression_Guesses_CPP

Performs basic Cox Proportional Hazards regression, Generates multiple starting guesses on c++ side
RunCoxRegression_Omnibus_Multidose

Performs Cox Proportional Hazards regression using the omnibus function with multiple column realizations
RunCoxRegression_Single

Performs basic Cox Proportional Hazards calculation with no derivative
RunCoxRegression_Basic

Performs basic Cox Proportional Hazards regression with a multiplicative log-linear model
Rcomp_version

Checks how R was compiled
RunCoxRegression_CR

Performs basic Cox Proportional Hazards regression with competing risks
RunCoxRegression_Strata

Performs basic Cox Proportional Hazards regression with strata effect
RunCoxRegression_Tier_Guesses

Performs basic cox regression, with multiple guesses, starts with solving for a single term
Replace_Missing

Automatically assigns missing values in listed columns
RunPoissonRegression_Strata

Performs poisson regression with strata effect
RunPoissonRegression_Tier_Guesses

Performs basic poisson regression, with multiple guesses, starts with a single term
RunPoissonRegression

Performs basic poisson regression
RunPoissonRegression_Residual

Calculates poisson residuals
gcc_version

Checks default c++ compiler
RunPoissonRegression_Single

Performs poisson regression with no derivative calculations
RunPoissonRegression_Guesses_CPP

Performs basic Poisson regression, generates multiple starting guesses on c++ side
gen_time_dep

Applies time dependence to parameters
System_Version

Checks OS, compilers, and OMP
Time_Since

Automates creating a date since a reference column
get_os

Checks system OS
RunPoissonEventAssignment

Predicts how many events are due to baseline vs excess
RunPoissonRegression_Omnibus

Performs basic Poisson regression using the omnibus function
RunPoissonRegression_Joint_Omnibus

Performs joint Poisson regression using the omnibus function
factorize

Splits a parameter into factors
RunPoissonEventAssignment_bound

Predicts how many events are due to baseline vs excess at the confidence bounds of a single parameter
interact_them

Defines Interactions
factorize_par

Splits a parameter into factors in parallel
Check_Trunc

Applies time duration truncation limits to create columns for Cox model
Date_Shift

Automates creating a date difference column
Cox_Relative_Risk

Calculates hazard ratios for a reference vector
Def_Control

Automatically assigns missing control values
CoxCurveSolver

Calculates the likelihood curve for a cox model directly
Correct_Formula_Order

Corrects the order of terms/formula/etc
Def_Control_Guess

Automatically assigns missing guessing control values
Convert_Model_Eq

Converts a string equation to regression model inputs
Check_Verbose

General purpose verbosity check
Check_Dupe_Columns

checks for duplicated column names
Likelihood_Ratio_Test

Defines the likelihood ratio test
Interpret_Output

Prints a regression output clearly
GetCensWeight

Calculates and returns data for time by hazard and survival to estimate censoring rate
Event_Time_Gen

uses a table, list of categories, list of summaries, list of events, and person-year information to generate person-time tables