Learn R Programming

⚠️There's a newer version (1.10.2) of this package.Take me there.

iai: Interpretable AI R Interface

iai is a package providing an interface to the algorithms of Interpretable AI from the R programming language, including:

  • Optimal Trees for classification, regression, prescription and survival analysis
  • Optimal Imputation for missing data imputation and outlier detection
  • Optimal Feature Selection for exact sparse regression

Installation and Usage

Please refer to the official Interpretable AI documentation for information on setting up and using the package.

Copy Link

Version

Install

install.packages('iai')

Monthly Downloads

399

Version

1.7.0

License

MIT + file LICENSE

Maintainer

Jack Dunn

Last Published

December 6th, 2021

Functions in iai (1.7.0)

categorical_classification_reward_estimator

Learner for conducting reward estimation with categorical treatments and classification outcomes
as.mixeddata

Convert a vector of values to IAI mixed data format
autoplot.similarity_comparison

Construct a ggplot2::ggplot object plotting the results of the similarity comparison
apply

Return the leaf index in a tree model into which each point in the features falls
add_julia_processes

Add additional Julia worker processes to parallelize workloads
apply_nodes

Return the indices of the points in the features that fall into each node of a trained tree model
autoplot.roc_curve

Construct a ggplot2::ggplot object plotting the ROC curve
autoplot.grid_search

Construct a ggplot2::ggplot object plotting grid search results for Optimal Feature Selection learners
autoplot.stability_analysis

Construct a ggplot2::ggplot object plotting the results of the stability analysis
categorical_survival_reward_estimator

Learner for conducting reward estimation with categorical treatments and survival outcomes
cleanup_installation

Remove all traces of automatic Julia/IAI installation
delete_rich_output_param

Delete a global rich output parameter
fit_cv

Fits a grid search to the training data with cross-validation
categorical_regression_reward_estimator

Learner for conducting reward estimation with categorical treatments and regression outcomes
categorical_reward_estimator

Learner for conducting reward estimation with categorical treatments
get_policy_treatment_outcome

Return the quality of the treatments at a node of a tree
get_survival_expected_time

Return the predicted expected survival time at a node of a tree
all_treatment_combinations

Return a dataframe containing all treatment combinations of one or more treatment vectors, ready for use as treatment candidates in `fit_predict!` or `predict`
get_survival_hazard

Return the predicted hazard ratio at a node of a tree
get_policy_treatment_rank

Return the treatments ordered from most effective to least effective at a node of a tree
fit_predict

Fit a reward estimation model on features, treatments and outcomes and return predicted counterfactual rewards for each observation, as well as the score of the internal estimators.
imputation_learner

Generic learner for imputing missing values
clone

Return an unfitted copy of a learner with the same parameters
fit

Fits a model to the training data
get_machine_id

Return the machine ID for the current computer.
get_depth

Get the depth of a node of a tree
get_split_weights

Return the weights for numeric and categoric features used in the hyperplane split at a node of a tree
equal_propensity_estimator

Learner that estimates equal propensity for all treatments.
get_estimation_densities

Return the total kernel density surrounding each treatment candidate for the propensity/outcome estimation problems in a fitted learner.
get_num_fits

Return the number of fits along the path in the trained learner
get_stability_results

Return the trained trees in order of increasing objective value, along with their variable importance scores for each feature
fit_and_expand

Fit an imputation learner with training features and create adaptive indicator features to encode the missing pattern
fit_transform

Fit an imputation model using the given features and impute the missing values in these features
is_hyperplane_split

Check if a node of a tree applies a hyperplane split
multi_questionnaire.grid_search

Construct an interactive tree questionnaire using multiple tree learners from the results of a grid search
get_grid_result_summary

Return a summary of the results from the grid search
get_split_feature

Return the feature used in the split at a node of a tree
get_train_errors

Extract the training objective value for each candidate tree in the comparison, where a lower value indicates a better solution
get_tree

Return a copy of the learner that uses a specific tree rather than the tree with the best training objective.
get_grid_results

Return a summary of the results from the grid search
fit_transform_cv

Train a grid using cross-validation with features and impute all missing values in these features
get_split_threshold

Return the threshold used in the split at a node of a tree
copy_splits_and_refit_leaves

Copy the tree split structure from one learner into another and refit the models in each leaf of the tree using the supplied data
multi_tree_plot

Generic function for constructing an interactive tree visualization of multiple tree learners
optimal_feature_selection_regressor

Learner for conducting Optimal Feature Selection on regression problems
predict_expected_survival_time

Return the expected survival time estimate made by a model for each point in the features.
random_forest_classifier

Learner for training random forests for classification problems
predict_hazard

Return the fitted hazard coefficient estimate made by a model for each point in the features.
optimal_tree_classifier

Learner for training Optimal Classification Trees
get_best_params

Return the best parameter combination from a grid
get_classification_label

Return the predicted label at a node of a tree
decision_path

Return a matrix where entry (i, j) is true if the ith point in the features passes through the jth node in a trained tree model.
get_cluster_assignments

Return the indices of the trees assigned to each cluster, under the clustering of a given number of trees
get_classification_proba

Return the predicted probabilities of class membership at a node of a tree
convert_treatments_to_numeric

Convert `treatments` from symbol/string format into numeric values.
get_cluster_distances

Return the distances between the centroids of each pair of clusters, under the clustering of a given number of trees
get_cluster_details

Return the centroid information for each cluster, under the clustering of a given number of trees
get_features_used

Return the names of the features used by the learner
random_forest_regressor

Learner for training random forests for regression problems
get_num_nodes

Return the number of nodes in a trained learner
get_learner

Return the fitted learner using the best parameter combination from a grid
get_num_samples

Get the number of training points contained in a node of a tree
get_lower_child

Get the index of the lower child at a split node of a tree
get_regression_weights

Return the weights for each feature in the regression prediction at a node of a tree
get_rich_output_params

Return the current global rich output parameter settings
get_regression_constant

Return the constant term in the regression prediction at a node of a tree
get_params

Return the value of all parameters on a learner
get_prescription_treatment_rank

Return the treatments ordered from most effective to least effective at a node of a tree
get_parent

Get the index of the parent node at a node of a tree
score

Generic function for calculating scores
multi_questionnaire

Generic function for constructing an interactive questionnaire using multiple tree learners
impute

Impute missing values using either a specified method or through validation
multi_questionnaire.default

Construct an interactive questionnaire using multiple tree learners as specified by questions
get_survival_curve_data

get_survival_curve

Return the survival curve at a node of a tree
get_split_categories

Return the categoric/ordinal information used in the split at a node of a tree
get_grid_result_details

Return a vector of lists detailing the results of the grid search
get_prediction_weights

Return the weights for numeric and categoric features used for prediction in the trained learner
get_prediction_constant

Return the constant term in the prediction in the trained learner
get_roc_curve_data

transform

Impute missing values in a dataframe using a fitted imputation model
score.default

Calculate the score for a set of predictions on the given data
set_params

Set all supplied parameters on a learner
set_julia_seed

Set the random seed in Julia
stability_analysis

Conduct a stability analysis of the trees in a tree learner
glmnetcv_regressor

Learner for training GLMNet models for regression problems with cross-validation
get_upper_child

Get the index of the upper child at a split node of a tree
optimal_tree_policy_maximizer

Learner for training Optimal Policy Trees where the policy should aim to maximize outcomes
opt_knn_imputation_learner

Learner for conducting optimal k-NN imputation
random_forest_survival_learner

Learner for training random forests for survival problems
predict_reward

Return counterfactual rewards estimated using learner parameters for each observation in the supplied data and predictions
opt_svm_imputation_learner

Learner for conducting optimal SVM imputation
optimal_tree_policy_minimizer

Learner for training Optimal Policy Trees where the policy should aim to minimize outcomes
predict_shap

Calculate SHAP values for all points in the features using the learner
is_leaf

Check if a node of a tree is a leaf
impute_cv

Impute missing values using cross validation
install_julia

Download and install Julia automatically.
glmnetcv_survival_learner

Learner for training GLMNet models for survival problems with cross-validation
numeric_reward_estimator

Learner for conducting reward estimation with numeric treatments
grid_search

Controls grid search over parameter combinations
is_ordinal_split

Check if a node of a tree applies a ordinal split
glmnetcv_classifier

Learner for training GLMNet models for classification problems with cross-validation
is_parallel_split

Check if a node of a tree applies a parallel split
iai_setup

Initialize Julia and the IAI package.
xgboost_classifier

Learner for training XGBoost models for classification problems
write_svg

Output a learner as a SVG image
mean_imputation_learner

Learner for conducting mean imputation
read_json

Read in a learner or grid saved in JSON format
multi_tree_plot.default

Construct an interactive tree visualization of multiple tree learners as specified by questions
plot.stability_analysis

Plot a stability analysis
predict_treatment_outcome

Return the estimated quality of each treatment in the trained model of the learner for each point in the features
predict

Return the predictions made by the model for each point in the features
predict_treatment_rank

Return the treatments in ranked order of effectiveness for each point in the features
multi_tree_plot.grid_search

Construct an interactive tree visualization of multiple tree learners from the results of a grid search
set_display_label

Show the probability of a specified label when visualizing a learner
score.learner

Calculate the score for a model on the given data
transform_and_expand

Transform features with a trained imputation learner and create adaptive indicator features to encode the missing pattern
install_system_image

Download and install the IAI system image automatically.
reward_estimator

Learner for conducting reward estimation with categorical treatments
is_mixed_ordinal_split

Check if a node of a tree applies a mixed ordinal/categoric split
is_mixed_parallel_split

Check if a node of a tree applies a mixed parallel/categoric split
is_categoric_split

Check if a node of a tree applies a categoric split
missing_goes_lower

Check if points with missing values go to the lower child at a split node of of a tree
write_dot

single_knn_imputation_learner

Learner for conducting heuristic k-NN imputation
roc_curve

Generic function for constructing an ROC curve
split_data

Split the data into training and test datasets
tree_plot

Specify an interactive tree visualization of a tree learner
write_pdf

Output a learner as a PDF image
write_json

Output a learner or grid in JSON format
numeric_classification_reward_estimator

Learner for conducting reward estimation with numeric treatments and classification outcomes
numeric_survival_reward_estimator

Learner for conducting reward estimation with numeric treatments and survival outcomes
optimal_tree_regressor

Learner for training Optimal Regression Trees
optimal_feature_selection_classifier

Learner for conducting Optimal Feature Selection on classification problems
opt_tree_imputation_learner

Learner for conducting optimal tree-based imputation
optimal_tree_prescription_minimizer

Learner for training Optimal Prescriptive Trees where the prescriptions should aim to minimize outcomes
optimal_tree_prescription_maximizer

Learner for training Optimal Prescriptive Trees where the prescriptions should aim to maximize outcomes
numeric_regression_reward_estimator

Learner for conducting reward estimation with numeric treatments and regression outcomes
write_questionnaire

Output a learner as an interactive questionnaire in HTML format
write_png

Output a learner as a PNG image
write_html

Output a learner as an interactive browser visualization in HTML format
print_path

Print the decision path through the learner for each sample in the features
prune_trees

Use the trained trees in a learner along with the supplied validation data to determine the best value for the `cp` parameter and then prune the trees according to this value
set_threshold

For a binary classification problem, update the the predicted labels in the leaves of the learner to predict a label only if the predicted probability is at least the specified threshold.
show_in_browser

Show interactive visualization of an object (such as a learner or curve) in the default browser
optimal_tree_survivor

Learner for training Optimal Survival Trees
predict_outcomes

Return the predicted outcome for each treatment made by a model for each point in the features
plot.grid_search

Plot a grid search results for Optimal Feature Selection learners
predict_proba

Return the probabilities of class membership predicted by a model for each point in the features
variable_importance

Generate a ranking of the variables in the learner according to their importance during training. The results are normalized so that they sum to one.
tune_reward_kernel_bandwidth

Conduct the reward kernel bandwidth tuning procedure for a range of starting bandwidths and return the final tuned values.
xgboost_survival_learner

Learner for training XGBoost models for survival problems
xgboost_regressor

Learner for training XGBoost models for regression problems
roc_curve.default

Construct an ROC curve from predicted probabilities and true labels
questionnaire

Specify an interactive questionnaire of a tree learner
roc_curve.learner

Construct an ROC curve using a trained model on the given data
rand_imputation_learner

Learner for conducting random imputation
show_questionnaire

Show an interactive questionnaire based on a learner in default browser
similarity_comparison

Conduct a similarity comparison between the final tree in a learner and all trees in a new learner to consider the tradeoff between training performance and similarity to the original tree
optimal_tree_survival_learner

Learner for training Optimal Survival Trees
variable_importance_similarity

Calculate similarity between the final tree in a tree learner with all trees in new tree learner using variable importance scores.
write_booster

Write the internal booster saved in the learner to file
plot.roc_curve

Plot an ROC curve
refit_leaves

Refit the models in the leaves of a trained learner using the supplied data
zero_imputation_learner

Learner for conducting zero-imputation
set_reward_kernel_bandwidth

Save a new reward kernel bandwidth inside a learner, and return new reward predictions generated using this bandwidth for the original data used to train the learner.
set_rich_output_param

Sets a global rich output parameter
reset_display_label

Reset the predicted probability displayed to be that of the predicted label when visualizing a learner
plot.similarity_comparison

Plot a similarity comparison