# Luis Torgo

#### 6 packages on CRAN

This package includes functions and data accompanying the book "Data Mining with R, learning with case studies" by Luis Torgo, CRC Press 2010.

Functions and data accompanying the second edition of the book "Data Mining with R, learning with case studies" by Luis Torgo, published by CRC Press.

An infra-structure for estimating the predictive performance of predictive models. In this context, it can also be used to compare and/or select among different alternative ways of solving one or more predictive tasks. The main goal of the package is to provide a generic infra-structure to estimate the values of different metrics of predictive performance using different estimation procedures. These estimation tasks can be applied to any solutions (workflows) to the predictive tasks. The package provides easy to use standard workflows that allow the usage of any available R modeling algorithm together with some pre-defined data pre-processing steps and also prediction post- processing methods. It also provides means for addressing issues related with the statistical significance of the observed differences.

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

A framework for dynamically combining forecasting models for time series forecasting predictive tasks. It leverages machine learning models from other packages to automatically combine expert advice using metalearning and other state-of-the-art forecasting combination approaches. The predictive methods receive a data matrix as input, representing an embedded time series, and return a predictive ensemble model. The ensemble use generic functions 'predict()' and 'forecast()' to forecast future values of the time series. Moreover, an ensemble can be updated using methods, such as 'update_weights()' or 'update_base_models()'. A complete description of the methods can be found in: Cerqueira, V., Torgo, L., Pinto, F., and Soares, C. "Arbitrated Ensemble for Time Series Forecasting." to appear at: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer International Publishing, 2017; and Cerqueira, V., Torgo, L., and Soares, C.: "Arbitrated Ensemble for Solar Radiation Forecasting." International Work-Conference on Artificial Neural Networks. Springer, 2017 <doi:10.1007/978-3-319-59153-7_62>.

Provides a set of functions that can be used to obtain better predictive performance on cost-sensitive and cost/benefits tasks (for both regression and classification). This includes re-sampling approaches that modify the original data set biasing it towards the user preferences.