Yachen Yan

Yachen Yan

5 packages on CRAN

99.99th

Percentile

Estimation and regularization for covariance matrix of asset returns. For covariance matrix estimation, three major types of factor models are included: macroeconomic factor model, fundamental factor model and statistical factor model. For covariance matrix regularization, four regularized estimators are included: banding, tapering, hard-thresholding and soft- thresholding. The tuning parameters of these regularized estimators are selected via cross-validation.

MLmetrics

cran
99.99th

Percentile

A collection of evaluation metrics, including loss, score and utility functions, that measure regression, classification and ranking performance.

99.99th

Percentile

A Pure R implementation of Bayesian Global Optimization with Gaussian Processes.

99.99th

Percentile

An efficient C++ based implementation of "Follow The (Proximally) Regularized Leader" online learning algorithm. This algorithm was proposed in McMahan et al. (2013) <DOI:10.1145/2487575.2488200>.

lightgbm

cran
99.99th

Percentile

Tree based algorithms can be improved by introducing boosting frameworks. 'LightGBM' is one such framework, based on Ke, Guolin et al. (2017) <https://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision>. This package offers an R interface to work with it. It is designed to be distributed and efficient with the following advantages: 1. Faster training speed and higher efficiency. 2. Lower memory usage. 3. Better accuracy. 4. Parallel learning supported. 5. Capable of handling large-scale data. In recognition of these advantages, 'LightGBM' has been widely-used in many winning solutions of machine learning competitions. Comparison experiments on public datasets suggest that 'LightGBM' can outperform existing boosting frameworks on both efficiency and accuracy, with significantly lower memory consumption. In addition, parallel experiments suggest that in certain circumstances, 'LightGBM' can achieve a linear speed-up in training time by using multiple machines.