# Reinhard Furrer

#### 14 packages on CRAN

Several statistical test functions as well as a function for exploratory data analysis to investigate classifiers allocating individuals to one of three disjoint and ordered classes. In a single classifier assessment the discriminatory power is compared to classification by chance. In a comparison of two classifiers the null hypothesis corresponds to equal discriminatory power of the two classifiers.

Bayesian network analysis is a form of probabilistic graphical models which derives from empirical data a directed acyclic graph, DAG, describing the dependency structure between random variables. An additive Bayesian network model consists of a form of a DAG where each node comprises a generalized linear model, GLM. Additive Bayesian network models are equivalent to Bayesian multivariate regression using graphical modelling, they generalises the usual multivariable regression, GLM, to multiple dependent variables. 'abn' provides routines to help determine optimal Bayesian network models for a given data set, where these models are used to identify statistical dependencies in messy, complex data. The additive formulation of these models is equivalent to multivariate generalised linear modelling (including mixed models with iid random effects). The usual term to describe this model selection process is structure discovery. The core functionality is concerned with model selection - determining the most robust empirical model of data from interdependent variables. Laplace approximations are used to estimate goodness of fit metrics and model parameters, and wrappers are also included to the INLA package which can be obtained from <http://www.r-inla.org>. A comprehensive set of documented case studies, numerical accuracy/quality assurance exercises, and additional documentation are available from the 'abn' website <http://r-bayesian-networks.org>.

Provides .C64(), which is an enhanced version of .C() and .Fortran() from the foreign function interface. .C64() supports long vectors, arguments of type 64-bit integer, and provides a mechanism to avoid unnecessary copies of read-only and write-only arguments. This makes it a convenient and fast interface to C/C++ and Fortran code.

An implementation of Bayesian hierarchical models for faecal egg count data to assess anthelmintic efficacy. Bayesian inference is done via MCMC sampling using 'Stan' <https://mc-stan.org/>.

For curve, surface and function fitting with an emphasis on splines, spatial data, geostatistics, and spatial statistics. The major methods include cubic, and thin plate splines, Kriging, and compactly supported covariance functions for large data sets. The splines and Kriging methods are supported by functions that can determine the smoothing parameter (nugget and sill variance) and other covariance function parameters by cross validation and also by restricted maximum likelihood. For Kriging there is an easy to use function that also estimates the correlation scale (range parameter). A major feature is that any covariance function implemented in R and following a simple format can be used for spatial prediction. There are also many useful functions for plotting and working with spatial data as images. This package also contains an implementation of sparse matrix methods for large spatial data sets and currently requires the sparse matrix (spam) package. Use help(fields) to get started and for an overview. The fields source code is deliberately commented and provides useful explanations of numerical details as a companion to the manual pages. The commented source code can be viewed by expanding source code version and looking in the R subdirectory. The reference for fields can be generated by the citation function in R and has DOI <doi:10.5065/D6W957CT>. Development of this package was supported in part by the National Science Foundation Grant 1417857 and the National Center for Atmospheric Research. See the Fields URL for a vignette on using this package and some background on spatial statistics.

Flexible implementation of a structural MCMC sampler for Directed Acyclic Graphs (DAGs). It supports the new edge reversal move from Grzegorczyk and Husmeier (2008) <doi:10.1007/s10994-008-5057-7> and the Markov blanket resampling from Su and Borsuk (2016) <http://jmlr.org/papers/v17/su16a.html>. It supports three priors: a prior controlling for structure complexity from Koivisto and Sood (2004) <http://dl.acm.org/citation.cfm?id=1005332.1005352>, an uninformative prior and a user-defined prior. The three main problems that can be addressed by this R package are selecting the most probable structure based on a cache of pre-computed scores, controlling for overfitting, and sampling the landscape of high scoring structures. It allows us to quantify the marginal impact of relationships of interest by marginalizing out over structures or nuisance dependencies. Structural MCMC seems an elegant and natural way to estimate the true marginal impact, so one can determine if it's magnitude is big enough to consider as a worthwhile intervention.

A method for the multiresolution analysis of spatial fields and images to capture scale-dependent features. mrbsizeR is based on scale space smoothing and uses differences of smooths at neighbouring scales for finding features on different scales. To infer which of the captured features are credible, Bayesian analysis is used. The scale space multiresolution analysis has three steps: (1) Bayesian signal reconstruction. (2) Using differences of smooths, scale-dependent features of the reconstructed signal can be found. (3) Posterior credibility analysis of the differences of smooths created. The method has first been proposed by Holmstrom, Pasanen, Furrer, Sain (2011) <DOI:10.1016/j.csda.2011.04.011>. Matlab code is available under <http://cc.oulu.fi/~lpasanen/MRBSiZer/>.

Generate and analyze Optimal Channel Networks (OCNs): oriented spanning trees reproducing all scaling features characteristic of real, natural river networks. As such, they can be used in a variety of numerical experiments in the fields of hydrology, ecology and epidemiology. See Carraro et al. (2020) <doi:10.1101/2020.02.17.948851> for a presentation of the package; Rinaldo et al. (2014) <doi:10.1073/pnas.1322700111> for a theoretical overview on the OCN concept; Furrer and Sain (2010) <doi:10.18637/jss.v036.i10> for the construct used.

Provides a simulation framework to simulate streamflow time series with similar main characteristics as observed data. These characteristics include the distribution of daily streamflow values and their temporal correlation as expressed by short- and long-range dependence. The approach is based on the randomization of the phases of the Fourier transform or the phases of the wavelet transform. The function prsim() is applicable to single site simulation and uses the Fourier transform. The function prsim.wave() extends the approach to multiple sites and is based on the complex wavelet transform. We further use the flexible four-parameter Kappa distribution, which allows for the extrapolation to yet unobserved low and high flows. Alternatively, the empirical or any other distribution can be used. A detailed description of the simulation approach for single sites and an application example can be found in <https://www.hydrol-earth-syst-sci.net/23/3175/2019/>. A detailed description and evaluation of the wavelet-based multi-site approach can be found in <https://www.hydrol-earth-syst-sci-discuss.net/hess-2019-658/>.

Various utilities are provided that might be used in spatial statistics and elsewhere. It delivers a method for solving linear equations that checks the sparsity of the matrix before any algorithm is used. Furthermore, it includes the Struve functions.

Multivariate modelling of geostatistical (point), lattice (areal) and point pattern data in a unifying spatial fusion framework. Details are given in Wang and Furrer (2019) <arXiv:1906.00364>. Model inference is done using either 'Stan' <https://mc-stan.org/> or 'INLA' <http://www.r-inla.org>.

Applying Monte Carlo permutation to generate pointwise variogram envelope and checking for spatial dependence at different scales using permutation test. Empirical Brown's method and Fisher's method are used to compute overall p-value for hypothesis test.

A computational toolbox of heuristics approaches for performing variable ranking and feature selection based on mutual information well adapted for multivariate system epidemiology datasets. The core function is a general implementation of the minimum redundancy maximum relevance model. R. Battiti (1994) <doi:10.1109/72.298224>. Continuous variables are discretized using a large choice of rule. Variables ranking can be learned with a sequential forward/backward search algorithm. The two main problems that can be addressed by this package is the selection of the most representative variable within a group of variables of interest (i.e. dimension reduction) and variable ranking with respect to a set of features of interest.

Implements a maximum likelihood estimation (MLE) method for estimation and prediction in spatially varying coefficient (SVC) models (Dambon et al. (2020) <arXiv:2001.08089>). Covariance tapering (Furrer et al. (2006) <doi:10.1198/106186006X132178>) can be applied such that the method scales to large data.