Learn R Programming

sparr (version 0.2-2)

sparr-package: The sparr Package

Description

Provides functions to estimate fixed and adaptive kernel-smoothed relative risk estimates via the density-ratio method and perform subsequent inference.

Arguments

Dependencies

The sparr package depends upon some other important contributions to CRAN in order to operate; their uses here are indicated: spatstat - Fast-fourier transform assistance with fixed and adaptive density estimation, as well as region handling; see Baddeley and Turner (2005). sm - Provision of LSCV bandwidth calculation; see Bowman and Azzalini (1997, 2010). rgl - Interactive 3D plotting of densities and surfaces; see Adler and Murdoch (2009). MASS - Utility support for internal functions; see Venables and Ripley (2002).

Citation

To cite use of sparr, the user may refer to the following work: Davies, T.M., Hazelton, M.L. and Marshall, J.C. (2011), sparr: Analyzing spatial relative risk using fixed and adaptive kernel density estimation in R, Journal of Statistical Software 39(1), 1-14.

Details

ll{ Package: sparr Version: 0.2-2 Date: 2011-02-28 License: GPL (>= 2) }

Kernel smoothing, and the flexibility afforded by this methodology, provides an attractive approach to estimating complex probability density functions. This is particularly of interest when exploring problems in geographical epidemiology, the study of disease dispersion throughout some spatial region, given a population. The so-called `relative risk surface', constructed as a ratio of estimated case to control densities (Bithell, 1990; 1991), describes the variation in the `risk' of the disease, given the underlying at-risk population. This is a technique that has been applied successfully for mainly exploratory purposes in a number of different examples (see for example Sabel et al., 2000; Prince et al., 2001; Wheeler, 2007).

This package provides functions for bivariate kernel density estimation (KDE), implementing both fixed and `variable' or `adaptive' (Abramson, 1982) smoothing parameter options (see the function documentation for more information). Two isotropic bandwidth calculators for bivariate KDE are provided, one based on the maximal smoothing principle (Terrell, 1990), the other using a least-squares cross-validation approach adapted from an existing R package function (see below). In addition, the ability to construct asymptotically derived p-value surfaces (`tolerance' contours of which signal statistically significant sub-regions of `extremity' in a risk surface - Hazelton and Davies, 2009; Davies and Hazelton, 2010), as well as some flexible visualisation tools, are provided.

The content of sparr can be broken up as follows: Datasets PBC a case/control dataset concerning liver disease in northern England. Also available is the case/control dataset chorley of the spatstat package, which concerns the distribution of laryngeal cancer in an area of Lancashire, England. Bandwidth calculators OS estimation of an isotropic smoothing parameter for bivariate KDE, based on the oversmoothing principle introduced by Terrell (1990). CV.sm a least-squares cross-validated (LSCV) estimate of an isotropic bandwidth for bivariate KDE, based on the h.select function in the package sm. Bivariate functions KBivN bivariate normal (Gaussian) kernel KBivQ bivariate quartic (biweight) kernel bivariate.density kernel density estimate of bivariate data; fixed or adaptive smoothing Relative risk and p-value surfaces risk estimation of a (log) relative risk function tolerance calculation of asymptotic p-value surface Printing and summarising objects S3 methods (print.bivden, print.rrs, summary.bivden and summary.rrs) are available for the bivariate density and risk function objects. Visualisation Most applications of the relative risk function in practice require plotting the relative risk within the study region (especially for an inspection of tolerance contours). To this end, sparr provides a number of different ways to achieve attractive and flexible visualisation. The user may produce a heat plot, a perspective plot, a contour plot, or an interactive 3D perspective plot (that the user can pan around and zoom - courtesy of the powerful rgl package; see below) for either an estimated relative risk function or a bivariate density estimate. These capabilities are available through S3 support of the plot function; see plot.bivden for visualising a single bivariate density estimate from bivariate.density, and plot.rrs for visualisation of an estimated relative risk function from risk.

References

Abramson, I. (1982), On bandwidth variation in kernel estimates --- a square root law, Annals of Statistics, 10(4), 1217-1223. Adler, D. and Murdoch, D. (2009), rgl: 3D visualization device system (OpenGL). R package version 0.87; URL: http://CRAN.R-project.org/package=rgl Baddeley, A. and Turner, R. (2005), Spatstat: an R package for analyzing spatial point patterns, Journal of Statistical Software, 12(6), 1-42. Bithell, J.F. (1990), An application of density estimation to geographical epidemiology, Statistics in Medicine, 9, 691-701. Bithell, J.F. (1991), Estimation of relative risk function,. Statistics in Medicine, 10, 1745-1751. Bowman, A.W. and Azzalini, A. (1997), Applied Smoothing Techniques for Data Analysis: The Kernel Approach with S-Plus Illustrations. Oxford University Press Inc., New York. ISBN 0-19-852396-3. Bowman, A.W. and Azzalini, A. (2010), R package 'sm': nonparametric smoothing methods (version 2.2-4), URL: http://www.stats.gla.ac.uk/~adrian/sm; http://azzalini.stat.unipd.it/Book_sm Davies, T.M. and Hazelton, M.L. (2010), Adaptive kernel estimation of spatial relative risk, Statistics in Medicine, 29(23) 2423-2437. Hazelton, M.L. and Davies, T.M. (2009), Inference based on kernel estimates of the relative risk function in geographical epidemiology, Biometrical Journal, 51(1), 98-109. Prince, M. I., Chetwynd, A., Diggle, P. J., Jarner, M., Metcalf, J. V. and James, O. F. W. (2001), The geographical distribution of primary biliary cirrhosis in a well-defined cohort, Hepatology 34, 1083-1088. Sabel, C. E., Gatrell, A. C., Loytonenc, M., Maasiltad, P. and Jokelainene, M. (2000), Modelling exposure opportunitites: estimating relative risk for motor disease in Finland, Social Science & Medicine 50, 1121-1137. Terrell, G.R. (1990), The maximal smoothing principle in density estimation, Journal of the American Statistical Association, 85, 470-477. Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S, Fourth Edition, Springer, New York. Wheeler, D. C. (2007), A comparison of spatial clustering and cluster detection techniques for childhood leukemia incidence in Ohio, 1996-2003, International Journal of Health Geographics, 6(13).