mvbutils.utils: Miscellaneous utilities

Description

Miscellaneous utilities.

Usage

as.cat( x)
clip( x, n=1)
cq( ...)
deparse.names.parsably( x)
empty.data.frame( ...)
env.name.string( env)
expanded.call( nlocal=sys.parent())
everyth( x, by=1, from=1)
find.funs(pos=1, ..., exclude.mcache = TRUE, mode="function")
find.lurking.envs(obj, delve=FALSE, trace=FALSE)
index( lvector)
integ(expr, lo, hi, what = "x", ..., args.to.integrate = list())
is.dir( dir)
isF( x)
isT( x)
legal.filename( name)
lsall( ...)
masked( pos)
masking( pos=1)
mkdir( dirlist)
most.recent( lvec)
my.all.equal( x, y, ...)
named( x)
nscat( fmt, ..., sep='\n', file='')
nscatn( fmt, ..., sep='\n', file='')
option.or.default( opt.name, default=NULL)
pos( substrs, mainstrs, any.case = FALSE, names.for.output)
put.in.session( ...)
returnList( ...)
safe.rbind( df1, df2) # Deprecated in 2013
scatn( fmt, ..., sep='\n', file='')
to.regexpr( x)
yes.no( prompt, default)

Value

as.cat

character vector of class cat

clip

vector of the same mode as x

character vector

empty.data.frame

data.frame

env.name.string

a string

expanded.call

a call object

everyth

same type as x

find.funs

a character vector of function names

find.lurking.envs

a data.frame with columns "what" and "size"

integ

scalar

is.dir

logical vector

is.nonzero

TRUE or FALSE

isF, isT

TRUE or FALSE

legal.filename

character( 1)

masked

character vector

masking

character vector

mkdir

logical vector of success/failure

nscat

NULL

nscatn

NULL

most.recent

integer vector the same length as lvec, with values in the range (0,length(lvec)).

named

vector of the same mode as x

option.or.default

option's value

pos

numeric matrix, one column per match found plus one; at least one column guaranteed

returnList

list or single object

safe.rbind

data.frame

scatn

NULL

to.regexpr

character

yes.no

TRUE or FALSE

Arguments by function

as.cat: x: character vector that you want to be displayed via cat( x, sep="\n")
clip: x: a vector or list
clip: n: integer saying how many elements to clip from the end of x
cq: ...: quoted or unquoted character strings, to be substituted and then concatenated
deparse.names.parsably: x: any object for deparse- name objects treated specially
empty.data.frame: ...: named length-1 vectors of appropriate mode, e.g. "first.col=''"
env.name.string: env: environment
expanded.call: nlocal: frame to retrieve arguments from. Normally, use the default; see mlocal.
everyth: x: subsettable thing. by: step between values to extract. from: first position.
find.funs: ...: extra arguments for objects. Usually just "pattern" for regexp searches.
find.funs: exclude.mcache: if TRUE (default), don't look at mlazy objects
find.funs: mode: "function" to look for functions, "environment" to look for environments, etc
find.lurking.envs: delve: whether to recurse into function arguments and function bodies
find.lurking.envs: trace: just a debugging aid-- leave as FALSE
index: lvector: vector of TRUE/FALSE/NA
integ: expr: an expression; what: a string, the argument of expr to be integrated over; lo, hi: limits; ...: other variables to be set in the expression; args.to.integrate: a list of other things to pass to integrate
is.dir: dir: character vector of files to check existence and directoriness of.
isF, isT: x: anything, but meant to be a logical scalar
legal.filename: name: character string to be modified
find.funs: pos: list of environments, or vector of char or numeric positions in search path.
lsall: ...: as for ls, except that all.names will be coerced to TRUE
masking, masked: pos: position in search path
mkdir: dirlist: character vector of directories to create
most.recent: logical vector
my.all.equal: x, y: anything; ...: passed to all.equal
named: x: character vector which will become its own names attribute
nscat, nscatn: see scatn
option.or.default: opt.name: character(1)
option.or.default: default: value to be returned if there is no option called "opt.name"
pos: substrs: character vector of patterns (literal not regexpr)
pos: mainstrs: character vector to search for substrs in.
pos: any.case: logical- ignore case?
pos: names.for.output: character vector to label rows of output matrix; optional
put.in.session: ...: a named set of objects, to be assigned into the mvb.session.info search environment
returnList: ...: named or un-named arguments, just as for return before R 1.8.
safe.rbind: df1, df2: data.frame or list
scatn, nscat: fmt, ...: as per sprintf; file, sep: as per cat
to.regexpr: x: character vector
yes.no: prompt: string to put before asking for input
yes.no: default: value to return if user just presses <ENTER>

Details

as.cat makes a character vector print as if it was catted rather than printed (one element per line, no extra quotes or backslashes, no [1] etc prefixes).

clip removes the last n elements of x.

cq is handy for typing cq( alpha, beta, gamma) instead of cq( "alpha", "beta", "gamma"). Certain strings DO still require quotes around them, e.g. cq( "NULL", "1-2")).

deparse.names.parsably is like deparse except that name objects get wrapped in a call to as.name, so that they won't be evaluated accidentally.

empty.data.frame creates a template data frame with 0 rows but with all columns of the appropriate type. Useful for rbinding to later.

env.name.string returns a string naming an environment; its name attribute if there is one, or the name of its path attribute if applicable, concatenated with the first line of what would be shown if you printed the argument. Unlike environmentName, this will always return a non-empty string.

expanded.call returns the full argument list available to its caller, including defaults where arguments were not set explicitly. The arguments may not be those originally passed, if they were modified before the invocation of expanded.call. Default arguments which depend on calculations after the invocation of expanded.call will lead to an error.

everyth extracts every by-th element of x, starting at position from.

find.funs finds "function" objects (or objects of other modes, via the "mode" arg) in one or more environments, optionally matching a pattern.

find.lurking.envs( myobj) will search through myobj and all its attributes, returning the size of each sub-object. The size of environments is returned as Inf. The search is completely recursive, except for environments and by default the inner workings of functions; attributes of the entire function are always recursed. Changing the delve parameter to TRUE ensures full recursion of function arguments and function bodies, which will show e.g. the srcref structure; try it to see why the default is FALSE. find.lurking.envs can be very useful for working out e.g. why the result of a model-fitting function is taking up 1000000MB of disk space; sometimes this is due to unnecessary environments in well-concealed places.

index returns the position(s) of TRUE elements. Unlike which: attributes are lost; NA elements map to NAs; index(<<length 0 object>>) is numeric(0); index( <<non-logical>>) is NA.

integ is a handy wrapper for integrate, that takes an expression rather than a function--- so integ( sin(x), 0, 1) "just works".

is.dir tests for directoriness.

isF and isT test a logical scalar in the obvious way, with NA (and non-logicals) failing the test, to avoid teeeedious repetition of is( !is.na( my.complicated.expression) & my.complicated.expression) .... They are deliberately not vectorized (contrary to some versions of mvbutils documentation); arguments with non-1 length trigger a warning.

legal.filename coerces its character argument into a similar-looking string that is a legal filename on any (?) system.

lsall is like ls but coerces all.names=TRUE.

masked checks which objects in search()[pos] are masked by identically-named objects higher in the search path. masking checks for objects mask identically-named objects lower in the search path. Namespaces may make the results irrelevant.

mkdir makes directories; unlike dir.create, it can do several levels at once.

most.recent returns the highest-so-far position of TRUE within a logical vector, or 0 if TRUE has not occurred yet; most.recent( c(F,T,F,T)) returns c(0,2,2,4).

my.all.equal is like all.equal, except that it returns FALSE in cases where all.equal returns a non-logical-mode result.

named(x) is just names(x) <- as.character( x); x; useful for lapply etc.

nscat, nscatn: see scatn

option.or.default obsolete--- use equivalent getOption() instead.

pos is probably to be eschewed in new code, in favour of gregexpr with fixed=TRUE, which is likely faster. (And I should rewrite it to use gregexpr.) It's one of a few legacy functions in mvbutils that pre-date improvements in base R. pos will either search for several literal patterns in a single target, or vice versa-- but not both. It returns a matrix showing the positions of the matching substrings, with as many columns as the maximum number of matches. 0 signifies "no match"; there is always at least one column even if there are no matches at all.

returnList returns a list corresponding to old-style (pre-R 1.8) return syntax. Briefly: a single argument is returned as itself. Multiple arguments are returned in a list. The names of that list are the argument names if provided; or, for any unnamed argument that is just a symbolic name, that symbolic name; or no name at all, for other unnamed arguments. You can duplicate pre-1.8 behaviour of return(...) via return(returnList(...)).

safe.rbind ( Deprecated in 2013 ) mimics rbind, but works round an R bug (I reckon) where a column appears to be a numeric in one data.frame but a factor in the other. But I now think you should just sort your column classes/types properly in advance, rather than mixing types and relying on somewhat arbitrary conversion rules.

scatn is just cat( sprintf( fmt, ...), "", file=file, sep=sep). scatn prints a newline afterwards, but not before; nscat does the opposite; nscatn does both. If you're just displaying a "title" before calling print, use nscat.

to.regexpr converts literal strings to their equivalent regexps, e.g. by doubling backslashes. Useful if you want "fixed=TRUE" to apply only to a portion of your regexp.

yes.no cats its "prompt" argument and waits for user input. if the user input pmatches "yes" or "YES", then yes.no returns TRUE; if the input pmatches no or NO then yes.no returns FALSE; if the input is '' and default is set, then yes.no returns default; otherwise it repeats the question. You probably want to put a space at the end of prompt.

Examples

Run this code

# NOT RUN {
# as.cat
ugly.bugly <- c( 'A rose by any other name', 'would annoy taxonomists')
ugly.bugly
#[1] "A rose by any other name"                 "would annoy taxonomists"
as.cat( ugly.bugly) # calls print.cat--- no clutter
#A rose by any other name
#would annoy taxonomists
clip( 1:5, 2) # 1:3
cq( alpha, beta) # c( "alpha", "beta")
empty.data.frame( a=1, b="yes")
# data.frame with 0 rows of columns "a" (numeric) and "b" (character)
empty.data.frame( a=1, b=factor( c( "yes", "no")))$b
# factor with levels c( "no", "yes")
everyth( 1:10, 3, 5) # c( 5, 8)
f <- function( a=9, b) expanded.call(); f( 3, 4) # list( a=3, b=4)
find.funs( "package:base", patt="an") # "transform" etc.
find.lurking.envs( cd)
#                                     what  size
#1                     attr(obj, "source")  5368
#2                                     obj 49556
#3 environment(obj) <: namespace:mvbutils>   Inf
# }
# NOT RUN {
eapply( .GlobalEnv, find.lurking.envs)
# }
# NOT RUN {
integ( sin(x), 0, 1) # [1] 0.4597
integ( sin(x+a), a=5, 0, 1) # [1] -0.6765; 'a' is "passed" to 'expr'
integ( sin(y+a), what='y', 0, 1, a=0) # [1] 0.4597; arg is 'y' not 'x'
is.dir( getwd()) # TRUE
isF( FALSE) # TRUE
isF( NA) # FALSE
isF( c( FALSE, FALSE)) # FALSE, with a warning
sapply( c( FALSE, NA, TRUE), isF)
# [1]  TRUE FALSE FALSE
sapply( c( FALSE, NA, TRUE), isT)
# [1] FALSE FALSE  TRUE
legal.filename( "a:b\\c/d&f") # "a.b.c.d&f"
most.recent( c( FALSE,TRUE,FALSE,TRUE)) # c( 0, 2, 2, 4)
sapply( named( cq( alpha, beta)), nchar)  # c( alpha=5, beta=4)
pos( cq( quick, lazy), "the quick brown fox jumped over the lazy dog")
# matrix( c( 5, 37), nrow=2)
pos( "quick", c( "first quick", "second quick quick", "third"))
# matrix( c( 7,8,0, 0,14,0), nrow=3)
pos( "quick", "slow") # matrix( 0)
f <- function() { a <- 9; return( returnList( a, a*a, a2=a+a)) }
f() # list( a=9, 81, a2=18)
scatn( 'Things %i', 1:3)
nscat( 'Things %i', 1:3)
nscatn( 'Things %i', 1:3)
to.regexpr( "a{{") # "a\\{\\{"
# }
# NOT RUN {
mkdir( "subdirectory.of.getwd")
yes.no( "OK (Y/N)? ")
masking( 1)
masked( 5)
# }

Run the code above in your browser using DataLab