get1stOfRepeatedByCol: Get first of repeated by column
Description
get1stOfRepeatedByCol sorts matrix 'mat' and extracts only 1st occurance of values in column 'sortBy'.
Returns then non-redundant matrix (ie for column 'sortBy', if 'markIfAmbig' specifies existing col, mark ambig there).
Note : problem when sortSupl or sortBy not present (or not intended for use)
This function returns depending on argumnet 'asList' either list with non-redundant ('unique') and removed lines ('repeats')
Arguments
mat
(matrix or data.frame) numeric vector to be tested
sortBy
(character) column name for which elements should be made unique, numeric or character column; 'sortSupl' .. add'l colname to always select specific 1st)
sortSupl
(character) default="ty"
asFirstLast
(character,length=2) to force specific strings from coluln 'sortSupl' as first and last when selecting 1st of repeated terms, default=c("full","inter")
markIfAmbig
(character,length=2) 1st will be set to 'TRUE' if ambiguous/repeated, 2nd will get (heading) prefix, default=c("ambig","seqNa")
asList
(logical) to return list with non-redundant ('unique') and removed lines ('repeats')
abmiPref
(character) prefix to note ambiguous entries/terms, default="_"
silent
(logical) suppress messages
debug
(logical) additional messages for debugging
callFrom
(character) allow easier tracking of messages produced
See Also
firstOfRepeated for (more basic) treatment of simple vector, nonAmbiguousNum for numeric use (much faster !!!)
aa <- cbind(no=as.character(1:20),seq=sample(LETTERS[1:15],20,repl=TRUE),
ty=sample(c("full","Nter","inter"),20,repl=TRUE),ambig=rep(NA,20),seqNa=1:20)
get1stOfRepeatedByCol(aa)