Learn R Programming

votesys (version 0.1.1)

check_dup_wrong: Check Ballots with Duplicated Values, Mistakes, or without Any Valid Entry

Description

The function simply checks validity of ballots and shows the check result. If you want a one-step clean, set clean to TRUE and a set of cleaned ballots will be returned. Here, duplicated values mean that the voter write the same candidate more than one time, or, when he assigns scores, he assigns the same score to more than one candidates. Mistakes are names that do not appear in the candidate list, or score values that are illegal (e.g., if voters are required to assign 1-5 to candidates, then 6 is an illegal value). Ballots without a valid entry (that is, all entries are NAs) are also to be picked out. Different formats can be input into the function, see Details.

Usage

check_dup_wrong(x, xtype = 2, candidate = NULL, vv = NULL, isna = NULL,
  clean = FALSE)

Arguments

x

a data.frame, matrix or list of raw ballots. See Details.

xtype

should be 1, 2 (default) or 3, designating the type of x. See Details.

candidate

if xtype is 1, this argument is ignored. If xtype is 2 or3, candidate names must be given as a character or numeric vector. If a name is not given, but is still on a ballot, then the ballot is labelled as wrong.

vv

if xtype is 2 or 3, it is ignored. If xtype is 1, this gives the valid score values for x.

isna

entries which should be taken as NAs. NA in x be taken as missing value, however, you can add more (e.g., you may use 99, 999 as missing values). If x contains characters, this argument should also be provided with a character vector, and if numeric, then numeric vector. Do not add NA to isna, because the default (NULL) means NA is already included.

clean

the default is FALSE, that is, it does not return the cleaned data. If it is TRUE, a set of ballots without duplicated values, without mistakes and with at least one valid value, is returned.

Value

a list with 3 or 4 elements: row_with_dup is the rows (not row names) of rows that have duplicated values; row_with_wrong is the rows with illegal names or the lengths of them are larger than candidate number (this could only happen when x is a list). row_all_na is the rows the entries of which are all NAs. For a list, elements with NULL are also taken as all-NA ballots.

Details

The function accepts the following input:

  • (1) when xtype is 1, x must be a matrix. Column names are candidate names (if column names are NULL, they will be created: x1, x2, x3...). Candidate number is the number of columns of the matrix. Entry ij is the numeric score assigned by the ith voter to the jth candidate.

  • (2) when xtype is 2, x can be a matrix or data.frame. Candidate number is the length of candidate. Entries are names (character or numeric) of candidates. The i1, i2, i3... entries are the 1st, 2nd, 3rd... preferences of voter i.

  • (3) when xtype is 3, x should be a list. Each element of the list is a ballot, a vector contains the names (character or numeric) of candidates. The 1st preference is in the 1st position of the vector, the 2nd preference is in the 2nd position... The number of candidates is the length of candidate; as a result, a ballot with number of names larger than candidate number is labelled as wrong.

Examples

Run this code
# NOT RUN {
raw=list(
    c('a', 'e', 'c', 'd', 'b'), 
    c('b', 'a', 'e'), 
    c('c', 'd', 'b'), 
    c('d', 'a', 'b'), 
    c('a', 'a', 'b', 'b', 'b'), 
    c(NA, NA, NA, NA), 
    v7=NULL, 
    v8=c('a', NA, NA, NA, NA, NA, NA), 
    v9=rep(" ", 3)
)
y=check_dup_wrong(raw, xtype=3, candidate=letters[1: 5])
y=check_dup_wrong(raw, xtype=3, candidate=letters[1: 4]) 
# }

Run the code above in your browser using DataLab