Classifies occurrence records in levels of confidence in species identification
classify_occ(
occ,
spec = NULL,
na.rm.coords = TRUE,
crit.levels = c("det_by_spec", "not_spec_name", "image", "sci_collection", "field_obs",
"no_criteria_met"),
ignore.det.names = NULL,
spec.ambiguity = "not.spec",
institution.code = "institutionCode",
collection.code = "collectionCode",
catalog.number = "catalogNumber",
year = "year",
date.identified = "dateIdentified",
species = "species",
identified.by = "identifiedBy",
decimal.latitude = "decimalLatitude",
decimal.longitude = "decimalLongitude",
basis.of.record = "basisOfRecord",
media.type = "mediaType",
occurrence.id = "occurrenceID",
institution.source,
year.event,
scientific.name,
determined.by,
latitude,
longitude,
basis.of.rec,
occ.id
)The occ data frame plus the classification of each record
in a new column, named naturaList_levels.
data frame with occurrence records information.
data frame with specialists' names. See details.
logical. If TRUE, remove occurrences with NA
in decimal.latitude or decimal.longitude
character. Vector with levels of confidence in decreasing
order. The criteria allowed are det_by_spec, not_spec_name,
image, sci_collection, field_obs, no_criteria_met.
See details.
character vector indicating strings in
identified.by that should be ignored as a taxonomist. See details.
character. Indicates how to deal with ambiguity in
specialists names. not.spec solve ambiguity by classifying the
identification as done by a non-specialist;is.spec assumes the
identification was done by a specialist; manual.check enables the
user to manually check all ambiguous names. Default is not.spec.
column name of occ with the name (or acronym)
in use by the institution having custody of the object(s) or information
referred to in the record.
column name of occ with The name, acronym,
code, or initials identifying the collection or data set from which the
record was derived.
column name of occ with an identifier
(preferably unique) for the record within the data set or collection.
Column name of occ the four-digit year in which the
Event occurred, according to the Common Era Calendar.
Column name of occ with the date on which the
subject was determined as representing the Taxon.
column name of occ with the species names.
column name of occ with the name of who
determined the species.
column name of occ latitude in decimal
degrees.
column name of occ longitude in decimal
degrees.
column name with the specific nature of the data record. See details.
column name of occ with the media type of recording.
See details.
column name of occ with link or code for the
occurrence record. See in
Darwin Core Format
deprecated, use institution.code instead.
deprecated, use year instead.
deprecated, use species instead.
deprecated, use identified.by instead
deprecated, use decimal.latitude instead
deprecated, use decimal.longitude instead
deprecated, use basis.of.record instead.
deprecated, use occurrence.id instead
Arthur V. Rodrigues
spec data frame must have columns separating LastName,
Name and Abbrev. See create_spec_df
function for a easy way to produce this data frame.
When ignore.det.name = NULL (default), the function ignores
strings with "RRC ID Flag", "NA", "", "-" and "_". When a character
vector is provided, the function adds the default strings to the provided
character vector and ignore all these strings as being a name of a taxonomist.
The function classifies the occurrence records in six levels of confidence in species identification. The six levels are:
det_by_spec - when the identification was made by a specialists
which is present in the list of specialists provided in the spec
argument;
not_spec_name - when the identification was made by a name who is
not a specialist name provide in spec;
image - the occurrence have not name of a identifier, but present
an image associated;
sci_collection - the occurrence have not name of a identifier,
but preserved in a scientific collection;
field_obs - the occurrence have not name of a identifier,
but it was identified in field observation;
no_criteria_met - no other criteria was met.
The (decreasing) order of the levels in the character vector determines the classification level order.
basis.of.record is a character vector with one of the following
types of record: PRESERVED_SPECIMEN, PreservedSpecimen,
HUMAN_OBSERVATION or HumanObservation, as in GBIF data
'basisOfRecord'.
media.type uses the same pattern as GBIF mediaType column,
indicating the existence of an associated image with stillImage.
speciaLists
data("A.setosa")
data("speciaLists")
occ.class <- classify_occ(A.setosa, speciaLists)
Run the code above in your browser using DataLab