CITAN (version 2011.08-1)

Scopus_ReadCSV: Import bibliography entries from a 40-column CSV file.

Description

Reads bibliography entries from a 40-column CSV file created e.g. with SciVerse Scopus (Complete format).

Usage

Scopus_ReadCSV(filename, stopOnErrors=TRUE, dbIdentifier="Scopus", ...)

Arguments

filename
the name of the file which the data are to be read from, see read.csv.
stopOnErrors
logical; TRUE to stop on all potential parse errors or just warn otherwise.
dbIdentifier
single character value; database identifier, helps detect parse errors, see above.
...
further arguments to be passed to read.csv.

Value

  • A data frame (data.frame) containing the following 14 columns: ll{ Authors Author(s) name(s), comma-separated, surnames first. Title Document title. Year Year of publication. UniqueId Unique document identifier. SourceTitle Title of the source containing the document. Volume Volume. Issue Issue. ArticleNumber Article number (identifier). PageStart Start page; numeric. PageEnd End page; numeric. Citations Number of citations. ISSN ISSN of the source. Language Language of the document. DocumentType Type of the document; see above. } Such an object may be imported to a local bibliometric storage with lbsImportDocuments.

Details

The function read.csv is used to read the bibliometric database. However, you may freely modify its behavior by passing further arguments (...), see the manual page of read.table for details.

The CSV file should consist of exactly 40 variables. Here are their meanings (in order of appearance):

  1. Author name(s) (surname first; multiple names are comma-separated, e.g.Kovalsky John, Smith B. W.),
  2. Document title,
  3. Year,
  4. Source title,
  5. Volume.
  6. Issue,
  7. Article number,
  8. Page start,
  9. Page end,
  10. not used,
  11. Number of citations received,
  12. String containing unique document identifier of the form ...id=UNIQUE_ID&...
  13. not used,
  14. not used,
  15. not used,
  16. not used,
  17. not used,
  18. not used,
  19. not used,
  20. not used,
  21. not used,
  22. not used,
  23. not used,
  24. not used,
  25. not used,
  26. not used,
  27. not used,
  28. not used,
  29. not used,
  30. not used,
  31. not used,
  32. Source ISSN,
  33. not used,
  34. not used,
  35. not used,
  36. not used,
  37. Language of original document,
  38. not used,
  39. Document type, one of:Article,Article in Press,Book,Conference Paper,Editorial,Erratum,Letter,Note,Report,Review,Short Survey, orNA(other categories are interpreted asNA),
  40. Database identifier, should be the same as the value ofdbIdentifierparameter, otherwise an exception is thrown.

Such a CSV file may be generated e.g. with SciVerse Scopus (Export format=comma separated file, .csv (e.g. Excel), Output=Complete format). Note that the exported CSV file needs some corrections in a few cases (wrong page numbers, single double quotes in string instead of two-double quotes etc.). We suggest to make them in Notepad-like applications (in plain text). The function tries to point out the line numbers that cause potential problems. However, sometimes a support of Spreadsheet-like programs could be helpful.

See Also

Scopus_ASJC, Scopus_SourceList, lbsConnect, Scopus_ImportSources, read.table, lbsImportDocuments

Examples

Run this code
conn <- lbsConnect("Bibliometrics.db");
## ...
data <- Scopus_ReadCSV("db_Polish_MATH/Poland_MATH_1987-1993.csv");
lbsImportDocuments(conn, data, "Poland_MATH");
## ...
dbDisconnect(conn);

Run the code above in your browser using DataLab