Learn R Programming

⚠️There's a newer version (0.2.5) of this package.Take me there.

scholar

The scholar R package provides functions to extract citation data from Google Scholar. In addition to retrieving basic information about a single scholar, the package also allows you to compare multiple scholars and predict future h-index values.

Development of the scholar package has resumed and a new maintainer should be confirmed shortly. Please continue to file issues and make pull requests against https://github.com/jkeirstead/scholar going forwards.

Basic features

Individual scholars are referenced by a unique character string, which can be found by searching for an author and inspecting the resulting scholar homepage. For example, the profile of physicist Richard Feynman is located at http://scholar.google.com/citations?user=B7vSqZsAAAAJ and so his unique id is B7vSqZsAAAAJ.

Basic information on a scholar can be retrieved as follows:

# Define the id for Richard Feynman
id <- 'B7vSqZsAAAAJ'

# Get his profile and print his name
l <- get_profile(id)
l$name 

# Get his citation history, i.e. citations to his work in a given year 
get_citation_history(id)

# Get his publications (a large data frame)
get_publications(id)

Additional functions allow the user to query the publications list, e.g. get_num_articles, get_num_distinct_journals, get_oldest_article, get_num_top_journals. Note that Google doesn't explicit categorize publications as journal articles, book chapters, etc, and so journal or article in these function names is just a generic term for a publication.

Comparing scholars

You can also compare multiple scholars, as shown below. Note that these two particular scholars are rather profilic and these queries will take a very long time to run.

# Compare Feynman and Stephen Hawking
ids <- c('B7vSqZsAAAAJ', 'qj74uXkAAAAJ')

# Get a data frame comparing the number of citations to their work in
# a given year 
compare_scholars(ids)

# Compare their career trajectories, based on year of first citation
compare_scholar_careers(ids)

Predicting future h-index values

Finally users can predict the future h-index of a scholar, based on the method of Acuna et al.. Since the method was originally calibrated on data from neuroscientists, it goes without saying that, if the scholar is from another discipline, then the results should be taken with a large pinch of salt. A more general critique of the original paper is available here. Still, it's a bit of fun.

## Predict h-index of original method author, Daniel Acuna
id <- 'GAi23ssAAAAJ'
predict_h_index(id)

Copy Link

Version

Install

install.packages('scholar')

Monthly Downloads

1,765

Version

0.1.7

License

MIT + file LICENSE

Maintainer

Guangchuang Yu

Last Published

July 3rd, 2018

Functions in scholar (0.1.7)

get_article_cite_history

Gets the citation history of a single article
get_coauthors

Gets the network of coauthors of a scholar
get_citation_history

Get historical citation data for a scholar
get_num_top_journals

Gets the number of top journals in which a scholar has published
scholar

scholar
get_profile

Gets profile information for a scholar
plot_coauthors

Plot a network of coauthors
get_impactfactor

Get journal metrics.
get_num_distinct_journals

Gets the number of distinct journals in which a scholar has published
compare_scholar_careers

Compare the careers of multiple scholars
get_complete_authors

Get the Complete list of authors for a Publication
compare_scholars

Compare the citation records of multiple scholars
tidy_id

Ensures that specified IDs are correctly formatted
get_oldest_article

Gets the year of the oldest article for a scholar
predict_h_index

Predicts the h-index for a researcher
get_publications

Gets the publications for a scholar
get_num_articles

Calculates how many articles a scholar has published