Learn R Programming

⚠️There's a newer version (1.0.0) of this package.Take me there.

Data Science for Psychologists (ds4psy)

Welcome to the R package ds4psy — a software companion to the book and course Data Science for Psychologists.

This R package provides datasets and functions used in the ds4psy book and course. The book and course introduce the principles and methods of data science for students of psychology and other biological or social sciences.

Installation

The current release of ds4psy is available from CRAN at https://CRAN.R-project.org/package=ds4psy:

install.packages('ds4psy')  # install ds4psy from CRAN client
library('ds4psy')           # load to use the package

The current development version of ds4psy can be installed from its GitHub repository at https://github.com/hneth/ds4psy/:

# install.packages('devtools')  # (if not installed yet)
devtools::install_github('hneth/ds4psy')
library('ds4psy')  # load to use the package

The most recent version of the ds4psy book is available at https://bookdown.org/hneth/ds4psy/.

Course Coordinates

Description

This book and course provide an introduction to data science that is tailored to the needs of psychologists, but is also suitable for students of the humanities and other biological or social sciences. This audience typically has some knowledge of statistics, but rarely an idea how data is prepared and shaped to allow for statistical testing. By using various data types and working with many examples, we teach tools for transforming, summarizing, and visualizing data. By keeping our eyes open for the perils of misleading representations, the book fosters fundamental skills of data literacy and cultivates reproducible research practices that enable and precede any practical use of statistics.

Audience

Students of psychology and other social sciences are trained to analyze data. But the data they learn to work with (e.g., in courses on statistics and empirical research methods) is typically provided to them and structured in a (rectangular or "tidy") format that presupposes many steps of data processing regarding the aggregation and spatial layout of variables. When beginning to collect their own data, students inevitably struggle with these pre-processing steps which — even for experienced data scientists — tend to require more time and effort than choosing and conducting statistical tests.

This course develops the foundations of data analysis that allow students to collect data from real-world sources and transform and shape such data to answer scientific and practical questions. Although there are many good introductions to data science (e.g., Grolemund & Wickham, 2017) they typically do not take into account the special needs — and often anxieties and reservations — of psychology students. As social scientists are not computer scientists, we introduce new concepts and commands without assuming a mathematical or computational background. Adopting a task-oriented perspective, we begin with a specific problem and then solve it with some combination of data collection, manipulation, and visualization.

Goals

Our main goal is to develop a set of useful skills in analyzing real-world data and conducting reproducible research. Upon completing this course, you will be able to use R to read, transform, analyze, and visualize data of various types. Many interactive exercises allow students to continuously check their understanding, practice their skills, and monitor their progress.

Requirements

This course assumes some basic familiarity with statistics and the R programming language, but enthusiastic programming novices are also welcome.

Resources

This package and the corresponding book are still being developed and are updated as new materials become available.

References

Course materials

The script was originally based on the following textbook:

  • Wickham, H., & Grolemund, G. (2017). R for data science: Import, tidy, transform, visualize, and model data. Sebastopol, Canada: O'Reilly Media, Inc. [Available online at http://r4ds.had.co.nz.]

Software

Please install the following open-source programs on your computer:

# Tidyverse packages: 
install.packages('tidyverse')

# Course packages: 
install.packages('ds4psy')  # datasets and functions
install.packages('unikn')   # color palettes and functions

Other resources

Online

R manuals and books

About

If you find these materials useful, or want to adopt or alter them for your purposes, please let me know.

Citation

To cite ds4psy in derivations and publications, please use:

  • Neth, H. (2020). ds4psy: Data Science for Psychologists. Social Psychology and Decision Sciences, University of Konstanz, Germany. Textbook and R package (version 0.4.0, July 6, 2020). Retrieved from https://bookdown.org/hneth/ds4psy/.

A BibTeX entry for LaTeX users is:

@Manual{ds4psy,
  title = {ds4psy: Data Science for Psychologists},
  author = {Hansjörg Neth},
  year = {2020},
  organization = {Social Psychology and Decision Sciences, University of Konstanz},
  address = {Konstanz, Germany},
  note = {Textbook and R package (version 0.4.0, July 6, 2020)},
  url = {https://bookdown.org/hneth/ds4psy/} 
}

The URL of the ds4psy R package is https://CRAN.R-project.org/package=ds4psy.

License

Data science for psychologists (ds4psy) by Hansjörg Neth is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

[Updated 2020-07-06 by hn.]

Copy Link

Version

Install

install.packages('ds4psy')

Monthly Downloads

345

Version

0.4.0

License

CC BY-SA 4.0

Issues

Pull Requests

Stars

Forks

Maintainer

Hansjoerg Neth

Last Published

July 6th, 2020

Functions in ds4psy (0.4.0)

cclass

cclass provides character classes (as a named vector).
change_tz

Change time zone (without changing represented time).
capitalize

capitalize converts the case of each word's n initial characters (typically to upper) in a string of text x.
change_time

Change time and time zone (without changing time display).
coin

Flip a fair coin (with 2 sides "H" and "T") n times.
count_chars

count_chars counts the frequency of characters in a string of text x.
caseflip

caseflip flips the case of characters in a string of text x.
Bushisms

Data: Bushisms.
Umlaut

Umlaut provides German Umlaut letters (as Unicode characters).
Trumpisms

Data: Trumpisms.
data_t1

Data table data_t1.
data_1

Data import data_1.
data_2

Data import data_2.
exp_wide

Data exp_wide.
data_t1_de

Data import data_t1_de.
dice_2

Throw a questionable dice (with a given number of sides) n times.
cur_date

Current date (in yyyy-mm-dd or dd-mm-yyyy format).
dice

Throw a fair dice (with a given number of sides) n times.
cur_time

Current time (in hh:mm or hh:mm:ss format).
exp_num_dt

Data from an experiment with numeracy and date-time variables.
falsePosPsy_all

False Positive Psychology data.
ds4psy.guide

Opens user guide of the ds4psy package.
num_as_char

Convert a number into a character sequence.
plot_text

Plot text characters (from file or user input).
fame

Data table fame.
metachar

metachar provides R metacharacters (as a character vector).
plot_tiles

Plot n-by-n tiles.
is.wholenumber

Test for whole numbers (i.e., integers).
data_t1_tab

Data import data_t1_tab.
data_t2

Data table data_t2.
flowery

Data: Flowery phrases.
dt_10

Data from 10 Danish people.
read_ascii

read_ascii parses text (from a file) into a table.
sample_char

Draw a sample of n random characters (from given characters).
theme_ds4psy

ds4psy default plot theme (using ggplot2 and unikn).
is_leap_year

Is some year a so-called leap year?
sample_date

Draw a sample of n random dates (from a given range).
posPsy_AHI_CESD

Positive Psychology: AHI CESD data.
sample_time

Draw a sample of n random times (from a given range).
make_grid

Generate a grid of x-y coordinates.
posPsy_long

Positive Psychology: AHI CESD corrected data (in long format).
fruits

Data: Names of fruits.
l33t_rul35

l33t_rul35 provides rules for translating text into leet/l33t slang.
plot_fun

Another function to plot some plot.
plot_n

Plot n tiles.
t_3

Data t_3.
t_2

Data t_2.
count_words

count_words counts the frequency of words in a string of text x.
t_1

Data t_1.
transl33t

transl33t translates text into leet slang.
t_4

Data t_4.
plot_fn

A function to plot a plot.
text_to_sentences

text_to_sentences splits a string of text x (consisting of one or more character strings) into a vector of its constituting sentences.
pi_100k

Data: 100k digits of pi.
num_as_ordinal

Convert a number into an ordinal character sequence.
countries

Data: Names of countries.
data_t3

Data table data_t3.
data_t4

Data table data_t4.
what_week

What week is it?
text_to_words

text_to_words splits a string of text x (consisting of one or more character strings) into a vector of its constituting words.
what_month

What month is it?
outliers

Outlier data.
posPsy_wide

Positive Psychology: All corrected data (in wide format).
what_time

What time is it?
posPsy_p_info

Positive Psychology: Participant data.
table8

Data table8.
pal_n_sq

Get n-by-n dedicated colors of a color palette.
pal_ds4psy

ds4psy default color palette.
what_year

What year is it?
what_date

What date is it?
table6

Data table6.
what_day

What day (of the week) is it?
tb

Data table tb.
table7

Data table7.
t4

Data table t4.
t3

Data table t3.