Learn R Programming

ds4psy (version 1.2.0)

i2ds_survey: Data from the i2ds online survey

Description

i2ds_survey contains pre-processed data from the i2ds online survey.

Usage

i2ds_survey

Arguments

Format

On 2025-11-02, this data contains 60 participants (rows) and 116 variables (columns).

Details

Prefix codes

Many variable names have prefixes that indicate a particular type of variable:

  • rv: A random variable

  • c(#): A choice variable (with # alternatives)

  • t: A text variable (with any input)

  • tn: A text variable (with numeric input)

  • crs: A course-related variable

  • combined: A composite variable created by averaging either 4 or 5 individual Likert-scale items. Depending on the item set, the resulting score was normalized (i.e., divided by 4 or 5), and stored as a new variable.

List of variables

After pre-processing the raw data and re-arranging its variables (columns), the variable names and their contents in the i2ds_survey tibble are as follows:

  1. Key person-related variables: c4_gender A categorical (character) variable indicating the participant’s gender identity, with possible values including "female", "male", "non-binary" or "do not wish to respond". This variable is used for demographic analysis.

  2. tn_year A numeric (double) variable indicating the year of birth (e.g., 1999, 2000, 2001, etc.).

  3. tn_month A numeric (double) variable indicating the participant’s birth month (1–12). This variable also supports demographic profiling.

  4. tn_day A numeric (double) variable indicating the day of birth provided by the participant (1–31). Used for demographic purposes and potential exploratory analyses. DOB-related variables can be used to calculate age and analyze age-related trends.

  5. t_height A character variable indicating a participant's self-described height, using various formats and units (e.g., "1.80", "180 cm", "1,80m", or "5'11"). This variable requires pre-processing for analysis.

  6. t_pid An optional character variable capturing a participant ID, pseudonym, or other identifying entry. This variable allows participants to recognize their own data without disclosing their identity.

  7. Variables indicating informed consent and willingness to share data: c2_informed_consent A logical variable indicating whether the participant provided informed consent before starting the study (TRUE = consent provided, FALSE = no consent provided). This variable is a pre-requisite for ethical compliance (i.e., should be TRUE for all participants).

  8. c2_use_data_2 A logical variable indicating whether a participant still agrees to allow their data to be shared after having finished the survey (TRUE = consent provided, FALSE = no consent provided). This variable is a pre-requisite for data re-usability in research (and should be TRUE for all cases included here).

  9. Variables indicating course membership: crs_i2ds_1 A logical variable indicating whether a participant is currently enrolled in the course Introduction to Data Science 1: Basics (i2ds 1: TRUE = enrolled).

  10. crs_i2ds_2 A logical variable indicating whether a participant is enrolled in the course Introduction to Data Science 2: Applications (i2ds 2: TRUE = enrolled).

  11. crs_ds4psy A logical variable indicating whether a participant is enrolled in the course Data Science for Psychology (ds4psy: TRUE = enrolled).

  12. crs_diff_kn A logical variable indicating whether a participant is enrolled in a different course at the University of Konstanz (TRUE = yes).

  13. crs_diff_else A logical variable indicating whether a participant is enrolled in a course not at the University of Konstanz (TRUE = yes). This variable helps identifying external learners.

  14. crs_self_study A logical variable indicating whether a participant is engaging with course materials without formal enrollment (TRUE = yes). This variable reflects informal learning engagement.

  15. crs_only_study A logical variable indicating whether a participant is taking the survey only, without engaging with course materials (TRUE = yes). This variable identifies participants not studying R or data science.

  16. t_crs_other A character variable capturing free-text input describing any other course a participant is taking.

  17. v_crs_other_dept A character variable indicating the department of the other course(s) mentioned in t_crs_other. This variable may facilitate grouping participants by academic discipline.

  18. Variables indicating (randomized) survey conditions: rv_anchor_high_low A randomized (character) variable that indicates whether a person is to keep a relatively large or small number in memory (i.e., assignment to either 242 or 42, respectively). This manipulation is used to examine anchoring effects on later responses.

  19. rv_scale_randomization A randomized (character) variable that indicates whether a person was asked to rate their personality (from "serious" to "humorous") on a 4-point or on a 5-point Likert scale. The variable controls for the influence of scale granularity on ratings.

  20. rv_barnum_pos_neg A randomized (character) variable that indicates whether the participant is to receive a positive or negative Barnum statement ("positive" vs. "negative"). This is used to measure sensitivity to vague or generic personality feedback.

  21. rv_sc_false_dicho_3 A randomized (character) variable indicating which version of the scale is to be shown: a dichotomous comparison between admiration vs. respect, fear vs. love, admiration vs. love and fear, or a single undivided scale (values: "admir_resp" "fear_love", "admir_love" fear_resp", "single_scale"). Used to examine how scale format affects evaluative judgments.

  22. rv_wait_time A randomized (character) variable that indicates whether the participant waited 10 seconds ("short") or 30 seconds ("long") before continuing. This manipulation aims to examine whether a longer waiting period increases the perceived credibility or value of a following personality feedback, in line with mechanisms underlying the Barnum effect.

  23. rv_political_orientation A randomized (character) variable indicating the order in which the two political orientation scales ("left–right" and "liberal–conservative") were presented. Possible values include "left_right, lib_cons", "left_cons, lib_right", etc. This variable is used to control for potential order effects in political self-placement tasks.

  24. rv_thinkingstyle A randomized (character) variable that indicates the order in which pairs of thinking styles are to be presented ("deliberative vs. intuitive"; "reflective vs. spontaneous";" deliberative vs. spontaneous";"reflective vs. Intuitive"). The order is counterbalanced to reduce presentation bias in self-assessment tasks.

  25. Binary choices on art preference: c2_img_sel_1 A numeric (double) variable that represents the participant's preferred choice between 2 images in choice Set 1. The binary variable indicates the participant's image preference:

    • 1 corresponds to the cubist painting Les Baigneurs (the bathers), by Roger de La Fresnaye, 1912

    • 2 corresponds to the expressionist painting Badende Mädchen (bathing girls), by August Macke, 1913

  26. c2_img_sel_2 A numeric (double) variable that represents the participant's preferred choice between 2 images in choice Set 2. The binary variable indicates the participant's image preference:

    • 1 corresponds to the cubist painting Le Gouter (the taster, aka. tea time), by Jean Metzinger, 1911

    • 2 corresponds to the expressionist painting La petite Jeanne, by Amedeo Modigliani, 1909

  27. c2_img_sel_3 A numeric (double) variable that represents the participant's preferred choice between 2 images in choice Set 3. The binary variable indicates the participant's image preference:

    • 1 corresponds to the cubist painting Edtaonisl Ecclesiastic (the 1st word being an acronym made by alternating the French words for 'star' and 'dance'), by Francis Picabia, 1913

    • 2 corresponds to the impressionist painting Femme avec parasol dans un jardin (woman with parasol in a garden), by Pierre-Auguste Renoir, 1875

  28. c2_img_sel_4 A numeric (double) variable that represents the participant's preferred choice between 2 images in choice Set 4. The binary variable indicates the participant's image preference:

    • 1 corresponds to the expressionist painting Solitude, by Alexej von Jawlensky, 1912

    • 2 corresponds to the impressionist painting Pont dans le Jardin de Monet (bridge in Monet’s garden), by Claude Monet, 1895–96

  29. Variables describing habits and preferences: c7_eating_habits A categorical (character) variable that indicates which dietary lifestyle an individual assigns to itself (1 = "vegetarian"; 2 = "omnivore"; 3 = "vegan"; 4 = "pescetarian"; 5 = "flexitarian"; 6 = "carnivore"; 7 = "other").

  30. t_eating_habits_other A character variable intended to capture free-text input for other dietary descriptions; usually NA unless "other" was selected. May appear as logical if no responses were entered.

  31. c7_apple A numeric (double) variable indicating how much a participant likes apples on a 1-7 ranking scale (1 = highest preference, 7 = lowest preference, 0 if not ranked).

  32. c7_cherry A numeric (double) variable indicating how much a participant likes cherries on a 1-7 ranking scale (1 = highest preference, 7 = lowest preference, 0 if not ranked).

  33. c7_broccoli A numeric (double) variable indicating how much a participant likes broccoli on a 1-7 ranking scale (1 = highest preference, 7 = lowest preference, 0 if not ranked).

  34. c7_asparagus A numeric (double) variable indicating how much a participant likes asparagus on a 1-7 ranking scale (1 = highest preference, 7 = lowest preference, 0 if not ranked).

  35. c7_spinach A numeric (double) variable indicating how much a participant likes spinach on a 1-7 ranking scale (1 = highest preference, 7 = lowest preference, 0 if not ranked).

  36. c7_mud A numeric (double) variable indicating how much a participant likes mud on a 1-7 ranking scale (1 = highest preference, 7 = lowest preference, 0 if not ranked).

  37. c7_banana A numeric (double) variable indicating how much a participant likes bananas on a 1-7 ranking scale (1 = highest preference, 7 = lowest preference, 0 if not ranked).

    Note: Variables c7_apple to c7_banana were derived from a sorting/ranking task in which each participant sorted/ranked food items by preference. Each item was subsequently coded as a numeric value between 1 and 7 (0 if not ranked).

  38. Responses to binary choice items: c2_decsleep_instant A categorical (character) variable indicating whether a participant prefers to sleep before making important decisions ("sleep") or to make them instantly ("instant").

  39. c2_shopperson_online A categorical (character) variable indicating whether a participant prefers shopping in person ("person") or online ("online").

  40. c2_town_city A categorical (character) variable indicating whether a participant prefers living in a town ("town") or in a city ("city").

  41. c2_club_house A categorical (character) variable indicating whether a participant prefers to party in a club ("club") or to attend an house party ("house").

  42. c2_hotel_camping A categorical (character) variable capturing a participant's preference for staying in a hotel ("hotel") versus going camping ("camping").

  43. c2_photo_being A categorical (character) variable indicating whether a participant prefers photographing ("photo") or being in a moment ("being").

  44. c2_spring_fall A categorical (character) variable indicating whether a participant prefers the spring season ("spring") or the fall/autumn season ("fall").

  45. c2_beach_mount A categorical (character) variable reflecting whether a participant prefers the beach ("beach") or the mountains ("mount").

  46. c2_cats_dogs A categorical (character) variable indicating preference for cats ("cats") versus dogs ("dogs").

  47. c2_indiv_team A categorical (character) variable indicating whether a participant prefers individual ("indiv") or team sports ("team").

  48. c2_movies_books A categorical (character) variable indicating a participant's preference for movies ("movies") or books ("books").

  49. c2_board_video A categorical (character) variable indicating whether a participant prefers board games ("board") or video games ("video").

  50. c2_ios_android A categorical (character) variable indicating whether a participant prefers iOS ("ios") or Android ("android") as a mobile operating system.

  51. c2_text_voice A categorical (character) variable indicating whether a participant prefers texting ("text") or sending voice messages ("voice").

  52. c2_cook_bake A categorical (character) variable indicating whether a participant prefers cooking ("cook") or baking ("bake").

  53. c2_pinapple_no A categorical (character) variable that records whether a participant likes pineapple on pizza ("yes") or not ("no").

  54. c2_ketchup_mayo A categorical (character) variable indicating whether a participant prefers ketchup ("ketchup") or mayonnaise ("mayo").

  55. c2_coffee_tea A categorical (character) variable indicating whether a participant prefers coffee ("coffee") or tea ("tea").

  56. c2_math_lang A categorical (character) variable indicating whether a participant prefers mathematics ("math") or language-related subjects ("lang").

  57. c2_odd_even A categorical (character) variable indicating whether a participant prefers odd numbers ("odd") or even numbers ("even").

  58. c3_diff_bin A categorical (character) variable indicating how difficult it was for a participant to make their previous preference decisions (items 22--41) . Response options include "yes", "a little", and "no". This item captures perceived decisional difficulty and may serve as an indicator of response certainty, thinking style, or task engagement.

  59. Variables on political opinions: politics_left A numeric (double) variable representing the participant’s self-placement on a left–right political spectrum. Values range from 1 (left) to 6 (right).

  60. politics_liberal A numeric (double) variable representing self-placement on a liberal to conservative scale, ranging from 1 (liberal) to 6 (conservative).

  61. Miscellaneous estimates, choices, opinions, and preferences: tn_estimate_sun A numeric (double) variable capturing the participant’s estimate of how many times larger the sun’s diameter is compared to that of the earth. This item serves as a manipulation check for the anchoring effect, based on previously presented numeric anchors (e.g., 42 or 242).

  62. t_att_check_1 A character variable containing the participant’s open-text response to an attention check prompt ("Please type: 'I read the instructions'"). This attention check allows detecting inattentive or automated responses.

  63. c2_fly_invisible A categorical (character) variable indicating whether the participant would prefer the superpower of flying ("fly") or becoming invisible ("invisible").

  64. t_fly_invisible_explain A character variable where participants explain their choice between flying and invisibility. This free text answer allows for qualitative analysis of a participant's justifications and motivations.

  65. combined_c_ser_hum_self A numeric (double) variable reflecting a participant’s self-assessment on a "serious vs. humorous" scale. The score is based on a 4-point or 5-point Likert scale, depending on random assignment. This variable is used to test how perspective (self vs. others) and scale format (presence vs. absence of a middle option) influences self-ratings.

  66. combined_c_ser_hum_others A combined numeric (double) variable reflecting how humorous or serious participants believe others to perceive them. This score is derived from either a 4-point or 5-point scale and is used to examine the effect of perspective and scale design on perceived external ratings.

  67. c4_chronotype A categorical (character) variable indicating whether the participant identifies as a morning person ("morning"), evening person ("evening") mid-day person ("mid-day") or a never person ("never").

  68. tn_sleep A numeric (double) variable indicating the typical number of hours the participant typically sleeps per night.

  69. tn_bedtime A character variable representing the participant’s usual bedtime, to be entered in 24-hour format (e.g., "22:30", "00:00").

  70. tn_anchor_recall_1 A numeric (double) variable recording the number (either 42 or 242) that the participant was previously asked to memorize and later recall. It is used to test memory for the anchor manipulation.

  71. combined_admired A combined numeric (double) variable reflecting how much a participant wants to be admired by others, rated on a 1–6 Likert scale (1 = not at all, 6 = very much).

  72. combined_feared A combined numeric (double) variable reflecting how much a participant wants to be feared by others, rated on a 1–6 Likert scale (1 = not at all, 6 = very much).

  73. combined_loved A combined numeric (double) variable reflecting how much a participant wants to be loved by others, rated on a 1–6 Likert scale (1 = not at all, 6 = very much).

  74. combined_respected A combined numeric (double) variable reflecting how much a participant wants to be respected by others, rated on a 1–6 Likert scale (1 = not at all, 6 = very much).

  75. c7_pess_opti A numeric (double) variable capturing a participant’s self-rated tendency toward pessimism versus optimism, on a 7-point scale (1 = very pessimistic, 7 = very optimistic).

  76. c7_story_list A numeric (double) variable indicating how much a participant enjoys listening to or reading stories, rated from 1 (not at all) to 7 (very much).

  77. c7_stab_adv A numeric (double) variable indicating a participant’s self-assessed position on a stability versus adventurousness spectrum, rated on a scale from 1 (very stable) to 7 (very adventurous). This variable may indicate personality traits related to risk-taking.

  78. think_reflect A numeric (double) variable representing a participant’s placement on a bipolar scale ranging from 1 ("reflective") to 6 (either "spontaneous" or " intuitive"). The specific version of the 2nd scale anchor is randomly assigned.

  79. think_delib A numeric (double) variable representing a participant’s placement on a bipolar scale ranging from 1 ("deliberative") to 6 (either "intuitive" or " spontaneous". The specific version of the 2nd scale anchor is randomly assigned.

  80. c4_intro_extrovert A categorical (character) variable indicating a participant's self-rated social orientation: "introverted", "extroverted", or mixed variants such as "extro-intro" or "intro-extro".

  81. tn_favorit_number A numeric (double) variable capturing a participant’s favorite number, in free answer format.

  82. c3_cutlery A categorical (character) variable indicating which piece of cutlery a participant most identifies with. The 3 possible values include "knife", "fork", and "spoon".

  83. c3_rock_paper_scissors A categorical (character) variable capturing a participant's selection in a rock–paper–scissors scenario. The 3 possible values are "rock", "paper", or "scissors".

  84. c5_att_check_2 A numeric (double) variable used as an attention check. Participants were asked to select the number that most resembles the shape of a circle. The correct response is 0, which corresponds to scale option 5. Responses deviating from this may indicate inattentiveness.

  85. c6_barnum_accuracy A numeric (double) variable indicating how accurately a participant rated a generic personality description (i.e., a Barnum statement), on a scale from 1 (poor) to 6 (perfect). This variable is used to assess susceptibility to the so-called Barnum effect (i.e., the tendency to perceive vague and general statements as highly accurate).

  86. t_anchor_recall_2 A numeric (double) variable recording whether a participant correctly remembered a previously presented number (either 42 or 242). This assesses memory and anchoring manipulation success (for a 2nd time).

  87. Other person-related variables: c9_occupation A categorical (character) variable indicating a participant’s current occupational status (e.g., "student", "employed", "other"). This variable may be used for demographic segmentation.

  88. t_occupation_other A logical variable for free-text input if a participant selected "other" for occupation. This variable captures detailed occupational descriptions not covered by the pre-defined options.

  89. c7_education A categorical (character) variable indicating a participant’s highest completed education level (e.g., "high school", "bachelor", "master"). This variable may be used for demographic segmentation.

  90. t_education_other A logical variable to allow participants to enter their education level in free text (if "other" was selected). This variable enables open-format responses for less common education paths.

  91. c3_current_degree A categorical (character) variable indicating the type of academic degree a participant is currently pursuing (e.g, "bachelor", "master"). This variable provides educational context for other academic measures.

  92. tn_semester A numeric (double) variable indicating the current semester of study reported by a participant (e.g., 1, 6, 10). This variable helps contextualize course experience and academic progress.

  93. c14_studyfield A categorical (character) variable indicating the participant’s field of study (e.g., "psychology", "data science"). This variable is used to examine field-specific attitudes and skills.

  94. t_studyfield_other A character variable capturing free-text responses if the participant selected "other" as their study field. This variable allows classification of less common disciplines.

  95. Preferences for course contents: c5_pref_stats A numeric (double) variable indicating a participant’s interest in preparing data for statistical analysis, rated on a scale from 1 (no interest) to 5 (absolutely essential).

  96. c5_pref_visualize A numeric (double) variable indicating a participant's interest in data visualization in R, rated on a scale from 1 (no interest) to 5 (absolutely essential).

  97. c5_pref_sims A numeric (double) variable indicating a participant’s interest in using R for simulations and modeling, rated on a scale from 1 (no interest) to 5 (absolutely essential).

  98. c5_pref_shiny A numeric (double) variable capturing how essential a participant considers learning to build interactive web applications using R Shiny. Responses range from 1 (no interest) to 5 (absolutely essential).

  99. c5_pref_scrape A numeric (double) variable capturing how essential a participant considers learning web scraping with R. Responses range from 1 (no interest) to 5 (absolutely essential).

  100. c5_pref_arts A numeric (double) variable capturing how essential a participant considers exploring artistic or creative aspects of data science (e.g., generative art in R). Responses range from 1 (no interest) to 5 (absolutely essential).

  101. Course-related expectations and worries: t_crs_expect_i2ds_1 A character variable containing free-text input describing a participant’s expectations and hopes for the course Introduction to Data Science 1: Basics (i2ds 1).

  102. t_crs_worry_i2ds_1 A character variable capturing free-text responses describing a participant’s worries and reservations related to the course Introduction to Data Science 1: Basics (i2ds 1).

  103. t_crs_expect_i2ds_2 A character variable containing free-text input describing a participant’s expectations and hopes for the course Introduction to Data Science 2: Applications (i2ds 2).

  104. t_crs_worry_i2ds_2 A character variable capturing free-text input describing a participant’s worries and reservations concerns related to the course Introduction to Data Science 2: Applications (i2ds 2).

  105. t_crs_expect_ds4psy A logical variable containing free-text input describing a participant’s expectations and hopes for the course Data Science for Psychology (ds4psy).

  106. t_crs_worry_ds4psy A logical variable describing a participant’s worries and reservations regarding the course Data Science for Psychology (ds4psy), in free text format.

  107. Variables on prior experience: c6_exp_math A numeric (double) variable indicating a participant’s self-assessed experience with mathematics, rated on a scale from 1 (no experience) to 6 (extremely experienced).

  108. c6_exp_statistics A numeric (double) variable measuring a participant’s self-assessed experience with statistics, rated on a scale from 1 (no experience) to 6 (extremely experienced).

  109. c6_exp_program A numeric (double) variable indicating a participant’s experience with programming (any programming language), rated on a scale from 1 (no experience) to 6 (extremely experienced).

  110. c6_exp_r A numeric (double) variable indicating a participant’s experience with R programming, rated on a scale from 1 (no experience) to 6 (extremely experienced).

  111. c6_exp_datavisual A numeric (double) variable capturing a participant’s prior experience with data visualization, rated on a scale from 1 (no experience) to 6 (extremely experienced).

  112. Survey feedback: t_feedback An optional character variable containing general feedback provided by the participant regarding the survey or course. This is an open-ended text field for final comments, impressions, or suggestions.

  113. Session info: referer URL of referring page.

  114. datetime Date and time of initial survey access.

  115. duration Session duration (in seconds).

  116. date_of_last_access Date and time of final survey access.

See the codebook and print version for additional coding details.

Missing values are represented as NA values in the data. These can be due to a participant not providing a response to an item or to an item not being applicable to this participant.

See Also

Other datasets: Bushisms, Trumpisms, countries, data_1, data_2, data_t1, data_t1_de, data_t1_tab, data_t2, data_t3, data_t4, dt_10, exp_num_dt, exp_wide, falsePosPsy_all, fame, flowery, fruits, outliers, pi_100k, posPsy_AHI_CESD, posPsy_long, posPsy_p_info, posPsy_wide, t3, t4, t_1, t_2, t_3, t_4, table6, table7, table8, table9, tb