Learn R Programming

NaileR (version 1.2.3)

nail_sort: Sort textual data

Description

Group textual data according to their similarity, in a context in which the assessors have commented on a set of stimuli.

Usage

nail_sort(
  dataset,
  name_size = 3,
  stimulus_id = "individual",
  introduction = NULL,
  measure = NULL,
  request = NULL,
  model = "llama3.1",
  nb.clusters = 4,
  generate = FALSE,
  max.attempts = 5
)

Value

A list consisting of:

  • a list of prompts (one per assessor);

  • a list of results (one per assessor);

  • a data frame with the group names.

Arguments

dataset

a data frame where each row is a stimulus and each column is an assessor.

name_size

the maximum number of words in a group name created by the LLM.

stimulus_id

the nature of the stimulus. Customizing it is highly recommended.

introduction

the introduction to the LLM prompt.

measure

the type of measure used in the experiment.

request

the request of the LLM prompt.

model

the model name ('llama3.1' by default).

nb.clusters

the maximum number of clusters the LLM can form per assessor.

generate

a boolean that indicates whether to generate the LLM response. If FALSE, the function only returns the prompt.

max.attempts

the maximum number of attempts for a column.

Details

This function uses a while loop to ensure that the LLM gives the right number of groups. Therefore, customizing the stimulus ID, prompt introduction and measure is highly recommended; a clear prompt can help the LLM finish its task faster.

Examples

Run this code
if (FALSE) {
# Processing time is often longer than ten seconds
# because the function uses a large language model.

library(NaileR)
data(beard_wide)

intro_beard <- "As a barber, you make
recommendations based on consumers comments.
Examples of consumers descriptions of beards
are as follows."
intro_beard <- gsub('\n', ' ', intro_beard) |>
stringr::str_squish()

req_beard <- "Each group should contain beards with descriptions
that relate to a similar type of person - not
necessarily the same person, but sharing common traits.
Each group must have a short,
meaningful name that characterizes the person."
req_beard <- gsub('\n', ' ', req_beard) |>
stringr::str_squish()

res <- nail_sort(beard_wide[,1:5], name_size = 3,
stimulus_id = "beard", introduction = intro_beard,
measure = 'the description was',
request = req_beard,
nb.clusters = 6,
generate = TRUE)

cat(res$prompt_llm[[1]])
cat(res$res_llm[[1]])
res$dta_sort
}

Run the code above in your browser using DataLab