This function uses a Large Language Model (LLM) to automatically classify variables
in a dataset into quasi-identifiers, sensitive variables, numerical variables, and more,
and passes the result to createSdcObj(). It optionally uses a codebook and policy context.
AI_createSdcObj(
dat,
codebook = NULL,
policy = c("open", "restricted", "confidential"),
model = NULL,
api_key = NULL,
provider = c("openai", "anthropic", "custom"),
base_url = NULL,
confirm = TRUE,
info = TRUE,
...
)An object of class sdcMicroObj.
A data.frame containing the microdata.
Optional path to a codebook file (currently not parsed; placeholder for future use).
Data sharing policy context: "open" (default), "restricted", or "confidential".
The LLM model to use. If NULL, a default is chosen per provider.
API key. If NULL, auto-detected from environment variables.
LLM provider: "openai" (default), "anthropic", or
"custom" for any OpenAI-compatible endpoint (Ollama, Azure, vLLM, Groq, etc.).
Base URL for the API endpoint. Required when provider = "custom".
Logical; if TRUE (default) and session is interactive, shows the
proposed classification and asks for confirmation before creating the sdcMicroObj.
Logical; if TRUE, prints the LLM classification result and reasoning.
Additional arguments passed to createSdcObj().
Matthias Templ
if (FALSE) {
data(testdata)
sdc <- AI_createSdcObj(dat = testdata, policy = "open")
sdc
}
Run the code above in your browser using DataLab