Learn R Programming

causalDisco (version 1.0.1)

tfci_run: Run the TFCI Algorithm for Causal Discovery

Description

Use a modification of the FCI algorithm that makes use of background knowledge in the format of a partial ordering. This may, for instance, come about when variables can be assigned to distinct tiers or periods (i.e., a temporal ordering).

Usage

tfci_run(
  data = NULL,
  knowledge = NULL,
  alpha = 0.05,
  test = reg_test,
  suff_stat = NULL,
  method = "stable.fast",
  na_method = "none",
  orientation_method = "conservative",
  directed_as_undirected = FALSE,
  varnames = NULL,
  num_cores = 1,
  ...
)

Value

A Disco object (a list with a caugi and a knowledgeobject).

Arguments

data

A data frame with the observed variables.

knowledge

A knowledge object describing tiers/periods and optional forbidden/required edges. This replaces the legacy order interface and is the preferred way to supply temporal background knowledge.

alpha

The alpha level used as the per-test significance threshold for conditional independence testing.

test

A conditional independence test. The default reg_test uses a regression-based information-loss test. Another available option is cor_test which tests for vanishing partial correlations. User-supplied functions may also be used; see details for the required interface.

suff_stat

A sufficient statistic. If supplied, it is passed directly to the test and no statistics are computed from data. Its structure depends on the chosen test.

method

Skeleton construction method, one of "stable", "original", or "stable.fast" (default). See pcalg::skeleton() for details.

na_method

Handling of missing values, one of "none" (default; error on any NA), "cc" (complete-case analysis), or "twd" (test-wise deletion).

orientation_method

Method for handling conflicting separating sets when orienting edges; must be one of "standard", "conservative" (the default) or "maj.rule". See pcalg::pc() for further details.

directed_as_undirected

Logical; if TRUE, treat any directed edges in knowledge as undirected during skeleton learning. This is due to the fact that pcalg does not allow directed edges in fixedEdges or fixedGaps. Default is FALSE.

varnames

Character vector of variable names. Only needed when data is not supplied and all information is passed via suff_stat.

num_cores

Integer number of CPU cores to use for parallel skeleton learning.

...

Additional arguments passed to pcalg::skeleton() during skeleton construction.

Details

The temporal/tiered background information enters several places in the TFCI algorithm: (1) In the skeleton construction phase, when looking for separating sets \(Z\) between two variables \(X\) and \(Y\), \(Z\) is not allowed to contain variables that are strictly after both \(X\) and \(Y\) in the temporal order (as specified by the knowledge tiers). (2) This also applies to the subsequent phase where the algorithm searches for possible D-SEP sets. (3) Prior to other orientation steps, any cross-tier edges get an arrowhead placed at their latest node.

After this, the usual FCI orientation rules are applied; see pcalg::udag2pag() for details.

Examples

Run this code
data(tpc_example)

kn <- knowledge(
  tpc_example,
  tier(
    child ~ tidyselect::starts_with("child"),
    youth ~ tidyselect::starts_with("youth"),
    oldage ~ tidyselect::starts_with("oldage")
  )
)

# Recommended path using disco()
my_tfci <- tfci(engine = "causalDisco", test = "fisher_z", alpha = 0.05)

disco(tpc_example, my_tfci, knowledge = kn)

# or using my_tfci directly
my_tfci <- my_tfci |> set_knowledge(kn)
my_tfci(tpc_example)

# Also possible: using tfci_run()
tfci_run(tpc_example, test = cor_test, knowledge = kn)

Run the code above in your browser using DataLab