Learn R Programming

searchAnalyzeR (version 0.1.0)

calc_tes: Calculate Term Effectiveness Score

Description

Calculates a balanced effectiveness score for individual search terms using the harmonic mean of precision and coverage. This provides a single metric to evaluate how well each term performs in retrieving relevant articles.

Usage

calc_tes(term_analysis, score_name = "tes")

Value

Data frame with added effectiveness score column

Arguments

term_analysis

Data frame from term_effectiveness() function

score_name

Name for the new score column (default: "tes")

Details

The Term Effectiveness Score (TES) is calculated as: $$TES = 2 \times \frac{precision \times coverage}{precision + coverage}$$

Where:

  • Precision: Proportion of retrieved articles that are relevant

  • Coverage: Proportion of term-specific relevant articles that were retrieved

This differs from the traditional F1 score in that it uses coverage (term-specific relevance) rather than recall (overall strategy relevance).

Key Differences from F1 Score:

  • F1 Score: Precision × Recall (strategy-level performance)

  • TES: Precision × Coverage (term-level performance)

  • Recall: Relevant articles found / All relevant articles

  • Coverage: Relevant articles found / Term-specific relevant articles

See Also

term_effectiveness for calculating term precision and coverage calc_precision_recall for strategy-level F1 scores

Examples

Run this code
# Create sample term analysis
terms <- c("diabetes", "treatment", "clinical")
search_results <- data.frame(
  id = paste0("art", 1:20),
  title = paste("Study on", sample(terms, 20, replace = TRUE)),
  abstract = paste("Research about", sample(terms, 20, replace = TRUE))
)
gold_standard <- paste0("art", c(1, 3, 5, 7, 9))

# Analyze term effectiveness
term_analysis <- term_effectiveness(terms, search_results, gold_standard)

# Calculate effectiveness scores
term_scores <- calc_tes(term_analysis)
print(term_scores[order(term_scores$tes, decreasing = TRUE), ])

Run the code above in your browser using DataLab