coraAI data consists of a response, journal indication
  matrix, and co-citation network. This data is a subset of the Cora
  text mining project (refer to reference). 
  
  The observations are text documents that consist of 879 published
  papers about either Artificial Intelligence (AI) or Machine
  Learning (ML).  The journal name for each document is available
  (8 journals and an other category). The observed co-citation graph
  is also available, where each vertex is a document (observation), and
  the edge is the count of citations in common between each document and
  all other documents.  The goal is to incorporate both the text information and co-citation
  information for the prediction of paper subject AI/ML.
  Another, interesting problem might be to predict the journal of the
  paper given the text information and the categorization.  
data(coraAI)coraAI data consists of three objects each discussed next. class: categorization of the document(observation) as either
  AI or ML.  Typically the response. journals: indication of the document as published in a specific
  journal, (other, artificial-intelligence, machine-learning,
  nueral-computing, ieee-trans-Nnet, ieee-tpami,
  j-artificial-intelligence-research, ai-magazine, JASA) cite: the adjacency matrix of the co-citation network for these
  879 documents.The spa is particularly appealing for this data since it fits a function directly to the graph and coeficient vector to the journals. Other approaches require convergence of the journal information into a graph for processing, which is unclear when the data is a binary design matrix.
M. Culp (2011). spa: A Semi-Supervised R Package for Semi-Parametric Graph-Based Estimation. Journal of Statistical Software, 40(10), 1-29. URL http://www.jstatsoft.org/v40/i10/.