## Parse XML

Work with XML files using a simple, consistent interface. Built on top of the 'libxml2' C library.

# xml2

The xml2 package is a binding to libxml2, making it easy to work with HTML and XML from R. The API is somewhat inspired by jQuery.

## Installation

You can install xml2 from CRAN,

install.packages("xml2")


or you can install the development version from github, using devtools:

# install.packages("devtools")
devtools::install_github("r-lib/xml2")


## Usage

library("xml2")
x <- read_xml("<foo> <bar> text <baz/> </bar> </foo>")
x

xml_name(x)
xml_children(x)
xml_text(x)
xml_find_all(x, ".//baz")

xml_name(h)
xml_text(h)


There are three key classes:

• xml_node: a single node in a document.

• xml_doc: the complete document. Acting on a document is usually the same as acting on the root node of the document.

• xml_nodeset: a set of nodes within the document. Operations on xml_nodesets are vectorised, apply the operation over each node in the set.

## Compared to the XML package

xml2 has similar goals to the XML package. The main differences are:

• xml2 takes care of memory management for you. It will automatically free the memory used by an XML document as soon as the last reference to it goes away.

• xml2 has a very simple class hierarchy so don't need to think about exactly what type of object you have, xml2 will just do the right thing.

• More convenient handling of namespaces in Xpath expressions - see xml_ns() and xml_ns_strip() to get started.

## Functions in xml2

 Name Description xml_new_document Create a new document, possibly with a root node xml_serialize Serializing XML objects to connections. xml_dtd Construct a document type definition xml_find_all Find nodes that match an xpath expression. xml_replace Modify a tree by inserting, replacing or removing nodes xml_children Navigate around the family tree. xml_set_namespace Set the node's namespace xml_comment Construct a comment node xml_structure Show the structure of an html/xml document. xml2_example Get path to a xml2 example xml_attr Retrieve an attribute. xml_cdata Construct a cdata node xml_url The URL of an XML document xml_validate Validate XML schema xml_missing Construct an missing xml object xml_path Retrieve the xpath to a node xml_ns_strip Strip the default namespaces from a document xml_name The (tag) name of an xml element. xml_text Extract or modify the text xml_type Determine the type of a node. xml_ns XML namespaces. url_parse Parse a url into its component pieces. as_xml_document Coerce a R list to xml nodes. url_escape Escape and unescape urls. read_xml Read HTML or XML. download_xml Download a HTML or XML file as_list Coerce xml nodes to a list. url_absolute Convert between relative and absolute urls. write_xml Write XML or HTML to disk. xml_document-class Register S4 classes No Results!