xml
  1. xml-xpath-expression

XPath Expression - ( XPath Tutorial )

XPath is a language used to navigate and select nodes in an XML document. It provides a way to access specific nodes and attributes in an XML document. XPath is used by many programming languages to extract information from XML documents. In this tutorial, we will cover the basics of XPath expressions.

Syntax

XPath expressions typically start with a forward slash (/) and contain a series of node names separated by slashes. We can use various operators and functions to filter and select specific nodes. Here is the basic syntax of an XPath expression:

/first-node/second-node[@attribute='value']/third-node

Example

Suppose we have the following XML document:

<books>
   <book id="1">
      <title>Harry Potter and the Philosopher's Stone</title>
      <author>J.K. Rowling</author>
      <price>12.99</price>
   </book>
   <book id="2">
      <title>The Lord of the Rings</title>
      <author>J.R.R. Tolkien</author>
      <price>20.99</price>
   </book>
   <book id="3">
      <title>The Hitchhiker's Guide to the Galaxy</title>
      <author>Douglas Adams</author>
      <price>9.99</price>
   </book>
</books>

Here are some examples of XPath expressions:

Expression Output
/books/book Selects all book nodes
/books/book[2] Selects the second book node
/books/book[@id='1'] Selects the book node with id="1"
/books/book/price[text()>10] Selects all price nodes that have a value greater than 10

Explanation

In the first example, the XPath expression /books/book selects all book nodes in the XML document. The second example, /books/book[2], selects only the second book node. In the third example, /books/book[@id='1'], the attribute selector [@id='1'] selects only the book node with id="1". Finally, the fourth example, /books/book/price[text()>10], uses the text() function to select only the price nodes that have a value greater than 10.

Use

XPath expressions are used in many programming languages to extract information from XML documents. For example, in Python, we can use the lxml library to parse an XML document and extract information using XPath expressions. XPath expressions can also be used in web scraping to extract information from HTML documents.

Important Points

  • XPath expressions start with a forward slash (/).
  • Nodes are selected using node names separated by slashes.
  • Attributes can be selected using the attribute selector ([@attribute='value']).
  • Functions can be used to filter and select nodes.

Summary

XPath is a powerful language for selecting nodes in an XML document. It provides a way to access specific nodes and attributes. XPath expressions can be used in many programming languages to extract information from XML documents and HTML documents.

Published on: