XML Full-Text Search and Scoring
- Sihem Amer-Yahia | AT&T Labs Research - USA
One of the key benefits of XML is its ability to represent a mix of structured and text data. Querying XML is a well-explored topic with powerful database-style query languages such as XPath/XQuery set to become W3C standards. However, these languages are not powerful enough to express full-text search queries. For this reason, we developed TeXQuery, a full-text extension to XPath/XQuery which provides a rich set of fully composable full-text search primitives, such as keyword and Boolean search, proximity distance, stemming and regular expressions and gracefully combines them with structured search with XPath/XQuery. TeXQuery is the precursor of XQuery Full-Text, the current full-text extension to XPath 2.0 and XQuery 1.0 that is being developed by the W3C. TeXQuery also supports a flexible scoring construct that allows users to express queries such as “return the top 20 elements ranked by their relevance to some structural conditions and contain 3 occurrences of some keywords within some distance of each other”. I will present a family of scoring methods for XML that are inspired from tf*idf and that allow to take both content and structure into account for scoring answers to XML queries.
Speaker Details
Sihem Amer-Yahia is Member of Technical Staff at AT&T Labs Research. She received her Ph.D. in Computer Science from the University Paris XI-Orsay and INRIA. She has been working on various issues related to XML query processing. Sihem is a co-editor of the XQuery Full-Text language specification and use cases published in April 2005 by the Full-Text Task Force in the W3C whose charter is to extend XQuery with full-text search and ranking capabilities. She is currently involved in the GalaTex project (www.galaxquery.org/galatex), a conformance implementation of XQuery Full-Text.
-
-
Jeff Running
-
-
Watch Next
-
-
-
-
-
-
-
-
-
GenAI for Supply Chain Management: Present and Future
- Georg Glantschnig,
- Beibin Li,
- Konstantina Mellou
-
Using Optimization and LLMs to Enhance Cloud Supply Chain Operations
- Beibin Li,
- Konstantina Mellou,
- Ishai Menache