eXist XPath Extensions

Filed under: Uncategorized. Tags:

One of the really cool things about eXist is the XPath extensions for fulltext searching. They mimic (using XPath) the stuff that is done in XStreamDB via XQuery.

I can do stuff like:

document(*)//text() &= "*image*"

and eXist will return me any xml document (from it’s entire set of collections) that contains the string “image” somewhere in it (could be in /lom/general/title/langstring/Images Of Bangalore, or /lom/technical/format/image/jpeg, or whatever. It doesn’t care. And, it’s very fast.

What’s more, I can do stuff like:

document(*)/*[ //format &= "*image*" and //text() &= "*earth*"]

which says “find me xml documents that have “image” somewhere in a “format” element (could be, say, /lom/technical/format), and contain the string “earth” somewhere (like, say, /lom/general/title/langstring/Earth At Night or /lom/general/title/langstring/Earthquakes )

I can also do something like:

document(*)//text() &="*image* *kyoto*"

Which will give me different results than

document(*)//text() &= "*image* *kyoto* *relig*"

because the second query will restrict the search to stuff to do with “relig” - religion, religious, whatever (in this case, a Buddhist temple in Kyoto is returned, as opposed to the Kyoto Accord presentations at the University of Calgary, which are returned by the query before it…)

The fulltext extension - based queries (using the &= qualifier to indicate “boolean and” - you can also use the := qualifier to indicate “boolean or”) are amazingly fast. I’m getting results from rather complicated test queries on the entire 3600+ CAREO record set in a fraction of a second. Nice.

Comments

Comments are closed.

Creative Commons License
This work is licensed under a Creative Commons Attribution 2.5 Canada License.