Jan
27
(2005)
Teaching Resources Database
Filed under: Uncategorized. Tags: databases, lcwebsite, php, xml. | Leave a Comment
I’ve just updated our Teaching Resources database to use a copy of the lightweight asset management system I built for the Pachyderm project.
Previously, the TR database had been developed as a WebObjects application, connecting to an XStreamDB XML database. That performed really well, and made for nice reliable queries, but meant an editing interface was more difficult to develop.
Now that it’s just a simple MySQL database, and a simple PHP script running the queries and interface, it’s easy to manage, and performs quite well.
There are currently 622 teaching resources (books, websites, documents), in 28 different teaching-related topics. It is a collection of links, gathered by ourselves and the VP Academic’s office from relevant sources around the internet.
Aug
23
(2004)
I’m clearing my whiteboard, and need the space this was occupying, so I’m dumping it here for future reference. The following table compares the time it takes to retrieve XML from various sources (XML databases and the like) as well as to perform various types of processing (nothing, save as file, convert to DOM…). This table was very useful when we were coming up with our current XML storage strategy.
| XML Store | Retrieving | Process | Time (ms) (289 records per query) |
Time (ms) (per record) |
|---|---|---|---|---|
| XStreamDB | Minimal XML (handful of elements) | DOM via DOM4JKVC | 3038.9 | 10.52 |
| NSDictionary | 2684.6 | 9.28 | ||
| Full LOM (entire XML Document) | DOM4JKVC | 3371.5 | 11.66 | |
| null (no processing) | 1854.3 | 6.41 | ||
| W3CDOM | 3513.8 | 12.16 | ||
| NSDictionary | N/A * | 14.28 * | ||
| JUD | Save File | 33 minutes / 4056 records | 488.17 |
* This process was flakey at best, and refused to convert some records to NSDictionary objects, so the multiple conversion method failed.
All XStreamDB tests were performed using a WebObjects application that ran some java code to perform an XQuery against an XStreamDB database containing a copy of the CAREO repository (4056 records).
DOM4JKVC: a simple version of what has become the JavaEOXMLSupport.framework. It uses DOM4J to provide a Key Value Coding interface around a DOM Element, and involves parsing an XML string into a DOM Element.
NSDictionary: Uses WebObjects’ built-in XML-NSDictionary conversion (without mapping file)
W3CDOM: Simple conversion of the XML document into a W3CDOM Document.
Save File: Just save the XML string to the filesystem with no additional processing.
The JUD test was performed by using a Python script that pulled every document out of the live CAREO repository, and saved them to the local filesystem as .xml files.
All tests were performed on my PowerBook, with the XML being retrieved from a separate server (XStreamDB on our commons webserver, JUD on the U of C IT appserver)
This was nowhere near a fully empirical test, with extra variables popping in all over the place. The goal of this was to give an idea at the order-of-magnitude level of which strategy was fastest, and which was slowest. Basically, I needed to see if pulling the full LOM from XStreamDB and converting the whole shebang into a DOM would kill us. Turns out it’s over an order of magnitude faster than what CAREO had been doing… I also needed to get an idea of the additional time it would take to wrap a DOM Element with the Key Value Coding interface. Turns out that didn’t add anything, and somehow actually shaved some time off (although that is due to the DOM4J vs. W3C class performance).
I was surprised to fine that the combination of XStreamDB + DOM was approximately 41 times faster than the JUD (MySQL database to store any XML document by breaking it into Elements, Attibutes, and some other meta stuff, and reconstituting it on the fly via a PHP script).
Once I’ve got JavaXStreamDBAdaptor.framework and JavaEOXMLSupport.framework polished off a bit more, I’ll add the metrics for their performance to this chart.
Mar
10
(2003)
XML Databases
Filed under: Uncategorized. Tags: xml. | Leave a Comment
Just came across some really good info on XML databases on Ronald Bourret’s XML website (thanks, Scott!).
XML And Databases
XML Database Products
Feb
7
(2003)
XSLT Theme Processor continues
Filed under: Uncategorized. Tags: xml. | Leave a Comment
Just cut the XSLT code to a shockingly small 42 lines. Wow.
The XLST that converts the repository theme xml fragments into WebObjects .wod strings is only 16 lines… The XSLT that produces the .html strings from the same xml fragments weighs in at 26 lines.
Smaller is better. Less code, less to maintain, less to go wrong…
Jan
28
(2003)
Optimus Prime – XSLT Transformation App
Filed under: Uncategorized. Tags: xml. | Leave a Comment
I’m working on a simple java application to manage transformations using XSLT via Xalan-J. Not sure if I’ll use Cocoa-Java or SWING. Don’t care a lot about cross-platformability, since it really only has to work on my machine…
I don’t have a lot of experience with SWING (did a simple app a couple of years ago), and have no experience with Cocoa, so I’m kinda ambivalent (although having an excuse to learn Cocoa would be cool, I’m not sure I have the time to take that on). Since I have a brain-dead-simple command line Java app working already, I might just do it in SWING.
Actually, the command-line version works just dandy, but doesn’t have the ability to remember state – which source xml file to use? which xslt? where to put the output? any parameters? open output in another application? which one? …
Update: It seems as though Marc Liyanage has beaten me to the punch with TestXSLT. It seems like it’s almost exactly what I was planning on building. Thanks for saving me some time, Marc! He’s even released the source, so I might be able to tweak if needed. Cool.
I’ll be playing around with TestXSLT (and his BBEdit XSL vocab) over the next couple of days. Should be interesting…
Update 2: It looks like the 2 XSLT libraries used in TestXSL (Libxslt and Sablotron) handle xsl:import and xsl:include differently than Xalan-J – that is they appear to fail with relative URIs… Not sure what’s causing this, but it’s not too fatal (yet)
Update 3: Marc has an older version of TestXSLT that he wrote in Java, and which uses Xalan-J, so for testing stuff out in the _real_ xsl library, there’s a tool there. It’s not as polished as TestXSLT 2.5, but it’s better than command-line. I might try to graft a Xalan option into TestXSLT 2.5 using the Cocoa-Java bridge…
Nov
9
(2002)
XSLT = XML the WO Way?
Filed under: Uncategorized. Tags: xml. | Leave a Comment
I need to be able to present a (potentially interactive) HTML page that is created from an XML file (such as an IMS, DublinCore, METS…). The way I had been doing it was with object modelling – convert the XML file into a set of objects, similar to EOs – relations and all – and then feeding those objects into custom WOComponents for display.
Turns out, there is a much better way. Simply create an .xsl file that contains the logic for each schema (or set of schemas, if similar enough), and just feed the results of the XSLT transformation into a single WOComponent. Very cool stuff.
This basically means that I can have themed metadata presentation (and navigation of content) without having to modify and java code. Customization can be done on the fly, by people who don’t have access to the source code.
I’ve got a simple test case where I convert any IMS record into an HTML page presenting the details about that record. Easy peasy. Next, to embed the transformation into a WebObjects application, and test it out there.
I’ve come across two challenges with this approach so far. First, the IMS schema (1.2.2) contains an invalid namespace, so java xml parsers choke when validating. I should be able to turn off validation once in WO. I’m using jEdit to create and test the .xsl file, so I don’t have access to the namespace property.
Second, most of the metadata schemas use the non-xml version of the VCARD specification. Unless I can figure out a way to process VCARD data within the XSL file, I may be hooped here. The Old Way, I just created a java class that understood the VCARD spec and ripped out relevant bits of data. I’ll keep plugging away at this…
Jul
3
(2002)
I got an updated version of a generic XML document displayer working in WebObjects this morning. Takes an XML document (source as a String) and parses it into a DOM Document.
Then, I have a recursive WOComponent that accepts a Node (say, the root node of the Document, perhaps?) and displays all Elements and Attributes in the hierarchy starting at that Node.
Works pretty well, and is reasonably fast. Once the classes are “warmed up”, parsing a big document takes 50-100ms, with more time for display. Not too bad, but there’s probably lots of room for improvement.
I need to figure out how to really ignore white space. I have set the factory to setIgnoresWhiteSpace(true), but that doesn’t seem to work for some reason… I still get the padded lines and spaces/tabs between nested elements. I’ll look into that a little more… I think it has something to do with not being able to locate a DTD/schema for the document…

