Andrew Solymosi

Subscribe to Andrew Solymosi: eMailAlertsEmail Alerts
Get Andrew Solymosi via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Related Topics: XML Magazine

XML: Article

Aggregation with XMLSPY

Aggregation with XMLSPY

Aggregation in XML is not trivial. Altova's XMLSPY offers a number of features facilitating this process. This article presents an example, including best practices and practical programming techniques - especially useful for those who don't like typing a lot of angle brackets.

Aggregation of XML (or HTML) documents means to collect the content of several XML files in one XML (or HTML) document (see Figure 1).

A portal product, for example, would aggregate the content of several data sources into one HTML page and present their contents in boxes in the user's browser. Most portals do this with programs written in Java, Perl, or some other programming language; however, XSLT includes the function document(), which is suitable for this purpose.

XMLSPY is a high-level XML editing tool, offering many visual capabilities for creating, changing, formatting, and presenting XML documents. The following example shows how XMLSPY's features can be used for aggregation.

XMLSPY Features
XMLSPY uses the following file name extensions:

  • .xml: XML data
  • .xsd: XML Schema Definition
  • .xsl: Extensible Stylesheet Language
  • .xslt: XSL Transformations
  • .sps: Stylesheet Designer's internal format

    An .xml document contains the data. XMLSPY can check (with the function key F7) if it is well formed, i.e., if it satisfies XML's syntax. With its data structure defined in an XSD document, XMLSPY can check (with F8) if it is valid (if it uses only the structures defined in the schema). The reference to XSD can be written into the XML document.

    XMLSPY can automatically generate XSD for an existing XML document; however, it must be reviewed because it might contain undesired constraints.

    An XSL document usually contains formatting information for XML data. It can be visually developed with Stylesheet Designer for an existing XSD document, and formatting can be assigned to every structure element contained there. The result can be viewed in HTML preview if a working XML file (with data) has been assigned. Stylesheet Designer stores the result in its internal .sps format (a special XML language) for further processing by XMLSPY's Stylesheet Designer for the "authentic view." Stylesheet Designer can also generate an XSL (or XSLT) document, which can be used by any XML processor (e.g., XMLSPY or an XML-enabled browser such as Netscape 6 or Internet Explorer 6.0). For this the XML document needs to contain a reference to the XSL document or vice versa (see Figure 2).

    XMLSPY can process (with F10, with F11 even debug) the XSL document either way. Similarly, a link to SPS can be put into XML (used only by XMLSPY for the "authentic view").

    Stylesheet Designer also allows you to take an HTML document, but not an XSD, for the design's base. It can also be saved as .sps and .xsl.

    XMLSPY can present XML data in the following views:

  • Text view (raw XML, editable)
  • Browser view (uneditable), with or without a stylesheet reference
  • Enhanced grid view (the best choice for initial data entry)
  • Authentic view (editable, the best choice for additional data entry)

    For the authentic view it is necessary to have defined an SPS document with Stylesheet Designer. XMLSPY will then present the XML data in the defined format, offering very convenient editing and extending. However, only data elements that exist in the original XML file can be changed or extended. So the steps to working with XMLSPY are:
    1.   Create sample XML data in XMLSPY's enhanced grid (or text) view.
    2.   Generate (and review) XSD reflecting the document's structure.
    3.   Visually create a design (on the basis of XSD) with Stylesheet Designer as SPS.
    4.   Generate XSL from SPS with Stylesheet Designer.
    5.   Connect XML with XSD (for validation) and SPS (for the authentic view) in XMLSPY.
    6.   Edit visually (and create more) XML data in the authentic view.
    7.   Generate HTML for presentation in any browser, or connect XML with XSL for XML-enabled browsers.

    Alternative designs (e.g., through modifying the SPS file) would present the same data in a different style.

    The steps presented here solve most of the simple problems in working with XML files. The problem of aggregation, however, is more complex. Here an XSLT document containing a program that completes the aggregation by calling the standard XSLT function document() must be developed. An XSLT program consists of a set of rules that can be applied to parts of the input XML document. The rules define what (text) data should appear in the output document.

    There is no principal difference between XSL and XSLT: both document types are processed against an XML document. However, it's a good convention to put formatting information into XSL (as a stylesheet) and transforming information into XSLT documents. XSLT should output XML data, and XSL can output XML or HTML.

    The XSLT document aggregate.xslt performs the actual aggregation (see Listing 1). aggregate.xslt contains two rules: the first one, xsl:template match, matches the whole input XML document ("/"), generates a <result> tag (with an XSD reference), and then looks for <families> tags in the XML document (xsl:apply-templates select). The second rule matches all the <location> tags in the input XML document and copies the document with the URL given in the tag's data.

    An example with families shows how this aggregation works - the idea is to have several XML documents describing families (with country, last name, father, mother, and children):

    <!-- orlando.xml -->

    and one index file listing all the family documents (named by the residences of the author's family members, suggesting that those XML files can be scattered all around the world, just like today's families):

    <!-- index.xml -->

    The goal of the aggregation is to present all the families on one page.

    Table 1 shows the result of aggregating the files and presenting their content in a table with a stylesheet. There are two main steps in this process: aggregation and presentation. They are described in the two files aggregation.xslt and style.xsl. In aggregation.xslt we program the aggregation (i.e., pulling the content of the documents together); in style.xsl we store formatting information.

    All these files can be downloaded from www.solymosi.com/Andreas/Family/Aggregation.html.

    Example: Aggregation with XMLSPY
    Create documents to aggregate

    In this section we're going to create XML documents with data to aggregate. Our goal is to use XMLSPY's features and work as little as possible on the XML level (and to avoid typing <angle brackets>!).
    1.   Create the sample data file orlando.xml (it does not have to list all four children, just the first two).
    2.   Generate schema file family.xsd (XMLSPY's Menu: DTD/Schema, Generate DTD/Schema, W3C Schema, OK, family.xsd), edit it, and delete constraints for the children's names - otherwise no more children can be added.
    3.   Open family.xsd with the Stylesheet Designer.
    4.   Design and save family.sps (test it with data in orlando.xml).
    5.   Generate and save stylesheet family.xsl.
    6.   Open the data file orlando.xml with XMLSPY.
    7.   Connect it with the stylesheet family. xsl (for the browser view) and with famlily.sps (for the authentic view).
    8.   Add the rest of the children in the authentic view, correct misspelled names, etc.
    9.   orlando.xml can now also be viewed in any XML-enabled browser because it contains a stylesheet reference (see Figure 3).

    The document family.xsl is not necessary if the data documents are not going to be presented in a browser. family.sps is necessary only if they are going to be visually edited in XMLSPY (a very convenient feature). family.xsd is necessary every time an XML processor is going to validate a data document (like XMLSPY does when opening it).

    Create aggregation file
    In a similar way we can now create our aggregation file containing the information about which documents will be aggregated. It is a kind of "table of contents" and is going to be the starting point of the aggregation. This is why it's called index.xml:
    10.   Create the sample index file index.xml (as before, but with two location tags).
    11.   Create the metadata documents (index.xsd, index.sps, perhaps index.xsl) as before.
    12.   Add the rest of the data (list of documents to be aggregated) in the authentic view.

    Note: In step 10 we created the index file with two location tags; not with four (or more) because in the authentic view (step 12) it's easier to edit; not with one so that the schema file index.xsd contains the repetition. If the complete index file is created in step 10, steps 11-12 can be omitted.

    Program aggregation
    Now aggregation can be programmed and executed: 13. Write aggregate.xslt (as in listing 1, which can be found at www.sys-con.com/xml/sourcec.cfm) - this is the only step XMLSPY doesn't offer great support for.
    14.   Set in XMLSPY's menu: Tools, Options, XSL, Default file extension for output file: .xml.
    15.   Connect in XMLSPY index.xml with aggregate.xslt (menu XSL, Assign XSL, OK, Browse, aggregate.xslt).
    16.   Run aggregation with F10 or with menu XSL/Transformation.
    17.   Save output.xml.

    The aggregation process (step 16) can also be debugged with Alt-F11 and F11. The aggregation can be started either with index.xml (as described) or with aggregate.xslt. In this case a working XML file must be assigned (menu XSL, assign sample XML). The assignment

    <?xmlspysamplexml index.xml?>

    can be written into the document aggregate.xslt (after the ?xml instruction) - this is a processing instruction evaluated by XMLSPY. Alternatively, the document index.xml can contain the reference

    <?xml-stylesheet type="text/xsl" href="aggregate.xslt"?>

    in order to eliminate step 15 (see Figure 4).

    Design presentation
    Now we need a stylesheet for the aggregated document. The hardest part is learning how to handle Stylesheet Designer with more complex schema documents (step 20) - but there's a good tutorial.
    18.   Generate the schema document output.xsd.
    19.   Open output.xsd with Stylesheet Designer.
    20.   Design and create style.sps (test it with output.xml).
    21.   Generate the stylesheet document style.xsl.
    22.   Assign style.xsl to output.xml with XMLSPY's menu XSL, Assign XSL, style.xsl.
    23.   Now output.xml can be seen in browser view.
    24.   Save output.xml and open it in any XML-enabled browser.
    25.   XMLSPY's menu Tools, Options, XSL, Default file extension for output file: .html.
    26.   Process output.xml with F10, save output.html, open it in any browser.
    27.   Assign style.sps to output.xml with XMLSPY's menu Authentic, Assign configuration file, style.sps.
    28.   Now output.xml (not the original family files!) can be edited in authentic view (see Figure 5).

    Step 22 (assigning style.xsl) can be eliminated if the assignment is generated by aggregate.xslt; it should then contain the following instruction:

    type="text/xsl" href="style.xsl"

    before the line


    Additional Considerations
    XMLSPY's XSLT processor (like that of any browser) doesn't allow XSLT pipelining, i.e., it is not able to process more than one XSL or XSLT document at one time. This is why we had to save output.xml (after aggregation) and complete formatting in a second step (either in the browser or in producing output.html). Some other XSLT processors (like Cocoon or AxKit) follow W3C recommendations and process XSLT documents step by step. So index.xml may contain references to more than one XSL/XSLT (and also SPS and XSD) document.

    <!-- index.xml -->
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl"
    <?xml-stylesheet type="text/xsl"
    <?xmlspysps index.sps?>
    <families xmlns:xsi="http://www.w3.

    XMLSPY and popular XML-enabled browsers (like Netscape 6 or Internet Explorer 6.0) would perform only the first step (aggregate.xslt) and present the result without formatting by the stylesheet. This is because they are not designed for actual XSLT processing, just for formatting with a stylesheet. Many XSLT programmers might tend to solve the problem by combining aggregate.xslt and style.xsl into one stylesheet document. However, I believe it's better to separate transforming XML information (which is structural) from formatting HTML information (which is presentational).

    Maybe the greatest advantage of using XMLSPY for those who don't like angle brackets (i.e., low-level XML editing) is its authentic view for visual editing of XML data. Altova also offers a plug-in for popular browsers, allowing XML editing in Web clients (without XMLSPY installed).

    Though XMLSPY isn't intended to be an XML programming tool, its features can be very useful in preparing and testing XSL and XSLT documents. Aggregation of XML files is an interesting example where a number of techniques and best practices can be introduced.

  • Comments (0)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.