This chapter concludes our
exploration of XML technologies and their application to data integration
with a discussion of XSLT, a
very rich and complex tool. We cover only the salient characteristics
of XSLT, namely, the fact that XSLT uses XPath constructs, as well as the
intimate connection between the models of the data to be transformed (XSD's)
and the formulation of the transformations themselves.
Using a handful of XSLT instructions we learn how to conduct in
effect data normalization so that the resulting data sets can be readily
imported into a database. We
also see how new data can be built from the source data, and, lastly, we
learn how to process multiple records using the same template.
In the preceding chapters, we have learned not only to
build well-formed XML documents,
and to specify their structure for validation
via XSD's, but, more importantly, we have also learned to 'model' the source
data using this powerful technology. However,
as we also saw there, the same data can be modeled by different people in
vastly different ways. The
consequence of this fact is that, absent some agreement among the data
providers and users, we must be prepared to recast one XML vocabulary into
another. Fortunately, there is
already another XML technology that provides exactly that capability.
Welcome to XML Stylesheet Language Transformation, or XSLT for short.
The best way to begin learning about XSLT is to look at
a simple case. Let's assume
that Healthcare
Provider A already has modeled its data and has a schema for all its XML
documents as shown in the listing below.
<?xml
version="1.0" encoding="UTF-8"?>
<!--W3C Schema generated by XMLSPY v2004 rel. 3 U (http://www.xmlspy.com)-->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="cases">
<xs:complexType>
<xs:sequence>
<xs:element ref="case" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="case">
<xs:complexType>
<xs:sequence>
<xs:element ref="physician"/>
<xs:element ref="diagnosis"/>
<xs:element ref="visit"/>
</xs:sequence>
<xs:attribute name="patient" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
<xs:element name="diagnosis">
<xs:complexType>
<xs:attribute name="code" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
<xs:element name="physician">
<xs:complexType>
<xs:attribute name="name" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
<xs:element name="visit">
<xs:complexType>
<xs:attribute name="date" type="xs:date" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>
Documents built according to this schema would look
like this:
<?xml
version="1.0"?>
<?xml-stylesheet type="text/xsl" href="C:\Data\04_Word\HSCI720\A2B.xsl"?>
<cases>
<case patient="John Doe">
<physician name="Peter Alonzo"/>
<diagnosis code="heart arrhythmia"/>
<visit date="2004-03-22"/>
</case>
</cases>
Let's suppose that the data model chosen by Healthcare
Provider B for its data is as follows:
<?xml
version="1.0" encoding="UTF-8"?>
<!-- edited with XMLSPY v2004 rel. 3 U (http://www.xmlspy.com)
by DR. FRANCISCO LOAIZA (IDA) -->
<!--W3C Schema generated by XMLSPY v2004 rel. 3 U (http://www.xmlspy.com)-->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="Case">
<xs:complexType>
<xs:all>
<xs:element ref="Patient" minOccurs="0"/>
<xs:element ref="Clinician" minOccurs="0"/>
<xs:element ref="Date" minOccurs="0"/>
<xs:element ref="Diagnosis" minOccurs="0"/>
<xs:element ref="Treatment" minOccurs="0"/>
</xs:all>
</xs:complexType>
</xs:element>
<xs:element name="CaseReports">
|
<xs:complexType>
<xs:sequence>
<xs:element ref="Case"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Clinician" type="xs:string"/>
<xs:element name="Date" type="xs:date"/>
<xs:element name="Diagnosis" type="xs:string"/>
<xs:element name="Patient" type="xs:string"/>
<xs:element name="Treatment" type="xs:string"/>
</xs:schema>
XML documents built according to this schema would look
like this:
<?xml
version="1.0"?>
<CaseReports>
<Case>
<Patient>Jane Doe</Patient>
<Clinician>Dr. Phillips</Clinician>
<Date>2004-03-22</Date>
<Diagnosis>herniated disk</Diagnosis>
<Treatment>physiotherapy</Treatment>
</Case>
</CaseReports>
If Healthcare Provider B wanted to use the data
collected by Healthcare Provider A we need to figure out a way to process
its data so that after we manipulate it the resulting XML document will
conform to the schema it has already adopted.
As we look at both schemas we can see that there are
substantial areas of overlap. For
example, it is fair to assume that the value assigned to physician
in the first schema is the same as the one covered by Clinician
in the target schema. We also
see that both schemas have the concept of 'patient',
'diagnosis', as well as 'date'.
The complete mapping showing the specifics of each XSD
would look like this:
Schema
A
|
Schema
B
|
Concept
|
XSD
|
Concept
|
XSD
|
name
|
Attribute
of element physician
|
Clinician
|
Element
|
code
|
Attribute
of element diagnosis
|
Diagnosis
|
Element
|
patient
|
Attribute
of element case
|
Patient
|
Element
|
date
|
Attribute
of element visit
|
Date
|
Element
|
N/A
|
N/A
|
Treatment
|
Element
|
What is needed now is a way to extract the appropriate
pieces of data from the source document and put them into the right
container of the target document. The
way XSLT accomplishes this is by specifying via XPath expressions how to
fetch the content from the source document and then providing the XML tags
that will go with that content.
For the first row in the table above we could express
this is words as follows:
-
Traverse the source document and find the attribute
'name' in the element 'physician',
-
Fetch the value it currently has,
-
Make that value the content of the element
'Clinician'.
We are now almost ready to begin writing our first XSLT.
All we need now is to learn what the proper XPath expressions are
that will accomplish the steps delineated above.
But before we do that we need to understand one more concept.
As we learned in the very beginning of the course, XML documents are
essentially equivalent to what is known in graph theory as a 'tree'.
Trees are made of connecting lines and nodes.
In an XML document there are no explicit lines connecting the tags
(i.e., the element nodes), instead we use nesting.
If a tag is inside another tag that's equivalent to connecting the
parent tag to the child tag. In
addition to the element nodes (i.e., the XML tags) we also have in XML
attribute nodes, text nodes, comment nodes, processing instruction nodes and
namespace nodes. Every XML
document is made of, or more precisely, maps to a tree structure made of the
kinds of nodes just listed.
The reason for spending some time thinking about this
is that in order to use XPath effectively we need to understand the concept
of the path operator.
If you ever used the command line interface in MSDOS or in Unix you
already know what the path operator looks like.
Basically, it is a string separated by forward slashes where each
chunk in between the slashes corresponds to a node of our XML tree graph.
The XPath convention is that if a node is an attribute node it must
be prefixed by the symbol '@'. The
path operator is always accompanied by the select
expression. Once the select and
path operator have been correctly built we need to invoke the appropriate
XSLT instruction that will act on the node specified by the path operator.
About 99% of the time we will need only two XSLT
instructions to accomplish most of the transformations we are interested in.
The first one is xsl:value-of,
which, as its name suggests emits the string corresponding to the select
expression. The anatomy of the
XSL Transformation we just described is then:
<xsl:value-of
select="/path/to/node"/>
Thus, for example, the XSL Transformation for fetching
the name of the physician would be:
<xsl:value-of
select="cases/case/physician/@name"/>
To write our first XSLT we need to make sure the
processing application (e.g., XMLSpy) can properly identify it as such.
To that effect, just as with XSD's one places the stylesheet
root element at the beginning:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
Next we put <xsl:template
match="/"> to indicate that this is a template that applies to the whole document.
After that we put the tag or tags that will show up in the resulting
document, and between the tags we put the appropriate XSLT expressions.
For our first example we are only going to output an XML document
consisting of the root tag, the record delimiter tag and one tag, namely,
<Clinician>. Therefore,
our first XSLT would have as the next lines:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<CaseReports>
<Case>
<Clinician><xsl:value-of select="cases/case/physician/@name"/></Clinician>
</Case>
</CaseReports>
</xsl:template>
</xsl:stylesheet>
To carry out the transformation
it is necessary to add a processing instruction in the source XML document
that links it to the XSLT that we just created.
This is done using the following processing instruction:
<?xml-stylesheet type="text/xsl"
href="C:\Data\04_Word\HSCI720\A2B.xsl"?>
The modified XML document now looks like this:
<?xml
version="1.0"?>
<?xml-stylesheet type="text/xsl" href="C:\Data\04_Word\HSCI720\A2B.xsl"?>
<case patient="John Doe">
<physician name="Peter Alonzo"/>
<diagnosis code="heart arrhythmia"/>
<visit date="2004-03-22"/>
</case>
Executing the transformation produces the following
output:
<?xml
version="1.0" encoding="UTF-8"?>
<CaseReports>
<Case>
<Clinician>Peter Alonzo</Clinician>
</Case>
</CaseReports>
We can now add the path to the schema used by
Healthcare Provider B to validate the document.
<?xml
version="1.0" encoding="UTF-8"?>
<CaseReports xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="C:\Data\04_Word\HSCI720\XSL_02.xsd">
<Case>
<Clinician>Peter Alonzo</Clinician>
</Case>
</CaseReports>
Using XMLSpy one can test that the file is in fact valid.
The preceding example was created using the commercial
application XMLSpy. There are,
however, freeware applications that accomplish the same task. A very popular one is XALAN, which can be downloaded from http://xml.apache.org/xalan.
An even simpler XSLT engine is the one offered by
Microsoft. The executable can
be placed in any directory of your choice, and invoked by typing after the
prompt msxml.
C:\Data\04_Word\HSCI720>msxsl
Microsoft
(R) XSLT Processor Version 4.0
Usage: MSXSL source stylesheet [options] [param=value...] [xmlns:prefix=uri...]
Options:
-? Show
this message
-o filename Write output to named file
-m startMode Start the transform in this mode
-xw Strip
non-significant whitespace from source and stylesheet
-xe Do not
resolve external definitions during parse phase
-v Validate
documents during parse phase
-t Show
load and transformation timings
-pi Get
stylesheet URL from xml-stylesheet PI in source document
-u version Use a specific version of MSXML: '2.6', '3.0',
'4.0'
- Dash
used as source argument loads XML from stdin
- Dash
used as stylesheet argument loads XSL from stdin
For our example we can type:
C:\Data\04_Word\HSCI720>msxsl
XSLT_01.xml A2B.xsl -o A2B_2.xml
Where XSLT_01.xml
is the file we want to transform, A2B.xsl is the XSLT we are going to use and -o
A2B_2.xml indicates the name of the output file.
After running the application the resulting file looks like this:
<?xml
version="1.0" encoding="UTF-16"?>
<CaseReports>
<Case>
<Clinician>Peter
Alonzo</Clinician>
</Case>
</CaseReports>
Class exercise:
Complete the XSLT for all the remaining nodes in the source XML and test the
transformation. Use MS
msxml.exe for the transformation, and then validate the file using XMLSpy.
The previous sections have shown the basic concepts of
the XSL Transformation, namely, the use of the processing instruction for
stylesheets, the select
and path operator,
the concept of a template using something like <xsl:template
match="/">, and the transformation instruction xsl:value-of.
In addition we have also tested two engines that can accomplish XSL
transformations. In real life,
though, the XML sources will not consist of just one record.
In fact there would be little point in writing and testing a whole
transformation script to process one instance alone.
So the real power lies in being able to process thousands of records
from a source and have them recast in an XML vocabulary that conforms to the
target XSD.
As we alluded in the preceding section, the second most
used XSLT instruction is the one that let us process multiple records in a
single pass. This is the xsl:for-each
instruction. Its specification
is as follows:
<xsl:for-each
select = node-set-expression>
<!- - Content: (xsl:sort*, template) - ->
</xsl:for-each>
When the xsl:for-each
is invoked it evaluates the template
against each node in the path operator (node-set-expression)
returned by the select
expression. The order of evaluation can be influenced using one or more xsl:sorts.
With this in mind let's look at an XML source data
example from Healthcare Provider A containing more than one record:
<cases>
<case patient="John Doe">
<physician name="Peter Alonzo"/>
<diagnosis code="heart arrhythmia"/>
<visit date="2004-03-22"/>
</case>
<case patient="Mary Jones">
<physician name="Hans Allers"/>
<diagnosis code="high cholesterol"/>
<visit date="2004-03-20"/>
</case>
<case patient="Sandy Mullens">
<physician name="Emily Lang"/>
<diagnosis code="dislocated shoulder"/>
<visit date="2004-02-19"/>
</case>
</cases>
To transform it all we need is to let the
transformation engine that there are multiple instances of the tag
<case> and that all need to be processed in the same way.
A possible solution for accomplishing this is shown below:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<CaseReports>
<xsl:for-each select="cases/case">
<Case>
<Clinician>
<xsl:value-of select="physician/@name"/>
</Clinician>
</Case>
</xsl:for-each>
</CaseReports>
</xsl:template>
</xsl:stylesheet>
The msxsl transformation engine can be invoked using a
command like this:
C:\Data\04_Word\HSCI720>msxsl
XSLT_01B.xml A2B.xsl -o A2B_4.xml
The resulting file is shown below:
<?xml
version="1.0" encoding="UTF-16"?>
<CaseReports>
<Case>
<Clinician>Peter Alonzo</Clinician>
</Case>
<Case>
<Clinician>Hans Allers</Clinician>
</Case>
<Case>
<Clinician>Emily Lang</Clinician>
</Case>
</CaseReports>
The sections above have shown how we can extract data
from one XML source and recast it into a new XML vocabulary.
The XSLT also allow us to insert new tags at the places we desire.
We can use this capability to transform a flat table into a series of
tables, i.e., to normalize the data as required.
The listing below shows a single record from a notional
CDC report on an anthrax related incident.
<?xml
version="1.0"?>
<CDC_REPORT>
<INCIDENT>
<CITY>New York City</CITY>
<STATE>New York</STATE>
<SEVERITY>High</SEVERITY>
<PEOPLE_QY>139</PEOPLE_QY>
<CONTAMINANT>Anthrax</CONTAMINANT>
<FATALITIES>23</FATALITIES>
<FEMALE_ADULT>87</FEMALE_ADULT>
<MALE_ADULT>32</MALE_ADULT>
<CHILDREN>20</CHILDREN>
<TREATMENT>ANTIBIOTICS</TREATMENT>
<YEAR>2001</YEAR>
</INCIDENT>
</CDC_REPORT>
As we can see, the structure of the record is similar
to that of a row in a flat table. One
could begin to normalize the data by breaking it into its logical
components. For example, the
data could be recast to look like the listing below:
<?xml
version="1.0" encoding="UTF-8"?>
<CDC_REPORT>
<INCIDENT>
<LOCATION>
<CITY>New York City</CITY>
<STATE>New York</STATE>
</LOCATION>
<DESCRIPTION>
<YEAR>2001</YEAR>
<SEVERITY>High</SEVERITY>
<PEOPLE_QY>139</PEOPLE_QY>
<CONTAMINANT>Anthrax</CONTAMINANT>
</DESCRIPTION>
<STATISTICS>
<FATALITIES>23</FATALITIES>
<FEMALE_ADULT>87</FEMALE_ADULT>
<MALE_ADULT>32</MALE_ADULT>
<CHILDREN>20</CHILDREN>
</STATISTICS>
</INCIDENT>
</CDC_REPORT>
To accomplish this we can use an XSLT like the one
listed below:
<?xml
version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<CDC_REPORT xmlns:dt="urn:schemas-microsoft-com:datatypes">
<xsl:for-each select="CDC_REPORT/INCIDENT">
<INCIDENT>
<LOCATION>
<CITY>
<xsl:value-of select="CITY"/>
</CITY>
<STATE>
<xsl:value-of select="STATE"/>
</STATE>
</LOCATION>
<DESCRIPTION>
<YEAR>
<xsl:value-of select="YEAR"/>
</YEAR>
<SEVERITY>
<xsl:value-of select="SEVERITY"/>
</SEVERITY>
<PEOPLE_QY>
<xsl:value-of select="PEOPLE_QY"/>
</PEOPLE_QY>
<CONTAMINANT>
<xsl:value-of select="CONTAMINANT"/>
</CONTAMINANT>
</DESCRIPTION>
<STATISTICS>
<FATALITIES>
<xsl:value-of select="FATALITIES"/>
</FATALITIES>
<FEMALE_ADULT>
<xsl:value-of select="FEMALE_ADULT"/>
</FEMALE_ADULT>
<MALE_ADULT>
<xsl:value-of select="MALE_ADULT"/>
</MALE_ADULT>
<CHILDREN>
<xsl:value-of select="CHILDREN"/>
</CHILDREN>
</STATISTICS>
</INCIDENT>
</xsl:for-each>
</CDC_REPORT>
</xsl:template>
</xsl:stylesheet>
If you paid attention to the discussion on
normalization you probably are wondering what good would do to us to break
the original record into three distinct sections if there is no way to
relink them. Well, there is a
way to solve this. Remember
that we said that in a transformation we can insert new tags as required. This means that we could add, for example, an <ID> tag
to each of the segments of the resulting record.
Since we don't have any indication of an ID in the original record we
could simple assign record count as the new ID. In other words, all segments from the first record will have
ID = 1, those from the second will have ID = 2, etc. For example, take a source XML document containing two
records as shown in the listing below:
<?xml
version="1.0"?>
<?xml-stylesheet type="text/xsl" href="C:\Data\04_Word\HSCI720\Flat2nonFlat2.xsl"?>
<CDC_REPORT>
<INCIDENT>
<CITY>New York City</CITY>
<STATE>New York</STATE>
<SEVERITY>High</SEVERITY>
<PEOPLE_QY>139</PEOPLE_QY>
<CONTAMINANT>Anthrax</CONTAMINANT>
<FATALITIES>23</FATALITIES>
<FEMALE_ADULT>87</FEMALE_ADULT>
<MALE_ADULT>32</MALE_ADULT>
<CHILDREN>20</CHILDREN>
<TREATMENT>ANTIBIOTICS</TREATMENT>
<YEAR>2001</YEAR>
</INCIDENT>
<INCIDENT>
<CITY>New York City</CITY>
<STATE>New York</STATE>
<SEVERITY>Medium</SEVERITY>
<PEOPLE_QY>49</PEOPLE_QY>
<CONTAMINANT>Anthrax</CONTAMINANT>
<FATALITIES>13</FATALITIES>
<FEMALE_ADULT>17</FEMALE_ADULT>
<MALE_ADULT>12</MALE_ADULT>
<CHILDREN>20</CHILDREN>
<TREATMENT>ANTIBIOTICS</TREATMENT>
<YEAR>2002</YEAR>
</INCIDENT>
</CDC_REPORT>
We would like the transformed XML to look like the
listing below because with an XML instance document such as this it would be
easy to load the data into a database and then to create the foreign key
constraints necessary to rebuild the original data.:
<?xml
version="1.0" encoding="UTF-8"?>
<CDC_REPORT>
<INCIDENT>
<LOCATION>
<ID>1</ID>
<CITY>New York City</CITY>
<STATE>New York</STATE>
</LOCATION>
<DESCRIPTION>
<ID>1</ID>
<YEAR>2001</YEAR>
<SEVERITY>High</SEVERITY>
<PEOPLE_QY>139</PEOPLE_QY>
<CONTAMINANT>Anthrax</CONTAMINANT>
</DESCRIPTION>
<STATISTICS>
<ID>1</ID>
<FATALITIES>23</FATALITIES>
<FEMALE_ADULT>87</FEMALE_ADULT>
<MALE_ADULT>32</MALE_ADULT>
<CHILDREN>20</CHILDREN>
</STATISTICS>
</INCIDENT>
<INCIDENT>
<LOCATION>
<ID>2</ID>
<CITY>New York City</CITY>
<STATE>New York</STATE>
</LOCATION>
<DESCRIPTION>
<ID>2</ID>
<YEAR>2002</YEAR>
<SEVERITY>Medium</SEVERITY>
<PEOPLE_QY>49</PEOPLE_QY>
<CONTAMINANT>Anthrax</CONTAMINANT>
</DESCRIPTION>
<STATISTICS>
<ID>2</ID>
<FATALITIES>13</FATALITIES>
<FEMALE_ADULT>17</FEMALE_ADULT>
<MALE_ADULT>12</MALE_ADULT>
<CHILDREN>20</CHILDREN>
</STATISTICS>
</INCIDENT>
</CDC_REPORT>
The XSLT instruction that allows to this is xsl:number.
Its specification is as follows:
<xsl:number
level = "single" | "multiple" |
"any"
count = pattern
from = pattern
value = number-expression
format = { string }
lang = { nmtoken }
letter-value = { "alphabetic" |
"traditional" }
grouping-separator = { char }
grouping-size = { number } />
Its effect is to emit a number based on the XPath
number expression found in value. All
is required now is to add the new tag, namely <ID> to each segment of
the original XSLT and to assign the proper value using xsl:number.
The listing would look like this:
<?xml
version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<CDC_REPORT>
<xsl:for-each select="CDC_REPORT/INCIDENT">
<INCIDENT>
<LOCATION>
<ID><xsl:number
level="single" count="CDC_REPORT/INCIDENT" format="1"/></ID>
<CITY>
<xsl:value-of select="CITY"/>
</CITY>
<STATE>
<xsl:value-of select="STATE"/>
</STATE>
</LOCATION>
<DESCRIPTION>
<ID><xsl:number
level="single" count="CDC_REPORT/INCIDENT" format="1"/></ID>
<YEAR>
<xsl:value-of select="YEAR"/>
</YEAR>
<SEVERITY>
<xsl:value-of select="SEVERITY"/>
</SEVERITY>
<PEOPLE_QY>
<xsl:value-of select="PEOPLE_QY"/>
</PEOPLE_QY>
<CONTAMINANT>
<xsl:value-of select="CONTAMINANT"/>
</CONTAMINANT>
</DESCRIPTION>
<STATISTICS>
<ID><xsl:number
level="single" count="CDC_REPORT/INCIDENT" format="1"/></ID>
<FATALITIES>
<xsl:value-of select="FATALITIES"/>
</FATALITIES>
<FEMALE_ADULT>
<xsl:value-of select="FEMALE_ADULT"/>
</FEMALE_ADULT>
<MALE_ADULT>
<xsl:value-of select="MALE_ADULT"/>
</MALE_ADULT>
<CHILDREN>
<xsl:value-of select="CHILDREN"/>
</CHILDREN>
</STATISTICS>
</INCIDENT>
</xsl:for-each>
</CDC_REPORT>
</xsl:template>
</xsl:stylesheet>
(1) Modify the previous XSLT so that there is also an
index tag <IDX> in the description section.
This will permit entering additional description information related
to the same incident, for example updates on the severity and the number of
individuals affected. See
it done (SWF
file).
(2) The example below shows the use of the concat()
function. As shown, an XML
document where the name of the person is broken into three pieces can be
recast in the form of a single concatenated string.
<?xml
version="1.0"?>
<Persons>
<Person>
<FirstName>Farrokh</FirstName>
<MiddleInitial>M</MiddleInitial>
<LastName>Alemi</LastName>
</Person>
</Persons>
<?xml
version="1.0" encoding="UTF-8"?>
<Persons>
<Person>
<PersonName>Farrokh M. Alemi</PersonName>
</Person>
</Persons>
The XSLT to accomplish this is shown below:
<?xml
version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<Persons>
<Person>
<PersonName><xsl:value-of select="concat(Persons/Person/FirstName,
' ', Persons/Person/MiddleInitial, '. ', Persons/Person/LastName)"/></PersonName>
</Person>
</Persons>
</xsl:template>
</xsl:stylesheet>
Use the same technique to
transform the input file shown below:
<?xml
version="1.0"?>
<CDC_REPORT>
<INCIDENT>
<CITY>New York City</CITY>
<STATE>New York</STATE>
<SEVERITY>High</SEVERITY>
<PEOPLE_QY>139</PEOPLE_QY>
<CONTAMINANT>Anthrax</CONTAMINANT>
<FATALITIES>23</FATALITIES>
<FEMALE_ADULT>87</FEMALE_ADULT>
<MALE_ADULT>32</MALE_ADULT>
<CHILDREN>20</CHILDREN>
<TREATMENT>ANTIBIOTICS</TREATMENT>
<YEAR>2001</YEAR>
</INCIDENT>
</CDC_REPORT>
The output file should contain
an incident section made up of the <ID> and a name for the incident
made up of the concatenation of the city and the year values (see listing
below):
<?xml
version="1.0" encoding="UTF-8"?>
<CDC_REPORT>
<REPORT>
<INCIDENT>
<ID>1</ID>
<NAME>New York City -- 2001</NAME>
</INCIDENT>
<LOCATION>
<ID>1</ID>
<CITY>New York City</CITY>
<STATE>New York</STATE>
</LOCATION>
<DESCRIPTION>
<ID>1</ID>
<YEAR>2001</YEAR>
<SEVERITY>High</SEVERITY>
<PEOPLE_QY>139</PEOPLE_QY>
<CONTAMINANT>Anthrax</CONTAMINANT>
</DESCRIPTION>
<STATISTICS>
<ID>1</ID>
<FATALITIES>23</FATALITIES>
<FEMALE_ADULT>87</FEMALE_ADULT>
<MALE_ADULT>32</MALE_ADULT>
<CHILDREN>20</CHILDREN>
</STATISTICS>
</REPORT>
</CDC_REPORT>
Submit your work by email to your instructor.
Appendix—Listing
of XSLT Instructions
Instruction
|
Syntax
|
Description
|
xsl:copy-of
|
<xsl:copy-of
select
= expression />
|
Emits
the node-set corresponding to the select expression.
|
xsl:value-of
|
<xsl:value-of
select
= string-expression
disable-output-escaping
= "yes"
|
"no" />
|
Emits
the string corresponding to the select expression.
|
xsl:if
|
<xsl:if
test
= boolean-expression>
<!-
- Content: template - ->
</xsl:if>
|
Evaluates
the template if and only if the test expression evaluates to true.
|
xsl:choose
|
<xsl:choose>
<!-
- Content: (xsl:when+, xsl:otherwise?) - ->
</xsl:choose>
|
Evaluates
the template from the first xsl:when clause whose test expression
evaluates to true. If none of the test expressions evaluate to true,
then the template contained in the xsl:otherwise clause is
evaluated.
|
xsl:for-each
|
<xsl:for-each
select
= node-set-expression>
<!-
- Content: (xsl:sort*, template) - ->
</xsl:for-each>
|
Evaluates
the template against each node in node-set returned by the select
expression. The order of evaluation can be influenced using one or
more xsl:sorts.
|
xsl:call-template
|
<xsl:call-template
name
= qname>
<!-
- Content: xsl:with-param* - ->
</xsl:call-template>
|
Invokes
the template rule named by name.
|
xsl:variable
|
<xsl:variable
name
= qname
select
= expression>
<!-
- Content: template - ->
</xsl:variable>
|
Declares
a variable named name and initializes it using the select expression
or template.
|
xsl:text
|
<xsl:text
disable-output-escaping
= "yes" | "no">
<!-
- Content: #PCDATA - ->
</xsl:text>
|
Emits
the text found in #PCDATA. Escaping of the five built-in entities is
controlled using disable-output-escaping.
|
xsl:number
|
<xsl:number
level
= "single" | "multiple" | "any"
count
= pattern
from
= pattern
value
= number-expression
format
= { string }
lang
= { nmtoken }
letter-value
= { "alphabetic" | "traditional" }
grouping-separator
= { char }
grouping-size
= { number } />
|
Emits
a number based on the XPath number expression found in value.
|
xsl:copy
|
<xsl:copy
use-attribute-sets
= qnames>
<!-
- Content: template - ->
</xsl:copy>
|
Copies
the current context node (and associated namespace nodes) to the
result tree fragment.
|
xsl:apply-templates
|
<xsl:apply-templates
select
= node-set-expression
mode
= qname>
<!-
- Content: (xsl:sort | xsl:with-param)* - ->
</xsl:apply-templates>
|
Invokes
the best-match template rules against the node-set returned by the
select expression.
|
xsl:apply-imports
|
<xsl:apply-imports
/>
|
Promotes
the current stylesheet in import precedence.
|
xsl:message
|
<xsl:message
terminate
= "yes" | "no">
<!-
- Content: template - ->
</xsl:message>
|
Emits
a message in a processor-dependent manner.
|
xsl:fallback
|
<xsl:fallback>
<!-
- Content: template - ->
</xsl:fallback>
|
Evaluates
the template when the parent instruction/directive is not supported
by the current processor.
|
xsl:comment
|
<xsl:comment>
<!- - Content: template - ->
</xsl:comment>
|
Emits
an XML comment containing the template as its character data.
|
xsl:processing-instruction
|
<xsl:processing-instruction
name
= { ncname }>
<!-
- Content: template - ->
</xsl:processing-instruction>
|
Emits
an XML processing instruction whose [target] is name and whose
[children] are based on template.
|
xsl:element
|
<xsl:element
name
= { qname }
namespace
= { uri-reference }
use-attribute-sets
= qnames>
<!-
- Content: template - ->
</xsl:element>
|
Emits
an XML element whose [local name] is name, whose [namespace URI] is
namespace, and whose [children] are based on template.
|
xsl:attribute
|
<xsl:attribute
name
= { qname }
namespace
= { uri-reference }>
<!-
- Content: template - ->
</xsl:attribute>
|
Emits
an XML attribute whose [local name] is name, whose [namespace URI]
is namespace, and whose [children] are based on template.
|
(From XSL Transformations: XSLT Alleviates XML Schema Incompatibility
Headaches, by Don Box, Aaron Skonnard, John
Lam. Modified excerpt from Essential
XML (Chapter 5), by Don Box, Aaron Skonnard, and John Lam © 2001
Addison Wesley Longman.)
Recommended
Reading
XSLT by Doug Tidwell, O'Reilly 2001, ISBN 0-596-00053-7
This page is part of the course on
Data Integration, the lecture on
XSLT
transformations.
This page was first prepared on 3/24/2004 and last revised on
10/22/2011.
|