How to: view XML as customized HTML with XSLT (you only need a browser)

salaminizer

Colorado Internacional
Joined
Aug 12, 2006
Messages
221
Location
Porto Alegre, Brasil
How to: view XML as customized HTML with XSLT (you only need a browser)

I've started modding recently and finally got to the point of editing the Tech XML file, which with the help of a PHP script became easier. However, checking out every field is still messy. My first idea was writing a parser with Python that would generate HTML, which seems to be a good idea, until you remember you already worked with XSL, STILL work with it and WILL work for a long time! In the end I spent more time finding out how to parse XML in Python than creating the XSL stylesheet. :p

Objective: from a XML file, generate an HTML file containing the fields selected in the XSL stylesheet, so we can see our Civ4 XML files in a nice way in our browser :).
What do I need?: a browser, the input file and a XSL stylesheet file.
What do I need to KNOW?: you need to know HTML and you SHOULD know XSL and/or XPath. If you don't know XSL, you'll learn basic commands in this tutorial that will probably be enough for simple tasks. I hope that after you read this tutorial you will think XSL is wonderful and will look forward to learn it :)

1. What is XSLT and XPath?

If you already know what they ARE, you can skip to section 2.

From W3Schools:

XSL stands for EXtensible Stylesheet Language.

The World Wide Web Consortium (W3C) started to develop XSL because there was a need for an XML-based style sheet language.

XSLT stands for XSL Transformations

A XSL file is much like a CSS file, as it "intercepts" a XML file and TRANSFORM it, generating whatever content you want, plain text, another XML, HTML or XHTML, etc. With XSLT, we can reorder elements, filter anything, modify attributes and content, etc.

From the same W3Schools:

XPath is a language for finding information in an XML document. XPath is used to navigate through elements and attributes in an XML document.

XPath is a major element in the W3C's XSLT standard - and XQuery and XPointer are both built on XPath expressions.

So an understanding of XPath is fundamental to a lot of advanced XML usage.


2. The root of all documents

Every stylesheet begins with <xsl:stylesheet> or <xsl:transform>, which is what defines the document as a stylesheet.

We'll create a new xsl file named "techs.xsl" and it will look like this:

Code:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

</xsl:stylesheet>

I won't get into details about namespaces, but we need to add xmlns:xls in order to use XSLT elements, attributes and features (the purpose of this tutorial is getting an HTML file from a XML file, not teaching XSLT - to do your own stuff after you read this, you won't need to know about namespaces.)

I said that XSLT "intercepts" XML. To do that, we have the Template tags that match XML elements using XPath. Too complicated? Consider our CIV4TechInfos.xml:

Spoiler :
Code:
<Civ4TechInfos xmlns="x-schema:CIV4TechnologiesSchema.xml">
	<TechInfos>
		<TechInfo>
			<Type>TECH_MILITARY_0</Type>
			<Description>TXT_KEY_TECH_MILITARY_0</Description>
			<Civilopedia>TXT_KEY_TECH_MILITARY_0_PEDIA</Civilopedia>
..
		</TechInfo>
	</TechInfos>
</Civ4TechInfos>

What we want to do is intercept the Civ4TechInfos tag and manipulate every descendant (TechInfos, TechInfo, Type, OrPreReqs, PrereqTech). Our template tag must match Civ4TechInfos. NOTE: I will use a PREFIX for our matches, which will prevent us from messing with the XML file. Remember the namespaces? Imagine it as an ownership, or a context. The prefix will be the namespace that tell us that we're transforming XML from Civ4.

It goes like this:

Spoiler :
Code:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:civ="x-schema:CIV4TechnologiesSchema.xml">

<xsl:output method="html" indent="yes"/>

<xsl:template match="/">
<html>
<head>
</head>
<body>
<table border="1">
</table>
</body>
</html>
</xsl:template> 



</xsl:stylesheet>

Above we've added the xmlns:civ namespace, which is the same xmlns (XML namespace) that CIV4TechInfos.xml uses. So, when in our stylesheet, we call civ:TechInfos, it knows that we're talking about the same elements in both files. There's the xsl: output too, which defines our output content and a few options (indentation is one).

We've also added the xsl:template tag matching "/", which is the document element (the whole document). Our stylesheet now "holds" the whole document, which means we can start processing the first element from the XML file, which is Civ4TechInfos.

As you can see, the result from the XSL we have right now will be an empty table, because we "intercept" the whole document but don't process its descendants. We will know use the apply-templates command, which is again kinda self-explanatory: it picks the first descendant and finds an appropriate template for it, which means it will look for a template matching Civ4TechInfos (if it doesn't exist, it will output everything to the screen).

And that's what we'll do:

Spoiler :
Code:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:civ="x-schema:CIV4TechnologiesSchema.xml">

<xsl:output method="html" indent="yes"/>

<xsl:template match="/">
<html>
<head>
</head>
<body>
<table border="1">
<tr><th>Type</th><th>Cost</th><th>Goody</th><th>OrPrereqs</th></tr>
<xsl:apply-templates />
</table>
</body>
</html>
</xsl:template> 

<xsl:template match="civ:TechInfos">
	<xsl:apply-templates>
		<xsl:sort select="iCost" data-type="number"/>
	</xsl:apply-templates>
</xsl:template>

</xsl:stylesheet>

I've gone far enough to add the sort command, which will sort TechInfos descendants by iCost. This new template matches TechInfos and will apply the appropriate template to its descendants, sorting by iCost before looking for it.

3. The complete stylesheet

From now it's pretty straightforward so I'll complete our example before explaining it:

Spoiler :
Code:
<xsl:stylesheet 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:civ="x-schema:CIV4TechnologiesSchema.xml"
    version="2.0">

<xsl:output method="html" indent="yes"/>

<xsl:template match="/">
<html>
<head>
</head>
<body>
<table border="1">
	<tr><th>Type</th><th>Cost</th><th>Goody</th><th>OrPrereqs</th></tr>
	<xsl:apply-templates />
</table>
</body>
</html>
</xsl:template> 

<xsl:template match="civ:TechInfos">
	<xsl:apply-templates>
		<xsl:sort select="iCost" data-type="number"/>
	</xsl:apply-templates>
</xsl:template>

<xsl:template match="civ:TechInfo">
	<tr>
	<xsl:apply-templates select="civ:Type"/>
	<xsl:apply-templates select="civ:iCost"/>
	<xsl:apply-templates select="civ:bGoodyTech"/>
	<xsl:apply-templates select="civ:OrPreReqs"/>
	</tr>
</xsl:template>

<xsl:template match="civ:Type">
	<td>
		<a>
			<xsl:attribute name="name">
				<xsl:value-of select="."/>
			</xsl:attribute>
			<b>
				<xsl:value-of select="."/>
			</b>
		</a>
	</td>
</xsl:template>

<xsl:template match="civ:iCost">
	<td>
		<xsl:value-of select="."/>
	</td>
</xsl:template>

<xsl:template match="civ:OrPreReqs">
	<td>
		<xsl:apply-templates />
	</td>
</xsl:template>

<xsl:template match="civ:PrereqTech">
	<a>
		<xsl:attribute name="href">
			<xsl:text>#</xsl:text><xsl:value-of select="."/>
		</xsl:attribute>
		<xsl:value-of select="."/>
	</a>
</xsl:template>

<xsl:template match="civ:bGoodyTech">
	<td><xsl:value-of select="."/></td>
</xsl:template>


</xsl:stylesheet>

After matching TechInfos, we applied a template to the sorted TechInfo's (not TechInfos!). We defined the template which will appropriately :)crazyeye:) match TechInfo. Now every TechInfo will be processed by this template. Now we apply the templates to the elements we want to show on the screen: Type, iCost, bGoodyTech and the OrPrereqs. All these templates use the xsl:value-of command to output the CONTENTS of the element.

Type and OrPrereqs need special attention because we use a new command: xsl:attribute, which does exactly what it says: it adds an attribute to the preceding element. For every Type tag, we've added the name attribute and for every PrereqTech we've added a link. Just because it needed special attention doesn't mean it's NOT obvious :p

Now we just need to add one line to our XML file so that when we open it in our browser, we'll see the result HTML:

Code:
<?xml version="1.0"?>
[b]<?xml-stylesheet type="text/xsl" href="techout.xsl"?>[/b]
<!-- edited with XMLSpy v2005 rel. 3 U (http://www.altova.com) by Soren Johnson (Firaxis Games) -->
<!-- edited with XMLSPY v2004 rel. 2 U (http://www.xmlspy.com) by Jon Shafer (Firaxis Games) -->
<!-- Sid Meier's Civilization 4 -->
<!-- Copyright Firaxis Games 2005 -->
<!-- -->
<!-- Tech Infos -->

If you want to show other elements, all you need to do is add a xsl:apply-template command selecting the element that you want and the appropriate :)crazyeye:) template matching it (and remember the "civ:" prefix).

And that's it. We can also do other complex stuff such as previewing iGridX and iGridY by outputting techs to a table (which I will try to do later). And of course we're not restricted to Tech files :p This will work with every file out there and with a bit of work you can link between different XML files and I'm looking forward to play around with the text files too.

One thing that you might have noticed: XSL is a FUNCTIONAL language, which means it works as a pipeline: it receives the XML document and outputs another document. We can use immutable variables and there's a for-each command which works much like apply-templates, and with some extensions we can use mutable variables. However, natively, XSL is meant to be fully functional.

And that's it. I didn't read everything I wrote so there might be some errors.

Note: instead of using the civ: prefix, we could have removed the xmlns attribute from CIV4TechInfos.xml. As we're working with a copy of CIV4TechInfos and have already added the xml-stylesheet, it shouldn't be a problem, however I didn't test the CIV4TechInfos with the xml-stylesheet tag, so *maybe* it works perfectly with Civ.

Below there's a reference of terms and commands you will see again if you decide to learn XPath and XSL.

Reference of XPath terms used:
Root/Document
Descendant
Preceding

Reference of XSL commands used:

Code:
<xsl:stylesheet>
<xsl:template>
<xsl:output>
<xsl:value-of>
<xsl:apply-template>
<xsl:sort>

And a final note: originally I did this using the open-source version of SAXON, a XML/XSL processor. There are many processors out there, which means you're not restricted to a browser. However, SAXON is ran from the command-line so I thought it would be much easier to use your favourite browser, and you can view your results from any PC with a "modern" browser too :)
 
Top Bottom