JavaML, XSLT, and XSLTC
Greg J. Badros
gregb@go2net.com
8 August 2000

Overview
JavaML
A complementary XML-based representation for Java source code
XSLT
eXtensible Style Language for Transformations
XSLTC
An XSLT to Java byte code compiler

Conceiving JavaML
Key observations
XML represents arbitrary graphs reasonably well (and represents trees exceptionally well)
Program source code gets represented internally as a tree—the Abstract Syntax Tree (AST)
Obvious conclusion:
We can use XML to represent programs

Two questions
So what? (i.e., Who cares?)
Why Java?

Lexical tools stink
% grep FirstApplet  < FirstApplet.java
public class FirstApplet extends Applet {
g.drawString("FirstApplet", 25, 50);

Benefits of XML
XML let’s us reason unambiguously about program-level concepts
(instead of just about syntactic constructs)
Consider casts:
(double) x              vs.
<cast-expr>
  <type name="int"/>
  <var-ref name="x"/>
</cast-expr>

The Big Win
Leverage and build on existing expertise, tools, technologies, and experience of the software engineering, languages, databases, web, and documents.
(Instead of having to re-invent lots of wheels.)

JavaML Example
<java-class-file name="FirstApplet.java">
    <import module="java.applet.*"/>
    <import module="java.awt.*"/>
    <class name="FirstApplet" visibility="public">
       <superclass name="Applet"/>
       <method name="paint" visibility="public">
<type name="void" primitive="true"/>
          <formal-args>
  <formal-arg name="g" id="frmarg-13">
      <type name="Graphics"/></formal-arg>
</formal-args>
……

Implementation
JavaML Document Type Definition (DTD)
Robust, complete implementation using Jikes Java compiler
(open-sourced compiler originally from IBM)
XSLT stylesheets to view JavaML as classical source code, or as pretty-printed HTML
Example queries, transformations, tools

For more information…
JavaML home page:
http://www.go2net.com/people/gregb/JavaML
WWW9 paper: “JavaML: A Markup Language for Java Source Code”, Amsterdam, May 2000.
SeaJUG talk, Wednesday August 16th

XML Tools
ltxml – sggrep, sgcount, sgrpg
Perl – XML::Parser::PerlSAX, XML::DOM
perlSGML – dtd2html, dtddiff, dtdtree
xmldiff, XML Notepad, XML Spy
Java
XP, Xerces [xml4j] (XML Parsers)
XT, Saxon, LotusXSL, Xalan (XSLT Interpreters)

XSL: eXtensible Style Language
XSL FO (Formatting Objects)
W3C working draft, 27 March 2000
XSLT (Transformations)
W3C recommendation, 16 November 1999

XSL FO
Mostly interesting if/when web browsers support XSL FO
Until then, interesting for page layout applications

XSLT
Expressive, largely declarative language for specifying transformations from XML to other formats
Output can be XML (e.g., XHTML, other XML instances), HTML, plain text, etc.
Client-side (in-browser) support is still limited, but useful server-side and for batch transformations

Null XSLT Style Sheet
Also an XML application
Uses the xsl: namespace

Hello world stylesheet
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:text>Hello world!</xsl:text>
</xsl:template>
</xsl:stylesheet>

Structure of XSLT stylesheet
xsl:stylesheet defines namespaces, version
Imports, inclusions, output declarations, parameter and variable declarations
Multiple template rules

Literal result stylesheets
<html xsl:version="1.0"
xmlns:xsl="http://www…">
  <body>Hello world</body>
</html>

input.xml
<?xml version="1.0" encoding="UTF-8"?>
<person>
  <name>Greg</name>
  <email username="gregb"
     domain="go2net.com"/>
  <employer>Go2Net</employer>
</person>

Classic substitution style
templates (a la JSP, GSP)
<html xsl:version="1.0“
          xmlns:xsl="http://www…">
<body>
    <xsl:value-of select="person/name"/>,
    <xsl:value-of select="person/email/@username"/>
    @
    <xsl:value-of select="person/email/@domain"/>
    (<xsl:value-of select="person/employer"/>)  </body>
</html>

XPath expressions
XML Path language, Version 1.0
W3C Recommendation 16 Nov 1999
Part of XSLT and XPointer
Domain-specific plain-text language for describing sets of nodes
XPath expressions may also evaluate to booleans, numbers, and strings

Location paths
Most common type of XPath expression
Like file paths, can be absolute (starts with “/” or relative, and can include multiple “steps” separated by “/”
person/name contains 2 steps,
person/email/@username contains 3

Location path steps
Made up of three parts
Axis specifier, followed by ::
Node test (the type of the node)
0 or more predicates enclosed in [ ]
E.g.,
attribute::*
ancestor-or-self::text()
child::para[position()=last()]

Pretty verbose,
so there’s shorthand…
attribute:: is abbreviated @, e.g.,
attribute::*  becomes  @*
child:: axis specifier can be omitted
inside [ ], position() tests are implicit
e.g.,
child::para[position()=last()]   becomes
para[last()]

XPath expressions
Very powerful…
Darn complex…
Often subtle semantics…
Tip: Learn XPath well, and learn it first (before the rest of XSLT)

XSLT templates
A stylesheet is just a lists of template rules for matching against the source tree, and substituting partial results
<xsl:template match="…">
    …substitution here…
</xsl:template>
match attribute is a pattern expression (subset of legal XPath expressions)

Stylesheets are declarative
Template rules say what the transformations should be
Rather than how the transformation will be accomplished
Processing model is described in the technical reports

Conflict resolution
Input nodes can match multiple template rules
Imported rules have lower precedence than local rules
Lower priority rules have lower precedence
Priority roughly corresponds to specificity of the matching pattern

Identity transformation
<xsl:template match="*|@*">
  <xsl:copy>
    <xsl:apply-templates
select="@*|node()"/>
  </xsl:copy>
</xsl:template>

<xsl:apply-templates …>
Instruction to recursively process all children of the current node
select attribute for filtering children to be processed

input.xml (revisited)
<?xml version="1.0" encoding="UTF-8"?>
<person>
  <name>Greg</name>
  <email username="gregb"
     domain="go2net.com"/>
  <employer>Go2Net</employer>
</person>

Stripping email elements
Identity transform + use more specific rule for email elements:
<xsl:template match="email"/>

JavaML and XSLT
I use XSLT to convert from JavaML to:
Ordinary Java source code
(Plain Old Source Representation, POSR)
HTML pretty-printed view, with index, indentation, colorization
Much easier than doing the conversions using DOM and Perl

Browsing
HTML Pretty-printed
using XSLT
(displayed in IE 5)

<xsl:if …>
No else clause—if you need that, use choose, when, and otherwise
Example:
<xsl:if test="@final">
    <em><font color="{$clr-final}">
  <xsl:text>final </xsl:text>
</font></em>
</xsl:if>

<xsl:choose …>
<xsl:choose>
<xsl:when test="@final">
      <em><font color="{$clr-final}">
      <xsl:text>final </xsl:text></font></em>
</xsl:when>
<xsl:otherwise>
<xsl:text>not-final</xsl:text>
</xsl:otherwise>
</xsl:choose>

Templates can be subroutines
<xsl:template name="semicolon-nl">
<xsl:text>;</xsl:text>
<xsl:call-template name="newline"/>
</xsl:template>
 Use name attribute of template

<xsl:for-each …>
Iteration is available:

<xsl:for-each select="implement">
<xsl:value-of select="@interface"/>
<xsl:if test="not(position()=last())">
<xsl:text>, </xsl:text>
</xsl:if>
</xsl:for-each>

Learning more about XSLT
See my JavaML distribution:
http://www.go2net.com/people/gregb/JavaML
page also contains a bunch of useful links
Book recommendations listed in index at: http://www.go2net.com/people/gregb/

Performance comparison

LotusXSL Profile

XSLTC
Work by Jacek Ambroziak, et al., at Sun
Compile the XSL file into a .class file that corresponds to a translet
Partial evaluation of an interpreter w.r.t. a stylesheet
Eliminates XSL parse time, other optimizations possible
Incomplete implementation still, or….

A full implementation
of XSLT-lite?
Some features of XSLT are especially hard to implement efficiently
Argument: better to disallow wholly inefficient things than to make performance suffer dramatically (without warning)
Best case though: expressiveness + detailed understanding of performance implications