JavaML: A Markup Language for Java Source Code Copyright (C) 2000 Greg J. Badros, Distributed under the LGPL See: http://www.cs.washington.edu/homes/gjb/papers/javaml/javaml.html Abstract The classical plain-text representation of source code is convenient for programmers but requires parsing to uncover the deep structure of the program. While sophisticated software tools parse source code to gain access to the program's structure, many lightweight programming aids such as grep rely instead on only the lexical structure of source code. I describe a new XML application that provides an alternative representation of Java source code. This XML-based representation, called JavaML, is more natural for tools and permits easy specification of numerous software-engineering analyses by leveraging the abundance of XML tools and techniques. A robust converter built with the IBM Jikes Java compiler framework translates from the classical Java source code representation to JavaML. (Installation instructions are below) I've added an XML unparser to Jikes to support the DTD (document type definition) described in java-ml.dtd. You need to configure with --enable-debug: ./configure --enable-debug and then use the +ux option to the resulting binary. To get started, try: export CLASSPATH=/usr/lib/netscape/java/classes/java40.jar:$CLASSPATH ../jikes/src/jikes +ux FirstApplet.java Then use James Clark's JADE package's nsgml (see http://www.jclark.com/jade/) to do: % nsgmls-xml xml-unparsed/FirstApplet.java.xml to check the output against java-ml.dtd nsgmls-xml is just: #!/bin/sh - nsgmls -c/usr/doc/jade-1.2.1/pubtext/xml.soc -wxml "$@" ------------ Files included: README-JAVAML this file ChangeLog-JavaML list of JavaML-relevant changes I've made to Jikes gjb-xml-unparse-*.patch the patch to the latest Jikes CVS source code xml-unparse.cpp the one new file I added to the Jikes source java-ml.dtd the document type definition for JavaML jikes-xml simple script to execute Jikes w/ options to convert to JavaML javaml-to-plain-source (wrapper around javaml-to-plain-source.xsl) convert JavaML representation back into a plain Java source file javaml-to-html (wrapper around javaml-to-html.xsl) convert JavaML representation into an indexed, pretty-printed HTML file javaml-extract-comments (wrapper around javaml-extract-comments.xsl) display all the comments in a JavaML file javaml-list-methods example Perl script using DOM to list methods in a JavaML file javaml-instrument example Perl script using DOM to instrument JavaML code results-to-plain-source convert JavaML query results back into plain Java source code ------------ Install instructions: ftp.cs.washington.edu:/homes/gjb/code/jikes-gjb-xml-unparse-*.tar.gz Just grab the file, then do: cvs -d :pserver:anoncvs@CVS.Sourcery.Org:/cvs/jikes co jikes # if not already present cd jikes/src tar xvzf ...../jikes-latest/src/jikes-gjb-xml-unparse-*.tar.gz patch < gjb-xml-unparse-*-against-1.11.patch ./configure --enable-debug make all Or you can use Jikes-1.11: tar xvzf jikes-1.11.tar.gz cd jikes/src tar xvzf ...../jikes-latest/src/jikes-gjb-xml-unparse-*.tar.gz patch < gjb-xml-unparse-*-against-1.11.patch ./configure --enable-debug make all The draft of the paper describing this is at: http://www.cs.washington.edu/homes/gjb/papers/javaml/javaml.html Here's a small example to give you the basic jist of what XML unparsing results in. Consider the below, FirstApplet.java: import java.applet.*; // do not forget this import statement! import java.awt.*; // Or this one for the graphics! public class FirstApplet extends Applet { // this method displays the applet. // the Graphics class is how you do all the drawing in Java public void paint(Graphics g) { g.drawString("Hello World", 25, 50); } } when you run: ./jikes +ux FirstApplet.java not only does FirstApplet.class get created, but the new +ulx option tells jikes to create xml-unparsed/FirstApplet.java.xml. That file looks like: This then lets you do all sorts of nifty static software-engineering analyses and transformations using all of the XML and SGML tools out there. See the paper for more details. You can add an edited version of the following line to a CATALOG file (listed in your SGML_CATALOG environment variable) to have various tools find the DTD more easily (e.g., nsgmls, Jade): SYSTEM "java-ml.dtd" "/path/to/java-ml/java-ml.dtd" Enjoy! Comments and feedback are appreciated! Greg J. Badros gjb@cs.washington.edu Seattle, WA USA http://www.cs.washington.edu/homes/gjb